Search | arXiv e-print repository

Rethinking Causal Relationships Learning in Graph Neural Networks

Authors: Hang Gao, Chengyu Yao, Jiangmeng Li, Lingyu Si, Yifan **, Fengge Wu, Changwen Zheng, Hua** Liu

Abstract: Graph Neural Networks (GNNs) demonstrate their significance by effectively modeling complex interrelationships within graph-structured data. To enhance the credibility and robustness of GNNs, it becomes exceptionally crucial to bolster their ability to capture causal relationships. However, despite recent advancements that have indeed strengthened GNNs from a causal learning perspective, conductin… ▽ More Graph Neural Networks (GNNs) demonstrate their significance by effectively modeling complex interrelationships within graph-structured data. To enhance the credibility and robustness of GNNs, it becomes exceptionally crucial to bolster their ability to capture causal relationships. However, despite recent advancements that have indeed strengthened GNNs from a causal learning perspective, conducting an in-depth analysis specifically targeting the causal modeling prowess of GNNs remains an unresolved issue. In order to comprehensively analyze various GNN models from a causal learning perspective, we constructed an artificially synthesized dataset with known and controllable causal relationships between data and labels. The rationality of the generated data is further ensured through theoretical foundations. Drawing insights from analyses conducted using our dataset, we introduce a lightweight and highly adaptable GNN module designed to strengthen GNNs' causal learning capabilities across a diverse range of tasks. Through a series of experiments conducted on both synthetic datasets and other real-world datasets, we empirically validate the effectiveness of the proposed module. △ Less

Submitted 15 December, 2023; originally announced December 2023.

arXiv:2208.08584 [pdf, other]

Robust Causal Graph Representation Learning against Confounding Effects

Authors: Hang Gao, Jiangmeng Li, Wenwen Qiang, Lingyu Si, Bing Xu, Changwen Zheng, Fuchun Sun

Abstract: The prevailing graph neural network models have achieved significant progress in graph representation learning. However, in this paper, we uncover an ever-overlooked phenomenon: the pre-trained graph representation learning model tested with full graphs underperforms the model tested with well-pruned graphs. This observation reveals that there exist confounders in graphs, which may interfere with… ▽ More The prevailing graph neural network models have achieved significant progress in graph representation learning. However, in this paper, we uncover an ever-overlooked phenomenon: the pre-trained graph representation learning model tested with full graphs underperforms the model tested with well-pruned graphs. This observation reveals that there exist confounders in graphs, which may interfere with the model learning semantic information, and current graph representation learning methods have not eliminated their influence. To tackle this issue, we propose Robust Causal Graph Representation Learning (RCGRL) to learn robust graph representations against confounding effects. RCGRL introduces an active approach to generate instrumental variables under unconditional moment restrictions, which empowers the graph representation learning model to eliminate confounders, thereby capturing discriminative information that is causally related to downstream predictions. We offer theorems and proofs to guarantee the theoretical effectiveness of the proposed approach. Empirically, we conduct extensive experiments on a synthetic dataset and multiple benchmark datasets. The results demonstrate that compared with state-of-the-art methods, RCGRL achieves better prediction performance and generalization ability. △ Less

Submitted 10 February, 2023; v1 submitted 17 August, 2022; originally announced August 2022.

Comments: Accepted by AAAI 2023 as Oral Presentation

arXiv:2104.14281 [pdf]

doi 10.1177/20552076221089092

Leveraging Online Shop** Behaviors as a Proxy for Personal Lifestyle Choices: New Insights into Chronic Disease Prevention Literacy

Authors: Yongzhen Wang, Xiaozhong Liu, Katy Börner, Jun Lin, Yingnan Ju, Changlong Sun, Luo Si

Abstract: Objective: Ubiquitous internet access is resha** the way we live, but it is accompanied by unprecedented challenges in preventing chronic diseases that are usually planted by long exposure to unhealthy lifestyles. This paper proposes leveraging online shop** behaviors as a proxy for personal lifestyle choices to improve chronic disease prevention literacy, targeted for times when e-commerce us… ▽ More Objective: Ubiquitous internet access is resha** the way we live, but it is accompanied by unprecedented challenges in preventing chronic diseases that are usually planted by long exposure to unhealthy lifestyles. This paper proposes leveraging online shop** behaviors as a proxy for personal lifestyle choices to improve chronic disease prevention literacy, targeted for times when e-commerce user experience has been assimilated into most people's everyday lives. Methods: Longitudinal query logs and purchase records from 15 million online shoppers were accessed, constructing a broad spectrum of lifestyle features covering various product categories and buyer personas. Using the lifestyle-related information preceding online shoppers' first purchases of specific prescription drugs, we could determine associations between their past lifestyle choices and whether they suffered from a particular chronic disease. Results: Novel lifestyle risk factors were discovered in two exemplars--depression and type 2 diabetes, most of which showed reasonable consistency with existing healthcare knowledge. Further, such empirical findings could be adopted to locate online shoppers at higher risk of these chronic diseases with decent accuracy [i.e., (area under the receiver operating characteristic curve) AUC=0.68 for depression and AUC=0.70 for type 2 diabetes], closely matching the performance of screening surveys benchmarked against medical diagnosis. Conclusions: Mining online shop** behaviors can point medical experts to a series of lifestyle issues associated with chronic diseases that are less explored to date. Hopefully, unobtrusive chronic disease surveillance via e-commerce sites can grant consenting individuals a privilege to be connected more readily with the medical profession and sophistication. △ Less

Submitted 9 March, 2022; v1 submitted 29 April, 2021; originally announced April 2021.

Comments: 58 pages with appendices, 5 figures, 17 tables

arXiv:1912.05977 [pdf, other]

Tracing the Propagation Path: A Flow Perspective of Representation Learning on Graphs

Authors: Menghan Wang, Kun Zhang, Gulin Li, Ke** Yang, Luo Si

Abstract: Graph Convolutional Networks (GCNs) have gained significant developments in representation learning on graphs. However, current GCNs suffer from two common challenges: 1) GCNs are only effective with shallow structures; stacking multiple GCN layers will lead to over-smoothing. 2) GCNs do not scale well with large, dense graphs due to the recursive neighborhood expansion. We generalize the propagat… ▽ More Graph Convolutional Networks (GCNs) have gained significant developments in representation learning on graphs. However, current GCNs suffer from two common challenges: 1) GCNs are only effective with shallow structures; stacking multiple GCN layers will lead to over-smoothing. 2) GCNs do not scale well with large, dense graphs due to the recursive neighborhood expansion. We generalize the propagation strategies of current GCNs as a \emph{"Sink$\to$Source"} mode, which seems to be an underlying cause of the two challenges. To address these issues intrinsically, in this paper, we study the information propagation mechanism in a \emph{"Source$\to$Sink"} mode. We introduce a new concept "information flow path" that explicitly defines where information originates and how it diffuses. Then a novel framework, namely Flow Graph Network (FlowGN), is proposed to learn node representations. FlowGN is computationally efficient and flexible in propagation strategies. Moreover, FlowGN decouples the layer structure from the information propagation process, removing the interior constraint of applying deep structures in traditional GCNs. Further experiments on public datasets demonstrate the superiority of FlowGN against state-of-the-art GCNs. △ Less

Submitted 12 December, 2019; originally announced December 2019.

arXiv:1805.10727 [pdf, other]

Perceive Your Users in Depth: Learning Universal User Representations from Multiple E-commerce Tasks

Authors: Yabo Ni, Dan Ou, Shichen Liu, Xiang Li, Wenwu Ou, Anxiang Zeng, Luo Si

Abstract: Tasks such as search and recommendation have become increas- ingly important for E-commerce to deal with the information over- load problem. To meet the diverse needs of di erent users, person- alization plays an important role. In many large portals such as Taobao and Amazon, there are a bunch of di erent types of search and recommendation tasks operating simultaneously for person- alization. How… ▽ More Tasks such as search and recommendation have become increas- ingly important for E-commerce to deal with the information over- load problem. To meet the diverse needs of di erent users, person- alization plays an important role. In many large portals such as Taobao and Amazon, there are a bunch of di erent types of search and recommendation tasks operating simultaneously for person- alization. However, most of current techniques address each task separately. This is suboptimal as no information about users shared across di erent tasks. In this work, we propose to learn universal user representations across multiple tasks for more e ective personalization. In partic- ular, user behavior sequences (e.g., click, bookmark or purchase of products) are modeled by LSTM and attention mechanism by integrating all the corresponding content, behavior and temporal information. User representations are shared and learned in an end-to-end setting across multiple tasks. Bene ting from better information utilization of multiple tasks, the user representations are more e ective to re ect their interests and are more general to be transferred to new tasks. We refer this work as Deep User Perception Network (DUPN) and conduct an extensive set of o ine and online experiments. Across all tested ve di erent tasks, our DUPN consistently achieves better results by giving more e ective user representations. Moreover, we deploy DUPN in large scale operational tasks in Taobao. Detailed implementations, e.g., incre- mental model updating, are also provided to address the practical issues for the real world applications. △ Less

Submitted 27 May, 2018; originally announced May 2018.

Comments: 10 pages, accepted an oral paper in sigKDD2018(industry track)

arXiv:1706.02093 [pdf, other]

doi 10.1145/3097983.3098011

Cascade Ranking for Operational E-commerce Search

Authors: Shichen Liu, Fei Xiao, Wenwu Ou, Luo Si

Abstract: In the 'Big Data' era, many real-world applications like search involve the ranking problem for a large number of items. It is important to obtain effective ranking results and at the same time obtain the results efficiently in a timely manner for providing good user experience and saving computational costs. Valuable prior research has been conducted for learning to efficiently rank like the casc… ▽ More In the 'Big Data' era, many real-world applications like search involve the ranking problem for a large number of items. It is important to obtain effective ranking results and at the same time obtain the results efficiently in a timely manner for providing good user experience and saving computational costs. Valuable prior research has been conducted for learning to efficiently rank like the cascade ranking (learning) model, which uses a sequence of ranking functions to progressively filter some items and rank the remaining items. However, most existing research of learning to efficiently rank in search is studied in a relatively small computing environments with simulated user queries. This paper presents novel research and thorough study of designing and deploying a Cascade model in a Large-scale Operational E-commerce Search application (CLOES), which deals with hundreds of millions of user queries per day with hundreds of servers. The challenge of the real-world application provides new insights for research: 1). Real-world search applications often involve multiple factors of preferences or constraints with respect to user experience and computational costs such as search accuracy, search latency, size of search results and total CPU cost, while most existing search solutions only address one or two factors; 2). Effectiveness of e-commerce search involves multiple types of user behaviors such as click and purchase, while most existing cascade ranking in search only models the click behavior. Based on these observations, a novel cascade ranking model is designed and deployed in an operational e-commerce search application. An extensive set of experiments demonstrate the advantage of the proposed work to address multiple factors of effectiveness, efficiency and user experience in the real-world application. △ Less

Submitted 7 June, 2017; originally announced June 2017.

arXiv:1411.5732 [pdf]

A Joint Probabilistic Classification Model of Relevant and Irrelevant Sentences in Mathematical Word Problems

Authors: Suleyman Cetintas, Luo Si, Yan ** Xin, Dake Zhang, Joo Young Park, Ron Tzur

Abstract: Estimating the difficulty level of math word problems is an important task for many educational applications. Identification of relevant and irrelevant sentences in math word problems is an important step for calculating the difficulty levels of such problems. This paper addresses a novel application of text categorization to identify two types of sentences in mathematical word problems, namely re… ▽ More Estimating the difficulty level of math word problems is an important task for many educational applications. Identification of relevant and irrelevant sentences in math word problems is an important step for calculating the difficulty levels of such problems. This paper addresses a novel application of text categorization to identify two types of sentences in mathematical word problems, namely relevant and irrelevant sentences. A novel joint probabilistic classification model is proposed to estimate the joint probability of classification decisions for all sentences of a math word problem by utilizing the correlation among all sentences along with the correlation between the question sentence and other sentences, and sentence text. The proposed model is compared with i) a SVM classifier which makes independent classification decisions for individual sentences by only using the sentence text and ii) a novel SVM classifier that considers the correlation between the question sentence and other sentences along with the sentence text. An extensive set of experiments demonstrates the effectiveness of the joint probabilistic classification model for identifying relevant and irrelevant sentences as well as the novel SVM classifier that utilizes the correlation between the question sentence and other sentences. Furthermore, empirical results and analysis show that i) it is highly beneficial not to remove stopwords and ii) utilizing part of speech tagging does not make a significant improvement although it has been shown to be effective for the related task of math word problem type classification. △ Less

Submitted 20 November, 2014; originally announced November 2014.

Comments: appears in Journal of Educational Data Mining (JEDM, 2010)

arXiv:1207.4146 [pdf]

A Bayesian Approach toward Active Learning for Collaborative Filtering

Authors: Rong **, Luo Si

Abstract: Collaborative filtering is a useful technique for exploiting the preference patterns of a group of users to predict the utility of items for the active user. In general, the performance of collaborative filtering depends on the number of rated examples given by the active user. The more the number of rated examples given by the active user, the more accurate the predicted ratings will be. Active l… ▽ More Collaborative filtering is a useful technique for exploiting the preference patterns of a group of users to predict the utility of items for the active user. In general, the performance of collaborative filtering depends on the number of rated examples given by the active user. The more the number of rated examples given by the active user, the more accurate the predicted ratings will be. Active learning provides an effective way to acquire the most informative rated examples from active users. Previous work on active learning for collaborative filtering only considers the expected loss function based on the estimated model, which can be misleading when the estimated model is inaccurate. This paper takes one step further by taking into account of the posterior distribution of the estimated model, which results in more robust active learning algorithm. Empirical studies with datasets of movie ratings show that when the number of ratings from the active user is restricted to be small, active learning methods only based on the estimated model don't perform well while the active learning method using the model distribution achieves substantially better performance. △ Less

Submitted 11 July, 2012; originally announced July 2012.

Comments: Appears in Proceedings of the Twentieth Conference on Uncertainty in Artificial Intelligence (UAI2004)

Report number: UAI-P-2004-PG-278-285

arXiv:1204.2311 [pdf, other]

Robust Nonnegative Matrix Factorization via $L_1$ Norm Regularization

Authors: Bin Shen, Luo Si, Rongrong Ji, Baodi Liu

Abstract: Nonnegative Matrix Factorization (NMF) is a widely used technique in many applications such as face recognition, motion segmentation, etc. It approximates the nonnegative data in an original high dimensional space with a linear representation in a low dimensional space by using the product of two nonnegative matrices. In many applications data are often partially corrupted with large additive nois… ▽ More Nonnegative Matrix Factorization (NMF) is a widely used technique in many applications such as face recognition, motion segmentation, etc. It approximates the nonnegative data in an original high dimensional space with a linear representation in a low dimensional space by using the product of two nonnegative matrices. In many applications data are often partially corrupted with large additive noise. When the positions of noise are known, some existing variants of NMF can be applied by treating these corrupted entries as missing values. However, the positions are often unknown in many real world applications, which prevents the usage of traditional NMF or other existing variants of NMF. This paper proposes a Robust Nonnegative Matrix Factorization (RobustNMF) algorithm that explicitly models the partial corruption as large additive noise without requiring the information of positions of noise. In practice, large additive noise can be used to model outliers. In particular, the proposed method jointly approximates the clean data matrix with the product of two nonnegative matrices and estimates the positions and values of outliers/noise. An efficient iterative optimization algorithm with a solid theoretical justification has been proposed to learn the desired matrix factorization. Experimental results demonstrate the advantages of the proposed algorithm. △ Less

Submitted 10 April, 2012; originally announced April 2012.

Showing 1–9 of 9 results for author: Si, L