Search | arXiv e-print repository

Unifying Unsupervised Graph-Level Anomaly Detection and Out-of-Distribution Detection: A Benchmark

Authors: Yili Wang, Yixin Liu, Xu Shen, Chenyu Li, Kaize Ding, Rui Miao, Ying Wang, Shirui Pan, Xin Wang

Abstract: To build safe and reliable graph machine learning systems, unsupervised graph-level anomaly detection (GLAD) and unsupervised graph-level out-of-distribution (OOD) detection (GLOD) have received significant attention in recent years. Though those two lines of research indeed share the same objective, they have been studied independently in the community due to distinct evaluation setups, creating… ▽ More To build safe and reliable graph machine learning systems, unsupervised graph-level anomaly detection (GLAD) and unsupervised graph-level out-of-distribution (OOD) detection (GLOD) have received significant attention in recent years. Though those two lines of research indeed share the same objective, they have been studied independently in the community due to distinct evaluation setups, creating a gap that hinders the application and evaluation of methods from one to the other. To bridge the gap, in this work, we present a Unified Benchmark for unsupervised Graph-level OOD and anomaly Detection (our method), a comprehensive evaluation framework that unifies GLAD and GLOD under the concept of generalized graph-level OOD detection. Our benchmark encompasses 35 datasets spanning four practical anomaly and OOD detection scenarios, facilitating the comparison of 16 representative GLAD/GLOD methods. We conduct multi-dimensional analyses to explore the effectiveness, generalizability, robustness, and efficiency of existing methods, shedding light on their strengths and limitations. Furthermore, we provide an open-source codebase (https://github.com/UB-GOLD/UB-GOLD) of our method to foster reproducible research and outline potential directions for future investigations based on our insights. △ Less

Submitted 21 June, 2024; originally announced June 2024.

arXiv:2405.16730 [pdf, other]

Latent Energy-Based Odyssey: Black-Box Optimization via Expanded Exploration in the Energy-Based Latent Space

Authors: Peiyu Yu, Dinghuai Zhang, Hengzhi He, Xiaojian Ma, Ruiyao Miao, Yifan Lu, Yasi Zhang, Deqian Kong, Ruiqi Gao, Jianwen Xie, Guang Cheng, Ying Nian Wu

Abstract: Offline Black-Box Optimization (BBO) aims at optimizing a black-box function using the knowledge from a pre-collected offline dataset of function values and corresponding input designs. However, the high-dimensional and highly-multimodal input design space of black-box function pose inherent challenges for most existing methods that model and operate directly upon input designs. These issues inclu… ▽ More Offline Black-Box Optimization (BBO) aims at optimizing a black-box function using the knowledge from a pre-collected offline dataset of function values and corresponding input designs. However, the high-dimensional and highly-multimodal input design space of black-box function pose inherent challenges for most existing methods that model and operate directly upon input designs. These issues include but are not limited to high sample complexity, which relates to inaccurate approximation of black-box function; and insufficient coverage and exploration of input design modes, which leads to suboptimal proposal of new input designs. In this work, we consider finding a latent space that serves as a compressed yet accurate representation of the design-value joint space, enabling effective latent exploration of high-value input design modes. To this end, we formulate an learnable energy-based latent space, and propose Noise-intensified Telesco** density-Ratio Estimation (NTRE) scheme for variational learning of an accurate latent space model without costly Markov Chain Monte Carlo. The optimization process is then exploration of high-value designs guided by the learned energy-based model in the latent space, formulated as gradient-based sampling from a latent-variable-parameterized inverse model. We show that our particular parameterization encourages expanded exploration around high-value design modes, motivated by inversion thinking of a fundamental result of conditional covariance matrix typically used for variance reduction. We observe that our method, backed by an accurately learned informative latent space and an expanding-exploration model design, yields significant improvements over strong previous methods on both synthetic and real world datasets such as the design-bench suite. △ Less

Submitted 26 May, 2024; originally announced May 2024.

arXiv:2302.12670 [pdf, ps, other]

Personalized Pricing with Invalid Instrumental Variables: Identification, Estimation, and Policy Learning

Authors: Rui Miao, Zhengling Qi, Cong Shi, Lin Lin

Abstract: Pricing based on individual customer characteristics is widely used to maximize sellers' revenues. This work studies offline personalized pricing under endogeneity using an instrumental variable approach. Standard instrumental variable methods in causal inference/econometrics either focus on a discrete treatment space or require the exclusion restriction of instruments from having a direct effect… ▽ More Pricing based on individual customer characteristics is widely used to maximize sellers' revenues. This work studies offline personalized pricing under endogeneity using an instrumental variable approach. Standard instrumental variable methods in causal inference/econometrics either focus on a discrete treatment space or require the exclusion restriction of instruments from having a direct effect on the outcome, which limits their applicability in personalized pricing. In this paper, we propose a new policy learning method for Personalized pRicing using Invalid iNsTrumental variables (PRINT) for continuous treatment that allow direct effects on the outcome. Specifically, relying on the structural models of revenue and price, we establish the identifiability condition of an optimal pricing strategy under endogeneity with the help of invalid instrumental variables. Based on this new identification, which leads to solving conditional moment restrictions with generalized residual functions, we construct an adversarial min-max estimator and learn an optimal pricing strategy. Furthermore, we establish an asymptotic regret bound to find an optimal pricing strategy. Finally, we demonstrate the effectiveness of the proposed method via extensive simulation studies as well as a real data application from an US online auto loan company. △ Less

Submitted 24 February, 2023; originally announced February 2023.

arXiv:2209.10064 [pdf, other]

Off-Policy Evaluation for Episodic Partially Observable Markov Decision Processes under Non-Parametric Models

Authors: Rui Miao, Zhengling Qi, Xiaoke Zhang

Abstract: We study the problem of off-policy evaluation (OPE) for episodic Partially Observable Markov Decision Processes (POMDPs) with continuous states. Motivated by the recently proposed proximal causal inference framework, we develop a non-parametric identification result for estimating the policy value via a sequence of so-called V-bridge functions with the help of time-dependent proxy variables. We th… ▽ More We study the problem of off-policy evaluation (OPE) for episodic Partially Observable Markov Decision Processes (POMDPs) with continuous states. Motivated by the recently proposed proximal causal inference framework, we develop a non-parametric identification result for estimating the policy value via a sequence of so-called V-bridge functions with the help of time-dependent proxy variables. We then develop a fitted-Q-evaluation-type algorithm to estimate V-bridge functions recursively, where a non-parametric instrumental variable (NPIV) problem is solved at each step. By analyzing this challenging sequential NPIV problem, we establish the finite-sample error bounds for estimating the V-bridge functions and accordingly that for evaluating the policy value, in terms of the sample size, length of horizon and so-called (local) measure of ill-posedness at each step. To the best of our knowledge, this is the first finite-sample error bound for OPE in POMDPs under non-parametric models. △ Less

Submitted 16 October, 2022; v1 submitted 20 September, 2022; originally announced September 2022.

arXiv:2206.11384 [pdf, other]

A joint latent class model of longitudinal and survival data with a time-varying membership probability

Authors: Ruoyu Miao, Christiana Charalambous

Abstract: Joint latent class modelling has been developed considerably in the past two decades. In some instances, the models are linked by the latent class k (i.e. the number of subgroups), in others they are joined by shared random effects or a heterogeneous random covariance matrix. We propose an extension to the joint latent class model (JLCM) in which probabilities of subjects being in latent class k c… ▽ More Joint latent class modelling has been developed considerably in the past two decades. In some instances, the models are linked by the latent class k (i.e. the number of subgroups), in others they are joined by shared random effects or a heterogeneous random covariance matrix. We propose an extension to the joint latent class model (JLCM) in which probabilities of subjects being in latent class k can be set to vary with time. This can be a more flexible way to analyse the effect of treatments to patients. For example, a patient may be in period I at the first visit time and may move to period II at the second visit time, implying the treatment the patient had before might be noneffective at the following visit time. For a dataset with these particular features, the joint latent class model which allows jumps among different subgroups can potentially provide more information as well as more accurate estimation and prediction results compared to the basic JLCM. A Bayesian approach is used to do the estimation and a DIC criterion is used to decide the optimal number of classes. Simulation results indicate that the proposed model produces accurate results and the time-varying JLCM outperforms the basic JLCM. We also illustrate the performance of our proposed JLCM on the aids data (Goldman et al., 1996). △ Less

Submitted 1 March, 2023; v1 submitted 22 June, 2022; originally announced June 2022.

Comments: 27 pages, 9 figures

MSC Class: 62H30(Primary); 62N02 (Secondary) ACM Class: G.3; I.6.0

arXiv:2206.05093 [pdf, other]

Federated Momentum Contrastive Clustering

Authors: Runxuan Miao, Erdem Koyuncu

Abstract: We present federated momentum contrastive clustering (FedMCC), a learning framework that can not only extract discriminative representations over distributed local data but also perform data clustering. In FedMCC, a transformed data pair passes through both the online and target networks, resulting in four representations over which the losses are determined. The resulting high-quality representat… ▽ More We present federated momentum contrastive clustering (FedMCC), a learning framework that can not only extract discriminative representations over distributed local data but also perform data clustering. In FedMCC, a transformed data pair passes through both the online and target networks, resulting in four representations over which the losses are determined. The resulting high-quality representations generated by FedMCC can outperform several existing self-supervised learning methods for linear evaluation and semi-supervised learning tasks. FedMCC can easily be adapted to ordinary centralized clustering through what we call momentum contrastive clustering (MCC). We show that MCC achieves state-of-the-art clustering accuracy results in certain datasets such as STL-10 and ImageNet-10. We also present a method to reduce the memory footprint of our clustering schemes. △ Less

Submitted 10 June, 2022; originally announced June 2022.

Comments: Originally submitted March 2022

arXiv:2105.01187 [pdf, ps, other]

Proximal Learning for Individualized Treatment Regimes Under Unmeasured Confounding

Authors: Zhengling Qi, Rui Miao, Xiaoke Zhang

Abstract: Data-driven individualized decision making has recently received increasing research interests. Most existing methods rely on the assumption of no unmeasured confounding, which unfortunately cannot be ensured in practice especially in observational studies. Motivated by the recent proposed proximal causal inference, we develop several proximal learning approaches to estimating optimal individualiz… ▽ More Data-driven individualized decision making has recently received increasing research interests. Most existing methods rely on the assumption of no unmeasured confounding, which unfortunately cannot be ensured in practice especially in observational studies. Motivated by the recent proposed proximal causal inference, we develop several proximal learning approaches to estimating optimal individualized treatment regimes (ITRs) in the presence of unmeasured confounding. In particular, we establish several identification results for different classes of ITRs, exhibiting the trade-off between the risk of making untestable assumptions and the value function improvement in decision making. Based on these results, we propose several classification-based approaches to finding a variety of restricted in-class optimal ITRs and develop their theoretical properties. The appealing numerical performance of our proposed methods is demonstrated via an extensive simulation study and one real data application. △ Less

Submitted 22 December, 2022; v1 submitted 3 May, 2021; originally announced May 2021.

arXiv:2101.06388 [pdf, other]

Informative core identification in complex networks

Authors: Ruizhong Miao, Tianxi Li

Abstract: In network analysis, the core structure of modeling interest is usually hidden in a larger network in which most structures are not informative. The noise and bias introduced by the non-informative component in networks can obscure the salient structure and limit many network modeling procedures' effectiveness. This paper introduces a novel core-periphery model for the non-informative periphery st… ▽ More In network analysis, the core structure of modeling interest is usually hidden in a larger network in which most structures are not informative. The noise and bias introduced by the non-informative component in networks can obscure the salient structure and limit many network modeling procedures' effectiveness. This paper introduces a novel core-periphery model for the non-informative periphery structure of networks without imposing a specific form for the informative core structure. We propose spectral algorithms for core identification as a data preprocessing step for general downstream network analysis tasks based on the model. The algorithm enjoys a strong theoretical guarantee of accuracy and is scalable for large networks. We evaluate the proposed method by extensive simulation studies demonstrating various advantages over many traditional core-periphery methods. The method is applied to extract the informative core structure from a citation network and give more informative results in the downstream hierarchical community detection. △ Less

Submitted 16 January, 2021; originally announced January 2021.

arXiv:2009.11452 [pdf, ps, other]

A Wavelet-Based Independence Test for Functional Data with an Application to MEG Functional Connectivity

Authors: Rui Miao, Xiaoke Zhang, Raymond K. W. Wong

Abstract: Measuring and testing the dependency between multiple random functions is often an important task in functional data analysis. In the literature, a model-based method relies on a model which is subject to the risk of model misspecification, while a model-free method only provides a correlation measure which is inadequate to test independence. In this paper, we adopt the Hilbert-Schmidt Independenc… ▽ More Measuring and testing the dependency between multiple random functions is often an important task in functional data analysis. In the literature, a model-based method relies on a model which is subject to the risk of model misspecification, while a model-free method only provides a correlation measure which is inadequate to test independence. In this paper, we adopt the Hilbert-Schmidt Independence Criterion (HSIC) to measure the dependency between two random functions. We develop a two-step procedure by first pre-smoothing each function based on its discrete and noisy measurements and then applying the HSIC to recovered functions. To ensure the compatibility between the two steps such that the effect of the pre-smoothing error on the subsequent HSIC is asymptotically negligible, we propose to use wavelet soft-thresholding for pre-smoothing and Besov-norm-induced kernels for HSIC. We also provide the corresponding asymptotic analysis. The superior numerical performance of the proposed method over existing ones is demonstrated in a simulation study. Moreover, in an magnetoencephalography (MEG) data application, the functional connectivity patterns identified by the proposed method are more anatomically interpretable than those by existing methods. △ Less

Submitted 23 September, 2020; originally announced September 2020.

arXiv:2004.06166 [pdf, ps, other]

Average Treatment Effect Estimation in Observational Studies with Functional Covariates

Authors: Rui Miao, Wu Xue, Xiaoke Zhang

Abstract: Functional data analysis is an important area in modern statistics and has been successfully applied in many fields. Although many scientific studies aim to find causations, a predominant majority of functional data analysis approaches can only reveal correlations. In this paper, average treatment effect estimation is studied for observational data with functional covariates. This paper generalize… ▽ More Functional data analysis is an important area in modern statistics and has been successfully applied in many fields. Although many scientific studies aim to find causations, a predominant majority of functional data analysis approaches can only reveal correlations. In this paper, average treatment effect estimation is studied for observational data with functional covariates. This paper generalizes various state-of-art propensity score estimation methods for multivariate data to functional data. The resulting average treatment effect estimators via propensity score weighting are numerically evaluated by a simulation study and applied to a real-world dataset to study the causal effect of duloxitine on the pain relief of chronic knee osteoarthritis patients. △ Less

Submitted 9 July, 2020; v1 submitted 13 April, 2020; originally announced April 2020.

Comments: Section 3.1.1: added discussions and Remark 1.3; Section 3.1.2: added Eq. (5) and related discussions; Sections 5 and 6: added discussions

MSC Class: 62P10 (Primary) 62G05 (Secondary)

Showing 1–10 of 10 results for author: Miao, R