Skip to main content

Showing 1–50 of 213 results for author: Liu, T

Searching in archive stat. Search in all archives.
.
  1. arXiv:2407.01606  [pdf, other

    cs.LG cs.AI cs.CL cs.CV stat.ML

    On Discrete Prompt Optimization for Diffusion Models

    Authors: Ruochen Wang, Ting Liu, Cho-Jui Hsieh, Boqing Gong

    Abstract: This paper introduces the first gradient-based framework for prompt optimization in text-to-image diffusion models. We formulate prompt engineering as a discrete optimization problem over the language space. Two major challenges arise in efficiently finding a solution to this problem: (1) Enormous Domain Space: Setting the domain to the entire language space poses significant difficulty to the opt… ▽ More

    Submitted 26 June, 2024; originally announced July 2024.

    Comments: ICML 2024. Code available at https://github.com/ruocwang/dpo-diffusion

    MSC Class: 68T01

    Journal ref: Proceedings of the 41st International Conference on Machine Learning (ICML 2024)

  2. arXiv:2405.00917  [pdf, other

    stat.ME

    Semiparametric mean and variance joint models with clipped-Laplace link functions for bounded integer-valued time series

    Authors: Tianqing Liu, Xiaohui Yuan

    Abstract: We present a novel approach for modeling bounded count time series data, by deriving accurate upper and lower bounds for the variance of a bounded count random variable while maintaining a fixed mean. Leveraging these bounds, we propose semiparametric mean and variance joint (MVJ) models utilizing a clipped-Laplace link function. These models offer a flexible and feasible structure for both mean a… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

    Comments: arXiv admin note: text overlap with arXiv:2404.18421

  3. arXiv:2404.18421  [pdf, other

    stat.ME math.ST

    Semiparametric mean and variance joint models with Laplace link functions for count time series

    Authors: Tianqing Liu, Xiaohui Yuan

    Abstract: Count time series data are frequently analyzed by modeling their conditional means and the conditional variance is often considered to be a deterministic function of the corresponding conditional mean and is not typically modeled independently. We propose a semiparametric mean and variance joint model, called random rounded count-valued generalized autoregressive conditional heteroskedastic (RRC-G… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

  4. arXiv:2403.08635  [pdf, other

    cs.LG cs.AI stat.ML

    Human Alignment of Large Language Models through Online Preference Optimisation

    Authors: Daniele Calandriello, Daniel Guo, Remi Munos, Mark Rowland, Yunhao Tang, Bernardo Avila Pires, Pierre Harvey Richemond, Charline Le Lan, Michal Valko, Tianqi Liu, Rishabh Joshi, Zeyu Zheng, Bilal Piot

    Abstract: Ensuring alignment of language models' outputs with human preferences is critical to guarantee a useful, safe, and pleasant user experience. Thus, human alignment has been extensively studied recently and several methods such as Reinforcement Learning from Human Feedback (RLHF), Direct Policy Optimisation (DPO) and Sequence Likelihood Calibration (SLiC) have emerged. In this paper, our contributio… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

  5. arXiv:2402.03941  [pdf, other

    cs.LG cs.AI stat.ME

    Discovery of the Hidden World with Large Language Models

    Authors: Chenxi Liu, Yongqiang Chen, Tongliang Liu, Mingming Gong, James Cheng, Bo Han, Kun Zhang

    Abstract: Science originates with discovering new causal knowledge from a combination of known facts and observations. Traditional causal discovery approaches mainly rely on high-quality measured variables, usually given by human experts, to find causal relations. However, the causal variables are usually unavailable in a wide range of real-world applications. The rise of large language models (LLMs) that a… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

    Comments: Preliminary version of an ongoing project; Chenxi and Yongqiang contributed equally; 26 pages, 41 figures; Project page: https://causalcoat.github.io/

  6. arXiv:2310.18910  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    InstanT: Semi-supervised Learning with Instance-dependent Thresholds

    Authors: Muyang Li, Runze Wu, Haoyu Liu, Jun Yu, Xun Yang, Bo Han, Tongliang Liu

    Abstract: Semi-supervised learning (SSL) has been a fundamental challenge in machine learning for decades. The primary family of SSL algorithms, known as pseudo-labeling, involves assigning pseudo-labels to confident unlabeled instances and incorporating them into the training set. Therefore, the selection criteria of confident instances are crucial to the success of SSL. Recently, there has been growing in… ▽ More

    Submitted 29 October, 2023; originally announced October 2023.

    Comments: Accepted as poster for NeurIPS 2023

  7. arXiv:2310.18286  [pdf, other

    cs.LG stat.AP stat.ML

    Optimal Transport for Treatment Effect Estimation

    Authors: Hao Wang, Zhichao Chen, Jiajun Fan, Haoxuan Li, Tianqiao Liu, Weiming Liu, Quanyu Dai, Yichao Wang, Zhenhua Dong, Ruiming Tang

    Abstract: Estimating conditional average treatment effect from observational data is highly challenging due to the existence of treatment selection bias. Prevalent methods mitigate this issue by aligning distributions of different treatment groups in the latent space. However, there are two critical problems that these methods fail to address: (1) mini-batch sampling effects (MSE), which causes misalignment… ▽ More

    Submitted 27 October, 2023; originally announced October 2023.

    Comments: Accepted as NeurIPS 2023 Poster

  8. arXiv:2310.13232  [pdf, other

    stat.ME math.ST stat.ML

    Interaction Screening and Pseudolikelihood Approaches for Tensor Learning in Ising Models

    Authors: Tianyu Liu, Somabha Mukherjee

    Abstract: In this paper, we study two well known methods of Ising structure learning, namely the pseudolikelihood approach and the interaction screening approach, in the context of tensor recovery in $k$-spin Ising models. We show that both these approaches, with proper regularization, retrieve the underlying hypernetwork structure using a sample size logarithmic in the number of network nodes, and exponent… ▽ More

    Submitted 19 October, 2023; originally announced October 2023.

    Comments: 17 pages, 5 figures

  9. arXiv:2310.07999  [pdf, other

    cs.LG stat.ML

    LEMON: Lossless model expansion

    Authors: Yite Wang, Jiahao Su, Hanlin Lu, Cong Xie, Tianyi Liu, Jianbo Yuan, Haibin Lin, Ruoyu Sun, Hongxia Yang

    Abstract: Scaling of deep neural networks, especially Transformers, is pivotal for their surging performance and has further led to the emergence of sophisticated reasoning capabilities in foundation models. Such scaling generally requires training large models from scratch with random initialization, failing to leverage the knowledge acquired by their smaller counterparts, which are already resource-intens… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

    Comments: Preprint

  10. arXiv:2307.01389  [pdf, other

    cs.LG stat.ME

    Identification of Causal Relationship between Amyloid-beta Accumulation and Alzheimer's Disease Progression via Counterfactual Inference

    Authors: Haixing Dai, Mengxuan Hu, Qing Li, Lu Zhang, Lin Zhao, Dajiang Zhu, Ibai Diez, Jorge Sepulcre, Fan Zhang, Xingyu Gao, Manhua Liu, Quanzheng Li, Sheng Li, Tianming Liu, Xiang Li

    Abstract: Alzheimer's disease (AD) is a neurodegenerative disorder that is beginning with amyloidosis, followed by neuronal loss and deterioration in structure, function, and cognition. The accumulation of amyloid-beta in the brain, measured through 18F-florbetapir (AV45) positron emission tomography (PET) imaging, has been widely used for early diagnosis of AD. However, the relationship between amyloid-bet… ▽ More

    Submitted 3 July, 2023; originally announced July 2023.

  11. arXiv:2306.14019  [pdf, other

    stat.ME

    Instrumental Variable Approach to Estimating Individual Causal Effects in N-of-1 Trials: Application to ISTOP Study

    Authors: Kexin Qu, Christopher H. Schmid, Tao Liu

    Abstract: An N-of-1 trial is a multiple crossover trial conducted in a single individual to provide evidence to directly inform personalized treatment decisions. Advancements in wearable devices greatly improved the feasibility of adopting these trials to identify optimal individual treatment plans, particularly when treatments differ among individuals and responses are highly heterogeneous. Our work was mo… ▽ More

    Submitted 24 June, 2023; originally announced June 2023.

  12. arXiv:2306.05751  [pdf, other

    cs.LG stat.ME

    Advancing Counterfactual Inference through Nonlinear Quantile Regression

    Authors: Shaoan Xie, Biwei Huang, Bin Gu, Tongliang Liu, Kun Zhang

    Abstract: The capacity to address counterfactual "what if" inquiries is crucial for understanding and making use of causal influences. Traditional counterfactual inference, under Pearls' counterfactual framework, typically depends on having access to or estimating a structural causal model. Yet, in practice, this causal model is often unknown and might be challenging to identify. Hence, this paper aims to p… ▽ More

    Submitted 27 February, 2024; v1 submitted 9 June, 2023; originally announced June 2023.

  13. arXiv:2305.14076  [pdf, other

    math.ST cs.LG math.PR stat.CO stat.ML

    Towards Understanding the Dynamics of Gaussian-Stein Variational Gradient Descent

    Authors: Tianle Liu, Promit Ghosal, Krishnakumar Balasubramanian, Natesh S. Pillai

    Abstract: Stein Variational Gradient Descent (SVGD) is a nonparametric particle-based deterministic sampling algorithm. Despite its wide usage, understanding the theoretical properties of SVGD has remained a challenging problem. For sampling from a Gaussian target, the SVGD dynamics with a bilinear kernel will remain Gaussian as long as the initializer is Gaussian. Inspired by this fact, we undertake a deta… ▽ More

    Submitted 27 October, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: NeurIPS 2023; 60 pages, 8 figures

  14. arXiv:2305.00876  [pdf, ps, other

    cs.IT stat.ML

    Exactly Tight Information-Theoretic Generalization Error Bound for the Quadratic Gaussian Problem

    Authors: Ruida Zhou, Chao Tian, Tie Liu

    Abstract: We provide a new information-theoretic generalization error bound that is exactly tight (i.e., matching even the constant) for the canonical quadratic Gaussian (location) problem. Most existing bounds are order-wise loose in this setting, which has raised concerns about the fundamental capability of information-theoretic bounds in reasoning the generalization behavior for machine learning. The pro… ▽ More

    Submitted 12 November, 2023; v1 submitted 1 May, 2023; originally announced May 2023.

  15. Detector Design and Performance Analysis for Target Detection in Subspace Interference

    Authors: Weijian Liu, Jun Liu, Tao Liu, Hui Chen, Yong-Liang Wang

    Abstract: It is often difficult to obtain sufficient training data for adaptive signal detection, which is required to calculate the unknown noise covariance matrix. Additionally, interference is frequently present, which complicates the detecting issue. We provide a two-step method, termed interference cancellation before detection (ICBD), to address the issue of signal detection in the unknown Gaussian no… ▽ More

    Submitted 14 April, 2023; originally announced April 2023.

    Comments: This manuscript is submitted to IEEE SPL with paper ID SPL-35580-2023 and the decision "AQ - Publish In Minor, Required Changes"

  16. arXiv:2304.00530  [pdf, other

    math.ST stat.ME

    Tensor Recovery in High-Dimensional Ising Models

    Authors: Tianyu Liu, Somabha Mukherjee, Rahul Biswas

    Abstract: The $k$-tensor Ising model is an exponential family on a $p$-dimensional binary hypercube for modeling dependent binary data, where the sufficient statistic consists of all $k$-fold products of the observations, and the parameter is an unknown $k$-fold tensor, designed to capture higher-order interactions between the binary variables. In this paper, we describe an approach based on a penalization… ▽ More

    Submitted 23 July, 2023; v1 submitted 2 April, 2023; originally announced April 2023.

    Comments: 28 pages, 7 figures

  17. arXiv:2303.05506  [pdf, other

    cs.LG cs.AI stat.ML

    TANGOS: Regularizing Tabular Neural Networks through Gradient Orthogonalization and Specialization

    Authors: Alan Jeffares, Tennison Liu, Jonathan Crabbé, Fergus Imrie, Mihaela van der Schaar

    Abstract: Despite their success with unstructured data, deep neural networks are not yet a panacea for structured tabular data. In the tabular domain, their efficiency crucially relies on various forms of regularization to prevent overfitting and provide strong generalization performance. Existing regularization techniques include broad modelling decisions such as choice of architecture, loss functions, and… ▽ More

    Submitted 9 March, 2023; originally announced March 2023.

    Comments: Published at International Conference on Learning Representations (ICLR) 2023

  18. arXiv:2212.02125  [pdf, other

    stat.ML cs.AI cs.LG

    TD3 with Reverse KL Regularizer for Offline Reinforcement Learning from Mixed Datasets

    Authors: Yuanying Cai, Chuheng Zhang, Li Zhao, Wei Shen, Xuyun Zhang, Lei Song, Jiang Bian, Tao Qin, Tieyan Liu

    Abstract: We consider an offline reinforcement learning (RL) setting where the agent need to learn from a dataset collected by rolling out multiple behavior policies. There are two challenges for this setting: 1) The optimal trade-off between optimizing the RL signal and the behavior cloning (BC) signal changes on different states due to the variation of the action coverage induced by different behavior pol… ▽ More

    Submitted 5 December, 2022; originally announced December 2022.

    Comments: Accepted by ICDM-22 (Best Student Paper Runner-Up Awards)

  19. arXiv:2211.06812  [pdf, other

    cs.LG cs.DC stat.ML

    FedRule: Federated Rule Recommendation System with Graph Neural Networks

    Authors: Yuhang Yao, Mohammad Mahdi Kamani, Zhongwei Cheng, Lin Chen, Carlee Joe-Wong, Tianqiang Liu

    Abstract: Much of the value that IoT (Internet-of-Things) devices bring to ``smart'' homes lies in their ability to automatically trigger other devices' actions: for example, a smart camera triggering a smart lock to unlock a door. Manually setting up these rules for smart devices or applications, however, is time-consuming and inefficient. Rule recommendation systems can automatically suggest rules for use… ▽ More

    Submitted 12 November, 2022; originally announced November 2022.

  20. arXiv:2211.06138  [pdf, other

    cs.LG cs.CY stat.ML

    Practical Approaches for Fair Learning with Multitype and Multivariate Sensitive Attributes

    Authors: Tennison Liu, Alex J. Chan, Boris van Breugel, Mihaela van der Schaar

    Abstract: It is important to guarantee that machine learning algorithms deployed in the real world do not result in unfairness or unintended social consequences. Fair ML has largely focused on the protection of single attributes in the simpler setting where both attributes and target outcomes are binary. However, the practical application in many a real-world problem entails the simultaneous protection of m… ▽ More

    Submitted 11 November, 2022; originally announced November 2022.

  21. arXiv:2211.02315  [pdf, other

    q-bio.NC cs.CV stat.ML

    Spatial-Temporal Convolutional Attention for Map** Functional Brain Networks

    Authors: Yiheng Liu, Enjie Ge, Ning Qiang, Tianming Liu, Bao Ge

    Abstract: Using functional magnetic resonance imaging (fMRI) and deep learning to explore functional brain networks (FBNs) has attracted many researchers. However, most of these studies are still based on the temporal correlation between the sources and voxel signals, and lack of researches on the dynamics of brain function. Due to the widespread local correlations in the volumes, FBNs can be generated dire… ▽ More

    Submitted 4 November, 2022; originally announced November 2022.

    Comments: 5 pages, 5 figures, submitted to 20th IEEE International Symposium on Biomedical Imaging (ISBI 2023)

  22. arXiv:2210.15801  [pdf, ps, other

    stat.ME

    Clustering High-dimensional Data via Feature Selection

    Authors: Tianqi Liu, Yu Lu, Biqing Zhu, Hongyu Zhao

    Abstract: High-dimensional clustering analysis is a challenging problem in statistics and machine learning, with broad applications such as the analysis of microarray data and RNA-seq data. In this paper, we propose a new clustering procedure called Spectral Clustering with Feature Selection (SC-FS), where we first obtain an initial estimate of labels via spectral clustering, then select a small fraction of… ▽ More

    Submitted 27 October, 2022; originally announced October 2022.

    Comments: Accepted at Biometrics Journal (https://onlinelibrary.wiley.com/doi/epdf/10.1111/biom.13665)

  23. arXiv:2210.13258  [pdf, other

    stat.ME stat.AP

    A comparative study to alternatives to the log-rank test

    Authors: Ina Dormuth, Tiantian Liu, ** Xu, Markus Pauly, Marc Ditzhaus

    Abstract: Studies to compare the survival of two or more groups using time-to-event data are of high importance in medical research. The gold standard is the log-rank test, which is optimal under proportional hazards. As the latter is no simple regularity assumption, we are interested in evaluating the power of various statistical tests under different settings including proportional and non-proportional ha… ▽ More

    Submitted 24 October, 2022; originally announced October 2022.

  24. arXiv:2210.08486  [pdf, other

    cs.LG cs.AI stat.ML

    Streaming PAC-Bayes Gaussian process regression with a performance guarantee for online decision making

    Authors: Tianyu Liu, Jie Lu, Zheng Yan, Guangquan Zhang

    Abstract: As a powerful Bayesian non-parameterized algorithm, the Gaussian process (GP) has performed a significant role in Bayesian optimization and signal processing. GPs have also advanced online decision-making systems because their posterior distribution has a closed-form solution. However, its training and inference process requires all historic data to be stored and the GP model to be trained from sc… ▽ More

    Submitted 26 October, 2022; v1 submitted 16 October, 2022; originally announced October 2022.

  25. arXiv:2210.05955  [pdf, other

    stat.ML cs.LG

    Identifiability and Asymptotics in Learning Homogeneous Linear ODE Systems from Discrete Observations

    Authors: Yuanyuan Wang, Wei Huang, Mingming Gong, Xi Geng, Tongliang Liu, Kun Zhang, Dacheng Tao

    Abstract: Ordinary Differential Equations (ODEs) have recently gained a lot of attention in machine learning. However, the theoretical aspects, e.g., identifiability and asymptotic properties of statistical estimation are still obscure. This paper derives a sufficient condition for the identifiability of homogeneous linear ODE systems from a sequence of equally-spaced error-free observations sampled from a… ▽ More

    Submitted 2 June, 2024; v1 submitted 12 October, 2022; originally announced October 2022.

    Journal ref: Journal of Machine Learning Research 25 (2024) 1-50

  26. arXiv:2210.01765  [pdf, other

    cs.LG q-bio.BM stat.ML

    One Transformer Can Understand Both 2D & 3D Molecular Data

    Authors: Shengjie Luo, Tianlang Chen, Yixian Xu, Shuxin Zheng, Tie-Yan Liu, Liwei Wang, Di He

    Abstract: Unlike vision and language data which usually has a unique format, molecules can naturally be characterized using different chemical formulations. One can view a molecule as a 2D graph or define it as a collection of atoms located in a 3D space. For molecular representation learning, most previous works designed neural networks only for a particular data format, making the learned models likely to… ▽ More

    Submitted 27 March, 2023; v1 submitted 4 October, 2022; originally announced October 2022.

    Comments: 20 pages; ICLR 2023, Camera Ready Version; Code: https://github.com/lsj2408/Transformer-M

  27. arXiv:2209.15466  [pdf, other

    stat.ML cs.LG

    Sparsity-Constrained Optimal Transport

    Authors: Tianlin Liu, Joan Puigcerver, Mathieu Blondel

    Abstract: Regularized optimal transport (OT) is now increasingly used as a loss or as a matching layer in neural networks. Entropy-regularized OT can be computed using the Sinkhorn algorithm but it leads to fully-dense transportation plans, meaning that all sources are (fractionally) matched with all targets. To address this issue, several works have investigated quadratic regularization instead. This regul… ▽ More

    Submitted 14 April, 2023; v1 submitted 30 September, 2022; originally announced September 2022.

    Comments: Camera-ready ICLR 2023

  28. arXiv:2209.07303  [pdf, other

    cs.LG cs.CR stat.ML

    Differentially Private Estimation of Hawkes Process

    Authors: Simiao Zuo, Tianyi Liu, Tuo Zhao, Hongyuan Zha

    Abstract: Point process models are of great importance in real world applications. In certain critical applications, estimation of point process models involves large amounts of sensitive personal data from users. Privacy concerns naturally arise which have not been addressed in the existing literature. To bridge this glaring gap, we propose the first general differentially private estimation procedure for… ▽ More

    Submitted 15 September, 2022; originally announced September 2022.

  29. arXiv:2207.07985  [pdf

    econ.GN physics.soc-ph q-bio.QM stat.AP

    Home-made blues: Residential crowding and mental health in Bei**g, China

    Authors: Xize Wang, Tao Liu

    Abstract: Although residential crowding has many well-being implications, its connection to mental health is yet to be widely examined. Using survey data from 1613 residents in Bei**g, China, we find that living in a crowded place - measured by both square metres per person and persons per bedroom - is significantly associated with a higher risk of depression. We test for the mechanisms of such association… ▽ More

    Submitted 16 July, 2022; originally announced July 2022.

    Journal ref: Urban Studies (2022)

  30. arXiv:2207.02540  [pdf, other

    stat.ME math.ST

    Design-based theory for cluster rerandomization

    Authors: Xin Lu, Tianle Liu, Hanzhong Liu, Peng Ding

    Abstract: Complete randomization balances covariates on average, but covariate imbalance often exists in finite samples. Rerandomization can ensure covariate balance in the realized experiment by discarding the undesired treatment assignments. Many field experiments in public health and social sciences assign the treatment at the cluster level due to logistical constraints or policy considerations. Moreover… ▽ More

    Submitted 6 July, 2022; originally announced July 2022.

  31. arXiv:2206.13033  [pdf, other

    cs.LG cs.IT stat.ML

    Normalized/Clipped SGD with Perturbation for Differentially Private Non-Convex Optimization

    Authors: Xiaodong Yang, Huishuai Zhang, Wei Chen, Tie-Yan Liu

    Abstract: By ensuring differential privacy in the learning algorithms, one can rigorously mitigate the risk of large models memorizing sensitive training data. In this paper, we study two algorithms for this purpose, i.e., DP-SGD and DP-NSGD, which first clip or normalize \textit{per-sample} gradients to bound the sensitivity and then add noise to obfuscate the exact information. We analyze the convergence… ▽ More

    Submitted 26 June, 2022; originally announced June 2022.

    Comments: 25 pages, under review

  32. arXiv:2206.07769  [pdf, other

    stat.ML cs.LG

    HyperImpute: Generalized Iterative Imputation with Automatic Model Selection

    Authors: Daniel Jarrett, Bogdan Cebere, Tennison Liu, Alicia Curth, Mihaela van der Schaar

    Abstract: Consider the problem of imputing missing values in a dataset. One the one hand, conventional approaches using iterative imputation benefit from the simplicity and customizability of learning conditional distributions directly, but suffer from the practical requirement for appropriate model specification of each and every variable. On the other hand, recent methods using deep generative modeling be… ▽ More

    Submitted 15 June, 2022; originally announced June 2022.

    Journal ref: In Proc. 39th International Conference on Machine Learning (ICML 2022)

  33. arXiv:2206.05643  [pdf, other

    cs.LG stat.ML

    Density Regression and Uncertainty Quantification with Bayesian Deep Noise Neural Networks

    Authors: Daiwei Zhang, Tianci Liu, Jian Kang

    Abstract: Deep neural network (DNN) models have achieved state-of-the-art predictive accuracy in a wide range of supervised learning applications. However, accurately quantifying the uncertainty in DNN predictions remains a challenging task. For continuous outcome variables, an even more difficult problem is to estimate the predictive density function, which not only provides a natural quantification of the… ▽ More

    Submitted 11 June, 2022; originally announced June 2022.

  34. arXiv:2206.02617  [pdf, other

    cs.LG cs.CR cs.DS stat.ML

    Individual Privacy Accounting for Differentially Private Stochastic Gradient Descent

    Authors: Da Yu, Gautam Kamath, Janardhan Kulkarni, Tie-Yan Liu, Jian Yin, Huishuai Zhang

    Abstract: Differentially private stochastic gradient descent (DP-SGD) is the workhorse algorithm for recent advances in private deep learning. It provides a single privacy guarantee to all datapoints in the dataset. We propose output-specific $(\varepsilon,δ)$-DP to characterize privacy guarantees for individual examples when releasing models trained by DP-SGD. We also design an efficient algorithm to inves… ▽ More

    Submitted 2 September, 2023; v1 submitted 6 June, 2022; originally announced June 2022.

    Comments: Published in Transactions on Machine Learning Research (TMLR)

  35. arXiv:2205.13869  [pdf, other

    cs.LG stat.ML

    MissDAG: Causal Discovery in the Presence of Missing Data with Continuous Additive Noise Models

    Authors: Erdun Gao, Ignavier Ng, Mingming Gong, Li Shen, Wei Huang, Tongliang Liu, Kun Zhang, Howard Bondell

    Abstract: State-of-the-art causal discovery methods usually assume that the observational data is complete. However, the missing data problem is pervasive in many practical scenarios such as clinical trials, economics, and biology. One straightforward way to address the missing data problem is first to impute the data using off-the-shelf imputation methods and then apply existing causal discovery methods. H… ▽ More

    Submitted 16 January, 2023; v1 submitted 27 May, 2022; originally announced May 2022.

    Comments: Accepted to NeurIPS22

  36. arXiv:2205.13401  [pdf, other

    cs.LG cs.CL stat.ML

    Your Transformer May Not be as Powerful as You Expect

    Authors: Shengjie Luo, Shanda Li, Shuxin Zheng, Tie-Yan Liu, Liwei Wang, Di He

    Abstract: Relative Positional Encoding (RPE), which encodes the relative distance between any pair of tokens, is one of the most successful modifications to the original Transformer. As far as we know, theoretical understanding of the RPE-based Transformers is largely unexplored. In this work, we mathematically analyze the power of RPE-based Transformers regarding whether the model is capable of approximati… ▽ More

    Submitted 28 October, 2022; v1 submitted 26 May, 2022; originally announced May 2022.

    Comments: 22 pages; NeurIPS 2022, Camera Ready Version

  37. arXiv:2205.12418  [pdf, other

    cs.LG cs.AI stat.ML

    Tiered Reinforcement Learning: Pessimism in the Face of Uncertainty and Constant Regret

    Authors: Jiawei Huang, Li Zhao, Tao Qin, Wei Chen, Nan Jiang, Tie-Yan Liu

    Abstract: We propose a new learning framework that captures the tiered structure of many real-world user-interaction applications, where the users can be divided into two groups based on their different tolerance on exploration risks and should be treated separately. In this setting, we simultaneously maintain two policies $π^{\text{O}}$ and $π^{\text{E}}$: $π^{\text{O}}$ ("O" for "online") interacts with m… ▽ More

    Submitted 26 February, 2023; v1 submitted 24 May, 2022; originally announced May 2022.

    Comments: 38 pages; NeurIPS 2022

  38. arXiv:2203.17262  [pdf

    stat.OT

    Length L-function for Network-Constrained Point Data

    Authors: Zidong Fang, Ci Song, Hua Shu, Jie Chen, Tianyu Liu, Xi Wang, Xiao Chen, Tao Pei

    Abstract: Network constrained points are referred to as points restricted to road networks, such as taxi pick up and drop off locations. A significant pattern of network constrained points is referred to as an aggregation; e.g., the aggregation of pick up points may indicate a high taxi demand in a particular area. Although the network K function using the shortest path network distance has been proposed to… ▽ More

    Submitted 29 March, 2022; originally announced March 2022.

  39. arXiv:2203.07681  [pdf, other

    cs.LG cs.AI stat.ML

    DEPTS: Deep Expansion Learning for Periodic Time Series Forecasting

    Authors: Wei Fan, Shun Zheng, Xiaohan Yi, Wei Cao, Yanjie Fu, Jiang Bian, Tie-Yan Liu

    Abstract: Periodic time series (PTS) forecasting plays a crucial role in a variety of industries to foster critical tasks, such as early warning, pre-planning, resource scheduling, etc. However, the complicated dependencies of the PTS signal on its inherent periodicity as well as the sophisticated composition of various periods hinder the performance of PTS forecasting. In this paper, we introduce a deep ex… ▽ More

    Submitted 15 March, 2022; originally announced March 2022.

    Comments: ICLR22 Spotlight

  40. arXiv:2202.08928  [pdf, other

    q-bio.PE physics.data-an q-bio.QM stat.ME

    "Back to the future" projections for COVID-19 surges

    Authors: J. Sunil Rao, Tianhao Liu, Daniel Andrés Díaz-Pachón

    Abstract: We argue that information from countries who had earlier COVID-19 surges can be used to inform another country's current model, then generating what we call back-to-the-future (BTF) projections. We show that these projections can be used to accurately predict future COVID-19 surges prior to an inflection point of the daily infection curve. We show, across 12 different countries from all populated… ▽ More

    Submitted 17 February, 2022; originally announced February 2022.

    Comments: 21 pages, 7 figures

    MSC Class: 92D25 (Primary) 92C60 92B15 62P10 62M10 (Secondary)

  41. arXiv:2202.08057  [pdf, other

    cs.LG cs.CR stat.ML

    Understanding and Improving Graph Injection Attack by Promoting Unnoticeability

    Authors: Yongqiang Chen, Han Yang, Yonggang Zhang, Kaili Ma, Tongliang Liu, Bo Han, James Cheng

    Abstract: Recently Graph Injection Attack (GIA) emerges as a practical attack scenario on Graph Neural Networks (GNNs), where the adversary can merely inject few malicious nodes instead of modifying existing nodes or edges, i.e., Graph Modification Attack (GMA). Although GIA has achieved promising results, little is known about why it is successful and whether there is any pitfall behind the success. To und… ▽ More

    Submitted 5 April, 2022; v1 submitted 16 February, 2022; originally announced February 2022.

    Comments: ICLR2022, 42 pages, 22 figures

  42. arXiv:2202.06450  [pdf, other

    cs.LG cs.AI stat.ML

    Towards Deployment-Efficient Reinforcement Learning: Lower Bound and Optimality

    Authors: Jiawei Huang, **glin Chen, Li Zhao, Tao Qin, Nan Jiang, Tie-Yan Liu

    Abstract: Deployment efficiency is an important criterion for many real-world applications of reinforcement learning (RL). Despite the community's increasing interest, there lacks a formal theoretical formulation for the problem. In this paper, we propose such a formulation for deployment-efficient RL (DE-RL) from an "optimization with constraints" perspective: we are interested in exploring an MDP and obta… ▽ More

    Submitted 30 August, 2022; v1 submitted 13 February, 2022; originally announced February 2022.

    Comments: 49 Pages; ICLR 2022

  43. arXiv:2201.12739  [pdf, other

    cs.LG stat.ML

    Do We Need to Penalize Variance of Losses for Learning with Label Noise?

    Authors: Yexiong Lin, Yu Yao, Yuxuan Du, Jun Yu, Bo Han, Mingming Gong, Tongliang Liu

    Abstract: Algorithms which minimize the averaged loss have been widely designed for dealing with noisy labels. Intuitively, when there is a finite training sample, penalizing the variance of losses will improve the stability and generalization of the algorithms. Interestingly, we found that the variance should be increased for the problem of learning with noisy labels. Specifically, increasing the variance… ▽ More

    Submitted 30 January, 2022; originally announced January 2022.

  44. arXiv:2112.03555  [pdf, other

    cs.LG stat.ML

    FedDAG: Federated DAG Structure Learning

    Authors: Erdun Gao, Junjia Chen, Li Shen, Tongliang Liu, Mingming Gong, Howard Bondell

    Abstract: To date, most directed acyclic graphs (DAGs) structure learning approaches require data to be stored in a central server. However, due to the consideration of privacy protection, data owners gradually refuse to share their personalized raw data to avoid private information leakage, making this task more troublesome by cutting off the first step. Thus, a puzzle arises: \textit{how do we discover th… ▽ More

    Submitted 16 January, 2023; v1 submitted 7 December, 2021; originally announced December 2021.

    Comments: Accepted to Transactions on Machine Learning Research

  45. arXiv:2111.13164  [pdf, other

    cs.LG q-fin.MF stat.ML

    Neural network stochastic differential equation models with applications to financial data forecasting

    Authors: Luxuan Yang, Ting Gao, Yubin Lu, **qiao Duan, Tao Liu

    Abstract: In this article, we employ a collection of stochastic differential equations with drift and diffusion coefficients approximated by neural networks to predict the trend of chaotic time series which has big jump properties. Our contributions are, first, we propose a model called Lévy induced stochastic differential equation network, which explores compounded stochastic differential equations with… ▽ More

    Submitted 3 November, 2022; v1 submitted 25 November, 2021; originally announced November 2021.

    Comments: 18 pages, 38 figures

  46. arXiv:2110.13750  [pdf, other

    cs.LG stat.ML

    Optimizing Information-theoretical Generalization Bounds via Anisotropic Noise in SGLD

    Authors: Bohan Wang, Huishuai Zhang, Jieyu Zhang, Qi Meng, Wei Chen, Tie-Yan Liu

    Abstract: Recently, the information-theoretical framework has been proven to be able to obtain non-vacuous generalization bounds for large models trained by Stochastic Gradient Langevin Dynamics (SGLD) with isotropic noise. In this paper, we optimize the information-theoretical generalization bound by manipulating the noise structure in SGLD. We prove that with constraint to guarantee low empirical risk, th… ▽ More

    Submitted 2 November, 2021; v1 submitted 26 October, 2021; originally announced October 2021.

    Comments: Accepted by Neurips 2021

  47. arXiv:2110.12088  [pdf, other

    cs.LG stat.ML

    Learning with Noisy Labels Revisited: A Study Using Real-World Human Annotations

    Authors: Jiaheng Wei, Zhaowei Zhu, Hao Cheng, Tongliang Liu, Gang Niu, Yang Liu

    Abstract: Existing research on learning with noisy labels mainly focuses on synthetic label noise. Synthetic noise, though has clean structures which greatly enabled statistical analyses, often fails to model real-world noise patterns. The recent literature has observed several efforts to offer real-world noisy datasets, yet the existing efforts suffer from two caveats: (1) The lack of ground-truth verifica… ▽ More

    Submitted 27 March, 2022; v1 submitted 22 October, 2021; originally announced October 2021.

    Comments: Published as a conference paper at ICLR 2022

  48. arXiv:2109.12784  [pdf, other

    cs.LG stat.ML

    Learning from Few Samples: Transformation-Invariant SVMs with Composition and Locality at Multiple Scales

    Authors: Tao Liu, P. R. Kumar, Ruida Zhou, Xi Liu

    Abstract: Motivated by the problem of learning with small sample sizes, this paper shows how to incorporate into support-vector machines (SVMs) those properties that have made convolutional neural networks (CNNs) successful. Particularly important is the ability to incorporate domain knowledge of invariances, e.g., translational invariance of images. Kernels based on the \textit{maximum} similarity over a g… ▽ More

    Submitted 22 October, 2022; v1 submitted 27 September, 2021; originally announced September 2021.

    Comments: Will appear in NeurIPS 2022

  49. arXiv:2109.02986  [pdf, other

    stat.ML cs.LG

    Instance-dependent Label-noise Learning under a Structural Causal Model

    Authors: Yu Yao, Tongliang Liu, Mingming Gong, Bo Han, Gang Niu, Kun Zhang

    Abstract: Label noise will degenerate the performance of deep learning algorithms because deep neural networks easily overfit label errors. Let X and Y denote the instance and clean label, respectively. When Y is a cause of X, according to which many datasets have been constructed, e.g., SVHN and CIFAR, the distributions of P(X) and P(Y|X) are entangled. This means that the unsupervised instances are helpfu… ▽ More

    Submitted 3 June, 2022; v1 submitted 7 September, 2021; originally announced September 2021.

  50. arXiv:2108.09042  [pdf

    cs.CG stat.ME

    Identifying Aggregation Artery Architecture of constrained Origin-Destination flows using Manhattan L-function

    Authors: Zidong Fang, Hua Shu, Ci Song, Jie Chen, Tianyu Liu, Xiaohan Liu, Tao Pei

    Abstract: The movement of humans and goods in cities can be represented by constrained flow, which is defined as the movement of objects between origin and destination in road networks. Flow aggregation, namely origins and destinations aggregated simultaneously, is one of the most common patterns, say the aggregated origin-to-destination flows between two transport hubs may indicate the great traffic demand… ▽ More

    Submitted 20 August, 2021; originally announced August 2021.

    Comments: 29 pages, 12 figures