Skip to main content

Showing 1–50 of 362 results for author: Li, H

Searching in archive stat. Search in all archives.
.
  1. arXiv:2407.01607  [pdf, other

    cs.LG cs.IR stat.ML

    Multi-Epoch learning with Data Augmentation for Deep Click-Through Rate Prediction

    Authors: Zhongxiang Fan, Zhaocheng Liu, Jian Liang, Dongying Kong, Han Li, Peng Jiang, Shuang Li, Kun Gai

    Abstract: This paper investigates the one-epoch overfitting phenomenon in Click-Through Rate (CTR) models, where performance notably declines at the start of the second epoch. Despite extensive research, the efficacy of multi-epoch training over the conventional one-epoch approach remains unclear. We identify the overfitting of the embedding layer, caused by high-dimensional data sparsity, as the primary is… ▽ More

    Submitted 27 June, 2024; originally announced July 2024.

  2. arXiv:2407.01111  [pdf, other

    cs.LG cs.AI stat.ML

    Proximity Matters: Local Proximity Preserved Balancing for Treatment Effect Estimation

    Authors: Hao Wang, Zhichao Chen, Yuan Shen, Jiajun Fan, Zhaoran Liu, Degui Yang, Xinggao Liu, Haoxuan Li

    Abstract: Heterogeneous treatment effect (HTE) estimation from observational data poses significant challenges due to treatment selection bias. Existing methods address this bias by minimizing distribution discrepancies between treatment groups in latent space, focusing on global alignment. However, the fruitful aspect of local proximity, where similar units exhibit similar outcomes, is often overlooked. In… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: Code is available at https://anonymous.4open.science/status/ncr-B697

  3. arXiv:2406.15762  [pdf, other

    cs.LG stat.ML

    Rethinking the Diffusion Models for Numerical Tabular Data Imputation from the Perspective of Wasserstein Gradient Flow

    Authors: Zhichao Chen, Haoxuan Li, Fangyikang Wang, Odin Zhang, Hu Xu, Xiaoyu Jiang, Zhihuan Song, Eric H. Wang

    Abstract: Diffusion models (DMs) have gained attention in Missing Data Imputation (MDI), but there remain two long-neglected issues to be addressed: (1). Inaccurate Imputation, which arises from inherently sample-diversification-pursuing generative process of DMs. (2). Difficult Training, which stems from intricate design required for the mask matrix in model training stage. To address these concerns within… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

  4. arXiv:2406.06941  [pdf, other

    stat.ME math.ST

    Efficient combination of observational and experimental datasets under general restrictions on outcome mean functions

    Authors: Harrison H. Li

    Abstract: A researcher collecting data from a randomized controlled trial (RCT) often has access to an auxiliary observational dataset that may be confounded or otherwise biased for estimating causal effects. Common modeling assumptions impose restrictions on the outcome mean function - the conditional expectation of the outcome of interest given observed covariates - in the two datasets. Running examples f… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: 52 pages, 4 figures

  5. arXiv:2405.15948  [pdf, other

    stat.ME stat.AP

    Multicalibration for Censored Survival Data: Towards Universal Adaptability in Predictive Modeling

    Authors: Hanxuan Ye, Hongzhe Li

    Abstract: Traditional statistical and machine learning methods assume identical distribution for the training and test data sets. This assumption, however, is often violated in real applications, particularly in health care research, where the training data~(source) may underrepresent specific subpopulations in the testing or target domain. Such disparities, coupled with censored observations, present signi… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: The supplementary material is excluded

  6. arXiv:2405.10194  [pdf, ps, other

    stat.CO math.ST

    Multivariate strong invariance principle and uncertainty assessment for time in-homogeneous cyclic MCMC samplers

    Authors: Haoxiang Li, Qian Qin

    Abstract: Time in-homogeneous cyclic Markov chain Monte Carlo (MCMC) samplers, including deterministic scan Gibbs samplers and Metropolis within Gibbs samplers, are extensively used for sampling from multi-dimensional distributions. We establish a multivariate strong invariance principle (SIP) for Markov chains associated with these samplers. The rate of this SIP essentially aligns with the tightest rate av… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

  7. arXiv:2405.05596  [pdf, other

    cs.CY cs.HC cs.IR cs.LG stat.ME

    Measuring Strategization in Recommendation: Users Adapt Their Behavior to Shape Future Content

    Authors: Sarah H. Cen, Andrew Ilyas, Jennifer Allen, Hannah Li, Aleksander Madry

    Abstract: Most modern recommendation algorithms are data-driven: they generate personalized recommendations by observing users' past behaviors. A common assumption in recommendation is that how a user interacts with a piece of content (e.g., whether they choose to "like" it) is a reflection of the content, but not of the algorithm that generated it. Although this assumption is convenient, it fails to captur… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  8. arXiv:2405.00697  [pdf, other

    q-fin.CP cs.LG q-fin.PR stat.AP

    Pricing Catastrophe Bonds -- A Probabilistic Machine Learning Approach

    Authors: Xiaowei Chen, Hong Li, Yufan Lu, Rui Zhou

    Abstract: This paper proposes a probabilistic machine learning method to price catastrophe (CAT) bonds in the primary market. The proposed method combines machine-learning-based predictive models with Conformal Prediction, an innovative algorithm that generates distribution-free probabilistic forecasts for CAT bond prices. Using primary market CAT bond transaction records between January 1999 and March 2021… ▽ More

    Submitted 10 April, 2024; originally announced May 2024.

  9. arXiv:2404.19620  [pdf, other

    cs.LG cs.IR stat.ML

    Be Aware of the Neighborhood Effect: Modeling Selection Bias under Interference

    Authors: Haoxuan Li, Chunyuan Zheng, Sihao Ding, Peng Wu, Zhi Geng, Fuli Feng, Xiangnan He

    Abstract: Selection bias in recommender system arises from the recommendation process of system filtering and the interactive process of user selection. Many previous studies have focused on addressing selection bias to achieve unbiased learning of the prediction model, but ignore the fact that potential outcomes for a given user-item pair may vary with the treatments assigned to other user-item pairs, name… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

    Comments: ICLR 24

  10. arXiv:2404.07098  [pdf, other

    stat.AP

    Unraveling Consumer Purchase Journey Using Neural Network Models

    Authors: Victor Churchill, H. Alice Li, Dongbin Xiu

    Abstract: This study utilizes an ensemble of feedforward neural network models to analyze large-volume and high-dimensional consumer touchpoints and their impact on purchase decisions. When applied to a proprietary dataset of consumer touchpoints and purchases from a global software service provider, the proposed approach demonstrates better predictive accuracy than both traditional models, such as logistic… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

  11. arXiv:2404.06064  [pdf, other

    stat.ME

    Constructing hierarchical time series through clustering: Is there an optimal way for forecasting?

    Authors: Bohan Zhang, Anastasios Panagiotelis, Han Li

    Abstract: Forecast reconciliation has attracted significant research interest in recent years, with most studies taking the hierarchy of time series as given. We extend existing work that uses time series clustering to construct hierarchies, with the goal of improving forecast accuracy, in three ways. First, we investigate multiple approaches to clustering, including not only different clustering algorithms… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Comments: 28 pages, 13 figures

  12. arXiv:2404.04719  [pdf, other

    stat.ME

    Generative Model for Change Point Detection in Dynamic Graphs

    Authors: Yik Lun Kei, Jialiang Li, Hangjian Li, Yanzhen Chen, Oscar Hernan Madrid Padilla

    Abstract: This paper proposes a generative model to detect change points in time series of graphs. The proposed framework consists of learnable prior distributions for low-dimensional graph representations and of a decoder that can generate graphs from the latent representations. The informative prior distributions in the latent spaces are learned from the observed data as empirical Bayes, and the expressiv… ▽ More

    Submitted 21 June, 2024; v1 submitted 6 April, 2024; originally announced April 2024.

  13. arXiv:2404.03878  [pdf, other

    stat.ME stat.ML

    Wasserstein F-tests for Fréchet regression on Bures-Wasserstein manifolds

    Authors: Haoshu Xu, Hongzhe Li

    Abstract: This paper considers the problem of regression analysis with random covariance matrix as outcome and Euclidean covariates in the framework of Fréchet regression on the Bures-Wasserstein manifold. Such regression problems have many applications in single cell genomics and neuroscience, where we have covariance matrix measured over a large set of samples. Fréchet regression on the Bures-Wasserstein… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

  14. arXiv:2403.15280  [pdf, other

    astro-ph.GA stat.AP

    Polarization Holes as an Indicator of Magnetic Field-Angular Momentum Alignment I. Initial Tests

    Authors: Lijun Wang, Zhuo Cao, Xiaodan Fan, Hua-bai Li

    Abstract: The formation of protostellar disks is still a mystery, largely due to the difficulties in observations that can constrain theories. For example, the 3D alignment between the rotation of the disk and the magnetic fields (B-fields) in the formation environment is critical in some models, but so far impossible to observe. Here, we study the possibility of probing the alignment between B-field and di… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

    Comments: accepted by The Astrophysical Journal

  15. arXiv:2403.13197  [pdf, other

    stat.ME

    Robust inference of cooperative behaviour of multiple ion channels in voltage-clamp recordings

    Authors: Robin Requadt, Manuel Fink, Patrick Kubica, Claudia Steinem, Axel Munk, Housen Li

    Abstract: Recent experimental studies have shed light on the intriguing possibility that ion channels exhibit cooperative behaviour. However, a comprehensive understanding of such cooperativity remains elusive, primarily due to limitations in measuring separately the response of each channel. Rather, only the superimposed channel response can be observed, challenging existing data analysis methods. To addre… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Comments: Codes in R are available at https://gitlab.gwdg.de/requadt/idc

  16. arXiv:2403.11356  [pdf, other

    stat.ME math.ST

    Multiscale Quantile Regression with Local Error Control

    Authors: Zhi Liu, Housen Li

    Abstract: For robust and efficient detection of change points, we introduce a novel methodology MUSCLE (multiscale quantile segmentation controlling local error) that partitions serial data into multiple segments, each sharing a common quantile. It leverages multiple tests for quantile changes over different scales and locations, and variational estimation. Unlike the often adopted global error control, MUS… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

    Comments: The implementation is in R package muscle, available at \url{https://github.com/liuzhi1993/muscle}

  17. arXiv:2403.07310  [pdf, other

    stat.ML cs.LG

    How does promoting the minority fraction affect generalization? A theoretical study of the one-hidden-layer neural network on group imbalance

    Authors: Hongkang Li, Shuai Zhang, Yihua Zhang, Meng Wang, Sijia Liu, Pin-Yu Chen

    Abstract: Group imbalance has been a known problem in empirical risk minimization (ERM), where the achieved high average accuracy is accompanied by low accuracy in a minority group. Despite algorithmic efforts to improve the minority group accuracy, a theoretical generalization analysis of ERM on individual groups remains elusive. By formulating the group imbalance problem with the Gaussian Mixture Model, t… ▽ More

    Submitted 19 March, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

  18. arXiv:2402.07193  [pdf, other

    cs.LG math.OC stat.ML

    Loss Symmetry and Noise Equilibrium of Stochastic Gradient Descent

    Authors: Liu Ziyin, Mingze Wang, Hongchao Li, Lei Wu

    Abstract: Symmetries exist abundantly in the loss function of neural networks. We characterize the learning dynamics of stochastic gradient descent (SGD) when exponential symmetries, a broad subclass of continuous symmetries, exist in the loss function. We establish that when gradient noises do not balance, SGD has the tendency to move the model parameters toward a point where noises from different directio… ▽ More

    Submitted 3 June, 2024; v1 submitted 11 February, 2024; originally announced February 2024.

    Comments: preprint

  19. arXiv:2402.06915  [pdf, other

    stat.ME math.ST

    Detection and inference of changes in high-dimensional linear regression with non-sparse structures

    Authors: Haeran Cho, Tobias Kley, Housen Li

    Abstract: For data segmentation in high-dimensional linear regression settings, the regression parameters are often assumed to be sparse segment-wise, which enables many existing methods to estimate the parameters locally via $\ell_1$-regularised maximum likelihood-type estimation and then contrast them for change point detection. Contrary to this common practice, we show that the sparsity of neither regres… ▽ More

    Submitted 4 March, 2024; v1 submitted 10 February, 2024; originally announced February 2024.

    Comments: Implementation is available at https://github.com/tobiaskley/inferchange. In version 2, an application to FRED-MD data is added

  20. arXiv:2402.02399  [pdf, other

    cs.LG cs.AI stat.AP stat.ML

    FreDF: Learning to Forecast in Frequency Domain

    Authors: Hao Wang, Licheng Pan, Zhichao Chen, Degui Yang, Sen Zhang, Yifei Yang, Xinggao Liu, Haoxuan Li, Dacheng Tao

    Abstract: Time series modeling is uniquely challenged by the presence of autocorrelation in both historical and label sequences. Current research predominantly focuses on handling autocorrelation within the historical sequence but often neglects its presence in the label sequence. Specifically, emerging forecast models mainly conform to the direct forecast (DF) paradigm, generating multi-step forecasts unde… ▽ More

    Submitted 4 February, 2024; originally announced February 2024.

  21. arXiv:2401.16308  [pdf, ps, other

    stat.AP q-bio.PE

    A Comprehensive Study of Covid 19 in Florida

    Authors: Julian Bennett, Lauren Eriksen, Xingjie Helen Li

    Abstract: Within the likes of any highly contagious and unpredictable disease, lies a predictable and attainable growth rate that researchers can find in order to make logistical conclusions about that particular disease and its affected regions' counterparts. The foundation that researchers pull from when studying a particular disease and looking for its growth rate is the Susceptible-Infected-Removed (SIR… ▽ More

    Submitted 1 February, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

    MSC Class: 93A30

  22. arXiv:2312.05756  [pdf

    cs.CE cs.AI stat.ME

    A quantitative fusion strategy of stock picking and timing based on Particle Swarm Optimized-Back Propagation Neural Network and Multivariate Gaussian-Hidden Markov Model

    Authors: Huajian Li, Longjian Li, Jiajian Liang, Weinan Dai

    Abstract: In recent years, machine learning (ML) has brought effective approaches and novel techniques to economic decision, investment forecasting, and risk management, etc., co** the variable and intricate nature of economic and financial environments. For the investment in stock market, this research introduces a pioneering quantitative fusion model combining stock timing and picking strategy by levera… ▽ More

    Submitted 22 December, 2023; v1 submitted 9 December, 2023; originally announced December 2023.

    Comments: 12 pages, 6 figures, 4 tables, 26 references

  23. arXiv:2312.00540  [pdf, other

    cs.LG cs.AI stat.ML

    Target-agnostic Source-free Domain Adaptation for Regression Tasks

    Authors: Tianlang He, Zhiqiu Xia, Jierun Chen, Haoliang Li, S. -H. Gary Chan

    Abstract: Unsupervised domain adaptation (UDA) seeks to bridge the domain gap between the target and source using unlabeled target data. Source-free UDA removes the requirement for labeled source data at the target to preserve data privacy and storage. However, work on source-free UDA assumes knowledge of domain gap distribution, and hence is limited to either target-aware or classification task. To overcom… ▽ More

    Submitted 1 December, 2023; originally announced December 2023.

    Comments: Accepted by ICDE 2024

  24. arXiv:2311.16984  [pdf, other

    stat.ME cs.DC cs.LG

    FedECA: A Federated External Control Arm Method for Causal Inference with Time-To-Event Data in Distributed Settings

    Authors: Jean Ogier du Terrail, Quentin Klopfenstein, Honghao Li, Imke Mayer, Nicolas Loiseau, Mohammad Hallal, Félix Balazard, Mathieu Andreux

    Abstract: External control arms (ECA) can inform the early clinical development of experimental drugs and provide efficacy evidence for regulatory approval in non-randomized settings. However, the main challenge of implementing ECA lies in accessing real-world data or historical clinical trials. Indeed, data sharing is often not feasible due to privacy considerations related to data leaving the original col… ▽ More

    Submitted 20 December, 2023; v1 submitted 28 November, 2023; originally announced November 2023.

    Comments: code available at: https://github.com/owkin/fedeca, fixed some typos, figures and acknowledgments in v2

  25. arXiv:2310.19822  [pdf, other

    cs.LG physics.ao-ph stat.AP

    FuXi-Extreme: Improving extreme rainfall and wind forecasts with diffusion model

    Authors: Xiaohui Zhong, Lei Chen, Jun Liu, Chensen Lin, Yuan Qi, Hao Li

    Abstract: Significant advancements in the development of machine learning (ML) models for weather forecasting have produced remarkable results. State-of-the-art ML-based weather forecast models, such as FuXi, have demonstrated superior statistical forecast performance in comparison to the high-resolution forecasts (HRES) of the European Centre for Medium-Range Weather Forecasts (ECMWF). However, ML models f… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

  26. arXiv:2310.18611  [pdf, other

    stat.AP stat.ME

    Sequential Kalman filter for fast online changepoint detection in longitudinal health records

    Authors: Hanmo Li, Yuedong Wang, Mengyang Gu

    Abstract: This article introduces the sequential Kalman filter, a computationally scalable approach for online changepoint detection with temporally correlated data. The temporal correlation was not considered in the Bayesian online changepoint detection approach due to the large computational cost. Motivated by detecting COVID-19 infections for dialysis patients from massive longitudinal health records wit… ▽ More

    Submitted 1 January, 2024; v1 submitted 28 October, 2023; originally announced October 2023.

  27. arXiv:2310.18286  [pdf, other

    cs.LG stat.AP stat.ML

    Optimal Transport for Treatment Effect Estimation

    Authors: Hao Wang, Zhichao Chen, Jiajun Fan, Haoxuan Li, Tianqiao Liu, Weiming Liu, Quanyu Dai, Yichao Wang, Zhenhua Dong, Ruiming Tang

    Abstract: Estimating conditional average treatment effect from observational data is highly challenging due to the existence of treatment selection bias. Prevalent methods mitigate this issue by aligning distributions of different treatment groups in the latent space. However, there are two critical problems that these methods fail to address: (1) mini-batch sampling effects (MSE), which causes misalignment… ▽ More

    Submitted 27 October, 2023; originally announced October 2023.

    Comments: Accepted as NeurIPS 2023 Poster

  28. arXiv:2310.10245  [pdf

    cs.CV stat.ML

    Mask wearing object detection algorithm based on improved YOLOv5

    Authors: Peng Wen, Junhu Zhang, Haitao Li

    Abstract: Wearing a mask is one of the important measures to prevent infectious diseases. However, it is difficult to detect people's mask-wearing situation in public places with high traffic flow. To address the above problem, this paper proposes a mask-wearing face detection model based on YOLOv5l. Firstly, Multi-Head Attentional Self-Convolution not only improves the convergence speed of the model but al… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

  29. arXiv:2310.08812  [pdf, other

    stat.ME cs.LG

    A novel decomposed-ensemble time series forecasting framework: capturing underlying volatility information

    Authors: Zhengtao Gui, Haoyuan Li, Sijie Xu, Yu Chen

    Abstract: Time series forecasting represents a significant and challenging task across various fields. Recently, methods based on mode decomposition have dominated the forecasting of complex time series because of the advantages of capturing local characteristics and extracting intrinsic modes from data. Unfortunately, most models fail to capture the implied volatilities that contain significant information… ▽ More

    Submitted 28 November, 2023; v1 submitted 12 October, 2023; originally announced October 2023.

  30. arXiv:2309.15297  [pdf, other

    stat.ME econ.EM math.ST

    Double machine learning and design in batch adaptive experiments

    Authors: Harrison H. Li, Art B. Owen

    Abstract: We consider an experiment with at least two stages or batches and $O(N)$ subjects per batch. First, we propose a semiparametric treatment effect estimator that efficiently pools information across the batches, and show it asymptotically dominates alternatives that aggregate single batch estimates. Then, we consider the design problem of learning propensity scores for assigning treatment in the lat… ▽ More

    Submitted 26 September, 2023; originally announced September 2023.

    Comments: 91 pages, 1 figure

  31. arXiv:2309.13324  [pdf, other

    stat.ME

    Targeted Learning on Variable Importance Measure for Heterogeneous Treatment Effect

    Authors: Haodong Li, Alan Hubbard, Mark van der Laan

    Abstract: Quantifying the heterogeneity of treatment effect is important for understanding how a commercial product or medical treatment affects different subgroups in a population. Beyond the overall impact reflected parameters like the average treatment effect, the analysis of treatment effect heterogeneity further reveals details on the importance of different covariates and how they lead to different tr… ▽ More

    Submitted 23 September, 2023; originally announced September 2023.

  32. arXiv:2309.10817  [pdf, other

    eess.IV cs.CV cs.LG stat.ML

    Assessing the capacity of a denoising diffusion probabilistic model to reproduce spatial context

    Authors: Rucha Deshpande, Muzaffer Özbey, Hua Li, Mark A. Anastasio, Frank J. Brooks

    Abstract: Diffusion models have emerged as a popular family of deep generative models (DGMs). In the literature, it has been claimed that one class of diffusion models -- denoising diffusion probabilistic models (DDPMs) -- demonstrate superior image synthesis performance as compared to generative adversarial networks (GANs). To date, these claims have been evaluated using either ensemble-based methods desig… ▽ More

    Submitted 19 September, 2023; originally announced September 2023.

    Comments: This paper is under consideration at IEEE TMI

  33. arXiv:2309.01334  [pdf, other

    stat.ME

    Average treatment effect on the treated, under lack of positivity

    Authors: Yi Liu, Huiyue Li, Yunji Zhou, Roland Matsouaka

    Abstract: The use of propensity score (PS) methods has become ubiquitous in causal inference. At the heart of these methods is the positivity assumption. Violation of the positivity assumption leads to the presence of extreme PS weights when estimating average causal effects of interest, such as the average treatment effect (ATE) or the average treatment effect on the treated (ATT), which renders invalid re… ▽ More

    Submitted 19 May, 2024; v1 submitted 3 September, 2023; originally announced September 2023.

  34. arXiv:2308.08148  [pdf, other

    cs.LG stat.ME

    Hierarchical Topological Ordering with Conditional Independence Test for Limited Time Series

    Authors: Anpeng Wu, Haoxuan Li, Kun Kuang, Keli Zhang, Fei Wu

    Abstract: Learning directed acyclic graphs (DAGs) to identify causal relations underlying observational data is crucial but also poses significant challenges. Recently, topology-based methods have emerged as a two-step approach to discovering DAGs by first learning the topological ordering of variables and then eliminating redundant edges, while ensuring that the graph remains acyclic. However, one limitati… ▽ More

    Submitted 16 August, 2023; originally announced August 2023.

  35. arXiv:2308.06671  [pdf, other

    cs.LG cs.AI stat.ML

    Law of Balance and Stationary Distribution of Stochastic Gradient Descent

    Authors: Liu Ziyin, Hongchao Li, Masahito Ueda

    Abstract: The stochastic gradient descent (SGD) algorithm is the algorithm we use to train neural networks. However, it remains poorly understood how the SGD navigates the highly nonlinear and degenerate loss landscape of a neural network. In this work, we prove that the minibatch noise of SGD regularizes the solution towards a balanced solution whenever the loss function contains a rescaling symmetry. Beca… ▽ More

    Submitted 12 August, 2023; originally announced August 2023.

    Comments: Preprint

  36. arXiv:2308.04246  [pdf, other

    stat.AP

    Spectrally-Corrected and Regularized Global Minimum Variance Portfolio for Spiked Model

    Authors: Hua Li, Jiafu Huang

    Abstract: Considering the shortcomings of the traditional sample covariance matrix estimation, this paper proposes an improved global minimum variance portfolio model and named spectral corrected and regularized global minimum variance portfolio (SCRGMVP), which is better than the traditional risk model. The key of this method is that under the assumption that the population covariance matrix follows the sp… ▽ More

    Submitted 29 August, 2023; v1 submitted 8 August, 2023; originally announced August 2023.

  37. arXiv:2307.01748  [pdf, other

    stat.ME astro-ph.IM stat.CO

    Monotone Cubic B-Splines with a Neural-Network Generator

    Authors: Lijun Wang, Xiaodan Fan, Huabai Li, Jun S. Liu

    Abstract: We present a method for fitting monotone curves using cubic B-splines, which is equivalent to putting a monotonicity constraint on the coefficients. We explore different ways of enforcing this constraint and analyze their theoretical and empirical properties. We propose two algorithms for solving the spline fitting problem: one that uses standard optimization techniques and one that trains a Multi… ▽ More

    Submitted 17 November, 2023; v1 submitted 4 July, 2023; originally announced July 2023.

  38. arXiv:2306.15915  [pdf, other

    stat.ML cs.LG stat.ME

    Transfer Learning with Random Coefficient Ridge Regression

    Authors: Hongzhe Zhang, Hongzhe Li

    Abstract: Ridge regression with random coefficients provides an important alternative to fixed coefficients regression in high dimensional setting when the effects are expected to be small but not zeros. This paper considers estimation and prediction of random coefficient ridge regression in the setting of transfer learning, where in addition to observations from the target model, source samples from differ… ▽ More

    Submitted 28 June, 2023; originally announced June 2023.

    Comments: 16 pages, 5 figures

  39. arXiv:2306.01337  [pdf, other

    cs.CL stat.ML

    MathChat: Converse to Tackle Challenging Math Problems with LLM Agents

    Authors: Yiran Wu, Feiran Jia, Shaokun Zhang, Hangyu Li, Erkang Zhu, Yue Wang, Yin Tat Lee, Richard Peng, Qingyun Wu, Chi Wang

    Abstract: Employing Large Language Models (LLMs) to address mathematical problems is an intriguing research endeavor, considering the abundance of math problems expressed in natural language across numerous science and engineering fields. LLMs, with their generalized ability, are used as a foundation model to build AI agents for different tasks. In this paper, we study the effectiveness of utilizing LLM age… ▽ More

    Submitted 28 June, 2024; v1 submitted 2 June, 2023; originally announced June 2023.

    Comments: Update version

  40. arXiv:2306.01264  [pdf, ps, other

    math.OC cs.LG stat.ML

    Convex and Non-convex Optimization Under Generalized Smoothness

    Authors: Haochuan Li, Jian Qian, Yi Tian, Alexander Rakhlin, Ali Jadbabaie

    Abstract: Classical analysis of convex and non-convex optimization methods often requires the Lipshitzness of the gradient, which limits the analysis to functions bounded by quadratics. Recent work relaxed this requirement to a non-uniform smoothness condition with the Hessian norm bounded by an affine function of the gradient norm, and proved convergence in the non-convex setting via gradient clip**, ass… ▽ More

    Submitted 3 November, 2023; v1 submitted 2 June, 2023; originally announced June 2023.

    Comments: 37 pages

  41. arXiv:2305.18578  [pdf, other

    stat.ME cs.LG stat.ML

    Quick Adaptive Ternary Segmentation: An Efficient Decoding Procedure For Hidden Markov Models

    Authors: Alexandre Mösching, Housen Li, Axel Munk

    Abstract: Hidden Markov models (HMMs) are characterized by an unobservable (hidden) Markov chain and an observable process, which is a noisy version of the hidden chain. Decoding the original signal (i.e., hidden chain) from the noisy observations is one of the main goals in nearly all HMM based data analyses. Existing decoding algorithms such as the Viterbi algorithm have computational complexity at best l… ▽ More

    Submitted 29 May, 2023; originally announced May 2023.

    MSC Class: 62M05

  42. arXiv:2305.16360  [pdf, other

    cs.LG cs.CE stat.AP

    Modeling Task Relationships in Multi-variate Soft Sensor with Balanced Mixture-of-Experts

    Authors: Yuxin Huang, Hao Wang, Zhaoran Liu, Licheng Pan, Haozhe Li, Xinggao Liu

    Abstract: Accurate estimation of multiple quality variables is critical for building industrial soft sensor models, which have long been confronted with data efficiency and negative transfer issues. Methods sharing backbone parameters among tasks address the data efficiency issue; however, they still fail to mitigate the negative transfer problem. To address this issue, a balanced Mixture-of-Experts (BMoE)… ▽ More

    Submitted 25 May, 2023; originally announced May 2023.

  43. arXiv:2305.07642  [pdf, other

    cs.CV cs.AI cs.LG stat.ML

    The ASNR-MICCAI Brain Tumor Segmentation (BraTS) Challenge 2023: Intracranial Meningioma

    Authors: Dominic LaBella, Maruf Adewole, Michelle Alonso-Basanta, Talissa Altes, Syed Muhammad Anwar, Ujjwal Baid, Timothy Bergquist, Radhika Bhalerao, Sully Chen, Verena Chung, Gian-Marco Conte, Farouk Dako, James Eddy, Ivan Ezhov, Devon Godfrey, Fathi Hilal, Ariana Familiar, Keyvan Farahani, Juan Eugenio Iglesias, Zhifan Jiang, Elaine Johanson, Anahita Fathi Kazerooni, Collin Kent, John Kirkpatrick, Florian Kofler , et al. (35 additional authors not shown)

    Abstract: Meningiomas are the most common primary intracranial tumor in adults and can be associated with significant morbidity and mortality. Radiologists, neurosurgeons, neuro-oncologists, and radiation oncologists rely on multiparametric MRI (mpMRI) for diagnosis, treatment planning, and longitudinal treatment monitoring; yet automated, objective, and quantitative tools for non-invasive assessment of men… ▽ More

    Submitted 12 May, 2023; originally announced May 2023.

  44. arXiv:2305.02542  [pdf, other

    stat.ME cs.LG stat.AP stat.ML

    Correcting for Interference in Experiments: A Case Study at Douyin

    Authors: Vivek F. Farias, Hao Li, Tianyi Peng, Xinyuyang Ren, Huawei Zhang, Andrew Zheng

    Abstract: Interference is a ubiquitous problem in experiments conducted on two-sided content marketplaces, such as Douyin (China's analog of TikTok). In many cases, creators are the natural unit of experimentation, but creators interfere with each other through competition for viewers' limited time and attention. "Naive" estimators currently used in practice simply ignore the interference, but in doing so i… ▽ More

    Submitted 4 May, 2023; originally announced May 2023.

  45. arXiv:2304.13972  [pdf, ps, other

    math.OC cs.LG stat.ML

    Convergence of Adam Under Relaxed Assumptions

    Authors: Haochuan Li, Alexander Rakhlin, Ali Jadbabaie

    Abstract: In this paper, we provide a rigorous proof of convergence of the Adaptive Moment Estimate (Adam) algorithm for a wide class of optimization objectives. Despite the popularity and efficiency of the Adam algorithm in training deep neural networks, its theoretical properties are not yet fully understood, and existing convergence proofs require unrealistically strong assumptions, such as globally boun… ▽ More

    Submitted 6 November, 2023; v1 submitted 27 April, 2023; originally announced April 2023.

    Comments: 35 pages

  46. arXiv:2303.17642  [pdf, other

    stat.ME

    Change Point Detection on A Separable Model for Dynamic Networks

    Authors: Yik Lun Kei, Hangjian Li, Yanzhen Chen, Oscar Hernan Madrid Padilla

    Abstract: This paper studies change point detection in time series of networks, with the Separable Temporal Exponential-family Random Graph Model (STERGM). Dynamic network patterns can be inherently complex due to dyadic and temporal dependence. Detection of the change points can identify the discrepancies in the underlying data generating processes and facilitate downstream analysis. The STERGM that utiliz… ▽ More

    Submitted 21 February, 2024; v1 submitted 30 March, 2023; originally announced March 2023.

  47. arXiv:2303.06423  [pdf, other

    q-bio.QM cs.LG physics.data-an q-bio.MN stat.ME

    Learning interpretable causal networks from very large datasets, application to 400,000 medical records of breast cancer patients

    Authors: Marcel da Câmara Ribeiro-Dantas, Honghao Li, Vincent Cabeli, Louise Dupuis, Franck Simon, Liza Hettal, Anne-Sophie Hamy, Hervé Isambert

    Abstract: Discovering causal effects is at the core of scientific investigation but remains challenging when only observational data is available. In practice, causal networks are difficult to learn and interpret, and limited to relatively small datasets. We report a more reliable and scalable causal discovery method (iMIIC), based on a general mutual information supremum principle, which greatly improves t… ▽ More

    Submitted 11 March, 2023; originally announced March 2023.

    Comments: 19 pages, 6 figures, 8 supplementary figures and 5 pages supporting information

  48. arXiv:2303.04030  [pdf, other

    stat.ML cs.AI cs.LG cs.SE

    PyXAB -- A Python Library for $\mathcal{X}$-Armed Bandit and Online Blackbox Optimization Algorithms

    Authors: Wenjie Li, Haoze Li, Jean Honorio, Qifan Song

    Abstract: We introduce a Python open-source library for $\mathcal{X}$-armed bandit and online blackbox optimization named PyXAB. PyXAB contains the implementations for more than 10 $\mathcal{X}$-armed bandit algorithms, such as HOO, StoSOO, HCT, and the most recent works GPO and VHCT. PyXAB also provides the most commonly-used synthetic objectives to evaluate the performance of different algorithms and the… ▽ More

    Submitted 7 March, 2023; originally announced March 2023.

  49. arXiv:2303.00883  [pdf, other

    cs.LG math.OC stat.ML

    Variance-reduced Clip** for Non-convex Optimization

    Authors: Amirhossein Reisizadeh, Haochuan Li, Subhro Das, Ali Jadbabaie

    Abstract: Gradient clip** is a standard training technique used in deep learning applications such as large-scale language modeling to mitigate exploding gradients. Recent experimental studies have demonstrated a fairly special behavior in the smoothness of the training objective along its trajectory when trained with gradient clip**. That is, the smoothness grows with the gradient norm. This is in clea… ▽ More

    Submitted 2 June, 2023; v1 submitted 1 March, 2023; originally announced March 2023.

  50. arXiv:2302.13231  [pdf

    eess.SY stat.AP

    A Synthetic Texas Backbone Power System with Climate-Dependent Spatio-Temporal Correlated Profiles

    Authors: ** Lu, Xingpeng Li, Hongyi Li, Taher Chegini, Carlos Gamarra, Y. C. Ethan Yang, Margaret Cook, Gavin Dillingham

    Abstract: Most power system test cases only have electrical parameters and can be used only for studies based on a snapshot of system profiles. To facilitate more comprehensive and practical studies, a synthetic power system including spatio-temporal correlated profiles for the entire year of 2019 at one-hour resolution has been created in this work. This system, referred to as the synthetic Texas 123-bus b… ▽ More

    Submitted 25 February, 2023; originally announced February 2023.

    Comments: 10 pages, 14 figures, 12 tables