Skip to main content

Showing 1–50 of 102 results for author: Xiao, Y

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.13130  [pdf, other

    cs.LG stat.ML

    Advancing Retail Data Science: Comprehensive Evaluation of Synthetic Data

    Authors: Yu Xia, Chi-Hua Wang, Joshua Mabry, Guang Cheng

    Abstract: The evaluation of synthetic data generation is crucial, especially in the retail sector where data accuracy is paramount. This paper introduces a comprehensive framework for assessing synthetic retail data, focusing on fidelity, utility, and privacy. Our approach differentiates between continuous and discrete data attributes, providing precise evaluation criteria. Fidelity is measured through stab… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  2. arXiv:2406.08180  [pdf, other

    stat.CO stat.ME

    Stochastic Process-based Method for Degree-Degree Correlation of Evolving Networks

    Authors: Yue Xiao, Xiaojun Zhang

    Abstract: Existing studies on the degree correlation of evolving networks typically rely on differential equations and statistical analysis, resulting in only approximate solutions due to inherent randomness. To address this limitation, we propose an improved Markov chain method for modeling degree correlation in evolving networks. By redesigning the network evolution rules to reflect actual network dynamic… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  3. arXiv:2406.01864  [pdf, other

    stat.CO

    Variance-reduced sampling importance resampling

    Authors: Yao Xiao, Kang Fu, Kun Li

    Abstract: The sampling importance resampling method is widely utilized in various fields, such as numerical integration and statistical simulation. In this paper, two modified methods are presented by incorporating two variance reduction techniques commonly used in Monte Carlo simulation, namely antithetic sampling and Latin hypercube sampling, into the process of sampling importance resampling method respe… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  4. arXiv:2405.10469  [pdf, other

    cs.AI cs.LG econ.EM stat.ML

    Simulation-Based Benchmarking of Reinforcement Learning Agents for Personalized Retail Promotions

    Authors: Yu Xia, Sriram Narayanamoorthy, Zhengyuan Zhou, Joshua Mabry

    Abstract: The development of open benchmarking platforms could greatly accelerate the adoption of AI agents in retail. This paper presents comprehensive simulations of customer shop** behaviors for the purpose of benchmarking reinforcement learning (RL) agents that optimize coupon targeting. The difficulty of this learning problem is largely driven by the sparsity of customer purchase events. We trained a… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

  5. arXiv:2403.07213  [pdf, other

    cs.LG stat.ML

    Which LLM to Play? Convergence-Aware Online Model Selection with Time-Increasing Bandits

    Authors: Yu Xia, Fang Kong, Tong Yu, Liya Guo, Ryan A. Rossi, Sungchul Kim, Shuai Li

    Abstract: Web-based applications such as chatbots, search engines and news recommendations continue to grow in scale and complexity with the recent surge in the adoption of LLMs. Online model selection has thus garnered increasing attention due to the need to choose the best model among a diverse set while balancing task reward and exploration cost. Organizations faces decisions like whether to employ a cos… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

    Comments: Accepted by WWW'24 (Oral)

  6. arXiv:2401.05784  [pdf, other

    econ.EM stat.ME

    Covariance Function Estimation for High-Dimensional Functional Time Series with Dual Factor Structures

    Authors: Chenlei Leng, Degui Li, Hanlin Shang, Yingcun Xia

    Abstract: We propose a flexible dual functional factor model for modelling high-dimensional functional time series. In this model, a high-dimensional fully functional factor parametrisation is imposed on the observed functional processes, whereas a low-dimensional version (via series approximation) is assumed for the latent functional factors. We extend the classic principal component analysis technique for… ▽ More

    Submitted 12 January, 2024; v1 submitted 11 January, 2024; originally announced January 2024.

  7. arXiv:2312.16769  [pdf, other

    stat.ME q-bio.NC stat.AP

    Estimation and Inference for High-dimensional Multi-response Growth Curve Model

    Authors: Xin Zhou, Yin Xia, Lexin Li

    Abstract: A growth curve model (GCM) aims to characterize how an outcome variable evolves, develops and grows as a function of time, along with other predictors. It provides a particularly useful framework to model growth trend in longitudinal data. However, the estimation and inference of GCM with a large number of response variables faces numerous challenges, and remains underdeveloped. In this article, w… ▽ More

    Submitted 27 December, 2023; originally announced December 2023.

  8. arXiv:2312.14095  [pdf, other

    stat.AP cs.AI cs.LG econ.EM

    RetailSynth: Synthetic Data Generation for Retail AI Systems Evaluation

    Authors: Yu Xia, Ali Arian, Sriram Narayanamoorthy, Joshua Mabry

    Abstract: Significant research effort has been devoted in recent years to develo** personalized pricing, promotions, and product recommendation algorithms that can leverage rich customer data to learn and earn. Systematic benchmarking and evaluation of these causal learning systems remains a critical challenge, due to the lack of suitable datasets and simulation environments. In this work, we propose a mu… ▽ More

    Submitted 21 December, 2023; originally announced December 2023.

    Comments: 30 pages, 8 figures

  9. arXiv:2311.16771  [pdf, other

    stat.ML cs.LG eess.SP

    The HR-Calculus: Enabling Information Processing with Quaternion Algebra

    Authors: Danilo P. Mandic, Sayed Pouria Talebi, Clive Cheong Took, Yili Xia, Dongpo Xu, Min Xiang, Pauline Bourigault

    Abstract: From their inception, quaternions and their division algebra have proven to be advantageous in modelling rotation/orientation in three-dimensional spaces and have seen use from the initial formulation of electromagnetic filed theory through to forming the basis of quantum filed theory. Despite their impressive versatility in modelling real-world phenomena, adaptive information processing technique… ▽ More

    Submitted 28 November, 2023; originally announced November 2023.

  10. arXiv:2310.17845  [pdf, other

    stat.ME

    A Unified and Optimal Multiple Testing Framework based on rho-values

    Authors: Bowen Gang, Shenghao Qin, Yin Xia

    Abstract: Multiple testing is an important research direction that has gained major attention in recent years. Currently, most multiple testing procedures are designed with p-values or Local false discovery rate (Lfdr) statistics. However, p-values obtained by applying probability integral transform to some well-known test statistics often do not incorporate information from the alternatives, resulting in s… ▽ More

    Submitted 26 October, 2023; originally announced October 2023.

  11. arXiv:2310.15653  [pdf, other

    cs.LG cs.SI stat.ML

    Deceptive Fairness Attacks on Graphs via Meta Learning

    Authors: Jian Kang, Yinglong Xia, Ross Maciejewski, Jiebo Luo, Hanghang Tong

    Abstract: We study deceptive fairness attacks on graphs to answer the following question: How can we achieve poisoning attacks on a graph learning model to exacerbate the bias deceptively? We answer this question via a bi-level optimization problem and propose a meta learning-based framework named FATE. FATE is broadly applicable with respect to various fairness definitions and graph learning models, as wel… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

    Comments: 23 pages, 11 tables

  12. arXiv:2310.08798  [pdf, other

    stat.ME stat.AP stat.ML

    Alteration Detection of Tensor Dependence Structure via Sparsity-Exploited Reranking Algorithm

    Authors: Li Ma, Shenghao Qin, Yin Xia

    Abstract: Tensor-valued data arise frequently from a wide variety of scientific applications, and many among them can be translated into an alteration detection problem of tensor dependence structures. In this article, we formulate the problem under the popularly adopted tensor-normal distributions and aim at two-sample correlation/partial correlation comparisons of tensor-valued observations. Through decor… ▽ More

    Submitted 12 October, 2023; originally announced October 2023.

  13. arXiv:2308.00894  [pdf, other

    cs.IR cs.LG stat.ME

    User-Controllable Recommendation via Counterfactual Retrospective and Prospective Explanations

    Authors: Juntao Tan, Yingqiang Ge, Yan Zhu, Yinglong Xia, Jiebo Luo, Jianchao Ji, Yongfeng Zhang

    Abstract: Modern recommender systems utilize users' historical behaviors to generate personalized recommendations. However, these systems often lack user controllability, leading to diminished user satisfaction and trust in the systems. Acknowledging the recent advancements in explainable recommender systems that enhance users' understanding of recommendation mechanisms, we propose leveraging these advancem… ▽ More

    Submitted 1 August, 2023; originally announced August 2023.

    Comments: Accepted for presentation at 26th European Conference on Artificial Intelligence (ECAI2023)

  14. arXiv:2306.08489  [pdf, ps, other

    stat.ML cs.LG math.SP

    Analysis and Approximate Inference of Large Random Kronecker Graphs

    Authors: Zhenyu Liao, Yuanqian Xia, Chengmei Niu, Yong Xiao

    Abstract: Random graph models are playing an increasingly important role in various fields ranging from social networks, telecommunication systems, to physiologic and biological networks. Within this landscape, the random Kronecker graph model, emerges as a prominent framework for scrutinizing intricate real-world networks. In this paper, we investigate large random Kronecker graphs, i.e., the number of gra… ▽ More

    Submitted 5 February, 2024; v1 submitted 14 June, 2023; originally announced June 2023.

    Comments: 27 pages, 5 figures, 2 tables

  15. arXiv:2305.05367  [pdf

    stat.AP

    Exploring assessment method of technological advancement based on literature cross-citation

    Authors: Shengxuan Tang, Liming Zhang, Shuo Jiang, Ming Cai, Yao Xiao

    Abstract: Assessing advancements of technology is essential for creating science and technology policies and making informed investments in the technology market. However, current methods primarily focus on the characteristics of the technologies themselves, making it difficult to accurately assess technologies across various fields and generations. To address this challenge, we propose a novel approach tha… ▽ More

    Submitted 9 May, 2023; originally announced May 2023.

    Comments: 15 pages, 6 figures

  16. arXiv:2304.12502  [pdf, ps, other

    cs.LG cs.IT eess.SP stat.ME

    Causal Semantic Communication for Digital Twins: A Generalizable Imitation Learning Approach

    Authors: Christo Kurisummoottil Thomas, Walid Saad, Yong Xiao

    Abstract: A digital twin (DT) leverages a virtual representation of the physical world, along with communication (e.g., 6G), computing (e.g., edge computing), and artificial intelligence (AI) technologies to enable many connected intelligence services. In order to handle the large amounts of network data based on digital twins (DTs), wireless systems can exploit the paradigm of semantic communication (SC) f… ▽ More

    Submitted 24 April, 2023; originally announced April 2023.

  17. arXiv:2302.14247  [pdf, ps, other

    stat.AP cs.LG

    Sequential edge detection using joint hierarchical Bayesian learning

    Authors: Yao Xiao, Anne Gelb, Guohui Song

    Abstract: This paper introduces a new sparse Bayesian learning (SBL) algorithm that jointly recovers a temporal sequence of edge maps from noisy and under-sampled Fourier data. The new method is cast in a Bayesian framework and uses a prior that simultaneously incorporates intra-image information to promote sparsity in each individual edge map with inter-image information to promote similarities in any unch… ▽ More

    Submitted 27 February, 2023; originally announced February 2023.

    MSC Class: 15A29; 62F15; 65F22; 65K10; 68U10

  18. arXiv:2302.11173  [pdf, other

    math.NA physics.comp-ph stat.ML

    VI-DGP: A variational inference method with deep generative prior for solving high-dimensional inverse problems

    Authors: Yingzhi Xia, Qifeng Liao, **glai Li

    Abstract: Solving high-dimensional Bayesian inverse problems (BIPs) with the variational inference (VI) method is promising but still challenging. The main difficulties arise from two aspects. First, VI methods approximate the posterior distribution using a simple and analytic variational distribution, which makes it difficult to estimate complex spatially-varying parameters in practice. Second, VI methods… ▽ More

    Submitted 22 February, 2023; originally announced February 2023.

    MSC Class: 35R30; 62F15; 68T07

  19. arXiv:2302.05790  [pdf, other

    stat.ME stat.ML

    Dimension Reduction and MARS

    Authors: Yu Liu, Degui Li, Yingcun Xia

    Abstract: The multivariate adaptive regression spline (MARS) is one of the popular estimation methods for nonparametric multivariate regressions. However, as MARS is based on marginal splines, to incorporate interactions of covariates, products of the marginal splines must be used, which leads to an unmanageable number of basis functions when the order of interaction is high and results in low estimation ef… ▽ More

    Submitted 4 July, 2023; v1 submitted 11 February, 2023; originally announced February 2023.

  20. arXiv:2301.10392  [pdf, other

    stat.ME math.ST

    Statistical Inference and Large-scale Multiple Testing for High-dimensional Regression Models

    Authors: T. Tony Cai, Zijian Guo, Yin Xia

    Abstract: This paper presents a selective survey of recent developments in statistical inference and multiple testing for high-dimensional regression models, including linear and logistic regression. We examine the construction of confidence intervals and hypothesis tests for various low-dimensional objectives such as regression coefficients and linear and quadratic functionals. The key technique is to gene… ▽ More

    Submitted 24 January, 2023; originally announced January 2023.

  21. arXiv:2301.05708  [pdf, other

    stat.ML cs.LG

    A domain-decomposed VAE method for Bayesian inverse problems

    Authors: Zhihang Xu, Yingzhi Xia, Qifeng Liao

    Abstract: Bayesian inverse problems are often computationally challenging when the forward model is governed by complex partial differential equations (PDEs). This is typically caused by expensive forward model evaluations and high-dimensional parameterization of priors. This paper proposes a domain-decomposed variational auto-encoder Markov chain Monte Carlo (DD-VAE-MCMC) method to tackle these challenges… ▽ More

    Submitted 6 February, 2023; v1 submitted 9 January, 2023; originally announced January 2023.

    Comments: arXiv admin note: text overlap with arXiv:2211.04026 by other authors

  22. arXiv:2212.12767  [pdf, other

    stat.ML cs.LG

    Streaming Traffic Flow Prediction Based on Continuous Reinforcement Learning

    Authors: Yanan Xiao, Minyu Liu, Zichen Zhang, Lu Jiang, Minghao Yin, Jianan Wang

    Abstract: Traffic flow prediction is an important part of smart transportation. The goal is to predict future traffic conditions based on historical data recorded by sensors and the traffic network. As the city continues to build, parts of the transportation network will be added or modified. How to accurately predict expanding and evolving long-term streaming networks is of great significance. To this end,… ▽ More

    Submitted 24 December, 2022; originally announced December 2022.

  23. Poisson multi-Bernoulli mixture filter with general target-generated measurements and arbitrary clutter

    Authors: Ángel F. García-Fernández, Yuxuan Xia, Lennart Svensson

    Abstract: This paper shows that the Poisson multi-Bernoulli mixture (PMBM) density is a multi-target conjugate prior for general target-generated measurement distributions and arbitrary clutter distributions. That is, for this multi-target measurement model and the standard multi-target dynamic model with Poisson birth model, the predicted and filtering densities are PMBMs. We derive the corresponding PMBM… ▽ More

    Submitted 24 May, 2023; v1 submitted 24 October, 2022; originally announced October 2022.

    Comments: Matlab code available at https://github.com/Agarciafernandez/MTT and https://github.com/yuhsuansia/Extented-target-PMBM-filter-independent-clutter-sources

    Journal ref: Á. F. García-Fernández, Y. Xia, L. Svensson, "Poisson multi-Bernoulli mixture filter with general target-generated measurements and arbitrary clutter", IEEE Transactions on Signal Processing, vol. 71, 2023

  24. arXiv:2210.04714  [pdf, other

    cs.CL cs.LG stat.ML

    Uncertainty Quantification with Pre-trained Language Models: A Large-Scale Empirical Analysis

    Authors: Yuxin Xiao, Paul Pu Liang, Umang Bhatt, Willie Neiswanger, Ruslan Salakhutdinov, Louis-Philippe Morency

    Abstract: Pre-trained language models (PLMs) have gained increasing popularity due to their compelling prediction performance in diverse natural language processing (NLP) tasks. When formulating a PLM-based prediction pipeline for NLP tasks, it is also crucial for the pipeline to minimize the calibration error, especially in safety-critical applications. That is, the pipeline should reliably indicate when w… ▽ More

    Submitted 14 October, 2022; v1 submitted 10 October, 2022; originally announced October 2022.

    Comments: Accepted by EMNLP 2022 (Findings)

  25. arXiv:2209.13281  [pdf, other

    stat.ME

    Robust Fused Lasso Penalized Huber Regression with Nonasymptotic Property and Implementation Studies

    Authors: Xin Xin, Boyi Xie, Yunhai Xiao

    Abstract: For some special data in reality, such as the genetic data, adjacent genes may have the similar function. Thus ensuring the smoothness between adjacent genes is highly necessary. But, in this case, the standard lasso penalty just doesn't seem appropriate anymore. On the other hand, in high-dimensional statistics, some datasets are easily contaminated by outliers or contain variables with heavy-tai… ▽ More

    Submitted 27 September, 2022; v1 submitted 27 September, 2022; originally announced September 2022.

  26. Methodological concerns about 'concordance-statistic for benefit' as a measure of discrimination in treatment benefit prediction

    Authors: Yuan Xia, Paul Gustafson, Mohsen Sadatsafavi

    Abstract: Prediction algorithms that quantify the expected benefit of a given treatment conditional on patient characteristics can critically inform medical decisions. Quantifying the performance of treatment benefit prediction algorithms is an active area of research. A recently proposed metric, the concordance statistic for benefit (cfb), evaluates the discriminative ability of a treatment benefit predict… ▽ More

    Submitted 15 May, 2023; v1 submitted 29 August, 2022; originally announced August 2022.

    Comments: 12 pages, 6 figures

  27. arXiv:2208.08754  [pdf, other

    stat.ME

    A Decorrelating and Debiasing Approach to Simultaneous Inference for High-Dimensional Confounded Models

    Authors: Yinrui Sun, Li Ma, Yin Xia

    Abstract: Motivated by the simultaneous association analysis with the presence of latent confounders, this paper studies the large-scale hypothesis testing problem for the high-dimensional confounded linear models with both non-asymptotic and asymptotic false discovery control. Such model covers a wide range of practical settings where both the response and the predictors may be confounded. In the presence… ▽ More

    Submitted 22 August, 2023; v1 submitted 18 August, 2022; originally announced August 2022.

  28. arXiv:2207.06156  [pdf, other

    stat.AP cs.CV eess.SY

    A comparison between PMBM Bayesian track initiation and labelled RFS adaptive birth

    Authors: Ángel F. García-Fernández, Yuxuan Xia, Lennart Svensson

    Abstract: This paper provides a comparative analysis between the adaptive birth model used in the labelled random finite set literature and the track initiation in the Poisson multi-Bernoulli mixture (PMBM) filter, with point-target models. The PMBM track initiation is obtained via Bayes' rule applied on the predicted PMBM density, and creates one Bernoulli component for each received measurement, represent… ▽ More

    Submitted 13 July, 2022; originally announced July 2022.

    Comments: Matlab implementations of PMBM filters can be found at https://github.com/Agarciafernandez/MTT and https://github.com/yuhsuansia

    Journal ref: Proceedings of the 25th International Conference on Information Fusion, 2022

  29. arXiv:2203.11461  [pdf, other

    stat.ME stat.ML

    Locally Adaptive Algorithms for Multiple Testing with Network Structure, with Application to Genome-Wide Association Studies

    Authors: Ziyi Liang, T. Tony Cai, Wenguang Sun, Yin Xia

    Abstract: Linkage analysis has provided valuable insights to the GWAS studies, particularly in revealing that SNPs in linkage disequilibrium (LD) can jointly influence disease phenotypes. However, the potential of LD network data has often been overlooked or underutilized in the literature. In this paper, we propose a locally adaptive structure learning algorithm (LASLA) that provides a principled and gener… ▽ More

    Submitted 16 August, 2023; v1 submitted 22 March, 2022; originally announced March 2022.

    Comments: 33 pages, 7 figures

  30. arXiv:2201.10043  [pdf, other

    stat.ME

    NAPA: Neighborhood-Assisted and Posterior-Adjusted Two-sample Inference

    Authors: Li Ma, Yin Xia, Lexin Li

    Abstract: Two-sample multiple testing problems of sparse spatial data are frequently arising in a variety of scientific applications. In this article, we develop a novel neighborhood-assisted and posterior-adjusted (NAPA) approach to incorporate both the spatial smoothness and sparsity type side information to improve the power of the test while controlling the false discovery of multiple testing. We transl… ▽ More

    Submitted 31 July, 2023; v1 submitted 24 January, 2022; originally announced January 2022.

  31. arXiv:2112.04243  [pdf

    stat.AP

    Hybrid Data-driven Framework for Shale Gas Production Performance Analysis via Game Theory, Machine Learning and Optimization Approaches

    Authors: ** Meng, Yujie Zhou, Tianrui Ye, Yitian Xiao

    Abstract: A comprehensive and precise analysis of shale gas production performance is crucial for evaluating resource potential, designing field development plan, and making investment decisions. However, quantitative analysis can be challenging because production performance is dominated by a complex interaction among a series of geological and engineering factors. In this study, we propose a hybrid data-d… ▽ More

    Submitted 7 June, 2022; v1 submitted 8 December, 2021; originally announced December 2021.

    Comments: 37 pages, 15 figures, 6 tables

  32. arXiv:2111.15367  [pdf, other

    q-fin.ST cs.LG stat.AP

    A Review on Graph Neural Network Methods in Financial Applications

    Authors: Jianian Wang, Sheng Zhang, Yanghua Xiao, Rui Song

    Abstract: With multiple components and relations, financial data are often presented as graph data, since it could represent both the individual features and the complicated relations. Due to the complexity and volatility of the financial market, the graph constructed on the financial data is often heterogeneous or time-varying, which imposes challenges on modeling technology. Among the graph modeling techn… ▽ More

    Submitted 26 April, 2022; v1 submitted 26 November, 2021; originally announced November 2021.

  33. arXiv:2111.10766  [pdf, other

    stat.ME math.OC

    Semismooth Newton Augmented Lagrangian Algorithm for Adaptive Lasso Penalized Least Squares in Semiparametric Regression

    Authors: Meixia Yang, Yunhai Xiao, Peili Li, Hanbing Zhu

    Abstract: This paper is concerned with a partially linear semiparametric regression model containing an unknown regression coefficient, an unknown nonparametric function, and an unobservable Gaussian distributed random error. We focus on the case of simultaneous variable selection and estimation with a divergent number of covariates under the assumption that the regression coefficient is sparse. We consider… ▽ More

    Submitted 8 February, 2022; v1 submitted 21 November, 2021; originally announced November 2021.

  34. arXiv:2111.03943  [pdf, ps, other

    cs.LG stat.CO stat.ML

    A Probit Tensor Factorization Model For Relational Learning

    Authors: Ye Liu, Rui Song, Wenbin Lu, Yanghua Xiao

    Abstract: With the proliferation of knowledge graphs, modeling data with complex multirelational structure has gained increasing attention in the area of statistical relational learning. One of the most important goals of statistical relational learning is link prediction, i.e., predicting whether certain relations exist in the knowledge graph. A large number of models and algorithms have been proposed to p… ▽ More

    Submitted 8 November, 2021; v1 submitted 6 November, 2021; originally announced November 2021.

    Comments: 30 pages

  35. arXiv:2106.10121  [pdf, other

    cs.LG stat.ML

    ScoreGrad: Multivariate Probabilistic Time Series Forecasting with Continuous Energy-based Generative Models

    Authors: Ti** Yan, Hongwei Zhang, Tong Zhou, Yufeng Zhan, Yuanqing Xia

    Abstract: Multivariate time series prediction has attracted a lot of attention because of its wide applications such as intelligence transportation, AIOps. Generative models have achieved impressive results in time series modeling because they can model data distribution and take noise into consideration. However, many existing works can not be widely used because of the constraints of functional form of ge… ▽ More

    Submitted 18 June, 2021; originally announced June 2021.

    Comments: 12 pages, 10 figures

  36. arXiv:2106.09179  [pdf, other

    cs.LG cs.AI stat.ML

    Amortized Auto-Tuning: Cost-Efficient Bayesian Transfer Optimization for Hyperparameter Recommendation

    Authors: Yuxin Xiao, Eric P. Xing, Willie Neiswanger

    Abstract: With the surge in the number of hyperparameters and training times of modern machine learning models, hyperparameter tuning is becoming increasingly expensive. However, after assessing 40 tuning methods systematically, we find that each faces certain limitations. In particular, methods that speed up tuning via knowledge transfer typically require the final performance of hyperparameters and do not… ▽ More

    Submitted 7 April, 2022; v1 submitted 16 June, 2021; originally announced June 2021.

  37. arXiv:2105.00393  [pdf, other

    math.ST stat.ML

    Directional FDR Control for Sub-Gaussian Sparse GLMs

    Authors: Chang Cui, **zhu Jia, Yijun Xiao, Huiming Zhang

    Abstract: High-dimensional sparse generalized linear models (GLMs) have emerged in the setting that the number of samples and the dimension of variables are large, and even the dimension of variables grows faster than the number of samples. False discovery rate (FDR) control aims to identify some small number of statistically significantly nonzero results after getting the sparse penalized estimation of GLM… ▽ More

    Submitted 2 May, 2021; originally announced May 2021.

    Comments: 37 pages

  38. Bayesian multiscale deep generative model for the solution of high-dimensional inverse problems

    Authors: Yingzhi Xia, Nicholas Zabaras

    Abstract: Estimation of spatially-varying parameters for computationally expensive forward models governed by partial differential equations is addressed. A novel multiscale Bayesian inference approach is introduced based on deep probabilistic generative models. Such generative models provide a flexible representation by inferring on each scale a low-dimensional latent encoding while allowing hierarchical p… ▽ More

    Submitted 10 February, 2021; v1 submitted 4 February, 2021; originally announced February 2021.

  39. arXiv:2011.04464  [pdf, other

    stat.ME cs.CV stat.AP

    A Poisson multi-Bernoulli mixture filter for coexisting point and extended targets

    Authors: Ángel F. García-Fernández, Jason L. Williams, Lennart Svensson, Yuxuan Xia

    Abstract: This paper proposes a Poisson multi-Bernoulli mixture (PMBM) filter for coexisting point and extended targets, i.e., for scenarios where there may be simultaneous point and extended targets. The PMBM filter provides a recursion to compute the multi-target filtering posterior based on probabilistic information on data associations, and single-target predictions and updates. In this paper, we first… ▽ More

    Submitted 18 May, 2021; v1 submitted 9 November, 2020; originally announced November 2020.

    Comments: Matlab files can be found at https://github.com/Agarciafernandez/Coexisting-point-extended-target-PMBM-filter and https://github.com/yuhsuansia/Coexisting-point-extended-target-PMBM-filter. A relevant multi-object tracking course can be found at https://www.youtube.com/channel/UCa2-fpj6AV8T6JK1uTRuFpw

    Journal ref: in IEEE Transactions on Signal Processing, vol. 69, pp. 2600-2610, 2021

  40. arXiv:2010.10698  [pdf, other

    cs.LG math.OC stat.CO

    Batch Sequential Adaptive Designs for Global Optimization

    Authors: Jianhui Ning, Yao Xiao, Zikang Xiong

    Abstract: Compared with the fixed-run designs, the sequential adaptive designs (SAD) are thought to be more efficient and effective. Efficient global optimization (EGO) is one of the most popular SAD methods for expensive black-box optimization problems. A well-recognized weakness of the original EGO in complex computer experiments is that it is serial, and hence the modern parallel computing techniques can… ▽ More

    Submitted 20 October, 2020; originally announced October 2020.

    Comments: 20Pages, 4 Figures, 8 Tables

    MSC Class: G.3

  41. arXiv:2009.11469  [pdf, other

    cs.LG cs.AI stat.ML

    Revisiting Graph Convolutional Network on Semi-Supervised Node Classification from an Optimization Perspective

    Authors: Hongwei Zhang, Ti** Yan, Zenjun Xie, Yuanqing Xia, Yuan Zhang

    Abstract: Graph convolutional networks (GCNs) have achieved promising performance on various graph-based tasks. However they suffer from over-smoothing when stacking more layers. In this paper, we present a quantitative study on this observation and develop novel insights towards the deeper GCN. First, we interpret the current graph convolutional operations from an optimization perspective and argue that ov… ▽ More

    Submitted 24 September, 2020; v1 submitted 23 September, 2020; originally announced September 2020.

  42. arXiv:2009.00538  [pdf, other

    stat.ML cs.LG

    Stochastic Graph Recurrent Neural Network

    Authors: Ti** Yan, Hongwei Zhang, Zirui Li, Yuanqing Xia

    Abstract: Representation learning over graph structure data has been widely studied due to its wide application prospects. However, previous methods mainly focus on static graphs while many real-world graphs evolve over time. Modeling such evolution is important for predicting properties of unseen networks. To resolve this challenge, we propose SGRNN, a novel neural architecture that applies stochastic late… ▽ More

    Submitted 1 September, 2020; originally announced September 2020.

  43. arXiv:2009.00399  [pdf, other

    stat.ME

    Accounting for correlated horizontal pleiotropy in two-sample Mendelian randomization using correlated instrumental variants

    Authors: Qing Cheng, Baoluo Sun, Yingcun Xia, ** Liu

    Abstract: Mendelian randomization (MR) is a powerful approach to examine the causal relationships between health risk factors and outcomes from observational studies. Due to the proliferation of genome-wide association studies (GWASs) and abundant fully accessible GWASs summary statistics, a variety of two-sample MR methods for summary data have been developed to either detect or account for horizontal plei… ▽ More

    Submitted 1 September, 2020; originally announced September 2020.

  44. arXiv:2009.00236  [pdf, other

    cs.LG stat.ML

    A Survey of Deep Active Learning

    Authors: Pengzhen Ren, Yun Xiao, Xiaojun Chang, Po-Yao Huang, Zhihui Li, Brij B. Gupta, Xiaojiang Chen, Xin Wang

    Abstract: Active learning (AL) attempts to maximize the performance gain of the model by marking the fewest samples. Deep learning (DL) is greedy for data and requires a large amount of data supply to optimize massive parameters, so that the model learns how to extract high-quality features. In recent years, due to the rapid development of internet technology, we are in an era of information torrents and we… ▽ More

    Submitted 5 December, 2021; v1 submitted 30 August, 2020; originally announced September 2020.

  45. arXiv:2008.07298  [pdf, other

    cs.CR cs.DC cs.LG stat.ML

    WAFFLE: Watermarking in Federated Learning

    Authors: Buse Gul Atli, Yuxi Xia, Samuel Marchal, N. Asokan

    Abstract: Federated learning is a distributed learning technique where machine learning models are trained on client devices in which the local training data resides. The training is coordinated via a central server which is, typically, controlled by the intended owner of the resulting model. By avoiding the need to transport the training data to the central server, federated learning improves privacy and e… ▽ More

    Submitted 22 July, 2021; v1 submitted 17 August, 2020; originally announced August 2020.

    Comments: Will appear in the proceedings of SRDS 2021; 14 pages, 11 figures, 10 tables

  46. arXiv:2007.12349  [pdf, other

    cs.LG stat.ML

    Adversarial Mixture Of Experts with Category Hierarchy Soft Constraint

    Authors: Zhuojian Xiao, Yunjiang jiang, Guoyu Tang, Lin Liu, Sulong Xu, Yun Xiao, Weipeng Yan

    Abstract: Product search is the most common way for people to satisfy their shop** needs on e-commerce websites. Products are typically annotated with one of several broad categorical tags, such as "Clothing" or "Electronics", as well as finer-grained categories like "Refrigerator" or "TV", both under "Electronics". These tags are used to construct a hierarchy of query categories. Distributions of feature… ▽ More

    Submitted 2 March, 2021; v1 submitted 24 July, 2020; originally announced July 2020.

  47. arXiv:2007.04649  [pdf, other

    cs.LG stat.ML

    Learning to Reweight with Deep Interactions

    Authors: Yang Fan, Yingce Xia, Lijun Wu, Shufang Xie, Weiqing Liu, Jiang Bian, Tao Qin, Xiang-Yang Li

    Abstract: Recently, the concept of teaching has been introduced into machine learning, in which a teacher model is used to guide the training of a student model (which will be used in real tasks) through data selection, loss function design, etc. Learning to reweight, which is a specific kind of teaching that reweights training data using a teacher model, receives much attention due to its simplicity and ef… ▽ More

    Submitted 12 January, 2021; v1 submitted 9 July, 2020; originally announced July 2020.

    Comments: Accepted to AAAI-2021

  48. arXiv:2006.16501  [pdf, other

    stat.ME math.ST

    Testing and Support Recovery of Correlation Structures for Matrix-Valued Observations with an Application to Stock Market Data

    Authors: Xin Chen, Dan Yang, Yan Xu, Yin Xia, Dong Wang, Haipeng Shen

    Abstract: Estimation of the covariance matrix of asset returns is crucial to portfolio construction. As suggested by economic theories, the correlation structure among assets differs between emerging markets and developed countries. It is therefore imperative to make rigorous statistical inference on correlation matrix equality between the two groups of countries. However, if the traditional vector-valued a… ▽ More

    Submitted 27 September, 2021; v1 submitted 29 June, 2020; originally announced June 2020.

  49. arXiv:2006.10932  [pdf, other

    cs.IR cs.LG stat.ML

    Convolutional Gaussian Embeddings for Personalized Recommendation with Uncertainty

    Authors: Junyang Jiang, Deqing Yang, Yanghua Xiao, Chenlu Shen

    Abstract: Most of existing embedding based recommendation models use embeddings (vectors) corresponding to a single fixed point in low-dimensional space, to represent users and items. Such embeddings fail to precisely represent the users/items with uncertainty often observed in recommender systems. Addressing this problem, we propose a unified deep recommendation framework employing Gaussian embeddings, whi… ▽ More

    Submitted 18 June, 2020; originally announced June 2020.

    Journal ref: IJCAI 2019

  50. arXiv:2006.10783  [pdf, other

    hep-th math.AG math.CO stat.ML

    Quiver Mutations, Seiberg Duality and Machine Learning

    Authors: Jiakang Bao, Sebastián Franco, Yang-Hui He, Edward Hirst, Gregg Musiker, Yan Xiao

    Abstract: We initiate the study of applications of machine learning to Seiberg duality, focusing on the case of quiver gauge theories, a problem also of interest in mathematics in the context of cluster algebras. Within the general theme of Seiberg duality, we define and explore a variety of interesting questions, broadly divided into the binary determination of whether a pair of theories picked from a seri… ▽ More

    Submitted 18 June, 2020; originally announced June 2020.

    Comments: 57 pages

    MSC Class: 13F60; 81T13; 81T30

    Journal ref: Phys. Rev. D 102, 086013 (2020)