Skip to main content

Showing 1–50 of 101 results for author: Yu, J

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.05745  [pdf, other

    stat.ML cs.AI cs.LG

    Structured Learning of Compositional Sequential Interventions

    Authors: Jialin Yu, Andreas Koukorinis, Nicolò Colombo, Yuchen Zhu, Ricardo Silva

    Abstract: We consider sequential treatment regimes where each unit is exposed to combinations of interventions over time. When interventions are described by qualitative labels, such as ``close schools for a month due to a pandemic'' or ``promote this podcast to this user during this week'', it is unclear which appropriate structural assumptions allow us to generalize behavioral predictions to previously un… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

  2. arXiv:2405.05695  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Aux-NAS: Exploiting Auxiliary Labels with Negligibly Extra Inference Cost

    Authors: Yuan Gao, Weizhong Zhang, Wenhan Luo, Lin Ma, **-Gang Yu, Gui-Song Xia, Jiayi Ma

    Abstract: We aim at exploiting additional auxiliary labels from an independent (auxiliary) task to boost the primary task performance which we focus on, while preserving a single task inference cost of the primary task. While most existing auxiliary learning methods are optimization-based relying on loss weights/gradients manipulation, our method is architecture-based with a flexible asymmetric structure fo… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

    Comments: Accepted to ICLR 2024

    Journal ref: International Conference on Learning Representations (ICLR), 2024

  3. arXiv:2405.02087  [pdf, other

    econ.EM stat.ME

    Testing for an Explosive Bubble using High-Frequency Volatility

    Authors: H. Peter Boswijk, Jun Yu, Yang Zu

    Abstract: Based on a continuous-time stochastic volatility model with a linear drift, we develop a test for explosive behavior in financial asset prices at a low frequency when prices are sampled at a higher frequency. The test exploits the volatility information in the high-frequency data. The method consists of devolatizing log-asset price increments with realized volatility measures and performing a supr… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

  4. arXiv:2404.13522  [pdf, other

    cs.AI cs.LG stat.ML

    Error Analysis of Shapley Value-Based Model Explanations: An Informative Perspective

    Authors: Ningsheng Zhao, Jia Yuan Yu, Krzysztof Dzieciolowski, Trang Bui

    Abstract: Shapley value attribution (SVA) is an increasingly popular explainable AI (XAI) method, which quantifies the contribution of each feature to the model's output. However, recent work has shown that most existing methods to implement SVAs have some drawbacks, resulting in biased or unreliable explanations that fail to correctly capture the true intrinsic relationships between features and model outp… ▽ More

    Submitted 29 May, 2024; v1 submitted 21 April, 2024; originally announced April 2024.

  5. arXiv:2401.16099  [pdf, other

    stat.ME eess.IV

    A Ridgelet Approach to Poisson Denoising

    Authors: Ali Dadras, Klara Leffler, Jun Yu

    Abstract: This paper introduces a novel ridgelet transform-based method for Poisson image denoising. Our work focuses on harnessing the Poisson noise's unique non-additive and signal-dependent properties, distinguishing it from Gaussian noise. The core of our approach is a new thresholding scheme informed by theoretical insights into the ridgelet coefficients of Poisson-distributed images and adaptive thres… ▽ More

    Submitted 29 January, 2024; originally announced January 2024.

    Comments: 11 pages, 8 figures

  6. arXiv:2312.12206  [pdf, other

    cs.LG cs.AI stat.ME

    Identification of Causal Structure in the Presence of Missing Data with Additive Noise Model

    Authors: Jie Qiao, Zhengming Chen, Jianhua Yu, Ruichu Cai, Zhifeng Hao

    Abstract: Missing data are an unavoidable complication frequently encountered in many causal discovery tasks. When a missing process depends on the missing values themselves (known as self-masking missingness), the recovery of the joint distribution becomes unattainable, and detecting the presence of such self-masking missingness remains a perplexing challenge. Consequently, due to the inability to reconstr… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

    Comments: Accepted by AAAI-2024

  7. arXiv:2312.01634  [pdf, ps, other

    cs.LG cs.GT stat.ML

    Robust Streaming, Sampling, and a Perspective on Online Learning

    Authors: Evan Dogariu, Jiatong Yu

    Abstract: In this work we present an overview of statistical learning, followed by a survey of robust streaming techniques and challenges, culminating in several rigorous results proving the relationship that we motivate and hint at throughout the journey. Furthermore, we unify often disjoint theorems in a shared framework and notation to clarify the deep connections that are discovered. We hope that by app… ▽ More

    Submitted 4 December, 2023; originally announced December 2023.

  8. arXiv:2311.14079  [pdf, other

    cs.LG stat.ML

    Empirical Comparison between Cross-Validation and Mutation-Validation in Model Selection

    Authors: **yang Yu, Sami Hamdan, Leonard Sasse, Abigail Morrison, Kaustubh R. Patil

    Abstract: Mutation validation (MV) is a recently proposed approach for model selection, garnering significant interest due to its unique characteristics and potential benefits compared to the widely used cross-validation (CV) method. In this study, we empirically compared MV and $k$-fold CV using benchmark and real-world datasets. By employing Bayesian tests, we compared generalization estimates yielding th… ▽ More

    Submitted 15 February, 2024; v1 submitted 23 November, 2023; originally announced November 2023.

  9. arXiv:2310.18910  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    InstanT: Semi-supervised Learning with Instance-dependent Thresholds

    Authors: Muyang Li, Runze Wu, Haoyu Liu, Jun Yu, Xun Yang, Bo Han, Tongliang Liu

    Abstract: Semi-supervised learning (SSL) has been a fundamental challenge in machine learning for decades. The primary family of SSL algorithms, known as pseudo-labeling, involves assigning pseudo-labels to confident unlabeled instances and incorporating them into the training set. Therefore, the selection criteria of confident instances are crucial to the success of SSL. Recently, there has been growing in… ▽ More

    Submitted 29 October, 2023; originally announced October 2023.

    Comments: Accepted as poster for NeurIPS 2023

  10. arXiv:2309.09564  [pdf, other

    stat.ML cs.LG

    New Bounds on the Accuracy of Majority Voting for Multi-Class Classification

    Authors: Sina Aeeneh, Nikola Zlatanov, Jiangshan Yu

    Abstract: Majority voting is a simple mathematical function that returns the value that appears most often in a set. As a popular decision fusion technique, the majority voting function (MVF) finds applications in resolving conflicts, where a number of independent voters report their opinions on a classification problem. Despite its importance and its various applications in ensemble learning, data crowd-so… ▽ More

    Submitted 18 September, 2023; originally announced September 2023.

  11. arXiv:2309.01161  [pdf, other

    math.OC eess.SY stat.ME

    Probabilistic Reduced-Dimensional Vector Autoregressive Modeling for Dynamics Prediction and Reconstruction with Oblique Projections

    Authors: Yanfang Mo, Jiaxin Yu, S. Joe Qin

    Abstract: In this paper, we propose a probabilistic reduced-dimensional vector autoregressive (PredVAR) model with oblique projections. This model partitions the measurement space into a dynamic subspace and a static subspace that do not need to be orthogonal. The partition allows us to apply an oblique projection to extract dynamic latent variables (DLVs) from high-dimensional data with maximized predictab… ▽ More

    Submitted 3 September, 2023; originally announced September 2023.

  12. arXiv:2308.12444  [pdf, other

    stat.ME

    Leverage classifier: Another look at support vector machine

    Authors: Yixin Han, Jun Yu, Nan Zhang, Cheng Meng, ** Ma, Wenxuan Zhong, Changliang Zou

    Abstract: Support vector machine (SVM) is a popular classifier known for accuracy, flexibility, and robustness. However, its intensive computation has hindered its application to large-scale datasets. In this paper, we propose a new optimal leverage classifier based on linear SVM under a nonseparable setting. Our classifier aims to select an informative subset of the training sample to reduce data size, ena… ▽ More

    Submitted 23 August, 2023; originally announced August 2023.

    Comments: 50 pages, 9 figures, 3 tables

  13. CoxKnockoff: Controlled Feature Selection for the Cox Model Using Knockoffs

    Authors: Daoji Li, **zhao Yu, Hui Zhao

    Abstract: Although there is a huge literature on feature selection for the Cox model, none of the existing approaches can control the false discovery rate (FDR) unless the sample size tends to infinity. In addition, there is no formal power analysis of the knockoffs framework for survival data in the literature. To address those issues, in this paper, we propose a novel controlled feature selection approach… ▽ More

    Submitted 1 August, 2023; originally announced August 2023.

    Comments: 22 pages including the Supporting Information

    Journal ref: Stat, 12(1), e607 (2023)

  14. arXiv:2307.02159  [pdf, other

    stat.ML cs.CV cs.LG math.AP

    DiffFlow: A Unified SDE Framework for Score-Based Diffusion Models and Generative Adversarial Networks

    Authors: **gwei Zhang, Han Shi, **cheng Yu, Enze Xie, Zhenguo Li

    Abstract: Generative models can be categorized into two types: explicit generative models that define explicit density forms and allow exact likelihood inference, such as score-based diffusion models (SDMs) and normalizing flows; implicit generative models that directly learn a transformation from the prior to the data distribution, such as generative adversarial nets (GANs). While these two types of models… ▽ More

    Submitted 5 July, 2023; originally announced July 2023.

    Comments: Tech Report

  15. arXiv:2306.12925  [pdf, other

    cs.CL cs.AI cs.SD eess.AS stat.ML

    AudioPaLM: A Large Language Model That Can Speak and Listen

    Authors: Paul K. Rubenstein, Chulayuth Asawaroengchai, Duc Dung Nguyen, Ankur Bapna, Zalán Borsos, Félix de Chaumont Quitry, Peter Chen, Dalia El Badawy, Wei Han, Eugene Kharitonov, Hannah Muckenhirn, Dirk Padfield, James Qin, Danny Rozenberg, Tara Sainath, Johan Schalkwyk, Matt Sharifi, Michelle Tadmor Ramanovich, Marco Tagliasacchi, Alexandru Tudor, Mihajlo Velimirović, Damien Vincent, Jiahui Yu, Yongqiang Wang, Vicky Zayats , et al. (5 additional authors not shown)

    Abstract: We introduce AudioPaLM, a large language model for speech understanding and generation. AudioPaLM fuses text-based and speech-based language models, PaLM-2 [Anil et al., 2023] and AudioLM [Borsos et al., 2022], into a unified multimodal architecture that can process and generate text and speech with applications including speech recognition and speech-to-speech translation. AudioPaLM inherits the… ▽ More

    Submitted 22 June, 2023; originally announced June 2023.

    Comments: Technical report

  16. arXiv:2306.06581  [pdf, other

    stat.ML cs.DS cs.LG math.OC

    Importance Sparsification for Sinkhorn Algorithm

    Authors: Mengyu Li, Jun Yu, Tao Li, Cheng Meng

    Abstract: Sinkhorn algorithm has been used pervasively to approximate the solution to optimal transport (OT) and unbalanced optimal transport (UOT) problems. However, its practical application is limited due to the high computational complexity. To alleviate the computational burden, we propose a novel importance sparsification method, called Spar-Sink, to efficiently approximate entropy-regularized OT and… ▽ More

    Submitted 11 June, 2023; originally announced June 2023.

    Comments: Accepted by Journal of Machine Learning Research

  17. arXiv:2306.04027  [pdf, other

    stat.ML cs.AI cs.LG

    Intervention Generalization: A View from Factor Graph Models

    Authors: Gecia Bravo-Hermsdorff, David S. Watson, Jialin Yu, Jakob Zeitler, Ricardo Silva

    Abstract: One of the goals of causal inference is to generalize from past experiments and observational data to novel conditions. While it is in principle possible to eventually learn a map** from a novel experimental condition to an outcome of interest, provided a sufficient variety of experiments is available in the training data, co** with a large combinatorial space of possible interventions is hard… ▽ More

    Submitted 8 November, 2023; v1 submitted 6 June, 2023; originally announced June 2023.

    Comments: Camera ready version (NeurIPS 2023)

  18. arXiv:2305.15577  [pdf, other

    stat.ML cs.LG

    Minimizing $f$-Divergences by Interpolating Velocity Fields

    Authors: Song Liu, Jiahao Yu, Jack Simons, Mingxuan Yi, Mark Beaumont

    Abstract: Many machine learning problems can be seen as approximating a \textit{target} distribution using a \textit{particle} distribution by minimizing their statistical discrepancy. Wasserstein Gradient Flow can move particles along a path that minimizes the $f$-divergence between the target and particle distributions. To move particles, we need to calculate the corresponding velocity fields derived from… ▽ More

    Submitted 6 June, 2024; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: This manuscript is an extended version of the ICML2024 version. The code for reproducing our results can be found at https://github.com/anewgithubname/gradest2

  19. Approximate Gibbs Sampler for Efficient Inference of Hierarchical Bayesian Models for Grouped Count Data

    Authors: **-Zhu Yu, Hiba Baroud

    Abstract: Hierarchical Bayesian Poisson regression models (HBPRMs) provide a flexible modeling approach of the relationship between predictors and count response variables. The applications of HBPRMs to large-scale datasets require efficient inference algorithms due to the high computational cost of inferring many model parameters based on random sampling. Although Markov Chain Monte Carlo (MCMC) algorithms… ▽ More

    Submitted 1 July, 2024; v1 submitted 28 November, 2022; originally announced November 2022.

  20. arXiv:2209.01018  [pdf, other

    cs.LG math.PR stat.AP stat.ML

    Normalization effects on deep neural networks

    Authors: Jiahui Yu, Konstantinos Spiliopoulos

    Abstract: We study the effect of normalization on the layers of deep neural networks of feed-forward type. A given layer $i$ with $N_{i}$ hidden units is allowed to be normalized by $1/N_{i}^{γ_{i}}$ with $γ_{i}\in[1/2,1]$ and we study the effect of the choice of the $γ_{i}$ on the statistical behavior of the neural network's output (such as variance) as well as on the test accuracy on the MNIST data set. W… ▽ More

    Submitted 2 September, 2022; originally announced September 2022.

    Comments: arXiv admin note: text overlap with arXiv:2011.10487

    MSC Class: 60F05; 68T01; 60G99

  21. arXiv:2208.04459  [pdf, other

    eess.SY stat.AP

    Bullwhip Effect of Supply Networks: Joint Impact of Network Structure and Market Demand

    Authors: **-Zhu Yü, Chencheng Cai, Jianxi Gao

    Abstract: The progressive amplification of fluctuations in demand as the demand travels upstream the supply chains is known as the bullwhip effect. We first analytically characterize the bullwhip effect in general supply chain networks in two cases: (i) all suppliers have a unique layer position, where our method is founded on the control-theoretic approach, and (ii) not all suppliers have a unique layer po… ▽ More

    Submitted 8 August, 2022; originally announced August 2022.

  22. arXiv:2207.03281  [pdf, ps, other

    stat.ME

    Semiparametric Estimation of Average Treatment Effect with Sieve Method

    Authors: Jichang Yu, Haibo Zhou, Jianwen Cai

    Abstract: Correctly identifying treatment effects in observational studies is very difficult due to the fact that the outcome model or the treatment assignment model must be correctly specified. Taking advantages of semiparametric models in this article, we use single-index models to establish the outcome model and the treatment assignment model, which can allow the link function to be unbounded and have un… ▽ More

    Submitted 7 July, 2022; originally announced July 2022.

  23. arXiv:2206.10240  [pdf, other

    stat.CO stat.ME

    Core-Elements for Classical Linear Regression

    Authors: Mengyu Li, Jun Yu, Tao Li, Cheng Meng

    Abstract: The coresets approach, also called subsampling or subset selection, aims to select a subsample as a surrogate for the observed sample. Such an approach has been used pervasively in large-scale data analysis. Existing coresets methods construct the subsample using a subset of rows from the predictor matrix. Such methods can be significantly inefficient when the predictor matrix is sparse or numeric… ▽ More

    Submitted 17 March, 2023; v1 submitted 21 June, 2022; originally announced June 2022.

  24. arXiv:2206.01182  [pdf, other

    stat.ML math.ST

    An optimal transport approach for selecting a representative subsample with application in efficient kernel density estimation

    Authors: **gyi Zhang, Cheng Meng, Jun Yu, Mengrui Zhang, Wenxuan Zhong, ** Ma

    Abstract: Subsampling methods aim to select a subsample as a surrogate for the observed sample. Such methods have been used pervasively in large-scale data analytics, active learning, and privacy-preserving analysis in recent decades. Instead of model-based methods, in this paper, we study model-free subsampling methods, which aim to identify a subsample that is not confined by model assumptions. Existing m… ▽ More

    Submitted 31 May, 2022; originally announced June 2022.

  25. arXiv:2205.15059  [pdf, other

    cs.LG stat.ML

    Hilbert Curve Projection Distance for Distribution Comparison

    Authors: Tao Li, Cheng Meng, Hongteng Xu, Jun Yu

    Abstract: Distribution comparison plays a central role in many machine learning tasks like data classification and generative modeling. In this study, we propose a novel metric, called Hilbert curve projection (HCP) distance, to measure the distance between two probability distributions with low complexity. In particular, we first project two high-dimensional probability distributions using Hilbert curve to… ▽ More

    Submitted 6 February, 2024; v1 submitted 30 May, 2022; originally announced May 2022.

    Comments: 33 pages, 11 figures

  26. arXiv:2205.13573  [pdf, other

    cs.LG stat.ME stat.ML

    Efficient Approximation of Gromov-Wasserstein Distance Using Importance Sparsification

    Authors: Mengyu Li, Jun Yu, Hongteng Xu, Cheng Meng

    Abstract: As a valid metric of metric-measure spaces, Gromov-Wasserstein (GW) distance has shown the potential for matching problems of structured data like point clouds and graphs. However, its application in practice is limited due to the high computational complexity. To overcome this challenge, we propose a novel importance sparsification method, called \textsc{Spar-GW}, to approximate GW distance effic… ▽ More

    Submitted 9 January, 2023; v1 submitted 26 May, 2022; originally announced May 2022.

  27. arXiv:2205.09860  [pdf, other

    cs.LG math.AP stat.ML

    Mean-Field Analysis of Two-Layer Neural Networks: Global Optimality with Linear Convergence Rates

    Authors: **gwei Zhang, Xunpeng Huang, **cheng Yu

    Abstract: We consider optimizing two-layer neural networks in the mean-field regime where the learning dynamics of network weights can be approximated by the evolution in the space of probability measures over the weight parameters associated with the neurons. The mean-field regime is a theoretically attractive alternative to the NTK (lazy training) regime which is only restricted locally in the so-called n… ▽ More

    Submitted 17 October, 2022; v1 submitted 19 May, 2022; originally announced May 2022.

    Comments: revision on presentation; add experiments; more discussions on previous independent works of chizat2022mean and nitanda2022convex

  28. Modeling Ride-Sourcing Matching and Pickup Processes based on Additive Gaussian Process Models

    Authors: Zheng Zhu, Meng Xu, Yining Di, Xiqun Chen, **gru Yu

    Abstract: Matching and pickup processes are core features of ride-sourcing services. Previous studies have adopted abundant analytical models to depict the two processes and obtain operational insights; while the goodness of fit between models and data was dismissed. To simultaneously consider the fitness between models and data and analytically tractable formations, we propose a data-driven approach based… ▽ More

    Submitted 29 April, 2022; originally announced April 2022.

    Comments: 30 pages, 8 figures, 4 tables. Submitted and under review in Transportmetrica B: Transport Dynamics

  29. arXiv:2203.06956  [pdf, other

    stat.AP

    Statistical learning for train delays and influence of winter climate and atmospheric icing

    Authors: Jianfeng Wang, Roberto Mantas Nakhai, Jun Yu

    Abstract: This study investigated the climate effect under consecutive winters on the arrival delay of high-speed passenger trains in northern Sweden. Novel statistical learning approaches, including inhomogeneous Markov chain model and stratified Cox model, were adopted to account for the time-varying risks of train delays. The inhomogeneous Markov chain modelling for the arrival delays has used several co… ▽ More

    Submitted 14 March, 2022; originally announced March 2022.

  30. arXiv:2201.12739  [pdf, other

    cs.LG stat.ML

    Do We Need to Penalize Variance of Losses for Learning with Label Noise?

    Authors: Yexiong Lin, Yu Yao, Yuxuan Du, Jun Yu, Bo Han, Mingming Gong, Tongliang Liu

    Abstract: Algorithms which minimize the averaged loss have been widely designed for dealing with noisy labels. Intuitively, when there is a finite training sample, penalizing the variance of losses will improve the stability and generalization of the algorithms. Interestingly, we found that the variance should be increased for the problem of learning with noisy labels. Specifically, increasing the variance… ▽ More

    Submitted 30 January, 2022; originally announced January 2022.

  31. arXiv:2110.14374  [pdf, other

    physics.comp-ph cond-mat.dis-nn stat.ML

    A2I Transformer: Permutation-equivariant attention network for pairwise and many-body interactions with minimal featurization

    Authors: Ji Woong Yu, Min Young Ha, Bumjoon Seo, Won Bo Lee

    Abstract: The combination of neural network potential (NNP) with molecular simulations plays an important role in an efficient and thorough understanding of a molecular system's potential energy surface (PES). However, gras** the interplay between input features and their local contribution to NNP is growingly evasive due to heavy featurization. In this work, we suggest an end-to-end model which directly… ▽ More

    Submitted 27 October, 2021; originally announced October 2021.

  32. arXiv:2110.08850  [pdf

    physics.soc-ph cs.LG cs.SI q-bio.MN stat.ML

    Understanding the network formation pattern for better link prediction

    Authors: Jiating Yu, Ling-Yun Wu

    Abstract: As a classical problem in the field of complex networks, link prediction has attracted much attention from researchers, which is of great significance to help us understand the evolution and dynamic development mechanisms of networks. Although various network type-specific algorithms have been proposed to tackle the link prediction problem, most of them suppose that the network structure is domina… ▽ More

    Submitted 17 October, 2021; originally announced October 2021.

    Comments: 21 pages, 3 figures, 18 tables, and 29 references

    Journal ref: Physica A: Statistical Mechanics and its Applications, 600 (2022) 127522

  33. arXiv:2109.11727  [pdf, other

    stat.ME

    Smoothing splines approximation using Hilbert curve basis selection

    Authors: Cheng Meng, Jun Yu, Yongkai Chen, Wenxuan Zhong, ** Ma

    Abstract: Smoothing splines have been used pervasively in nonparametric regressions. However, the computational burden of smoothing splines is significant when the sample size $n$ is large. When the number of predictors $d\geq2$, the computational cost for smoothing splines is at the order of $O(n^3)$ using the standard approach. Many methods have been developed to approximate smoothing spline estimators by… ▽ More

    Submitted 11 October, 2022; v1 submitted 23 September, 2021; originally announced September 2021.

  34. arXiv:2108.01931  [pdf, other

    stat.AP

    Train performance analysis using heterogeneous statistical models

    Authors: Jianfeng Wang, Jun Yu

    Abstract: This study investigated the effect of harsh winter climate on the performance of high speed passenger trains in northern Sweden. Novel approaches based on heterogeneous statistical models were introduced to analyse the train performance in order to take the time-varying risks of train delays into consideration. Specifically, stratified Cox model and heterogeneous Markov chain model were used for m… ▽ More

    Submitted 4 August, 2021; originally announced August 2021.

    Comments: arXiv admin note: substantial text overlap with arXiv:2009.10426

  35. arXiv:2103.13565  [pdf, other

    cs.LG cs.AI stat.ML

    Jointly Modeling Heterogeneous Student Behaviors and Interactions Among Multiple Prediction Tasks

    Authors: Haobing Liu, Yanmin Zhu, Tianzi Zang, Yanan Xu, Jiadi Yu, Feilong Tang

    Abstract: Prediction tasks about students have practical significance for both student and college. Making multiple predictions about students is an important part of a smart campus. For instance, predicting whether a student will fail to graduate can alert the student affairs office to take predictive measures to help the student improve his/her academic performance. With the development of information tec… ▽ More

    Submitted 24 March, 2021; originally announced March 2021.

    Journal ref: ACM TKDD 2022

  36. arXiv:2103.05059  [pdf, other

    stat.ME stat.ML

    Bias-Corrected Peaks-Over-Threshold Estimation of the CVaR

    Authors: Dylan Troop, Frédéric Godin, Jia Yuan Yu

    Abstract: The conditional value-at-risk (CVaR) is a useful risk measure in fields such as machine learning, finance, insurance, energy, etc. When measuring very extreme risk, the commonly used CVaR estimation method of sample averaging does not work well due to limited data above the value-at-risk (VaR), the quantile corresponding to the CVaR level. To mitigate this problem, the CVaR can be estimated by ext… ▽ More

    Submitted 8 March, 2021; originally announced March 2021.

  37. arXiv:2102.05960  [pdf

    stat.ML cs.AI cs.LG

    Comparative Analysis of Machine Learning Approaches to Analyze and Predict the Covid-19 Outbreak

    Authors: Muhammad Naeem, Jian Yu, Muhammad Aamir, Sajjad Ahmad Khan, Olayinka Adeleye, Zardad Khan

    Abstract: Background. Forecasting the time of forthcoming pandemic reduces the impact of diseases by taking precautionary steps such as public health messaging and raising the consciousness of doctors. With the continuous and rapid increase in the cumulative incidence of COVID-19, statistical and outbreak prediction models including various machine learning (ML) models are being used by the research communi… ▽ More

    Submitted 11 February, 2021; originally announced February 2021.

    Comments: 22 pages, 10 figures

  38. arXiv:2011.10487  [pdf, other

    stat.ML cs.LG math.PR

    Normalization effects on shallow neural networks and related asymptotic expansions

    Authors: Jiahui Yu, Konstantinos Spiliopoulos

    Abstract: We consider shallow (single hidden layer) neural networks and characterize their performance when trained with stochastic gradient descent as the number of hidden units $N$ and gradient descent steps grow to infinity. In particular, we investigate the effect of different scaling schemes, which lead to different normalizations of the neural network, on the network's statistical output, closing the… ▽ More

    Submitted 1 June, 2022; v1 submitted 20 November, 2020; originally announced November 2020.

    Comments: Added link to code on GitHub: https://github.com/kspiliopoulos/NormalizationEffectsNeuralNetworks

    MSC Class: 60F05; 68T01; 60G99

    Journal ref: AIMS Journal on Foundations of Data Science, June 2021, Vol. 3, Issue 2, pp. 151-200

  39. arXiv:2010.09921  [pdf, other

    cs.LG stat.ME stat.ML

    Sufficient dimension reduction for classification using principal optimal transport direction

    Authors: Cheng Meng, Jun Yu, **gyi Zhang, ** Ma, Wenxuan Zhong

    Abstract: Sufficient dimension reduction is used pervasively as a supervised dimension reduction approach. Most existing sufficient dimension reduction methods are developed for data with a continuous response and may have an unsatisfactory performance for the categorical response, especially for the binary-response. To address this issue, we propose a novel estimation method of sufficient dimension reducti… ▽ More

    Submitted 1 February, 2021; v1 submitted 19 October, 2020; originally announced October 2020.

    Comments: 18 pages, 4 figures, to be published in 34th Conference on Neural Information Processing Systems (NeurIPS 2020), add the supplementary material

  40. arXiv:2010.05563  [pdf, other

    cs.LG stat.ML

    Graph Information Bottleneck for Subgraph Recognition

    Authors: Junchi Yu, Tingyang Xu, Yu Rong, Yatao Bian, Junzhou Huang, Ran He

    Abstract: Given the input graph and its label/property, several key problems of graph learning, such as finding interpretable subgraphs, graph denoising and graph compression, can be attributed to the fundamental problem of recognizing a subgraph of the original one. This subgraph shall be as informative as possible, yet contains less redundant and noisy structure. This problem setting is closely related to… ▽ More

    Submitted 12 October, 2020; originally announced October 2020.

  41. arXiv:2009.10426  [pdf, other

    stat.AP

    Effects of winter climate on high speed passenger trains in Botnia-Atlantica region

    Authors: Jianfeng Wang, Markus Granlöf, Jun Yu

    Abstract: Harsh winter climate can cause various problems for both public and private sectors in Sweden, especially in the northern part for railway industry. To have a better understanding of winter climate impacts, this study investigates effects of the winter climate including atmospheric icing on the performance of high speed passenger trains in the Botnia-Atlantica region. The investigation is done wit… ▽ More

    Submitted 24 September, 2020; v1 submitted 22 September, 2020; originally announced September 2020.

  42. arXiv:2009.06237  [pdf, other

    cs.LG cs.AI cs.AR stat.ML

    DANCE: Differentiable Accelerator/Network Co-Exploration

    Authors: Kanghyun Choi, Deokki Hong, Hojae Yoon, Joonsang Yu, Youngsok Kim, **ho Lee

    Abstract: To cope with the ever-increasing computational demand of the DNN execution, recent neural architecture search (NAS) algorithms consider hardware cost metrics into account, such as GPU latency. To further pursue a fast, efficient execution, DNN-specialized hardware accelerators are being designed for multiple purposes, which far-exceeds the efficiency of the GPUs. However, those hardware-related me… ▽ More

    Submitted 15 February, 2021; v1 submitted 14 September, 2020; originally announced September 2020.

    Comments: Accepted to DAC 2021

  43. arXiv:2009.06182  [pdf, other

    stat.ML cs.LG

    Density Estimation via Bayesian Inference Engines

    Authors: M. P. Wand, J. C. F. Yu

    Abstract: We explain how effective automatic probability density function estimates can be constructed using contemporary Bayesian inference engines such as those based on no-U-turn sampling and expectation propagation. Extensive simulation studies demonstrate that the proposed density estimates have excellent comparative performance and scale well to very large sample sizes due to a binning strategy. Moreo… ▽ More

    Submitted 26 September, 2021; v1 submitted 14 September, 2020; originally announced September 2020.

  44. arXiv:2007.09370  [pdf, other

    cs.CR cs.DC cs.LG stat.ML

    How to Democratise and Protect AI: Fair and Differentially Private Decentralised Deep Learning

    Authors: Lingjuan Lyu, Yitong Li, Karthik Nandakumar, Jiangshan Yu, Xingjun Ma

    Abstract: This paper firstly considers the research problem of fairness in collaborative deep learning, while ensuring privacy. A novel reputation system is proposed through digital tokens and local credibility to ensure fairness, in combination with differential privacy to guarantee privacy. In particular, we build a fair and differentially private decentralised deep learning framework called FDPDDL, which… ▽ More

    Submitted 18 July, 2020; originally announced July 2020.

    Comments: Accepted for publication in TDSC

  45. arXiv:2007.00169  [pdf, other

    cs.LG stat.ML

    Regularly Updated Deterministic Policy Gradient Algorithm

    Authors: Shuai Han, Wenbo Zhou, Shuai Lü, Jiayu Yu

    Abstract: Deep Deterministic Policy Gradient (DDPG) algorithm is one of the most well-known reinforcement learning methods. However, this method is inefficient and unstable in practical applications. On the other hand, the bias and variance of the Q estimation in the target function are sometimes difficult to control. This paper proposes a Regularly Updated Deterministic (RUD) policy gradient algorithm for… ▽ More

    Submitted 30 June, 2020; originally announced July 2020.

  46. arXiv:2006.07663  [pdf, other

    stat.ME

    Bayesian causal inference with some invalid instrumental variables

    Authors: Gyuhyeong Goh, Jisang Yu

    Abstract: In observational studies, instrumental variables estimation is greatly utilized to identify causal effects. One of the key conditions for the instrumental variables estimator to be consistent is the exclusion restriction, which indicates that instruments affect the outcome of interest only via the exposure variable of interest. We propose a likelihood-free Bayesian approach to make consistent infe… ▽ More

    Submitted 13 June, 2020; originally announced June 2020.

  47. arXiv:2005.13719  [pdf, other

    stat.ME

    Synthetic control method with convex hull restrictions: A Bayesian maximum a posteriori approach

    Authors: Gyuhyeong Goh, Jisang Yu

    Abstract: Synthetic control methods have gained popularity among causal studies with observational data, particularly when estimating the impacts of the interventions that are implemented to a small number of large units. Implementing the synthetic control methods faces two major challenges: a) estimating weights for each control unit to create a synthetic control and b) providing statistical inferences. To… ▽ More

    Submitted 27 May, 2020; originally announced May 2020.

  48. arXiv:2005.10435  [pdf, other

    stat.ME cs.DC stat.CO stat.ML

    Optimal Distributed Subsampling for Maximum Quasi-Likelihood Estimators with Massive Data

    Authors: Jun Yu, HaiYing Wang, Mingyao Ai, Huiming Zhang

    Abstract: Nonuniform subsampling methods are effective to reduce computational burden and maintain estimation efficiency for massive data. Existing methods mostly focus on subsampling with replacement due to its high computational efficiency. If the data volume is so large that nonuniform subsampling probabilities cannot be calculated all at once, then subsampling with replacement is infeasible to implement… ▽ More

    Submitted 5 July, 2021; v1 submitted 20 May, 2020; originally announced May 2020.

  49. arXiv:2004.13824  [pdf, other

    cs.CV cs.LG eess.IV stat.ML

    Pyramid Attention Networks for Image Restoration

    Authors: Yiqun Mei, Yuchen Fan, Yulun Zhang, Jiahui Yu, Yuqian Zhou, Ding Liu, Yun Fu, Thomas S. Huang, Humphrey Shi

    Abstract: Self-similarity refers to the image prior widely used in image restoration algorithms that small but similar patterns tend to occur at different locations and scales. However, recent advanced deep convolutional neural network based methods for image restoration do not take full advantage of self-similarities by relying on self-attention neural modules that only process information at the same scal… ▽ More

    Submitted 3 June, 2020; v1 submitted 28 April, 2020; originally announced April 2020.

  50. arXiv:2004.12570  [pdf, other

    cs.LG cs.RO stat.ML

    The Ingredients of Real-World Robotic Reinforcement Learning

    Authors: Henry Zhu, Justin Yu, Abhishek Gupta, Dhruv Shah, Kristian Hartikainen, Avi Singh, Vikash Kumar, Sergey Levine

    Abstract: The success of reinforcement learning for real world robotics has been, in many cases limited to instrumented laboratory scenarios, often requiring arduous human effort and oversight to enable continuous learning. In this work, we discuss the elements that are needed for a robotic learning system that can continually and autonomously improve with data collected in the real world. We propose a part… ▽ More

    Submitted 26 April, 2020; originally announced April 2020.

    Comments: First three authors contributed equally. Accepted as a spotlight presentation at ICLR 2020