Skip to main content

Showing 1–48 of 48 results for author: Yu, D

Searching in archive stat. Search in all archives.
.
  1. arXiv:2407.00307  [pdf, other

    math.OC stat.CO

    Deterministic and Stochastic Frank-Wolfe Recursion on Probability Spaces

    Authors: Di Yu, Shane G. Henderson, Raghu Pasupathy

    Abstract: Motivated by applications in emergency response and experimental design, we consider smooth stochastic optimization problems over probability measures supported on compact subsets of the Euclidean space. With the influence function as the variational object, we construct a deterministic Frank-Wolfe (dFW) recursion for probability spaces, made especially possible by a lemma that identifies a ``clos… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

  2. arXiv:2403.03562  [pdf, other

    cs.LG stat.ML

    Efficient Algorithms for Empirical Group Distributional Robust Optimization and Beyond

    Authors: Dingzhi Yu, Yunuo Cai, Wei Jiang, Lijun Zhang

    Abstract: We investigate the empirical counterpart of group distributionally robust optimization (GDRO), which aims to minimize the maximal empirical risk across $m$ distinct groups. We formulate empirical GDRO as a $\textit{two-level}$ finite-sum convex-concave minimax optimization problem and develop a stochastic variance reduced mirror prox algorithm. Unlike existing methods, we construct the stochastic… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

    Comments: 30 pages, 1 figure

  3. arXiv:2401.17518  [pdf, other

    stat.ME math.ST

    Model Uncertainty and Selection of Risk Models for Left-Truncated and Right-Censored Loss Data

    Authors: Qian Zhao, Sahadeb Upretee, Dao** Yu

    Abstract: Insurance loss data are usually in the form of left-truncation and right-censoring due to deductibles and policy limits respectively. This paper investigates the model uncertainty and selection procedure when various parametric models are constructed to accommodate such left-truncated and right-censored data. The joint asymptotic properties of the estimators have been established using the Delta m… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

    Journal ref: Risks, 2023, 11(11),188

  4. arXiv:2310.08843  [pdf

    stat.AP

    A Longitudinal Analysis about the Effect of Air Pollution on Astigmatism for Children and Young Adults

    Authors: Lin An, Qiuyue Hu, Jieying Guan, Yingting Zhu, Chenyao Jiang, Xiaoyun Zhong, Shuyue Ma, Dongmei Yu, Canyang Zhang, Yehong Zhuo, Peiwu Qin

    Abstract: Purpose: This study aimed to investigate the correlation between air pollution and astigmatism, considering the detrimental effects of air pollution on respiratory, cardiovascular, and eye health. Methods: A longitudinal study was conducted with 127,709 individuals aged 4-27 years from 9 cities in Guangdong Province, China, spanning from 2019 to 2021. Astigmatism was measured using cylinder values… ▽ More

    Submitted 13 October, 2023; originally announced October 2023.

  5. arXiv:2305.10042  [pdf, other

    stat.ML cs.LG

    Optimal Weighted Random Forests

    Authors: Xinyu Chen, Dalei Yu, Xinyu Zhang

    Abstract: The random forest (RF) algorithm has become a very popular prediction method for its great flexibility and promising accuracy. In RF, it is conventional to put equal weights on all the base learners (trees) to aggregate their predictions. However, the predictive performances of different trees within the forest can be very different due to the randomization of the embedded bootstrap sampling and f… ▽ More

    Submitted 17 May, 2023; originally announced May 2023.

    Comments: 29 pages, 4 figures

  6. arXiv:2304.02723  [pdf, ps, other

    stat.AP econ.GN stat.ME

    Measuring Discrete Risks on Infinite Domains: Theoretical Foundations, Conditional Five Number Summaries, and Data Analyses

    Authors: Dao** Yu, Vytaras Brazauskas, Ricardas Zitikis

    Abstract: To accommodate numerous practical scenarios, in this paper we extend statistical inference for smoothed quantile estimators from finite domains to infinite domains. We accomplish the task with the help of a newly designed truncation methodology for discrete loss distributions with infinite domains. A simulation study illustrates the methodology in the case of several distributions, such as Poisson… ▽ More

    Submitted 5 April, 2023; originally announced April 2023.

    Comments: 22 pages, 1 figure, 7 tables

  7. arXiv:2212.01539  [pdf, other

    cs.LG stat.ML

    Exploring the Limits of Differentially Private Deep Learning with Group-wise Clip**

    Authors: Jiyan He, Xuechen Li, Da Yu, Huishuai Zhang, Janardhan Kulkarni, Yin Tat Lee, Arturs Backurs, Nenghai Yu, Jiang Bian

    Abstract: Differentially private deep learning has recently witnessed advances in computational efficiency and privacy-utility trade-off. We explore whether further improvements along the two axes are possible and provide affirmative answers leveraging two instantiations of \emph{group-wise clip**}. To reduce the compute time overhead of private learning, we show that \emph{per-layer clip**}, where the… ▽ More

    Submitted 3 December, 2022; originally announced December 2022.

    Comments: 25 pages

  8. arXiv:2211.12213  [pdf, other

    stat.AP

    eDNAPlus: A unifying modelling framework for DNA-based biodiversity monitoring

    Authors: Alex Diana, Eleni Matechou, Jim Griffin, Douglas Yu, Mingjie Luo, Marie Tosa, Alex Bush, Richard Griffiths

    Abstract: DNA-based biodiversity surveys involve collecting physical samples from survey sites and assaying the contents in the laboratory to detect species via their diagnostic DNA sequences. DNA-based surveys are increasingly being adopted for biodiversity monitoring. The most commonly employed method is metabarcoding, which combines PCR with high-throughput DNA sequencing to amplify and then read `DNA ba… ▽ More

    Submitted 22 November, 2022; originally announced November 2022.

    Comments: The paper is 35 pages long and it has 8 figures

  9. arXiv:2211.02912  [pdf, other

    stat.ML cs.LG

    New Definitions and Evaluations for Saliency Methods: Staying Intrinsic, Complete and Sound

    Authors: Arushi Gupta, Nikunj Saunshi, Dingli Yu, Kaifeng Lyu, Sanjeev Arora

    Abstract: Saliency methods compute heat maps that highlight portions of an input that were most {\em important} for the label assigned to it by a deep net. Evaluations of saliency methods convert this heat map into a new {\em masked input} by retaining the $k$ highest-ranked pixels of the original input and replacing the rest with \textquotedblleft uninformative\textquotedblright\ pixels, and checking if th… ▽ More

    Submitted 5 November, 2022; originally announced November 2022.

    Comments: NeurIPS 2022 (Oral)

  10. arXiv:2210.02015  [pdf, other

    stat.ML cs.CY cs.LG

    Conformalized Fairness via Quantile Regression

    Authors: Meichen Liu, Lei Ding, Dengdeng Yu, Wulong Liu, Linglong Kong, Bei Jiang

    Abstract: Algorithmic fairness has received increased attention in socially sensitive domains. While rich literature on mean fairness has been established, research on quantile fairness remains sparse but vital. To fulfill great needs and advocate the significance of quantile fairness, we propose a novel framework to learn a real-valued quantile function under the fairness requirement of Demographic Parity… ▽ More

    Submitted 14 October, 2022; v1 submitted 5 October, 2022; originally announced October 2022.

    Comments: 18 pages, 5 figures, 2 tables

  11. arXiv:2207.05180  [pdf, other

    stat.ME stat.CO

    Testing Independence of Bivariate Censored Data using Random Walk on Restricted Permutation Graph

    Authors: Seonghun Cho, Donghyeon Yu, Johan Lim

    Abstract: In this paper, we propose a procedure to test the independence of bivariate censored data, which is generic and applicable to any censoring types in the literature. To test the hypothesis, we consider a rank-based statistic, Kendall's tau statistic. The censored data defines a restricted permutation space of all possible ranks of the observations. We propose the statistic, the average of Kendall's… ▽ More

    Submitted 11 July, 2022; originally announced July 2022.

  12. arXiv:2206.04316  [pdf, other

    cs.LG cs.AI stat.ML

    Adversarial Noises Are Linearly Separable for (Nearly) Random Neural Networks

    Authors: Huishuai Zhang, Da Yu, Yi** Lu, Di He

    Abstract: Adversarial examples, which are usually generated for specific inputs with a specific model, are ubiquitous for neural networks. In this paper we unveil a surprising property of adversarial noises when they are put together, i.e., adversarial noises crafted by one-step gradient methods are linearly separable if equipped with the corresponding labels. We theoretically prove this property for a two-… ▽ More

    Submitted 9 June, 2022; originally announced June 2022.

    Comments: 13 pages

  13. arXiv:2206.02617  [pdf, other

    cs.LG cs.CR cs.DS stat.ML

    Individual Privacy Accounting for Differentially Private Stochastic Gradient Descent

    Authors: Da Yu, Gautam Kamath, Janardhan Kulkarni, Tie-Yan Liu, Jian Yin, Huishuai Zhang

    Abstract: Differentially private stochastic gradient descent (DP-SGD) is the workhorse algorithm for recent advances in private deep learning. It provides a single privacy guarantee to all datapoints in the dataset. We propose output-specific $(\varepsilon,δ)$-DP to characterize privacy guarantees for individual examples when releasing models trained by DP-SGD. We also design an efficient algorithm to inves… ▽ More

    Submitted 2 September, 2023; v1 submitted 6 June, 2022; originally announced June 2022.

    Comments: Published in Transactions on Machine Learning Research (TMLR)

  14. arXiv:2203.15031  [pdf, ps, other

    stat.CO

    An efficient GPU-Parallel Coordinate Descent Algorithm for Sparse Precision Matrix Estimation via Scaled Lasso

    Authors: Seunghwan Lee, Sang Cheol Kim, Donghyeon Yu

    Abstract: The sparse precision matrix plays an essential role in the Gaussian graphical model since a zero off-diagonal element indicates conditional independence of the corresponding two variables given others. In the Gaussian graphical model, many methods have been proposed, and their theoretical properties are given as well. Among these, the sparse precision matrix estimation via scaled lasso (SPMESL) ha… ▽ More

    Submitted 28 March, 2022; originally announced March 2022.

    Comments: To appear in Computational Statistics

  15. arXiv:2110.06500  [pdf, other

    cs.LG cs.CL cs.CR stat.ML

    Differentially Private Fine-tuning of Language Models

    Authors: Da Yu, Saurabh Naik, Arturs Backurs, Sivakanth Gopi, Huseyin A. Inan, Gautam Kamath, Janardhan Kulkarni, Yin Tat Lee, Andre Manoel, Lukas Wutschitz, Sergey Yekhanin, Huishuai Zhang

    Abstract: We give simpler, sparser, and faster algorithms for differentially private fine-tuning of large-scale pre-trained language models, which achieve the state-of-the-art privacy versus utility tradeoffs on many standard NLP tasks. We propose a meta-framework for this problem, inspired by the recent success of highly parameter-efficient methods for fine-tuning. Our experiments show that differentially… ▽ More

    Submitted 14 July, 2022; v1 submitted 13 October, 2021; originally announced October 2021.

    Comments: ICLR 2022. Code available at https://github.com/huseyinatahaninan/Differentially-Private-Fine-tuning-of-Language-Models

  16. arXiv:2106.09382  [pdf, ps, other

    stat.CO

    An efficient parallel block coordinate descent algorithm for large-scale precision matrix estimation using graphics processing units

    Authors: Young-Geun Choi, Seunghwan Lee, Donghyeon Yu

    Abstract: Large-scale sparse precision matrix estimation has attracted wide interest from the statistics community. The convex partial correlation selection method (CONCORD) developed by Khare et al. (2015) has recently been credited with some theoretical properties for estimating sparse precision matrices. The CONCORD obtains its solution by a coordinate descent algorithm (CONCORD-CD) based on the convexit… ▽ More

    Submitted 17 June, 2021; originally announced June 2021.

  17. arXiv:2103.11338  [pdf

    cs.AI cs.DB stat.ML

    Mining GIS Data to Predict Urban Sprawl

    Authors: Anita Pampoore-Thampi, Aparna S. Varde, Danlin Yu

    Abstract: This paper addresses the interesting problem of processing and analyzing data in geographic information systems (GIS) to achieve a clear perspective on urban sprawl. The term urban sprawl refers to overgrowth and expansion of low-density areas with issues such as car dependency and segregation between residential versus commercial use. Sprawl has impacts on the environment and public health. In ou… ▽ More

    Submitted 21 March, 2021; originally announced March 2021.

    Comments: 8 Pages, 13 figures, KDD 2014 conference Bloomberg track

    ACM Class: H.2.8; I.2.1

  18. arXiv:2010.03700  [pdf, other

    stat.ME stat.AP

    Multivariate functional responses low rank regression with an application to brain imaging data

    Authors: Xiucai Ding, Dengdeng Yu, Zhengwu Zhang, Dehan Kong

    Abstract: We propose a multivariate functional responses low rank regression model with possible high dimensional functional responses and scalar covariates. By expanding the slope functions on a set of sieve basis, we reconstruct the basis coefficients as a matrix. To estimate these coefficients, we propose an efficient procedure using nuclear norm regularization. We also derive error bounds for our estima… ▽ More

    Submitted 7 October, 2020; originally announced October 2020.

    Comments: Canadian Journal of Statistics(accepted)

  19. arXiv:2008.07438  [pdf, ps, other

    cs.IT cs.NI eess.SY math.PR stat.AP

    Analysis and Optimization for Large-Scale LoRa Networks: Throughput Fairness and Scalability

    Authors: Jiangbin Lyu, Dan Yu, Liqun Fu

    Abstract: LoRa networks are pivotally enabling Long Range connectivity to low-cost and power-constrained user equipments (UEs) in a wide area, whereas a critical issue is to effectively allocate wireless resources to support potentially massive UEs while resolving the prominent near-far fairness issue, which is challenging due to the lack of tractable analytical model and the practical requirement for low-c… ▽ More

    Submitted 5 November, 2021; v1 submitted 17 August, 2020; originally announced August 2020.

    Comments: To appear in IEEE IOT Journal. Stochastic geometry-based framework to model/analyze large-scale LoRa networks with channel fading/aggregate interference/packet overlap**/multi-GW reception. Jointly optimize SF/Tx-power/duty-cycle based on channel statistics and UE distribution. Achieve both fairness/power savings and improve cell-edge throughput and spatial (sum) throughput for majority of UEs. arXiv admin note: text overlap with arXiv:1904.12300

  20. arXiv:2007.10567  [pdf, other

    cs.LG stat.ML

    How Does Data Augmentation Affect Privacy in Machine Learning?

    Authors: Da Yu, Huishuai Zhang, Wei Chen, Jian Yin, Tie-Yan Liu

    Abstract: It is observed in the literature that data augmentation can significantly mitigate membership inference (MI) attack. However, in this work, we challenge this observation by proposing new MI attacks to utilize the information of augmented data. MI attack is widely used to measure the model's information leakage of the training set. We establish the optimal membership inference when the model is tra… ▽ More

    Submitted 26 February, 2021; v1 submitted 20 July, 2020; originally announced July 2020.

    Comments: AAAI Conference on Artificial Intelligence (AAAI-21). Source code available at: https://github.com/dayu11/MI_with_DA

  21. arXiv:2007.04558  [pdf, other

    stat.AP stat.ME

    Map** the Genetic-Imaging-Clinical Pathway with Applications to Alzheimer's Disease

    Authors: Dengdeng Yu, Linbo Wang, Dehan Kong, Hongtu Zhu

    Abstract: Alzheimer's disease is a progressive form of dementia that results in problems with memory, thinking, and behavior. It often starts with abnormal aggregation and deposition of beta amyloid and tau, followed by neuronal damage such as atrophy of the hippocampi, leading to Alzheimer's Disease (AD). The aim of this paper is to map the genetic-imaging-clinical pathway for AD in order to delineate the… ▽ More

    Submitted 2 June, 2022; v1 submitted 9 July, 2020; originally announced July 2020.

  22. arXiv:2006.07331  [pdf, other

    cs.LG stat.ML

    Knowledge Embedding Based Graph Convolutional Network

    Authors: Donghan Yu, Yiming Yang, Ruohong Zhang, Yuexin Wu

    Abstract: Recently, a considerable literature has grown up around the theme of Graph Convolutional Network (GCN). How to effectively leverage the rich structural information in complex graphs, such as knowledge graphs with heterogeneous types of entities and relations, is a primary open challenge in the field. Most GCN methods are either restricted to graphs with a homogeneous type of edges (e.g., citation… ▽ More

    Submitted 23 April, 2021; v1 submitted 12 June, 2020; originally announced June 2020.

    Comments: WWW 2021

  23. arXiv:2004.11934  [pdf, other

    cs.LG stat.ML

    Correlation-aware Unsupervised Change-point Detection via Graph Neural Networks

    Authors: Ruohong Zhang, Yu Hao, Donghan Yu, Wei-Cheng Chang, Guokun Lai, Yiming Yang

    Abstract: Change-point detection (CPD) aims to detect abrupt changes over time series data. Intuitively, effective CPD over multivariate time series should require explicit modeling of the dependencies across input variables. However, existing CPD methods either ignore the dependency structures entirely or rely on the (unrealistic) assumption that the correlation structures are static over time. In this pap… ▽ More

    Submitted 13 September, 2020; v1 submitted 24 April, 2020; originally announced April 2020.

    Comments: Accepted for publication in the International Conference on Neural Information Processing (ICONIP) 2020 Original paper is 12 pages, additional appendix is available on arxiv

    MSC Class: I.2.6

    Journal ref: ICONIP 2020: Neural Information Processing

  24. arXiv:1912.07814  [pdf, other

    cs.LG eess.AS stat.ML

    A Unified Framework for Speech Separation

    Authors: Fahimeh Bahmaninezhad, Shi-Xiong Zhang, Yong Xu, Meng Yu, John H. L. Hansen, Dong Yu

    Abstract: Speech separation refers to extracting each individual speech source in a given mixed signal. Recent advancements in speech separation and ongoing research in this area, have made these approaches as promising techniques for pre-processing of naturalistic audio streams. After incorporating deep learning techniques into speech separation, performance on these systems is improving faster. The initia… ▽ More

    Submitted 16 December, 2019; originally announced December 2019.

  25. arXiv:1911.11363  [pdf, other

    cs.LG stat.ML

    Gradient Perturbation is Underrated for Differentially Private Convex Optimization

    Authors: Da Yu, Huishuai Zhang, Wei Chen, Tie-Yan Liu, Jian Yin

    Abstract: Gradient perturbation, widely used for differentially private optimization, injects noise at every iterative update to guarantee differential privacy. Previous work first determines the noise level that can satisfy the privacy requirement and then analyzes the utility of noisy gradient updates as in the non-private case. In contrast, we explore how privacy noise affects optimization property. We s… ▽ More

    Submitted 26 October, 2020; v1 submitted 26 November, 2019; originally announced November 2019.

    Comments: International Joint Conference on Artificial Intelligence (IJCAI) 2020; 7 pages, 2 figures, and 4 tables

  26. arXiv:1911.07123  [pdf, other

    cs.LG stat.ML

    Graph-Revised Convolutional Network

    Authors: Donghan Yu, Ruohong Zhang, Zhengbao Jiang, Yuexin Wu, Yiming Yang

    Abstract: Graph Convolutional Networks (GCNs) have received increasing attention in the machine learning community for effectively leveraging both the content features of nodes and the linkage patterns across graphs in various applications. As real-world graphs are often incomplete and noisy, treating them as ground-truth information, which is a common practice in most GCNs, unavoidably leads to sub-optimal… ▽ More

    Submitted 30 December, 2020; v1 submitted 16 November, 2019; originally announced November 2019.

    Comments: ECML-PKDD 2020

  27. arXiv:1911.00809  [pdf, other

    cs.LG cs.CV cs.NE stat.ML

    Enhanced Convolutional Neural Tangent Kernels

    Authors: Zhiyuan Li, Ruosong Wang, Dingli Yu, Simon S. Du, Wei Hu, Ruslan Salakhutdinov, Sanjeev Arora

    Abstract: Recent research shows that for training with $\ell_2$ loss, convolutional neural networks (CNNs) whose width (number of channels in convolutional layers) goes to infinity correspond to regression with respect to the CNN Gaussian Process kernel (CNN-GP) if only the last layer is trained, and correspond to regression with respect to the Convolutional Neural Tangent Kernel (CNTK) if all layers are tr… ▽ More

    Submitted 2 November, 2019; originally announced November 2019.

  28. arXiv:1910.02866  [pdf, ps, other

    math.ST stat.ME

    Nonparametric principal subspace regression

    Authors: Mark Koudstaal, Dengdeng Yu, Dehan Kong, Fang Yao

    Abstract: In scientific applications, multivariate observations often come in tandem with temporal or spatial covariates, with which the underlying signals vary smoothly. The standard approaches such as principal component analysis and factor analysis neglect the smoothness of the data, while multivariate linear or nonparametric regression fail to leverage the correlation information among multivariate resp… ▽ More

    Submitted 12 October, 2019; v1 submitted 7 October, 2019; originally announced October 2019.

  29. arXiv:1910.01793  [pdf, other

    stat.ME

    Automating Data Monitoring: Detecting Structural Breaks in Time Series Data Using Bayesian Minimum Description Length

    Authors: Yingbo Li, Robert Cezeaux, Di Yu

    Abstract: In modern business modeling and analytics, data monitoring plays a critical role. Nowadays, sophisticated models often rely on hundreds or even thousands of input variables. Over time, structural changes such as abrupt level shifts or trend slope changes may occur among some of these variables, likely due to changes in economy or government policies. As a part of data monitoring, it is important t… ▽ More

    Submitted 4 October, 2019; originally announced October 2019.

  30. arXiv:1910.01663  [pdf, ps, other

    cs.LG stat.ML

    Harnessing the Power of Infinitely Wide Deep Nets on Small-data Tasks

    Authors: Sanjeev Arora, Simon S. Du, Zhiyuan Li, Ruslan Salakhutdinov, Ruosong Wang, Dingli Yu

    Abstract: Recent research shows that the following two models are equivalent: (a) infinitely wide neural networks (NNs) trained under l2 loss by gradient descent with infinitesimally small learning rate (b) kernel regression with respect to so-called Neural Tangent Kernels (NTKs) (Jacot et al., 2018). An efficient algorithm to compute the NTK, as well as its convolutional counterparts, appears in Arora et a… ▽ More

    Submitted 27 October, 2019; v1 submitted 3 October, 2019; originally announced October 2019.

    Comments: Code for UCI experiments: https://github.com/LeoYu/neural-tangent-kernel-UCI

  31. arXiv:1905.11368  [pdf, other

    cs.LG cs.NE stat.ML

    Simple and Effective Regularization Methods for Training on Noisily Labeled Data with Generalization Guarantee

    Authors: Wei Hu, Zhiyuan Li, Dingli Yu

    Abstract: Over-parameterized deep neural networks trained by simple first-order methods are known to be able to fit any labeling of data. Such over-fitting ability hinders generalization when mislabeled training examples are present. On the other hand, simple regularization methods like early-stop** can often achieve highly nontrivial performance on clean test data in these scenarios, a phenomenon not the… ▽ More

    Submitted 2 October, 2020; v1 submitted 27 May, 2019; originally announced May 2019.

    Comments: International Conference on Learning Representations (ICLR) 2020

  32. arXiv:1905.08900  [pdf, other

    cs.LG cs.IR stat.ML

    Enhancing Domain Word Embedding via Latent Semantic Imputation

    Authors: Shibo Yao, Dantong Yu, Keli Xiao

    Abstract: We present a novel method named Latent Semantic Imputation (LSI) to transfer external knowledge into semantic space for enhancing word embedding. The method integrates graph theory to extract the latent manifold structure of the entities in the affinity space and leverages non-negative least squares with standard simplex constraints and power iteration method to derive spectral embeddings. It prov… ▽ More

    Submitted 21 May, 2019; originally announced May 2019.

    Comments: ACM SIGKDD 2019

  33. arXiv:1905.05605  [pdf, other

    cs.CR cs.CL cs.SD eess.AS stat.ML

    Encrypted Speech Recognition using Deep Polynomial Networks

    Authors: Shi-Xiong Zhang, Yifan Gong, Dong Yu

    Abstract: The cloud-based speech recognition/API provides developers or enterprises an easy way to create speech-enabled features in their applications. However, sending audios about personal or company internal information to the cloud, raises concerns about the privacy and security issues. The recognition results generated in cloud may also reveal some sensitive information. This paper proposes a deep pol… ▽ More

    Submitted 10 May, 2019; originally announced May 2019.

    Comments: ICASSP 2019, slides@ https://www.researchgate.net/publication/333005422_Encrypted_Speech_Recognition_using_deep_polynomial_networks

  34. arXiv:1904.08361  [pdf, other

    cs.LG cs.RO eess.SY stat.ML

    Decoupled Data Based Approach for Learning to Control Nonlinear Dynamical Systems

    Authors: Ran Wang, Karthikeya Parunandi, Dan Yu, Dileep Kalathil, Suman Chakravorty

    Abstract: This paper addresses the problem of learning the optimal control policy for a nonlinear stochastic dynamical system with continuous state space, continuous action space and unknown dynamics. This class of problems are typically addressed in stochastic adaptive control and reinforcement learning literature using model-based and model-free approaches respectively. Both methods rely on solving a dyna… ▽ More

    Submitted 17 April, 2019; originally announced April 2019.

  35. arXiv:1903.07120  [pdf, other

    cs.LG stat.ML

    Stabilize Deep ResNet with A Sharp Scaling Factor $τ$

    Authors: Huishuai Zhang, Da Yu, Mingyang Yi, Wei Chen, Tie-Yan Liu

    Abstract: We study the stability and convergence of training deep ResNets with gradient descent. Specifically, we show that the parametric branch in the residual block should be scaled down by a factor $τ=O(1/\sqrt{L})$ to guarantee stable forward/backward process, where $L$ is the number of residual blocks. Moreover, we establish a converse result that the forward process is unbounded when… ▽ More

    Submitted 30 January, 2023; v1 submitted 17 March, 2019; originally announced March 2019.

    Comments: Journal version (Published in Machine Learning Journal), 26 pages

    Journal ref: Machine Learning, 111(9), 3359-3392 (2022)

  36. arXiv:1902.02397  [pdf

    stat.AP

    Winning Is Not Everything: A contextual analysis of hockey face-offs

    Authors: Nick Czuzoj-Shulman, David Yu, Christopher Boucher, Luke Bornn, Mehrsan Javan

    Abstract: This paper takes a different approach to evaluating face-offs in ice hockey. Instead of looking at win percentages, the de facto measure of successful face-off takers for decades, focuses on the game events following the face-off and how directionality, clean wins, and player handedness play a significant role in creating value. This will demonstrate how not all face-off wins are made equal: some… ▽ More

    Submitted 6 February, 2019; originally announced February 2019.

    Comments: Accepted paper for the 2019 Sloan Sports Analytics Conference

  37. arXiv:1902.02020  [pdf

    stat.AP

    Playing Fast Not Loose: Evaluating team-level pace of play in ice hockey using spatio-temporal possession data

    Authors: David Yu, Christopher Boucher, Luke Bornn, Mehrsan Javan

    Abstract: Pace of play is an important characteristic in hockey as well as other team sports. We provide the first comprehensive study of pace within the sport of hockey, focusing on how teams and players impact pace in different regions of the ice, and the resultant effect on other aspects of the game. First we examined how pace of play varies across the surface of the rink, across different periods, at… ▽ More

    Submitted 5 February, 2019; originally announced February 2019.

    Comments: Accepted Paper for the 2019 Sloan Sports Analytics Conference

  38. arXiv:1812.09323  [pdf, other

    eess.AS cs.CL cs.LG cs.SD stat.ML

    Unsupervised Speech Recognition via Segmental Empirical Output Distribution Matching

    Authors: Chih-Kuan Yeh, Jianshu Chen, Chengzhu Yu, Dong Yu

    Abstract: We consider the problem of training speech recognition systems without using any labeled data, under the assumption that the learner can only access to the input utterances and a phoneme language model estimated from a non-overlap** corpus. We propose a fully unsupervised learning algorithm that alternates between solving two sub-problems: (i) learn a phoneme classifier for a given set of phonem… ▽ More

    Submitted 22 December, 2018; originally announced December 2018.

    Comments: Published as a conference paper at ICLR 2019

  39. arXiv:1811.03700  [pdf, ps, other

    cs.LG cs.AI cs.CL eess.AS stat.ML

    A Comparison of Lattice-free Discriminative Training Criteria for Purely Sequence-Trained Neural Network Acoustic Models

    Authors: Chao Weng, Dong Yu

    Abstract: In this work, three lattice-free (LF) discriminative training criteria for purely sequence-trained neural network acoustic models are compared on LVCSR tasks, namely maximum mutual information (MMI), boosted maximum mutual information (bMMI) and state-level minimum Bayes risk (sMBR). We demonstrate that, analogous to LF-MMI, a neural network acoustic model can also be trained from scratch using LF… ▽ More

    Submitted 17 November, 2018; v1 submitted 8 November, 2018; originally announced November 2018.

    Comments: under review ICASSP2019

  40. arXiv:1810.11701  [pdf, other

    stat.ML cs.LG cs.NE

    Hull Form Optimization with Principal Component Analysis and Deep Neural Network

    Authors: Dongchi Yu, Lu Wang

    Abstract: Designing and modifying complex hull forms for optimal vessel performances have been a major challenge for naval architects. In the present study, Principal Component Analysis (PCA) is introduced to compress the geometric representation of a group of existing vessels, and the resulting principal scores are manipulated to generate a large number of derived hull forms, which are evaluated computatio… ▽ More

    Submitted 27 October, 2018; originally announced October 2018.

    Comments: 20 pages

  41. arXiv:1709.02069  [pdf, other

    math.ST stat.CO stat.ME

    An Alternative Approach to Functional Linear Partial Quantile Regression

    Authors: Dengdeng Yu, Matthew Pietrosanu, Ivan Mizera, Bei Jiang, Linglong Kong, Wei Tu

    Abstract: Functional data such as curves and surfaces have become more and more common with modern technological advancements. The use of functional predictors remains challenging due to its inherent infinite-dimensionality. The common practice is to project functional data into a finite dimensional space. The popular partial least square (PLS) method has been well studied for the functional linear model [1… ▽ More

    Submitted 30 January, 2023; v1 submitted 7 September, 2017; originally announced September 2017.

  42. arXiv:1707.01647  [pdf, ps, other

    stat.ML cs.AI cs.LG math.OC

    Convergence Analysis of Optimization Algorithms

    Authors: HyoungSeok Kim, JiHoon Kang, WooMyoung Park, SukHyun Ko, YoonHo Cho, DaeSung Yu, YoungSook Song, JungWon Choi

    Abstract: The regret bound of an optimization algorithms is one of the basic criteria for evaluating the performance of the given algorithm. By inspecting the differences between the regret bounds of traditional algorithms and adaptive one, we provide a guide for choosing an optimizer with respect to the given data set and the loss function. For analysis, we assume that the loss function is convex and its g… ▽ More

    Submitted 6 July, 2017; originally announced July 2017.

  43. arXiv:1706.02353  [pdf, other

    math.ST stat.AP stat.CO stat.ME

    Sparse Wavelet Estimation in Quantile Regression with Multiple Functional Predictors

    Authors: Dengdeng Yu, Li Zhang, Ivan Mizera, Bei Jiang, Linglong Kong

    Abstract: In this manuscript, we study quantile regression in partial functional linear model where response is scalar and predictors include both scalars and multiple functions. Wavelet basis are adopted to better approximate functional slopes while effectively detect local features. The sparse group lasso penalty is imposed to select important functional predictors while capture shared information among t… ▽ More

    Submitted 2 December, 2017; v1 submitted 7 June, 2017; originally announced June 2017.

  44. Easily parallelizable and distributable class of algorithms for structured sparsity, with optimal acceleration

    Authors: Seyoon Ko, Donghyeon Yu, Joong-Ho Won

    Abstract: Many statistical learning problems can be posed as minimization of a sum of two convex functions, one typically a composition of non-smooth and linear functions. Examples include regression under structured sparsity assumptions. Popular algorithms for solving such problems, e.g., ADMM, often involve non-trivial optimization subproblems or smoothing approximation. We consider two classes of primal-… ▽ More

    Submitted 19 June, 2018; v1 submitted 20 February, 2017; originally announced February 2017.

    Comments: 57 pages (30 pages excluding appendix), 4 figures (2 excluding appendix)

    Journal ref: Journal of Computational and Graphical Statistics 28.4 (2019): pp.821-833

  45. arXiv:1511.00632  [pdf, other

    stat.ME

    Partial Functional Linear Quantile Regression for Neuroimaging Data Analysis

    Authors: Dengdeng Yu, Linglong Kong, Ivan Mizera

    Abstract: We propose a prediction procedure for the functional linear quantile regression model by using partial quantile covariance techniques and develop a simple partial quantile regression (SIMPQR) algorithm to efficiently extract partial quantile regression (PQR) basis for estimating functional coefficients. We further extend our partial quantile covariance techniques to functional composite quantile r… ▽ More

    Submitted 2 November, 2015; originally announced November 2015.

  46. arXiv:1306.1970  [pdf, ps, other

    stat.ME

    High-dimensional Fused Lasso Regression using Majorization-Minimization and Parallel Processing

    Authors: Donghyeon Yu, Joong-Ho Won, Taehoon Lee, Johan Lim, Sungroh Yoon

    Abstract: In this paper, we propose a majorization-minimization (MM) algorithm for high-dimensional fused lasso regression (FLR) suitable for parallelization using graphics processing units (GPUs). The MM algorithm is stable and flexible as it can solve the FLR problems with various types of design matrices and penalty structures within a few tens of iterations. We also show that the convergence of the prop… ▽ More

    Submitted 14 December, 2013; v1 submitted 8 June, 2013; originally announced June 2013.

  47. arXiv:1305.6340  [pdf, ps, other

    stat.ME

    Monotone false discovery rate

    Authors: Joong-Ho Won, Johan Lim, Donghyeon Yu, Byung Soo Kim, Kyunga Kim

    Abstract: This paper proposes a procedure to obtain monotone estimates of both the local and the tail false discovery rates that arise in large-scale multiple testing. The proposed monotonization is asymptotically optimal for controlling the false discovery rate and also has many attractive finite-sample properties.

    Submitted 13 December, 2013; v1 submitted 27 May, 2013; originally announced May 2013.

  48. arXiv:1302.0256  [pdf, ps, other

    stat.ML

    Regression shrinkage and grou** of highly correlated predictors with HORSES

    Authors: Woncheol Jang, Johan Lim, Nicole A. Lazar, Ji Meng Loh, Donghyeon Yu

    Abstract: Identifying homogeneous subgroups of variables can be challenging in high dimensional data analysis with highly correlated predictors. We propose a new method called Hexagonal Operator for Regression with Shrinkage and Equality Selection, HORSES for short, that simultaneously selects positively correlated variables and identifies them as predictive clusters. This is achieved via a constrained leas… ▽ More

    Submitted 1 February, 2013; originally announced February 2013.

    MSC Class: 62J07; 62P10