Skip to main content

Showing 1–50 of 98 results for author: Yu, H

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.02948  [pdf, other

    stat.ME stat.AP

    Copula-based semiparametric nonnormal transformed linear model for survival data with dependent censoring

    Authors: Huazhen Yu, Lixin Zhang

    Abstract: Although the independent censoring assumption is commonly used in survival analysis, it can be violated when the censoring time is related to the survival time, which often happens in many practical applications. To address this issue, we propose a flexible semiparametric method for dependent censored data. Our approach involves fitting the survival time and the censoring time with a joint transfo… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  2. arXiv:2405.19902  [pdf, other

    cs.LG stat.ML

    Learning Discriminative Dynamics with Label Corruption for Noisy Label Detection

    Authors: Suyeon Kim, Dongha Lee, SeongKu Kang, Sukang Chae, Sanghwan Jang, Hwanjo Yu

    Abstract: Label noise, commonly found in real-world datasets, has a detrimental impact on a model's generalization. To effectively detect incorrectly labeled instances, previous works have mostly relied on distinguishable training signals, such as training loss, as indicators to differentiate between clean and noisy labels. However, they have limitations in that the training signals incompletely reveal the… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: Accepted to CVPR 2024

  3. arXiv:2404.16209  [pdf

    stat.ME stat.AP stat.CO

    Exploring Spatial Context: A Comprehensive Bibliography of GWR and MGWR

    Authors: A. Stewart Fotheringham, Chen-Lun Kao, Hanchen Yu, Sarah Bardin, Taylor Oshan, Ziqi Li, Mehak Sachdeva, Wei Luo

    Abstract: Local spatial models such as Geographically Weighted Regression (GWR) and Multiscale Geographically Weighted Regression (MGWR) serve as instrumental tools to capture intrinsic contextual effects through the estimates of the local intercepts and behavioral contextual effects through estimates of the local slope parameters. GWR and MGWR provide simple implementation yet powerful frameworks that coul… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: 372 pages

  4. arXiv:2403.03058  [pdf, other

    stat.ME stat.ML

    Machine Learning Assisted Adjustment Boosts Inferential Efficiency of Randomized Controlled Trials

    Authors: Han Yu, Alan D. Hutson

    Abstract: In this work, we proposed a novel inferential procedure assisted by machine learning based adjustment for randomized control trials. The method was developed under the Rosenbaum's framework of exact tests in randomized experiments with covariate adjustments. Through extensive simulation experiments, we showed the proposed method can robustly control the type I error and can boost the inference eff… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

  5. arXiv:2402.17096  [pdf, other

    stat.CO

    Simple rejection Monte Carlo algorithm and its application to multivariate statistical inference

    Authors: Fengyu Li, Huijiao Yu, Jun Yan, Xianyong Meng

    Abstract: The Monte Carlo algorithm is increasingly utilized, with its central step involving computer-based random sampling from stochastic models. While both Markov Chain Monte Carlo (MCMC) and Reject Monte Carlo serve as sampling methods, the latter finds fewer applications compared to the former. Hence, this paper initially provides a concise introduction to the theory of the Reject Monte Carlo algorith… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

  6. arXiv:2401.16667  [pdf, other

    math.ST stat.AP stat.ME

    Sharp variance estimator and causal bootstrap in stratified randomized experiments

    Authors: Haoyang Yu, Ke Zhu, Hanzhong Liu

    Abstract: The design-based finite-population asymptotic theory provides a normal approximation for the sampling distribution of the average treatment effect estimator in stratified randomized experiments. The asymptotic variance could be estimated by a Neyman-type conservative variance estimator. However, the variance estimator can be overly conservative, and the asymptotic theory may fail in small samples.… ▽ More

    Submitted 26 June, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

  7. arXiv:2401.02203  [pdf, other

    stat.ML cs.LG

    Robust bilinear factor analysis based on the matrix-variate $t$ distribution

    Authors: Xuan Ma, Jianhua Zhao, Changchun Shang, Fen Jiang, Philip L. H. Yu

    Abstract: Factor Analysis based on multivariate $t$ distribution ($t$fa) is a useful robust tool for extracting common factors on heavy-tailed or contaminated data. However, $t$fa is only applicable to vector data. When $t$fa is applied to matrix data, it is common to first vectorize the matrix observations. This introduces two challenges for $t$fa: (i) the inherent matrix structure of the data is broken, a… ▽ More

    Submitted 4 January, 2024; originally announced January 2024.

  8. arXiv:2311.02766  [pdf, other

    cs.LG stat.ME stat.ML

    Riemannian Laplace Approximation with the Fisher Metric

    Authors: Hanlin Yu, Marcelo Hartmann, Bernardo Williams, Mark Girolami, Arto Klami

    Abstract: Laplace's method approximates a target density with a Gaussian distribution at its mode. It is computationally efficient and asymptotically exact for Bayesian inference due to the Bernstein-von Mises theorem, but for complex targets and finite-data posteriors it is often too crude an approximation. A recent generalization of the Laplace Approximation transforms the Gaussian approximation according… ▽ More

    Submitted 7 May, 2024; v1 submitted 5 November, 2023; originally announced November 2023.

    Comments: AISTATS 2024, with additional fixes and improvements

  9. arXiv:2308.08305  [pdf, other

    stat.ML cs.LG

    Warped geometric information on the optimisation of Euclidean functions

    Authors: Marcelo Hartmann, Bernardo Williams, Hanlin Yu, Mark Girolami, Alessandro Barp, Arto Klami

    Abstract: We consider the fundamental task of optimising a real-valued function defined in a potentially high-dimensional Euclidean space, such as the loss function in many machine-learning tasks or the logarithm of the probability distribution in statistical inference. We use Riemannian geometry notions to redefine the optimisation problem of a function on the Euclidean space to a Riemannian manifold with… ▽ More

    Submitted 18 March, 2024; v1 submitted 16 August, 2023; originally announced August 2023.

  10. arXiv:2304.03928  [pdf

    cs.LG stat.AP

    Interpretable machine learning-accelerated seed treatment by nanomaterials for environmental stress alleviation

    Authors: Hengjie Yu, Dan Luo, Sam F. Y. Li, Maozhen Qu, Da Liu, Yingchao He, Fang Cheng

    Abstract: Crops are constantly challenged by different environmental conditions. Seed treatment by nanomaterials is a cost-effective and environmentally-friendly solution for environmental stress mitigation in crop plants. Here, 56 seed nanopriming treatments are used to alleviate environmental stresses in maize. Seven selected nanopriming treatments significantly increase the stress resistance index (SRI)… ▽ More

    Submitted 8 April, 2023; originally announced April 2023.

    Comments: 30 pages, 6 figures

  11. arXiv:2303.05101  [pdf, other

    cs.LG stat.CO

    Scalable Stochastic Gradient Riemannian Langevin Dynamics in Non-Diagonal Metrics

    Authors: Hanlin Yu, Marcelo Hartmann, Bernardo Williams, Arto Klami

    Abstract: Stochastic-gradient sampling methods are often used to perform Bayesian inference on neural networks. It has been observed that the methods in which notions of differential geometry are included tend to have better performances, with the Riemannian metric improving posterior exploration by accounting for the local curvature. However, the existing methods often resort to simple diagonal metrics to… ▽ More

    Submitted 31 March, 2024; v1 submitted 9 March, 2023; originally announced March 2023.

    Comments: Adjust the template and minor fixes

  12. arXiv:2212.00992  [pdf, other

    cs.LG stat.ML

    Stable Learning via Sparse Variable Independence

    Authors: Han Yu, Peng Cui, Yue He, Zheyan Shen, Yong Lin, Renzhe Xu, Xingxuan Zhang

    Abstract: The problem of covariate-shift generalization has attracted intensive research attention. Previous stable learning algorithms employ sample reweighting schemes to decorrelate the covariates when there is no explicit domain information about training data. However, with finite samples, it is difficult to achieve the desirable weights that ensure perfect independence to get rid of the unstable varia… ▽ More

    Submitted 2 December, 2022; originally announced December 2022.

    Comments: Accepted by AAAI 2023

  13. arXiv:2210.09913  [pdf, ps, other

    stat.ML cs.LG

    Measure-Theoretic Probability of Complex Co-occurrence and E-Integral

    Authors: Jian-Yong Wang, Han Yu

    Abstract: Complex high-dimensional co-occurrence data are increasingly popular from a complex system of interacting physical, biological and social processes in discretely indexed modifiable areal units or continuously indexed locations of a study region for landscape-based mechanism. Modeling, predicting and interpreting complex co-occurrences are very general and fundamental problems of statistical and ma… ▽ More

    Submitted 18 October, 2022; originally announced October 2022.

  14. arXiv:2210.00635  [pdf, other

    cs.LG stat.ML

    Robust Empirical Risk Minimization with Tolerance

    Authors: Robi Bhattacharjee, Max Hopkins, Akash Kumar, Hantao Yu, Kamalika Chaudhuri

    Abstract: Develo** simple, sample-efficient learning algorithms for robust classification is a pressing issue in today's tech-dominated world, and current theoretical techniques requiring exponential sample complexity and complicated improper learning rules fall far from answering the need. In this work we study the fundamental paradigm of (robust) $\textit{empirical risk minimization}$ (RERM), a simple p… ▽ More

    Submitted 4 February, 2023; v1 submitted 2 October, 2022; originally announced October 2022.

    Comments: 22 pages, 1 figure, To appear at ALT'23

  15. arXiv:2209.12313  [pdf, other

    cs.DS math.ST stat.ML

    Random graph matching at Otter's threshold via counting chandeliers

    Authors: Cheng Mao, Yihong Wu, Jiaming Xu, Sophie H. Yu

    Abstract: We propose an efficient algorithm for graph matching based on similarity scores constructed from counting a certain family of weighted trees rooted at each vertex. For two Erdős-Rényi graphs $\mathcal{G}(n,q)$ whose edges are correlated through a latent vertex correspondence, we show that this algorithm correctly matches all but a vanishing fraction of the vertices with high probability, provided… ▽ More

    Submitted 13 February, 2023; v1 submitted 25 September, 2022; originally announced September 2022.

  16. arXiv:2204.09086  [pdf, other

    stat.ML cs.LG

    Choosing the number of factors in factor analysis with incomplete data via a hierarchical Bayesian information criterion

    Authors: Jianhua Zhao, Changchun Shang, Shulan Li, Ling Xin, Philip L. H. Yu

    Abstract: The Bayesian information criterion (BIC), defined as the observed data log likelihood minus a penalty term based on the sample size $N$, is a popular model selection criterion for factor analysis with complete data. This definition has also been suggested for incomplete data. However, the penalty term based on the `complete' sample size $N$ is the same no matter whether in a complete or incomplete… ▽ More

    Submitted 19 April, 2022; originally announced April 2022.

    Comments: 16 pages, 4 figures

    MSC Class: 62H25 ACM Class: G.3; I.2.6

  17. arXiv:2202.10506  [pdf, other

    math.OC cs.LG stat.ML

    Accelerating Primal-dual Methods for Regularized Markov Decision Processes

    Authors: Haoya Li, Hsiang-fu Yu, Lexing Ying, Inderjit Dhillon

    Abstract: Entropy regularized Markov decision processes have been widely used in reinforcement learning. This paper is concerned with the primal-dual formulation of the entropy regularized problems. Standard first-order methods suffer from slow convergence due to the lack of strict convexity and concavity. To address this issue, we first introduce a new quadratically convexified primal-dual formulation. The… ▽ More

    Submitted 12 June, 2023; v1 submitted 21 February, 2022; originally announced February 2022.

  18. arXiv:2201.09433  [pdf, other

    cs.LG cs.CC stat.ML

    Active Learning Polynomial Threshold Functions

    Authors: Omri Ben-Eliezer, Max Hopkins, Chutong Yang, Hantao Yu

    Abstract: We initiate the study of active learning polynomial threshold functions (PTFs). While traditional lower bounds imply that even univariate quadratics cannot be non-trivially actively learned, we show that allowing the learner basic access to the derivatives of the underlying classifier circumvents this issue and leads to a computationally efficient algorithm for active learning degree-$d$ univariat… ▽ More

    Submitted 1 October, 2022; v1 submitted 23 January, 2022; originally announced January 2022.

    MSC Class: 68Q32

  19. Assessing Deep Neural Networks as Probability Estimators

    Authors: Yu Pan, Kwo-Sen Kuo, Michael L. Rilee, Hongfeng Yu

    Abstract: Deep Neural Networks (DNNs) have performed admirably in classification tasks. However, the characterization of their classification uncertainties, required for certain applications, has been lacking. In this work, we investigate the issue by assessing DNNs' ability to estimate conditional probabilities and propose a framework for systematic uncertainty characterization. Denoting the input sample a… ▽ More

    Submitted 27 November, 2023; v1 submitted 16 November, 2021; originally announced November 2021.

    Comments: Y. Pan, K. Kuo, M. Rilee and H. Yu, "Assessing Deep Neural Networks as Probability Estimators," in 2021 IEEE International Conference on Big Data (Big Data), Orlando, FL, USA, 2021 pp. 1083-1091. doi: 10.1109/BigData52589.2021.9671328

  20. arXiv:2111.05708  [pdf

    cs.LG stat.ML

    STNN-DDI: A Substructure-aware Tensor Neural Network to Predict Drug-Drug Interactions

    Authors: Hui Yu, ShiYu Zhao, JianYu Shi

    Abstract: Motivation: Computational prediction of multiple-type drug-drug interaction (DDI) helps reduce unexpected side effects in poly-drug treatments. Although existing computational approaches achieve inspiring results, they ignore that the action of a drug is mainly caused by its chemical substructures. In addition, their interpretability is still weak. Results: In this paper, by supposing that the int… ▽ More

    Submitted 5 December, 2021; v1 submitted 10 November, 2021; originally announced November 2021.

  21. arXiv:2110.11816  [pdf, other

    math.ST stat.ML

    Testing network correlation efficiently via counting trees

    Authors: Cheng Mao, Yihong Wu, Jiaming Xu, Sophie H. Yu

    Abstract: We propose a new procedure for testing whether two networks are edge-correlated through some latent vertex correspondence. The test statistic is based on counting the co-occurrences of signed trees for a family of non-isomorphic trees. When the two networks are Erdős-Rényi random graphs $\mathcal{G}(n,q)$ that are either independent or correlated with correlation coefficient $ρ$, our test runs in… ▽ More

    Submitted 1 April, 2022; v1 submitted 22 October, 2021; originally announced October 2021.

  22. arXiv:2110.00685  [pdf, other

    cs.LG cs.AI cs.IR stat.ML

    Fast Multi-Resolution Transformer Fine-tuning for Extreme Multi-label Text Classification

    Authors: Jiong Zhang, Wei-cheng Chang, Hsiang-fu Yu, Inderjit S. Dhillon

    Abstract: Extreme multi-label text classification (XMC) seeks to find relevant labels from an extreme large label collection for a given text input. Many real-world applications can be formulated as XMC problems, such as recommendation systems, document tagging and semantic search. Recently, transformer based XMC methods, such as X-Transformer and LightXML, have shown significant improvement over other XMC… ▽ More

    Submitted 28 October, 2021; v1 submitted 1 October, 2021; originally announced October 2021.

  23. arXiv:2108.00819  [pdf, other

    cs.LG cs.AI stat.ML

    Active Learning in Gaussian Process State Space Model

    Authors: Hon Sum Alec Yu, Dingling Yao, Christoph Zimmer, Marc Toussaint, Duy Nguyen-Tuong

    Abstract: We investigate active learning in Gaussian Process state-space models (GPSSM). Our problem is to actively steer the system through latent states by determining its inputs such that the underlying dynamics can be optimally learned by a GPSSM. In order that the most informative inputs are selected, we employ mutual information as our active learning criterion. In particular, we present two approache… ▽ More

    Submitted 30 July, 2021; originally announced August 2021.

    Comments: European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD) 2021

  24. arXiv:2106.13275  [pdf, other

    cs.LG stat.ML

    Multitask Learning for Citation Purpose Classification

    Authors: Alex Oesterling, Angikar Ghosal, Haoyang Yu, Rui Xin, Yasa Baig, Lesia Semenova, Cynthia Rudin

    Abstract: We present our entry into the 2021 3C Shared Task Citation Context Classification based on Purpose competition. The goal of the competition is to classify a citation in a scientific article based on its purpose. This task is important because it could potentially lead to more comprehensive ways of summarizing the purpose and uses of scientific articles, but it is also difficult, mainly due to the… ▽ More

    Submitted 24 June, 2021; originally announced June 2021.

    Comments: Second Workshop on Scholarly Document Processing

    Journal ref: Proceedings of the Second Workshop on Scholarly Document Processing, 2021

  25. arXiv:2106.12751  [pdf, other

    stat.ML cs.LG

    Label Disentanglement in Partition-based Extreme Multilabel Classification

    Authors: Xuanqing Liu, Wei-Cheng Chang, Hsiang-Fu Yu, Cho-Jui Hsieh, Inderjit S. Dhillon

    Abstract: Partition-based methods are increasingly-used in extreme multi-label classification (XMC) problems due to their scalability to large output spaces (e.g., millions or more). However, existing methods partition the large label space into mutually exclusive clusters, which is sub-optimal when labels have multi-modality and rich semantics. For instance, the label "Apple" can be the fruit or the brand… ▽ More

    Submitted 23 June, 2021; originally announced June 2021.

  26. arXiv:2103.16438  [pdf, other

    stat.ME

    A General Framework of Nonparametric Feature Selection in High-Dimensional Data

    Authors: Hang Yu, Yuanjia Wang, Donglin Zeng

    Abstract: Nonparametric feature selection in high-dimensional data is an important and challenging problem in statistics and machine learning fields. Most of the existing methods for feature selection focus on parametric or additive models which may suffer from model misspecification. In this paper, we propose a new framework to perform nonparametric feature selection for both regression and classification… ▽ More

    Submitted 30 March, 2021; originally announced March 2021.

    Comments: 25 pages, 3 figures

  27. arXiv:2102.00082  [pdf, ps, other

    math.ST cs.IT stat.ML

    Settling the Sharp Reconstruction Thresholds of Random Graph Matching

    Authors: Yihong Wu, Jiaming Xu, Sophie H. Yu

    Abstract: This paper studies the problem of recovering the hidden vertex correspondence between two edge-correlated random graphs. We focus on the Gaussian model where the two graphs are complete graphs with correlated Gaussian weights and the Erdős-Rényi model where the two graphs are subsampled from a common parent Erdős-Rényi graph $\mathcal{G}(n,p)$. For dense graphs with $p=n^{-o(1)}$, we prove that th… ▽ More

    Submitted 16 February, 2022; v1 submitted 29 January, 2021; originally announced February 2021.

    MSC Class: 94A15; 62B10; 68Q87; 05C80; 05C60

  28. arXiv:2101.00484  [pdf, other

    stat.ME stat.AP

    Marginal modeling of cluster-period means and intraclass correlations in stepped wedge designs with binary outcomes

    Authors: Fan Li, Hengshi Yu, Paul J. Rathouz, Elizabeth L. Turner, John S. Preisser

    Abstract: Stepped wedge cluster randomized trials (SW-CRTs) with binary outcomes are increasingly used in prevention and implementation studies. Marginal models represent a flexible tool for analyzing SW-CRTs with population-averaged interpretations, but the joint estimation of the mean and intraclass correlation coefficients (ICCs) can be computationally intensive due to large cluster-period sizes. Motivat… ▽ More

    Submitted 2 January, 2021; originally announced January 2021.

    Comments: 28 pages, 2 figures, 3 tables

    Journal ref: Biostatistics (2021)

  29. arXiv:2012.07190   

    q-bio.PE stat.AP

    A prognostic dynamic model applicable to infectious diseases providing easily visualized guides -- A case study of COVID-19 in the UK

    Authors: Yuxuan Zhang, Chen Gong, Dawei Li, Zhi-Wei Wang, Shengda D Pu, Alex W Robertson, Hong Yu, John Parrington

    Abstract: A reasonable prediction of infectious diseases transmission process under different disease control strategies is an important reference point for policy makers. Here we established a dynamic transmission model via Python and realized comprehensive regulation of disease control measures. We classified government interventions into three categories and introduced three parameters as descriptions fo… ▽ More

    Submitted 22 February, 2021; v1 submitted 13 December, 2020; originally announced December 2020.

    Comments: Errors appears in Results, data changed

  30. arXiv:2009.07703  [pdf, other

    stat.ML cs.LG stat.ME

    Efficient Variational Bayes Learning of Graphical Models with Smooth Structural Changes

    Authors: Hang Yu, Songwei Wu, Justin Dauwels

    Abstract: Estimating time-varying graphical models are of paramount importance in various social, financial, biological, and engineering systems, since the evolution of such networks can be utilized for example to spot trends, detect anomalies, predict vulnerability, and evaluate the impact of interventions. Existing methods require extensive tuning of parameters that control the graph sparsity and temporal… ▽ More

    Submitted 4 February, 2023; v1 submitted 16 September, 2020; originally announced September 2020.

    Journal ref: IEEE Transactions on Pattern Analysis and Machine Intelligence 45 (2023)

  31. arXiv:2009.07022  [pdf, other

    cs.LG cs.CL cs.DB cs.IR stat.ML

    The Devil is the Classifier: Investigating Long Tail Relation Classification with Decoupling Analysis

    Authors: Haiyang Yu, Ningyu Zhang, Shumin Deng, Zonggang Yuan, Yantao Jia, Huajun Chen

    Abstract: Long-tailed relation classification is a challenging problem as the head classes may dominate the training phase, thereby leading to the deterioration of the tail performance. Existing solutions usually address this issue via class-balancing strategies, e.g., data re-sampling and loss re-weighting, but all these methods adhere to the schema of entangling learning of the representation and classifi… ▽ More

    Submitted 15 September, 2020; originally announced September 2020.

  32. arXiv:2009.02909  [pdf, other

    cs.LG stat.ML

    Sparse Network Inversion for Key Instance Detection in Multiple Instance Learning

    Authors: Beomjo Shin, Junsu Cho, Hwanjo Yu, Seung** Choi

    Abstract: Multiple Instance Learning (MIL) involves predicting a single label for a bag of instances, given positive or negative labels at bag-level, without accessing to label for each instance in the training phase. Since a positive bag contains both positive and negative instances, it is often required to detect positive instances (key instances) when a set of instances is categorized as a positive bag.… ▽ More

    Submitted 7 September, 2020; v1 submitted 7 September, 2020; originally announced September 2020.

    Comments: 8 pages, 4 figures, in Proceedings of the 25th International Conference on Pattern Recognition (ICPR-2020)

  33. arXiv:2008.10097  [pdf, other

    math.ST math.CO math.PR stat.ML

    Testing correlation of unlabeled random graphs

    Authors: Yihong Wu, Jiaming Xu, Sophie H. Yu

    Abstract: We study the problem of detecting the edge correlation between two random graphs with $n$ unlabeled nodes. This is formalized as a hypothesis testing problem, where under the null hypothesis, the two graphs are independently generated; under the alternative, the two graphs are edge-correlated under some latent node correspondence, but have the same marginal distributions as the null. For both Gaus… ▽ More

    Submitted 7 February, 2021; v1 submitted 23 August, 2020; originally announced August 2020.

  34. arXiv:2008.01200  [pdf, other

    stat.ME stat.AP

    A Robust Spearman Correlation Coefficient Permutation Test

    Authors: Han Yu, Alan D. Hutson

    Abstract: In this work, we show that Spearman's correlation coefficient test about $H_0:ρ_s=0$ found in most statistical software packages is theoretically incorrect and performs poorly when bivariate normality assumptions are not met or the sample size is small. The historical works about these tests make an unverifiable assumption that the approximate bivariate normality of original data justifies using c… ▽ More

    Submitted 3 August, 2020; originally announced August 2020.

    Comments: 10 pages, 3 figures

  35. arXiv:2006.06903  [pdf, other

    cs.LG stat.ML

    On Correctness of Automatic Differentiation for Non-Differentiable Functions

    Authors: Wonyeol Lee, Hangyeol Yu, Xavier Rival, Hongseok Yang

    Abstract: Differentiation lies at the core of many machine-learning algorithms, and is well-supported by popular autodiff systems, such as TensorFlow and PyTorch. Originally, these systems have been developed to compute derivatives of differentiable functions, but in practice, they are commonly applied to functions with non-differentiabilities. For instance, neural networks using ReLU define non-differentia… ▽ More

    Submitted 26 October, 2020; v1 submitted 11 June, 2020; originally announced June 2020.

    Comments: To appear at NeurIPS 2020

  36. arXiv:2006.06762  [pdf, other

    cs.LG cs.NE cs.PF cs.PL stat.ML

    Ansor: Generating High-Performance Tensor Programs for Deep Learning

    Authors: Lianmin Zheng, Chengfan Jia, Minmin Sun, Zhao Wu, Cody Hao Yu, Ameer Haj-Ali, Yida Wang, Jun Yang, Danyang Zhuo, Koushik Sen, Joseph E. Gonzalez, Ion Stoica

    Abstract: High-performance tensor programs are crucial to guarantee efficient execution of deep neural networks. However, obtaining performant tensor programs for different operators on various hardware platforms is notoriously challenging. Currently, deep learning systems rely on vendor-provided kernel libraries or various search strategies to get performant tensor programs. These approaches either require… ▽ More

    Submitted 15 October, 2023; v1 submitted 11 June, 2020; originally announced June 2020.

    Comments: OSDI 2020

  37. arXiv:2006.06141  [pdf

    cond-mat.mtrl-sci stat.ML

    On-the-fly Closed-loop Autonomous Materials Discovery via Bayesian Active Learning

    Authors: A. Gilad Kusne, Heshan Yu, Changming Wu, Huairuo Zhang, Jason Hattrick-Simpers, Brian DeCost, Suchismita Sarker, Corey Oses, Cormac Toher, Stefano Curtarolo, Albert V. Davydov, Ritesh Agarwal, Leonid A. Bendersky, Mo Li, Apurva Mehta, Ichiro Takeuchi

    Abstract: Active learning - the field of machine learning (ML) dedicated to optimal experiment design, has played a part in science as far back as the 18th century when Laplace used it to guide his discovery of celestial mechanics [1]. In this work we focus a closed-loop, active learning-driven autonomous system on another major challenge, the discovery of advanced materials against the exceedingly complex… ▽ More

    Submitted 10 November, 2020; v1 submitted 10 June, 2020; originally announced June 2020.

    Comments: 30 pages and 13 figures in PDF including Methods section

    Journal ref: Nat Commun 11, 5966 (2020)

  38. arXiv:2006.04239  [pdf, other

    cs.LG cs.IR stat.ML

    Unsupervised Differentiable Multi-aspect Network Embedding

    Authors: Chanyoung Park, Carl Yang, Qi Zhu, Donghyun Kim, Hwanjo Yu, Jiawei Han

    Abstract: Network embedding is an influential graph mining technique for representing nodes in a graph as distributed vectors. However, the majority of network embedding methods focus on learning a single vector representation for each node, which has been recently criticized for not being capable of modeling multiple aspects of a node. To capture the multiple aspects of each node, existing studies mainly r… ▽ More

    Submitted 7 July, 2020; v1 submitted 7 June, 2020; originally announced June 2020.

    Comments: KDD 2020 (Research Track). 9 Pages + Appendix (2 Pages). Source code can be found https://github.com/pcy1302/asp2vec. Typo fixed in Fig.2

  39. arXiv:2004.00198  [pdf, other

    cs.LG stat.ML

    Extreme Multi-label Classification from Aggregated Labels

    Authors: Yanyao Shen, Hsiang-fu Yu, Sujay Sanghavi, Inderjit Dhillon

    Abstract: Extreme multi-label classification (XMC) is the problem of finding the relevant labels for an input, from a very large universe of possible labels. We consider XMC in the setting where labels are available only for groups of samples - but not for individual ones. Current XMC approaches are not built for such multi-instance multi-label (MIML) training data, and MIML approaches do not scale to XMC s… ▽ More

    Submitted 31 March, 2020; originally announced April 2020.

  40. arXiv:2003.11941  [pdf, other

    cs.LG cs.AI stat.ML

    AliExpress Learning-To-Rank: Maximizing Online Model Performance without Going Online

    Authors: Guangda Huzhang, Zhen-Jia Pang, Yongqing Gao, Yawen Liu, Weijie Shen, Wen-Ji Zhou, Qing Da, An-Xiang Zeng, Han Yu, Yang Yu, Zhi-Hua Zhou

    Abstract: Learning-to-rank (LTR) has become a key technology in E-commerce applications. Most existing LTR approaches follow a supervised learning paradigm from offline labeled data collected from the online system. However, it has been noticed that previous LTR models can have a good validation performance over offline validation data but have a poor online performance, and vice versa, which implies a poss… ▽ More

    Submitted 31 December, 2020; v1 submitted 25 March, 2020; originally announced March 2020.

  41. arXiv:2003.09229  [pdf, other

    cs.LG cs.CL stat.ML

    Learning to Encode Position for Transformer with Continuous Dynamical Model

    Authors: Xuanqing Liu, Hsiang-Fu Yu, Inderjit Dhillon, Cho-Jui Hsieh

    Abstract: We introduce a new way of learning to encode position information for non-recurrent models, such as Transformer models. Unlike RNN and LSTM, which contain inductive bias by loading the input tokens sequentially, non-recurrent models are less sensitive to position. The main reason is that position information among input units is not inherently encoded, i.e., the models are permutation equivalent;… ▽ More

    Submitted 12 March, 2020; originally announced March 2020.

    Comments: Code to be released in https://github.com/xuanqing94/FLOATER

  42. arXiv:2003.02133  [pdf, other

    cs.CR cs.LG stat.ML

    Threats to Federated Learning: A Survey

    Authors: Lingjuan Lyu, Han Yu, Qiang Yang

    Abstract: With the emergence of data silos and popular privacy awareness, the traditional centralized approach of training artificial intelligence (AI) models is facing strong challenges. Federated learning (FL) has recently emerged as a promising solution under this new reality. Existing FL protocol design has been shown to exhibit vulnerabilities which can be exploited by adversaries both within and witho… ▽ More

    Submitted 4 March, 2020; originally announced March 2020.

    Comments: 7 pages, 4 figures, 2 tables

  43. arXiv:2002.12663  [pdf, other

    cs.LG cs.CV stat.ML

    HOTCAKE: Higher Order Tucker Articulated Kernels for Deeper CNN Compression

    Authors: Rui Lin, Ching-Yun Ko, Zhuolun He, Cong Chen, Yuan Cheng, Hao Yu, Graziano Chesi, Ngai Wong

    Abstract: The emerging edge computing has promoted immense interests in compacting a neural network without sacrificing much accuracy. In this regard, low-rank tensor decomposition constitutes a powerful tool to compress convolutional neural networks (CNNs) by decomposing the 4-way kernel tensor into multi-stage smaller ones. Building on top of Tucker-2 decomposition, we propose a generalized Higher Order T… ▽ More

    Submitted 28 February, 2020; originally announced February 2020.

    Comments: 6 pages, 5 figures

  44. arXiv:2002.11711  [pdf, other

    cs.CR cs.LG stat.ML

    FedCoin: A Peer-to-Peer Payment System for Federated Learning

    Authors: Yuan Liu, Shuai Sun, Zhengpeng Ai, Shuangfeng Zhang, Zelei Liu, Han Yu

    Abstract: Federated learning (FL) is an emerging collaborative machine learning method to train models on distributed datasets with privacy concerns. To properly incentivize data owners to contribute their efforts, Shapley Value (SV) is often adopted to fairly assess their contribution. However, the calculation of SV is time-consuming and computationally costly. In this paper, we propose FedCoin, a blockcha… ▽ More

    Submitted 25 February, 2020; originally announced February 2020.

    Comments: 7 pages, 6 figures,21 references

  45. arXiv:2002.02829  [pdf, other

    cs.LG cs.AI stat.ML

    Off-policy Maximum Entropy Reinforcement Learning : Soft Actor-Critic with Advantage Weighted Mixture Policy(SAC-AWMP)

    Authors: Zhimin Hou, Kuangen Zhang, Yi Wan, Dongyu Li, Chenglong Fu, Haoyong Yu

    Abstract: The optimal policy of a reinforcement learning problem is often discontinuous and non-smooth. I.e., for two states with similar representations, their optimal policies can be significantly different. In this case, representing the entire policy with a function approximator (FA) with shared parameters for all states maybe not desirable, as the generalization ability of parameters sharing makes repr… ▽ More

    Submitted 7 February, 2020; originally announced February 2020.

  46. arXiv:2001.11359  [pdf, other

    cs.LG stat.ML

    FOCUS: Dealing with Label Quality Disparity in Federated Learning

    Authors: Yiqiang Chen, Xiaodong Yang, Xin Qin, Han Yu, Biao Chen, Zhiqi Shen

    Abstract: Ubiquitous systems with End-Edge-Cloud architecture are increasingly being used in healthcare applications. Federated Learning (FL) is highly useful for such applications, due to silo effect and privacy preserving. Existing FL approaches generally do not account for disparities in the quality of local data labels. However, the clients in ubiquitous systems tend to suffer from label noise due to va… ▽ More

    Submitted 29 January, 2020; originally announced January 2020.

    Comments: 7 pages

  47. arXiv:2001.11154  [pdf, other

    cs.LG stat.ML

    Multi-Participant Multi-Class Vertical Federated Learning

    Authors: Siwei Feng, Han Yu

    Abstract: Federated learning (FL) is a privacy-preserving paradigm for training collective machine learning models with locally stored data from multiple participants. Vertical federated learning (VFL) deals with the case where participants sharing the same sample ID space but having different feature spaces, while label information is owned by one participant. Current studies of VFL only support two partic… ▽ More

    Submitted 29 January, 2020; originally announced January 2020.

  48. arXiv:2001.06202  [pdf, other

    cs.LG cs.CV stat.ML

    FedVision: An Online Visual Object Detection Platform Powered by Federated Learning

    Authors: Yang Liu, Anbu Huang, Yun Luo, He Huang, Youzhi Liu, Yuanyuan Chen, Lican Feng, Tianjian Chen, Han Yu, Qiang Yang

    Abstract: Visual object detection is a computer vision-based artificial intelligence (AI) technique which has many practical applications (e.g., fire hazard monitoring). However, due to privacy concerns and the high cost of transmitting video data, it is highly challenging to build object detection models on centrally stored large training datasets following the current approach. Federated learning (FL) is… ▽ More

    Submitted 17 January, 2020; originally announced January 2020.

  49. arXiv:1912.04977  [pdf, other

    cs.LG cs.CR stat.ML

    Advances and Open Problems in Federated Learning

    Authors: Peter Kairouz, H. Brendan McMahan, Brendan Avent, Aurélien Bellet, Mehdi Bennis, Arjun Nitin Bhagoji, Kallista Bonawitz, Zachary Charles, Graham Cormode, Rachel Cummings, Rafael G. L. D'Oliveira, Hubert Eichner, Salim El Rouayheb, David Evans, Josh Gardner, Zachary Garrett, Adrià Gascón, Badih Ghazi, Phillip B. Gibbons, Marco Gruteser, Zaid Harchaoui, Chaoyang He, Lie He, Zhouyuan Huo, Ben Hutchinson , et al. (34 additional authors not shown)

    Abstract: Federated learning (FL) is a machine learning setting where many clients (e.g. mobile devices or whole organizations) collaboratively train a model under the orchestration of a central server (e.g. service provider), while kee** the training data decentralized. FL embodies the principles of focused data collection and minimization, and can mitigate many of the systemic privacy risks and costs re… ▽ More

    Submitted 8 March, 2021; v1 submitted 10 December, 2019; originally announced December 2019.

    Comments: Published in Foundations and Trends in Machine Learning Vol 4 Issue 1. See: https://www.nowpublishers.com/article/Details/MAL-083

  50. arXiv:1911.07346  [pdf, other

    cs.LG cs.CV stat.ML

    Any-Precision Deep Neural Networks

    Authors: Haichao Yu, Haoxiang Li, Honghui Shi, Thomas S. Huang, Gang Hua

    Abstract: We present any-precision deep neural networks (DNNs), which are trained with a new method that allows the learned DNNs to be flexible in numerical precision during inference. The same model in runtime can be flexibly and directly set to different bit-widths, by truncating the least significant bits, to support dynamic speed and accuracy trade-off. When all layers are set to low-bits, we show that… ▽ More

    Submitted 15 January, 2021; v1 submitted 17 November, 2019; originally announced November 2019.

    Comments: AAAI 2021