Skip to main content

Showing 1–50 of 73 results for author: Lin, H

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.12017  [pdf, other

    stat.ML cs.LG stat.CO

    Sparsity-Constraint Optimization via Splicing Iteration

    Authors: Zezhi Wang, ** Zhu, Junxian Zhu, Borui Tang, Hongmei Lin, Xueqin Wang

    Abstract: Sparsity-constraint optimization has wide applicability in signal processing, statistics, and machine learning. Existing fast algorithms must burdensomely tune parameters, such as the step size or the implementation of precise stop criteria, which may be challenging to determine in practice. To address this issue, we develop an algorithm named Sparsity-Constraint Optimization via sPlicing itEratio… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 34 pages

  2. arXiv:2406.01252  [pdf, other

    cs.CL cs.AI stat.ML

    Towards Scalable Automated Alignment of LLMs: A Survey

    Authors: Boxi Cao, Keming Lu, Xinyu Lu, Jiawei Chen, Mengjie Ren, Hao Xiang, Peilin Liu, Yaojie Lu, Ben He, Xianpei Han, Le Sun, Hongyu Lin, Bowen Yu

    Abstract: Alignment is the most critical step in building large language models (LLMs) that meet human needs. With the rapid development of LLMs gradually surpassing human capabilities, traditional alignment methods based on human-annotation are increasingly unable to meet the scalability demands. Therefore, there is an urgent need to explore new sources of automated alignment signals and technical approach… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  3. arXiv:2405.11284  [pdf, other

    cs.AI stat.OT

    The Logic of Counterfactuals and the Epistemology of Causal Inference

    Authors: Hanti Lin

    Abstract: The 2021 Nobel Prize in Economics recognized a theory of causal inference, which deserves more attention from philosophers. To that end, I develop a dialectic that extends the Lewis-Stalnaker debate on a logical principle called Conditional Excluded Middle (CEM). I first play the good cop for CEM, and give a new argument for it: a Quine-Putnam indispensability argument based on the Nobel-Prize win… ▽ More

    Submitted 18 May, 2024; originally announced May 2024.

  4. arXiv:2405.03723  [pdf, other

    cs.LG stat.ME stat.ML

    Generative adversarial learning with optimal input dimension and its adaptive generator architecture

    Authors: Zhiyao Tan, Ling Zhou, Huazhen Lin

    Abstract: We investigate the impact of the input dimension on the generalization error in generative adversarial networks (GANs). In particular, we first provide both theoretical and practical evidence to validate the existence of an optimal input dimension (OID) that minimizes the generalization error. Then, to identify the OID, we introduce a novel framework called generalized GANs (G-GANs), which include… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

  5. arXiv:2404.16954  [pdf, other

    cs.LG cs.AI stat.ML

    Taming False Positives in Out-of-Distribution Detection with Human Feedback

    Authors: Harit Vishwakarma, Heguang Lin, Ramya Korlakai Vinayak

    Abstract: Robustness to out-of-distribution (OOD) samples is crucial for safely deploying machine learning models in the open world. Recent works have focused on designing scoring functions to quantify OOD uncertainty. Setting appropriate thresholds for these scoring functions for OOD detection is challenging as OOD samples are often unavailable up front. Typically, thresholds are set to achieve a desired t… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: Appeared in the 27th International Conference on Artificial Intelligence and Statistics (AISTATS 2024)

    Journal ref: PMLR 238:1486-1494, 2024

  6. arXiv:2404.13309  [pdf, ps, other

    stat.ML cs.LG

    Latent Schr{ö}dinger Bridge Diffusion Model for Generative Learning

    Authors: Yuling Jiao, Lican Kang, Huazhen Lin, ** Liu, Heng Zuo

    Abstract: This paper aims to conduct a comprehensive theoretical analysis of current diffusion models. We introduce a novel generative learning methodology utilizing the Schr{ö}dinger bridge diffusion model in latent space as the framework for theoretical exploration in this domain. Our approach commences with the pre-training of an encoder-decoder architecture using data originating from a distribution tha… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

  7. arXiv:2402.14966  [pdf, other

    stat.ML cs.LG stat.ME

    Smoothness Adaptive Hypothesis Transfer Learning

    Authors: Haotian Lin, Matthew Reimherr

    Abstract: Many existing two-phase kernel-based hypothesis transfer learning algorithms employ the same kernel regularization across phases and rely on the known smoothness of functions to obtain optimality. Therefore, they fail to adapt to the varying and unknown smoothness between the target/source and their offset in practice. In this paper, we address these problems by proposing Smoothness Adaptive Trans… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

  8. arXiv:2311.13768  [pdf, other

    stat.ME

    Valid confidence intervals for regression with best subset selection

    Authors: Huiming Lin, Meng Li

    Abstract: Classical confidence intervals after best subset selection are widely implemented in statistical software and are routinely used to guide practitioners in scientific fields to conclude significance. However, there are increasing concerns in the recent literature about the validity of these confidence intervals in that the intended frequentist coverage is not attained. In the context of the Akaike… ▽ More

    Submitted 22 November, 2023; originally announced November 2023.

  9. arXiv:2310.14608  [pdf, other

    stat.ML cs.LG

    CAD-DA: Controllable Anomaly Detection after Domain Adaptation by Statistical Inference

    Authors: Vo Nguyen Le Duy, Hsuan-Tien Lin, Ichiro Takeuchi

    Abstract: We propose a novel statistical method for testing the results of anomaly detection (AD) under domain adaptation (DA), which we call CAD-DA -- controllable AD under DA. The distinct advantage of the CAD-DA lies in its ability to control the probability of misidentifying anomalies under a pre-specified level $α$ (e.g., 0.05). The challenge within this DA setting is the necessity to account for the i… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

  10. arXiv:2310.10048  [pdf, other

    stat.ME

    Evaluation of transplant benefits with the U.S. Scientific Registry of Transplant Recipients by semiparametric regression of mean residual life

    Authors: Ge Zhao, Yanyuan Ma, Huazhen Lin, Yi Li

    Abstract: Kidney transplantation is the most effective renal replacement therapy for end stage renal disease patients. With the severe shortage of kidney supplies and for the clinical effectiveness of transplantation, patient's life expectancy post transplantation is used to prioritize patients for transplantation; however, severe comorbidity conditions and old age are the most dominant factors that negativ… ▽ More

    Submitted 17 October, 2023; v1 submitted 16 October, 2023; originally announced October 2023.

    Comments: 68 pages, 13 figures. arXiv admin note: text overlap with arXiv:2011.04067

  11. arXiv:2310.07999  [pdf, other

    cs.LG stat.ML

    LEMON: Lossless model expansion

    Authors: Yite Wang, Jiahao Su, Hanlin Lu, Cong Xie, Tianyi Liu, Jianbo Yuan, Haibin Lin, Ruoyu Sun, Hongxia Yang

    Abstract: Scaling of deep neural networks, especially Transformers, is pivotal for their surging performance and has further led to the emergence of sophisticated reasoning capabilities in foundation models. Such scaling generally requires training large models from scratch with random initialization, failing to leverage the knowledge acquired by their smaller counterparts, which are already resource-intens… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

    Comments: Preprint

  12. arXiv:2309.12872  [pdf, other

    stat.ME

    Deep regression learning with optimal loss function

    Authors: Xuancheng Wang, Ling Zhou, Huazhen Lin

    Abstract: In this paper, we develop a novel efficient and robust nonparametric regression estimator under a framework of feedforward neural network. There are several interesting characteristics for the proposed estimator. First, the loss function is built upon an estimated maximum likelihood function, who integrates the information from observed data, as well as the information from data structure. Consequ… ▽ More

    Submitted 22 September, 2023; originally announced September 2023.

  13. arXiv:2309.00125  [pdf, other

    stat.ML cs.CR cs.LG

    Pure Differential Privacy for Functional Summaries via a Laplace-like Process

    Authors: Haotian Lin, Matthew Reimherr

    Abstract: Many existing mechanisms to achieve differential privacy (DP) on infinite-dimensional functional summaries often involve embedding these summaries into finite-dimensional subspaces and applying traditional DP techniques. Such mechanisms generally treat each dimension uniformly and struggle with complex, structured summaries. This work introduces a novel mechanism for DP functional summary release:… ▽ More

    Submitted 3 March, 2024; v1 submitted 31 August, 2023; originally announced September 2023.

  14. arXiv:2308.00251  [pdf, other

    stat.ML cs.LG stat.CO stat.ME

    Best-Subset Selection in Generalized Linear Models: A Fast and Consistent Algorithm via Splicing Technique

    Authors: Junxian Zhu, ** Zhu, Borui Tang, Xuanyu Chen, Hongmei Lin, Xueqin Wang

    Abstract: In high-dimensional generalized linear models, it is crucial to identify a sparse model that adequately accounts for response variation. Although the best subset section has been widely regarded as the Holy Grail of problems of this type, achieving either computational efficiency or statistical guarantees is challenging. In this article, we intend to surmount this obstacle by utilizing a fast algo… ▽ More

    Submitted 31 July, 2023; originally announced August 2023.

  15. arXiv:2211.12620  [pdf, other

    cs.LG cs.AI stat.ML

    Promises and Pitfalls of Threshold-based Auto-labeling

    Authors: Harit Vishwakarma, Heguang Lin, Frederic Sala, Ramya Korlakai Vinayak

    Abstract: Creating large-scale high-quality labeled datasets is a major bottleneck in supervised machine learning workflows. Threshold-based auto-labeling (TBAL), where validation data obtained from humans is used to find a confidence threshold above which the data is machine-labeled, reduces reliance on manual annotation. TBAL is emerging as a widely-used solution in practice. Given the long shelf-life and… ▽ More

    Submitted 21 February, 2024; v1 submitted 22 November, 2022; originally announced November 2022.

    Comments: NeurIPS 2023 (Spotlight)

    Journal ref: Thirty Seventh Conference on Neural Information Processing Systems (NeurIPS 2023)

  16. arXiv:2211.12012  [pdf, other

    stat.ME math.ST

    Factor-guided functional PCA for high-dimensional functional data

    Authors: Shoudao Wen, Huazhen Lin

    Abstract: The literature on high-dimensional functional data focuses on either the dependence over time or the correlation among functional variables. In this paper, we propose a factor-guided functional principal component analysis (FaFPCA) method to consider both temporal dependence and correlation of variables so that the extracted features are as sufficient as possible. In particular, we use a factor pr… ▽ More

    Submitted 22 November, 2022; originally announced November 2022.

    Comments: 34 pages, 5 figures, 3 tables

  17. arXiv:2207.09081  [pdf, other

    cs.LG cs.AI cs.RO stat.ME

    Generalizing Goal-Conditioned Reinforcement Learning with Variational Causal Reasoning

    Authors: Wenhao Ding, Haohong Lin, Bo Li, Ding Zhao

    Abstract: As a pivotal component to attaining generalizable solutions in human intelligence, reasoning provides great potential for reinforcement learning (RL) agents' generalization towards varied goals by summarizing part-to-whole arguments and discovering cause-and-effect relations. However, how to discover and represent causalities remains a huge gap that hinders the development of causal RL. In this pa… ▽ More

    Submitted 17 May, 2023; v1 submitted 19 July, 2022; originally announced July 2022.

    Comments: Accepted to NeurIPS 2022

  18. arXiv:2206.04277  [pdf, other

    stat.ML cs.LG

    On Hypothesis Transfer Learning of Functional Linear Models

    Authors: Haotian Lin, Matthew Reimherr

    Abstract: We study the transfer learning (TL) for the functional linear regression (FLR) under the Reproducing Kernel Hilbert Space (RKHS) framework, observing the TL techniques in existing high-dimensional linear regression is not compatible with the truncation-based FLR methods as functional data are intrinsically infinite-dimensional and generated by smooth underlying processes. We measure the similarity… ▽ More

    Submitted 22 February, 2024; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: The results are extended to functional GLM

  19. arXiv:2202.08180  [pdf, other

    stat.ML cs.AI cs.IT cs.LG

    Geometry of the Minimum Volume Confidence Sets

    Authors: Heguang Lin, Mengze Li, Daniel Pimentel-Alarcón, Matthew Malloy

    Abstract: Computation of confidence sets is central to data science and machine learning, serving as the workhorse of A/B testing and underpinning the operation and analysis of reinforcement learning algorithms. This paper studies the geometry of the minimum-volume confidence sets for the multinomial parameter. When used in place of more standard confidence sets and intervals based on bounds and asymptotic… ▽ More

    Submitted 16 February, 2022; originally announced February 2022.

  20. arXiv:2110.09823  [pdf, other

    cs.LG stat.AP stat.ME

    An Empirical Study: Extensive Deep Temporal Point Process

    Authors: Haitao Lin, Cheng Tan, Lirong Wu, Zhangyang Gao, Stan. Z. Li

    Abstract: Temporal point process as the stochastic process on continuous domain of time is commonly used to model the asynchronous event sequence featuring with occurrence timestamps. Thanks to the strong expressivity of deep neural networks, they are emerging as a promising choice for capturing the patterns in asynchronous sequences, in the context of temporal point process. In this paper, we first review… ▽ More

    Submitted 21 December, 2021; v1 submitted 19 October, 2021; originally announced October 2021.

    Comments: 22 pages, 8 figures

  21. arXiv:2110.04367  [pdf, other

    cs.LG stat.ML

    Hybrid Random Features

    Authors: Krzysztof Choromanski, Haoxian Chen, Han Lin, Yuanzhe Ma, Arijit Sehanobish, Deepali Jain, Michael S Ryoo, Jake Varley, Andy Zeng, Valerii Likhosherstov, Dmitry Kalashnikov, Vikas Sindhwani, Adrian Weller

    Abstract: We propose a new class of random feature methods for linearizing softmax and Gaussian kernels called hybrid random features (HRFs) that automatically adapt the quality of kernel estimation to provide most accurate approximation in the defined regions of interest. Special instantiations of HRFs lead to well-known methods such as trigonometric (Rahimi and Recht, 2007) or (recently introduced in the… ▽ More

    Submitted 30 January, 2022; v1 submitted 8 October, 2021; originally announced October 2021.

    Comments: Published as a conference paper at ICLR 2022

  22. arXiv:2105.07829  [pdf, other

    cs.DC cs.LG stat.ML

    Compressed Communication for Distributed Training: Adaptive Methods and System

    Authors: Yuchen Zhong, Cong Xie, Shuai Zheng, Haibin Lin

    Abstract: Communication overhead severely hinders the scalability of distributed machine learning systems. Recently, there has been a growing interest in using gradient compression to reduce the communication overhead of the distributed training. However, there is little understanding of applying gradient compression to adaptive gradient methods. Moreover, its performance benefits are often limited by the n… ▽ More

    Submitted 17 May, 2021; originally announced May 2021.

  23. arXiv:2105.05555  [pdf, ps, other

    cs.LG cs.DS math.ST stat.ML

    Robust Learning of Fixed-Structure Bayesian Networks in Nearly-Linear Time

    Authors: Yu Cheng, Honghao Lin

    Abstract: We study the problem of learning Bayesian networks where an $ε$-fraction of the samples are adversarially corrupted. We focus on the fully-observable case where the underlying graph structure is known. In this work, we present the first nearly-linear time algorithm for this problem with a dimension-independent error guarantee. Previous robust algorithms with comparable error guarantees are slower… ▽ More

    Submitted 12 May, 2021; originally announced May 2021.

  24. arXiv:2012.11100  [pdf, other

    stat.ME

    Two-directional simultaneous inference for high-dimensional models

    Authors: Wei Liu, Huazhen Lin, ** Liu, Shurong Zheng

    Abstract: This paper proposes a general two directional simultaneous inference (TOSI) framework for high-dimensional models with a manifest variable or latent variable structure, for example, high-dimensional mean models, high-dimensional sparse regression models, and high-dimensional latent factors models. TOSI performs simultaneous inference on a set of parameters from two directions, one to test whether… ▽ More

    Submitted 6 February, 2023; v1 submitted 20 December, 2020; originally announced December 2020.

  25. arXiv:2011.04067  [pdf, ps, other

    math.ST stat.ME

    Semiparametric regression of mean residual life with censoring and covariate dimension reduction

    Authors: Ge Zhao, Yanyuan Ma, Huazhen Lin, Yi Li

    Abstract: We propose a new class of semiparametric regression models of mean residual life for censored outcome data. The models, which enable us to estimate the expected remaining survival time and generalize commonly used mean residual life models, also conduct covariate dimension reduction. Using the geometric approaches in semiparametrics literature and the martingale properties with survival data, we p… ▽ More

    Submitted 8 November, 2020; originally announced November 2020.

    Comments: 73 pages, 9 figures

  26. arXiv:2009.11612  [pdf, other

    cs.LG stat.ML

    Clustering Based on Graph of Density Topology

    Authors: Zhangyang Gao, Haitao Lin, Stan. Z Li

    Abstract: Data clustering with uneven distribution in high level noise is challenging. Currently, HDBSCAN is considered as the SOTA algorithm for this problem. In this paper, we propose a novel clustering algorithm based on what we call graph of density topology (GDT). GDT jointly considers the local and global structures of data samples: firstly forming local clusters based on a density growing process wit… ▽ More

    Submitted 24 September, 2020; originally announced September 2020.

  27. arXiv:2009.06795  [pdf, other

    cs.LG stat.ML

    DynamicVAE: Decoupling Reconstruction Error and Disentangled Representation Learning

    Authors: Huajie Shao, Haohong Lin, Qinmin Yang, Shuochao Yao, Han Zhao, Tarek Abdelzaher

    Abstract: This paper challenges the common assumption that the weight $β$, in $β$-VAE, should be larger than $1$ in order to effectively disentangle latent factors. We demonstrate that $β$-VAE, with $β< 1$, can not only attain good disentanglement but also significantly improve reconstruction accuracy via dynamic control. The paper removes the inherent trade-off between reconstruction accuracy and disentang… ▽ More

    Submitted 30 September, 2020; v1 submitted 14 September, 2020; originally announced September 2020.

  28. arXiv:2007.13221  [pdf, other

    cs.LG cs.DC stat.ML

    CSER: Communication-efficient SGD with Error Reset

    Authors: Cong Xie, Shuai Zheng, Oluwasanmi Koyejo, Indranil Gupta, Mu Li, Haibin Lin

    Abstract: The scalability of Distributed Stochastic Gradient Descent (SGD) is today limited by communication bottlenecks. We propose a novel SGD variant: Communication-efficient SGD with Error Reset, or CSER. The key idea in CSER is first a new technique called "error reset" that adapts arbitrary compressors for SGD, producing bifurcated local models with periodic reset of resulting local residual errors. S… ▽ More

    Submitted 4 December, 2020; v1 submitted 26 July, 2020; originally announced July 2020.

  29. arXiv:2007.04387  [pdf, other

    stat.ME

    Double spike Dirichlet priors for structured weighting

    Authors: Huiming Lin, Meng Li

    Abstract: Assigning weights to a large pool of objects is a fundamental task in a wide variety of applications. In this article, we introduce the concept of structured high-dimensional probability simplexes, in which most components are zero or near zero and the remaining ones are close to each other. Such structure is well motivated by (i) high-dimensional weights that are common in modern applications, an… ▽ More

    Submitted 16 September, 2022; v1 submitted 8 July, 2020; originally announced July 2020.

  30. arXiv:2007.02235  [pdf, other

    cs.LG stat.ML

    Unbiased Risk Estimators Can Mislead: A Case Study of Learning with Complementary Labels

    Authors: Yu-Ting Chou, Gang Niu, Hsuan-Tien Lin, Masashi Sugiyama

    Abstract: In weakly supervised learning, unbiased risk estimator(URE) is a powerful tool for training classifiers when training and test data are drawn from different distributions. Nevertheless, UREs lead to overfitting in many problem settings when the models are complex like deep networks. In this paper, we investigate reasons for such overfitting by studying a weakly supervised problem called learning w… ▽ More

    Submitted 21 August, 2020; v1 submitted 5 July, 2020; originally announced July 2020.

    Comments: Accepted at ICML 2020

  31. arXiv:2006.13484  [pdf, other

    cs.LG cs.CL cs.DC stat.ML

    Accelerated Large Batch Optimization of BERT Pretraining in 54 minutes

    Authors: Shuai Zheng, Haibin Lin, Sheng Zha, Mu Li

    Abstract: BERT has recently attracted a lot of attention in natural language understanding (NLU) and achieved state-of-the-art results in various NLU tasks. However, its success requires large deep neural networks and huge amount of data, which result in long training time and impede development progress. Using stochastic gradient methods with large mini-batch has been advocated as an efficient tool to redu… ▽ More

    Submitted 18 September, 2020; v1 submitted 24 June, 2020; originally announced June 2020.

    Comments: Technical Report (not under reviewed in any venue)

  32. arXiv:2005.13590  [pdf, other

    cs.LG stat.ML

    Demystifying Orthogonal Monte Carlo and Beyond

    Authors: Han Lin, Haoxian Chen, Tianyi Zhang, Clement Laroche, Krzysztof Choromanski

    Abstract: Orthogonal Monte Carlo (OMC) is a very effective sampling algorithm imposing structural geometric conditions (orthogonality) on samples for variance reduction. Due to its simplicity and superior performance as compared to its Quasi Monte Carlo counterparts, OMC is used in a wide spectrum of challenging machine learning applications ranging from scalable kernel methods to predictive recurrent neura… ▽ More

    Submitted 27 May, 2020; originally announced May 2020.

    Comments: 22 pages, 4 figures

  33. arXiv:2005.09159  [pdf, other

    cs.CV stat.ML

    Sketch-BERT: Learning Sketch Bidirectional Encoder Representation from Transformers by Self-supervised Learning of Sketch Gestalt

    Authors: Hangyu Lin, Yanwei Fu, Yu-Gang Jiang, Xiangyang Xue

    Abstract: Previous researches of sketches often considered sketches in pixel format and leveraged CNN based models in the sketch understanding. Fundamentally, a sketch is stored as a sequence of data points, a vector format representation, rather than the photo-realistic image of pixels. SketchRNN studied a generative neural representation for sketches of vector format by Long Short Term Memory networks (LS… ▽ More

    Submitted 18 May, 2020; originally announced May 2020.

    Comments: Accepted to CVPR 2020

  34. arXiv:2002.03273  [pdf, ps, other

    cs.LG math.OC stat.ML

    On the Complexity of Minimizing Convex Finite Sums Without Using the Indices of the Individual Functions

    Authors: Yossi Arjevani, Amit Daniely, Stefanie Jegelka, Hongzhou Lin

    Abstract: Recent advances in randomized incremental methods for minimizing $L$-smooth $μ$-strongly convex finite sums have culminated in tight complexity of $\tilde{O}((n+\sqrt{n L/μ})\log(1/ε))$ and $O(n+\sqrt{nL/ε})$, where $μ>0$ and $μ=0$, respectively, and $n$ denotes the number of individual functions. Unlike incremental methods, stochastic methods for finite sums do not rely on an explicit knowledge o… ▽ More

    Submitted 8 February, 2020; originally announced February 2020.

  35. arXiv:2001.09832  [pdf, other

    cs.LG stat.ML

    Polygames: Improved Zero Learning

    Authors: Tristan Cazenave, Yen-Chi Chen, Guan-Wei Chen, Shi-Yu Chen, Xian-Dong Chiu, Julien Dehos, Maria Elsa, Qucheng Gong, Hengyuan Hu, Vasil Khalidov, Cheng-Ling Li, Hsin-I Lin, Yu-** Lin, Xavier Martinet, Vegard Mella, Jeremy Rapin, Baptiste Roziere, Gabriel Synnaeve, Fabien Teytaud, Olivier Teytaud, Shi-Cheng Ye, Yi-Jun Ye, Shi-Jim Yen, Sergey Zagoruyko

    Abstract: Since DeepMind's AlphaZero, Zero learning quickly became the state-of-the-art method for many board games. It can be improved using a fully convolutional structure (no fully connected layer). Using such an architecture plus global pooling, we can create bots independent of the board size. The training can be made more robust by kee** track of the best checkpoints during the training and by train… ▽ More

    Submitted 27 January, 2020; originally announced January 2020.

  36. arXiv:2001.04345  [pdf, ps, other

    cs.IR cs.CL cs.LG stat.ML

    Shareable Representations for Search Query Understanding

    Authors: Mukul Kumar, Youna Hu, Will Headden, Rahul Goutam, Heran Lin, Bing Yin

    Abstract: Understanding search queries is critical for shop** search engines to deliver a satisfying customer experience. Popular shop** search engines receive billions of unique queries yearly, each of which can depict any of hundreds of user preferences or intents. In order to get the right results to customers it must be known queries like "inexpensive prom dresses" are intended to not only surface r… ▽ More

    Submitted 20 December, 2019; originally announced January 2020.

  37. arXiv:1912.07663  [pdf, other

    cs.LG cs.AI stat.ML

    Spatial-Temporal Self-Attention Network for Flow Prediction

    Authors: Haoxing Lin, Weijia Jia, Yi** Sun, Yongjian You

    Abstract: Flow prediction (e.g., crowd flow, traffic flow) with features of spatial-temporal is increasingly investigated in AI research field. It is very challenging due to the complicated spatial dependencies between different locations and dynamic temporal dependencies among different time intervals. Although measurements of both dependencies are employed, existing methods suffer from the following two p… ▽ More

    Submitted 22 December, 2019; v1 submitted 13 December, 2019; originally announced December 2019.

    Comments: 8 pages

  38. arXiv:1911.09030  [pdf, other

    cs.LG cs.DC stat.ML

    Local AdaAlter: Communication-Efficient Stochastic Gradient Descent with Adaptive Learning Rates

    Authors: Cong Xie, Oluwasanmi Koyejo, Indranil Gupta, Haibin Lin

    Abstract: When scaling distributed training, the communication overhead is often the bottleneck. In this paper, we propose a novel SGD variant with reduced communication and adaptive learning rates. We prove the convergence of the proposed algorithm for smooth but non-convex problems. Empirical results show that the proposed algorithm significantly reduces the communication overhead, which, in turn, reduces… ▽ More

    Submitted 4 December, 2020; v1 submitted 20 November, 2019; originally announced November 2019.

  39. arXiv:1910.13188  [pdf, other

    cs.LG stat.ML

    Learning from Label Proportions with Consistency Regularization

    Authors: Kuen-Han Tsai, Hsuan-Tien Lin

    Abstract: The problem of learning from label proportions (LLP) involves training classifiers with weak labels on bags of instances, rather than strong labels on individual instances. The weak labels only contain the label proportion of each bag. The LLP problem is important for many practical applications that only allow label proportions to be collected because of data privacy or annotation cost, and has r… ▽ More

    Submitted 29 October, 2019; originally announced October 2019.

  40. arXiv:1910.08664  [pdf, ps, other

    stat.ME

    Latent Variable Model for Multivariate Data with Measure-specific Sample Weights and Its Application in Hospital Compare

    Authors: Chengan Du, Shu-Xia Li, Zhenqiu Lin, Haiqun Lin

    Abstract: We developed a single factor model with measure-specific sample weights for multivariate data with multiple observed indicators clustered within a higher level subject. The factor is therefore a latent variable shared by multiple indicators within a same subject and the sample weights are different across different indicators and different subjects. Even after integrating out the latent variable,… ▽ More

    Submitted 18 October, 2019; originally announced October 2019.

  41. arXiv:1909.11616  [pdf, other

    cs.LG stat.ML

    Benchmarking Tropical Cyclone Rapid Intensification with Satellite Images and Attention-based Deep Models

    Authors: Ching-Yuan Bai, Buo-Fu Chen, Hsuan-Tien Lin

    Abstract: Rapid intensification (RI) of tropical cyclones often causes major destruction to human civilization due to short response time. It is an important yet challenging task to accurately predict this kind of extreme weather event in advance. Traditionally, meteorologists tackle the task with human-driven feature extraction and predictor correction procedures. Nevertheless, these procedures do not leve… ▽ More

    Submitted 24 September, 2020; v1 submitted 25 September, 2019; originally announced September 2019.

    Comments: In Proceedings of the The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD), September 2020

  42. arXiv:1909.10582  [pdf, other

    stat.ML cs.LG cs.RO eess.SY

    Kalman Filtering with Gaussian Processes Measurement Noise

    Authors: Vince Kurtz, Hai Lin

    Abstract: Real-world measurement noise in applications like robotics is often correlated in time, but we typically assume i.i.d. Gaussian noise for filtering. We propose general Gaussian Processes as a non-parametric model for correlated measurement noise that is flexible enough to accurately reflect correlation in time, yet simple enough to enable efficient computation. We show that this model accurately r… ▽ More

    Submitted 23 September, 2019; originally announced September 2019.

  43. arXiv:1909.08417  [pdf, other

    cs.LG cs.CG math.AT stat.ML

    Persistence B-Spline Grids: Stable Vector Representation of Persistence Diagrams Based on Data Fitting

    Authors: Zhetong Dong, Hongwei Lin, Chi Zhou

    Abstract: Many attempts have been made in recent decades to integrate machine learning (ML) and topological data analysis. A prominent problem in applying persistent homology to ML tasks is finding a vector representation of a persistence diagram (PD), which is a summary diagram for representing topological features. From the perspective of data fitting, a stable vector representation, namely, persistence B… ▽ More

    Submitted 22 April, 2022; v1 submitted 17 September, 2019; originally announced September 2019.

  44. arXiv:1909.04323  [pdf

    cs.CY stat.OT

    Investigating the completeness and omission roads of OpenStreetMap data in Hubei, China by comparing with Street Map and Street View

    Authors: Qi Zhou, Hao Lin

    Abstract: OpenStreetMap (OSM) is a free map of the world which can be edited by global volunteers. Existing studies have showed that completeness of OSM road data in some develo** countries (e.g. China) is much lower, resulting in concern in utilizing the data in various applications. But very few have focused on investigating what types of road are still poorly mapped. This study aims not only to investi… ▽ More

    Submitted 10 September, 2019; originally announced September 2019.

  45. arXiv:1907.04433  [pdf, other

    cs.LG cs.CL cs.CV stat.ML

    GluonCV and GluonNLP: Deep Learning in Computer Vision and Natural Language Processing

    Authors: Jian Guo, He He, Tong He, Leonard Lausen, Mu Li, Haibin Lin, Xingjian Shi, Chenguang Wang, Junyuan Xie, Sheng Zha, Aston Zhang, Hang Zhang, Zhi Zhang, Zhongyue Zhang, Shuai Zheng, Yi Zhu

    Abstract: We present GluonCV and GluonNLP, the deep learning toolkits for computer vision and natural language processing based on Apache MXNet (incubating). These toolkits provide state-of-the-art pre-trained models, training scripts, and training logs, to facilitate rapid prototy** and promote reproducible research. We also provide modular APIs with flexible building blocks to enable efficient customiza… ▽ More

    Submitted 12 February, 2020; v1 submitted 9 July, 2019; originally announced July 2019.

    Journal ref: Journal of Machine Learning Research 21 (2020) 1-7

  46. arXiv:1905.10834  [pdf, other

    q-bio.NC stat.AP

    ABCD Neurocognitive Prediction Challenge 2019: Predicting individual residual fluid intelligence scores from cortical grey matter morphology

    Authors: Neil P. Oxtoby, Fabio S. Ferreira, Agoston Mihalik, Tong Wu, Mikael Brudfors, Hongxiang Lin, Anita Rau, Stefano B. Blumberg, Maria Robu, Cemre Zor, Maira Tariq, Maria Del Mar Estarellas Garcia, Baris Kanber, Daniil I. Nikitichev, Janaina Mourao-Miranda

    Abstract: We predicted residual fluid intelligence scores from T1-weighted MRI data available as part of the ABCD NP Challenge 2019, using morphological similarity of grey-matter regions across the cortex. Individual structural covariance networks (SCN) were abstracted into graph-theory metrics averaged over nodes across the brain and in data-driven communities/modules. Metrics included degree, path length,… ▽ More

    Submitted 26 May, 2019; originally announced May 2019.

    Comments: 8 pages plus references, 3 figures, 2 tables. Submission to the ABCD Neurocognitive Prediction Challenge at MICCAI 2019

  47. arXiv:1905.10831  [pdf, other

    q-bio.NC stat.AP

    ABCD Neurocognitive Prediction Challenge 2019: Predicting individual fluid intelligence scores from structural MRI using probabilistic segmentation and kernel ridge regression

    Authors: Agoston Mihalik, Mikael Brudfors, Maria Robu, Fabio S. Ferreira, Hongxiang Lin, Anita Rau, Tong Wu, Stefano B. Blumberg, Baris Kanber, Maira Tariq, Maria Del Mar Estarellas Garcia, Cemre Zor, Daniil I. Nikitichev, Janaina Mourao-Miranda, Neil P. Oxtoby

    Abstract: We applied several regression and deep learning methods to predict fluid intelligence scores from T1-weighted MRI scans as part of the ABCD Neurocognitive Prediction Challenge (ABCD-NP-Challenge) 2019. We used voxel intensities and probabilistic tissue-type labels derived from these as features to train the models. The best predictive performance (lowest mean-squared error) came from Kernel Ridge… ▽ More

    Submitted 26 May, 2019; originally announced May 2019.

    Comments: Winning entry in the ABCD Neurocognitive Prediction Challenge at MICCAI 2019. 7 pages plus references, 3 figures, 1 table

  48. arXiv:1904.12043  [pdf, other

    cs.LG cs.CV cs.DC stat.ML

    Dynamic Mini-batch SGD for Elastic Distributed Training: Learning in the Limbo of Resources

    Authors: Haibin Lin, Hang Zhang, Yifei Ma, Tong He, Zhi Zhang, Sheng Zha, Mu Li

    Abstract: With an increasing demand for training powers for deep learning algorithms and the rapid growth of computation resources in data centers, it is desirable to dynamically schedule different distributed deep learning tasks to maximize resource utilization and reduce cost. In this process, different tasks may receive varying numbers of machines at different time, a setting we call elastic distributed… ▽ More

    Submitted 2 May, 2019; v1 submitted 26 April, 2019; originally announced April 2019.

  49. arXiv:1904.00284  [pdf, other

    cs.LG cs.CV stat.ML

    COCO-GAN: Generation by Parts via Conditional Coordinating

    Authors: Chieh Hubert Lin, Chia-Che Chang, Yu-Sheng Chen, Da-Cheng Juan, Wei Wei, Hwann-Tzong Chen

    Abstract: Humans can only interact with part of the surrounding environment due to biological restrictions. Therefore, we learn to reason the spatial relationships across a series of observations to piece together the surrounding environment. Inspired by such behavior and the fact that machines also have computational constraints, we propose \underline{CO}nditional \underline{CO}ordinate GAN (COCO-GAN) of w… ▽ More

    Submitted 5 January, 2020; v1 submitted 30 March, 2019; originally announced April 2019.

    Comments: Accepted to ICCV'19 (oral). All images are compressed due to size limit, please access the full-resolution version via Google Drive: http://bit.ly/COCO-GAN-full

  50. arXiv:1812.06600  [pdf, other

    q-fin.TR cs.LG q-fin.CP stat.ML

    Double Deep Q-Learning for Optimal Execution

    Authors: Brian Ning, Franco Ho Ting Lin, Sebastian Jaimungal

    Abstract: Optimal trade execution is an important problem faced by essentially all traders. Much research into optimal execution uses stringent model assumptions and applies continuous time stochastic control to solve them. Here, we instead take a model free approach and develop a variation of Deep Q-Learning to estimate the optimal actions of a trader. The model is a fully connected Neural Network trained… ▽ More

    Submitted 8 June, 2020; v1 submitted 16 December, 2018; originally announced December 2018.

    Comments: 20 pages, 7 figures, 1 table. Updated minor typos

    MSC Class: 91G99; 93E35