Skip to main content

Showing 1–50 of 129 results for author: Chen, B

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.17278  [pdf, other

    stat.ME econ.EM math.ST

    Estimation and Inference for CP Tensor Factor Models

    Authors: Bin Chen, Yuefeng Han, Qiyang Yu

    Abstract: High-dimensional tensor-valued data have recently gained attention from researchers in economics and finance. We consider the estimation and inference of high-dimensional tensor factor models, where each dimension of the tensor diverges. Our focus is on a factor model that admits CP-type tensor decomposition, which allows for non-orthogonal loading vectors. Based on the contemporary covariance mat… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  2. arXiv:2406.09357  [pdf, other

    cs.LG stat.ML

    Advancing Graph Generation through Beta Diffusion

    Authors: Yilin He, Xinyang Liu, Bo Chen, Mingyuan Zhou

    Abstract: Diffusion models have demonstrated effectiveness in generating natural images and have been extended to generate diverse data types, including graphs. This new generation of diffusion-based graph generative models has demonstrated significant performance improvements over methods that rely on variational autoencoders or generative adversarial networks. It's important to recognize, however, that mo… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  3. arXiv:2406.02847  [pdf, other

    cs.LG stat.ML

    Exact Conversion of In-Context Learning to Model Weights in Linearized-Attention Transformers

    Authors: Brian K Chen, Tianyang Hu, Hui **, Hwee Kuan Lee, Kenji Kawaguchi

    Abstract: In-Context Learning (ICL) has been a powerful emergent property of large language models that has attracted increasing attention in recent years. In contrast to regular gradient-based learning, ICL is highly interpretable and does not require parameter updates. In this paper, we show that, for linearized transformer networks, ICL can be made explicit and permanent through the inclusion of bias ter… ▽ More

    Submitted 6 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

    Comments: Accepted to ICML 2024

  4. arXiv:2405.16413  [pdf, other

    cs.AI cs.CL cs.LG stat.AP

    Augmented Risk Prediction for the Onset of Alzheimer's Disease from Electronic Health Records with Large Language Models

    Authors: Jiankun Wang, Sumyeong Ahn, Taykhoom Dalal, Xiaodan Zhang, Weishen Pan, Qiannan Zhang, Bin Chen, Hiroko H. Dodge, Fei Wang, Jiayu Zhou

    Abstract: Alzheimer's disease (AD) is the fifth-leading cause of death among Americans aged 65 and older. Screening and early detection of AD and related dementias (ADRD) are critical for timely intervention and for identifying clinical trial participants. The widespread adoption of electronic health records (EHRs) offers an important resource for develo** ADRD screening tools such as machine learning bas… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

  5. arXiv:2404.03900  [pdf, other

    stat.ML cs.AI cs.LG cs.NE

    Nonparametric Modern Hopfield Models

    Authors: Jerry Yao-Chieh Hu, Bo-Yu Chen, Dennis Wu, Feng Ruan, Han Liu

    Abstract: We present a nonparametric construction for deep learning compatible modern Hopfield models and utilize this framework to debut an efficient variant. Our key contribution stems from interpreting the memory storage and retrieval processes in modern Hopfield models as a nonparametric regression problem subject to a set of query-memory pairs. Crucially, our framework not only recovers the known resul… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

    Comments: 59 pages; Code available at https://github.com/MAGICS-LAB/NonparametricHopfield

  6. arXiv:2404.01546  [pdf, other

    stat.ME

    Time-Varying Matrix Factor Models

    Authors: Bin Chen, Elynn Y. Chen, Stevenson Bolivar, Rong Chen

    Abstract: Matrix-variate data of high dimensions are frequently observed in finance and economics, spanning extended time periods, such as the long-term data on international trade flows among numerous countries. To address potential structural shifts and explore the matrix structure's informational context, we propose a time-varying matrix factor model. This model accommodates changing factor loadings over… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  7. arXiv:2312.17346  [pdf, other

    cs.LG cs.AI cs.CV cs.NE stat.ML

    STanHop: Sparse Tandem Hopfield Model for Memory-Enhanced Time Series Prediction

    Authors: Dennis Wu, Jerry Yao-Chieh Hu, Weijian Li, Bo-Yu Chen, Han Liu

    Abstract: We present STanHop-Net (Sparse Tandem Hopfield Network) for multivariate time series prediction with memory-enhanced capabilities. At the heart of our approach is STanHop, a novel Hopfield-based neural network block, which sparsely learns and stores both temporal and cross-series representations in a data-dependent fashion. In essence, STanHop sequentially learn temporal representation and cross-s… ▽ More

    Submitted 28 December, 2023; originally announced December 2023.

  8. arXiv:2311.14910  [pdf, other

    math.DS cs.LG stat.ML

    A latent linear model for nonlinear coupled oscillators on graphs

    Authors: Agam Goyal, Zhaoxing Wu, Richard P. Yim, Binhao Chen, Zihong Xu, Hanbaek Lyu

    Abstract: A system of coupled oscillators on an arbitrary graph is locally driven by the tendency to mutual synchronization between nearby oscillators, but can and often exhibit nonlinear behavior on the whole graph. Understanding such nonlinear behavior has been a key challenge in predicting whether all oscillators in such a system will eventually synchronize. In this paper, we demonstrate that, surprising… ▽ More

    Submitted 24 November, 2023; originally announced November 2023.

    Comments: 23 pages, 14 figures

  9. arXiv:2310.13769  [pdf, other

    q-bio.QM stat.ML

    Compositional Deep Probabilistic Models of DNA Encoded Libraries

    Authors: Benson Chen, Mohammad M. Sultan, Theofanis Karaletsos

    Abstract: DNA-Encoded Library (DEL) has proven to be a powerful tool that utilizes combinatorially constructed small molecules to facilitate highly-efficient screening assays. These selection experiments, involving multiple stages of washing, elution, and identification of potent binders via unique DNA barcodes, often generate complex data. This complexity can potentially mask the underlying signals, necess… ▽ More

    Submitted 13 February, 2024; v1 submitted 20 October, 2023; originally announced October 2023.

  10. arXiv:2309.12673  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    On Sparse Modern Hopfield Model

    Authors: Jerry Yao-Chieh Hu, Donglin Yang, Dennis Wu, Chenwei Xu, Bo-Yu Chen, Han Liu

    Abstract: We introduce the sparse modern Hopfield model as a sparse extension of the modern Hopfield model. Like its dense counterpart, the sparse modern Hopfield model equips a memory-retrieval dynamics whose one-step approximation corresponds to the sparse attention mechanism. Theoretically, our key contribution is a principled derivation of a closed-form sparse Hopfield energy using the convex conjugate… ▽ More

    Submitted 29 November, 2023; v1 submitted 22 September, 2023; originally announced September 2023.

    Comments: 37 pages, accepted at NeurIPS 2023. [v2] updated to match with camera-ready version. Code is available at https://github.com/MAGICS-LAB/SparseModernHopfield

  11. arXiv:2307.13290  [pdf, other

    stat.ML cs.LG math.OC

    Modify Training Directions in Function Space to Reduce Generalization Error

    Authors: Yi Yu, Wenlian Lu, Boyu Chen

    Abstract: We propose theoretical analyses of a modified natural gradient descent method in the neural network function space based on the eigendecompositions of neural tangent kernel and Fisher information matrix. We firstly present analytical expression for the function learned by this modified natural gradient under the assumptions of Gaussian distribution and infinite width limit. Thus, we explicitly der… ▽ More

    Submitted 25 July, 2023; originally announced July 2023.

  12. arXiv:2303.15464  [pdf, other

    cs.LG cs.AI math.ST stat.ML

    Mathematical Challenges in Deep Learning

    Authors: Vahid Partovi Nia, Guojun Zhang, Ivan Kobyzev, Michael R. Metel, Xinlin Li, Ke Sun, Sobhan Hemati, Masoud Asgharian, Linglong Kong, Wulong Liu, Boxing Chen

    Abstract: Deep models are dominating the artificial intelligence (AI) industry since the ImageNet challenge in 2012. The size of deep models is increasing ever since, which brings new challenges to this field with applications in cell phones, personal computers, autonomous cars, and wireless base stations. Here we list a set of problems, ranging from training, inference, generalization bound, and optimizati… ▽ More

    Submitted 24 March, 2023; originally announced March 2023.

  13. arXiv:2302.13059  [pdf, ps, other

    stat.ME

    Intrinsic minimum average variance estimation for sufficient dimension reduction with symmetric positive definite matrices and beyond

    Authors: B. Chen, S. Dai, Z. Yu

    Abstract: In this paper, we target the problem of sufficient dimension reduction with symmetric positive definite matrices valued responses. We propose the intrinsic minimum average variance estimation method and the intrinsic outer product gradient method which fully exploit the geometric structure of the Riemannian manifold where responses lie. We present the algorithms for our newly developed methods und… ▽ More

    Submitted 25 February, 2023; originally announced February 2023.

    Comments: 35 pages, 4 tables, 2 figures

  14. Adaptive sparseness for correntropy-based robust regression via automatic relevance determination

    Authors: Yuanhao Li, Badong Chen, Okito Yamashita, Natsue Yoshimura, Yasuharu Koike

    Abstract: Sparseness and robustness are two important properties for many machine learning scenarios. In the present study, regarding the maximum correntropy criterion (MCC) based robust regression algorithm, we investigate to integrate the MCC method with the automatic relevance determination (ARD) technique in a Bayesian framework, so that MCC-based robust regression could be implemented with adaptive spa… ▽ More

    Submitted 31 January, 2023; originally announced February 2023.

    Journal ref: 2023 International Joint Conference on Neural Networks (IJCNN)

  15. arXiv:2301.01642  [pdf, other

    stat.ML cs.LG q-bio.NC

    CI-GNN: A Granger Causality-Inspired Graph Neural Network for Interpretable Brain Network-Based Psychiatric Diagnosis

    Authors: Kaizhong Zheng, Shujian Yu, Badong Chen

    Abstract: There is a recent trend to leverage the power of graph neural networks (GNNs) for brain-network based psychiatric diagnosis, which,in turn, also motivates an urgent need for psychiatrists to fully understand the decision behavior of the used GNNs. However, most of the existing GNN explainers are either post-hoc in which another interpretive model needs to be created to explain a well-trained GNN,… ▽ More

    Submitted 28 January, 2024; v1 submitted 4 January, 2023; originally announced January 2023.

    Comments: Manuscript ia accepted by Neural Networks, The source code and implementation details are freely available at GitHub repository (https://github.com/ZKZ-Brain/CI-GNN/). 45 pages, 14 figures

  16. arXiv:2210.08231  [pdf, other

    stat.ME

    Assessing Spatial Stationarity and Segmenting Spatial Processes into Stationary Components

    Authors: ShengLi Tzeng, Bo-Yu Chen, Hsin-Cheng Huang

    Abstract: In this research, we propose a novel technique for visualizing nonstationarity in geostatistics, particularly when confronted with a single realization of data at irregularly spaced locations. Our method hinges on formulating a statistic that tracks a stable microergodic parameter of the exponential covariance function, allowing us to address the intricate challenges of nonstationary processes tha… ▽ More

    Submitted 28 August, 2023; v1 submitted 15 October, 2022; originally announced October 2022.

    Comments: 25 pages, 5 tables, 11 figures

    MSC Class: 62M30

  17. arXiv:2207.00818  [pdf, other

    cs.LG math.DG math.ST stat.CO

    Geometric Learning of Hidden Markov Models via a Method of Moments Algorithm

    Authors: Berlin Chen, Cyrus Mostajeran, Salem Said

    Abstract: We present a novel algorithm for learning the parameters of hidden Markov models (HMMs) in a geometric setting where the observations take values in Riemannian manifolds. In particular, we elevate a recent second-order method of moments algorithm that incorporates non-consecutive correlations to a more general setting where observations take place in a Riemannian symmetric space of non-positive cu… ▽ More

    Submitted 2 July, 2022; originally announced July 2022.

  18. arXiv:2203.01570  [pdf, other

    cs.LG stat.ME stat.ML

    Representing Mixtures of Word Embeddings with Mixtures of Topic Embeddings

    Authors: Dongsheng Wang, Dandan Guo, He Zhao, Huangjie Zheng, Korawat Tanwisuth, Bo Chen, Mingyuan Zhou

    Abstract: A topic model is often formulated as a generative model that explains how each word of a document is generated given a set of topics and document-specific topic proportions. It is focused on capturing the word co-occurrences in a document and hence often suffers from poor performance in analyzing short documents. In addition, its parameter estimation often relies on approximate posterior inference… ▽ More

    Submitted 14 March, 2022; v1 submitted 3 March, 2022; originally announced March 2022.

    Comments: Proceedings of ICLR, 2022

  19. arXiv:2202.11778  [pdf, other

    stat.ME stat.AP stat.CO

    baker: An R package for Nested Partially-Latent Class Models

    Authors: Irena B Chen, Qiyuan Shi, Scott L Zeger, Zhenke Wu

    Abstract: This paper describes and illustrates the functionality of the baker R package. The package estimates a suite of nested partially-latent class models (NPLCM) for multivariate binary responses that are observed under a case-control design. The baker package allows researchers to flexibly estimate population-level class prevalences and posterior probabilities of class membership for individual cases.… ▽ More

    Submitted 23 February, 2022; originally announced February 2022.

    Comments: 30 pages

  20. arXiv:2202.03233  [pdf, other

    stat.ML cs.LG stat.CO stat.ME

    A Variational Edge Partition Model for Supervised Graph Representation Learning

    Authors: Yilin He, Chaojie Wang, Hao Zhang, Bo Chen, Mingyuan Zhou

    Abstract: Graph neural networks (GNNs), which propagate the node features through the edges and learn how to transform the aggregated features under label supervision, have achieved great success in supervised feature extraction for both node-level and graph-level classification tasks. However, GNNs typically treat the graph structure as given and ignore how the edges are formed. This paper introduces a gra… ▽ More

    Submitted 31 October, 2022; v1 submitted 7 February, 2022; originally announced February 2022.

    Comments: 10 pages, 5 figures, 14 pages of appendix, accepted to NeurIPS 2022

  21. arXiv:2110.12024  [pdf, other

    cs.LG cs.CV stat.ML

    A Prototype-Oriented Framework for Unsupervised Domain Adaptation

    Authors: Korawat Tanwisuth, Xinjie Fan, Huangjie Zheng, Shujian Zhang, Hao Zhang, Bo Chen, Mingyuan Zhou

    Abstract: Existing methods for unsupervised domain adaptation often rely on minimizing some statistical distance between the source and target samples in the latent space. To avoid the sampling variability, class imbalance, and data-privacy concerns that often plague these methods, we instead provide a memory and computation-efficient probabilistic framework to extract class prototypes and align the target… ▽ More

    Submitted 22 October, 2021; originally announced October 2021.

    Comments: NeurIPS 2021

  22. arXiv:2106.05251  [pdf, other

    cs.LG cs.CL stat.ML

    Bayesian Attention Belief Networks

    Authors: Shujian Zhang, Xinjie Fan, Bo Chen, Mingyuan Zhou

    Abstract: Attention-based neural networks have achieved state-of-the-art results on a wide range of tasks. Most such models use deterministic attention while stochastic attention is less explored due to the optimization difficulties or complicated model design. This paper introduces Bayesian attention belief networks, which construct a decoder network by modeling unnormalized attention weights with a hierar… ▽ More

    Submitted 9 June, 2021; originally announced June 2021.

    Comments: ICML 2021

  23. arXiv:2105.14367  [pdf, other

    cs.LG stat.ML

    Deconvolutional Density Network: Modeling Free-Form Conditional Distributions

    Authors: Bing Chen, Mazharul Islam, Jisuo Gao, Lin Wang

    Abstract: Conditional density estimation (CDE) is the task of estimating the probability of an event conditioned on some inputs. A neural network (NN) can also be used to compute the output distribution for continuous-domain, which can be viewed as an extension of regression task. Nevertheless, it is difficult to explicitly approximate a distribution without knowing the information of its general form a pri… ▽ More

    Submitted 28 December, 2021; v1 submitted 29 May, 2021; originally announced May 2021.

    Comments: 10 pages, 5 figures, 2 tables

  24. arXiv:2105.08677  [pdf, ps, other

    stat.ME

    Maximum profile binomial likelihood estimation for the semiparametric Box--Cox power transformation model

    Authors: Pengfei Li, Tao Yu, Baojiang Chen, **g Qin

    Abstract: The Box--Cox transformation model has been widely applied for many years. The parametric version of this model assumes that the random error follows a parametric distribution, say the normal distribution, and estimates the model parameters using the maximum likelihood method. The semiparametric version assumes that the distribution of the random error is completely unknown; existing methods either… ▽ More

    Submitted 18 May, 2021; v1 submitted 18 May, 2021; originally announced May 2021.

    Comments: 70 pages, 1 figure

  25. Matching Visual Features to Hierarchical Semantic Topics for Image Paragraph Captioning

    Authors: Dandan Guo, Ruiying Lu, Bo Chen, Zequn Zeng, Mingyuan Zhou

    Abstract: Observing a set of images and their corresponding paragraph-captions, a challenging task is to learn how to produce a semantically coherent paragraph to describe the visual content of an image. Inspired by recent successes in integrating semantic topics into this task, this paper develops a plug-and-play hierarchical-topic-guided image paragraph generation framework, which couples a visual extract… ▽ More

    Submitted 25 July, 2022; v1 submitted 10 May, 2021; originally announced May 2021.

  26. arXiv:2104.14379  [pdf, other

    cs.LG stat.ML

    Learning Robust Variational Information Bottleneck with Reference

    Authors: Weizhu Qian, Bowei Chen, Xiaowei Huang

    Abstract: We propose a new approach to train a variational information bottleneck (VIB) that improves its robustness to adversarial perturbations. Unlike the traditional methods where the hard labels are usually used for the classification task, we refine the categorical class information in the training phase with soft labels which are obtained from a pre-trained reference neural network and can reflect th… ▽ More

    Submitted 29 April, 2021; originally announced April 2021.

    Comments: 8 pages, 5 figures

  27. arXiv:2103.13435  [pdf, ps, other

    stat.ME math.ST

    Maximum pairwise-rank-likelihood-based inference for the semiparametric transformation model

    Authors: Tao Yu, Pengfei Li, Baojiang Chen, Ao Yuan, **g Qin

    Abstract: In this paper, we study the linear transformation model in the most general setup. This model includes many important and popular models in statistics and econometrics as special cases. Although it has been studied for many years, the methods in the literature are based on kernel-smoothing techniques or make use of only the ranks of the responses in the estimation of the parametric components. The… ▽ More

    Submitted 24 March, 2021; originally announced March 2021.

    Comments: 6 tables and 2 figures

    MSC Class: 62J02; 62J05; 62J86

  28. arXiv:2010.12648  [pdf, other

    cs.LG stat.ML

    An Investigation of how Label Smoothing Affects Generalization

    Authors: Blair Chen, Liu Ziyin, Zihao Wang, Paul Pu Liang

    Abstract: It has been hypothesized that label smoothing can reduce overfitting and improve generalization, and current empirical evidence seems to corroborate these effects. However, there is a lack of mathematical understanding of when and why such empirical improvements occur. In this paper, as a step towards understanding why label smoothing is effective, we propose a theoretical framework to show how la… ▽ More

    Submitted 23 October, 2020; originally announced October 2020.

  29. arXiv:2010.10604  [pdf, other

    stat.ML cs.LG cs.NE

    Bayesian Attention Modules

    Authors: Xinjie Fan, Shujian Zhang, Bo Chen, Mingyuan Zhou

    Abstract: Attention modules, as simple and effective tools, have not only enabled deep neural networks to achieve state-of-the-art results in many domains, but also enhanced their interpretability. Most current models use deterministic attention modules due to their simplicity and ease of optimization. Stochastic counterparts, on the other hand, are less popular despite their potential benefits. The main re… ▽ More

    Submitted 20 October, 2020; originally announced October 2020.

  30. Variational Temporal Deep Generative Model for Radar HRRP Target Recognition

    Authors: Dandan Guo, Bo Chen, Wenchao Chen, Chaojie Wang, Hongwei Liu, Mingyuan Zhou

    Abstract: We develop a recurrent gamma belief network (rGBN) for radar automatic target recognition (RATR) based on high-resolution range profile (HRRP), which characterizes the temporal dependence across the range cells of HRRP. The proposed rGBN adopts a hierarchy of gamma distributions to build its temporal deep generative model. For scalable training and fast out-of-sample prediction, we propose the hyb… ▽ More

    Submitted 27 September, 2020; originally announced September 2020.

  31. arXiv:2009.08311  [pdf, other

    cs.LG cs.RO stat.ML

    Multimodal Safety-Critical Scenarios Generation for Decision-Making Algorithms Evaluation

    Authors: Wenhao Ding, Baiming Chen, Bo Li, Kim Ji Eun, Ding Zhao

    Abstract: Existing neural network-based autonomous systems are shown to be vulnerable against adversarial attacks, therefore sophisticated evaluation on their robustness is of great importance. However, evaluating the robustness only under the worst-case scenarios based on known attacks is not comprehensive, not to mention that some of them even rarely occur in the real world. In addition, the distribution… ▽ More

    Submitted 26 December, 2020; v1 submitted 16 September, 2020; originally announced September 2020.

    Comments: 8 pages, 7 figures

  32. arXiv:2009.03259  [pdf, other

    cs.LG cs.CV cs.GR stat.ML

    Implicit Multidimensional Projection of Local Subspaces

    Authors: Rongzheng Bian, Yumeng Xue, Liang Zhou, Jian Zhang, Baoquan Chen, Daniel Weiskopf, Yunhai Wang

    Abstract: We propose a visualization method to understand the effect of multidimensional projection on local subspaces, using implicit function differentiation. Here, we understand the local subspace as the multidimensional local neighborhood of data points. Existing methods focus on the projection of multidimensional data points, and the neighborhood information is ignored. Our method is able to analyze th… ▽ More

    Submitted 20 July, 2023; v1 submitted 7 September, 2020; originally announced September 2020.

    Journal ref: in IEEE Transactions on Visualization and Computer Graphics, vol. 27, no. 2, pp. 1558-1568, Feb. 2021

  33. arXiv:2008.13225  [pdf, other

    cs.LG cs.AI cs.DC cs.IR stat.ML

    SOLAR: Sparse Orthogonal Learned and Random Embeddings

    Authors: Tharun Medini, Beidi Chen, Anshumali Shrivastava

    Abstract: Dense embedding models are commonly deployed in commercial search engines, wherein all the document vectors are pre-computed, and near-neighbor search (NNS) is performed with the query vector to find relevant documents. However, the bottleneck of indexing a large number of dense vectors and performing an NNS hurts the query time and accuracy of these models. In this paper, we argue that high-dimen… ▽ More

    Submitted 30 August, 2020; originally announced August 2020.

    Comments: Under review at NeurIPS 2020

  34. Causal mediation analysis decomposition of between-hospital variance

    Authors: Bo Chen, Keith A. Lawson, Antonio Finelli, Olli Saarela

    Abstract: Causal variance decompositions for a given disease-specific quality indicator can be used to quantify differences in performance between hospitals or health care providers. While variance decompositions can demonstrate variation in quality of care, causal mediation analysis can be used to study care pathways leading to the differences in performance between the institutions. This raises the questi… ▽ More

    Submitted 24 January, 2023; v1 submitted 28 August, 2020; originally announced August 2020.

    Journal ref: Health Services and Outcomes Research Methodology volume 22, pages 118-144 (2022)

  35. arXiv:2008.06120  [pdf, other

    cs.LG cs.CV stat.ML

    Can weight sharing outperform random architecture search? An investigation with TuNAS

    Authors: Gabriel Bender, Hanxiao Liu, Bo Chen, Grace Chu, Shuyang Cheng, Pieter-Jan Kindermans, Quoc Le

    Abstract: Efficient Neural Architecture Search methods based on weight sharing have shown good promise in democratizing Neural Architecture Search for computer vision models. There is, however, an ongoing debate whether these efficient methods are significantly better than random search. Here we perform a thorough comparison between efficient and random search methods on a family of progressively larger and… ▽ More

    Submitted 13 August, 2020; originally announced August 2020.

    Comments: Published at CVPR 2020

    ACM Class: I.2.10

    Journal ref: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 14323-14332

  36. arXiv:2007.00339  [pdf, other

    cs.LG stat.ML

    Multi-Task Variational Information Bottleneck

    Authors: Weizhu Qian, Bowei Chen, Yichao Zhang, Guanghui Wen, Franck Gechter

    Abstract: Multi-task learning (MTL) is an important subject in machine learning and artificial intelligence. Its applications to computer vision, signal processing, and speech recognition are ubiquitous. Although this subject has attracted considerable attention recently, the performance and robustness of the existing models to different tasks have not been well balanced. This article proposes an MTL model… ▽ More

    Submitted 1 March, 2021; v1 submitted 1 July, 2020; originally announced July 2020.

    Comments: 10 pages

  37. arXiv:2006.15820  [pdf, other

    cs.LG cs.AI stat.ML

    Retro*: Learning Retrosynthetic Planning with Neural Guided A* Search

    Authors: Binghong Chen, Chengtao Li, Hanjun Dai, Le Song

    Abstract: Retrosynthetic planning is a critical task in organic chemistry which identifies a series of reactions that can lead to the synthesis of a target product. The vast number of possible chemical transformations makes the size of the search space very big, and retrosynthetic planning is challenging even for experienced chemists. However, existing methods either require expensive return estimation by r… ▽ More

    Submitted 29 June, 2020; originally announced June 2020.

    Comments: Presented at ICML 2020

  38. arXiv:2006.11441  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    Task-Agnostic Online Reinforcement Learning with an Infinite Mixture of Gaussian Processes

    Authors: Mengdi Xu, Wenhao Ding, Jiacheng Zhu, Zuxin Liu, Baiming Chen, Ding Zhao

    Abstract: Continuously learning to solve unseen tasks with limited experience has been extensively pursued in meta-learning and continual learning, but with restricted assumptions such as accessible task distributions, independently and identically distributed tasks, and clear task delineations. However, real-world physical tasks frequently violate these assumptions, resulting in performance degradation. Th… ▽ More

    Submitted 30 November, 2020; v1 submitted 19 June, 2020; originally announced June 2020.

    Comments: 16 pages, 6 figures

  39. arXiv:2006.08804  [pdf, other

    cs.LG stat.AP stat.CO stat.ML

    Deep Autoencoding Topic Model with Scalable Hybrid Bayesian Inference

    Authors: Hao Zhang, Bo Chen, Yulai Cong, Dandan Guo, Hongwei Liu, Mingyuan Zhou

    Abstract: To build a flexible and interpretable model for document analysis, we develop deep autoencoding topic model (DATM) that uses a hierarchy of gamma distributions to construct its multi-stochastic-layer generative network. In order to provide scalable posterior inference for the parameters of the generative network, we develop topic-layer-adaptive stochastic gradient Riemannian MCMC that jointly lear… ▽ More

    Submitted 15 June, 2020; originally announced June 2020.

    Comments: To appear in IEEE Transactions on Pattern Analysis and Machine Intelligence. arXiv admin note: text overlap with arXiv:1803.01328

  40. arXiv:2006.04804  [pdf, other

    stat.ML cs.LG

    Optimal Transport Graph Neural Networks

    Authors: Benson Chen, Gary Bécigneul, Octavian-Eugen Ganea, Regina Barzilay, Tommi Jaakkola

    Abstract: Current graph neural network (GNN) architectures naively average or sum node embeddings into an aggregated graph representation -- potentially losing structural or semantic information. We here introduce OT-GNN, a model that computes graph embeddings using parametric prototypes that highlight key facets of different graph aspects. Towards this goal, we successfully combine optimal transport (OT) w… ▽ More

    Submitted 8 October, 2021; v1 submitted 8 June, 2020; originally announced June 2020.

  41. Hierarchical causal variance decomposition for institution and provider comparisons in healthcare

    Authors: Bo Chen, Olli Saarela

    Abstract: Disease-specific quality indicators (QIs) are used to compare institutions and health care providers in terms processes or outcomes relevant to treatment of a particular condition. In the context of surgical cancer treatments, the performance variations can be due to hospital and/or surgeon level differences, creating a hierarchical clustering. We consider how the observed variation in care receiv… ▽ More

    Submitted 21 May, 2020; v1 submitted 14 May, 2020; originally announced May 2020.

    Journal ref: Health Services and Outcomes Research Methodology (2023)

  42. arXiv:2005.05441  [pdf, other

    cs.LG cs.MA stat.ML

    Delay-Aware Multi-Agent Reinforcement Learning for Cooperative and Competitive Environments

    Authors: Baiming Chen, Mengdi Xu, Zuxin Liu, Liang Li, Ding Zhao

    Abstract: Action and observation delays exist prevalently in the real-world cyber-physical systems which may pose challenges in reinforcement learning design. It is particularly an arduous task when handling multi-agent systems where the delay of one agent could spread to other agents. To resolve this problem, this paper proposes a novel framework to deal with delays as well as the non-stationary training i… ▽ More

    Submitted 28 August, 2020; v1 submitted 11 May, 2020; originally announced May 2020.

  43. arXiv:2005.05440  [pdf, other

    cs.LG cs.AI stat.ML

    Delay-Aware Model-Based Reinforcement Learning for Continuous Control

    Authors: Baiming Chen, Mengdi Xu, Liang Li, Ding Zhao

    Abstract: Action delays degrade the performance of reinforcement learning in many real-world systems. This paper proposes a formal definition of delay-aware Markov Decision Process and proves it can be transformed into standard MDP with augmented states using the Markov reward process. We develop a delay-aware model-based reinforcement learning framework that can incorporate the multi-step delay into the le… ▽ More

    Submitted 11 May, 2020; originally announced May 2020.

    Journal ref: Neurocomputing Volume 450, 25 August 2021, Pages 119-128

  44. arXiv:2005.04806  [pdf, other

    cs.SI cs.IR cs.LG stat.ML

    Comparison and Benchmark of Graph Clustering Algorithms

    Authors: Lizhen Shi, Bo Chen

    Abstract: Graph clustering is widely used in analysis of biological networks, social networks and etc. For over a decade many graph clustering algorithms have been published, however a comprehensive and consistent performance comparison is not available. In this paper we benchmarked more than 70 graph clustering programs to evaluate their runtime and quality performance for both weighted and unweighted grap… ▽ More

    Submitted 10 May, 2020; originally announced May 2020.

    Comments: 32 pages, 4 figures

  45. arXiv:2004.14774  [pdf, other

    cs.CV cs.LG cs.RO eess.IV stat.ML

    IROS 2019 Lifelong Robotic Vision Challenge -- Lifelong Object Recognition Report

    Authors: Qi She, Fan Feng, Qi Liu, Rosa H. M. Chan, Xinyue Hao, Chuanlin Lan, Qihan Yang, Vincenzo Lomonaco, German I. Parisi, Heechul Bae, Eoin Brophy, Baoquan Chen, Gabriele Graffieti, Vidit Goel, Hyonyoung Han, Sathursan Kanagarajah, Somesh Kumar, Siew-Kei Lam, Tin Lun Lam, Liang Ma, Davide Maltoni, Lorenzo Pellegrini, Duvindu Piyasena, Shiliang Pu, Debdoot Sheet , et al. (11 additional authors not shown)

    Abstract: This report summarizes IROS 2019-Lifelong Robotic Vision Competition (Lifelong Object Recognition Challenge) with methods and results from the top $8$ finalists (out of over~$150$ teams). The competition dataset (L)ifel(O)ng (R)obotic V(IS)ion (OpenLORIS) - Object Recognition (OpenLORIS-object) is designed for driving lifelong/continual learning research and application in robotic vision domain, w… ▽ More

    Submitted 26 April, 2020; originally announced April 2020.

    Comments: 9 pages, 11 figures, 3 tables, accepted into IEEE Robotics and Automation Magazine. arXiv admin note: text overlap with arXiv:1911.06487

  46. arXiv:2004.06531  [pdf, other

    cs.RO cs.LG stat.ML

    Adversarial Evaluation of Autonomous Vehicles in Lane-Change Scenarios

    Authors: Baiming Chen, Xiang Chen, Wu Qiong, Liang Li

    Abstract: Autonomous vehicles must be comprehensively evaluated before deployed in cities and highways. However, most existing evaluation approaches for autonomous vehicles are static and lack adaptability, so they are usually inefficient in generating challenging scenarios for tested vehicles. In this paper, we propose an adaptive evaluation framework to efficiently evaluate autonomous vehicles in adversar… ▽ More

    Submitted 23 November, 2020; v1 submitted 14 April, 2020; originally announced April 2020.

  47. arXiv:2004.00378  [pdf, other

    cs.LG eess.SP stat.ML

    Time-Frequency Analysis based Blind Modulation Classification for Multiple-Antenna Systems

    Authors: Weiheng Jiang, Xiaogang Wu, Bolin Chen, Wenjiang Feng, Yi **

    Abstract: Blind modulation classification is an important step to implement cognitive radio networks. The multiple-input multiple-output (MIMO) technique is widely used in military and civil communication systems. Due to the lack of prior information about channel parameters and the overlap** of signals in the MIMO systems, the traditional likelihood-based and feature-based approaches cannot be applied in… ▽ More

    Submitted 1 April, 2020; originally announced April 2020.

    Comments: 12 pages, 11 figures

  48. arXiv:2003.01751  [pdf, other

    cs.LG stat.ML

    Automatic Hyper-Parameter Optimization Based on Map** Discovery from Data to Hyper-Parameters

    Authors: Bozhou Chen, Kaixin Zhang, Longshen Ou, Chenmin Ba, Hongzhi Wang, Chunnan Wang

    Abstract: Machine learning algorithms have made remarkable achievements in the field of artificial intelligence. However, most machine learning algorithms are sensitive to the hyper-parameters. Manually optimizing the hyper-parameters is a common method of hyper-parameter tuning. However, it is costly and empirically dependent. Automatic hyper-parameter optimization (autoHPO) is favored due to its effective… ▽ More

    Submitted 3 March, 2020; originally announced March 2020.

  49. arXiv:2002.06541  [pdf, other

    cs.LG cs.IT stat.ML

    Learning Not to Learn in the Presence of Noisy Labels

    Authors: Liu Ziyin, Blair Chen, Ru Wang, Paul Pu Liang, Ruslan Salakhutdinov, Louis-Philippe Morency, Masahito Ueda

    Abstract: Learning in the presence of label noise is a challenging yet important task: it is crucial to design models that are robust in the presence of mislabeled datasets. In this paper, we discover that a new class of loss functions called the gambler's loss provides strong robustness to label noise across various levels of corruption. We show that training with this loss function encourages the model to… ▽ More

    Submitted 16 February, 2020; originally announced February 2020.

  50. arXiv:2002.05770  [pdf, ps, other

    cs.LG eess.SP stat.ML

    Harvesting Ambient RF for Presence Detection Through Deep Learning

    Authors: Yang Liu, Tiexing Wang, Yuexin Jiang, Biao Chen

    Abstract: This paper explores the use of ambient radio frequency (RF) signals for human presence detection through deep learning. Using WiFi signal as an example, we demonstrate that the channel state information (CSI) obtained at the receiver contains rich information about the propagation environment. Through judicious pre-processing of the estimated CSI followed by deep learning, reliable presence detect… ▽ More

    Submitted 9 December, 2020; v1 submitted 13 February, 2020; originally announced February 2020.

    Comments: Source code and datasets are available at Github: https://github.com/bigtreeyanger/presence_detection_cnn