Skip to main content

Showing 1–46 of 46 results for author: Chen, E

Searching in archive stat. Search in all archives.
.
  1. arXiv:2407.01770  [pdf, other

    stat.ME

    Exploring causal effects of hormone- and radio-treatments in an observational study of breast cancer using copula-based semi-competing risks models

    Authors: Tonghui Yu, Mengjiao Peng, Yifan Cui, Elynn Chen, Chixiang Chen

    Abstract: Breast cancer patients may experience relapse or death after surgery during the follow-up period, leading to dependent censoring of relapse. This phenomenon, known as semi-competing risk, imposes challenges in analyzing treatment effects on breast cancer and necessitates advanced statistical tools for unbiased analysis. Despite progress in estimation and inference within semi-competing risks regre… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: Contact: [email protected]

  2. arXiv:2407.00561  [pdf, ps, other

    stat.ME stat.AP

    Advancing Information Integration through Empirical Likelihood: Selective Reviews and a New Idea

    Authors: Chixiang Chen, Jia Liang, Elynn Chen, Ming Wang

    Abstract: Information integration plays a pivotal role in biomedical studies by facilitating the combination and analysis of independent datasets from multiple studies, thereby uncovering valuable insights that might otherwise remain obscured due to the limited sample size in individual studies. However, sharing raw data from independent studies presents significant challenges, primarily due to the need to… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

  3. arXiv:2405.17744  [pdf, other

    stat.ME

    Factor Augmented Matrix Regression

    Authors: Elynn Chen, Jianqing Fan, Xiaonan Zhu

    Abstract: We introduce \underline{F}actor-\underline{A}ugmented \underline{Ma}trix \underline{R}egression (FAMAR) to address the growing applications of matrix-variate data and their associated challenges, particularly with high-dimensionality and covariate correlations. FAMAR encompasses two key algorithms. The first is a novel non-iterative approach that efficiently estimates the factors and loadings of t… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  4. arXiv:2405.11681  [pdf, other

    stat.ME math.ST

    Distributed Tensor Principal Component Analysis

    Authors: Elynn Chen, Xi Chen, Wenbo **g, Yichen Zhang

    Abstract: As tensors become widespread in modern data analysis, Tucker low-rank Principal Component Analysis (PCA) has become essential for dimensionality reduction and structural discovery in tensor datasets. Motivated by the common scenario where large-scale tensors are distributed across diverse geographic locations, this paper investigates tensor PCA within a distributed framework where direct data pool… ▽ More

    Submitted 19 May, 2024; originally announced May 2024.

  5. arXiv:2405.06866  [pdf, other

    stat.ME

    Dynamic Contextual Pricing with Doubly Non-Parametric Random Utility Models

    Authors: Elynn Chen, Xi Chen, Lan Gao, Jiayu Li

    Abstract: In the evolving landscape of digital commerce, adaptive dynamic pricing strategies are essential for gaining a competitive edge. This paper introduces novel {\em doubly nonparametric random utility models} that eschew traditional parametric assumptions used in estimating consumer demand's mean utility function and noise distribution. Existing nonparametric methods like multi-scale {\em Distributio… ▽ More

    Submitted 10 June, 2024; v1 submitted 10 May, 2024; originally announced May 2024.

  6. arXiv:2404.15209  [pdf, other

    cs.LG stat.ME stat.ML

    Data-Driven Knowledge Transfer in Batch $Q^*$ Learning

    Authors: Elynn Chen, Xi Chen, Wenbo **g

    Abstract: In data-driven decision-making in marketing, healthcare, and education, it is desirable to utilize a large amount of data from existing ventures to navigate high-dimensional feature spaces and address data scarcity in new ventures. We explore knowledge transfer in dynamic decision-making by concentrating on batch stationary environments and formally defining task discrepancies through the lens of… ▽ More

    Submitted 31 March, 2024; originally announced April 2024.

  7. arXiv:2404.01546  [pdf, other

    stat.ME

    Time-Varying Matrix Factor Models

    Authors: Bin Chen, Elynn Y. Chen, Stevenson Bolivar, Rong Chen

    Abstract: Matrix-variate data of high dimensions are frequently observed in finance and economics, spanning extended time periods, such as the long-term data on international trade flows among numerous countries. To address potential structural shifts and explore the matrix structure's informational context, we propose a time-varying matrix factor model. This model accommodates changing factor loadings over… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  8. arXiv:2310.10013  [pdf, other

    stat.ML cs.LG

    Riemannian Residual Neural Networks

    Authors: Isay Katsman, Eric Ming Chen, Sidhanth Holalkere, Anna Asch, Aaron Lou, Ser-Nam Lim, Christopher De Sa

    Abstract: Recent methods in geometric deep learning have introduced various neural networks to operate over data that lie on Riemannian manifolds. Such networks are often necessary to learn well over graphs with a hierarchical structure or to learn over manifold-valued data encountered in the natural sciences. These networks are often inspired by and directly generalize standard Euclidean neural networks. H… ▽ More

    Submitted 15 October, 2023; originally announced October 2023.

    Comments: Published at NeurIPS 2023

  9. Aggregating human judgment probabilistic predictions of COVID-19 transmission, burden, and preventative measures

    Authors: Allison Codi, Damon Luk, David Braun, Juan Cambeiro, Tamay Besiroglu, Eva Chen, Luis Enrique Urtubey de C`esaris, Paolo Bocchini, Thomas McAndrew

    Abstract: Aggregated human judgment forecasts for COVID-19 targets of public health importance are accurate, often outperforming computational models. Our work shows aggregated human judgment forecasts for infectious agents are timely, accurate, and adaptable, and can be used as tool to aid public health decision making during outbreaks.

    Submitted 14 April, 2022; v1 submitted 5 April, 2022; originally announced April 2022.

  10. arXiv:2202.09820  [pdf, other

    stat.AP

    Chimeric forecasting: combining probabilistic predictions from computational models and human judgment

    Authors: Thomas McAndrew, Allison Codi, Juan Cambeiro, Tamay Besiroglu, David Braun, Eva Chen, Luis Enrique Urtubey de Cesaris, Damon Luk

    Abstract: Forecasts of the trajectory of an infectious agent can help guide public health decision making. A traditional approach to forecasting fits a computational model to structured data and generates a predictive distribution. However, human judgment has access to the same data as computational models plus experience, intuition, and subjective data. We propose a chimeric ensemble -- a combination of co… ▽ More

    Submitted 20 February, 2022; originally announced February 2022.

  11. arXiv:2202.04709  [pdf, other

    cs.LG stat.ME

    Transferred Q-learning

    Authors: Elynn Y. Chen, Michael I. Jordan, Sai Li

    Abstract: We consider $Q$-learning with knowledge transfer, using samples from a target reinforcement learning (RL) task as well as source samples from different but related RL tasks. We propose transfer learning algorithms for both batch and online $Q$-learning with offline source studies. The proposed transferred $Q$-learning algorithm contains a novel re-targeting step that enables vertical information-c… ▽ More

    Submitted 9 February, 2022; originally announced February 2022.

  12. arXiv:2202.00088  [pdf, other

    cs.LG stat.ME

    Reinforcement Learning with Heterogeneous Data: Estimation and Inference

    Authors: Elynn Y. Chen, Rui Song, Michael I. Jordan

    Abstract: Reinforcement Learning (RL) has the promise of providing data-driven support for decision-making in a wide range of problems in healthcare, education, business, and other domains. Classical RL methods focus on the mean of the total return and, thus, may provide misleading results in the setting of the heterogeneous populations that commonly underlie large-scale datasets. We introduce the K-Heterog… ▽ More

    Submitted 31 January, 2022; originally announced February 2022.

  13. arXiv:2110.05368  [pdf, other

    stat.ME stat.AP

    Truncated Rank-Based Tests for Two-Part Models with Excessive Zeros and Applications to Microbiome Data

    Authors: Wanjie Wang, Eric Z. Chen, Hongzhe Li

    Abstract: High-throughput sequencing technology allows us to test the compositional difference of bacteria in different populations. One important feature of human microbiome data is that it often includes a large number of zeros. Such data can be treated as being generated from a two-part model that includes a zero point-mass. Motivated by analysis of such non-negative data with excessive zeros, we introdu… ▽ More

    Submitted 21 August, 2022; v1 submitted 11 October, 2021; originally announced October 2021.

    MSC Class: 62P10 (Primary) 62G10; 92D10 (Secondary)

  14. arXiv:2109.00582  [pdf, other

    stat.ML cs.LG stat.ME

    Information-theoretic Classification Accuracy: A Criterion that Guides Data-driven Combination of Ambiguous Outcome Labels in Multi-class Classification

    Authors: Chihao Zhang, Yiling Elaine Chen, Shihua Zhang, **gyi Jessica Li

    Abstract: Outcome labeling ambiguity and subjectivity are ubiquitous in real-world datasets. While practitioners commonly combine ambiguous outcome labels for all data points (instances) in an ad hoc way to improve the accuracy of multi-class classification, there lacks a principled approach to guide the label combination for all data points by any optimality criterion. To address this problem, we propose t… ▽ More

    Submitted 2 July, 2022; v1 submitted 1 September, 2021; originally announced September 2021.

    MSC Class: 62-08

  15. arXiv:2007.05521  [pdf, other

    stat.ME

    Community Network Auto-Regression for High-Dimensional Time Series

    Authors: Elynn Y. Chen, Jianqing Fan, Xuening Zhu

    Abstract: Modeling responses on the nodes of a large-scale network is an important task that arises commonly in practice. This paper proposes a community network vector autoregressive (CNAR) model, which utilizes the network structure to characterize the dependence and intra-community homogeneity of the high dimensional time series. The CNAR model greatly increases the flexibility and generality of the netw… ▽ More

    Submitted 10 July, 2020; originally announced July 2020.

    Comments: 59 pages; 10 figures; submitted

  16. arXiv:2007.04785  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Accuracy Prediction with Non-neural Model for Neural Architecture Search

    Authors: Renqian Luo, Xu Tan, Rui Wang, Tao Qin, Enhong Chen, Tie-Yan Liu

    Abstract: Neural architecture search (NAS) with an accuracy predictor that predicts the accuracy of candidate architectures has drawn increasing attention due to its simplicity and effectiveness. Previous works usually employ neural network-based predictors which require more delicate design and are easy to overfit. Considering that most architectures are represented as sequences of discrete symbols which a… ▽ More

    Submitted 19 July, 2021; v1 submitted 9 July, 2020; originally announced July 2020.

    Comments: Code is available at https://github.com/renqianluo/GBDT-NAS

  17. ASGN: An Active Semi-supervised Graph Neural Network for Molecular Property Prediction

    Authors: Zhongkai Hao, Chengqiang Lu, Zheyuan Hu, Hao Wang, Zhenya Huang, Qi Liu, Enhong Chen, Cheekong Lee

    Abstract: Molecular property prediction (e.g., energy) is an essential problem in chemistry and biology. Unfortunately, many supervised learning methods usually suffer from the problem of scarce labeled molecules in the chemical space, where such property labels are generally obtained by Density Functional Theory (DFT) calculation which is extremely computational costly. An effective solution is to incorpor… ▽ More

    Submitted 7 July, 2020; originally announced July 2020.

    Comments: 9 pages

  18. Semi-parametric TEnsor Factor Analysis by Iteratively Projected Singular Value Decomposition

    Authors: Elynn Y. Chen, Dong Xia, Chencheng Cai, Jianqing Fan

    Abstract: This paper introduces a general framework of Semi-parametric TEnsor Factor Analysis (STEFA) that focuses on the methodology and theory of low-rank tensor decomposition with auxiliary covariates. Semi-parametric TEnsor Factor Analysis models extend tensor factor models by incorporating auxiliary covariates in the loading matrices. We propose an algorithm of iteratively projected singular value deco… ▽ More

    Submitted 2 April, 2024; v1 submitted 5 July, 2020; originally announced July 2020.

    Comments: 44 pages, 8 figures

    Journal ref: Journal of the Royal Statistical Society Series B: Statistical Methodology, 2024

  19. arXiv:2006.13737  [pdf

    stat.AP cs.IR cs.LG

    Diagnosis Prevalence vs. Efficacy in Machine-learning Based Diagnostic Decision Support

    Authors: Gil Alon, Elizabeth Chen, Guergana Savova, Carsten Eickhoff

    Abstract: Many recent studies use machine learning to predict a small number of ICD-9-CM codes. In practice, on the other hand, physicians have to consider a broader range of diagnoses. This study aims to put these previously incongruent evaluation settings on a more equal footing by predicting ICD-9-CM codes based on electronic health record properties and demonstrating the relationship between diagnosis p… ▽ More

    Submitted 24 June, 2020; originally announced June 2020.

    Comments: AMIA Joint Summits in Translational Science, 2020

  20. arXiv:2006.12301  [pdf, other

    math.ST cs.LG stat.ML

    On Projection Robust Optimal Transport: Sample Complexity and Model Misspecification

    Authors: Tianyi Lin, Zeyu Zheng, Elynn Y. Chen, Marco Cuturi, Michael I. Jordan

    Abstract: Optimal transport (OT) distances are increasingly used as loss functions for statistical inference, notably in the learning of generative models or supervised learning. Yet, the behavior of minimum Wasserstein estimators is poorly understood, notably in high-dimensional regimes or under model misspecification. In this work we adopt the viewpoint of projection robust (PR) OT, which seeks to maximiz… ▽ More

    Submitted 17 July, 2021; v1 submitted 22 June, 2020; originally announced June 2020.

    Comments: Accepted by AISTATS 2021; Fix some inaccuracy in the definition and proof; 49 Pages, 41 figures

  21. arXiv:2006.02983  [pdf, other

    stat.CO cs.LG stat.ML

    Median regression with differential privacy

    Authors: E Chen, Ying Miao, Yu Tang

    Abstract: Median regression analysis has robustness properties which make it attractive compared with regression based on the mean, while differential privacy can protect individual privacy during statistical analysis of certain datasets. In this paper, three privacy preserving methods are proposed for median regression. The first algorithm is based on a finite smoothing method, the second provides an itera… ▽ More

    Submitted 4 June, 2020; originally announced June 2020.

    MSC Class: 62F30; 68W20

  22. arXiv:2002.10389  [pdf, other

    cs.LG stat.ML

    Semi-Supervised Neural Architecture Search

    Authors: Renqian Luo, Xu Tan, Rui Wang, Tao Qin, Enhong Chen, Tie-Yan Liu

    Abstract: Neural architecture search (NAS) relies on a good controller to generate better architectures or predict the accuracy of given architectures. However, training the controller requires both abundant and high-quality pairs of architectures and their accuracy, while it is costly to evaluate an architecture and obtain its accuracy. In this paper, we propose SemiNAS, a semi-supervised NAS approach that… ▽ More

    Submitted 3 November, 2020; v1 submitted 24 February, 2020; originally announced February 2020.

    Comments: NeurIPS 2020

  23. arXiv:2002.01305  [pdf, other

    stat.ME

    Modeling Multivariate Spatial-Temporal Data with Latent Low-Dimensional Dynamics

    Authors: Elynn Y. Chen, Xin Yun, Rong Chen, Qiwei Yao

    Abstract: High-dimensional multivariate spatial-temporal data arise frequently in a wide range of applications; however, there are relatively few statistical methods that can simultaneously deal with spatial, temporal and variable-wise dependencies in large data sets. In this paper, we propose a new approach to utilize the correlations in variable, space and time to achieve dimension reduction and to facili… ▽ More

    Submitted 1 February, 2020; originally announced February 2020.

    Comments: arXiv admin note: text overlap with arXiv:1710.06351

  24. Deep Technology Tracing for High-tech Companies

    Authors: Han Wu, Kun Zhang, Guangyi Lv, Qi Liu, Runlong Yu, Weihao Zhao, Enhong Chen, Jianhui Ma

    Abstract: Technological change and innovation are vitally important, especially for high-tech companies. However, factors influencing their future research and development (R&D) trends are both complicated and various, leading it a quite difficult task to make technology tracing for high-tech companies. To this end, in this paper, we develop a novel data-driven solution, i.e., Deep Technology Forecasting (D… ▽ More

    Submitted 2 January, 2020; originally announced January 2020.

    Comments: 6 pages, 7 figures

  25. Statistical Inference for High-Dimensional Matrix-Variate Factor Model

    Authors: Elynn Y. Chen, Jianqing Fan

    Abstract: This paper considers the estimation and inference of the low-rank components in high-dimensional matrix-variate factor models, where each dimension of the matrix-variates ($p \times q$) is comparable to or greater than the number of observations ($T$). We propose an estimation method called $α$-PCA that preserves the matrix structure and aggregates mean and contemporary covariance through a hyper-… ▽ More

    Submitted 19 October, 2022; v1 submitted 7 January, 2020; originally announced January 2020.

    Journal ref: Journal of the American Statistical Association, 2021

  26. arXiv:1912.12844  [pdf, other

    cs.LG cs.DC math.OC stat.ML

    Variance Reduced Local SGD with Lower Communication Complexity

    Authors: Xianfeng Liang, Shuheng Shen, **gchang Liu, Zhen Pan, Enhong Chen, Yifei Cheng

    Abstract: To accelerate the training of machine learning models, distributed stochastic gradient descent (SGD) and its variants have been widely adopted, which apply multiple workers in parallel to speed up training. Among them, Local SGD has gained much attention due to its lower communication cost. Nevertheless, when the data distribution on workers is non-identical, Local SGD requires… ▽ More

    Submitted 30 December, 2019; originally announced December 2019.

    Comments: 25 pages, 6 figures. The paper presents a novel variance reduction algorithm for Local SGD

  27. arXiv:1912.06767  [pdf, other

    cs.LG stat.ML

    Estimating Early Fundraising Performance of Innovations via Graph-based Market Environment Model

    Authors: Likang Wu, Zhi Li, Hongke Zhao, Zhen Pan, Qi Liu, Enhong Chen

    Abstract: Well begun is half done. In the crowdfunding market, the early fundraising performance of the project is a concerned issue for both creators and platforms. However, estimating the early fundraising performance before the project published is very challenging and still under-explored. To that end, in this paper, we present a focused study on this important problem in a market modeling view. Specifi… ▽ More

    Submitted 13 December, 2019; originally announced December 2019.

  28. arXiv:1912.00543  [pdf, other

    eess.IV cs.CV cs.LG stat.ML

    Pyramid Convolutional RNN for MRI Image Reconstruction

    Authors: Eric Z. Chen, Puyang Wang, Xiao Chen, Terrence Chen, Shanhui Sun

    Abstract: Fast and accurate MRI image reconstruction from undersampled data is crucial in clinical practice. Deep learning based reconstruction methods have shown promising advances in recent years. However, recovering fine details from undersampled data is still challenging. In this paper, we introduce a novel deep learning based method, Pyramid Convolutional RNN (PC-RNN), to reconstruct images from multip… ▽ More

    Submitted 21 February, 2022; v1 submitted 1 December, 2019; originally announced December 2019.

    Comments: Copyright©2022 IEEE. Personal use of this material is permitted. However, permission to use this material for any other purposes must be obtained from the IEEE by sending a request to [email protected]

  29. arXiv:1911.08717  [pdf, other

    cs.LG stat.ML

    Fine-Tuning by Curriculum Learning for Non-Autoregressive Neural Machine Translation

    Authors: Junliang Guo, Xu Tan, Linli Xu, Tao Qin, Enhong Chen, Tie-Yan Liu

    Abstract: Non-autoregressive translation (NAT) models remove the dependence on previous target tokens and generate all target tokens in parallel, resulting in significant inference speedup but at the cost of inferior translation accuracy compared to autoregressive translation (AT) models. Considering that AT models have higher accuracy and are easier to train than NAT models, and both of them share the same… ▽ More

    Submitted 21 November, 2019; v1 submitted 20 November, 2019; originally announced November 2019.

    Comments: AAAI 2020

  30. arXiv:1909.10815  [pdf, other

    cs.LG cs.NE stat.ML

    Balanced One-shot Neural Architecture Optimization

    Authors: Renqian Luo, Tao Qin, Enhong Chen

    Abstract: The ability to rank candidate architectures is the key to the performance of neural architecture search~(NAS). One-shot NAS is proposed to reduce the expense but shows inferior performance against conventional NAS and is not adequately stable. We investigate into this and find that the ranking correlation between architectures under one-shot training and the ones under stand-alone full training is… ▽ More

    Submitted 31 March, 2020; v1 submitted 24 September, 2019; originally announced September 2019.

    Comments: Code and model checkpoints are publicly available at https://github.com/renqianluo/NAO_pytorch

  31. arXiv:1908.08733  [pdf, other

    cs.LG cs.CY stat.ML

    Neural Cognitive Diagnosis for Intelligent Education Systems

    Authors: Fei Wang, Qi Liu, Enhong Chen, Zhenya Huang, Yuying Chen, Yu Yin, Zai Huang, Shi** Wang

    Abstract: Cognitive diagnosis is a fundamental issue in intelligent education, which aims to discover the proficiency level of students on specific knowledge concepts. Existing approaches usually mine linear interactions of student exercising process by manual-designed function (e.g., logistic function), which is not sufficient for capturing complex relations between students and exercises. In this paper, w… ▽ More

    Submitted 3 March, 2020; v1 submitted 23 August, 2019; originally announced August 2019.

  32. arXiv:1905.12470  [pdf, other

    cs.CY cs.LG stat.ML

    Exploiting Cognitive Structure for Adaptive Learning

    Authors: Qi Liu, Shiwei Tong, Chuanren Liu, Hongke Zhao, Enhong Chen, Hai** Ma, Shi** Wang

    Abstract: Adaptive learning, also known as adaptive teaching, relies on learning path recommendation, which sequentially recommends personalized learning items (e.g., lectures, exercises) to satisfy the unique needs of each learner. Although it is well known that modeling the cognitive structure including knowledge level of learners and knowledge structure (e.g., the prerequisite relations) of learning item… ▽ More

    Submitted 23 May, 2019; originally announced May 2019.

    Comments: Accepted by KDD 2019 Research Track. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD'19)

  33. arXiv:1905.10954  [pdf, other

    cs.LG cs.CV cs.SD stat.ML

    Transcribing Content from Structural Images with Spotlight Mechanism

    Authors: Yu Yin, Zhenya Huang, Enhong Chen, Qi Liu, Fuzheng Zhang, Xing Xie, Guo** Hu

    Abstract: Transcribing content from structural images, e.g., writing notes from music scores, is a challenging task as not only the content objects should be recognized, but the internal structure should also be preserved. Existing image recognition methods mainly work on images with simple content (e.g., text lines with characters), but are not capable to identify ones with more complex content (e.g., stru… ▽ More

    Submitted 26 May, 2019; originally announced May 2019.

    Comments: Accepted by KDD2018 Research Track. In proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'18)

  34. arXiv:1905.10949  [pdf, other

    cs.LG cs.CL stat.ML

    QuesNet: A Unified Representation for Heterogeneous Test Questions

    Authors: Yu Yin, Qi Liu, Zhenya Huang, Enhong Chen, Wei Tong, Shi** Wang, Yu Su

    Abstract: Understanding learning materials (e.g. test questions) is a crucial issue in online learning systems, which can promote many applications in education domain. Unfortunately, many supervised approaches suffer from the problem of scarce human labeled data, whereas abundant unlabeled resources are highly underutilized. To alleviate this problem, an effective solution is to use pre-trained representat… ▽ More

    Submitted 26 May, 2019; originally announced May 2019.

  35. arXiv:1904.12369  [pdf, other

    stat.ML cs.LG stat.ME

    Low-Rank Principal Eigenmatrix Analysis

    Authors: Krishna Balasubramanian, Elynn Y. Chen, Jianqing Fan, Xiang Wu

    Abstract: Sparse PCA is a widely used technique for high-dimensional data analysis. In this paper, we propose a new method called low-rank principal eigenmatrix analysis. Different from sparse PCA, the dominant eigenvectors are allowed to be dense but are assumed to have a low-rank structure when matricized appropriately. Such a structure arises naturally in several practical cases: Indeed the top eigenvect… ▽ More

    Submitted 28 April, 2019; originally announced April 2019.

  36. arXiv:1904.07383  [pdf, other

    stat.ME

    Hel** Effects Against Curse of Dimensionality in Threshold Factor Models for Matrix Time Series

    Authors: Xialu Liu, Elynn Chen

    Abstract: As is known, factor analysis is a popular method to reduce dimension for high-dimensional data. For matrix data, the dimension reduction can be more effectively achieved through both row and column directions. In this paper, we introduce a threshold factor models to analyze matrix-valued high-dimensional time series data. The factor loadings are allowed to switch between regimes, controlling by a… ▽ More

    Submitted 15 April, 2019; originally announced April 2019.

    Comments: 70 pages

  37. arXiv:1902.00006  [pdf, other

    cs.LG stat.ML

    An Evaluation of the Human-Interpretability of Explanation

    Authors: Isaac Lage, Emily Chen, Jeffrey He, Menaka Narayanan, Been Kim, Sam Gershman, Finale Doshi-Velez

    Abstract: Recent years have seen a boom in interest in machine learning systems that can provide a human-understandable rationale for their predictions or decisions. However, exactly what kinds of explanation are truly human-interpretable remains poorly understood. This work advances our understanding of what makes explanations interpretable under three specific tasks that users may perform with machine lea… ▽ More

    Submitted 28 August, 2019; v1 submitted 30 January, 2019; originally announced February 2019.

    Comments: arXiv admin note: substantial text overlap with arXiv:1802.00682

  38. arXiv:1901.00769  [pdf, other

    econ.EM stat.ME

    Modeling Dynamic Transport Network with Matrix Factor Models: with an Application to International Trade Flow

    Authors: Elynn Y. Chen, Rong Chen

    Abstract: International trade research plays an important role to inform trade policy and shed light on wider issues relating to poverty, development, migration, productivity, and economy. With recent advances in information technology, global and regional agencies distribute an enormous amount of internationally comparable trading data among a large number of countries over time, providing a goldmine for e… ▽ More

    Submitted 2 January, 2019; originally announced January 2019.

    Comments: arXiv admin note: text overlap with arXiv:1710.06325

  39. arXiv:1808.07233  [pdf, other

    cs.LG stat.ML

    Neural Architecture Optimization

    Authors: Renqian Luo, Fei Tian, Tao Qin, Enhong Chen, Tie-Yan Liu

    Abstract: Automatic neural architecture design has shown its potential in discovering powerful neural network architectures. Existing methods, no matter based on reinforcement learning or evolutionary algorithms (EA), conduct architecture search in a discrete space, which is highly inefficient. In this paper, we propose a simple and efficient method to automatic neural architecture design based on continuou… ▽ More

    Submitted 4 September, 2019; v1 submitted 22 August, 2018; originally announced August 2018.

    Comments: NeurIPS 2018. Code available at: https://github.com/renqianluo/NAO

  40. arXiv:1808.06296  [pdf, ps, other

    math.OC stat.ML

    Universal Stagewise Learning for Non-Convex Problems with Convergence on Averaged Solutions

    Authors: Zaiyi Chen, Zhuoning Yuan, **feng Yi, Bowen Zhou, Enhong Chen, Tianbao Yang

    Abstract: Although stochastic gradient descent (SGD) method and its variants (e.g., stochastic momentum methods, AdaGrad) are the choice of algorithms for solving non-convex problems (especially deep learning), there still remain big gaps between the theory and the practice with many questions unresolved. For example, there is still a lack of theories of convergence for SGD and its variants that use stagewi… ▽ More

    Submitted 5 March, 2019; v1 submitted 19 August, 2018; originally announced August 2018.

    Comments: added more experimental results

  41. arXiv:1711.04094  [pdf, other

    cs.SI cs.LG stat.ML

    Enhancing Network Embedding with Auxiliary Information: An Explicit Matrix Factorization Perspective

    Authors: Junliang Guo, Linli Xu, Xunpeng Huang, Enhong Chen

    Abstract: Recent advances in the field of network embedding have shown the low-dimensional network representation is playing a critical role in network analysis. However, most of the existing principles of network embedding do not incorporate auxiliary information such as content and labels of nodes flexibly. In this paper, we take a matrix factorization perspective of network embedding, and incorporate str… ▽ More

    Submitted 4 March, 2018; v1 submitted 11 November, 2017; originally announced November 2017.

    Comments: DASFAA 2018

  42. arXiv:1710.06351  [pdf, other

    stat.ME

    Multivariate Spatial-temporal Prediction on Latent Low-dimensional Functional Structure with Non-stationarity

    Authors: Elynn Yi Chen, Qiwei Yao, Rong Chen

    Abstract: Multivariate spatio-temporal data arise more and more frequently in a wide range of applications; however, there are relatively few general statistical methods that can readily use that incorporate spatial, temporal and variable dependencies simultaneously. In this paper, we propose a new approach to represent non-parametrically the linear dependence structure of a multivariate spatio-temporal pro… ▽ More

    Submitted 11 November, 2017; v1 submitted 17 October, 2017; originally announced October 2017.

  43. arXiv:1710.06325  [pdf, other

    stat.ME

    Factor Models for High-Dimensional Dynamic Networks: with Application to International Trade Flow Time Series 1981-2015

    Authors: Elynn Yi Chen, Rong Chen

    Abstract: Dynamic network analysis has found an increasing interest in the literature because of the importance of different kinds of dynamic social networks, biological networks, and economic networks. Most available probability and statistical models for dynamic network data are deduced from random graph theory where the networks are characterized on the node and edge level. They are often very restrictiv… ▽ More

    Submitted 17 October, 2017; originally announced October 2017.

  44. Constrained Factor Models for High-Dimensional Matrix-Variate Time Series

    Authors: Elynn Y. Chen, Ruey S. Tsay, Rong Chen

    Abstract: High-dimensional matrix-variate time series data are becoming widely available in many scientific fields, such as economics, biology, and meteorology. To achieve significant dimension reduction while preserving the intrinsic matrix structure and temporal dynamics in such data, Wang et al. (2017) proposed a matrix factor model that is shown to provide effective analysis. In this paper, we establish… ▽ More

    Submitted 19 October, 2022; v1 submitted 16 October, 2017; originally announced October 2017.

    Journal ref: Journal of the American Statistical Association, 2019

  45. arXiv:1706.08222  [pdf

    stat.ML

    YouTube-8M Video Understanding Challenge Approach and Applications

    Authors: Edward Chen

    Abstract: This paper introduces the YouTube-8M Video Understanding Challenge hosted as a Kaggle competition and also describes my approach to experimenting with various models. For each of my experiments, I provide the score result as well as possible improvements to be made. Towards the end of the paper, I discuss the various ensemble learning techniques that I applied on the dataset which significantly bo… ▽ More

    Submitted 26 June, 2017; originally announced June 2017.

    Comments: YouTube-8M Workshop submission, 8 pages

  46. arXiv:1307.8366  [pdf

    stat.AP q-bio.QM

    Characteristic Direction Approach to Identify Differentially Expressed Genes

    Authors: Neil R. Clark, Kevin Hu, Edward Y. Chen, Qioanan Duan, Avi Ma`ayan

    Abstract: Genome-wide gene expression profiles, as measured with microarrays or RNA-Seq experiments, have revolutionized biological and biomedical research by providing a quantitative measure of the entire mRNA transcriptome. Typically, researchers set up experiments where control samples are compared to a treatment condition, and using the t-test they identify differentially expressed genes upon which furt… ▽ More

    Submitted 31 July, 2013; originally announced July 2013.

    Comments: 22 pages, 8 figures