Skip to main content

Showing 1–16 of 16 results for author: Tang, D

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.13814  [pdf, other

    stat.AP stat.ME stat.ML

    Evaluation of Missing Data Analytical Techniques in Longitudinal Research: Traditional and Machine Learning Approaches

    Authors: Dandan Tang, Xin Tong

    Abstract: Missing Not at Random (MNAR) and nonnormal data are challenging to handle. Traditional missing data analytical techniques such as full information maximum likelihood estimation (FIML) may fail with nonnormal data as they are built on normal distribution assumptions. Two-Stage Robust Estimation (TSRE) does manage nonnormal data, but both FIML and TSRE are less explored in longitudinal studies under… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: 47 pages, 3 tables, 8 figures

  2. arXiv:2405.15090  [pdf, other

    cs.LG stat.ML

    Pure Exploration for Constrained Best Mixed Arm Identification with a Fixed Budget

    Authors: Dengwang Tang, Rahul Jain, Ashutosh Nayyar, Pierluigi Nuzzo

    Abstract: In this paper, we introduce the constrained best mixed arm identification (CBMAI) problem with a fixed budget. This is a pure exploration problem in a stochastic finite armed bandit model. Each arm is associated with a reward and multiple types of costs from unknown distributions. Unlike the unconstrained best arm identification problem, the optimal solution for the CBMAI problem may be a randomiz… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: 7 pages, 5 figures, 1 table

  3. Are the Signs of Factor Loadings Arbitrary in Confirmatory Factor Analysis? Problems and Solutions

    Authors: Dandan Tang, Steven M. Boker, Xin Tong

    Abstract: The replication crisis in social and behavioral sciences has raised concerns about the reliability and validity of empirical studies. While research in the literature has explored contributing factors to this crisis, the issues related to analytical tools have received less attention. This study focuses on a widely used analytical tool - confirmatory factor analysis (CFA) - and investigates one is… ▽ More

    Submitted 23 January, 2024; originally announced January 2024.

    Comments: 35 pages, 3 figures, 8 tables

    Journal ref: Structural Equation Modeling: A Multidisciplinary Journal 2024

  4. arXiv:2312.17363  [pdf, other

    stat.AP

    A Comparison of Full Information Maximum Likelihood and Machine Learning Missing Data Analytical Methods in Growth Curve Modeling

    Authors: Dandan Tang, Xin Tong

    Abstract: Missing data are inevitable in longitudinal studies. Traditional methods, such as the full information maximum likelihood (FIML), are commonly used to handle ignorable missing data. However, they may lead to biased model estimation due to missing not at random data that often appear in longitudinal studies. Recently, machine learning methods, such as random forests (RF) and K-nearest neighbors (KN… ▽ More

    Submitted 28 December, 2023; originally announced December 2023.

    Comments: 8 pages, 2 figures, and This proceeding was accepted by The Annual Meeting of the Psychometric Society

    Journal ref: The Annual Meeting of the Psychometric Society 2023

  5. arXiv:2310.11531  [pdf, ps, other

    cs.LG cs.AI eess.SY stat.ML

    Efficient Online Learning with Offline Datasets for Infinite Horizon MDPs: A Bayesian Approach

    Authors: Dengwang Tang, Rahul Jain, Botao Hao, Zheng Wen

    Abstract: In this paper, we study the problem of efficient online reinforcement learning in the infinite horizon setting when there is an offline dataset to start with. We assume that the offline dataset is generated by an expert but with unknown level of competence, i.e., it is not perfect and not necessarily using the optimal policy. We show that if the learning agent models the behavioral policy (paramet… ▽ More

    Submitted 1 February, 2024; v1 submitted 17 October, 2023; originally announced October 2023.

    Comments: 22 pages

    MSC Class: 93E35

  6. arXiv:2310.10107  [pdf, other

    cs.LG cs.AI eess.SY stat.ML

    Posterior Sampling-based Online Learning for Episodic POMDPs

    Authors: Dengwang Tang, Dongze Ye, Rahul Jain, Ashutosh Nayyar, Pierluigi Nuzzo

    Abstract: Learning in POMDPs is known to be significantly harder than MDPs. In this paper, we consider the online learning problem for episodic POMDPs with unknown transition and observation models. We propose a Posterior Sampling-based reinforcement learning algorithm for POMDPs (PS4POMDPs), which is much simpler and more implementable compared to state-of-the-art optimism-based online learning algorithms… ▽ More

    Submitted 23 May, 2024; v1 submitted 16 October, 2023; originally announced October 2023.

    Comments: 32 pages, 4 figures

    MSC Class: 93E35

  7. arXiv:2304.01098  [pdf, other

    stat.ME

    The synthetic instrument: From sparse association to sparse causation

    Authors: Dingke Tang, Dehan Kong, Linbo Wang

    Abstract: In many observational studies, researchers are often interested in studying the effects of multiple exposures on a single outcome. Standard approaches for high-dimensional data such as the lasso assume the associations between the exposures and the outcome are sparse. These methods, however, do not estimate the causal effects in the presence of unmeasured confounding. In this paper, we consider an… ▽ More

    Submitted 3 April, 2023; originally announced April 2023.

  8. arXiv:2303.01954  [pdf, other

    stat.ML cs.AI cs.LG

    Synthetic Data Generator for Adaptive Interventions in Global Health

    Authors: Aditya Rastogi, Juan Francisco Garamendi, Ana Fernández del Río, Anna Guitart, Moiz Hassan Khan, Dexian Tang, África Periáñez

    Abstract: Artificial Intelligence and digital health have the potential to transform global health. However, having access to representative data to test and validate algorithms in realistic production environments is essential. We introduce HealthSyn, an open-source synthetic data generator of user behavior for testing reinforcement learning algorithms in the context of mobile health interventions. The gen… ▽ More

    Submitted 27 April, 2023; v1 submitted 3 March, 2023; originally announced March 2023.

  9. arXiv:2012.05849  [pdf, ps, other

    stat.ME

    The Promises of Parallel Outcomes

    Authors: Ying Zhou, Dingke Tang, Dehan Kong, Linbo Wang

    Abstract: A key challenge in causal inference from observational studies is the identification and estimation of causal effects in the presence of unmeasured confounding. In this paper, we introduce a novel approach for causal inference that leverages information in multiple outcomes to deal with unmeasured confounding. The key assumption in our approach is conditional independence among multiple outcomes.… ▽ More

    Submitted 14 October, 2022; v1 submitted 10 December, 2020; originally announced December 2020.

  10. arXiv:2007.14190  [pdf, other

    stat.ME

    Ultra-high Dimensional Variable Selection for Doubly Robust Causal Inference

    Authors: Dingke Tang, Dehan Kong, Wenliang Pan, Linbo Wang

    Abstract: Causal inference has been increasingly reliant on observational studies with rich covariate information. To build tractable causal procedures, such as the doubly robust estimators, it is imperative to first extract important features from high or even ultra-high dimensional data. In this paper, we propose causal ball screening for confounder selection from modern ultra-high dimensional data sets.… ▽ More

    Submitted 6 February, 2022; v1 submitted 28 July, 2020; originally announced July 2020.

    Comments: To appear in Biometrics

  11. arXiv:1906.06419  [pdf, other

    cs.LG stat.ML

    Learning Correlated Latent Representations with Adaptive Priors

    Authors: Da Tang, Dawen Liang, Nicholas Ruozzi, Tony Jebara

    Abstract: Variational Auto-Encoders (VAEs) have been widely applied for learning compact, low-dimensional latent representations of high-dimensional data. When the correlation structure among data points is available, previous work proposed Correlated Variational Auto-Encoders (CVAEs), which employ a structured mixture model as prior and a structured variational posterior for each mixture component to enfor… ▽ More

    Submitted 18 December, 2019; v1 submitted 14 June, 2019; originally announced June 2019.

    Comments: 16 pages, 1 figure, 5 tables

  12. arXiv:1905.05335  [pdf, other

    cs.LG stat.ML

    Correlated Variational Auto-Encoders

    Authors: Da Tang, Dawen Liang, Tony Jebara, Nicholas Ruozzi

    Abstract: Variational Auto-Encoders (VAEs) are capable of learning latent representations for high dimensional data. However, due to the i.i.d. assumption, VAEs only optimize the singleton variational distributions and fail to account for the correlations between data points, which might be crucial for learning latent representations from dataset where a priori we know correlations exist. We propose Correla… ▽ More

    Submitted 17 April, 2020; v1 submitted 13 May, 2019; originally announced May 2019.

    Comments: International Conference on Machine Learning (ICML), 2019

  13. arXiv:1903.02984  [pdf, other

    cs.LG stat.ML

    The Variational Predictive Natural Gradient

    Authors: Da Tang, Rajesh Ranganath

    Abstract: Variational inference transforms posterior inference into parametric optimization thereby enabling the use of latent variable models where otherwise impractical. However, variational inference can be finicky when different variational parameters control variables that are strongly correlated under the model. Traditional natural gradients based on the variational approximation fail to correct for c… ▽ More

    Submitted 29 November, 2019; v1 submitted 7 March, 2019; originally announced March 2019.

    Comments: International Conference on Machine Learning (ICML), 2019

  14. arXiv:1807.06651  [pdf, other

    stat.ML cs.IR cs.LG

    Item Recommendation with Variational Autoencoders and Heterogenous Priors

    Authors: Giannis Karamanolakis, Kevin Raji Cherian, Ananth Ravi Narayan, Jie Yuan, Da Tang, Tony Jebara

    Abstract: In recent years, Variational Autoencoders (VAEs) have been shown to be highly effective in both standard collaborative filtering applications and extensions such as incorporation of implicit feedback. We extend VAEs to collaborative filtering with side information, for instance when ratings are combined with explicit text feedback from the user. Instead of using a user-agnostic standard Gaussian p… ▽ More

    Submitted 6 October, 2018; v1 submitted 17 July, 2018; originally announced July 2018.

    Comments: Accepted for the 3rd Workshop on Deep Learning for Recommender Systems (DLRS 2018), held in conjunction with the 12th ACM Conference on Recommender Systems (RecSys 2018) in Vancouver, Canada

  15. arXiv:1611.00838  [pdf, other

    stat.ML cs.CV cs.LG

    Initialization and Coordinate Optimization for Multi-way Matching

    Authors: Da Tang, Tony Jebara

    Abstract: We consider the problem of consistently matching multiple sets of elements to each other, which is a common task in fields such as computer vision. To solve the underlying NP-hard objective, existing methods often relax or approximate it, but end up with unsatisfying empirical performance due to a misaligned objective. We propose a coordinate update algorithm that directly optimizes the target obj… ▽ More

    Submitted 18 July, 2019; v1 submitted 2 November, 2016; originally announced November 2016.

    Comments: Artificial Intelligence and Statistics (AISTATS), 2017

  16. arXiv:1512.06273  [pdf, ps, other

    stat.AP

    A Marked Cox Model for IBNR Claims: Model and Theory

    Authors: Andrei L. Badescu, X. Sheldon Lin, Dameng Tang

    Abstract: Incurred but not reported (IBNR) loss reserving is an important issue for Property & Casualty (P&C) insurers. The modeling of the claim arrival process, especially its temporal dependence, has not been closely examined in many of the current loss reserving models. In this paper, we propose modeling the claim arrival process together with its reporting delays as a marked Cox process. Our model is… ▽ More

    Submitted 19 December, 2015; originally announced December 2015.

    Comments: 25 pages, working paper