Skip to main content

Showing 1–12 of 12 results for author: Do, K

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.01557  [pdf, other

    stat.ME stat.AP

    Bayesian compositional regression with flexible microbiome feature aggregation and selection

    Authors: Satabdi Saha, Liangliang Zhang, Kim-Anh Do, Christine B. Peterson

    Abstract: Ongoing advances in microbiome profiling have allowed unprecedented insights into the molecular activities of microbial communities. This has fueled a strong scientific interest in understanding the critical role the microbiome plays in governing human health, by identifying microbial features associated with clinical outcomes of interest. Several aspects of microbiome data limit the applicability… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  2. arXiv:2309.08109  [pdf, other

    stat.ME

    CAT: a conditional association test for microbiome data using a leave-out approach

    Authors: Yushu Shi, Liangliang Zhang, Kim-Anh Do, Robert R. Jenq, Christine B. Peterson

    Abstract: In microbiome analysis, researchers often seek to identify taxonomic features associated with an outcome of interest. However, microbiome features are intercorrelated and linked by phylogenetic relationships, making it challenging to assess the association between an individual feature and an outcome. Researchers have developed global tests for the association of microbiome profiles with outcomes… ▽ More

    Submitted 14 September, 2023; originally announced September 2023.

  3. arXiv:2308.13737  [pdf, other

    stat.AP

    survivalContour: Visualizing predicted survival via colored contour plots

    Authors: Yushu Shi, Liangliang Zhang, Kim-Anh Do, Robert R. Jenq, Christine B. Peterson

    Abstract: Advances in survival analysis have facilitated unprecedented flexibility in data modeling, yet there remains a lack of tools for graphically illustrating the influence of continuous covariates on predicted survival outcomes. We propose the utilization of a colored contour plot to depict the predicted survival probabilities over time, and provide a Shiny app and R package as implementations of this… ▽ More

    Submitted 12 January, 2024; v1 submitted 25 August, 2023; originally announced August 2023.

  4. arXiv:2207.14753  [pdf, other

    stat.ME

    Estimating Causal Effects with Hidden Confounding using Instrumental Variables and Environments

    Authors: James P. Long, Hongxu Zhu, Kim-Anh Do, Min ** Ha

    Abstract: Recent works have proposed regression models which are invariant across data collection environments. These estimators often have a causal interpretation under conditions on the environments and type of invariance imposed. One recent example, the Causal Dantzig (CD), is consistent under hidden confounding and represents an alternative to classical instrumental variable estimators such as Two Stage… ▽ More

    Submitted 9 November, 2023; v1 submitted 29 July, 2022; originally announced July 2022.

    Comments: 32 pages, 7 figures, 4 tables

  5. arXiv:2207.09991  [pdf, other

    stat.AP q-bio.QM

    Causal Models, Prediction, and Extrapolation in Cell Line Perturbation Experiments

    Authors: James P. Long, Yumeng Yang, Kim-Anh Do

    Abstract: In cell line perturbation experiments, a collection of cells is perturbed with external agents (e.g. drugs) and responses such as protein expression measured. Due to cost constraints, only a small fraction of all possible perturbations can be tested in vitro. This has led to the development of computational (in silico) models which can predict cellular responses to perturbations. Perturbations wit… ▽ More

    Submitted 20 July, 2022; originally announced July 2022.

    Comments: 13 pages, 4 figures

  6. arXiv:2011.06061  [pdf, other

    stat.ME

    A Framework for Mediation Analysis with Multiple Exposures, Multivariate Mediators, and Non-Linear Response Models

    Authors: James P. Long, Ehsan Irajizad, James D. Doecke, Kim-Anh Do, Min ** Ha

    Abstract: Mediation analysis seeks to identify and quantify the paths by which an exposure affects an outcome. Intermediate variables which are effected by the exposure and which effect the outcome are known as mediators. There exists extensive work on mediation analysis in the context of models with a single mediator and continuous and binary outcomes. However these methods are often not suitable for multi… ▽ More

    Submitted 11 November, 2020; originally announced November 2020.

    Comments: 17 pages, 5 figures

  7. arXiv:2007.15812  [pdf, other

    stat.AP

    Sparse tree-based clustering of microbiome data to characterize microbiome heterogeneity in pancreatic cancer

    Authors: Yushu Shi, Liangliang Zhang, Kim-Anh Do, Robert Jenq, Christine Peterson

    Abstract: There is a keen interest in characterizing variation in the microbiome across cancer patients, given increasing evidence of its important role in determining treatment outcomes. Here our goal is to discover subgroups of patients with similar microbiome profiles. We propose a novel unsupervised clustering approach in the Bayesian framework that innovates over existing model-based clustering approac… ▽ More

    Submitted 2 December, 2022; v1 submitted 30 July, 2020; originally announced July 2020.

  8. arXiv:2005.01169  [pdf, ps, other

    stat.ME stat.AP

    ProgPermute: Progressive permutation for a dynamic representation of the robustness of microbiome discoveries

    Authors: Liangliang Zhang, Yushu Shi, Kim-Anh Do, Christine B. Peterson, Robert R. Jenq

    Abstract: Identification of features is a critical task in microbiome studies that is complicated by the fact that microbial data are high dimensional and heterogeneous. Masked by the complexity of the data, the problem of separating signals from noise becomes challenging and troublesome. For instance, when performing differential abundance tests, multiple testing adjustments tend to be overconservative, as… ▽ More

    Submitted 17 September, 2020; v1 submitted 3 May, 2020; originally announced May 2020.

    Comments: 8 pages, 6 figures

  9. arXiv:1908.09961  [pdf, other

    cs.LG stat.ML

    Theory and Evaluation Metrics for Learning Disentangled Representations

    Authors: Kien Do, Truyen Tran

    Abstract: We make two theoretical contributions to disentanglement learning by (a) defining precise semantics of disentangled representations, and (b) establishing robust metrics for evaluation. First, we characterize the concept "disentangled representations" used in supervised and unsupervised methods along three dimensions-informativeness, separability and interpretability - which can be expressed and qu… ▽ More

    Submitted 18 March, 2021; v1 submitted 26 August, 2019; originally announced August 2019.

  10. NExUS: Bayesian simultaneous network estimation across unequal sample sizes

    Authors: Priyam Das, Christine Peterson, Kim-Anh Do, Rehan Akbani, Veerabhadran Baladandayuthapani

    Abstract: Network-based analyses of high-throughput genomics data provide a holistic, systems-level understanding of various biological mechanisms for a common population. However, when estimating multiple networks across heterogeneous sub-populations, varying sample sizes pose a challenge in the estimation and inference, as network differences may be driven by differences in power. We are particularly inte… ▽ More

    Submitted 6 November, 2018; originally announced November 2018.

    Comments: 8 pages, 8 figues

  11. arXiv:1804.00293  [pdf, other

    cs.LG cs.AI stat.ML

    Attentional Multilabel Learning over Graphs: A Message Passing Approach

    Authors: Kien Do, Truyen Tran, Thin Nguyen, Svetha Venkatesh

    Abstract: We address a largely open problem of multilabel classification over graphs. Unlike traditional vector input, a graph has rich variable-size substructures which are related to the labels in some ways. We believe that uncovering these relations might hold the key to classification performance and explainability. We introduce GAML (Graph Attentional Multi-Label learning), a novel graph neural network… ▽ More

    Submitted 11 April, 2018; v1 submitted 1 April, 2018; originally announced April 2018.

  12. arXiv:1608.04830  [pdf, other

    stat.ML cs.LG

    Outlier Detection on Mixed-Type Data: An Energy-based Approach

    Authors: Kien Do, Truyen Tran, Dinh Phung, Svetha Venkatesh

    Abstract: Outlier detection amounts to finding data points that differ significantly from the norm. Classic outlier detection methods are largely designed for single data type such as continuous or discrete. However, real world data is increasingly heterogeneous, where a data point can have both discrete and continuous attributes. Handling mixed-type data in a disciplined way remains a great challenge. In t… ▽ More

    Submitted 16 August, 2016; originally announced August 2016.