Skip to main content

Showing 1–15 of 15 results for author: You, K

Searching in archive stat. Search in all archives.
.
  1. arXiv:2312.10072  [pdf, other

    cs.HC cs.AI cs.LG stat.AP

    Assessing the Usability of GutGPT: A Simulation Study of an AI Clinical Decision Support System for Gastrointestinal Bleeding Risk

    Authors: Colleen Chan, Kisung You, Sunny Chung, Mauro Giuffrè, Theo Saarinen, Niroop Rajashekar, Yuan Pu, Yeo Eun Shin, Loren Laine, Ambrose Wong, René Kizilcec, Jasjeet Sekhon, Dennis Shung

    Abstract: Applications of large language models (LLMs) like ChatGPT have potential to enhance clinical decision support through conversational interfaces. However, challenges of human-algorithmic interaction and clinician trust are poorly understood. GutGPT, a LLM for gastrointestinal (GI) bleeding risk prediction and management guidance, was deployed in clinical simulation scenarios alongside the electroni… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

    Comments: Extended Abstract presented at Machine Learning for Health (ML4H) symposium 2023, December 10, 2023, New Orleans, United States, 11 pages

  2. arXiv:2307.15213  [pdf, other

    stat.ME stat.ML

    PCA, SVD, and Centering of Data

    Authors: Donggun Kim, Kisung You

    Abstract: The research detailed in this paper scrutinizes Principal Component Analysis (PCA), a seminal method employed in statistics and machine learning for the purpose of reducing data dimensionality. Singular Value Decomposition (SVD) is often employed as the primary means for computing PCA, a process that indispensably includes the step of centering - the subtraction of the mean location from the data… ▽ More

    Submitted 1 April, 2024; v1 submitted 27 July, 2023; originally announced July 2023.

    Comments: 16 pages, 2 figures

  3. arXiv:2209.03318  [pdf, other

    stat.ME stat.CO

    On the Wasserstein median of probability measures

    Authors: Kisung You, Dennis Shung

    Abstract: Measures of central tendency such as mean and median are a primary way to summarize a given collection of random objects. In the field of optimal transport, the Wasserstein barycenter corresponds to Fréchet or geometric mean of a set of probability measures, which is defined as a minimizer of the sum of squared distances to each element in a given set when the order is 2. We present the Wasserstei… ▽ More

    Submitted 8 September, 2022; v1 submitted 7 September, 2022; originally announced September 2022.

    Comments: 27 pages, 9 figures

    MSC Class: 49Q22

  4. arXiv:2208.12435  [pdf, other

    stat.ME stat.AP

    Comparing multiple latent space embeddings using topological analysis

    Authors: Kisung You, Ilmun Kim, Ick Hoon **, Minjeong Jeon, Dennis Shung

    Abstract: The latent space model is one of the well-known methods for statistical inference of network data. While the model has been much studied for a single network, it has not attracted much attention to analyze collectively when multiple networks and their latent embeddings are present. We adopt a topology-based representation of latent space embeddings to learn over a population of network model fits,… ▽ More

    Submitted 26 August, 2022; originally announced August 2022.

    Comments: 46 pages, 11 figures

  5. arXiv:2208.11929  [pdf, other

    stat.ME

    On the spherical Laplace distribution

    Authors: Kisung You, Dennis Shung

    Abstract: The von Mises-Fisher (vMF) distribution has long been a mainstay for inference with data on the unit hypersphere in directional statistics. The performance of statistical inference based on the vMF distribution, however, may suffer when there are significant outliers and noise in the data. Based on an analogy of the median as a robust measure of central tendency and its relationship to the Laplace… ▽ More

    Submitted 7 September, 2022; v1 submitted 25 August, 2022; originally announced August 2022.

    Comments: 28 pages, 6 figures

    MSC Class: 62F10; 62H11; 62H12; 62H30; 62R30

  6. arXiv:2112.02580  [pdf, other

    stat.ME math.ST

    Bayesian Optimal Two-sample Tests in High-dimension

    Authors: Kyoungjae Lee, Kisung You, Lizhen Lin

    Abstract: We propose optimal Bayesian two-sample tests for testing equality of high-dimensional mean vectors and covariance matrices between two populations. In many applications including genomics and medical imaging, it is natural to assume that only a few entries of two mean vectors or covariance matrices are different. Many existing tests that rely on aggregating the difference between empirical means o… ▽ More

    Submitted 5 December, 2021; originally announced December 2021.

  7. Parameter Estimation and Model-Based Clustering with Spherical Normal Distribution on the Unit Hypersphere

    Authors: Kisung You

    Abstract: In directional statistics, the von Mises-Fisher (vMF) distribution is one of the most basic and popular probability distributions for data on the unit hypersphere. Recently, the spherical normal (SN) distribution was proposed as an intrinsic counterpart to the vMF distribution by replacing the standard Euclidean norm with the great-circle distance, which is the shortest path joining two points on… ▽ More

    Submitted 11 June, 2021; originally announced June 2021.

  8. arXiv:2106.02096  [pdf, ps, other

    stat.ML cs.LG

    Shape-Preserving Dimensionality Reduction : An Algorithm and Measures of Topological Equivalence

    Authors: Byeongsu Yu, Kisung You

    Abstract: We introduce a linear dimensionality reduction technique preserving topological features via persistent homology. The method is designed to find linear projection $L$ which preserves the persistent diagram of a point cloud $\mathbb{X}$ via simulated annealing. The projection $L$ induces a set of canonical simplicial maps from the Rips (or Čech) filtration of $\mathbb{X}$ to that of $L\mathbb{X}$.… ▽ More

    Submitted 13 June, 2021; v1 submitted 3 June, 2021; originally announced June 2021.

    Comments: 18 pages, 2 figures

  9. Rdimtools: An R package for Dimension Reduction and Intrinsic Dimension Estimation

    Authors: Kisung You

    Abstract: Discovering patterns of the complex high-dimensional data is a long-standing problem. Dimension Reduction (DR) and Intrinsic Dimension Estimation (IDE) are two fundamental thematic programs that facilitate geometric understanding of the data. We present Rdimtools - an R package that supports 133 DR and 17 IDE algorithms whose extent makes multifaceted scrutiny of the data in one place easier. Rdim… ▽ More

    Submitted 22 May, 2020; originally announced May 2020.

  10. arXiv:2003.00433  [pdf, other

    cs.LG cs.MA math.OC stat.ML

    Fully Asynchronous Policy Evaluation in Distributed Reinforcement Learning over Networks

    Authors: Xingyu Sha, Jiaqi Zhang, Keyou You, Kaiqing Zhang, Tamer Başar

    Abstract: This paper proposes a \emph{fully asynchronous} scheme for the policy evaluation problem of distributed reinforcement learning (DisRL) over directed peer-to-peer networks. Without waiting for any other node of the network, each node can locally update its value function at any time by using (possibly delayed) information from its neighbors. This is in sharp contrast to the gossip-based scheme wher… ▽ More

    Submitted 22 January, 2021; v1 submitted 1 March, 2020; originally announced March 2020.

  11. Data transforming augmentation for heteroscedastic models

    Authors: Hyungsuk Tak, Kisung You, Sujit K. Ghosh, Bingyue Su, Joseph Kelly

    Abstract: Data augmentation (DA) turns seemingly intractable computational problems into simple ones by augmenting latent missing data. In addition to computational simplicity, it is now well-established that DA equipped with a deterministic transformation can improve the convergence speed of iterative algorithms such as an EM algorithm or Gibbs sampler. In this article, we outline a framework for the trans… ▽ More

    Submitted 27 January, 2020; v1 submitted 6 November, 2019; originally announced November 2019.

  12. arXiv:1909.02712  [pdf, other

    cs.LG cs.DC cs.MA eess.SY stat.ML

    Decentralized Stochastic Gradient Tracking for Non-convex Empirical Risk Minimization

    Authors: Jiaqi Zhang, Keyou You

    Abstract: This paper studies a decentralized stochastic gradient tracking (DSGT) algorithm for non-convex empirical risk minimization problems over a peer-to-peer network of nodes, which is in sharp contrast to the existing DSGT only for convex problems. To ensure exact convergence and handle the variance among decentralized datasets, each node performs a stochastic gradient (SG) tracking step by using a mi… ▽ More

    Submitted 28 August, 2020; v1 submitted 6 September, 2019; originally announced September 2019.

    Comments: This paper has been revised and theoretical results are improved

  13. arXiv:1908.01878  [pdf, other

    cs.LG stat.ML

    How Does Learning Rate Decay Help Modern Neural Networks?

    Authors: Kaichao You, Mingsheng Long, Jianmin Wang, Michael I. Jordan

    Abstract: Learning rate decay (lrDecay) is a \emph{de facto} technique for training modern neural networks. It starts with a large learning rate and then decays it multiple times. It is empirically observed to help both optimization and generalization. Common beliefs in how lrDecay works come from the optimization analysis of (Stochastic) Gradient Descent: 1) an initially large learning rate accelerates tra… ▽ More

    Submitted 26 September, 2019; v1 submitted 5 August, 2019; originally announced August 2019.

    Comments: title changed

  14. arXiv:1810.05297  [pdf, other

    stat.AP

    Bayesian Hierarchical Spatial Model for Small Area Estimation with Non-ignorable Nonresponses and Its Applications to the NHANES Dental Caries Assessments

    Authors: Ick Hoon **, Fang Liu, Evercita C. Eugenio, Kisung You, Suyu Liu

    Abstract: The National Health and Nutrition Examination Survey (NHANES) is a major program of the National Center for Health Statistics, designed to assess the health and nutritional status of adults and children in the United States. The analysis of NHANES dental caries data faces several challenges, including (1) the data were collected using a complex, multistage, stratified, unequal-probability sampling… ▽ More

    Submitted 14 October, 2019; v1 submitted 11 October, 2018; originally announced October 2018.

  15. arXiv:1810.02906  [pdf, other

    stat.ML cs.LG

    Network Distance Based on Laplacian Flows on Graphs

    Authors: Dianbin Bao, Kisung You, Lizhen Lin

    Abstract: Distance plays a fundamental role in measuring similarity between objects. Various visualization techniques and learning tasks in statistics and machine learning such as shape matching, classification, dimension reduction and clustering often rely on some distance or similarity measure. It is of tremendous importance to have a distance that can incorporate the underlying structure of the object. I… ▽ More

    Submitted 5 October, 2018; originally announced October 2018.