Skip to main content

Showing 1–50 of 264 results for author: Kim, J

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.12409  [pdf, other

    stat.ML cs.LG

    Translation Equivariant Transformer Neural Processes

    Authors: Matthew Ashman, Cristiana Diaconu, Junhyuck Kim, Lakee Sivaraya, Stratis Markou, James Requeima, Wessel P. Bruinsma, Richard E. Turner

    Abstract: The effectiveness of neural processes (NPs) in modelling posterior prediction maps -- the map** from data to posterior predictive distributions -- has significantly improved since their inception. This improvement can be attributed to two principal factors: (1) advancements in the architecture of permutation invariant set functions, which are intrinsic to all NPs; and (2) leveraging symmetries p… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  2. arXiv:2406.10087  [pdf

    cs.LG cs.AI stat.ML

    Biomarker based Cancer Classification using an Ensemble with Pre-trained Models

    Authors: Chongmin Lee, Jihie Kim

    Abstract: Certain cancer types, namely pancreatic cancer is difficult to detect at an early stage; sparking the importance of discovering the causal relationship between biomarkers and cancer to identify cancer efficiently. By allowing for the detection and monitoring of specific biomarkers through a non-invasive method, liquid biopsies enhance the precision and efficacy of medical interventions, advocating… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: Accepted to the AIAA Workshop at IJCAI 2024

  3. arXiv:2406.08097  [pdf, other

    cs.LG stat.AP stat.ME

    Inductive Global and Local Manifold Approximation and Projection

    Authors: Jungeum Kim, Xiao Wang

    Abstract: Nonlinear dimensional reduction with the manifold assumption, often called manifold learning, has proven its usefulness in a wide range of high-dimensional data analysis. The significant impact of t-SNE and UMAP has catalyzed intense research interest, seeking further innovations toward visualizing not only the local but also the global structure information of the data. Moreover, there have been… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  4. arXiv:2406.02840  [pdf, other

    stat.ME math.OC math.ST

    Statistical inference of convex order by Wasserstein projection

    Authors: Jakwang Kim, Young-Heon Kim, Yuanlong Ruan, Andrew Warren

    Abstract: Ranking distributions according to a stochastic order has wide applications in diverse areas. Although stochastic dominance has received much attention,convex order, particularly in general dimensions, has yet to be investigated from a statistical point of view. This article addresses this gap by introducing a simple statistical test for convex order based on the Wasserstein projection distance. T… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    MSC Class: 62G10; 49K27

  5. arXiv:2405.19703  [pdf, other

    cs.LG cs.CV stat.ML

    Towards a Better Evaluation of Out-of-Domain Generalization

    Authors: Duhun Hwang, Suhyun Kang, Moonjung Eo, Jimyeong Kim, Wonjong Rhee

    Abstract: The objective of Domain Generalization (DG) is to devise algorithms and models capable of achieving high performance on previously unseen test distributions. In the pursuit of this objective, average measure has been employed as the prevalent measure for evaluating models and comparing algorithms in the existing DG studies. Despite its significance, a comprehensive exploration of the average measu… ▽ More

    Submitted 2 June, 2024; v1 submitted 30 May, 2024; originally announced May 2024.

  6. arXiv:2405.03083  [pdf, other

    stat.ME cs.LG stat.ML

    Causal K-Means Clustering

    Authors: Kwangho Kim, Jisu Kim, Edward H. Kennedy

    Abstract: Causal effects are often characterized with population summaries. These might provide an incomplete picture when there are heterogeneous treatment effects across subgroups. Since the subgroup structure is typically unknown, it is more challenging to identify and evaluate subgroup effects than population effects. We propose a new solution to this problem: Causal k-Means Clustering, which harnesses… ▽ More

    Submitted 29 June, 2024; v1 submitted 5 May, 2024; originally announced May 2024.

  7. arXiv:2404.16168  [pdf, other

    cs.LG cs.AI stat.ML

    The Over-Certainty Phenomenon in Modern UDA Algorithms

    Authors: Fin Amin, Jung-Eun Kim

    Abstract: When neural networks are confronted with unfamiliar data that deviate from their training set, this signifies a domain shift. While these networks output predictions on their inputs, they typically fail to account for their level of familiarity with these novel observations. This challenge becomes even more pronounced in resource-constrained settings, such as embedded systems or edge devices. To a… ▽ More

    Submitted 27 May, 2024; v1 submitted 24 April, 2024; originally announced April 2024.

  8. arXiv:2404.14202  [pdf, other

    cs.LG stat.ML

    An Adaptive Approach for Infinitely Many-armed Bandits under Generalized Rotting Constraints

    Authors: Jung-hun Kim, Milan Vojnovic, Se-Young Yun

    Abstract: In this study, we consider the infinitely many-armed bandit problems in a rested rotting setting, where the mean reward of an arm may decrease with each pull, while otherwise, it remains unchanged. We explore two scenarios regarding the rotting of rewards: one in which the cumulative amount of rotting is bounded by $V_T$, referred to as the slow-rotting case, and the other in which the cumulative… ▽ More

    Submitted 24 May, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

  9. arXiv:2404.10436  [pdf, other

    cs.LG stat.CO stat.ME

    Tree Bandits for Generative Bayes

    Authors: Sean O'Hagan, Jungeum Kim, Veronika Rockova

    Abstract: In generative models with obscured likelihood, Approximate Bayesian Computation (ABC) is often the tool of last resort for inference. However, ABC demands many prior parameter trials to keep only a small fraction that passes an acceptance test. To accelerate ABC rejection sampling, this paper develops a self-aware framework that learns from past trials and errors. We apply recursive partitioning c… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

  10. arXiv:2404.01076  [pdf, other

    stat.ME

    Debiased calibration estimation using generalized entropy in survey sampling

    Authors: Yonghyun Kwon, Jae Kwang Kim, Yumou Qiu

    Abstract: Incorporating the auxiliary information into the survey estimation is a fundamental problem in survey sampling. Calibration weighting is a popular tool for incorporating the auxiliary information. The calibration weighting method of Deville and Sarndal (1992) uses a distance measure between the design weights and the final weights to solve the optimization problem with calibration constraints. Thi… ▽ More

    Submitted 2 April, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

  11. arXiv:2402.07341  [pdf, other

    stat.ML cs.LG

    Noise-Adaptive Confidence Sets for Linear Bandits and Application to Bayesian Optimization

    Authors: Kwang-Sung Jun, Jungtaek Kim

    Abstract: Adapting to a priori unknown noise level is a very important but challenging problem in sequential decision-making as efficient exploration typically requires knowledge of the noise level, which is often loosely specified. We report significant progress in addressing this issue for linear bandits in two respects. First, we propose a novel confidence set that is `semi-adaptive' to the unknown sub-G… ▽ More

    Submitted 7 June, 2024; v1 submitted 11 February, 2024; originally announced February 2024.

    Comments: accepted to ICML'24; fixed typos

  12. arXiv:2402.04582  [pdf, other

    stat.AP stat.ML

    Dimensionality reduction can be used as a surrogate model for high-dimensional forward uncertainty quantification

    Authors: Jungho Kim, Sang-ri Yi, Ziqi Wang

    Abstract: We introduce a method to construct a stochastic surrogate model from the results of dimensionality reduction in forward uncertainty quantification. The hypothesis is that the high-dimensional input augmented by the output of a computational model admits a low-dimensional representation. This assumption can be met by numerous uncertainty quantification applications with physics-based computational… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

  13. arXiv:2402.01258  [pdf, other

    stat.ML cs.LG

    Transformers Learn Nonlinear Features In Context: Nonconvex Mean-field Dynamics on the Attention Landscape

    Authors: Juno Kim, Taiji Suzuki

    Abstract: Large language models based on the Transformer architecture have demonstrated impressive capabilities to learn in context. However, existing theoretical studies on how this phenomenon arises are limited to the dynamics of a single layer of attention trained on linear regression tasks. In this paper, we study the optimization of a Transformer consisting of a fully connected layer followed by a line… ▽ More

    Submitted 2 June, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

    Comments: ICML 2024 Oral

  14. arXiv:2401.12517  [pdf, other

    cs.LG stat.ML

    DDMI: Domain-Agnostic Latent Diffusion Models for Synthesizing High-Quality Implicit Neural Representations

    Authors: Dogyun Park, Sihyeon Kim, So** Lee, Hyunwoo J. Kim

    Abstract: Recent studies have introduced a new class of generative models for synthesizing implicit neural representations (INRs) that capture arbitrary continuous signals in various domains. These models opened the door for domain-agnostic generative models, but they often fail to achieve high-quality generation. We observed that the existing methods generate the weights of neural networks to parameterize… ▽ More

    Submitted 20 March, 2024; v1 submitted 23 January, 2024; originally announced January 2024.

  15. arXiv:2401.09191  [pdf, other

    cs.LG math.OC stat.ML

    An Optimal Transport Approach for Computing Adversarial Training Lower Bounds in Multiclass Classification

    Authors: Nicolas Garcia Trillos, Matt Jacobs, Jakwang Kim, Matthew Werenski

    Abstract: Despite the success of deep learning-based algorithms, it is widely known that neural networks may fail to be robust. A popular paradigm to enforce robustness is adversarial training (AT), however, this introduces many computational and theoretical difficulties. Recent works have developed a connection between AT in the multiclass classification setting and multimarginal optimal transport (MOT), u… ▽ More

    Submitted 17 January, 2024; originally announced January 2024.

  16. arXiv:2401.09028  [pdf, other

    stat.AP

    A Novel Interpretable Fusion Analytic Framework for Investigating Functional Brain Connectivity Differences in Cognitive Impairments

    Authors: Yeseul Jeon, Jeong-Jae Kim, SuMin Yu, Junggu Choi, Sanghoon Han

    Abstract: Functional magnetic resonance imaging (fMRI) data is characterized by its complexity and high--dimensionality, encompassing signals from various regions of interests (ROIs) that exhibit intricate correlations. Analyzing fMRI data directly proves challenging due to its intricate structure. Nevertheless, ROIs convey crucial information about brain activities through their connections, offering insig… ▽ More

    Submitted 17 January, 2024; originally announced January 2024.

    Comments: arXiv admin note: text overlap with arXiv:2207.01581

  17. arXiv:2401.07625  [pdf, ps, other

    stat.ME

    Statistics in Survey Sampling

    Authors: Jae Kwang Kim

    Abstract: Survey sampling theory and methods are introduced. Sampling designs and estimation methods are carefully discussed as a textbook for survey sampling. Topics includes Horvitz-Thompson estimation, simple random sampling, stratified sampling, cluster sampling, ratio estimation, regression estimation, variance estimation, two-phase sampling, and nonresponse adjustment methods.

    Submitted 11 June, 2024; v1 submitted 15 January, 2024; originally announced January 2024.

  18. arXiv:2401.01981  [pdf, other

    cs.LG stat.ML

    Beyond Regrets: Geometric Metrics for Bayesian Optimization

    Authors: Jungtaek Kim

    Abstract: Bayesian optimization is a principled optimization strategy for a black-box objective function. It shows its effectiveness in a wide variety of real-world applications such as scientific discovery and experimental design. In general, the performance of Bayesian optimization is reported through regret-based metrics such as instantaneous, simple, and cumulative regrets. These metrics only rely on fu… ▽ More

    Submitted 11 March, 2024; v1 submitted 3 January, 2024; originally announced January 2024.

  19. arXiv:2312.09634  [pdf, other

    stat.ML cs.LG

    Vectorizing string entries for data processing on tables: when are larger language models better?

    Authors: Léo Grinsztajn, Edouard Oyallon, Myung Jun Kim, Gaël Varoquaux

    Abstract: There are increasingly efficient data processing pipelines that work on vectors of numbers, for instance most machine learning models, or vector databases for fast similarity search. These require converting the data to numbers. While this conversion is easy for simple numerical and categorical entries, databases are strife with text entries, such as names or descriptions. In the age of large lang… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

  20. arXiv:2312.05411  [pdf, other

    stat.ME stat.CO stat.ML

    Deep Bayes Factors

    Authors: Jungeum Kim, Veronika Rockova

    Abstract: The is no other model or hypothesis verification tool in Bayesian statistics that is as widely used as the Bayes factor. We focus on generative models that are likelihood-free and, therefore, render the computation of Bayes factors (marginal likelihood ratios) far from obvious. We propose a deep learning estimator of the Bayes factor based on simulated data from two competing models using the like… ▽ More

    Submitted 12 June, 2024; v1 submitted 8 December, 2023; originally announced December 2023.

  21. arXiv:2312.01133  [pdf, other

    stat.ML cs.LG

    $t^3$-Variational Autoencoder: Learning Heavy-tailed Data with Student's t and Power Divergence

    Authors: Juno Kim, Jaehyuk Kwon, Mincheol Cho, Hyunjong Lee, Joong-Ho Won

    Abstract: The variational autoencoder (VAE) typically employs a standard normal prior as a regularizer for the probabilistic latent encoder. However, the Gaussian tail often decays too quickly to effectively accommodate the encoded points, failing to preserve crucial structures hidden in the data. In this paper, we explore the use of heavy-tailed models to combat over-regularization. Drawing upon insights f… ▽ More

    Submitted 3 March, 2024; v1 submitted 2 December, 2023; originally announced December 2023.

    Comments: ICLR 2024; 27 pages, 7 figures, 8 tables

  22. arXiv:2312.01127  [pdf, other

    math.OC stat.ML

    Symmetric Mean-field Langevin Dynamics for Distributional Minimax Problems

    Authors: Juno Kim, Kakei Yamamoto, Kazusato Oko, Zhuoran Yang, Taiji Suzuki

    Abstract: In this paper, we extend mean-field Langevin dynamics to minimax optimization over probability distributions for the first time with symmetric and provably convergent updates. We propose mean-field Langevin averaged gradient (MFL-AG), a single-loop algorithm that implements gradient descent ascent in the distribution spaces with a novel weighted averaging, and establish average-iterate convergence… ▽ More

    Submitted 16 February, 2024; v1 submitted 2 December, 2023; originally announced December 2023.

    Comments: ICLR 2024 spotlight

  23. Supervised low-rank semi-nonnegative matrix factorization with frequency regularization for forecasting spatio-temporal data

    Authors: Keunsu Kim, Hanbaek Lyu, **su Kim, Jae-Hun Jung

    Abstract: We propose a novel methodology for forecasting spatio-temporal data using supervised semi-nonnegative matrix factorization (SSNMF) with frequency regularization. Matrix factorization is employed to decompose spatio-temporal data into spatial and temporal components. To improve clarity in the temporal patterns, we introduce a nonnegativity constraint on the time domain along with regularization in… ▽ More

    Submitted 19 June, 2024; v1 submitted 14 November, 2023; originally announced November 2023.

    Comments: 35 pages, Final version

    MSC Class: 65F22; 65F55 and 86A04

    Journal ref: Journal of Scientific Computing (2024)

  24. Discordance Minimization-based Imputation Algorithms for Missing Values in Rating Data

    Authors: Young Woong Park, **hak Kim, Dan Zhu

    Abstract: Ratings are frequently used to evaluate and compare subjects in various applications, from education to healthcare, because ratings provide succinct yet credible measures for comparing subjects. However, when multiple rating lists are combined or considered together, subjects often have missing ratings, because most rating lists do not rate every subject in the combined list. In this study, we pro… ▽ More

    Submitted 7 November, 2023; originally announced November 2023.

  25. Topological Learning for Motion Data via Mixed Coordinates

    Authors: Hengrui Luo, Jisu Kim, Alice Patania, Mikael Vejdemo-Johansson

    Abstract: Topology can extract the structural information in a dataset efficiently. In this paper, we attempt to incorporate topological information into a multiple output Gaussian process model for transfer learning purposes. To achieve this goal, we extend the framework of circular coordinates into a novel framework of mixed valued coordinates to take linear trends in the time series into consideration.… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

    Comments: 7 pages, 4 figures

    Journal ref: 2021 IEEE International Conference on Big Data (Big Data)

  26. arXiv:2310.19053  [pdf, other

    cs.LG physics.optics stat.ML

    Datasets and Benchmarks for Nanophotonic Structure and Parametric Design Simulations

    Authors: Jungtaek Kim, Mingxuan Li, Oliver Hinder, Paul W. Leu

    Abstract: Nanophotonic structures have versatile applications including solar cells, anti-reflective coatings, electromagnetic interference shielding, optical filters, and light emitting diodes. To design and understand these nanophotonic structures, electrodynamic simulations are essential. These simulations enable us to model electromagnetic fields over time and calculate optical properties. In this work,… ▽ More

    Submitted 29 October, 2023; originally announced October 2023.

    Comments: 31 pages, 31 figures, 4 tables. Accepted at the 37th Conference on Neural Information Processing Systems (NeurIPS 2023), Datasets and Benchmarks Track

  27. arXiv:2310.07174  [pdf, other

    cs.LG stat.ML

    Generalized Neural Sorting Networks with Error-Free Differentiable Swap Functions

    Authors: Jungtaek Kim, Jeongbeen Yoon, Minsu Cho

    Abstract: Sorting is a fundamental operation of all computer systems, having been a long-standing significant research topic. Beyond the problem formulation of traditional sorting algorithms, we consider sorting problems for more abstract yet expressive inputs, e.g., multi-digit images and image fragments, through a neural sorting network. To learn a map** from a high-dimensional input to an ordinal varia… ▽ More

    Submitted 13 March, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

    Comments: Accepted at the 12th International Conference on Learning Representations (ICLR 2024)

  28. arXiv:2310.04283  [pdf, other

    cs.LG math.OC stat.ML

    On the Error-Propagation of Inexact Hotelling's Deflation for Principal Component Analysis

    Authors: Fangshuo Liao, Junhyung Lyle Kim, Cruz Barnum, Anastasios Kyrillidis

    Abstract: Principal Component Analysis (PCA) aims to find subspaces spanned by the so-called principal components that best represent the variance in the dataset. The deflation method is a popular meta-algorithm that sequentially finds individual principal components, starting from the most important ones and working towards the less important ones. However, as deflation proceeds, numerical errors from the… ▽ More

    Submitted 29 May, 2024; v1 submitted 6 October, 2023; originally announced October 2023.

    Comments: ICML2024

  29. arXiv:2310.00803  [pdf, other

    stat.ME

    A Bayesian joint model for mediation analysis with matrix-valued mediators

    Authors: Zi** Liu, Zhihui Liu, Ali Hosni, John Kim, Bei Jiang, Olli Saarela

    Abstract: Unscheduled treatment interruptions may lead to reduced quality of care in radiation therapy (RT). Identifying the RT prescription dose effects on the outcome of treatment interruptions, mediated through doses distributed into different organs-at-risk (OARs), can inform future treatment planning. The radiation exposure to OARs can be summarized by a matrix of dose-volume histograms (DVH) for each… ▽ More

    Submitted 27 June, 2024; v1 submitted 1 October, 2023; originally announced October 2023.

  30. arXiv:2308.05957  [pdf, other

    cs.SI cs.LG stat.ML

    Node Embedding for Homophilous Graphs with ARGEW: Augmentation of Random walks by Graph Edge Weights

    Authors: Jun Hee Kim, Jaeman Son, Hyunsoo Kim, Eunjo Lee

    Abstract: Representing nodes in a network as dense vectors node embeddings is important for understanding a given network and solving many downstream tasks. In particular, for weighted homophilous graphs where similar nodes are connected with larger edge weights, we desire node embeddings where node pairs with strong weights have closer embeddings. Although random walk based node embedding methods like node… ▽ More

    Submitted 11 August, 2023; originally announced August 2023.

  31. arXiv:2307.11651  [pdf, other

    stat.ME

    Multiple bias-calibration for adjusting selection bias of non-probability samples using data integration

    Authors: Zhonglei Wang, Shu Yang, Jae Kwang Kim

    Abstract: Valid statistical inference is challenging when the sample is subject to unknown selection bias. Data integration can be used to correct for selection bias when we have a parallel probability sample from the same population with some common measurements. How to model and estimate the selection probability or the propensity score (PS) of a non-probability sample using an independent probability sam… ▽ More

    Submitted 21 July, 2023; originally announced July 2023.

  32. arXiv:2306.15173  [pdf, other

    stat.ME

    Robust propensity score weighting estimation under missing at random

    Authors: Hengfang Wang, Jae Kwang Kim, Jeongseop Han, Youngjo Lee

    Abstract: Missing data is frequently encountered in many areas of statistics. Propensity score weighting is a popular method for handling missing data. The propensity score method employs a response propensity model, but correct specification of the statistical model can be challenging in the presence of missing data. Doubly robust estimation is attractive, as the consistency of the estimator is guaranteed… ▽ More

    Submitted 27 March, 2024; v1 submitted 26 June, 2023; originally announced June 2023.

  33. arXiv:2306.00126  [pdf, other

    math.ST stat.ML

    On Mixing Rates for Bayesian CART

    Authors: Jungeum Kim, Veronika Rockova

    Abstract: The success of Bayesian inference with MCMC depends critically on Markov chains rapidly reaching the posterior distribution. Despite the plentitude of inferential theory for posteriors in Bayesian non-parametrics, convergence properties of MCMC algorithms that simulate from such ideal inferential targets are not thoroughly understood. This work focuses on the Bayesian CART algorithm which forms a… ▽ More

    Submitted 31 May, 2023; originally announced June 2023.

  34. arXiv:2305.19809  [pdf, other

    cs.CV cs.AI cs.LG stat.ML

    Direct Diffusion Bridge using Data Consistency for Inverse Problems

    Authors: Hyung** Chung, Jeongsol Kim, Jong Chul Ye

    Abstract: Diffusion model-based inverse problem solvers have shown impressive performance, but are limited in speed, mostly as they require reverse diffusion sampling starting from noise. Several recent works have tried to alleviate this problem by building a diffusion process, directly bridging the clean and the corrupted for specific inverse problems. In this paper, we first unify these existing works und… ▽ More

    Submitted 24 October, 2023; v1 submitted 31 May, 2023; originally announced May 2023.

    Comments: NeurIPS 2023 camera-ready. 16 pages, 6 figures

  35. arXiv:2305.15612  [pdf, other

    cs.LG stat.ML

    Density Ratio Estimation-based Bayesian Optimization with Semi-Supervised Learning

    Authors: Jungtaek Kim

    Abstract: Bayesian optimization has attracted huge attention from diverse research areas in science and engineering, since it is capable of finding a global optimum of an expensive-to-evaluate black-box function efficiently. In general, a probabilistic regression model, e.g., Gaussian processes and Bayesian neural networks, is widely used as a surrogate function to model an explicit distribution over functi… ▽ More

    Submitted 6 October, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: 20 pages, 14 figures, 2 tables

  36. arXiv:2305.00075  [pdf, ps, other

    cs.LG math.OC stat.ML

    On the existence of solutions to adversarial training in multiclass classification

    Authors: Nicolas Garcia Trillos, Matt Jacobs, Jakwang Kim

    Abstract: We study three models of the problem of adversarial training in multiclass classification designed to construct robust classifiers against adversarial perturbations of data in the agnostic-classifier setting. We prove the existence of Borel measurable robust classifiers in each model and provide a unified perspective of the adversarial training problem, expanding the connections with optimal trans… ▽ More

    Submitted 29 May, 2023; v1 submitted 28 April, 2023; originally announced May 2023.

  37. Adaptive active subspace-based metamodeling for high-dimensional reliability analysis

    Authors: Jungho Kim, Ziqi Wang, Junho Song

    Abstract: To address the challenges of reliability analysis in high-dimensional probability spaces, this paper proposes a new metamodeling method that couples active subspace, heteroscedastic Gaussian process, and active learning. The active subspace is leveraged to identify low-dimensional salient features of a high-dimensional computational model. A surrogate computational model is built in the low-dimens… ▽ More

    Submitted 13 April, 2023; originally announced April 2023.

  38. arXiv:2303.05659  [pdf, other

    stat.ME

    A marginal structural model for normal tissue complication probability

    Authors: Thai-Son Tang, Zhihui Liu, Ali Hosni, John Kim, Olli Saarela

    Abstract: The goal of radiation therapy for cancer is to deliver prescribed radiation dose to the tumor while minimizing dose to the surrounding healthy tissues. To evaluate treatment plans, the dose distribution to healthy organs is commonly summarized as dose-volume histograms (DVHs). Normal tissue complication probability (NTCP) modelling has centered around making patient-level risk predictions with fea… ▽ More

    Submitted 23 May, 2024; v1 submitted 9 March, 2023; originally announced March 2023.

  39. arXiv:2303.02201  [pdf, other

    stat.ME

    Causal Inference using Multivariate Generalized Linear Mixed-Effects Models with Longitudinal Data

    Authors: Yizhen Xu, Jisoo Kim, Laura K. Hummers, Ami A. Shah, Scott Zeger

    Abstract: Dynamic prediction of causal effects under different treatment regimes conditional on an individual's characteristics and longitudinal history is an essential problem in precision medicine. This is challenging in practice because outcomes and treatment assignment mechanisms are unknown in observational studies, an individual's treatment efficacy is a counterfactual, and the existence of selection… ▽ More

    Submitted 3 March, 2023; originally announced March 2023.

    Comments: 19 pages, 5 figures

  40. arXiv:2303.01052  [pdf, other

    cs.LG cs.AI cs.CV stat.ME

    Demystifying Causal Features on Adversarial Examples and Causal Inoculation for Robust Network by Adversarial Instrumental Variable Regression

    Authors: Junho Kim, Byung-Kwan Lee, Yong Man Ro

    Abstract: The origin of adversarial examples is still inexplicable in research fields, and it arouses arguments from various viewpoints, albeit comprehensive investigations. In this paper, we propose a way of delving into the unexpected vulnerability in adversarially trained networks from a causal perspective, namely adversarial instrumental variable (IV) regression. By deploying it, we estimate the causal… ▽ More

    Submitted 2 March, 2023; originally announced March 2023.

    Comments: Accepted in CVPR 2023

  41. arXiv:2302.05025  [pdf, ps, other

    stat.ML cs.LG

    Hessian Based Smoothing Splines for Manifold Learning

    Authors: Juno Kim

    Abstract: We propose a multidimensional smoothing spline algorithm in the context of manifold learning. We generalize the bending energy penalty of thin-plate splines to a quadratic form on the Sobolev space of a flat manifold, based on the Frobenius norm of the Hessian matrix. This leads to a natural definition of smoothing splines on manifolds, which minimizes square error while optimizing a global curvat… ▽ More

    Submitted 9 February, 2023; originally announced February 2023.

    Comments: 20 pages

  42. arXiv:2211.15063  [pdf, other

    stat.ME math.ST

    High dimensional discriminant rules with shrinkage estimators of the covariance matrix and mean vector

    Authors: Jaehoan Kim, Hoyoung Park, Junyong Park

    Abstract: Linear discriminant analysis (LDA) is a typical method for classification problems with large dimensions and small samples. There are various types of LDA methods that are based on the different types of estimators for the covariance matrices and mean vectors. In this paper, we consider shrinkage methods based on a non-parametric approach. For the precision matrix, methods based on the sparsity st… ▽ More

    Submitted 5 March, 2023; v1 submitted 27 November, 2022; originally announced November 2022.

    Comments: 43 pages, 2 figures

  43. arXiv:2211.10656  [pdf, other

    cs.CV cs.LG stat.ML

    Parallel Diffusion Models of Operator and Image for Blind Inverse Problems

    Authors: Hyung** Chung, Jeongsol Kim, Sehui Kim, Jong Chul Ye

    Abstract: Diffusion model-based inverse problem solvers have demonstrated state-of-the-art performance in cases where the forward operator is known (i.e. non-blind). However, the applicability of the method to blind inverse problems has yet to be explored. In this work, we show that we can indeed solve a family of blind inverse problems by constructing another diffusion prior for the forward operator. Speci… ▽ More

    Submitted 19 November, 2022; originally announced November 2022.

    Comments: 25 pages, 13 figures

  44. arXiv:2211.04659  [pdf, other

    cs.LG math.OC stat.ML

    When is Momentum Extragradient Optimal? A Polynomial-Based Analysis

    Authors: Junhyung Lyle Kim, Gauthier Gidel, Anastasios Kyrillidis, Fabian Pedregosa

    Abstract: The extragradient method has gained popularity due to its robust convergence properties for differentiable games. Unlike single-objective optimization, game dynamics involve complex interactions reflected by the eigenvalues of the game vector field's Jacobian scattered across the complex plane. This complexity can cause the simple gradient method to diverge, even for bilinear games, while the extr… ▽ More

    Submitted 10 February, 2024; v1 submitted 8 November, 2022; originally announced November 2022.

  45. arXiv:2211.02998  [pdf, ps, other

    stat.ME math.ST

    An empirical likelihood approach to reduce selection bias in voluntary samples

    Authors: Jae Kwang Kim, Kosuke Morikawa

    Abstract: We address the weighting problem in voluntary samples under a nonignorable sample selection model. Under the assumption that the sample selection model is correctly specified, we can compute a consistent estimator of the model parameter and construct the propensity score estimator of the population mean. We use the empirical likelihood method to construct the final weights for voluntary samples by… ▽ More

    Submitted 11 May, 2023; v1 submitted 5 November, 2022; originally announced November 2022.

  46. Applications of Machine Learning in Pharmacogenomics: Clustering Plasma Concentration-Time Curves

    Authors: Jackson P. Lautier, Stella Grosser, Jessica Kim, Hyewon Kim, Junghi Kim

    Abstract: Pharmaceutical researchers are continually searching for techniques to improve both drug development processes and patient outcomes. An area of recent interest is the potential for machine learning (ML) applications within pharmacology. One such application not yet given close study is the unsupervised clustering of plasma concentration-time curves, hereafter, pharmacokinetic (PK) curves. In this… ▽ More

    Submitted 4 September, 2023; v1 submitted 24 October, 2022; originally announced October 2022.

    Comments: 38 pages, 14 figures, 3 tables

  47. arXiv:2209.14687  [pdf, other

    stat.ML cs.AI cs.CV cs.LG

    Diffusion Posterior Sampling for General Noisy Inverse Problems

    Authors: Hyung** Chung, Jeongsol Kim, Michael T. Mccann, Marc L. Klasky, Jong Chul Ye

    Abstract: Diffusion models have been recently studied as powerful generative inverse problem solvers, owing to their high quality reconstructions and the ease of combining existing iterative solvers. However, most works focus on solving simple linear inverse problems in noiseless settings, which significantly under-represents the complexity of real-world problems. In this work, we extend diffusion solvers t… ▽ More

    Submitted 20 May, 2024; v1 submitted 29 September, 2022; originally announced September 2022.

    Comments: ICLR 2023 spotlight

  48. arXiv:2209.10092  [pdf, other

    stat.CO

    A Fast Algorithm for Implementation of Koul's Minimum Distance Estimators and Their Application to Image Segmentation

    Authors: Jiwoong Kim

    Abstract: Minimum distance estimation methodology based on an empirical distribution function has been popular due to its desirable properties including robustness. Even though the statistical literature is awash with the research on the minimum distance estimation, the most of it is confined to the theoretical findings: only few statisticians conducted research on the application of the method to real worl… ▽ More

    Submitted 28 March, 2023; v1 submitted 20 September, 2022; originally announced September 2022.

  49. arXiv:2208.09819  [pdf, other

    stat.ML cs.LG

    Robust Tests in Online Decision-Making

    Authors: Gi-Soo Kim, Hyun-Joon Yang, Jane P. Kim

    Abstract: Bandit algorithms are widely used in sequential decision problems to maximize the cumulative reward. One potential application is mobile health, where the goal is to promote the user's health through personalized interventions based on user specific information acquired through wearable devices. Important considerations include the type of, and frequency with which data is collected (e.g. GPS, or… ▽ More

    Submitted 21 August, 2022; originally announced August 2022.

    Comments: 17 pages, 1 figure, supplementary material for "Robust Tests in Online Decision-Making" published in Proceedings of the AAAI Conference on Artificial Intelligence (2022)

  50. arXiv:2208.07535  [pdf, other

    stat.ME

    Semiparametric imputation using latent sparse conditional Gaussian mixtures for multivariate mixed outcomes

    Authors: Shonosuke Sugasawa, Jae Kwang Kim, Kosuke Morikawa

    Abstract: This paper proposes a flexible Bayesian approach to multiple imputation using conditional Gaussian mixtures. We introduce novel shrinkage priors for covariate-dependent mixing proportions in the mixture models to automatically select the suitable number of components used in the imputation step. We develop an efficient sampling algorithm for posterior computation and multiple imputation via Markov… ▽ More

    Submitted 16 August, 2022; originally announced August 2022.

    Comments: 29 pages, 5 figures