Skip to main content

Showing 1–10 of 10 results for author: Choe, Y J

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.01506  [pdf, other

    cs.CL cs.AI cs.LG stat.ML

    The Geometry of Categorical and Hierarchical Concepts in Large Language Models

    Authors: Kiho Park, Yo Joong Choe, Yibo Jiang, Victor Veitch

    Abstract: Understanding how semantic meaning is encoded in the representation spaces of large language models is a fundamental problem in interpretability. In this paper, we study the two foundational questions in this area. First, how are categorical concepts, such as {'mammal', 'bird', 'reptile', 'fish'}, represented? Second, how are hierarchical relations between concepts encoded? For example, how is the… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: Code is available at https://github.com/KihoPark/LLM_Categorical_Hierarchical_Representations

  2. arXiv:2402.09698  [pdf, other

    stat.ME cs.LG math.PR math.ST stat.ML

    Combining Evidence Across Filtrations Using Adjusters

    Authors: Yo Joong Choe, Aaditya Ramdas

    Abstract: In anytime-valid sequential inference, it is known that any admissible procedure must be based on e-processes, which are composite generalizations of test martingales that quantify the accumulated evidence against a composite null hypothesis at any arbitrary stop** time. This paper studies methods for combining e-processes constructed using different information sets (filtrations) for the same n… ▽ More

    Submitted 28 May, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

    Comments: Substantially revised with new results in Sections 5 and 6. Code is available at https://github.com/yjchoe/CombiningEvidenceAcrossFiltrations

  3. arXiv:2311.03658  [pdf, other

    cs.CL cs.AI cs.LG stat.ML

    The Linear Representation Hypothesis and the Geometry of Large Language Models

    Authors: Kiho Park, Yo Joong Choe, Victor Veitch

    Abstract: Informally, the 'linear representation hypothesis' is the idea that high-level concepts are represented linearly as directions in some representation space. In this paper, we address two closely related questions: What does "linear representation" actually mean? And, how do we make sense of geometric notions (e.g., cosine similarity or projection) in the representation space? To answer these, we u… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

    Comments: Accepted for an oral presentation at NeurIPS 2023 Workshop on Causal Representation Learning. Code is available at https://github.com/KihoPark/linear_rep_geometry

  4. arXiv:2305.10564  [pdf, other

    stat.ML cs.AI cs.LG stat.ME

    Counterfactually Comparing Abstaining Classifiers

    Authors: Yo Joong Choe, Aditya Gangrade, Aaditya Ramdas

    Abstract: Abstaining classifiers have the option to abstain from making predictions on inputs that they are unsure about. These classifiers are becoming increasingly popular in high-stakes decision-making problems, as they can withhold uncertain predictions to improve their reliability and safety. When evaluating black-box abstaining classifier(s), however, we lack a principled approach that accounts for wh… ▽ More

    Submitted 9 November, 2023; v1 submitted 17 May, 2023; originally announced May 2023.

    Comments: Accepted to NeurIPS 2023. Preliminary work presented at the ICML 2023 Workshop on Counterfactuals in Minds and Machines. Code available at https://github.com/yjchoe/ComparingAbstainingClassifiers

  5. arXiv:2110.00115  [pdf, other

    stat.ME cs.LG math.ST stat.AP stat.ML

    Comparing Sequential Forecasters

    Authors: Yo Joong Choe, Aaditya Ramdas

    Abstract: Consider two forecasters, each making a single prediction for a sequence of events over time. We ask a relatively basic question: how might we compare these forecasters, either online or post-hoc, while avoiding unverifiable assumptions on how the forecasts and outcomes were generated? In this paper, we present a rigorous answer to this question by designing novel sequential inference procedures f… ▽ More

    Submitted 9 November, 2023; v1 submitted 30 September, 2021; originally announced October 2021.

    Comments: Published in Operations Research. Code and data sources available at https://github.com/yjchoe/ComparingForecasters

  6. arXiv:2010.01243  [pdf, other

    cs.LG cs.DC stat.ML

    Client Selection in Federated Learning: Convergence Analysis and Power-of-Choice Selection Strategies

    Authors: Yae Jee Cho, Jianyu Wang, Gauri Joshi

    Abstract: Federated learning is a distributed optimization paradigm that enables a large number of resource-limited client nodes to cooperatively train a model without data sharing. Several works have analyzed the convergence of federated learning by accounting of data heterogeneity, communication and computation limitations, and partial client participation. However, they assume unbiased client participati… ▽ More

    Submitted 2 October, 2020; originally announced October 2020.

  7. arXiv:2004.05007  [pdf, other

    stat.ML cs.LG

    An Empirical Study of Invariant Risk Minimization

    Authors: Yo Joong Choe, Jiyeon Ham, Kyubyong Park

    Abstract: Invariant risk minimization (IRM) (Arjovsky et al., 2019) is a recently proposed framework designed for learning predictors that are invariant to spurious correlations across different training environments. Yet, despite its theoretical justifications, IRM has not been extensively tested across various settings. In an attempt to gain a better understanding of the framework, we empirically investig… ▽ More

    Submitted 6 July, 2020; v1 submitted 10 April, 2020; originally announced April 2020.

    Comments: Presented at the ICML 2020 Workshop on Uncertainty and Robustness in Deep Learning. Code at https://github.com/kakaobrain/irm-empirical-study

  8. arXiv:1904.08144  [pdf, other

    cs.LG stat.ML

    Predicting drug-target interaction using 3D structure-embedded graph representations from graph neural networks

    Authors: Jaechang Lim, Seongok Ryu, Kyubyong Park, Yo Joong Choe, Jiyeon Ham, Woo Youn Kim

    Abstract: Accurate prediction of drug-target interaction (DTI) is essential for in silico drug design. For the purpose, we propose a novel approach for predicting DTI using a GNN that directly incorporates the 3D structure of a protein-ligand complex. We also apply a distance-aware graph attention algorithm with gate augmentation to increase the performance of our model. As a result, our model shows better… ▽ More

    Submitted 17 April, 2019; originally announced April 2019.

    Comments: 20 pages, 2 figures

  9. arXiv:1902.07249  [pdf, other

    cs.CL cs.LG stat.ML

    Discovery of Natural Language Concepts in Individual Units of CNNs

    Authors: Seil Na, Yo Joong Choe, Dong-Hyun Lee, Gunhee Kim

    Abstract: Although deep convolutional networks have achieved improved performance in many natural language tasks, they have been treated as black boxes because they are difficult to interpret. Especially, little is known about how they represent language in their intermediate layers. In an attempt to understand the representations of deep convolutional networks trained on language tasks, we show that indivi… ▽ More

    Submitted 28 February, 2019; v1 submitted 18 February, 2019; originally announced February 2019.

    Comments: Published as a conference paper at ICLR 2019

  10. arXiv:1804.08154  [pdf, other

    stat.AP q-bio.NC stat.ML

    Local White Matter Architecture Defines Functional Brain Dynamics

    Authors: Yo Joong Choe, Sivaraman Balakrishnan, Aarti Singh, Jean M. Vettel, Timothy Verstynen

    Abstract: Large bundles of myelinated axons, called white matter, anatomically connect disparate brain regions together and compose the structural core of the human connectome. We recently proposed a method of measuring the local integrity along the length of each white matter fascicle, termed the local connectome. If communication efficiency is fundamentally constrained by the integrity along the entire le… ▽ More

    Submitted 16 September, 2018; v1 submitted 22 April, 2018; originally announced April 2018.

    Comments: Accepted to the 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC 2018)