Skip to main content

Showing 1–50 of 90 results for author: Yoo, K

.
  1. arXiv:2406.16275  [pdf, other

    cs.CL

    Investigating the Influence of Prompt-Specific Shortcuts in AI Generated Text Detection

    Authors: Choonghyun Park, Hyuhng Joon Kim, Junyeob Kim, Youna Kim, Taeuk Kim, Hyunsoo Cho, Hwiyeol Jo, Sang-goo Lee, Kang Min Yoo

    Abstract: AI Generated Text (AIGT) detectors are developed with texts from humans and LLMs of common tasks. Despite the diversity of plausible prompt choices, these datasets are generally constructed with a limited number of prompts. The lack of prompt variation can introduce prompt-specific shortcut features that exist in data collected with the chosen prompt, but do not generalize to others. In this paper… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

    Comments: 19 pages, 3 figures, 13 tables, under review

  2. arXiv:2404.11972  [pdf, other

    cs.CL

    Aligning Language Models to Explicitly Handle Ambiguity

    Authors: Hyuhng Joon Kim, Youna Kim, Cheonbok Park, Junyeob Kim, Choonghyun Park, Kang Min Yoo, Sang-goo Lee, Taeuk Kim

    Abstract: In interactions between users and language model agents, user utterances frequently exhibit ellipsis (omission of words or phrases) or imprecision (lack of exactness) to prioritize efficiency. This can lead to varying interpretations of the same input based on different assumptions or background knowledge. It is thus crucial for agents to adeptly handle the inherent ambiguity in queries to ensure… ▽ More

    Submitted 16 June, 2024; v1 submitted 18 April, 2024; originally announced April 2024.

  3. arXiv:2404.01954  [pdf, other

    cs.CL cs.AI

    HyperCLOVA X Technical Report

    Authors: Kang Min Yoo, Jaegeun Han, Sookyo In, Heewon Jeon, Jisu Jeong, Jaewook Kang, Hyunwook Kim, Kyung-Min Kim, Munhyong Kim, Sungju Kim, Donghyun Kwak, Hanock Kwak, Se Jung Kwon, Bado Lee, Dongsoo Lee, Gichang Lee, Jooho Lee, Baeseong Park, Seong** Shin, Joonsang Yu, Seolki Baek, Sumin Byeon, Eungsup Cho, Dooseok Choe, Jeesung Han , et al. (371 additional authors not shown)

    Abstract: We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment t… ▽ More

    Submitted 13 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

    Comments: 44 pages; updated authors list and fixed author names

  4. arXiv:2403.19254  [pdf, other

    cs.CV

    Imperceptible Protection against Style Imitation from Diffusion Models

    Authors: Namhyuk Ahn, Wonhyuk Ahn, KiYoon Yoo, Daesik Kim, Seung-Hun Nam

    Abstract: Recent progress in diffusion models has profoundly enhanced the fidelity of image generation. However, this has raised concerns about copyright infringements. While prior methods have introduced adversarial perturbations to prevent style imitation, most are accompanied by the degradation of artworks' visual quality. Recognizing the importance of maintaining this, we develop a visually improved pro… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

  5. arXiv:2402.11548  [pdf, other

    cs.CL

    KMMLU: Measuring Massive Multitask Language Understanding in Korean

    Authors: Gui** Son, Hanwool Lee, Sungdong Kim, Seungone Kim, Niklas Muennighoff, Taekyoon Choi, Cheonbok Park, Kang Min Yoo, Stella Biderman

    Abstract: We propose KMMLU, a new Korean benchmark with 35,030 expert-level multiple-choice questions across 45 subjects ranging from humanities to STEM. While prior Korean benchmarks are translated from existing English benchmarks, KMMLU is collected from original Korean exams, capturing linguistic and cultural aspects of the Korean language. We test 27 public and proprietary LLMs and observe the best publ… ▽ More

    Submitted 6 June, 2024; v1 submitted 18 February, 2024; originally announced February 2024.

    Comments: Under Review

  6. arXiv:2402.11253  [pdf, other

    cs.LG cs.AI cs.CL

    Aligning Large Language Models by On-Policy Self-Judgment

    Authors: Sangkyu Lee, Sungdong Kim, Ashkan Yousefpour, Minjoon Seo, Kang Min Yoo, Youngjae Yu

    Abstract: Existing approaches for aligning large language models with human preferences face a trade-off that requires a separate reward model (RM) for on-policy learning. In this paper, we present a novel alignment framework, SELF-JUDGE that (1) does on-policy learning and 2) is parameter efficient, as it does not require an additional RM for evaluating the samples for on-policy learning. To this end, we p… ▽ More

    Submitted 25 June, 2024; v1 submitted 17 February, 2024; originally announced February 2024.

    Comments: Published as a main conference paper at ACL 2024

  7. arXiv:2402.08093  [pdf, other

    cs.LG cs.CL eess.AS

    BASE TTS: Lessons from building a billion-parameter Text-to-Speech model on 100K hours of data

    Authors: Mateusz Łajszczak, Guillermo Cámbara, Yang Li, Fatih Beyhan, Arent van Korlaar, Fan Yang, Arnaud Joly, Álvaro Martín-Cortinas, Ammar Abbas, Adam Michalski, Alexis Moinet, Sri Karlapati, Ewa Muszyńska, Haohan Guo, Bartosz Putrycz, Soledad López Gambino, Kayeon Yoo, Elena Sokolova, Thomas Drugman

    Abstract: We introduce a text-to-speech (TTS) model called BASE TTS, which stands for $\textbf{B}$ig $\textbf{A}$daptive $\textbf{S}$treamable TTS with $\textbf{E}$mergent abilities. BASE TTS is the largest TTS model to-date, trained on 100K hours of public domain speech data, achieving a new state-of-the-art in speech naturalness. It deploys a 1-billion-parameter autoregressive Transformer that converts ra… ▽ More

    Submitted 15 February, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

    Comments: v1.1 (fixed typos)

  8. arXiv:2402.05706  [pdf, other

    cs.CL cs.SD eess.AS

    Unified Speech-Text Pretraining for Spoken Dialog Modeling

    Authors: Heeseung Kim, Soonshin Seo, Kyeongseok Jeong, Ohsung Kwon, Jungwhan Kim, Jaehong Lee, Eunwoo Song, Myungwoo Oh, Sungroh Yoon, Kang Min Yoo

    Abstract: While recent work shows promising results in expanding the capabilities of large language models (LLM) to directly understand and synthesize speech, an LLM-based strategy for modeling spoken dialogs remains elusive and calls for further investigation. This work proposes an extensive speech-text LLM framework, named the Unified Spoken Dialog Model (USDM), to generate coherent spoken responses with… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

  9. arXiv:2312.05141  [pdf, other

    cs.CV

    Open Domain Generalization with a Single Network by Regularization Exploiting Pre-trained Features

    Authors: Inseop Chung, KiYoon Yoo, Nojun Kwak

    Abstract: Open Domain Generalization (ODG) is a challenging task as it not only deals with distribution shifts but also category shifts between the source and target datasets. To handle this task, the model has to learn a generalizable representation that can be applied to unseen domains while also identify unknown classes that were not present during training. Previous work has used multiple source-specifi… ▽ More

    Submitted 8 December, 2023; originally announced December 2023.

  10. arXiv:2311.08766  [pdf

    physics.optics

    Dirac Bilayer Metasurfaces as an Inverse Gires-Tournois Etalon

    Authors: Ki Young Lee, Kwang Wook Yoo, Francesco Monticone, Jae Woong Yoon

    Abstract: Efficient transmissive pure-phase resonances are highly desirable for optical modulation and wavefront engineering. Here, we propose a novel principle to realize a pure-phase resonance in an extremely broad transmission band, as opposed to previous approaches restricted to operating in reflection mode or over a narrow spectral band. We show that a glide-symmetric bilayer metasurface mathematically… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

    Comments: 11 pages, 5 figures

  11. arXiv:2311.07820  [pdf, other

    cs.CL

    On the Analysis of Cross-Lingual Prompt Tuning for Decoder-based Multilingual Model

    Authors: Nohil Park, Joonsuk Park, Kang Min Yoo, Sungroh Yoon

    Abstract: An exciting advancement in the field of multilingual models is the emergence of autoregressive models with zero- and few-shot capabilities, a phenomenon widely reported in large-scale language models. To further improve model adaptation to cross-lingual tasks, another trend is to further fine-tune the language models with either full fine-tuning or parameter-efficient tuning. However, the interact… ▽ More

    Submitted 13 November, 2023; originally announced November 2023.

  12. arXiv:2310.14849  [pdf, other

    cs.CL

    Universal Domain Adaptation for Robust Handling of Distributional Shifts in NLP

    Authors: Hyuhng Joon Kim, Hyunsoo Cho, Sang-Woo Lee, Junyeob Kim, Choonghyun Park, Sang-goo Lee, Kang Min Yoo, Taeuk Kim

    Abstract: When deploying machine learning systems to the wild, it is highly desirable for them to effectively leverage prior knowledge to the unfamiliar domain while also firing alarms to anomalous inputs. In order to address these requirements, Universal Domain Adaptation (UniDA) has emerged as a novel research area in computer vision, focusing on achieving both adaptation ability and robustness (i.e., the… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

    Comments: Findings of EMNLP 2023

  13. arXiv:2310.09518  [pdf, other

    cs.CL cs.AI cs.LG

    Instruction Tuning with Human Curriculum

    Authors: Bruce W. Lee, Hyunsoo Cho, Kang Min Yoo

    Abstract: In this work, we (1) introduce Curriculum Instruction Tuning, (2) explore the potential advantages of employing diverse curriculum strategies, and (3) delineate a synthetic instruction-response generation framework that complements our theoretical approach. Distinct from the existing instruction tuning dataset, our generation pipeline is systematically structured to emulate the sequential and orde… ▽ More

    Submitted 16 June, 2024; v1 submitted 14 October, 2023; originally announced October 2023.

    Comments: NAACL 2024

  14. arXiv:2308.00221  [pdf, other

    cs.CL cs.AI cs.CR

    Advancing Beyond Identification: Multi-bit Watermark for Large Language Models

    Authors: KiYoon Yoo, Wonhyuk Ahn, Nojun Kwak

    Abstract: We show the viability of tackling misuses of large language models beyond the identification of machine-generated text. While existing zero-bit watermark methods focus on detection only, some malicious misuses demand tracing the adversary user for counteracting them. To address this, we propose Multi-bit Watermark via Position Allocation, embedding traceable multi-bit information during language m… ▽ More

    Submitted 19 March, 2024; v1 submitted 31 July, 2023; originally announced August 2023.

    Comments: NAACL 2024 main. 9 pages and appendix

  15. arXiv:2306.15801  [pdf, other

    hep-ex physics.ins-det

    Production of antihydrogen atoms by 6 keV antiprotons through a positronium cloud

    Authors: P. Adrich, P. Blumer, G. Caratsch, M. Chung, P. Cladé, P. Comini, P. Crivelli, O. Dalkarov, P. Debu, A. Douillet, D. Drapier, P. Froelich, N. Garroum, S. Guellati-Khelifa, J. Guyomard, P-A. Hervieux, L. Hilico, P. Indelicato, S. Jonsell, J-P. Karr, B. Kim, S. Kim, E-S. Kim, Y. J. Ko, T. Kosinski , et al. (39 additional authors not shown)

    Abstract: We report on the first production of an antihydrogen beam by charge exchange of 6.1 keV antiprotons with a cloud of positronium in the GBAR experiment at CERN. The antiproton beam was delivered by the AD/ELENA facility. The positronium target was produced from a positron beam itself obtained from an electron linear accelerator. We observe an excess over background indicating antihydrogen productio… ▽ More

    Submitted 3 July, 2023; v1 submitted 27 June, 2023; originally announced June 2023.

    Journal ref: European Physical Journal C 83, 1004 (2023)

  16. arXiv:2305.14152  [pdf, other

    cs.LG cs.AI

    Memory-Efficient Fine-Tuning of Compressed Large Language Models via sub-4-bit Integer Quantization

    Authors: Jeonghoon Kim, Jung Hyun Lee, Sungdong Kim, Joonsuk Park, Kang Min Yoo, Se Jung Kwon, Dongsoo Lee

    Abstract: Large language models (LLMs) face the challenges in fine-tuning and deployment due to their high memory demands and computational costs. While parameter-efficient fine-tuning (PEFT) methods aim to reduce the memory usage of the optimizer state during fine-tuning, the inherent size of pre-trained LLM weights continues to be a pressing concern. Even though quantization techniques are widely proposed… ▽ More

    Submitted 28 October, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: Published at NeurIPS 2023. Camera-ready version

  17. arXiv:2305.13735  [pdf, other

    cs.CL cs.AI cs.LG

    Aligning Large Language Models through Synthetic Feedback

    Authors: Sungdong Kim, Sanghwan Bae, Jamin Shin, Soyoung Kang, Donghyun Kwak, Kang Min Yoo, Minjoon Seo

    Abstract: Aligning large language models (LLMs) to human values has become increasingly important as it enables sophisticated steering of LLMs. However, it requires significant human demonstrations and feedback or distillation from proprietary LLMs such as ChatGPT. In this work, we propose a novel alignment learning framework with synthetic feedback not dependent on extensive human annotations and proprieta… ▽ More

    Submitted 20 October, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: Accepted to EMNLP 2023 main conference

  18. arXiv:2305.01904  [pdf, other

    cs.CL cs.AI

    Robust Multi-bit Natural Language Watermarking through Invariant Features

    Authors: KiYoon Yoo, Wonhyuk Ahn, Jiho Jang, Nojun Kwak

    Abstract: Recent years have witnessed a proliferation of valuable original natural language contents found in subscription-based media outlets, web novel platforms, and outputs of large language models. However, these contents are susceptible to illegal piracy and potential misuse without proper security measures. This calls for a secure watermarking system to guarantee copyright protection through leakage… ▽ More

    Submitted 9 June, 2023; v1 submitted 3 May, 2023; originally announced May 2023.

    Comments: ACL 2023 long

  19. arXiv:2305.00215  [pdf

    cond-mat.str-el cond-mat.mtrl-sci

    Observation of linear magnetoelectric effect in a Dirac magnon antiferromagnet Cu$_3$TeO$_6$

    Authors: Aga Shahee, Kyongjun Yoo, B. Koteswararao, N. V. Ter-Oganessian, Kee Hoon Kim

    Abstract: Cu$_3$TeO$_6$, a three-dimensional antiferromagnet forming a unique spin-web lattice of spin-1/2 Cu2+ ions below the Neel temperature T$_N$ = 62 K, has recently been found to exhibit topological Dirac or nodal magnon dispersion. In this study, we report the discovery of the linear magnetoelectric (ME) effects in Cu$_3$TeO$_6$ below TN. Our pyroelectric current measurements at a constant magnetic f… ▽ More

    Submitted 29 April, 2023; originally announced May 2023.

  20. arXiv:2301.11660  [pdf, other

    cs.CL

    Probing Out-of-Distribution Robustness of Language Models with Parameter-Efficient Transfer Learning

    Authors: Hyunsoo Cho, Choonghyun Park, Junyeop Kim, Hyuhng Joon Kim, Kang Min Yoo, Sang-goo Lee

    Abstract: As the size of the pre-trained language model (PLM) continues to increase, numerous parameter-efficient transfer learning methods have been proposed recently to compensate for the tremendous cost of fine-tuning. Despite the impressive results achieved by large pre-trained language models (PLMs) and various parameter-efficient transfer learning (PETL) methods on sundry benchmarks, it remains unclea… ▽ More

    Submitted 13 June, 2023; v1 submitted 27 January, 2023; originally announced January 2023.

    Comments: *SEM 2023

  21. arXiv:2212.10938  [pdf, other

    cs.CL

    Critic-Guided Decoding for Controlled Text Generation

    Authors: Minbeom Kim, Hwanhee Lee, Kang Min Yoo, Joonsuk Park, Hwaran Lee, Kyomin Jung

    Abstract: Steering language generation towards objectives or away from undesired content has been a long-standing goal in utilizing language models (LM). Recent work has demonstrated reinforcement learning and weighted decoding as effective approaches to achieve a higher level of language control and quality with pros and cons. In this work, we propose a novel critic decoding method for controlled language… ▽ More

    Submitted 21 December, 2022; originally announced December 2022.

    Comments: 11 pages, 6 figures

  22. arXiv:2212.10873  [pdf, other

    cs.CL cs.LG

    Prompt-Augmented Linear Probing: Scaling beyond the Limit of Few-shot In-Context Learners

    Authors: Hyunsoo Cho, Hyuhng Joon Kim, Junyeob Kim, Sang-Woo Lee, Sang-goo Lee, Kang Min Yoo, Taeuk Kim

    Abstract: Through in-context learning (ICL), large-scale language models are effective few-shot learners without additional model fine-tuning. However, the ICL performance does not scale well with the number of available training samples as it is limited by the inherent input length constraint of the underlying language model. Meanwhile, many studies have revealed that language models are also powerful feat… ▽ More

    Submitted 13 June, 2023; v1 submitted 21 December, 2022; originally announced December 2022.

    Comments: AAAI 2023

  23. arXiv:2210.11034  [pdf, other

    cs.CL cs.LG

    Enhancing Out-of-Distribution Detection in Natural Language Understanding via Implicit Layer Ensemble

    Authors: Hyunsoo Cho, Choonghyun Park, Jaewook Kang, Kang Min Yoo, Taeuk Kim, Sang-goo Lee

    Abstract: Out-of-distribution (OOD) detection aims to discern outliers from the intended data distribution, which is crucial to maintaining high reliability and a good user experience. Most recent studies in OOD detection utilize the information from a single representation that resides in the penultimate layer to determine whether the input is anomalous or not. Although such a method is straightforward, th… ▽ More

    Submitted 20 October, 2022; originally announced October 2022.

    Comments: EMNLP Findings 2022

  24. arXiv:2210.06870  [pdf

    cond-mat.mtrl-sci

    Direct visualization and control of SrOx segregation on semiconducting Nb doped SrTiO3 (100) surface

    Authors: Hyang Keun Yoo, Daniel Schwarz, Soren Ulstrup, Woo** Kim, Chris Jozwiak, Aaron Bostwick, Tae Won Noh, Eli Rotenberg, Young Jun Chang

    Abstract: We investigated how SrOx segregates on a Nb doped SrTiO3 (100) surface by in air annealing. Using atomic force and photoemission electron microscopes, we can directly visualize the morphology and the electronic phase changes with SrOx segregation. SrOx islands less than 2 micron meter in size and 1-5 unit cells thick nucleate first and grow in a labyrinth domain pattern. After prolonged annealing,… ▽ More

    Submitted 13 October, 2022; originally announced October 2022.

    Journal ref: Journal of the Korean Physical Society 22, 4715 (2022)

  25. arXiv:2210.03858  [pdf, other

    cs.LG cs.CL

    AlphaTuning: Quantization-Aware Parameter-Efficient Adaptation of Large-Scale Pre-Trained Language Models

    Authors: Se Jung Kwon, Jeonghoon Kim, Jeongin Bae, Kang Min Yoo, **-Hwa Kim, Baeseong Park, Byeongwook Kim, Jung-Woo Ha, Nako Sung, Dongsoo Lee

    Abstract: There are growing interests in adapting large-scale language models using parameter-efficient fine-tuning methods. However, accelerating the model itself and achieving better inference efficiency through model compression has not been thoroughly explored yet. Model compression could provide the benefits of reducing memory footprints, enabling low-precision computations, and ultimately achieving co… ▽ More

    Submitted 7 October, 2022; originally announced October 2022.

    Comments: Findings of EMNLP 2022

  26. arXiv:2209.01765  [pdf, other

    cs.CL

    Continuous Decomposition of Granularity for Neural Paraphrase Generation

    Authors: Xiaodong Gu, Zhaowei Zhang, Sang-Woo Lee, Kang Min Yoo, Jung-Woo Ha

    Abstract: While Transformers have had significant success in paragraph generation, they treat sentences as linear sequences of tokens and often neglect their hierarchical information. Prior work has shown that decomposing the levels of granularity~(e.g., word, phrase, or sentence) for input tokens has produced substantial improvements, suggesting the possibility of enhancing Transformers via more fine-grain… ▽ More

    Submitted 16 September, 2022; v1 submitted 5 September, 2022; originally announced September 2022.

    Comments: Accepted to be published in COLING 2022

  27. arXiv:2207.07754  [pdf

    physics.optics physics.app-ph

    Lab-on-a-Chip Optical Biosensor Platform: Micro Ring Resonator Integrated with Near-Infrared Fourier Transform Spectrometer

    Authors: Kyoung Min Yoo, May Hlaing, Sourabh Jain, James Fan, Yue An, Ray T. Chen

    Abstract: A micro-ring-resonator (MRR) optical biosensor based on the evanescent field sensing mechanism has been extensively studied due to its high sensitivity and compact device size. However, a suitable on-chip integrated spectrometer device has to be demonstrated for the lab-on-a-chip applications, which can read the resonance wavelength shift from MRR biosensors based on minuscule changes in refractiv… ▽ More

    Submitted 15 July, 2022; originally announced July 2022.

    Comments: 23 pages, 9 figures including supplementary

  28. arXiv:2206.08082  [pdf, other

    cs.CL

    Self-Generated In-Context Learning: Leveraging Auto-regressive Language Models as a Demonstration Generator

    Authors: Hyuhng Joon Kim, Hyunsoo Cho, Junyeob Kim, Taeuk Kim, Kang Min Yoo, Sang-goo Lee

    Abstract: Large-scale pre-trained language models (PLMs) are well-known for being capable of solving a task simply by conditioning a few input-label pairs dubbed demonstrations on a prompt without being explicitly tuned for the desired downstream task. Such a process (i.e., in-context learning), however, naturally leads to high reliance on the demonstrations which are usually selected from external datasets… ▽ More

    Submitted 16 June, 2022; originally announced June 2022.

    Comments: NAACL 2022 Workshop on Large-scale Pre-trained Language Models

  29. arXiv:2205.13445  [pdf, other

    cs.CV cs.AI cs.CL cs.IT cs.LG

    Mutual Information Divergence: A Unified Metric for Multimodal Generative Models

    Authors: **-Hwa Kim, Yunji Kim, Jiyoung Lee, Kang Min Yoo, Sang-Woo Lee

    Abstract: Text-to-image generation and image captioning are recently emerged as a new experimental paradigm to assess machine intelligence. They predict continuous quantity accompanied by their sampling techniques in the generation, making evaluation complicated and intractable to get marginal distributions. Based on a recent trend that multimodal generative evaluations exploit a vison-and-language pre-trai… ▽ More

    Submitted 25 May, 2022; originally announced May 2022.

  30. arXiv:2205.12685  [pdf, other

    cs.CL cs.AI cs.LG

    Ground-Truth Labels Matter: A Deeper Look into Input-Label Demonstrations

    Authors: Kang Min Yoo, Junyeob Kim, Hyuhng Joon Kim, Hyunsoo Cho, Hwiyeol Jo, Sang-Woo Lee, Sang-goo Lee, Taeuk Kim

    Abstract: Despite recent explosion of interests in in-context learning, the underlying mechanism and the precise impact of the quality of demonstrations remain elusive. Intuitively, ground-truth labels should have as much impact in in-context learning (ICL) as supervised learning, but recent work reported that the input-label correspondence is significantly less important than previously thought. Intrigued… ▽ More

    Submitted 24 October, 2022; v1 submitted 25 May, 2022; originally announced May 2022.

    Comments: Accepted to EMNLP Long. Kang Min Yoo and Junyeob Kim contributed equally. Kang Min Yoo and Taeuk Kim are the corresponding authors

  31. arXiv:2205.12609  [pdf, other

    cs.CL

    Generating Information-Seeking Conversations from Unlabeled Documents

    Authors: Gangwoo Kim, Sungdong Kim, Kang Min Yoo, Jaewoo Kang

    Abstract: In this paper, we introduce a novel framework, SIMSEEK, (Simulating information-Seeking conversation from unlabeled documents), and compare its two variants. In our baseline SIMSEEK-SYM, a questioner generates follow-up questions upon the predetermined answer by an answerer. On the contrary, SIMSEEK-ASYM first generates the question and then finds its corresponding answer under the conversational… ▽ More

    Submitted 24 October, 2022; v1 submitted 25 May, 2022; originally announced May 2022.

    Comments: Accepted to EMNLP 2022 main conference

  32. Positron accumulation in the GBAR experiment

    Authors: P. Blumer, M. Charlton, M. Chung, P. Clade, P. Comini, P. Crivelli, O. Dalkarov, P. Debu, L. Dodd, A. Douillet, S. Guellati, P. -A Hervieux, L. Hilico, P. Indelicato, G. Janka, S. Jonsell, J. -P. Karr, B. H. Kim, E. S. Kim, S. K. Kim, Y. Ko, T. Kosinski, N. Kuroda, B. M. Latacz, B. Lee , et al. (45 additional authors not shown)

    Abstract: We present a description of the GBAR positron (e+) trap** apparatus, which consists of a three stage Buffer Gas Trap (BGT) followed by a High Field Penning Trap (HFT), and discuss its performance. The overall goal of the GBAR experiment is to measure the acceleration of the neutral antihydrogen (H) atom in the terrestrial gravitational field by neutralising a positive antihydrogen ion (H+), whic… ▽ More

    Submitted 9 May, 2022; originally announced May 2022.

    Journal ref: Nuclear Instruments and Methods in Physics Research Section A, Volume 1040, 2022, 167263

  33. arXiv:2205.02035  [pdf, other

    cs.CL

    Masked Summarization to Generate Factually Inconsistent Summaries for Improved Factual Consistency Checking

    Authors: Hwanhee Lee, Kang Min Yoo, Joonsuk Park, Hwaran Lee, Kyomin Jung

    Abstract: Despite the recent advances in abstractive summarization systems, it is still difficult to determine whether a generated summary is factual consistent with the source text. To this end, the latest approach is to train a factual consistency classifier on factually consistent and inconsistent summaries. Luckily, the former is readily available as reference summaries in existing summarization dataset… ▽ More

    Submitted 4 May, 2022; originally announced May 2022.

    Comments: NAACL 2022 Findings

  34. arXiv:2204.14017  [pdf, other

    cs.LG cs.AI cs.CL

    Backdoor Attacks in Federated Learning by Rare Embeddings and Gradient Ensembling

    Authors: KiYoon Yoo, Nojun Kwak

    Abstract: Recent advances in federated learning have demonstrated its promising capability to learn on decentralized datasets. However, a considerable amount of work has raised concerns due to the potential risks of adversaries participating in the framework to poison the global model for an adversarial purpose. This paper investigates the feasibility of model poisoning for backdoor attacks through rare wor… ▽ More

    Submitted 23 October, 2022; v1 submitted 29 April, 2022; originally announced April 2022.

    Comments: Accepted to EMNLP 2022, 9 pages and Appendix

  35. arXiv:2203.01677  [pdf, other

    cs.CL cs.CR cs.LG

    Detection of Word Adversarial Examples in Text Classification: Benchmark and Baseline via Robust Density Estimation

    Authors: KiYoon Yoo, Jangho Kim, Jiho Jang, Nojun Kwak

    Abstract: Word-level adversarial attacks have shown success in NLP models, drastically decreasing the performance of transformer-based models in recent years. As a countermeasure, adversarial defense has been explored, but relatively few efforts have been made to detect adversarial examples. However, detecting adversarial examples may be crucial for automated tasks (e.g. review sentiment analysis) that wish… ▽ More

    Submitted 3 March, 2022; originally announced March 2022.

    Comments: Findings of ACL 2022

  36. arXiv:2112.07027  [pdf

    physics.optics physics.app-ph physics.ins-det

    Dual-Polarization Bandwidth-Bridged On-Chip Bandpass Sampling Fourier Transform Spectrometer from Visible to Near-Infrared

    Authors: Kyoung Min Yoo, Ray T. Chen

    Abstract: The on-chip broadband optical spectrometers which cover the entire tissue transparency window (λ=650-1050 nm) with high resolution are highly demanded for the miniaturized bio-sensing and bio-imaging applications. Here, we propose a novel type of spatial heterodyne Fourier transform spectrometer (SHFTS) integrated with a sub-wavelength grating coupler (SWGC) for the dual-polarization bandpass samp… ▽ More

    Submitted 13 December, 2021; originally announced December 2021.

    Comments: 48 Pages, 6 figures, 14 supportive figures

  37. arXiv:2111.12958  [pdf, other

    cs.CV

    Self-Distilled Self-Supervised Representation Learning

    Authors: Jiho Jang, Seonhoon Kim, Kiyoon Yoo, Chaerin Kong, Jangho Kim, Nojun Kwak

    Abstract: State-of-the-art frameworks in self-supervised learning have recently shown that fully utilizing transformer-based models can lead to performance boost compared to conventional CNN models. Striving to maximize the mutual information of two views of an image, existing works apply a contrastive loss to the final representations. Motivated by self-distillation in the supervised regime, we further exp… ▽ More

    Submitted 23 November, 2022; v1 submitted 25 November, 2021; originally announced November 2021.

    Comments: WACV 23, 11 pages

  38. arXiv:2111.02643  [pdf, other

    cs.CL

    Response Generation with Context-Aware Prompt Learning

    Authors: Xiaodong Gu, Kang Min Yoo, Sang-Woo Lee

    Abstract: Pre-trained language models (PLM) have marked a huge leap in neural dialogue modeling. While PLMs are pre-trained on large-scale text corpora, they are usually fine-tuned on scarce dialogue data with specific domain knowledge and dialogue styles. However, tailoring the language models while fully utilizing prior knowledge in large pre-trained models remains a challenge. In this paper, we present a… ▽ More

    Submitted 13 December, 2021; v1 submitted 4 November, 2021; originally announced November 2021.

  39. arXiv:2110.03461  [pdf, other

    cs.AI

    Self-Evolutionary Optimization for Pareto Front Learning

    Authors: Simyung Chang, KiYoon Yoo, Jiho Jang, Nojun Kwak

    Abstract: Multi-task learning (MTL), which aims to improve performance by learning multiple tasks simultaneously, inherently presents an optimization challenge due to multiple objectives. Hence, multi-objective optimization (MOO) approaches have been proposed for multitasking problems. Recent MOO methods approximate multiple optimal solutions (Pareto front) with a single unified model, which is collectively… ▽ More

    Submitted 7 October, 2021; originally announced October 2021.

    Comments: 16 pages

  40. arXiv:2109.07953  [pdf, other

    cs.CL

    Efficient Attribute Injection for Pretrained Language Models

    Authors: Reinald Kim Amplayo, Kang Min Yoo, Sang-Woo Lee

    Abstract: Metadata attributes (e.g., user and product IDs from reviews) can be incorporated as additional inputs to neural-based NLP models, by modifying the architecture of the models, in order to improve their performance. Recent models however rely on pretrained language models (PLMs), where previously used techniques for attribute injection are either nontrivial or ineffective. In this paper, we propose… ▽ More

    Submitted 16 September, 2021; originally announced September 2021.

  41. arXiv:2109.04660  [pdf, other

    cs.LG cs.AI

    Dynamic Collective Intelligence Learning: Finding Efficient Sparse Model via Refined Gradients for Pruned Weights

    Authors: Jangho Kim, Jayeon Yoo, Yeji Song, KiYoon Yoo, Nojun Kwak

    Abstract: With the growth of deep neural networks (DNN), the number of DNN parameters has drastically increased. This makes DNN models hard to be deployed on resource-limited embedded systems. To alleviate this problem, dynamic pruning methods have emerged, which try to find diverse sparsity patterns during training by utilizing Straight-Through-Estimator (STE) to approximate gradients of pruned weights. ST… ▽ More

    Submitted 31 July, 2023; v1 submitted 10 September, 2021; originally announced September 2021.

    Comments: Accepted to ACM MM 2023, code is in https://github.com/Jangho-Kim/DCIL-pytorch

  42. arXiv:2109.04650  [pdf, other

    cs.CL

    What Changes Can Large-scale Language Models Bring? Intensive Study on HyperCLOVA: Billions-scale Korean Generative Pretrained Transformers

    Authors: Boseop Kim, HyoungSeok Kim, Sang-Woo Lee, Gichang Lee, Donghyun Kwak, Dong Hyeon Jeon, Sunghyun Park, Sungju Kim, Seonhoon Kim, Dongpil Seo, Heungsub Lee, Minyoung Jeong, Sungjae Lee, Minsub Kim, Suk Hyun Ko, Seokhun Kim, Taeyong Park, **uk Kim, Soyoung Kang, Na-Hyeon Ryu, Kang Min Yoo, Minsuk Chang, Soobin Suh, Sookyo In, **seong Park , et al. (12 additional authors not shown)

    Abstract: GPT-3 shows remarkable in-context learning ability of large-scale language models (LMs) trained on hundreds of billion scale data. Here we address some remaining issues less reported by the GPT-3 paper, such as a non-English LM, the performances of different sized models, and the effect of recently introduced prompt optimization on in-context learning. To achieve this, we introduce HyperCLOVA, a K… ▽ More

    Submitted 28 November, 2021; v1 submitted 9 September, 2021; originally announced September 2021.

    Comments: Accepted to EMNLP2021 as a long paper. Fixed some typos

  43. arXiv:2106.07345  [pdf, other

    cs.CL cs.AI

    Self-Guided Contrastive Learning for BERT Sentence Representations

    Authors: Taeuk Kim, Kang Min Yoo, Sang-goo Lee

    Abstract: Although BERT and its variants have reshaped the NLP landscape, it still remains unclear how best to derive sentence embeddings from such pre-trained Transformers. In this work, we propose a contrastive learning method that utilizes self-guidance for improving the quality of BERT sentence representations. Our method fine-tunes BERT in a self-supervised fashion, does not rely on data augmentation,… ▽ More

    Submitted 3 June, 2021; originally announced June 2021.

    Comments: ACL 2021

  44. In situ investigation of conducting interface formation in LaAlO3/SrTiO3 heterostructure

    Authors: Hyang Keun Yoo, Luca Moreschini, Aaron Bostwick, Andrew L. Walter, Tae Won Noh, Eli Rotenberg, Young Jun Chang

    Abstract: The high-mobility conducting interface (CI) between LaAlO_{3}(LAO) and SrTiO_{3}(STO) has revealed many fascinating phenomena, including exotic magnetism and superconductivity. But, the formation mechanism of the CI has not been conclusively explained. Here, using in situ angle-resolved photoemission spectroscopy, we elucidated the mechanisms for the CI formation. In as-grown samples, we observed… ▽ More

    Submitted 12 May, 2021; originally announced May 2021.

    Comments: 18 pages, 4 figures

  45. Enhanced tunability of two-dimensional electron gas on SrTiO3 through heterostructuring

    Authors: Hyang Keun Yoo, Luca Moreschini, Andrew L. Walter, Aaron Bostwick, Karsten Horn, Eli Rotenberg, Young Jun Chang

    Abstract: Two-dimensional electron gases (2DEGs) on the SrTiO3 (STO) surface or in STO-based heterostructures have exhibited many intriguing phenomena, which are strongly dependent on the 2DEG-carrier density. We report that the tunability of the 2DEG-carrier density is significantly enhanced by adding a monolayer LaTiO3 (LTO) onto the STO. Ultraviolet (UV) irradiation induced maximum carrier density of the… ▽ More

    Submitted 12 May, 2021; originally announced May 2021.

    Comments: 19 pages, 4 figures

    Journal ref: Current Applied Physics 20, 1268 (2020)

  46. arXiv:2104.08826  [pdf, other

    cs.CL cs.AI

    GPT3Mix: Leveraging Large-scale Language Models for Text Augmentation

    Authors: Kang Min Yoo, Dongju Park, Jaewook Kang, Sang-Woo Lee, Woomyeong Park

    Abstract: Large-scale language models such as GPT-3 are excellent few-shot learners, allowing them to be controlled via natural text prompts. Recent studies report that prompt-based direct classification eliminates the need for fine-tuning but lacks data and inference scalability. This paper proposes a novel data augmentation technique that leverages large-scale language models to generate realistic text sa… ▽ More

    Submitted 18 November, 2021; v1 submitted 18 April, 2021; originally announced April 2021.

    Comments: Accepted to EMNLP2021 Findings; 11 pages, 7 tables, 2 figures

  47. arXiv:2104.07541  [pdf, other

    cs.CL cs.LG

    Reward Optimization for Neural Machine Translation with Learned Metrics

    Authors: Raphael Shu, Kang Min Yoo, Jung-Woo Ha

    Abstract: Neural machine translation (NMT) models are conventionally trained with token-level negative log-likelihood (NLL), which does not guarantee that the generated translations will be optimized for a selected sequence-level evaluation metric. Multiple approaches are proposed to train NMT with BLEU as the reward, in order to directly improve the metric. However, it was reported that the gain in BLEU do… ▽ More

    Submitted 15 April, 2021; originally announced April 2021.

  48. arXiv:2012.01775  [pdf, other

    cs.CL cs.AI cs.LG

    DialogBERT: Discourse-Aware Response Generation via Learning to Recover and Rank Utterances

    Authors: Xiaodong Gu, Kang Min Yoo, Jung-Woo Ha

    Abstract: Recent advances in pre-trained language models have significantly improved neural response generation. However, existing methods usually view the dialogue context as a linear sequence of tokens and learn to generate the next word through token-level self-attention. Such token-level encoding hinders the exploration of discourse-level coherence among utterances. This paper presents DialogBERT, a nov… ▽ More

    Submitted 13 December, 2021; v1 submitted 3 December, 2020; originally announced December 2020.

    Comments: Published as a conference paper at AAAI 2021

  49. arXiv:2010.10338  [pdf, other

    cs.LG cs.CV

    Edge Bias in Federated Learning and its Solution by Buffered Knowledge Distillation

    Authors: Sangho Lee, Kiyoon Yoo, Nojun Kwak

    Abstract: Federated learning (FL), which utilizes communication between the server (core) and local devices (edges) to indirectly learn from more data, is an emerging field in deep learning research. Recently, Knowledge Distillation-based FL methods with notable performance and high applicability have been suggested. In this paper, we choose knowledge distillation-based FL method as our baseline and tackle… ▽ More

    Submitted 9 February, 2021; v1 submitted 20 October, 2020; originally announced October 2020.

    Comments: 10 pages

  50. arXiv:2010.06690  [pdf, other

    physics.flu-dyn physics.med-ph

    Disease transmission through expiratory aerosols on an urban bus

    Authors: Zhihang Zhang, Taehoon Han, Kwang Hee Yoo, Jesse Capecelatro, Andre Boehman, Kevin Maki

    Abstract: Airborne respiratory diseases such as SARS-CoV-2 (COVID-19) pose significant challenges for public transportation. Several recent outbreaks of SARS-CoV-2 indicate the high risk of transmission among passengers on public buses if special precautions are not taken. This study presents a combined experimental and numerical analysis to identify transmission mechanisms on an urban bus and assess strate… ▽ More

    Submitted 15 November, 2020; v1 submitted 13 October, 2020; originally announced October 2020.

    Comments: 22 pages, 15 figures