Skip to main content

Showing 1–50 of 55 results for author: Oh, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.09948  [pdf, other

    cs.CL

    BLEnD: A Benchmark for LLMs on Everyday Knowledge in Diverse Cultures and Languages

    Authors: Junho Myung, Nayeon Lee, Yi Zhou, Jiho **, Rifki Afina Putri, Dimosthenis Antypas, Hsuvas Borkakoty, Eunsu Kim, Carla Perez-Almendros, Abinew Ali Ayele, Víctor Gutiérrez-Basulto, Yazmín Ibáñez-García, Hwaran Lee, Shamsuddeen Hassan Muhammad, Kiwoong Park, Anar Sabuhi Rzayev, Nina White, Seid Muhie Yimam, Mohammad Taher Pilehvar, Nedjma Ousidhoum, Jose Camacho-Collados, Alice Oh

    Abstract: Large language models (LLMs) often lack culture-specific knowledge of daily life, especially across diverse regions and non-English languages. Existing benchmarks for evaluating LLMs' cultural sensitivities are limited to a single language or collected from online sources such as Wikipedia, which do not reflect the mundane everyday lifestyles of diverse regions. That is, information about the food… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  2. arXiv:2405.19691  [pdf, other

    cs.HC

    Designing Prompt Analytics Dashboards to Analyze Student-ChatGPT Interactions in EFL Writing

    Authors: Minsun Kim, SeonGyeom Kim, Suyoun Lee, Yoosang Yoon, Junho Myung, Haneul Yoo, Hyungseung Lim, Jieun Han, Yoonsu Kim, So-Yeon Ahn, Juho Kim, Alice Oh, Hwajung Hong, Tak Yeon Lee

    Abstract: While ChatGPT has significantly impacted education by offering personalized resources for students, its integration into educational settings poses unprecedented risks, such as inaccuracies and biases in AI-generated content, plagiarism and over-reliance on AI, and privacy and security issues. To help teachers address such risks, we conducted a two-phase iterative design process that comprises sur… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  3. arXiv:2403.10900  [pdf, other

    cs.CL

    BEnQA: A Question Answering and Reasoning Benchmark for Bengali and English

    Authors: Sheikh Shafayat, H M Quamran Hasan, Minhajur Rahman Chowdhury Mahim, Rifki Afina Putri, James Thorne, Alice Oh

    Abstract: In this study, we introduce BEnQA, a dataset comprising parallel Bengali and English exam questions for middle and high school levels in Bangladesh. Our dataset consists of approximately 5K questions covering several subjects in science with different types of questions, including factual, application, and reasoning-based questions. We benchmark several Large Language Models (LLMs) with our parall… ▽ More

    Submitted 16 March, 2024; originally announced March 2024.

  4. arXiv:2403.08272  [pdf, other

    cs.CL

    RECIPE4U: Student-ChatGPT Interaction Dataset in EFL Writing Education

    Authors: Jieun Han, Haneul Yoo, Junho Myung, Minsun Kim, Tak Yeon Lee, So-Yeon Ahn, Alice Oh

    Abstract: The integration of generative AI in education is expanding, yet empirical analyses of large-scale and real-world interactions between students and AI systems still remain limited. Addressing this gap, we present RECIPE4U (RECIPE for University), a dataset sourced from a semester-long experiment with 212 college students in English as Foreign Language (EFL) writing courses. During the study, studen… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

    Comments: arXiv admin note: text overlap with arXiv:2309.13243

  5. arXiv:2403.06412  [pdf, other

    cs.CL

    CLIcK: A Benchmark Dataset of Cultural and Linguistic Intelligence in Korean

    Authors: Eunsu Kim, Juyoung Suk, Philhoon Oh, Haneul Yoo, James Thorne, Alice Oh

    Abstract: Despite the rapid development of large language models (LLMs) for the Korean language, there remains an obvious lack of benchmark datasets that test the requisite Korean cultural and linguistic knowledge. Because many existing Korean benchmark datasets are derived from the English counterparts through translation, they often overlook the different cultural contexts. For the few benchmark datasets… ▽ More

    Submitted 15 March, 2024; v1 submitted 10 March, 2024; originally announced March 2024.

  6. arXiv:2402.18045  [pdf, other

    cs.CL

    Multi-FAct: Assessing Multilingual LLMs' Multi-Regional Knowledge using FActScore

    Authors: Sheikh Shafayat, Eunsu Kim, Juhyun Oh, Alice Oh

    Abstract: Large Language Models (LLMs) are prone to factuality hallucination, generating text that contradicts established knowledge. While extensive research has addressed this in English, little is known about multilingual LLMs. This paper systematically evaluates multilingual LLMs' factual accuracy across languages and geographic regions. We introduce a novel pipeline for multilingual factuality evaluati… ▽ More

    Submitted 1 March, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

  7. arXiv:2402.17302  [pdf, other

    cs.CL

    Can LLM Generate Culturally Relevant Commonsense QA Data? Case Study in Indonesian and Sundanese

    Authors: Rifki Afina Putri, Faiz Ghifari Haznitrama, Dea Adhista, Alice Oh

    Abstract: Large Language Models (LLMs) are increasingly being used to generate synthetic data for training and evaluating models. However, it is unclear whether they can generate a good quality of question answering (QA) dataset that incorporates knowledge and cultural nuance embedded in a language, especially for low-resource languages. In this study, we investigate the effectiveness of using LLMs in gener… ▽ More

    Submitted 16 April, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

  8. arXiv:2402.16733  [pdf, other

    cs.CL cs.AI

    DREsS: Dataset for Rubric-based Essay Scoring on EFL Writing

    Authors: Haneul Yoo, Jieun Han, So-Yeon Ahn, Alice Oh

    Abstract: Automated essay scoring (AES) is a useful tool in English as a Foreign Language (EFL) writing education, offering real-time essay scores for students and instructors. However, previous AES models were trained on essays and scores irrelevant to the practical scenarios of EFL writing education and usually provided a single holistic score due to the lack of appropriate datasets. In this paper, we rel… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2310.05191

  9. arXiv:2402.06204  [pdf, other

    cs.CL cs.AI

    The Generative AI Paradox on Evaluation: What It Can Solve, It May Not Evaluate

    Authors: Juhyun Oh, Eunsu Kim, Inha Cha, Alice Oh

    Abstract: This paper explores the assumption that Large Language Models (LLMs) skilled in generation tasks are equally adept as evaluators. We assess the performance of three LLMs and one open-source LM in Question-Answering (QA) and evaluation tasks using the TriviaQA (Joshi et al., 2017) dataset. Results indicate a significant disparity, with LLMs exhibiting lower performance in evaluation tasks compared… ▽ More

    Submitted 9 February, 2024; originally announced February 2024.

  10. arXiv:2312.02093  [pdf

    cs.CY

    Cultural Differences in Students' Privacy Concerns in Learning Analytics across Germany, South Korea, Spain, Sweden, and the United States

    Authors: Olga Viberg, René F. Kizilcec, Ioana Jivet, Alejandra Martínez Monés, Alice Oh, Chantal Mutimukwe, Stefan Hrastinski, Maren Scheffel

    Abstract: Applications of learning analytics (LA) can raise concerns from students about their privacy in higher education contexts. Develo** effective privacy-enhancing practices requires a systematic understanding of students' privacy concerns and how they vary across national and cultural dimensions. We conducted a survey study with established instruments to measure privacy concerns and cultural value… ▽ More

    Submitted 11 February, 2024; v1 submitted 4 December, 2023; originally announced December 2023.

  11. arXiv:2311.09497  [pdf, other

    cs.DL cs.GT

    Peer Reviews of Peer Reviews: A Randomized Controlled Trial and Other Experiments

    Authors: Alexander Goldberg, Ivan Stelmakh, Kyunghyun Cho, Alice Oh, Alekh Agarwal, Danielle Belgrave, Nihar B. Shah

    Abstract: Is it possible to reliably evaluate the quality of peer reviews? We study this question driven by two primary motivations -- incentivizing high-quality reviewing using assessed quality of reviews and measuring changes to review quality in experiments. We conduct a large scale study at the NeurIPS 2022 conference, a top-tier conference in machine learning, in which we invited (meta)-reviewers and a… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

  12. arXiv:2310.12585  [pdf, other

    cs.CL cs.AI cs.LG

    Time-Aware Representation Learning for Time-Sensitive Question Answering

    Authors: Jungbin Son, Alice Oh

    Abstract: Time is one of the crucial factors in real-world question answering (QA) problems. However, language models have difficulty understanding the relationships between time specifiers, such as 'after' and 'before', and numbers, since existing QA datasets do not include sufficient time expressions. To address this issue, we propose a Time-Context aware Question Answering (TCQA) framework. We suggest a… ▽ More

    Submitted 19 October, 2023; originally announced October 2023.

    Comments: 2023 EMNLP Findings

  13. arXiv:2310.05191  [pdf, other

    cs.CL

    FABRIC: Automated Scoring and Feedback Generation for Essays

    Authors: Jieun Han, Haneul Yoo, Junho Myung, Minsun Kim, Hyunseung Lim, Yoonsu Kim, Tak Yeon Lee, Hwajung Hong, Juho Kim, So-Yeon Ahn, Alice Oh

    Abstract: Automated essay scoring (AES) provides a useful tool for students and instructors in writing classes by generating essay scores in real-time. However, previous AES models do not provide more specific rubric-based scores nor feedback on how to improve the essays, which can be even more important than the overall scores for learning. We present FABRIC, a pipeline to help students and instructors in… ▽ More

    Submitted 8 October, 2023; originally announced October 2023.

  14. arXiv:2309.13243   

    cs.CL

    ChEDDAR: Student-ChatGPT Dialogue in EFL Writing Education

    Authors: Jieun Han, Haneul Yoo, Junho Myung, Minsun Kim, Tak Yeon Lee, So-Yeon Ahn, Alice Oh

    Abstract: The integration of generative AI in education is expanding, yet empirical analyses of large-scale, real-world interactions between students and AI systems still remain limited. In this study, we present ChEDDAR, ChatGPT & EFL Learner's Dialogue Dataset As Revising an essay, which is collected from a semester-long longitudinal experiment involving 212 college students enrolled in English as Foreign… ▽ More

    Submitted 20 March, 2024; v1 submitted 22 September, 2023; originally announced September 2023.

    Comments: The new version of this paper is on arXiv as arXiv:2403.08272

  15. arXiv:2309.10419  [pdf, other

    cs.HC cs.AI

    Learning from Teaching Assistants to Program with Subgoals: Exploring the Potential for AI Teaching Assistants

    Authors: Changyoon Lee, Junho Myung, Jieun Han, Jiho **, Alice Oh

    Abstract: With recent advances in generative AI, conversational models like ChatGPT have become feasible candidates for TAs. We investigate the practicality of using generative AI as TAs in introductory programming education by examining novice learners' interaction with TAs in a subgoal learning environment. To compare the learners' interaction and perception of the AI and human TAs, we conducted a between… ▽ More

    Submitted 19 September, 2023; originally announced September 2023.

    Comments: 15 pages, 6 figures, submitted to CHI 2024

  16. arXiv:2308.16705  [pdf, other

    cs.CL cs.AI

    Exploring Cross-Cultural Differences in English Hate Speech Annotations: From Dataset Construction to Analysis

    Authors: Nayeon Lee, Chani Jung, Junho Myung, Jiho **, Jose Camacho-Collados, Juho Kim, Alice Oh

    Abstract: Warning: this paper contains content that may be offensive or upsetting. Most hate speech datasets neglect the cultural diversity within a single language, resulting in a critical shortcoming in hate speech detection. To address this, we introduce CREHate, a CRoss-cultural English Hate speech dataset. To construct CREHate, we follow a two-step procedure: 1) cultural post collection and 2) cross-… ▽ More

    Submitted 3 April, 2024; v1 submitted 31 August, 2023; originally announced August 2023.

    Comments: Accepted to NAACL 2024 Main Conference

  17. arXiv:2307.16778  [pdf, other

    cs.CL cs.AI

    KoBBQ: Korean Bias Benchmark for Question Answering

    Authors: Jiho **, Jiseon Kim, Nayeon Lee, Haneul Yoo, Alice Oh, Hwaran Lee

    Abstract: The Bias Benchmark for Question Answering (BBQ) is designed to evaluate social biases of language models (LMs), but it is not simple to adapt this benchmark to cultural contexts other than the US because social biases depend heavily on the cultural context. In this paper, we present KoBBQ, a Korean bias benchmark dataset, and we propose a general framework that addresses considerations for cultura… ▽ More

    Submitted 25 January, 2024; v1 submitted 31 July, 2023; originally announced July 2023.

    Comments: TACL 2024 (pre-MIT Press publication version)

  18. arXiv:2305.17696  [pdf, other

    cs.CL

    SQuARe: A Large-Scale Dataset of Sensitive Questions and Acceptable Responses Created Through Human-Machine Collaboration

    Authors: Hwaran Lee, Seokhee Hong, Joonsuk Park, Takyoung Kim, Meeyoung Cha, Ye** Choi, Byoung Pil Kim, Gunhee Kim, Eun-Ju Lee, Yong Lim, Alice Oh, Sangchul Park, Jung-Woo Ha

    Abstract: The potential social harms that large language models pose, such as generating offensive content and reinforcing biases, are steeply rising. Existing works focus on co** with this concern while interacting with ill-intentioned users, such as those who explicitly make hate speech or elicit harmful responses. However, discussions on sensitive issues can become toxic even if the users are well-inte… ▽ More

    Submitted 28 May, 2023; originally announced May 2023.

    Comments: 19 pages, 10 figures, ACL 2023

  19. RECIPE: How to Integrate ChatGPT into EFL Writing Education

    Authors: Jieun Han, Haneul Yoo, Yoonsu Kim, Junho Myung, Minsun Kim, Hyunseung Lim, Juho Kim, Tak Yeon Lee, Hwajung Hong, So-Yeon Ahn, Alice Oh

    Abstract: The integration of generative AI in the field of education is actively being explored. In particular, ChatGPT has garnered significant interest, offering an opportunity to examine its effectiveness in English as a foreign language (EFL) education. To address this need, we present a novel learning platform called RECIPE (Revising an Essay with ChatGPT on an Interactive Platform for EFL learners). O… ▽ More

    Submitted 19 May, 2023; originally announced May 2023.

  20. arXiv:2212.10504  [pdf, other

    cs.CL

    Can Current Task-oriented Dialogue Models Automate Real-world Scenarios in the Wild?

    Authors: Sang-Woo Lee, Sungdong Kim, Donghyeon Ko, Donghoon Ham, Youngki Hong, Shin Ah Oh, Hyunhoon Jung, Wangkyo Jung, Kyunghyun Cho, Donghyun Kwak, Hyungsuk Noh, Woomyoung Park

    Abstract: Task-oriented dialogue (TOD) systems are mainly based on the slot-filling-based TOD (SF-TOD) framework, in which dialogues are broken down into smaller, controllable units (i.e., slots) to fulfill a specific task. A series of approaches based on this framework achieved remarkable success on various TOD benchmarks. However, we argue that the current TOD benchmarks are limited to surrogate real-worl… ▽ More

    Submitted 24 May, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

  21. CS1QA: A Dataset for Assisting Code-based Question Answering in an Introductory Programming Course

    Authors: Changyoon Lee, Yeon Seonwoo, Alice Oh

    Abstract: We introduce CS1QA, a dataset for code-based question answering in the programming education domain. CS1QA consists of 9,237 question-answer pairs gathered from chat logs in an introductory programming class using Python, and 17,698 unannotated chat data with code. Each question is accompanied with the student's code, and the portion of the code relevant to answering the question. We carefully des… ▽ More

    Submitted 26 October, 2022; originally announced October 2022.

  22. arXiv:2210.14389  [pdf, other

    cs.CL cs.AI

    Towards standardizing Korean Grammatical Error Correction: Datasets and Annotation

    Authors: Soyoung Yoon, Sungjoon Park, Gyuwan Kim, Junhee Cho, Kihyo Park, Gyutae Kim, Minjoon Seo, Alice Oh

    Abstract: Research on Korean grammatical error correction (GEC) is limited, compared to other major languages such as English. We attribute this problematic circumstance to the lack of a carefully designed evaluation benchmark for Korean GEC. In this work, we collect three datasets from different sources (Kor-Lang8, Kor-Native, and Kor-Learner) that covers a wide range of Korean grammatical errors. Consider… ▽ More

    Submitted 24 May, 2023; v1 submitted 25 October, 2022; originally announced October 2022.

    Comments: Accepted at ACL 2023 (main)

  23. arXiv:2210.13778  [pdf, other

    cs.CL

    IDK-MRC: Unanswerable Questions for Indonesian Machine Reading Comprehension

    Authors: Rifki Afina Putri, Alice Oh

    Abstract: Machine Reading Comprehension (MRC) has become one of the essential tasks in Natural Language Understanding (NLU) as it is often included in several NLU benchmarks (Liang et al., 2020; Wilie et al., 2020). However, most MRC datasets only have answerable question type, overlooking the importance of unanswerable questions. MRC models trained only on answerable questions will select the span that is… ▽ More

    Submitted 25 October, 2022; originally announced October 2022.

    Comments: EMNLP 2022

  24. arXiv:2210.06828  [pdf, other

    cs.CL

    Rethinking Annotation: Can Language Learners Contribute?

    Authors: Haneul Yoo, Rifki Afina Putri, Changyoon Lee, Youngin Lee, So-Yeon Ahn, Dongyeop Kang, Alice Oh

    Abstract: Researchers have traditionally recruited native speakers to provide annotations for widely used benchmark datasets. However, there are languages for which recruiting native speakers can be difficult, and it would help to find learners of those languages to annotate the data. In this paper, we investigate whether language learners can contribute annotations to benchmark datasets. In a carefully con… ▽ More

    Submitted 29 May, 2023; v1 submitted 13 October, 2022; originally announced October 2022.

    Comments: ACL 2023

  25. HUE: Pretrained Model and Dataset for Understanding Hanja Documents of Ancient Korea

    Authors: Haneul Yoo, Jiho **, Juhee Son, **Yeong Bak, Kyunghyun Cho, Alice Oh

    Abstract: Historical records in Korea before the 20th century were primarily written in Hanja, an extinct language based on Chinese characters and not understood by modern Korean or Chinese speakers. Historians with expertise in this time period have been analyzing the documents, but that process is very difficult and time-consuming, and language models would significantly speed up the process. Toward build… ▽ More

    Submitted 10 October, 2022; originally announced October 2022.

    Comments: Findings of NAACL 2022

  26. arXiv:2209.04333  [pdf, other

    cs.CL

    Ranking-Enhanced Unsupervised Sentence Representation Learning

    Authors: Yeon Seonwoo, Guoyin Wang, Changmin Seo, Sajal Choudhary, Jiwei Li, Xiang Li, Puyang Xu, Sunghyun Park, Alice Oh

    Abstract: Unsupervised sentence representation learning has progressed through contrastive learning and data augmentation methods such as dropout masking. Despite this progress, sentence encoders are still limited to using only an input sentence when predicting its semantic vector. In this work, we show that the semantic meaning of a sentence is also determined by nearest-neighbor sentences that are similar… ▽ More

    Submitted 18 May, 2023; v1 submitted 9 September, 2022; originally announced September 2022.

    Comments: ACL 2023

  27. arXiv:2209.00508  [pdf, other

    cs.LG cs.AI cs.SI

    Models and Benchmarks for Representation Learning of Partially Observed Subgraphs

    Authors: Dongkwan Kim, Jiho **, Jaimeen Ahn, Alice Oh

    Abstract: Subgraphs are rich substructures in graphs, and their nodes and edges can be partially observed in real-world tasks. Under partial observation, existing node- or subgraph-level message-passing produces suboptimal representations. In this paper, we formulate a novel task of learning representations of partially observed subgraphs. To solve this problem, we propose Partial Subgraph InfoMax (PSI) fra… ▽ More

    Submitted 19 September, 2022; v1 submitted 1 September, 2022; originally announced September 2022.

    Comments: CIKM 2022 Short Paper (Camera-ready + Appendix)

  28. arXiv:2205.11315  [pdf, other

    cs.CL cs.AI

    KOLD: Korean Offensive Language Dataset

    Authors: Younghoon Jeong, Juhyun Oh, Jaimeen Ahn, Jongwon Lee, Jihyung Moon, Sungjoon Park, Alice Oh

    Abstract: Recent directions for offensive language detection are hierarchical modeling, identifying the type and the target of offensive language, and interpretability with offensive span annotation and prediction. These improvements are focused on English and do not transfer well to other languages because of cultural and linguistic differences. In this paper, we present the Korean Offensive Language Datas… ▽ More

    Submitted 4 November, 2022; v1 submitted 23 May, 2022; originally announced May 2022.

    Comments: 9 pages, 2 figures

  29. arXiv:2205.10019  [pdf, other

    cs.CL cs.AI cs.LG

    Translating Hanja Historical Documents to Contemporary Korean and English

    Authors: Juhee Son, Jiho **, Haneul Yoo, **Yeong Bak, Kyunghyun Cho, Alice Oh

    Abstract: The Annals of Joseon Dynasty (AJD) contain the daily records of the Kings of Joseon, the 500-year kingdom preceding the modern nation of Korea. The Annals were originally written in an archaic Korean writing system, `Hanja', and were translated into Korean from 1968 to 1993. The resulting translation was however too literal and contained many archaic Korean words; thus, a new expert translation ef… ▽ More

    Submitted 29 December, 2023; v1 submitted 20 May, 2022; originally announced May 2022.

    Comments: EMNLP Findings 2022

  30. arXiv:2205.09393  [pdf, other

    cs.CL

    Two-Step Question Retrieval for Open-Domain QA

    Authors: Yeon Seonwoo, Juhee Son, Jiho **, Sang-Woo Lee, Ji-Hoon Kim, Jung-Woo Ha, Alice Oh

    Abstract: The retriever-reader pipeline has shown promising performance in open-domain QA but suffers from a very slow inference speed. Recently proposed question retrieval models tackle this problem by indexing question-answer pairs and searching for similar questions. These models have shown a significant increase in inference speed, but at the cost of lower QA performance compared to the retriever-reader… ▽ More

    Submitted 19 May, 2022; originally announced May 2022.

    Comments: ACL2022-Findings

  31. arXiv:2204.04879  [pdf, other

    cs.LG cs.AI cs.SI stat.ML

    How to Find Your Friendly Neighborhood: Graph Attention Design with Self-Supervision

    Authors: Dongkwan Kim, Alice Oh

    Abstract: Attention mechanism in graph neural networks is designed to assign larger weights to important neighbor nodes for better representation. However, what graph attention learns is not understood well, particularly when graphs are noisy. In this paper, we propose a self-supervised graph attention network (SuperGAT), an improved graph attention model for noisy graphs. Specifically, we exploit two atten… ▽ More

    Submitted 11 April, 2022; originally announced April 2022.

    Comments: ICLR 2021

  32. arXiv:2204.04510  [pdf, other

    cs.LG cs.AI cs.SI

    Translating Subgraphs to Nodes Makes Simple GNNs Strong and Efficient for Subgraph Representation Learning

    Authors: Dongkwan Kim, Alice Oh

    Abstract: Subgraph representation learning has emerged as an important problem, but it is by default approached with specialized graph neural networks on a large global graph. These models demand extensive memory and computational resources but challenge modeling hierarchical structures of subgraphs. In this paper, we propose Subgraph-To-Node (S2N) translation, a novel formulation for learning representatio… ▽ More

    Submitted 23 May, 2024; v1 submitted 9 April, 2022; originally announced April 2022.

    Comments: ICML 2024 Camera Ready (22 pages)

  33. arXiv:2109.09057  [pdf, other

    cs.CL

    Knowledge-Enhanced Evidence Retrieval for Counterargument Generation

    Authors: Yohan Jo, Haneul Yoo, **Yeong Bak, Alice Oh, Chris Reed, Eduard Hovy

    Abstract: Finding counterevidence to statements is key to many tasks, including counterargument generation. We build a system that, given a statement, retrieves counterevidence from diverse sources on the Web. At the core of this system is a natural language inference (NLI) model that determines whether a candidate sentence is valid counterevidence or not. Most NLI models to date, however, lack proper reaso… ▽ More

    Submitted 19 September, 2021; originally announced September 2021.

    Comments: To appear in Findings of EMNLP 2021

  34. arXiv:2109.06527  [pdf, other

    cs.CL

    Learning Bill Similarity with Annotated and Augmented Corpora of Bills

    Authors: Jiseon Kim, Elden Griggs, In Song Kim, Alice Oh

    Abstract: Bill writing is a critical element of representative democracy. However, it is often overlooked that most legislative bills are derived, or even directly copied, from other bills. Despite the significance of bill-to-bill linkages for understanding the legislative process, existing approaches fail to address semantic similarities across bills, let alone reordering or paraphrasing which are prevalen… ▽ More

    Submitted 14 September, 2021; originally announced September 2021.

    Comments: Accepted at EMNLP 2021(Long paper)

  35. arXiv:2109.05941  [pdf, other

    cs.CL cs.LG

    Efficient Contrastive Learning via Novel Data Augmentation and Curriculum Learning

    Authors: Seonghyeon Ye, Jiseon Kim, Alice Oh

    Abstract: We introduce EfficientCL, a memory-efficient continual pretraining method that applies contrastive learning with novel data augmentation and curriculum learning. For data augmentation, we stack two types of operation sequentially: cutoff and PCA jittering. While pretraining steps proceed, we apply curriculum learning by incrementing the augmentation degree for each difficulty step. After data augm… ▽ More

    Submitted 18 October, 2021; v1 submitted 10 September, 2021; originally announced September 2021.

    Comments: EMNLP 2021

  36. arXiv:2109.05704  [pdf, other

    cs.CL cs.AI

    Mitigating Language-Dependent Ethnic Bias in BERT

    Authors: Jaimeen Ahn, Alice Oh

    Abstract: BERT and other large-scale language models (LMs) contain gender and racial bias. They also exhibit other dimensions of social bias, most of which have not been studied in depth, and some of which vary depending on the language. In this paper, we study ethnic bias and how it varies across languages by analyzing and mitigating ethnic bias in monolingual BERT for English, German, Spanish, Korean, Tur… ▽ More

    Submitted 14 September, 2021; v1 submitted 13 September, 2021; originally announced September 2021.

    Comments: 17 pages including references and appendix. To appear in EMNLP 2021 (camera-ready ver.)

  37. arXiv:2106.09983  [pdf, other

    cs.CL

    Weakly Supervised Pre-Training for Multi-Hop Retriever

    Authors: Yeon Seonwoo, Sang-Woo Lee, Ji-Hoon Kim, Jung-Woo Ha, Alice Oh

    Abstract: In multi-hop QA, answering complex questions entails iterative document retrieval for finding the missing entity of the question. The main steps of this process are sub-question detection, document retrieval for the sub-question, and generation of a new query for the final document retrieval. However, building a dataset that contains complex questions with sub-questions and their corresponding doc… ▽ More

    Submitted 18 June, 2021; originally announced June 2021.

    Comments: ACL-Findings 2021

  38. arXiv:2105.09680  [pdf, other

    cs.CL

    KLUE: Korean Language Understanding Evaluation

    Authors: Sungjoon Park, Jihyung Moon, Sungdong Kim, Won Ik Cho, Jiyoon Han, Jangwon Park, Chisung Song, Junseong Kim, Yongsook Song, Taehwan Oh, Joohong Lee, Juhyun Oh, Sungwon Lyu, Younghoon Jeong, Inkwon Lee, Sangwoo Seo, Dongjun Lee, Hyunwoo Kim, Myeonghwa Lee, Seongbo Jang, Seungwon Do, Sunkyoung Kim, Kyungtae Lim, Jongwon Lee, Kyumin Park , et al. (6 additional authors not shown)

    Abstract: We introduce Korean Language Understanding Evaluation (KLUE) benchmark. KLUE is a collection of 8 Korean natural language understanding (NLU) tasks, including Topic Classification, SemanticTextual Similarity, Natural Language Inference, Named Entity Recognition, Relation Extraction, Dependency Parsing, Machine Reading Comprehension, and Dialogue State Tracking. We build all of the tasks from scrat… ▽ More

    Submitted 2 November, 2021; v1 submitted 20 May, 2021; originally announced May 2021.

    Comments: 76 pages, 10 figures, 36 tables

  39. arXiv:2011.02687  [pdf, other

    cs.CL

    Context-Aware Answer Extraction in Question Answering

    Authors: Yeon Seonwoo, Ji-Hoon Kim, Jung-Woo Ha, Alice Oh

    Abstract: Extractive QA models have shown very promising performance in predicting the correct answer to a question for a given passage. However, they sometimes result in predicting the correct answer text but in a context irrelevant to the given question. This discrepancy becomes especially important as the number of occurrences of the answer text in a passage increases. To resolve this issue, we propose \… ▽ More

    Submitted 5 November, 2020; originally announced November 2020.

    Comments: EMNLP 2020

  40. arXiv:2006.07015  [pdf, other

    cs.CL

    Speaker Sensitive Response Evaluation Model

    Authors: **Yeong Bak, Alice Oh

    Abstract: Automatic evaluation of open-domain dialogue response generation is very challenging because there are many appropriate responses for a given context. Existing evaluation models merely compare the generated response with the ground truth response and rate many of the appropriate responses as inappropriate if they deviate from the ground truth. One approach to resolve this problem is to consider th… ▽ More

    Submitted 12 June, 2020; originally announced June 2020.

    Comments: Accepted at ACL 2020

  41. K-EmoCon, a multimodal sensor dataset for continuous emotion recognition in naturalistic conversations

    Authors: Cheul Young Park, Narae Cha, Soowon Kang, Auk Kim, Ahsan Habib Khandoker, Leontios Hadjileontiadis, Alice Oh, Yong Jeong, Uichin Lee

    Abstract: Recognizing emotions during social interactions has many potential applications with the popularization of low-cost mobile sensors, but a challenge remains with the lack of naturalistic affective interaction data. Most existing emotion datasets do not support studying idiosyncratic emotions arising in the wild as they were collected in constrained environments. Therefore, studying emotions in the… ▽ More

    Submitted 19 May, 2020; v1 submitted 8 May, 2020; originally announced May 2020.

    Comments: 20 pages, 4 figures, for associated dataset, see https://doi.org/10.5281/zenodo.3814370

    Journal ref: Sci Data 7, (2020) 293

  42. arXiv:1911.02499  [pdf, other

    cs.CL cs.SD eess.AS

    Dimensional Emotion Detection from Categorical Emotion

    Authors: Sungjoon Park, Jiseon Kim, Seonghyeon Ye, Jaeyeol Jeon, Hee Young Park, Alice Oh

    Abstract: We present a model to predict fine-grained emotions along the continuous dimensions of valence, arousal, and dominance (VAD) with a corpus with categorical emotion annotations. Our model is trained by minimizing the EMD (Earth Mover's Distance) loss between the predicted VAD score distribution and the categorical emotion distributions sorted along VAD, and it can simultaneously classify the emotio… ▽ More

    Submitted 10 September, 2021; v1 submitted 6 November, 2019; originally announced November 2019.

    Comments: 9 pages, 2 figure

  43. arXiv:1904.00350  [pdf, other

    cs.CL cs.LG stat.ML

    Conversation Model Fine-Tuning for Classifying Client Utterances in Counseling Dialogues

    Authors: Sungjoon Park, Donghyun Kim, Alice Oh

    Abstract: The recent surge of text-based online counseling applications enables us to collect and analyze interactions between counselors and clients. A dataset of those interactions can be used to learn to automatically classify the client utterances into categories that help counselors in diagnosing client status and predicting counseling outcome. With proper anonymization, we collect counselor-client dia… ▽ More

    Submitted 31 March, 2019; originally announced April 2019.

    Comments: 9 pages, 2 figures, NAACL 2019

  44. arXiv:1811.09702  [pdf, other

    cs.CY cs.LG cs.SI stat.ML

    Homogeneity-Based Transmissive Process to Model True and False News in Social Networks

    Authors: Jooyeon Kim, Dongkwan Kim, Alice Oh

    Abstract: An overwhelming number of true and false news stories are posted and shared in social networks, and users diffuse the stories based on multiple factors. Diffusion of news stories from one user to another depends not only on the stories' content and the genuineness but also on the alignment of the topical interests between the users. In this paper, we propose a novel Bayesian nonparametric model th… ▽ More

    Submitted 15 November, 2018; originally announced November 2018.

    Comments: To appear in proceedings of the 12th ACM International Conference on Web Search and Data Mining (WSDM 2019)

    ACM Class: I.2.1; I.2.6; I.2.7; I.7; K.4.2; G.1.2; G.1.6

  45. arXiv:1711.09918  [pdf, other

    cs.SI cs.HC stat.ML

    Leveraging the Crowd to Detect and Reduce the Spread of Fake News and Misinformation

    Authors: Jooyeon Kim, Behzad Tabibian, Alice Oh, Bernhard Schoelkopf, Manuel Gomez-Rodriguez

    Abstract: Online social networking sites are experimenting with the following crowd-powered procedure to reduce the spread of fake news and misinformation: whenever a user is exposed to a story through her feed, she can flag the story as misinformation and, if the story receives enough flags, it is sent to a trusted third party for fact checking. If this party identifies the story as misinformation, it is m… ▽ More

    Submitted 27 November, 2017; originally announced November 2017.

    Comments: To appear at the 11th ACM International Conference on Web Search and Data Mining (WSDM 2018)

  46. arXiv:1709.05828  [pdf, other

    cs.HC

    Non-Linear Editor for Text-Based Screencast

    Authors: Jungkook Park, Yeong Hoon Park, Alice Oh

    Abstract: Screencasts, where computer screen is broadcast to a large audience on the web, are becoming popular as an online educational tool. Among various types of screencast content, popular are the contents that involve text editing, including computer programming. There are emerging platforms that support such text-based screencasts by recording every character insertion/deletion from the creator and re… ▽ More

    Submitted 18 September, 2017; originally announced September 2017.

    Comments: To appear in Adjunct Proceedings of the 30th Annual ACM Symposium on User Interface Software & Technology (UIST 2017, Poster)

  47. arXiv:1706.00593  [pdf, other

    cs.CL cs.DL cs.SI

    Joint Modeling of Topics, Citations, and Topical Authority in Academic Corpora

    Authors: Jooyeon Kim, Dongwoo Kim, Alice Oh

    Abstract: Much of scientific progress stems from previously published findings, but searching through the vast sea of scientific publications is difficult. We often rely on metrics of scholarly authority to find the prominent authors but these authority indices do not differentiate authority based on research topics. We present Latent Topical-Authority Indexing (LTAI) for jointly modeling the topics, citati… ▽ More

    Submitted 2 June, 2017; originally announced June 2017.

    Comments: Accepted by Transactions of the Association for Computational Linguistics (TACL); to appear

  48. arXiv:1512.08321  [pdf, other

    cs.HC cs.CY

    The Proficiency-Congruency Dilemma: Virtual Team Design and Performance in Multiplayer Online Games

    Authors: Jooyeon Kim, Brian C. Keegan, Sungjoon Park, Alice Oh

    Abstract: Multiplayer online battle arena games provide an excellent opportunity to study team performance. When designing a team, players must negotiate a \textit{proficiency-congruency dilemma} between selecting roles that best match their experience and roles that best complement the existing roles on the team. We adopt a mixed-methods approach to explore how users negotiate this dilemma. Using data from… ▽ More

    Submitted 28 December, 2015; originally announced December 2015.

    Comments: To appear In Proceedings of the 34th Annual ACM Conference on Human Factors in Computing Systems (CHI 2016)

  49. Understanding Editing Behaviors in Multilingual Wikipedia

    Authors: Suin Kim, Sungjoon Park, Scott A. Hale, Sooyoung Kim, Jeongmin Byun, Alice Oh

    Abstract: Multilingualism is common offline, but we have a more limited understanding of the ways multilingualism is displayed online and the roles that multilinguals play in the spread of content between speakers of different languages. We take a computational approach to studying multilingualism using one of the largest user-generated content platforms, Wikipedia. We study multilingualism by collecting an… ▽ More

    Submitted 28 August, 2015; originally announced August 2015.

    Comments: 34 pages, 7 figures

  50. Hierarchical Dirichlet Scaling Process

    Authors: Dongwoo Kim, Alice Oh

    Abstract: We present the \textit{hierarchical Dirichlet scaling process} (HDSP), a Bayesian nonparametric mixed membership model. The HDSP generalizes the hierarchical Dirichlet process (HDP) to model the correlation structure between metadata in the corpus and mixture components. We construct the HDSP based on the normalized gamma representation of the Dirichlet process, and this construction allows incorp… ▽ More

    Submitted 11 February, 2015; v1 submitted 22 March, 2014; originally announced April 2014.