Skip to main content

Showing 1–50 of 133 results for author: Kwon, Y

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.16341  [pdf, other

    cs.CL

    EHRCon: Dataset for Checking Consistency between Unstructured Notes and Structured Tables in Electronic Health Records

    Authors: Yeonsu Kwon, Jiho Kim, Gyubok Lee, Seongsu Bae, Daeun Kyung, Wonchul Cha, Tom Pollard, Alistair Johnson, Edward Choi

    Abstract: Electronic Health Records (EHRs) are integral for storing comprehensive patient medical records, combining structured data (e.g., medications) with detailed clinical notes (e.g., physician notes). These elements are essential for straightforward data retrieval and provide deep, contextual insights into patient care. However, they often suffer from discrepancies due to unintuitive EHR system design… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  2. arXiv:2406.09047  [pdf, other

    cs.CG

    DeepJEB: 3D Deep Learning-based Synthetic Jet Engine Bracket Dataset

    Authors: Seongjun Hong, Yongmin Kwon, Dongju Shin, Jangseop Park, Namwoo Kang

    Abstract: Recent advancements in artificial intelligence (AI) have significantly influenced various fields, including mechanical engineering. Nonetheless, the development of high-quality, diverse datasets for structural analysis still needs to be improved. Although traditional datasets, such as simulated jet engine bracket dataset, are useful, they are constrained by a small number of samples, which must be… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  3. arXiv:2406.06947  [pdf, other

    cs.AI cs.HC

    CAAP: Context-Aware Action Planning Prompting to Solve Computer Tasks with Front-End UI Only

    Authors: Junhee Cho, Jihoon Kim, Daseul Bae, **ho Choo, Youngjune Gwon, Yeong-Dae Kwon

    Abstract: Software robots have long been deployed in Robotic Process Automation (RPA) to automate mundane and repetitive computer tasks. The advent of Large Language Models (LLMs) with advanced reasoning capabilities has set the stage for these agents to now undertake more complex and even previously unseen tasks. However, the LLM-based automation techniques in recent literature frequently rely on HTML sour… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: 10 pages, 5 figures; (19 pages and 6 figures more in appendix)

  4. arXiv:2406.06650  [pdf, other

    eess.IV cs.CV

    Predicting the risk of early-stage breast cancer recurrence using H\&E-stained tissue images

    Authors: Geongyu Lee, Joonho Lee, Tae-Yeong Kwak, Sun Woo Kim, Youngmee Kwon, Chungyeul Kim, Hyeyoon Chang

    Abstract: Accurate prediction of the likelihood of recurrence is important in the selection of postoperative treatment for patients with early-stage breast cancer. In this study, we investigated whether deep learning algorithms can predict patients' risk of recurrence by analyzing the pathology images of their cancer histology. A total of 125 hematoxylin and eosin stained breast cancer whole slide images la… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 12 pages, 7 figures

  5. arXiv:2406.01339  [pdf, other

    cs.HC cs.OS cs.SE

    Recover as It is Designed to Be: Recovering from Compatibility Mobile App Crashes by Reusing User Flows

    Authors: Donghwi Kim, Hyungjun Yoon, Chang Min Park, Su** Han, Young** Kwon, Steven Y. Ko, Sung-Ju Lee

    Abstract: Android OS is severely fragmented by API updates and device vendors' OS customization, creating a market condition where vastly different OS versions coexist. This gives rise to compatibility crash problems where Android apps crash on certain Android versions but not on others. Although well-known, this problem is extremely challenging for app developers to overcome due to the sheer number of Andr… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  6. arXiv:2405.18253  [pdf, other

    cs.LG cs.GT

    Truthful Dataset Valuation by Pointwise Mutual Information

    Authors: Shuran Zheng, Yongchan Kwon, Xuan Qi, James Zou

    Abstract: A common way to evaluate a dataset in ML involves training a model on this dataset and assessing the model's performance on a test set. However, this approach has two issues: (1) it may incentivize undesirable data manipulation in data marketplaces, as the self-interested data providers seek to modify the dataset to maximize their evaluation scores; (2) it may select datasets that overfit to poten… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  7. arXiv:2405.06424  [pdf, other

    cs.CL cs.AI cs.LG

    Improving Instruction Following in Language Models through Proxy-Based Uncertainty Estimation

    Authors: JoonHo Lee, Jae Oh Woo, Juree Seok, Parisa Hassanzadeh, Wooseok Jang, JuYoun Son, Sima Didari, Baruch Gutow, Heng Hao, Hankyu Moon, Wenjun Hu, Yeong-Dae Kwon, Taehee Lee, Seungjai Min

    Abstract: Assessing response quality to instructions in language models is vital but challenging due to the complexity of human language across different contexts. This complexity often results in ambiguous or inconsistent interpretations, making accurate assessment difficult. To address this issue, we propose a novel Uncertainty-aware Reward Model (URM) that introduces a robust uncertainty estimation for t… ▽ More

    Submitted 19 May, 2024; v1 submitted 10 May, 2024; originally announced May 2024.

    Comments: Accepted to ICML 2024

  8. arXiv:2405.03875  [pdf, other

    cs.LG stat.ML

    Rethinking Data Shapley for Data Selection Tasks: Misleads and Merits

    Authors: Jiachen T. Wang, Tianji Yang, James Zou, Yongchan Kwon, Ruoxi Jia

    Abstract: Data Shapley provides a principled approach to data valuation and plays a crucial role in data-centric machine learning (ML) research. Data selection is considered a standard application of Data Shapley. However, its data selection performance has shown to be inconsistent across settings in the literature. This study aims to deepen our understanding of this phenomenon. We introduce a hypothesis te… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: ICML 2024

  9. arXiv:2404.10933  [pdf, other

    cs.AI cs.CL cs.LG

    LLMem: Estimating GPU Memory Usage for Fine-Tuning Pre-Trained LLMs

    Authors: Taeho Kim, Yanming Wang, Vatshank Chaturvedi, Lokesh Gupta, Seyeon Kim, Yongin Kwon, Sangtae Ha

    Abstract: Fine-tuning pre-trained large language models (LLMs) with limited hardware presents challenges due to GPU memory constraints. Various distributed fine-tuning methods have been proposed to alleviate memory constraints on GPU. However, determining the most effective method for achieving rapid fine-tuning while preventing GPU out-of-memory issues in a given environment remains unclear. To address thi… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Comments: 9 pages, 9 figures, accepted to IJCAI 2024

  10. arXiv:2404.08847  [pdf, other

    cs.IR cs.CR cs.LG

    LazyDP: Co-Designing Algorithm-Software for Scalable Training of Differentially Private Recommendation Models

    Authors: Juntaek Lim, Youngeun Kwon, Ranggi Hwang, Kiwan Maeng, G. Edward Suh, Minsoo Rhu

    Abstract: Differential privacy (DP) is widely being employed in the industry as a practical standard for privacy protection. While private training of computer vision or natural language processing applications has been studied extensively, the computational challenges of training of recommender systems (RecSys) with DP have not been explored. In this work, we first present our detailed characterization of… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

    Journal ref: Published at 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-29), 2024

  11. arXiv:2404.01954  [pdf, other

    cs.CL cs.AI

    HyperCLOVA X Technical Report

    Authors: Kang Min Yoo, Jaegeun Han, Sookyo In, Heewon Jeon, Jisu Jeong, Jaewook Kang, Hyunwook Kim, Kyung-Min Kim, Munhyong Kim, Sungju Kim, Donghyun Kwak, Hanock Kwak, Se Jung Kwon, Bado Lee, Dongsoo Lee, Gichang Lee, Jooho Lee, Baeseong Park, Seong** Shin, Joonsang Yu, Seolki Baek, Sumin Byeon, Eungsup Cho, Dooseok Choe, Jeesung Han , et al. (371 additional authors not shown)

    Abstract: We introduce HyperCLOVA X, a family of large language models (LLMs) tailored to the Korean language and culture, along with competitive capabilities in English, math, and coding. HyperCLOVA X was trained on a balanced mix of Korean, English, and code data, followed by instruction-tuning with high-quality human-annotated datasets while abiding by strict safety guidelines reflecting our commitment t… ▽ More

    Submitted 13 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

    Comments: 44 pages; updated authors list and fixed author names

  12. arXiv:2403.16049  [pdf, other

    cs.LG physics.soc-ph

    Improving Demand Forecasting in Open Systems with Cartogram-Enhanced Deep Learning

    Authors: Sangjoon Park, Yongsung Kwon, Hyungjoon Soh, Mi ** Lee, Seung-Woo Son

    Abstract: Predicting temporal patterns across various domains poses significant challenges due to their nuanced and often nonlinear trajectories. To address this challenge, prediction frameworks have been continuously refined, employing data-driven statistical methods, mathematical models, and machine learning. Recently, as one of the challenging systems, shared transport systems such as public bicycles hav… ▽ More

    Submitted 26 May, 2024; v1 submitted 24 March, 2024; originally announced March 2024.

    Comments: 11 pages, 7 figures

  13. arXiv:2403.12098  [pdf, other

    cs.CV cs.AI cs.LG eess.IV

    Deep Generative Design for Mass Production

    Authors: Jihoon Kim, Yongmin Kwon, Namwoo Kang

    Abstract: Generative Design (GD) has evolved as a transformative design approach, employing advanced algorithms and AI to create diverse and innovative solutions beyond traditional constraints. Despite its success, GD faces significant challenges regarding the manufacturability of complex designs, often necessitating extensive manual modifications due to limitations in standard manufacturing processes and t… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  14. arXiv:2403.11513  [pdf, other

    cs.RO

    Visual Preference Inference: An Image Sequence-Based Preference Reasoning in Tabletop Object Manipulation

    Authors: Joonhyung Lee, Sangbeom Park, Yongin Kwon, Jemin Lee, Minwook Ahn, Sungjoon Choi

    Abstract: In robotic object manipulation, human preferences can often be influenced by the visual attributes of objects, such as color and shape. These properties play a crucial role in operating a robot to interact with objects and align with human intention. In this paper, we focus on the problem of inferring underlying human preferences from a sequence of raw visual observations in tabletop manipulation… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

    Comments: 8 pages

  15. arXiv:2402.09264  [pdf, other

    cs.LG cs.HC

    UR2M: Uncertainty and Resource-Aware Event Detection on Microcontrollers

    Authors: Hong Jia, Young D. Kwon, Dong Ma, Nhat Pham, Lorena Qendro, Tam Vu, Cecilia Mascolo

    Abstract: Traditional machine learning techniques are prone to generating inaccurate predictions when confronted with shifts in the distribution of data between the training and testing phases. This vulnerability can lead to severe consequences, especially in applications such as mobile healthcare. Uncertainty estimation has the potential to mitigate this issue by assessing the reliability of a model's outp… ▽ More

    Submitted 12 March, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

  16. arXiv:2401.16875  [pdf, other

    quant-ph cs.ET

    Quantum Circuit Map** for Universal and Scalable Computing in MZI-based Integrated Photonics

    Authors: Yong Kwon, Alessio Baldazzi, Lorenzo Pavesi, Byung-Soo Choi

    Abstract: Linear optical quantum computing (LOQC) offers a quantum computation paradigm based on well-established and robust technology and flexible environmental conditions following DiVincenzo's criteria. Within this framework, integrated photonics can be utilized to achieve gate-based quantum computing, defining qubits by path-encoding, quantum gates through the use of Mach-Zehnder interferometers (MZIs)… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

    Journal ref: Opt. Express 32, 12852-12881 (2024)

  17. arXiv:2312.08400  [pdf, other

    cs.CL cs.AI

    Beyond English: Evaluating LLMs for Arabic Grammatical Error Correction

    Authors: Sang Yun Kwon, Gagan Bhatia, El Moatez Billah Nagoudi, Muhammad Abdul-Mageed

    Abstract: Large language models (LLMs) finetuned to follow human instruction have recently exhibited significant capabilities in various English NLP tasks. However, their performance in grammatical error correction (GEC), especially on languages other than English, remains significantly unexplored. In this work, we evaluate the abilities of instruction finetuned LLMs in Arabic GEC, a complex task due to Ara… ▽ More

    Submitted 13 December, 2023; originally announced December 2023.

    Comments: arXiv admin note: text overlap with arXiv:2308.04492

  18. arXiv:2312.01511  [pdf, other

    cs.CY cs.DB

    SoK: The Gap Between Data Rights Ideals and Reality

    Authors: Yu** Kwon, Ella Corren, Gonzalo Munilla Garrido, Chris Hoofnagle, Dawn Song

    Abstract: As information economies burgeon, they unlock innovation and economic wealth while posing novel threats to civil liberties and altering power dynamics between individuals, companies, and governments. Legislatures have reacted with privacy laws designed to empower individuals over their data. These laws typically create rights for "data subjects" (individuals) to make requests of data collectors (c… ▽ More

    Submitted 3 December, 2023; originally announced December 2023.

  19. arXiv:2311.13712  [pdf, other

    cs.AI

    Data Acquisition: A New Frontier in Data-centric AI

    Authors: Lingjiao Chen, Bilge Acun, Newsha Ardalani, Yifan Sun, Feiyang Kang, Hanrui Lyu, Yongchan Kwon, Ruoxi Jia, Carole-Jean Wu, Matei Zaharia, James Zou

    Abstract: As Machine Learning (ML) systems continue to grow, the demand for relevant and comprehensive datasets becomes imperative. There is limited study on the challenges of data acquisition due to ad-hoc processes and lack of consistent methodologies. We first present an investigation of current data marketplaces, revealing lack of platforms offering detailed information about datasets, transparent prici… ▽ More

    Submitted 22 November, 2023; originally announced November 2023.

  20. arXiv:2311.11420  [pdf, other

    cs.LG cs.AI cs.CV

    LifeLearner: Hardware-Aware Meta Continual Learning System for Embedded Computing Platforms

    Authors: Young D. Kwon, Jagmohan Chauhan, Hong Jia, Stylianos I. Venieris, Cecilia Mascolo

    Abstract: Continual Learning (CL) allows applications such as user personalization and household robots to learn on the fly and adapt to context. This is an important feature when context, actions, and users change. However, enabling CL on resource-constrained embedded systems is challenging due to the limited labeled data, memory, and computing capacity. In this paper, we propose LifeLearner, a hardware-aw… ▽ More

    Submitted 19 November, 2023; originally announced November 2023.

    Comments: Accepted for publication at SenSys 2023

  21. arXiv:2310.11449  [pdf, other

    cs.CV cs.GR cs.LG

    DELIFFAS: Deformable Light Fields for Fast Avatar Synthesis

    Authors: Youngjoong Kwon, Lingjie Liu, Henry Fuchs, Marc Habermann, Christian Theobalt

    Abstract: Generating controllable and photorealistic digital human avatars is a long-standing and important problem in Vision and Graphics. Recent methods have shown great progress in terms of either photorealism or inference speed while the combination of the two desired properties still remains unsolved. To this end, we propose a novel method, called DELIFFAS, which parameterizes the appearance of the hum… ▽ More

    Submitted 17 October, 2023; originally announced October 2023.

  22. arXiv:2310.11220  [pdf, other

    cs.CL

    KG-GPT: A General Framework for Reasoning on Knowledge Graphs Using Large Language Models

    Authors: Jiho Kim, Yeonsu Kwon, Yohan Jo, Edward Choi

    Abstract: While large language models (LLMs) have made considerable advancements in understanding and generating unstructured text, their application in structured data remains underexplored. Particularly, using LLMs for complex reasoning tasks on knowledge graphs (KGs) remains largely untouched. To address this, we propose KG-GPT, a multi-purpose framework leveraging LLMs for tasks employing KGs. KG-GPT co… ▽ More

    Submitted 17 October, 2023; originally announced October 2023.

    Comments: Accepted to EMNLP 2023 Findings

  23. arXiv:2310.00902  [pdf, other

    cs.LG stat.ML

    DataInf: Efficiently Estimating Data Influence in LoRA-tuned LLMs and Diffusion Models

    Authors: Yongchan Kwon, Eric Wu, Kevin Wu, James Zou

    Abstract: Quantifying the impact of training data points is crucial for understanding the outputs of machine learning models and for improving the transparency of the AI pipeline. The influence function is a principled and popular data attribution method, but its computational cost often makes it challenging to use. This issue becomes more pronounced in the setting of large language models and text-to-image… ▽ More

    Submitted 13 March, 2024; v1 submitted 2 October, 2023; originally announced October 2023.

    Comments: ICLR 2024

  24. arXiv:2309.14741  [pdf, other

    eess.AS cs.SD

    Rethinking Session Variability: Leveraging Session Embeddings for Session Robustness in Speaker Verification

    Authors: Hee-Soo Heo, KiHyun Nam, Bong-** Lee, Youngki Kwon, Minjae Lee, You ** Kim, Joon Son Chung

    Abstract: In the field of speaker verification, session or channel variability poses a significant challenge. While many contemporary methods aim to disentangle session information from speaker embeddings, we introduce a novel approach using an additional embedding to represent the session information. This is achieved by training an auxiliary network appended to the speaker embedding extractor which remain… ▽ More

    Submitted 26 September, 2023; originally announced September 2023.

  25. arXiv:2309.04655  [pdf

    cs.RO cs.LG eess.SP eess.SY

    Intelligent upper-limb exoskeleton integrated with soft wearable bioelectronics and deep-learning for human intention-driven strength augmentation based on sensory feedback

    Authors: **woo Lee, Kangkyu Kwon, Ira Soltis, Jared Matthews, Yoonjae Lee, Hojoong Kim, Lissette Romero, Nathan Zavanelli, Young** Kwon, Shinjae Kwon, Jimin Lee, Yewon Na, Sung Hoon Lee, Ki Jun Yu, Minoru Shinohara, Frank L. Hammond, Woon-Hong Yeo

    Abstract: The age and stroke-associated decline in musculoskeletal strength degrades the ability to perform daily human tasks using the upper extremities. Although there are a few examples of exoskeletons, they need manual operations due to the absence of sensor feedback and no intention prediction of movements. Here, we introduce an intelligent upper-limb exoskeleton system that uses cloud-based deep learn… ▽ More

    Submitted 26 January, 2024; v1 submitted 8 September, 2023; originally announced September 2023.

    Comments: 15 pages, 6 figures, 1 table, published in npj flexible electronics journals

    MSC Class: 68T40 (Primary) 92C55; 68T99 (Secondary)

  26. arXiv:2309.01310  [pdf, other

    cs.CV cs.AI

    ExMobileViT: Lightweight Classifier Extension for Mobile Vision Transformer

    Authors: Gyeongdong Yang, Yungwook Kwon, Hyun** Kim

    Abstract: The paper proposes an efficient structure for enhancing the performance of mobile-friendly vision transformer with small computational overhead. The vision transformer (ViT) is very attractive in that it reaches outperforming results in image classification, compared to conventional convolutional neural networks (CNNs). Due to its need of high computational resources, MobileNet-based ViT models su… ▽ More

    Submitted 3 September, 2023; originally announced September 2023.

    Comments: Under Review

  27. arXiv:2308.04492  [pdf, other

    cs.AI

    ChatGPT for Arabic Grammatical Error Correction

    Authors: Sang Yun Kwon, Gagan Bhatia, El Moatez Billah Nagoud, Muhammad Abdul-Mageed

    Abstract: Recently, large language models (LLMs) fine-tuned to follow human instruction have exhibited significant capabilities in various English NLP tasks. However, their performance in grammatical error correction (GEC) tasks, particularly in non-English languages, remains significantly unexplored. In this paper, we delve into abilities of instruction fine-tuned LLMs in Arabic GEC, a task made complex du… ▽ More

    Submitted 8 August, 2023; originally announced August 2023.

  28. arXiv:2307.11754  [pdf, other

    cs.GT cs.CR

    What Drives the (In)stability of a Stablecoin?

    Authors: Yu** Kwon, Kornrapat Pongmala, Kaihua Qin, Ariah Klages-Mundt, Philipp Jovanovic, Christine Parlour, Arthur Gervais, Dawn Song

    Abstract: In May 2022, an apparent speculative attack, followed by market panic, led to the precipitous downfall of UST, one of the most popular stablecoins at that time. However, UST is not the only stablecoin to have been depegged in the past. Designing resilient and long-term stable coins, therefore, appears to present a hard challenge. To further scrutinize existing stablecoin designs and ultimately l… ▽ More

    Submitted 25 July, 2023; v1 submitted 14 June, 2023; originally announced July 2023.

  29. arXiv:2307.09988  [pdf, other

    cs.LG cs.CV

    TinyTrain: Resource-Aware Task-Adaptive Sparse Training of DNNs at the Data-Scarce Edge

    Authors: Young D. Kwon, Rui Li, Stylianos I. Venieris, Jagmohan Chauhan, Nicholas D. Lane, Cecilia Mascolo

    Abstract: On-device training is essential for user personalisation and privacy. With the pervasiveness of IoT devices and microcontroller units (MCUs), this task becomes more challenging due to the constrained memory and compute resources, and the limited availability of labelled user data. Nonetheless, prior works neglect the data scarcity issue, require excessively long training time (e.g. a few hours), o… ▽ More

    Submitted 10 June, 2024; v1 submitted 19 July, 2023; originally announced July 2023.

    Comments: Accepted by ICML 2024

  30. arXiv:2307.02784  [pdf, other

    cs.IT cs.NI eess.SP

    On the Spatial-Wideband Effects in Millimeter-Wave Cell-Free Massive MIMO

    Authors: Seyoung Ahn, Soohyeong Kim, Yongseok Kwon, Joohan Park, Jiseung Youn, Sunghyun Cho

    Abstract: In this paper, we investigate the spatial-wideband effects in cell-free massive MIMO (CF-mMIMO) systems in mmWave bands. The utilization of mmWave frequencies brings challenges such as signal attenuation and the need for denser networks like ultra-dense networks (UDN) to maintain communication performance. CF-mMIMO is introduced as a solution, where distributed access points (APs) transmit signals… ▽ More

    Submitted 6 July, 2023; originally announced July 2023.

  31. arXiv:2306.10577  [pdf, other

    cs.LG stat.ML

    OpenDataVal: a Unified Benchmark for Data Valuation

    Authors: Kevin Fu Jiang, Weixin Liang, James Zou, Yongchan Kwon

    Abstract: Assessing the quality and impact of individual data points is critical for improving model performance and mitigating undesirable biases within the training dataset. Several data valuation algorithms have been proposed to quantify data quality, however, there lacks a systemic and standardized benchmarking system for data valuation. In this paper, we introduce OpenDataVal, an easy-to-use and unifie… ▽ More

    Submitted 13 October, 2023; v1 submitted 18 June, 2023; originally announced June 2023.

    Comments: 25 pages, NeurIPS 2023 Track on Datasets and Benchmarks

  32. arXiv:2306.00680  [pdf, other

    cs.SD cs.AI eess.AS

    Encoder-decoder multimodal speaker change detection

    Authors: Jee-weon Jung, Soonshin Seo, Hee-Soo Heo, Geonmin Kim, You ** Kim, Young-ki Kwon, Minjae Lee, Bong-** Lee

    Abstract: The task of speaker change detection (SCD), which detects points where speakers change in an input, is essential for several applications. Several studies solved the SCD task using audio inputs only and have shown limited performance. Recently, multimodal SCD (MMSCD) models, which utilise text modality in addition to audio, have shown improved performance. In this study, the proposed model are bui… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

    Comments: 5 pages, accepted for presentation at INTERSPEECH 2023

  33. arXiv:2305.13114  [pdf, other

    cs.CY cs.HC

    Exploring User Perspectives on ChatGPT: Applications, Perceptions, and Implications for AI-Integrated Education

    Authors: Reza Hadi Mogavi, Chao Deng, Justin Juho Kim, Pengyuan Zhou, Young D. Kwon, Ahmed Hosny Saleh Metwally, Ahmed Tlili, Simone Bassanelli, Antonio Bucchiarone, Sujit Gujar, Lennart E. Nacke, Pan Hui

    Abstract: To foster the development of pedagogically potent and ethically sound AI-integrated learning landscapes, it is pivotal to critically explore the perceptions and experiences of the users immersed in these contexts. In this study, we perform a thorough qualitative content analysis across four key social media platforms. Our goal is to understand the user experience (UX) and views of early adopters o… ▽ More

    Submitted 25 November, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

    Comments: This is the authors' preprint version of the paper accepted by the Journal of Computers in Human Behavior: Artificial Humans (doi: https://doi.org/10.1016/j.chbah.2023.100027)

  34. arXiv:2305.07288  [pdf, other

    cs.CL

    Open-WikiTable: Dataset for Open Domain Question Answering with Complex Reasoning over Table

    Authors: Sunjun Kweon, Yeonsu Kwon, Seonhee Cho, Yohan Jo, Edward Choi

    Abstract: Despite recent interest in open domain question answering (ODQA) over tables, many studies still rely on datasets that are not truly optimal for the task with respect to utilizing structural nature of table. These datasets assume answers reside as a single cell value and do not necessitate exploring over multiple cells such as aggregation, comparison, and sorting. Thus, we release Open-WikiTable,… ▽ More

    Submitted 12 May, 2023; originally announced May 2023.

    Comments: ACL 2023 (Findings)

  35. arXiv:2305.06590  [pdf, other

    cs.CL cs.AI

    FactKG: Fact Verification via Reasoning on Knowledge Graphs

    Authors: Jiho Kim, Sung** Park, Yeonsu Kwon, Yohan Jo, James Thorne, Edward Choi

    Abstract: In real world applications, knowledge graphs (KG) are widely used in various domains (e.g. medical applications and dialogue agents). However, for fact verification, KGs have not been adequately utilized as a knowledge source. KGs can be a valuable knowledge source in fact verification due to their reliability and broad applicability. A KG consists of nodes and edges which makes it clear how conce… ▽ More

    Submitted 18 May, 2023; v1 submitted 11 May, 2023; originally announced May 2023.

    Comments: Accepted to ACL 2023

  36. arXiv:2305.02995  [pdf, other

    cs.LG cs.AI cs.CL cs.CV

    Accuracy on the Curve: On the Nonlinear Correlation of ML Performance Between Data Subpopulations

    Authors: Weixin Liang, Yining Mao, Yongchan Kwon, Xinyu Yang, James Zou

    Abstract: Understanding the performance of machine learning (ML) models across diverse data distributions is critically important for reliable applications. Despite recent empirical studies positing a near-perfect linear correlation between in-distribution (ID) and out-of-distribution (OOD) accuracies, we empirically demonstrate that this correlation is more nuanced under subpopulation shifts. Through rigor… ▽ More

    Submitted 31 May, 2023; v1 submitted 4 May, 2023; originally announced May 2023.

    Comments: Accepted to the main conference of ICML 2023

  37. arXiv:2304.13292  [pdf, other

    cs.CL

    Zero-Shot Slot and Intent Detection in Low-Resource Languages

    Authors: Sang Yun Kwon, Gagan Bhatia, El Moatez Billah Nagoudi, Alcides Alcoba Inciarte, Muhammad Abdul-Mageed

    Abstract: Intent detection and slot filling are critical tasks in spoken and natural language understanding for task-oriented dialog systems. In this work we describe our participation in the slot and intent detection for low-resource language varieties (SID4LR; Aepli et al. (2023)). We investigate the slot and intent detection (SID) tasks using a wide range of models and settings. Given the recent success… ▽ More

    Submitted 26 April, 2023; originally announced April 2023.

    Comments: VarDial @ EACL

  38. arXiv:2304.09822  [pdf, other

    cs.CY cs.SI

    Unpacking How Decentralized Autonomous Organizations (DAOs) Work in Practice

    Authors: Tanusree Sharma, Yu** Kwon, Kornrapat Pongmala, Henry Wang, Andrew Miller, Dawn Song, Yang Wang

    Abstract: Decentralized Autonomous Organizations (DAOs) have emerged as a novel way to coordinate a group of (pseudonymous) entities towards a shared vision (e.g., promoting sustainability), utilizing self-executing smart contracts on blockchains to support decentralized governance and decision-making. In just a few years, over 4,000 DAOs have been launched in various domains, such as investment, education,… ▽ More

    Submitted 16 April, 2023; originally announced April 2023.

  39. arXiv:2304.07718  [pdf, other

    cs.LG stat.ML

    Data-OOB: Out-of-bag Estimate as a Simple and Efficient Data Value

    Authors: Yongchan Kwon, James Zou

    Abstract: Data valuation is a powerful framework for providing statistical insights into which data are beneficial or detrimental to model training. Many Shapley-based data valuation methods have shown promising results in various downstream tasks, however, they are well known to be computationally challenging as it requires training a large number of models. As a result, it has been recognized as infeasibl… ▽ More

    Submitted 1 June, 2023; v1 submitted 16 April, 2023; originally announced April 2023.

    Comments: 18 pages, to be published at ICML 2023

  40. arXiv:2304.04897  [pdf, other

    cs.CV

    Neural Image-based Avatars: Generalizable Radiance Fields for Human Avatar Modeling

    Authors: Youngjoong Kwon, Dahun Kim, Duygu Ceylan, Henry Fuchs

    Abstract: We present a method that enables synthesizing novel views and novel poses of arbitrary human performers from sparse multi-view images. A key ingredient of our method is a hybrid appearance blending module that combines the advantages of the implicit body NeRF representation and image-based rendering. Existing generalizable human NeRF methods that are conditioned on the body model have shown robust… ▽ More

    Submitted 10 April, 2023; originally announced April 2023.

  41. Tensor Slicing and Optimization for Multicore NPUs

    Authors: Rafael Sousa, Marcio Pereira, Yongin Kwon, Taeho Kim, Namsoon Jung, Chang Soo Kim, Michael Frank, Guido Araujo

    Abstract: Although code generation for Convolution Neural Network (CNN) models has been extensively studied, performing efficient data slicing and parallelization for highly-constrai\-ned Multicore Neural Processor Units (NPUs) is still a challenging problem. Given the size of convolutions' input/output tensors and the small footprint of NPU on-chip memories, minimizing memory transactions while maximizing… ▽ More

    Submitted 6 April, 2023; originally announced April 2023.

    Journal ref: Journal of Parallel and Distributed Computing Journal of Parallel and Distributed Computing, Volume 175, May 2023, Pages 66-79

  42. arXiv:2304.01197  [pdf, other

    cs.CV cs.GR

    Bringing Telepresence to Every Desk

    Authors: Shengze Wang, Ziheng Wang, Ryan Schmelzle, Liujie Zheng, YoungJoong Kwon, Soumyadip Sengupta, Henry Fuchs

    Abstract: In this paper, we work to bring telepresence to every desktop. Unlike commercial systems, personal 3D video conferencing systems must render high-quality videos while remaining financially and computationally viable for the average consumer. To this end, we introduce a capturing and rendering system that only requires 4 consumer-grade RGBD cameras and synthesizes high-quality free-viewpoint videos… ▽ More

    Submitted 3 April, 2023; originally announced April 2023.

  43. GPT-4 can pass the Korean National Licensing Examination for Korean Medicine Doctors

    Authors: Dongyeop Jang, Tae-Rim Yun, Choong-Yeol Lee, Young-Kyu Kwon, Chang-Eop Kim

    Abstract: Traditional Korean medicine (TKM) emphasizes individualized diagnosis and treatment. This uniqueness makes AI modeling difficult due to limited data and implicit processes. Large language models (LLMs) have demonstrated impressive medical inference, even without advanced training in medical texts. This study assessed the capabilities of GPT-4 in TKM, using the Korean National Licensing Examination… ▽ More

    Submitted 16 November, 2023; v1 submitted 31 March, 2023; originally announced March 2023.

    Comments: 23 pages, 4 figures

    ACM Class: J.3

  44. arXiv:2303.12557  [pdf, other

    cs.CV cs.AI

    Q-HyViT: Post-Training Quantization of Hybrid Vision Transformers with Bridge Block Reconstruction for IoT Systems

    Authors: Jemin Lee, Yongin Kwon, Sihyeong Park, Misun Yu, Jeman Park, Hwanjun Song

    Abstract: Recently, vision transformers (ViTs) have superseded convolutional neural networks in numerous applications, including classification, detection, and segmentation. However, the high computational requirements of ViTs hinder their widespread implementation. To address this issue, researchers have proposed efficient hybrid transformer architectures that combine convolutional and transformer layers w… ▽ More

    Submitted 16 May, 2024; v1 submitted 22 March, 2023; originally announced March 2023.

    Comments: 14 pages, 9 figures, accepted in IEEE Internet of Things Journal

  45. arXiv:2302.13750  [pdf, other

    eess.AS cs.CL cs.SD

    MoLE : Mixture of Language Experts for Multi-Lingual Automatic Speech Recognition

    Authors: Yoohwan Kwon, Soo-Whan Chung

    Abstract: Multi-lingual speech recognition aims to distinguish linguistic expressions in different languages and integrate acoustic processing simultaneously. In contrast, current multi-lingual speech recognition research follows a language-aware paradigm, mainly targeted to improve recognition performance rather than discriminate language characteristics. In this paper, we present a multi-lingual speech re… ▽ More

    Submitted 27 February, 2023; originally announced February 2023.

    Comments: Accepted by ICASSP 2023

  46. arXiv:2302.09765  [pdf, other

    cs.CV

    ENInst: Enhancing Weakly-supervised Low-shot Instance Segmentation

    Authors: Moon Ye-Bin, Dongmin Choi, Yong** Kwon, Junsik Kim, Tae-Hyun Oh

    Abstract: We address a weakly-supervised low-shot instance segmentation, an annotation-efficient training method to deal with novel classes effectively. Since it is an under-explored problem, we first investigate the difficulty of the problem and identify the performance bottleneck by conducting systematic analyses of model components and individual sub-tasks with a simple baseline model. Based on the analy… ▽ More

    Submitted 30 July, 2023; v1 submitted 20 February, 2023; originally announced February 2023.

    Comments: Accepted at Pattern Recognition (PR)

  47. arXiv:2302.09716  [pdf, other

    cs.RO cs.CV

    Seeing the Fruit for the Leaves: Towards Automated Apple Fruitlet Thinning

    Authors: Ans Qureshi, Neville Loh, Young Min Kwon, David Smith, Trevor Gee, Oliver Bachelor, Josh McCulloch, Mahla Nejati, JongYoon Lim, Richard Green, Ho Seok Ahn, Bruce MacDonald, Henry Williams

    Abstract: Following a global trend, the lack of reliable access to skilled labour is causing critical issues for the effective management of apple orchards. One of the primary challenges is maintaining skilled human operators capable of making precise fruitlet thinning decisions. Thinning requires accurately measuring the true crop load for individual apple trees to provide optimal thinning decisions on an… ▽ More

    Submitted 19 February, 2023; originally announced February 2023.

    Comments: Accepted and Presented at the Australasian Conference on Robotics and Automation (ACRA 2022)

  48. arXiv:2302.07352  [pdf, other

    cs.RO

    Reachability-based Trajectory Design with Neural Implicit Safety Constraints

    Authors: Jonathan Michaux, Qingyi Chen, Yongseok Kwon, Ram Vasudevan

    Abstract: Generating safe motion plans in real-time is a key requirement for deploying robot manipulators to assist humans in collaborative settings. In particular, robots must satisfy strict safety requirements to avoid self-damage or harming nearby humans. Satisfying these requirements is particularly challenging if the robot must also operate in real-time to adjust to changes in its environment.This pape… ▽ More

    Submitted 14 February, 2023; originally announced February 2023.

  49. arXiv:2302.01493  [pdf

    eess.IV cs.CV physics.med-ph

    Deep Learning (DL)-based Automatic Segmentation of the Internal Pudendal Artery (IPA) for Reduction of Erectile Dysfunction in Definitive Radiotherapy of Localized Prostate Cancer

    Authors: Anjali Balagopal, Michael Dohopolski, Young Suk Kwon, Steven Montalvo, Howard Morgan, Ti Bai, Dan Nguyen, Xiao Liang, Xinran Zhong, Mu-Han Lin, Neil Desai, Steve Jiang

    Abstract: Background and purpose: Radiation-induced erectile dysfunction (RiED) is commonly seen in prostate cancer patients. Clinical trials have been developed in multiple institutions to investigate whether dose-sparing to the internal-pudendal-arteries (IPA) will improve retention of sexual potency. The IPA is usually not considered a conventional organ-at-risk (OAR) due to segmentation difficulty. In t… ▽ More

    Submitted 2 February, 2023; originally announced February 2023.

  50. arXiv:2301.07695  [pdf, other

    cs.CL cs.AI

    EHRSQL: A Practical Text-to-SQL Benchmark for Electronic Health Records

    Authors: Gyubok Lee, Hyeonji Hwang, Seongsu Bae, Yeonsu Kwon, Woncheol Shin, Seongjun Yang, Minjoon Seo, Jong-Yeup Kim, Edward Choi

    Abstract: We present a new text-to-SQL dataset for electronic health records (EHRs). The utterances were collected from 222 hospital staff members, including physicians, nurses, and insurance review and health records teams. To construct the QA dataset on structured EHR data, we conducted a poll at a university hospital and used the responses to create seed questions. We then manually linked these questions… ▽ More

    Submitted 25 December, 2023; v1 submitted 16 January, 2023; originally announced January 2023.

    Comments: Published as a conference paper at NeurIPS 2022 (Track on Datasets and Benchmarks)