Skip to main content

Showing 1–50 of 93 results for author: Cha, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.11260  [pdf, other

    cs.CL cs.AI

    Adversarial Style Augmentation via Large Language Model for Robust Fake News Detection

    Authors: Sungwon Park, Sungwon Han, Meeyoung Cha

    Abstract: The spread of fake news negatively impacts individuals and is regarded as a significant social challenge that needs to be addressed. A number of algorithmic and insightful features have been identified for detecting fake news. However, with the recent LLMs and their advanced generation capabilities, many of the detectable features (e.g., style-conversion attacks) can be altered, making it more cha… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 8 pages

  2. arXiv:2406.09799  [pdf, other

    cs.CY

    GeoSEE: Regional Socio-Economic Estimation With a Large Language Model

    Authors: Sungwon Han, Donghyun Ahn, Seungeon Lee, Minhyuk Song, Sungwon Park, Sangyoon Park, Jihee Kim, Meeyoung Cha

    Abstract: Moving beyond traditional surveys, combining heterogeneous data sources with AI-driven inference models brings new opportunities to measure socio-economic conditions, such as poverty and population, over expansive geographic areas. The current research presents GeoSEE, a method that can estimate various socio-economic indicators using a unified pipeline powered by a large language model (LLM). Pre… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  3. arXiv:2406.08020  [pdf, other

    cs.CV

    Generalizable Disaster Damage Assessment via Change Detection with Vision Foundation Model

    Authors: Kyeong** Ahn, Sungwon Han, Sungwon Park, Jihee Kim, Sangyoon Park, Meeyoung Cha

    Abstract: The increasing frequency and intensity of natural disasters demand more sophisticated approaches for rapid and precise damage assessment. To tackle this issue, researchers have developed various methods on disaster benchmark datasets from satellite imagery to aid in detecting disaster damage. However, the diverse nature of geographical landscapes and disasters makes it challenging to apply existin… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 9 pages, 4 figures, 2 tables

  4. arXiv:2405.18986  [pdf, other

    cs.LG q-bio.BM q-bio.QM

    Robust Optimization in Protein Fitness Landscapes Using Reinforcement Learning in Latent Space

    Authors: Minji Lee, Luiz Felipe Vecchietti, Hyunkyu Jung, Hyun Joo Ro, Meeyoung Cha, Ho Min Kim

    Abstract: Proteins are complex molecules responsible for different functions in nature. Enhancing the functionality of proteins and cellular fitness can significantly impact various industries. However, protein optimization using computational methods remains challenging, especially when starting from low-fitness sequences. We propose LatProtRL, an optimization method to efficiently traverse a latent space… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: ICML 2024

  5. arXiv:2405.04990  [pdf, other

    cs.LG cs.AI

    Health Index Estimation Through Integration of General Knowledge with Unsupervised Learning

    Authors: Kristupas Bajarunas, Marcia L. Baptista, Kai Goebel, Manuel A. Chao

    Abstract: Accurately estimating a Health Index (HI) from condition monitoring data (CM) is essential for reliable and interpretable prognostics and health management (PHM) in complex systems. In most scenarios, complex systems operate under varying operating conditions and can exhibit different fault modes, making unsupervised inference of an HI from CM data a significant challenge. Hybrid models combining… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  6. arXiv:2404.11905  [pdf, other

    cs.LG cs.CR

    FedMID: A Data-Free Method for Using Intermediate Outputs as a Defense Mechanism Against Poisoning Attacks in Federated Learning

    Authors: Sungwon Han, Hyeonho Song, Sungwon Park, Meeyoung Cha

    Abstract: Federated learning combines local updates from clients to produce a global model, which is susceptible to poisoning attacks. Most previous defense strategies relied on vectors derived from projections of local updates on a Euclidean space; however, these methods fail to accurately represent the functionality and structure of local models, resulting in inconsistent performance. Here, we present a n… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

  7. arXiv:2403.05861  [pdf, ps, other

    cs.DC

    DeepVM: Integrating Spot and On-Demand VMs for Cost-Efficient Deep Learning Clusters in the Cloud

    Authors: Yoochan Kim, Kihyun Kim, Yonghyeon Cho, **woo Kim, Awais Khan, Ki-Dong Kang, Baik-Song An, Myung-Hoon Cha, Hong-Yeon Kim, Youngjae Kim

    Abstract: Distributed Deep Learning (DDL), as a paradigm, dictates the use of GPU-based clusters as the optimal infrastructure for training large-scale Deep Neural Networks (DNNs). However, the high cost of such resources makes them inaccessible to many users. Public cloud services, particularly Spot Virtual Machines (VMs), offer a cost-effective alternative, but their unpredictable availability poses a sig… ▽ More

    Submitted 14 March, 2024; v1 submitted 9 March, 2024; originally announced March 2024.

    Comments: 14 pages, 8 figures

  8. arXiv:2402.10436  [pdf, other

    cs.CL

    I Am Not Them: Fluid Identities and Persistent Out-group Bias in Large Language Models

    Authors: Wenchao Dong, Assem Zhunis, Hyo** Chin, Jiyoung Han, Meeyoung Cha

    Abstract: We explored cultural biases-individualism vs. collectivism-in ChatGPT across three Western languages (i.e., English, German, and French) and three Eastern languages (i.e., Chinese, Japanese, and Korean). When ChatGPT adopted an individualistic persona in Western languages, its collectivism scores (i.e., out-group values) exhibited a more negative trend, surpassing their positive orientation toward… ▽ More

    Submitted 15 February, 2024; originally announced February 2024.

  9. arXiv:2401.09466  [pdf, other

    physics.ao-ph cs.AI cs.CV cs.LG

    Self Supervised Vision for Climate Downscaling

    Authors: Karandeep Singh, Chaeyoon Jeong, Naufal Shidqi, Sungwon Park, Arjun Nellikkattil, Elke Zeller, Meeyoung Cha

    Abstract: Climate change is one of the most critical challenges that our planet is facing today. Rising global temperatures are already bringing noticeable changes to Earth's weather and climate patterns with an increased frequency of unpredictable and extreme weather events. Future projections for climate change research are based on Earth System Models (ESMs), the computer models that simulate the Earth's… ▽ More

    Submitted 9 January, 2024; originally announced January 2024.

  10. arXiv:2401.08939  [pdf, other

    cs.RO

    Enhancing Campus Mobility: Achievements and Challenges of Autonomous Shuttle "Snow Lion''

    Authors: Yingbing Chen, Jie Cheng, Sheng Wang, Hongji Liu, Xiaodong Mei, Xiaoyang Yan, Mingkai Tang, Ge Sun, Ya Wen, Junwei Cai, Xupeng Xie, Lu Gan, Mandan Chao, Ren Xin, Ming Liu, Jianhao Jiao, Kangcheng Liu, Lujia Wang

    Abstract: The rapid evolution of autonomous vehicles (AVs) has significantly influenced global transportation systems. In this context, we present ``Snow Lion'', an autonomous shuttle meticulously designed to revolutionize on-campus transportation, offering a safer and more efficient mobility solution for students, faculty, and visitors. The primary objective of this research is to enhance campus mobility b… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

    Comments: 9 pages, 9 figures

  11. arXiv:2312.15166  [pdf, other

    cs.CL cs.AI cs.LG

    SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling

    Authors: Dahyun Kim, Chanjun Park, Sanghoon Kim, Wonsung Lee, Wonho Song, Yunsu Kim, Hyeonwoo Kim, Yungi Kim, Hyeonju Lee, Jihoo Kim, Changbae Ahn, Seonghoon Yang, Sukyung Lee, Hyunbyung Park, Gyoung** Gim, Mikyoung Cha, Hwalsuk Lee, Sunghun Kim

    Abstract: We introduce SOLAR 10.7B, a large language model (LLM) with 10.7 billion parameters, demonstrating superior performance in various natural language processing (NLP) tasks. Inspired by recent efforts to efficiently up-scale LLMs, we present a method for scaling LLMs called depth up-scaling (DUS), which encompasses depthwise scaling and continued pretraining. In contrast to other LLM up-scaling meth… ▽ More

    Submitted 3 April, 2024; v1 submitted 23 December, 2023; originally announced December 2023.

    Comments: accepted to NAACL 2024 Industry Track

  12. arXiv:2311.10922  [pdf, other

    cs.AI cs.CL cs.DB cs.IR

    Explainable Product Classification for Customs

    Authors: Eunji Lee, Sihyeon Kim, Sundong Kim, Soyeon Jung, Heeja Kim, Meeyoung Cha

    Abstract: The task of assigning internationally accepted commodity codes (aka HS codes) to traded goods is a critical function of customs offices. Like court decisions made by judges, this task follows the doctrine of precedent and can be nontrivial even for experienced officers. Together with the Korea Customs Service (KCS), we propose a first-ever explainable decision supporting model that suggests the mo… ▽ More

    Submitted 17 November, 2023; originally announced November 2023.

    Comments: 24 pages, Accepted to ACM Transactions on Intelligent Systems and Technology

  13. arXiv:2310.19635  [pdf, other

    cs.CV

    Bidirectional Captioning for Clinically Accurate and Interpretable Models

    Authors: Keegan Quigley, Miriam Cha, Josh Barua, Geeticka Chauhan, Seth Berkowitz, Steven Horng, Polina Golland

    Abstract: Vision-language pretraining has been shown to produce high-quality visual encoders which transfer efficiently to downstream computer vision tasks. While generative language models have gained widespread attention, image captioning has thus far been mostly overlooked as a form of cross-modal pretraining in favor of contrastive learning, especially in medical image analysis. In this paper, we experi… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

    Comments: 12 pages, 7 figures. Code release to follow

  14. arXiv:2310.05189  [pdf, ps, other

    cs.CL cs.AI cs.LG

    Factuality Challenges in the Era of Large Language Models

    Authors: Isabelle Augenstein, Timothy Baldwin, Meeyoung Cha, Tanmoy Chakraborty, Giovanni Luca Ciampaglia, David Corney, Renee DiResta, Emilio Ferrara, Scott Hale, Alon Halevy, Eduard Hovy, Heng Ji, Filippo Menczer, Ruben Miguez, Preslav Nakov, Dietram Scheufele, Shivam Sharma, Giovanni Zagni

    Abstract: The emergence of tools based on Large Language Models (LLMs), such as OpenAI's ChatGPT, Microsoft's Bing Chat, and Google's Bard, has garnered immense public attention. These incredibly useful, natural-sounding tools mark significant advances in natural language generation, yet they exhibit a propensity to generate false, erroneous, or misleading content -- commonly referred to as "hallucinations.… ▽ More

    Submitted 9 October, 2023; v1 submitted 8 October, 2023; originally announced October 2023.

    Comments: Our article offers a comprehensive examination of the challenges and risks associated with Large Language Models (LLMs), focusing on their potential impact on the veracity of information in today's digital landscape

  15. A Comparative Study of Reference Reliability in Multiple Language Editions of Wikipedia

    Authors: Aitolkyn Baigutanova, Diego Saez-Trumper, Miriam Redi, Meeyoung Cha, Pablo Aragón

    Abstract: Information presented in Wikipedia articles must be attributable to reliable published sources in the form of references. This study examines over 5 million Wikipedia articles to assess the reliability of references in multiple language editions. We quantify the cross-lingual patterns of the perennial sources list, a collection of reliability labels for web domains identified and collaboratively a… ▽ More

    Submitted 4 September, 2023; v1 submitted 31 August, 2023; originally announced September 2023.

    Comments: Conference on Information & Knowledge Management (CIKM '23)

  16. Fine-Grained Socioeconomic Prediction from Satellite Images with Distributional Adjustment

    Authors: Donghyun Ahn, Minhyuk Song, Seungeon Lee, Yubin Choi, Jihee Kim, Sangyoon Park, Hyunjoo Yang, Meeyoung Cha

    Abstract: While measuring socioeconomic indicators is critical for local governments to make informed policy decisions, such measurements are often unavailable at fine-grained levels like municipality. This study employs deep learning-based predictions from satellite images to close the gap. We propose a method that assigns a socioeconomic score to each satellite image by capturing the distributional behavi… ▽ More

    Submitted 4 September, 2023; v1 submitted 30 August, 2023; originally announced August 2023.

    ACM Class: J.4

  17. arXiv:2308.09318  [pdf, other

    cs.LG cs.AI cs.CR

    Towards Attack-tolerant Federated Learning via Critical Parameter Analysis

    Authors: Sungwon Han, Sungwon Park, Fangzhao Wu, Sundong Kim, Bin Zhu, Xing Xie, Meeyoung Cha

    Abstract: Federated learning is used to train a shared model in a decentralized way without clients sharing private data with each other. Federated learning systems are susceptible to poisoning attacks when malicious clients send false updates to the central server. Existing defense strategies are ineffective under non-IID data settings. This paper proposes a new defense strategy, FedCPA (Federated learning… ▽ More

    Submitted 18 August, 2023; originally announced August 2023.

    Comments: ICCV'23 Accepted

  18. arXiv:2307.09048  [pdf, other

    cs.CR cs.AI

    FedDefender: Client-Side Attack-Tolerant Federated Learning

    Authors: Sungwon Park, Sungwon Han, Fangzhao Wu, Sundong Kim, Bin Zhu, Xing Xie, Meeyoung Cha

    Abstract: Federated learning enables learning from decentralized data sources without compromising privacy, which makes it a crucial technique. However, it is vulnerable to model poisoning attacks, where malicious clients interfere with the training process. Previous defense mechanisms have focused on the server-side by using careful model aggregation, but this may not be effective when the data is not iden… ▽ More

    Submitted 18 July, 2023; originally announced July 2023.

    Comments: KDD'23 research track accepted

  19. arXiv:2306.06176  [pdf, other

    cs.SI cs.CY

    Quantitative Analysis of Cultural Dynamics Seen from an Event-based Social Network

    Authors: Bayu Adhi Tama, Jaehong Kim, Jaehyuk Park, Lev Manovich, Meeyoung Cha

    Abstract: Culture is a collection of connected and potentially interactive patterns that characterize a social group or a passed-on idea that people acquire as members of society. While offline activities can provide a better picture of the geographical association of cultural traits than online activities, gathering such data on a large scale has been challenging. Here, we use multi-decade longitudinal rec… ▽ More

    Submitted 9 June, 2023; originally announced June 2023.

  20. arXiv:2306.04738  [pdf, other

    cs.CV cs.AI

    MultiEarth 2023 -- Multimodal Learning for Earth and Environment Workshop and Challenge

    Authors: Miriam Cha, Gregory Angelides, Mark Hamilton, Andy Soszynski, Brandon Swenson, Nathaniel Maidel, Phillip Isola, Taylor Perron, Bill Freeman

    Abstract: The Multimodal Learning for Earth and Environment Workshop (MultiEarth 2023) is the second annual CVPR workshop aimed at the monitoring and analysis of the health of Earth ecosystems by leveraging the vast amount of remote sensing data that is continuously being collected. The primary objective of this workshop is to bring together the Earth and environmental science communities as well as the mul… ▽ More

    Submitted 7 June, 2023; originally announced June 2023.

  21. arXiv:2305.17696  [pdf, other

    cs.CL

    SQuARe: A Large-Scale Dataset of Sensitive Questions and Acceptable Responses Created Through Human-Machine Collaboration

    Authors: Hwaran Lee, Seokhee Hong, Joonsuk Park, Takyoung Kim, Meeyoung Cha, Ye** Choi, Byoung Pil Kim, Gunhee Kim, Eun-Ju Lee, Yong Lim, Alice Oh, Sangchul Park, Jung-Woo Ha

    Abstract: The potential social harms that large language models pose, such as generating offensive content and reinforcing biases, are steeply rising. Existing works focus on co** with this concern while interacting with ill-intentioned users, such as those who explicitly make hate speech or elicit harmful responses. However, discussions on sensitive issues can become toxic even if the users are well-inte… ▽ More

    Submitted 28 May, 2023; originally announced May 2023.

    Comments: 19 pages, 10 figures, ACL 2023

  22. arXiv:2305.11377  [pdf, other

    cs.LG cs.CY

    GraphFC: Customs Fraud Detection with Label Scarcity

    Authors: Karandeep Singh, Yu-Che Tsai, Cheng-Te Li, Meeyoung Cha, Shou-De Lin

    Abstract: Custom officials across the world encounter huge volumes of transactions. With increased connectivity and globalization, the customs transactions continue to grow every year. Associated with customs transactions is the customs fraud - the intentional manipulation of goods declarations to avoid the taxes and duties. With limited manpower, the custom offices can only undertake manual inspection of a… ▽ More

    Submitted 19 August, 2023; v1 submitted 18 May, 2023; originally announced May 2023.

  23. arXiv:2304.06237  [pdf, other

    cs.LG eess.SP

    Deep learning based ECG segmentation for delineation of diverse arrhythmias

    Authors: Chankyu Joung, Mi** Kim, Tae** Paik, Seong-Ho Kong, Seung-Young Oh, Won Kyeong Jeon, Jae-hu Jeon, Joong-Sik Hong, Wan-Joong Kim, Woong Kook, Myung-** Cha, Otto van Koert

    Abstract: Accurate delineation of key waveforms in an ECG is a critical initial step in extracting relevant features to support the diagnosis and treatment of heart conditions. Although deep learning based methods using a segmentation model to locate the P, QRS, and T waves have shown promising results, their ability to handle signals exhibiting arrhythmia remains unclear. This study builds on existing rese… ▽ More

    Submitted 6 September, 2023; v1 submitted 12 April, 2023; originally announced April 2023.

  24. arXiv:2304.02176  [pdf, other

    cs.CY cs.AI cs.HC

    Blaming Humans and Machines: What Shapes People's Reactions to Algorithmic Harm

    Authors: Gabriel Lima, Nina Grgić-Hlača, Meeyoung Cha

    Abstract: Artificial intelligence (AI) systems can cause harm to people. This research examines how individuals react to such harm through the lens of blame. Building upon research suggesting that people blame AI systems, we investigated how several factors influence people's reactive attitudes towards machines, designers, and users. The results of three studies (N = 1,153) indicate differences in how blame… ▽ More

    Submitted 4 April, 2023; originally announced April 2023.

    Comments: ACM CHI 2023

  25. arXiv:2303.08403  [pdf, other

    cs.LG cs.AI cs.CY

    DualFair: Fair Representation Learning at Both Group and Individual Levels via Contrastive Self-supervision

    Authors: Sungwon Han, Seungeon Lee, Fangzhao Wu, Sundong Kim, Chuhan Wu, Xiting Wang, Xing Xie, Meeyoung Cha

    Abstract: Algorithmic fairness has become an important machine learning problem, especially for mission-critical Web applications. This work presents a self-supervised model, called DualFair, that can debias sensitive attributes like gender and race from learned representations. Unlike existing models that target a single type of fairness, our model jointly optimizes for two fairness criteria - group fairne… ▽ More

    Submitted 15 March, 2023; originally announced March 2023.

    Comments: Accepted and will be published at TheWebConf2023 (WWW2023)

  26. Longitudinal Assessment of Reference Quality on Wikipedia

    Authors: Aitolkyn Baigutanova, Jaehyeon Myung, Diego Saez-Trumper, Ai-Jou Chou, Miriam Redi, Changwook Jung, Meeyoung Cha

    Abstract: Wikipedia plays a crucial role in the integrity of the Web. This work analyzes the reliability of this global encyclopedia through the lens of its references. We operationalize the notion of reference quality by defining reference need (RN), i.e., the percentage of sentences missing a citation, and reference risk (RR), i.e., the proportion of non-authoritative references. We release Citation Detec… ▽ More

    Submitted 9 March, 2023; originally announced March 2023.

    Comments: Published at the Web Conference 2023 (WWW '23)

    Journal ref: Proceedings of the ACM Web Conference 2023 (WWW '23), May 1-5, 2023, Austin, TX, USA. ACM

  27. arXiv:2302.04730  [pdf, other

    cs.LG stat.ML

    A Benchmark on Uncertainty Quantification for Deep Learning Prognostics

    Authors: Luis Basora, Arthur Viens, Manuel Arias Chao, Xavier Olive

    Abstract: Reliable uncertainty quantification on RUL prediction is crucial for informative decision-making in predictive maintenance. In this context, we assess some of the latest developments in the field of uncertainty quantification for prognostics deep learning. This includes the state-of-the-art variational inference algorithms for Bayesian neural networks (BNN) as well as popular alternatives such as… ▽ More

    Submitted 9 February, 2023; originally announced February 2023.

    Comments: 29 pages, 16 figures

  28. arXiv:2302.04465  [pdf, other

    cs.CL cs.CY

    Detecting Contextomized Quotes in News Headlines by Contrastive Learning

    Authors: Seonyeong Song, Hyeonho Song, Kunwoo Park, Jiyoung Han, Meeyoung Cha

    Abstract: Quotes are critical for establishing credibility in news articles. A direct quote enclosed in quotation marks has a strong visual appeal and is a sign of a reliable citation. Unfortunately, this journalistic practice is not strictly followed, and a quote in the headline is often "contextomized." Such a quote uses words out of context in a way that alters the speaker's intention so that there is no… ▽ More

    Submitted 9 February, 2023; originally announced February 2023.

    Comments: 8 pages, EACL 2023 (Findings)

  29. arXiv:2211.02784  [pdf, other

    physics.optics cs.HC

    Waveguide Holography: Towards True 3D Holographic Glasses

    Authors: Changwon Jang, Kiseung Bang, Minseok Chae, Byoungho Lee, Douglas Lanman

    Abstract: We present a novel near-eye display concept which consists of a waveguide combiner, a spatial light modulator, and a laser light source. The proposed system can display true 3D holographic images through see-through pupil-replicating waveguide combiner as well as providing a large eye-box. By modeling the coherent light interaction inside of the waveguide combiner, we demonstrate that the output w… ▽ More

    Submitted 4 November, 2022; originally announced November 2022.

  30. arXiv:2210.07024  [pdf, other

    cs.AI cs.LG cs.LO

    Self-explaining deep models with logic rule reasoning

    Authors: Seungeon Lee, Xiting Wang, Sungwon Han, Xiaoyuan Yi, Xing Xie, Meeyoung Cha

    Abstract: We present SELOR, a framework for integrating self-explaining capabilities into a given deep model to achieve both high prediction performance and human precision. By "human precision", we refer to the degree to which humans agree with the reasons models provide for their predictions. Human precision affects user trust and allows users to collaborate closely with the model. We demonstrate that log… ▽ More

    Submitted 18 October, 2022; v1 submitted 13 October, 2022; originally announced October 2022.

    Comments: 26 pages including reference, checklist, and appendix. Accepted in NeurIPS 2022

  31. RadTex: Learning Efficient Radiograph Representations from Text Reports

    Authors: Keegan Quigley, Miriam Cha, Ruizhi Liao, Geeticka Chauhan, Steven Horng, Seth Berkowitz, Polina Golland

    Abstract: Automated analysis of chest radiography using deep learning has tremendous potential to enhance the clinical diagnosis of diseases in patients. However, deep learning models typically require large amounts of annotated data to achieve high performance -- often an obstacle to medical domain adaptation. In this paper, we build a data-efficient learning framework that utilizes radiology reports to im… ▽ More

    Submitted 7 April, 2023; v1 submitted 5 August, 2022; originally announced August 2022.

    Comments: Awarded Best Paper at Resource Efficient Medical Image Analysis (REMIA) Workshop, MICCAI 2022

  32. arXiv:2207.13184  [pdf, other

    cs.CV eess.IV

    SAR-to-EO Image Translation with Multi-Conditional Adversarial Networks

    Authors: Armando Cabrera, Miriam Cha, Prafull Sharma, Michael Newey

    Abstract: This paper explores the use of multi-conditional adversarial networks for SAR-to-EO image translation. Previous methods condition adversarial networks only on the input SAR. We show that incorporating multiple complementary modalities such as Google maps and IR can further improve SAR-to-EO image translation especially on preserving sharp edges of manmade objects. We demonstrate effectiveness of o… ▽ More

    Submitted 26 July, 2022; originally announced July 2022.

  33. arXiv:2207.09158  [pdf, other

    cs.CV cs.LG

    FedX: Unsupervised Federated Learning with Cross Knowledge Distillation

    Authors: Sungwon Han, Sungwon Park, Fangzhao Wu, Sundong Kim, Chuhan Wu, Xing Xie, Meeyoung Cha

    Abstract: This paper presents FedX, an unsupervised federated learning framework. Our model learns unbiased representation from decentralized and heterogeneous local data. It employs a two-sided knowledge distillation with contrastive learning as a core component, allowing the federated system to function without requiring clients to share any data features. Furthermore, its adaptable architecture can be us… ▽ More

    Submitted 19 July, 2022; originally announced July 2022.

    Comments: Accepted and will be published at ECCV2022

  34. arXiv:2207.07033  [pdf, other

    cs.AI cs.CY

    Develo** a Series of AI Challenges for the United States Department of the Air Force

    Authors: Vijay Gadepally, Gregory Angelides, Andrei Barbu, Andrew Bowne, Laura J. Brattain, Tamara Broderick, Armando Cabrera, Glenn Carl, Ronisha Carter, Miriam Cha, Emilie Cowen, Jesse Cummings, Bill Freeman, James Glass, Sam Goldberg, Mark Hamilton, Thomas Heldt, Kuan Wei Huang, Phillip Isola, Boris Katz, Jamie Koerner, Yen-Chen Lin, David Mayo, Kyle McAlpin, Taylor Perron , et al. (17 additional authors not shown)

    Abstract: Through a series of federal initiatives and orders, the U.S. Government has been making a concerted effort to ensure American leadership in AI. These broad strategy documents have influenced organizations such as the United States Department of the Air Force (DAF). The DAF-MIT AI Accelerator is an initiative between the DAF and MIT to bridge the gap between AI researchers and DAF mission requireme… ▽ More

    Submitted 14 July, 2022; originally announced July 2022.

  35. arXiv:2206.13246  [pdf, other

    cs.LG cs.AI cs.IR

    Prediction of Football Player Value using Bayesian Ensemble Approach

    Authors: Hansoo Lee, Bayu Adhi Tama, Meeyoung Cha

    Abstract: The transfer fees of sports players have become astronomical. This is because bringing players of great future value to the club is essential for their survival. We present a case study on the key factors affecting the world's top soccer players' transfer fees based on the FIFA data analysis. To predict each player's market value, we propose an improved LightGBM model by optimizing its hyperparame… ▽ More

    Submitted 24 June, 2022; originally announced June 2022.

    Comments: 17 pages, 4 figures, 6 tables, will be published in Journal of Expert Systems with Applications

  36. The Conflict Between Explainable and Accountable Decision-Making Algorithms

    Authors: Gabriel Lima, Nina Grgić-Hlača, ** Keun Jeong, Meeyoung Cha

    Abstract: Decision-making algorithms are being used in important decisions, such as who should be enrolled in health care programs and be hired. Even though these systems are currently deployed in high-stakes scenarios, many of them cannot explain their decisions. This limitation has prompted the Explainable Artificial Intelligence (XAI) initiative, which aims to make algorithms explainable to comply with l… ▽ More

    Submitted 11 May, 2022; originally announced May 2022.

    Comments: To appear in the FAccT 2022 proceedings

  37. arXiv:2205.01472  [pdf, other

    cs.CY

    Learning Economic Indicators by Aggregating Multi-Level Geospatial Information

    Authors: Sungwon Park, Sungwon Han, Donghyun Ahn, Jaeyeon Kim, Jeasurk Yang, Susang Lee, Seunghoon Hong, Jihee Kim, Sangyoon Park, Hyunjoo Yang, Meeyoung Cha

    Abstract: High-resolution daytime satellite imagery has become a promising source to study economic activities. These images display detailed terrain over large areas and allow zooming into smaller neighborhoods. Existing methods, however, have utilized images only in a single-level geographical unit. This research presents a deep learning model to predict economic indicators via aggregating traits observed… ▽ More

    Submitted 3 May, 2022; originally announced May 2022.

    Comments: Accepted at AAAI2022

  38. arXiv:2204.13508  [pdf, other

    cs.CE

    Design of Blockchain-based Travel Rule Compliance System

    Authors: Chaehyeon Lee, Changhoon Kang, Wonseok Choi, Jehoon Lee, Myunghun Cha, Jongsoo Woo, James Won-Ki Hong

    Abstract: In accordance with the guidelines of the Financial Action Task Force (FATF), Virtual Asset Service Providers (VASPs) should comply with a `travel rule', which requires them to exchange originator's and beneficiary's personal information when transferring virtual assets. In this paper, we propose a novel blockchain-based travel rule compliance system that supports fully-decentralized data exchange.… ▽ More

    Submitted 28 April, 2022; originally announced April 2022.

    Comments: 3 pages, 1 figure, 1 table. Accepted to IEEE ICBC 2022 as a poster paper

  39. arXiv:2204.11095  [pdf, ps, other

    cs.DC

    EdgeKeeper: Resilient and Lightweight Coordination for Mobile Edge Computing Systems

    Authors: S. Bhunia, R. Stoleru, M. Sagor, A. Haroon, A. Altaweel, M. Chao, M. Maurice, R. Blalock

    Abstract: Mobile Edge Computing (MEC) has been gaining significant interest from first responders and tactical teams, primarily because they can employ handheld mobile devices to form a computing cluster (for computing tasks like face/scene recognition, virtual assistance) when connectivity to the cloud is not present or it is limited. High user mobility in first responder or tactical environments makes MEC… ▽ More

    Submitted 23 April, 2022; originally announced April 2022.

    Comments: 12 Pages

  40. arXiv:2204.10823  [pdf, ps, other

    cs.DC

    R-Drive: Resilient Data Storage and Sharing for Mobile Edge Computing Systems

    Authors: M. Sagor, R. Stoleru, S. Bhunia, M. Chao, A. Haroon, A. Altaweel, M. Maurice, R. Blalock

    Abstract: Mobile edge computing (MEC) systems (in which intensive computation and data storage tasks are performed locally, due to the absence of communication infrastructure for connectivity to the cloud) are currently being developed for disaster response applications and for tactical environments. MEC applications for these scenarios generate and process significant mission-critical and personal data tha… ▽ More

    Submitted 22 April, 2022; originally announced April 2022.

    Comments: 13 pages

  41. arXiv:2204.07649  [pdf, other

    cs.CV

    MultiEarth 2022 -- Multimodal Learning for Earth and Environment Workshop and Challenge

    Authors: Miriam Cha, Kuan Wei Huang, Morgan Schmidt, Gregory Angelides, Mark Hamilton, Sam Goldberg, Armando Cabrera, Phillip Isola, Taylor Perron, Bill Freeman, Yen-Chen Lin, Brandon Swenson, Jean Piou

    Abstract: The Multimodal Learning for Earth and Environment Challenge (MultiEarth 2022) will be the first competition aimed at the monitoring and analysis of deforestation in the Amazon rainforest at any time and in any weather conditions. The goal of the Challenge is to provide a common benchmark for multimodal information processing and to bring together the earth and environmental science communities as… ▽ More

    Submitted 31 May, 2022; v1 submitted 15 April, 2022; originally announced April 2022.

  42. arXiv:2201.06759  [pdf, other

    cs.AI cs.CY cs.LG

    Knowledge Sharing via Domain Adaptation in Customs Fraud Detection

    Authors: Sungwon Park, Sundong Kim, Meeyoung Cha

    Abstract: Knowledge of the changing traffic is critical in risk management. Customs offices worldwide have traditionally relied on local resources to accumulate knowledge and detect tax fraud. This naturally poses countries with weak infrastructure to become tax havens of potentially illicit trades. The current paper proposes DAS, a memory bank platform to facilitate knowledge sharing across multi-national… ▽ More

    Submitted 18 January, 2022; originally announced January 2022.

    Comments: AAAI2022, Special track on AI for Social Impact

  43. arXiv:2111.01663  [pdf, other

    cs.AI cs.IR

    Classification of Goods Using Text Descriptions With Sentences Retrieval

    Authors: Eunji Lee, Sundong Kim, Sihyun Kim, Sungwon Park, Meeyoung Cha, Soyeon Jung, Suyoung Yang, Yeonsoo Choi, Sungdae Ji, Minsoo Song, Heeja Kim

    Abstract: The task of assigning and validating internationally accepted commodity code (HS code) to traded goods is one of the critical functions at the customs office. This decision is crucial to importers and exporters, as it determines the tariff rate. However, similar to court decisions made by judges, the task can be non-trivial even for experienced customs officers. The current paper proposes a deep l… ▽ More

    Submitted 2 November, 2021; originally announced November 2021.

  44. arXiv:2106.10147  [pdf, other

    cs.CR cs.LG

    Evaluating the Robustness of Trigger Set-Based Watermarks Embedded in Deep Neural Networks

    Authors: Suyoung Lee, Wonho Song, Suman Jana, Meeyoung Cha, Sooel Son

    Abstract: Trigger set-based watermarking schemes have gained emerging attention as they provide a means to prove ownership for deep neural network model owners. In this paper, we argue that state-of-the-art trigger set-based watermarking algorithms do not achieve their designed goal of proving ownership. We posit that this impaired capability stems from two common experimental flaws that the existing resear… ▽ More

    Submitted 19 January, 2023; v1 submitted 18 June, 2021; originally announced June 2021.

    Comments: 15 pages, accepted at IEEE TDSC

  45. arXiv:2105.04046  [pdf, other

    stat.ML cs.LG

    A likelihood approach to nonparametric estimation of a singular distribution using deep generative models

    Authors: Minwoo Chae, Dongha Kim, Yongdai Kim, Lizhen Lin

    Abstract: We investigate statistical properties of a likelihood approach to nonparametric estimation of a singular distribution using deep generative models. More specifically, a deep generative model is used to model high-dimensional data that are assumed to concentrate around some low-dimensional structure. Estimating the distribution supported on this low-dimensional structure, such as a low-dimensional… ▽ More

    Submitted 28 March, 2023; v1 submitted 9 May, 2021; originally announced May 2021.

    Comments: 42 pages, 13 figures, 1 table

    MSC Class: 62G05 (Primary); 62G20 (Secondary)

  46. Misinformation, Believability, and Vaccine Acceptance Over 40 Countries: Takeaways From the Initial Phase of The COVID-19 Infodemic

    Authors: Karandeep Singh, Gabriel Lima, Meeyoung Cha, Chiyoung Cha, Juhi Kulshrestha, Yong-Yeol Ahn, Onur Varol

    Abstract: The COVID-19 pandemic has been damaging to the lives of people all around the world. Accompanied by the pandemic is an infodemic, an abundant and uncontrolled spreading of potentially harmful misinformation. The infodemic may severely change the pandemic's course by interfering with public health interventions such as wearing masks, social distancing, and vaccination. In particular, the impact of… ▽ More

    Submitted 22 April, 2021; originally announced April 2021.

  47. arXiv:2104.03613  [pdf, other

    cs.LG cs.AI stat.AP

    Uncertainty-aware Remaining Useful Life predictor

    Authors: Luca Biggio, Alexander Wieland, Manuel Arias Chao, Iason Kastanis, Olga Fink

    Abstract: Remaining Useful Life (RUL) estimation is the problem of inferring how long a certain industrial asset can be expected to operate within its defined specifications. Deploying successful RUL prediction methods in real-life applications is a prerequisite for the design of intelligent maintenance strategies with the potential of drastically reducing maintenance costs and machine downtimes. In light o… ▽ More

    Submitted 8 April, 2021; originally announced April 2021.

  48. arXiv:2103.15296  [pdf, other

    cs.CV cs.LG

    Elsa: Energy-based learning for semi-supervised anomaly detection

    Authors: Sungwon Han, Hyeonho Song, Seungeon Lee, Sungwon Park, Meeyoung Cha

    Abstract: Anomaly detection aims at identifying deviant instances from the normal data distribution. Many advances have been made in the field, including the innovative use of unsupervised contrastive learning. However, existing methods generally assume clean training data and are limited when the data contain unknown anomalies. This paper presents Elsa, a novel semi-supervised anomaly detection approach th… ▽ More

    Submitted 3 January, 2022; v1 submitted 28 March, 2021; originally announced March 2021.

    Comments: Accepted and published at BMVC2021

  49. Multimodal Representation Learning via Maximization of Local Mutual Information

    Authors: Ruizhi Liao, Daniel Moyer, Miriam Cha, Keegan Quigley, Seth Berkowitz, Steven Horng, Polina Golland, William M. Wells

    Abstract: We propose and demonstrate a representation learning approach by maximizing the mutual information between local features of images and text. The goal of this approach is to learn useful image representations by taking advantage of the rich information contained in the free text that describes the findings in the image. Our method trains image and text encoders by encouraging the resulting represe… ▽ More

    Submitted 14 December, 2021; v1 submitted 7 March, 2021; originally announced March 2021.

    Comments: In Proceedings of International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), 2021

    Journal ref: In International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 273-283. Springer, Cham, 2021

  50. arXiv:2102.07650  [pdf, other

    cs.LG

    Learning Student-Friendly Teacher Networks for Knowledge Distillation

    Authors: Dae Young Park, Moon-Hyun Cha, Changwook Jeong, Dae Sin Kim, Bohyung Han

    Abstract: We propose a novel knowledge distillation approach to facilitate the transfer of dark knowledge from a teacher to a student. Contrary to most of the existing methods that rely on effective training of student models given pretrained teachers, we aim to learn the teacher models that are friendly to students and, consequently, more appropriate for knowledge transfer. In other words, at the time of o… ▽ More

    Submitted 23 January, 2022; v1 submitted 12 February, 2021; originally announced February 2021.

    Comments: Accepted by NeurIPS 2021