-
Coderivative-Based Newton Methods with Wolfe Linesearch for Nonsmooth Optimization
Authors:
Miantao Chao,
Boris S. Mordukhovich,
Zijian Shi,
** Zhang
Abstract:
This paper introduces and develops novel coderivative-based Newton methods with Wolfe linesearch conditions to solve various classes of problems in nonsmooth optimization. We first propose a generalized regularized Newton method with Wolfe linesearch (GRNM-W) for unconstrained $C^{1,1}$ minimization problems (which are second-order nonsmooth) and establish global as well as local superlinear conve…
▽ More
This paper introduces and develops novel coderivative-based Newton methods with Wolfe linesearch conditions to solve various classes of problems in nonsmooth optimization. We first propose a generalized regularized Newton method with Wolfe linesearch (GRNM-W) for unconstrained $C^{1,1}$ minimization problems (which are second-order nonsmooth) and establish global as well as local superlinear convergence of their iterates. To deal with convex composite minimization problems (which are first-order nonsmooth and can be constrained), we combine the proposed GRNM-W with two algorithmic frameworks: the forward-backward envelope and the augmented Lagrangian method resulting in the two new algorithms called CNFB and CNAL, respectively. Finally, we present numerical results to solve Lasso and support vector machine problems appearing in, e.g., machine learning and statistics, which demonstrate the efficiency of the proposed algorithms.
△ Less
Submitted 3 July, 2024; v1 submitted 2 July, 2024;
originally announced July 2024.
-
Adversarial Style Augmentation via Large Language Model for Robust Fake News Detection
Authors:
Sungwon Park,
Sungwon Han,
Meeyoung Cha
Abstract:
The spread of fake news negatively impacts individuals and is regarded as a significant social challenge that needs to be addressed. A number of algorithmic and insightful features have been identified for detecting fake news. However, with the recent LLMs and their advanced generation capabilities, many of the detectable features (e.g., style-conversion attacks) can be altered, making it more cha…
▽ More
The spread of fake news negatively impacts individuals and is regarded as a significant social challenge that needs to be addressed. A number of algorithmic and insightful features have been identified for detecting fake news. However, with the recent LLMs and their advanced generation capabilities, many of the detectable features (e.g., style-conversion attacks) can be altered, making it more challenging to distinguish from real news. This study proposes adversarial style augmentation, AdStyle, to train a fake news detector that remains robust against various style-conversion attacks. Our model's key mechanism is the careful use of LLMs to automatically generate a diverse yet coherent range of style-conversion attack prompts. This improves the generation of prompts that are particularly difficult for the detector to handle. Experiments show that our augmentation strategy improves robustness and detection performance when tested on fake news benchmark datasets.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
GeoSEE: Regional Socio-Economic Estimation With a Large Language Model
Authors:
Sungwon Han,
Donghyun Ahn,
Seungeon Lee,
Minhyuk Song,
Sungwon Park,
Sangyoon Park,
Jihee Kim,
Meeyoung Cha
Abstract:
Moving beyond traditional surveys, combining heterogeneous data sources with AI-driven inference models brings new opportunities to measure socio-economic conditions, such as poverty and population, over expansive geographic areas. The current research presents GeoSEE, a method that can estimate various socio-economic indicators using a unified pipeline powered by a large language model (LLM). Pre…
▽ More
Moving beyond traditional surveys, combining heterogeneous data sources with AI-driven inference models brings new opportunities to measure socio-economic conditions, such as poverty and population, over expansive geographic areas. The current research presents GeoSEE, a method that can estimate various socio-economic indicators using a unified pipeline powered by a large language model (LLM). Presented with a diverse set of information modules, including those pre-constructed from satellite imagery, GeoSEE selects which modules to use in estimation, for each indicator and country. This selection is guided by the LLM's prior socio-geographic knowledge, which functions similarly to the insights of a domain expert. The system then computes target indicators via in-context learning after aggregating results from selected modules in the format of natural language-based texts. Comprehensive evaluation across countries at various stages of development reveals that our method outperforms other predictive models in both unsupervised and low-shot contexts. This reliable performance under data-scarce setting in under-developed or develo** countries, combined with its cost-effectiveness, underscores its potential to continuously support and monitor the progress of Sustainable Development Goals, such as poverty alleviation and equitable growth, on a global scale.
△ Less
Submitted 14 June, 2024;
originally announced June 2024.
-
Generalizable Disaster Damage Assessment via Change Detection with Vision Foundation Model
Authors:
Kyeong** Ahn,
Sungwon Han,
Sungwon Park,
Jihee Kim,
Sangyoon Park,
Meeyoung Cha
Abstract:
The increasing frequency and intensity of natural disasters demand more sophisticated approaches for rapid and precise damage assessment. To tackle this issue, researchers have developed various methods on disaster benchmark datasets from satellite imagery to aid in detecting disaster damage. However, the diverse nature of geographical landscapes and disasters makes it challenging to apply existin…
▽ More
The increasing frequency and intensity of natural disasters demand more sophisticated approaches for rapid and precise damage assessment. To tackle this issue, researchers have developed various methods on disaster benchmark datasets from satellite imagery to aid in detecting disaster damage. However, the diverse nature of geographical landscapes and disasters makes it challenging to apply existing methods to regions unseen during training. We present DAVI (Disaster Assessment with VIsion foundation model), which overcomes domain disparities and detects structural damage (e.g., building) without requiring ground-truth labels of the target region. DAVI integrates task-specific knowledge from a model trained on source regions with an image segmentation foundation model to generate pseudo labels of possible damage in the target region. It then employs a two-stage refinement process, targeting both the pixel and overall image, to more accurately pinpoint changes in disaster-struck areas based on before-and-after images. Comprehensive evaluations demonstrate that DAVI achieves exceptional performance across diverse terrains (e.g., USA and Mexico) and disaster types (e.g., wildfires, hurricanes, and earthquakes). This confirms its robustness in assessing disaster impact without dependence on ground-truth labels.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
Robust Optimization in Protein Fitness Landscapes Using Reinforcement Learning in Latent Space
Authors:
Minji Lee,
Luiz Felipe Vecchietti,
Hyunkyu Jung,
Hyun Joo Ro,
Meeyoung Cha,
Ho Min Kim
Abstract:
Proteins are complex molecules responsible for different functions in nature. Enhancing the functionality of proteins and cellular fitness can significantly impact various industries. However, protein optimization using computational methods remains challenging, especially when starting from low-fitness sequences. We propose LatProtRL, an optimization method to efficiently traverse a latent space…
▽ More
Proteins are complex molecules responsible for different functions in nature. Enhancing the functionality of proteins and cellular fitness can significantly impact various industries. However, protein optimization using computational methods remains challenging, especially when starting from low-fitness sequences. We propose LatProtRL, an optimization method to efficiently traverse a latent space learned by an encoder-decoder leveraging a large protein language model. To escape local optima, our optimization is modeled as a Markov decision process using reinforcement learning acting directly in latent space. We evaluate our approach on two important fitness optimization tasks, demonstrating its ability to achieve comparable or superior fitness over baseline methods. Our findings and in vitro evaluation show that the generated sequences can reach high-fitness regions, suggesting a substantial potential of LatProtRL in lab-in-the-loop scenarios.
△ Less
Submitted 29 May, 2024;
originally announced May 2024.
-
Health Index Estimation Through Integration of General Knowledge with Unsupervised Learning
Authors:
Kristupas Bajarunas,
Marcia L. Baptista,
Kai Goebel,
Manuel A. Chao
Abstract:
Accurately estimating a Health Index (HI) from condition monitoring data (CM) is essential for reliable and interpretable prognostics and health management (PHM) in complex systems. In most scenarios, complex systems operate under varying operating conditions and can exhibit different fault modes, making unsupervised inference of an HI from CM data a significant challenge. Hybrid models combining…
▽ More
Accurately estimating a Health Index (HI) from condition monitoring data (CM) is essential for reliable and interpretable prognostics and health management (PHM) in complex systems. In most scenarios, complex systems operate under varying operating conditions and can exhibit different fault modes, making unsupervised inference of an HI from CM data a significant challenge. Hybrid models combining prior knowledge about degradation with deep learning models have been proposed to overcome this challenge. However, previously suggested hybrid models for HI estimation usually rely heavily on system-specific information, limiting their transferability to other systems. In this work, we propose an unsupervised hybrid method for HI estimation that integrates general knowledge about degradation into the convolutional autoencoder's model architecture and learning algorithm, enhancing its applicability across various systems. The effectiveness of the proposed method is demonstrated in two case studies from different domains: turbofan engines and lithium batteries. The results show that the proposed method outperforms other competitive alternatives, including residual-based methods, in terms of HI quality and their utility for Remaining Useful Life (RUL) predictions. The case studies also highlight the comparable performance of our proposed method with a supervised model trained with HI labels.
△ Less
Submitted 8 May, 2024;
originally announced May 2024.
-
Unveiling dynamic bifurcation of Resch-patterned origami for self-adaptive impact mitigation structure
Authors:
Yasuhiro Miyazawa,
Chia-Yung Chang,
Qixun Li,
Ryan Tenu Ahn,
Koshiro Yamaguchi,
Seonghyun Kim,
Minho Cha,
Junseo Kim,
Yuyang Song,
Shinnosuke Shimokawa,
Umesh Gandhi,
**kyu Yang
Abstract:
In the classic realm of impact mitigation, targeting different impact scenarios with a universally designed device still remains an unassailable challenge. In this study, we delve into the untapped potential of Resch-patterned origami for impact mitigation, specifically considering the adaptively reconfigurable nature of the Resch origami structure. Our unit-cell-level analyses reveal two distinct…
▽ More
In the classic realm of impact mitigation, targeting different impact scenarios with a universally designed device still remains an unassailable challenge. In this study, we delve into the untapped potential of Resch-patterned origami for impact mitigation, specifically considering the adaptively reconfigurable nature of the Resch origami structure. Our unit-cell-level analyses reveal two distinctive modes of deformation, each characterized by contrasting mechanical responses: the folding mode that displays monostability coupled with strain-hardening, and the unfolding mode that manifests bistability, facilitating energy absorption through snap-through dynamics. Drop tests further unveil a novel dynamic bifurcation phenomenon, where the origami switches between folding and unfolding depending on impact speed, thereby showcasing its innate self-reconfigurability in a wide range of dynamic events. The tessellated meter-scale Resch structure mimicking an automotive bumper inherits this dynamically bifurcating behavior, demonstrating the instantaneous morphing into favorable deformation mode to minimize the peak acceleration upon impact. This suggests a self-adaptive and universally applicable impact-absorbing nature of the Resch-patterned origami system. We believe that our findings pave the way for develo** smart, origami-inspired impact mitigation devices capable of real-time response and adaptation to external stimuli, offering insights into designing universally protective structures with enhanced performance in response to various impact scenarios.
△ Less
Submitted 23 April, 2024;
originally announced April 2024.
-
FedMID: A Data-Free Method for Using Intermediate Outputs as a Defense Mechanism Against Poisoning Attacks in Federated Learning
Authors:
Sungwon Han,
Hyeonho Song,
Sungwon Park,
Meeyoung Cha
Abstract:
Federated learning combines local updates from clients to produce a global model, which is susceptible to poisoning attacks. Most previous defense strategies relied on vectors derived from projections of local updates on a Euclidean space; however, these methods fail to accurately represent the functionality and structure of local models, resulting in inconsistent performance. Here, we present a n…
▽ More
Federated learning combines local updates from clients to produce a global model, which is susceptible to poisoning attacks. Most previous defense strategies relied on vectors derived from projections of local updates on a Euclidean space; however, these methods fail to accurately represent the functionality and structure of local models, resulting in inconsistent performance. Here, we present a new paradigm to defend against poisoning attacks in federated learning using functional map**s of local models based on intermediate outputs. Experiments show that our mechanism is robust under a broad range of computing conditions and advanced attack scenarios, enabling safer collaboration among data-sensitive participants via federated learning.
△ Less
Submitted 18 April, 2024;
originally announced April 2024.
-
DeepVM: Integrating Spot and On-Demand VMs for Cost-Efficient Deep Learning Clusters in the Cloud
Authors:
Yoochan Kim,
Kihyun Kim,
Yonghyeon Cho,
**woo Kim,
Awais Khan,
Ki-Dong Kang,
Baik-Song An,
Myung-Hoon Cha,
Hong-Yeon Kim,
Youngjae Kim
Abstract:
Distributed Deep Learning (DDL), as a paradigm, dictates the use of GPU-based clusters as the optimal infrastructure for training large-scale Deep Neural Networks (DNNs). However, the high cost of such resources makes them inaccessible to many users. Public cloud services, particularly Spot Virtual Machines (VMs), offer a cost-effective alternative, but their unpredictable availability poses a sig…
▽ More
Distributed Deep Learning (DDL), as a paradigm, dictates the use of GPU-based clusters as the optimal infrastructure for training large-scale Deep Neural Networks (DNNs). However, the high cost of such resources makes them inaccessible to many users. Public cloud services, particularly Spot Virtual Machines (VMs), offer a cost-effective alternative, but their unpredictable availability poses a significant challenge to the crucial checkpointing process in DDL. To address this, we introduce DeepVM, a novel solution that recommends cost-effective cluster configurations by intelligently balancing the use of Spot and On-Demand VMs. DeepVM leverages a four-stage process that analyzes instance performance using the FLOPP (FLoating-point Operations Per Price) metric, performs architecture-level analysis with linear programming, and identifies the optimal configuration for the user-specific needs. Extensive simulations and real-world deployments in the AWS environment demonstrate that DeepVM consistently outperforms other policies, reducing training costs and overall makespan. By enabling cost-effective checkpointing with Spot VMs, DeepVM opens up DDL to a wider range of users and facilitates a more efficient training of complex DNNs.
△ Less
Submitted 14 March, 2024; v1 submitted 9 March, 2024;
originally announced March 2024.
-
Cluster structure of 3$α$+p states in $^{13}$N
Authors:
J. Bishop,
G. V. Rogachev,
S. Ahn,
M. Barbui,
S. M. Cha,
E. Harris,
C. Hunt,
C. H. Kim,
D. Kim,
S. H. Kim,
E. Koshchiy,
Z. Luo,
C. Park,
C. E. Parker,
E. C. Pollacco,
B. T. Roeder,
M. Roosa,
A. Saastamoinen,
D. P. Scriven
Abstract:
Background: Cluster states in $^{13}$N are extremely difficult to measure due to the unavailability of $^{9}$B+$α$ elastic scattering data. Purpose: Using $β$-delayed charged-particle spectroscopy of $^{13}$O, clustered states in $^{13}$N can be populated and measured in the 3$α$+p decay channel. Method: One-at-a-time implantation/decay of $^{13}$O was performed with the Texas Active Target Time P…
▽ More
Background: Cluster states in $^{13}$N are extremely difficult to measure due to the unavailability of $^{9}$B+$α$ elastic scattering data. Purpose: Using $β$-delayed charged-particle spectroscopy of $^{13}$O, clustered states in $^{13}$N can be populated and measured in the 3$α$+p decay channel. Method: One-at-a-time implantation/decay of $^{13}$O was performed with the Texas Active Target Time Projection Chamber (TexAT TPC). 149 $β3αp$ decay events were observed and the excitation function in $^{13}$N reconstructed. Results: Four previously unknown $α$-decaying excited states were observed in $^{13}$N at an excitation energy of 11.3 MeV, 12.4 MeV, 13.1 MeV and 13.7 MeV decaying via the 3$α$+p channel. Conclusion: These states are seen to have a [$^{9}\mathrm{B}(\mathrm{g.s}) \bigotimes α$/ $p+^{12}\mathrm{C}(0_{2}^{+})$], [$^{9}\mathrm{B}(\frac{1}{2}^{+}) \bigotimes α$], [$^{9}\mathrm{B}(\frac{5}{2}^{+}) \bigotimes α$] and [$^{9}\mathrm{B}(\frac{5}{2}^{+}) \bigotimes α$] structure respectively. A previously-seen state at 11.8 MeV was also determined to have a [$p+^{12}\mathrm{C}(\mathrm{g.s.})$/ $p+^{12}\mathrm{C}(0_{2}^{+})$] structure. The overall magnitude of the clustering is not able to be extracted however due to the lack of a total width measurement. Clustered states in $^{13}$N (with unknown magnitude) seem to persist from the addition of a proton to the highly $α$-clustered $^{12}$C. Evidence of the $\frac{1}{2}^{+}$ state in $^{9}$B was also seen to be populated by decays from $^{13}$N$^{\star}$.
△ Less
Submitted 26 February, 2024;
originally announced February 2024.
-
I Am Not Them: Fluid Identities and Persistent Out-group Bias in Large Language Models
Authors:
Wenchao Dong,
Assem Zhunis,
Hyo** Chin,
Jiyoung Han,
Meeyoung Cha
Abstract:
We explored cultural biases-individualism vs. collectivism-in ChatGPT across three Western languages (i.e., English, German, and French) and three Eastern languages (i.e., Chinese, Japanese, and Korean). When ChatGPT adopted an individualistic persona in Western languages, its collectivism scores (i.e., out-group values) exhibited a more negative trend, surpassing their positive orientation toward…
▽ More
We explored cultural biases-individualism vs. collectivism-in ChatGPT across three Western languages (i.e., English, German, and French) and three Eastern languages (i.e., Chinese, Japanese, and Korean). When ChatGPT adopted an individualistic persona in Western languages, its collectivism scores (i.e., out-group values) exhibited a more negative trend, surpassing their positive orientation towards individualism (i.e., in-group values). Conversely, when a collectivistic persona was assigned to ChatGPT in Eastern languages, a similar pattern emerged with more negative responses toward individualism (i.e., out-group values) as compared to collectivism (i.e., in-group values). The results indicate that when imbued with a particular social identity, ChatGPT discerns in-group and out-group, embracing in-group values while eschewing out-group values. Notably, the negativity towards the out-group, from which prejudices and discrimination arise, exceeded the positivity towards the in-group. The experiment was replicated in the political domain, and the results remained consistent. Furthermore, this replication unveiled an intrinsic Democratic bias in Large Language Models (LLMs), aligning with earlier findings and providing integral insights into mitigating such bias through prompt engineering. Extensive robustness checks were performed using varying hyperparameter and persona setup methods, with or without social identity labels, across other popular language models.
△ Less
Submitted 15 February, 2024;
originally announced February 2024.
-
Self Supervised Vision for Climate Downscaling
Authors:
Karandeep Singh,
Chaeyoon Jeong,
Naufal Shidqi,
Sungwon Park,
Arjun Nellikkattil,
Elke Zeller,
Meeyoung Cha
Abstract:
Climate change is one of the most critical challenges that our planet is facing today. Rising global temperatures are already bringing noticeable changes to Earth's weather and climate patterns with an increased frequency of unpredictable and extreme weather events. Future projections for climate change research are based on Earth System Models (ESMs), the computer models that simulate the Earth's…
▽ More
Climate change is one of the most critical challenges that our planet is facing today. Rising global temperatures are already bringing noticeable changes to Earth's weather and climate patterns with an increased frequency of unpredictable and extreme weather events. Future projections for climate change research are based on Earth System Models (ESMs), the computer models that simulate the Earth's climate system. ESMs provide a framework to integrate various physical systems, but their output is bound by the enormous computational resources required for running and archiving higher-resolution simulations. For a given resource budget, the ESMs are generally run on a coarser grid, followed by a computationally lighter $downscaling$ process to obtain a finer-resolution output. In this work, we present a deep-learning model for downscaling ESM simulation data that does not require high-resolution ground truth data for model optimization. This is realized by leveraging salient data distribution patterns and the hidden dependencies between weather variables for an $\textit{individual}$ data point at $\textit{runtime}$. Extensive evaluation with $2$x, $3$x, and $4$x scaling factors demonstrates that the proposed model consistently obtains superior performance over that of various baselines. The improved downscaling performance and no dependence on high-resolution ground truth data make the proposed method a valuable tool for climate research and mark it as a promising direction for future research.
△ Less
Submitted 9 January, 2024;
originally announced January 2024.
-
Enhancing Campus Mobility: Achievements and Challenges of Autonomous Shuttle "Snow Lion''
Authors:
Yingbing Chen,
Jie Cheng,
Sheng Wang,
Hongji Liu,
Xiaodong Mei,
Xiaoyang Yan,
Mingkai Tang,
Ge Sun,
Ya Wen,
Junwei Cai,
Xupeng Xie,
Lu Gan,
Mandan Chao,
Ren Xin,
Ming Liu,
Jianhao Jiao,
Kangcheng Liu,
Lujia Wang
Abstract:
The rapid evolution of autonomous vehicles (AVs) has significantly influenced global transportation systems. In this context, we present ``Snow Lion'', an autonomous shuttle meticulously designed to revolutionize on-campus transportation, offering a safer and more efficient mobility solution for students, faculty, and visitors. The primary objective of this research is to enhance campus mobility b…
▽ More
The rapid evolution of autonomous vehicles (AVs) has significantly influenced global transportation systems. In this context, we present ``Snow Lion'', an autonomous shuttle meticulously designed to revolutionize on-campus transportation, offering a safer and more efficient mobility solution for students, faculty, and visitors. The primary objective of this research is to enhance campus mobility by providing a reliable, efficient, and eco-friendly transportation solution that seamlessly integrates with existing infrastructure and meets the diverse needs of a university setting. To achieve this goal, we delve into the intricacies of the system design, encompassing sensing, perception, localization, planning, and control aspects. We evaluate the autonomous shuttle's performance in real-world scenarios, involving a 1146-kilometer road haul and the transportation of 442 passengers over a two-month period. These experiments demonstrate the effectiveness of our system and offer valuable insights into the intricate process of integrating an autonomous vehicle within campus shuttle operations. Furthermore, a thorough analysis of the lessons derived from this experience furnishes a valuable real-world case study, accompanied by recommendations for future research and development in the field of autonomous driving.
△ Less
Submitted 16 January, 2024;
originally announced January 2024.
-
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling
Authors:
Dahyun Kim,
Chanjun Park,
Sanghoon Kim,
Wonsung Lee,
Wonho Song,
Yunsu Kim,
Hyeonwoo Kim,
Yungi Kim,
Hyeonju Lee,
Jihoo Kim,
Changbae Ahn,
Seonghoon Yang,
Sukyung Lee,
Hyunbyung Park,
Gyoung** Gim,
Mikyoung Cha,
Hwalsuk Lee,
Sunghun Kim
Abstract:
We introduce SOLAR 10.7B, a large language model (LLM) with 10.7 billion parameters, demonstrating superior performance in various natural language processing (NLP) tasks. Inspired by recent efforts to efficiently up-scale LLMs, we present a method for scaling LLMs called depth up-scaling (DUS), which encompasses depthwise scaling and continued pretraining. In contrast to other LLM up-scaling meth…
▽ More
We introduce SOLAR 10.7B, a large language model (LLM) with 10.7 billion parameters, demonstrating superior performance in various natural language processing (NLP) tasks. Inspired by recent efforts to efficiently up-scale LLMs, we present a method for scaling LLMs called depth up-scaling (DUS), which encompasses depthwise scaling and continued pretraining. In contrast to other LLM up-scaling methods that use mixture-of-experts, DUS does not require complex changes to train and inference efficiently. We show experimentally that DUS is simple yet effective in scaling up high-performance LLMs from small ones. Building on the DUS model, we additionally present SOLAR 10.7B-Instruct, a variant fine-tuned for instruction-following capabilities, surpassing Mixtral-8x7B-Instruct. SOLAR 10.7B is publicly available under the Apache 2.0 license, promoting broad access and application in the LLM field.
△ Less
Submitted 3 April, 2024; v1 submitted 23 December, 2023;
originally announced December 2023.
-
Explainable Product Classification for Customs
Authors:
Eunji Lee,
Sihyeon Kim,
Sundong Kim,
Soyeon Jung,
Heeja Kim,
Meeyoung Cha
Abstract:
The task of assigning internationally accepted commodity codes (aka HS codes) to traded goods is a critical function of customs offices. Like court decisions made by judges, this task follows the doctrine of precedent and can be nontrivial even for experienced officers. Together with the Korea Customs Service (KCS), we propose a first-ever explainable decision supporting model that suggests the mo…
▽ More
The task of assigning internationally accepted commodity codes (aka HS codes) to traded goods is a critical function of customs offices. Like court decisions made by judges, this task follows the doctrine of precedent and can be nontrivial even for experienced officers. Together with the Korea Customs Service (KCS), we propose a first-ever explainable decision supporting model that suggests the most likely subheadings (i.e., the first six digits) of the HS code. The model also provides reasoning for its suggestion in the form of a document that is interpretable by customs officers. We evaluated the model using 5,000 cases that recently received a classification request. The results showed that the top-3 suggestions made by our model had an accuracy of 93.9\% when classifying 925 challenging subheadings. A user study with 32 customs experts further confirmed that our algorithmic suggestions accompanied by explainable reasonings, can substantially reduce the time and effort taken by customs officers for classification reviews.
△ Less
Submitted 17 November, 2023;
originally announced November 2023.
-
Bidirectional Captioning for Clinically Accurate and Interpretable Models
Authors:
Keegan Quigley,
Miriam Cha,
Josh Barua,
Geeticka Chauhan,
Seth Berkowitz,
Steven Horng,
Polina Golland
Abstract:
Vision-language pretraining has been shown to produce high-quality visual encoders which transfer efficiently to downstream computer vision tasks. While generative language models have gained widespread attention, image captioning has thus far been mostly overlooked as a form of cross-modal pretraining in favor of contrastive learning, especially in medical image analysis. In this paper, we experi…
▽ More
Vision-language pretraining has been shown to produce high-quality visual encoders which transfer efficiently to downstream computer vision tasks. While generative language models have gained widespread attention, image captioning has thus far been mostly overlooked as a form of cross-modal pretraining in favor of contrastive learning, especially in medical image analysis. In this paper, we experiment with bidirectional captioning of radiology reports as a form of pretraining and compare the quality and utility of learned embeddings with those from contrastive pretraining methods. We optimize a CNN encoder, transformer decoder architecture named RadTex for the radiology domain. Results show that not only does captioning pretraining yield visual encoders that are competitive with contrastive pretraining (CheXpert competition multi-label AUC of 89.4%), but also that our transformer decoder is capable of generating clinically relevant reports (captioning macro-F1 score of 0.349 using CheXpert labeler) and responding to prompts with targeted, interactive outputs.
△ Less
Submitted 30 October, 2023;
originally announced October 2023.
-
Factuality Challenges in the Era of Large Language Models
Authors:
Isabelle Augenstein,
Timothy Baldwin,
Meeyoung Cha,
Tanmoy Chakraborty,
Giovanni Luca Ciampaglia,
David Corney,
Renee DiResta,
Emilio Ferrara,
Scott Hale,
Alon Halevy,
Eduard Hovy,
Heng Ji,
Filippo Menczer,
Ruben Miguez,
Preslav Nakov,
Dietram Scheufele,
Shivam Sharma,
Giovanni Zagni
Abstract:
The emergence of tools based on Large Language Models (LLMs), such as OpenAI's ChatGPT, Microsoft's Bing Chat, and Google's Bard, has garnered immense public attention. These incredibly useful, natural-sounding tools mark significant advances in natural language generation, yet they exhibit a propensity to generate false, erroneous, or misleading content -- commonly referred to as "hallucinations.…
▽ More
The emergence of tools based on Large Language Models (LLMs), such as OpenAI's ChatGPT, Microsoft's Bing Chat, and Google's Bard, has garnered immense public attention. These incredibly useful, natural-sounding tools mark significant advances in natural language generation, yet they exhibit a propensity to generate false, erroneous, or misleading content -- commonly referred to as "hallucinations." Moreover, LLMs can be exploited for malicious applications, such as generating false but credible-sounding content and profiles at scale. This poses a significant challenge to society in terms of the potential deception of users and the increasing dissemination of inaccurate information. In light of these risks, we explore the kinds of technological innovations, regulatory reforms, and AI literacy initiatives needed from fact-checkers, news organizations, and the broader research and policy communities. By identifying the risks, the imminent threats, and some viable solutions, we seek to shed light on navigating various aspects of veracity in the era of generative AI.
△ Less
Submitted 9 October, 2023; v1 submitted 8 October, 2023;
originally announced October 2023.
-
A Comparative Study of Reference Reliability in Multiple Language Editions of Wikipedia
Authors:
Aitolkyn Baigutanova,
Diego Saez-Trumper,
Miriam Redi,
Meeyoung Cha,
Pablo Aragón
Abstract:
Information presented in Wikipedia articles must be attributable to reliable published sources in the form of references. This study examines over 5 million Wikipedia articles to assess the reliability of references in multiple language editions. We quantify the cross-lingual patterns of the perennial sources list, a collection of reliability labels for web domains identified and collaboratively a…
▽ More
Information presented in Wikipedia articles must be attributable to reliable published sources in the form of references. This study examines over 5 million Wikipedia articles to assess the reliability of references in multiple language editions. We quantify the cross-lingual patterns of the perennial sources list, a collection of reliability labels for web domains identified and collaboratively agreed upon by Wikipedia editors. We discover that some sources (or web domains) deemed untrustworthy in one language (i.e., English) continue to appear in articles in other languages. This trend is especially evident with sources tailored for smaller communities. Furthermore, non-authoritative sources found in the English version of a page tend to persist in other language versions of that page. We finally present a case study on the Chinese, Russian, and Swedish Wikipedias to demonstrate a discrepancy in reference reliability across cultures. Our finding highlights future challenges in coordinating global knowledge on source reliability.
△ Less
Submitted 4 September, 2023; v1 submitted 31 August, 2023;
originally announced September 2023.
-
Fine-Grained Socioeconomic Prediction from Satellite Images with Distributional Adjustment
Authors:
Donghyun Ahn,
Minhyuk Song,
Seungeon Lee,
Yubin Choi,
Jihee Kim,
Sangyoon Park,
Hyunjoo Yang,
Meeyoung Cha
Abstract:
While measuring socioeconomic indicators is critical for local governments to make informed policy decisions, such measurements are often unavailable at fine-grained levels like municipality. This study employs deep learning-based predictions from satellite images to close the gap. We propose a method that assigns a socioeconomic score to each satellite image by capturing the distributional behavi…
▽ More
While measuring socioeconomic indicators is critical for local governments to make informed policy decisions, such measurements are often unavailable at fine-grained levels like municipality. This study employs deep learning-based predictions from satellite images to close the gap. We propose a method that assigns a socioeconomic score to each satellite image by capturing the distributional behavior observed in larger areas based on the ground truth. We train an ordinal regression scoring model and adjust the scores to follow the common power law within and across regions. Evaluation based on official statistics in South Korea shows that our method outperforms previous models in predicting population and employment size at both the municipality and grid levels. Our method also demonstrates robust performance in districts with uneven development, suggesting its potential use in develo** countries where reliable, fine-grained data is scarce.
△ Less
Submitted 4 September, 2023; v1 submitted 30 August, 2023;
originally announced August 2023.
-
Towards Attack-tolerant Federated Learning via Critical Parameter Analysis
Authors:
Sungwon Han,
Sungwon Park,
Fangzhao Wu,
Sundong Kim,
Bin Zhu,
Xing Xie,
Meeyoung Cha
Abstract:
Federated learning is used to train a shared model in a decentralized way without clients sharing private data with each other. Federated learning systems are susceptible to poisoning attacks when malicious clients send false updates to the central server. Existing defense strategies are ineffective under non-IID data settings. This paper proposes a new defense strategy, FedCPA (Federated learning…
▽ More
Federated learning is used to train a shared model in a decentralized way without clients sharing private data with each other. Federated learning systems are susceptible to poisoning attacks when malicious clients send false updates to the central server. Existing defense strategies are ineffective under non-IID data settings. This paper proposes a new defense strategy, FedCPA (Federated learning with Critical Parameter Analysis). Our attack-tolerant aggregation method is based on the observation that benign local models have similar sets of top-k and bottom-k critical parameters, whereas poisoned local models do not. Experiments with different attack scenarios on multiple datasets demonstrate that our model outperforms existing defense strategies in defending against poisoning attacks.
△ Less
Submitted 18 August, 2023;
originally announced August 2023.
-
FedDefender: Client-Side Attack-Tolerant Federated Learning
Authors:
Sungwon Park,
Sungwon Han,
Fangzhao Wu,
Sundong Kim,
Bin Zhu,
Xing Xie,
Meeyoung Cha
Abstract:
Federated learning enables learning from decentralized data sources without compromising privacy, which makes it a crucial technique. However, it is vulnerable to model poisoning attacks, where malicious clients interfere with the training process. Previous defense mechanisms have focused on the server-side by using careful model aggregation, but this may not be effective when the data is not iden…
▽ More
Federated learning enables learning from decentralized data sources without compromising privacy, which makes it a crucial technique. However, it is vulnerable to model poisoning attacks, where malicious clients interfere with the training process. Previous defense mechanisms have focused on the server-side by using careful model aggregation, but this may not be effective when the data is not identically distributed or when attackers can access the information of benign clients. In this paper, we propose a new defense mechanism that focuses on the client-side, called FedDefender, to help benign clients train robust local models and avoid the adverse impact of malicious model updates from attackers, even when a server-side defense cannot identify or remove adversaries. Our method consists of two main components: (1) attack-tolerant local meta update and (2) attack-tolerant global knowledge distillation. These components are used to find noise-resilient model parameters while accurately extracting knowledge from a potentially corrupted global model. Our client-side defense strategy has a flexible structure and can work in conjunction with any existing server-side strategies. Evaluations of real-world scenarios across multiple datasets show that the proposed method enhances the robustness of federated learning against model poisoning attacks.
△ Less
Submitted 18 July, 2023;
originally announced July 2023.
-
Quantitative Analysis of Cultural Dynamics Seen from an Event-based Social Network
Authors:
Bayu Adhi Tama,
Jaehong Kim,
Jaehyuk Park,
Lev Manovich,
Meeyoung Cha
Abstract:
Culture is a collection of connected and potentially interactive patterns that characterize a social group or a passed-on idea that people acquire as members of society. While offline activities can provide a better picture of the geographical association of cultural traits than online activities, gathering such data on a large scale has been challenging. Here, we use multi-decade longitudinal rec…
▽ More
Culture is a collection of connected and potentially interactive patterns that characterize a social group or a passed-on idea that people acquire as members of society. While offline activities can provide a better picture of the geographical association of cultural traits than online activities, gathering such data on a large scale has been challenging. Here, we use multi-decade longitudinal records of cultural events from Meetup.com, the largest event-based social networking service, to examine the landscape of offline cultural events. We analyze the temporal and categorical event dynamics driven by cultural diversity using over 2 million event logs collected over 17 years in 90 countries. Our results show that the national economic status explains 44.6 percent of the variance in total event count, while cultural characteristics such as individualism and long-term orientation explain 32.8 percent of the variance in topic categories. Furthermore, our analysis using hierarchical clustering reveals cultural proximity between the topics of socio-cultural activities (e.g., politics, leisure, health, technology). We expect that this work provides a landscape of social and cultural activities across the world, which allows us to better understand their dynamical patterns as well as their associations with cultural characteristics.
△ Less
Submitted 9 June, 2023;
originally announced June 2023.
-
MultiEarth 2023 -- Multimodal Learning for Earth and Environment Workshop and Challenge
Authors:
Miriam Cha,
Gregory Angelides,
Mark Hamilton,
Andy Soszynski,
Brandon Swenson,
Nathaniel Maidel,
Phillip Isola,
Taylor Perron,
Bill Freeman
Abstract:
The Multimodal Learning for Earth and Environment Workshop (MultiEarth 2023) is the second annual CVPR workshop aimed at the monitoring and analysis of the health of Earth ecosystems by leveraging the vast amount of remote sensing data that is continuously being collected. The primary objective of this workshop is to bring together the Earth and environmental science communities as well as the mul…
▽ More
The Multimodal Learning for Earth and Environment Workshop (MultiEarth 2023) is the second annual CVPR workshop aimed at the monitoring and analysis of the health of Earth ecosystems by leveraging the vast amount of remote sensing data that is continuously being collected. The primary objective of this workshop is to bring together the Earth and environmental science communities as well as the multimodal representation learning communities to explore new ways of harnessing technological advancements in support of environmental monitoring. The MultiEarth Workshop also seeks to provide a common benchmark for processing multimodal remote sensing information by organizing public challenges focused on monitoring the Amazon rainforest. These challenges include estimating deforestation, detecting forest fires, translating synthetic aperture radar (SAR) images to the visible domain, and projecting environmental trends. This paper presents the challenge guidelines, datasets, and evaluation metrics. Our challenge website is available at https://sites.google.com/view/rainforest-challenge/multiearth-2023.
△ Less
Submitted 7 June, 2023;
originally announced June 2023.
-
SQuARe: A Large-Scale Dataset of Sensitive Questions and Acceptable Responses Created Through Human-Machine Collaboration
Authors:
Hwaran Lee,
Seokhee Hong,
Joonsuk Park,
Takyoung Kim,
Meeyoung Cha,
Ye** Choi,
Byoung Pil Kim,
Gunhee Kim,
Eun-Ju Lee,
Yong Lim,
Alice Oh,
Sangchul Park,
Jung-Woo Ha
Abstract:
The potential social harms that large language models pose, such as generating offensive content and reinforcing biases, are steeply rising. Existing works focus on co** with this concern while interacting with ill-intentioned users, such as those who explicitly make hate speech or elicit harmful responses. However, discussions on sensitive issues can become toxic even if the users are well-inte…
▽ More
The potential social harms that large language models pose, such as generating offensive content and reinforcing biases, are steeply rising. Existing works focus on co** with this concern while interacting with ill-intentioned users, such as those who explicitly make hate speech or elicit harmful responses. However, discussions on sensitive issues can become toxic even if the users are well-intentioned. For safer models in such scenarios, we present the Sensitive Questions and Acceptable Response (SQuARe) dataset, a large-scale Korean dataset of 49k sensitive questions with 42k acceptable and 46k non-acceptable responses. The dataset was constructed leveraging HyperCLOVA in a human-in-the-loop manner based on real news headlines. Experiments show that acceptable response generation significantly improves for HyperCLOVA and GPT-3, demonstrating the efficacy of this dataset.
△ Less
Submitted 28 May, 2023;
originally announced May 2023.
-
GraphFC: Customs Fraud Detection with Label Scarcity
Authors:
Karandeep Singh,
Yu-Che Tsai,
Cheng-Te Li,
Meeyoung Cha,
Shou-De Lin
Abstract:
Custom officials across the world encounter huge volumes of transactions. With increased connectivity and globalization, the customs transactions continue to grow every year. Associated with customs transactions is the customs fraud - the intentional manipulation of goods declarations to avoid the taxes and duties. With limited manpower, the custom offices can only undertake manual inspection of a…
▽ More
Custom officials across the world encounter huge volumes of transactions. With increased connectivity and globalization, the customs transactions continue to grow every year. Associated with customs transactions is the customs fraud - the intentional manipulation of goods declarations to avoid the taxes and duties. With limited manpower, the custom offices can only undertake manual inspection of a limited number of declarations. This necessitates the need for automating the customs fraud detection by machine learning (ML) techniques. Due the limited manual inspection for labeling the new-incoming declarations, the ML approach should have robust performance subject to the scarcity of labeled data. However, current approaches for customs fraud detection are not well suited and designed for this real-world setting. In this work, we propose $\textbf{GraphFC}$ ($\textbf{Graph}$ neural networks for $\textbf{C}$ustoms $\textbf{F}$raud), a model-agnostic, domain-specific, semi-supervised graph neural network based customs fraud detection algorithm that has strong semi-supervised and inductive capabilities. With upto 252% relative increase in recall over the present state-of-the-art, extensive experimentation on real customs data from customs administrations of three different countries demonstrate that GraphFC consistently outperforms various baselines and the present state-of-art by a large margin.
△ Less
Submitted 19 August, 2023; v1 submitted 18 May, 2023;
originally announced May 2023.
-
Minimax optimal density estimation using a shallow generative model with a one-dimensional latent variable
Authors:
Hyeok Kyu Kwon,
Minwoo Chae
Abstract:
A deep generative model yields an implicit estimator for the unknown distribution or density function of the observation. This paper investigates some statistical properties of the implicit density estimator pursued by VAE-type methods from a nonparametric density estimation framework. More specifically, we obtain convergence rates of the VAE-type density estimator under the assumption that the un…
▽ More
A deep generative model yields an implicit estimator for the unknown distribution or density function of the observation. This paper investigates some statistical properties of the implicit density estimator pursued by VAE-type methods from a nonparametric density estimation framework. More specifically, we obtain convergence rates of the VAE-type density estimator under the assumption that the underlying true density function belongs to a locally Hölder class. Remarkably, a near minimax optimal rate with respect to the Hellinger metric can be achieved by the simplest network architecture, a shallow generative model with a one-dimensional latent variable.
△ Less
Submitted 8 February, 2024; v1 submitted 11 May, 2023;
originally announced May 2023.
-
A descent method for nonsmooth multiobjective optimization problems on Riemannian manifolds
Authors:
Chunming Tang,
Hao He,
**bao Jian,
Miantao Chao
Abstract:
In this paper, a descent method for nonsmooth multiobjective optimization problems on complete Riemannian manifolds is proposed. The objective functions are only assumed to be locally Lipschitz continuous instead of convexity used in existing methods. A necessary condition for Pareto optimality in Euclidean space is generalized to the Riemannian setting. At every iteration, an acceptable descent d…
▽ More
In this paper, a descent method for nonsmooth multiobjective optimization problems on complete Riemannian manifolds is proposed. The objective functions are only assumed to be locally Lipschitz continuous instead of convexity used in existing methods. A necessary condition for Pareto optimality in Euclidean space is generalized to the Riemannian setting. At every iteration, an acceptable descent direction is obtained by constructing a convex hull of some Riemannian $\varepsilon$-subgradients. And then a Riemannian Armijo-type line search is executed to produce the next iterate. The convergence result is established in the sense that a point satisfying the necessary condition for Pareto optimality can be generated by the algorithm in a finite number of iterations. Finally, some preliminary numerical results are reported, which show that the proposed method is efficient.
△ Less
Submitted 24 April, 2023;
originally announced April 2023.
-
Deep learning based ECG segmentation for delineation of diverse arrhythmias
Authors:
Chankyu Joung,
Mi** Kim,
Tae** Paik,
Seong-Ho Kong,
Seung-Young Oh,
Won Kyeong Jeon,
Jae-hu Jeon,
Joong-Sik Hong,
Wan-Joong Kim,
Woong Kook,
Myung-** Cha,
Otto van Koert
Abstract:
Accurate delineation of key waveforms in an ECG is a critical initial step in extracting relevant features to support the diagnosis and treatment of heart conditions. Although deep learning based methods using a segmentation model to locate the P, QRS, and T waves have shown promising results, their ability to handle signals exhibiting arrhythmia remains unclear. This study builds on existing rese…
▽ More
Accurate delineation of key waveforms in an ECG is a critical initial step in extracting relevant features to support the diagnosis and treatment of heart conditions. Although deep learning based methods using a segmentation model to locate the P, QRS, and T waves have shown promising results, their ability to handle signals exhibiting arrhythmia remains unclear. This study builds on existing research by introducing a U-Net-like segmentation model for ECG delineation, with a particular focus on diverse arrhythmias. For this purpose, we curate an internal dataset containing waveform boundary annotations for various arrhythmia types to train and validate our model. Our key contributions include identifying segmentation model failures in different arrhythmia types, develo** a robust model using a diverse training set, achieving comparable performance on benchmark datasets, and introducing a classification guided strategy to reduce false P wave predictions for specific arrhythmias. This study advances deep learning based ECG delineation in the context of arrhythmias and highlights its challenges.
△ Less
Submitted 6 September, 2023; v1 submitted 12 April, 2023;
originally announced April 2023.
-
Blaming Humans and Machines: What Shapes People's Reactions to Algorithmic Harm
Authors:
Gabriel Lima,
Nina Grgić-Hlača,
Meeyoung Cha
Abstract:
Artificial intelligence (AI) systems can cause harm to people. This research examines how individuals react to such harm through the lens of blame. Building upon research suggesting that people blame AI systems, we investigated how several factors influence people's reactive attitudes towards machines, designers, and users. The results of three studies (N = 1,153) indicate differences in how blame…
▽ More
Artificial intelligence (AI) systems can cause harm to people. This research examines how individuals react to such harm through the lens of blame. Building upon research suggesting that people blame AI systems, we investigated how several factors influence people's reactive attitudes towards machines, designers, and users. The results of three studies (N = 1,153) indicate differences in how blame is attributed to these actors. Whether AI systems were explainable did not impact blame directed at them, their developers, and their users. Considerations about fairness and harmfulness increased blame towards designers and users but had little to no effect on judgments of AI systems. Instead, what determined people's reactive attitudes towards machines was whether people thought blaming them would be a suitable response to algorithmic harm. We discuss implications, such as how future decisions about including AI systems in the social and moral spheres will shape laypeople's reactions to AI-caused harm.
△ Less
Submitted 4 April, 2023;
originally announced April 2023.
-
DualFair: Fair Representation Learning at Both Group and Individual Levels via Contrastive Self-supervision
Authors:
Sungwon Han,
Seungeon Lee,
Fangzhao Wu,
Sundong Kim,
Chuhan Wu,
Xiting Wang,
Xing Xie,
Meeyoung Cha
Abstract:
Algorithmic fairness has become an important machine learning problem, especially for mission-critical Web applications. This work presents a self-supervised model, called DualFair, that can debias sensitive attributes like gender and race from learned representations. Unlike existing models that target a single type of fairness, our model jointly optimizes for two fairness criteria - group fairne…
▽ More
Algorithmic fairness has become an important machine learning problem, especially for mission-critical Web applications. This work presents a self-supervised model, called DualFair, that can debias sensitive attributes like gender and race from learned representations. Unlike existing models that target a single type of fairness, our model jointly optimizes for two fairness criteria - group fairness and counterfactual fairness - and hence makes fairer predictions at both the group and individual levels. Our model uses contrastive loss to generate embeddings that are indistinguishable for each protected group, while forcing the embeddings of counterfactual pairs to be similar. It then uses a self-knowledge distillation method to maintain the quality of representation for the downstream tasks. Extensive analysis over multiple datasets confirms the model's validity and further shows the synergy of jointly addressing two fairness criteria, suggesting the model's potential value in fair intelligent Web applications.
△ Less
Submitted 15 March, 2023;
originally announced March 2023.
-
Longitudinal Assessment of Reference Quality on Wikipedia
Authors:
Aitolkyn Baigutanova,
Jaehyeon Myung,
Diego Saez-Trumper,
Ai-Jou Chou,
Miriam Redi,
Changwook Jung,
Meeyoung Cha
Abstract:
Wikipedia plays a crucial role in the integrity of the Web. This work analyzes the reliability of this global encyclopedia through the lens of its references. We operationalize the notion of reference quality by defining reference need (RN), i.e., the percentage of sentences missing a citation, and reference risk (RR), i.e., the proportion of non-authoritative references. We release Citation Detec…
▽ More
Wikipedia plays a crucial role in the integrity of the Web. This work analyzes the reliability of this global encyclopedia through the lens of its references. We operationalize the notion of reference quality by defining reference need (RN), i.e., the percentage of sentences missing a citation, and reference risk (RR), i.e., the proportion of non-authoritative references. We release Citation Detective, a tool for automatically calculating the RN score, and discover that the RN score has dropped by 20 percent point in the last decade, with more than half of verifiable statements now accompanying references. The RR score has remained below 1% over the years as a result of the efforts of the community to eliminate unreliable references. We propose pairing novice and experienced editors on the same Wikipedia article as a strategy to enhance reference quality. Our quasi-experiment indicates that such a co-editing experience can result in a lasting advantage in identifying unreliable sources in future edits. As Wikipedia is frequently used as the ground truth for numerous Web applications, our findings and suggestions on its reliability can have a far-reaching impact. We discuss the possibility of other Web services adopting Wiki-style user collaboration to eliminate unreliable content.
△ Less
Submitted 9 March, 2023;
originally announced March 2023.
-
First observation of the $β$3$α$p decay of $^{13}\mathrm{O}$ via $β$-delayed charged-particle spectroscopy
Authors:
Jack Bishop,
G. V. Rogachev,
S. Ahn,
M. Barbui,
S. M. Cha,
E. Harris,
C. Hunt,
C. H. Kim,
D. Kim,
S. H. Kim,
E. Koshchiy,
Z. Luo,
C. Park,
C. E. Parker,
E. C. Pollacco,
B. T. Roeder,
M. Roosa,
A. Saastamoinen,
D. P. Scriven
Abstract:
Background: The $β$-delayed proton-decay of $^{13}\mathrm{O}$ has previously been studied, but the direct observation of $β$-delayed $α$+$α$+$α$+p decay has not been reported. Purpose: Observing rare 3$α$+p events from the decay of excited states in $^{13}\mathrm{N}^{\star}$ allows for a sensitive probe of exotic highly-clustered configurations in $^{13}$N. Method: To measure the low-energy produc…
▽ More
Background: The $β$-delayed proton-decay of $^{13}\mathrm{O}$ has previously been studied, but the direct observation of $β$-delayed $α$+$α$+$α$+p decay has not been reported. Purpose: Observing rare 3$α$+p events from the decay of excited states in $^{13}\mathrm{N}^{\star}$ allows for a sensitive probe of exotic highly-clustered configurations in $^{13}$N. Method: To measure the low-energy products following $β$-delayed 3$α$p-decay, the TexAT Time Projection Chamber was employed using the one-at-a-time $β$-delayed charged-particle spectroscopy technique at the Cyclotron Institute, Texas A&M University. Results: A total of $1.9 \times 10^{5}$ $^{13}\mathrm{O}$ implantations were made inside the TexAT Time Projection Chamber. 149 3$α$+p events were observed yielding a $β$-delayed 3$α+p$ branching ratio of 0.078(6)%. Conclusion: Four previously unknown $α$-decaying states were observed, one with a strong $^{9}\mathrm{B(g.s)}+α$ characteristic at 11.3 MeV, one with a $^{9}\mathrm{B}(\frac{1}{2}^{+})+α$ nature at 12.4 MeV, and another two that are dominated by $^{9}\mathrm{B}({\frac{5}{2}}^{+})+α$ at 13.1 and 13.7 MeV. Population of the $\frac{1}{2}^{+}$ state in $^{9}\mathrm{B}$ has been unambiguously seen, cementing the predicted existence of the mirror-state based on the states observed in $^{9}\mathrm{Be}$.
△ Less
Submitted 12 May, 2023; v1 submitted 27 February, 2023;
originally announced February 2023.
-
A Benchmark on Uncertainty Quantification for Deep Learning Prognostics
Authors:
Luis Basora,
Arthur Viens,
Manuel Arias Chao,
Xavier Olive
Abstract:
Reliable uncertainty quantification on RUL prediction is crucial for informative decision-making in predictive maintenance. In this context, we assess some of the latest developments in the field of uncertainty quantification for prognostics deep learning. This includes the state-of-the-art variational inference algorithms for Bayesian neural networks (BNN) as well as popular alternatives such as…
▽ More
Reliable uncertainty quantification on RUL prediction is crucial for informative decision-making in predictive maintenance. In this context, we assess some of the latest developments in the field of uncertainty quantification for prognostics deep learning. This includes the state-of-the-art variational inference algorithms for Bayesian neural networks (BNN) as well as popular alternatives such as Monte Carlo Dropout (MCD), deep ensembles (DE) and heteroscedastic neural networks (HNN). All the inference techniques share the same inception deep learning architecture as a functional model. We performed hyperparameter search to optimize the main variational and learning parameters of the algorithms. The performance of the methods is evaluated on a subset of the large NASA NCMAPSS dataset for aircraft engines. The assessment includes RUL prediction accuracy, the quality of predictive uncertainty, and the possibility to break down the total predictive uncertainty into its aleatoric and epistemic parts. The results show no method clearly outperforms the others in all the situations. Although all methods are close in terms of accuracy, we find differences in the way they estimate uncertainty. Thus, DE and MCD generally provide more conservative predictive uncertainty than BNN. Surprisingly, HNN can achieve strong results without the added training complexity and extra parameters of the BNN. For tasks like active learning where a separation of epistemic and aleatoric uncertainty is required, radial BNN and MCD seem the best options.
△ Less
Submitted 9 February, 2023;
originally announced February 2023.
-
Detecting Contextomized Quotes in News Headlines by Contrastive Learning
Authors:
Seonyeong Song,
Hyeonho Song,
Kunwoo Park,
Jiyoung Han,
Meeyoung Cha
Abstract:
Quotes are critical for establishing credibility in news articles. A direct quote enclosed in quotation marks has a strong visual appeal and is a sign of a reliable citation. Unfortunately, this journalistic practice is not strictly followed, and a quote in the headline is often "contextomized." Such a quote uses words out of context in a way that alters the speaker's intention so that there is no…
▽ More
Quotes are critical for establishing credibility in news articles. A direct quote enclosed in quotation marks has a strong visual appeal and is a sign of a reliable citation. Unfortunately, this journalistic practice is not strictly followed, and a quote in the headline is often "contextomized." Such a quote uses words out of context in a way that alters the speaker's intention so that there is no semantically matching quote in the body text. We present QuoteCSE, a contrastive learning framework that represents the embedding of news quotes based on domain-driven positive and negative samples to identify such an editorial strategy. The dataset and code are available at https://github.com/ssu-humane/contextomized-quote-contrastive.
△ Less
Submitted 9 February, 2023;
originally announced February 2023.
-
Waveguide Holography: Towards True 3D Holographic Glasses
Authors:
Changwon Jang,
Kiseung Bang,
Minseok Chae,
Byoungho Lee,
Douglas Lanman
Abstract:
We present a novel near-eye display concept which consists of a waveguide combiner, a spatial light modulator, and a laser light source. The proposed system can display true 3D holographic images through see-through pupil-replicating waveguide combiner as well as providing a large eye-box. By modeling the coherent light interaction inside of the waveguide combiner, we demonstrate that the output w…
▽ More
We present a novel near-eye display concept which consists of a waveguide combiner, a spatial light modulator, and a laser light source. The proposed system can display true 3D holographic images through see-through pupil-replicating waveguide combiner as well as providing a large eye-box. By modeling the coherent light interaction inside of the waveguide combiner, we demonstrate that the output wavefront from the waveguide can be controlled by modulating the wavefront of input light using a spatial light modulator. This new possibility allows combining a holographic display, which is considered as the ultimate 3D display technology, with the state-of-the-art pupil replicating waveguides, enabling the path towards true 3D holographic augmented reality glasses.
△ Less
Submitted 4 November, 2022;
originally announced November 2022.
-
Self-explaining deep models with logic rule reasoning
Authors:
Seungeon Lee,
Xiting Wang,
Sungwon Han,
Xiaoyuan Yi,
Xing Xie,
Meeyoung Cha
Abstract:
We present SELOR, a framework for integrating self-explaining capabilities into a given deep model to achieve both high prediction performance and human precision. By "human precision", we refer to the degree to which humans agree with the reasons models provide for their predictions. Human precision affects user trust and allows users to collaborate closely with the model. We demonstrate that log…
▽ More
We present SELOR, a framework for integrating self-explaining capabilities into a given deep model to achieve both high prediction performance and human precision. By "human precision", we refer to the degree to which humans agree with the reasons models provide for their predictions. Human precision affects user trust and allows users to collaborate closely with the model. We demonstrate that logic rule explanations naturally satisfy human precision with the expressive power required for good predictive performance. We then illustrate how to enable a deep model to predict and explain with logic rules. Our method does not require predefined logic rule sets or human annotations and can be learned efficiently and easily with widely-used deep learning modules in a differentiable way. Extensive experiments show that our method gives explanations closer to human decision logic than other methods while maintaining the performance of deep learning models.
△ Less
Submitted 18 October, 2022; v1 submitted 13 October, 2022;
originally announced October 2022.
-
The Stochastic Schwarz lemma on Kähler Manifolds by Couplings and Its Applications
Authors:
Myeongju Chae,
Gunhee Cho,
Maria Gordina,
Guang Yang
Abstract:
We first provide a stochastic formula for the Carathéodory distance in terms of general Markovian couplings and prove a comparison result between the Carathéodory distance and the complete Kähler metric with a negative lower curvature bound using the Kendall-Cranston coupling. This probabilistic approach gives a version of the Schwarz lemma on complete non-compact Kähler manifolds with a further d…
▽ More
We first provide a stochastic formula for the Carathéodory distance in terms of general Markovian couplings and prove a comparison result between the Carathéodory distance and the complete Kähler metric with a negative lower curvature bound using the Kendall-Cranston coupling. This probabilistic approach gives a version of the Schwarz lemma on complete non-compact Kähler manifolds with a further decomposition Ricci curvature into the orthogonal Ricci curvature and the holomorphic sectional curvature, which cannot be obtained by using Yau--Royden's Schwarz lemma. We also prove coupling estimates on quaternionic Kähler manifolds. As a byproduct, we obtain an improved gradient estimate of positive harmonic functions on Kähler manifolds and quaternionic Kähler manifolds under lower curvature bounds.
△ Less
Submitted 30 November, 2023; v1 submitted 21 August, 2022;
originally announced August 2022.
-
RadTex: Learning Efficient Radiograph Representations from Text Reports
Authors:
Keegan Quigley,
Miriam Cha,
Ruizhi Liao,
Geeticka Chauhan,
Steven Horng,
Seth Berkowitz,
Polina Golland
Abstract:
Automated analysis of chest radiography using deep learning has tremendous potential to enhance the clinical diagnosis of diseases in patients. However, deep learning models typically require large amounts of annotated data to achieve high performance -- often an obstacle to medical domain adaptation. In this paper, we build a data-efficient learning framework that utilizes radiology reports to im…
▽ More
Automated analysis of chest radiography using deep learning has tremendous potential to enhance the clinical diagnosis of diseases in patients. However, deep learning models typically require large amounts of annotated data to achieve high performance -- often an obstacle to medical domain adaptation. In this paper, we build a data-efficient learning framework that utilizes radiology reports to improve medical image classification performance with limited labeled data (fewer than 1000 examples). Specifically, we examine image-captioning pretraining to learn high-quality medical image representations that train on fewer examples. Following joint pretraining of a convolutional encoder and transformer decoder, we transfer the learned encoder to various classification tasks. Averaged over 9 pathologies, we find that our model achieves higher classification performance than ImageNet-supervised and in-domain supervised pretraining when labeled training data is limited.
△ Less
Submitted 7 April, 2023; v1 submitted 5 August, 2022;
originally announced August 2022.
-
SAR-to-EO Image Translation with Multi-Conditional Adversarial Networks
Authors:
Armando Cabrera,
Miriam Cha,
Prafull Sharma,
Michael Newey
Abstract:
This paper explores the use of multi-conditional adversarial networks for SAR-to-EO image translation. Previous methods condition adversarial networks only on the input SAR. We show that incorporating multiple complementary modalities such as Google maps and IR can further improve SAR-to-EO image translation especially on preserving sharp edges of manmade objects. We demonstrate effectiveness of o…
▽ More
This paper explores the use of multi-conditional adversarial networks for SAR-to-EO image translation. Previous methods condition adversarial networks only on the input SAR. We show that incorporating multiple complementary modalities such as Google maps and IR can further improve SAR-to-EO image translation especially on preserving sharp edges of manmade objects. We demonstrate effectiveness of our approach on a diverse set of datasets including SEN12MS, DFC2020, and SpaceNet6. Our experimental results suggest that additional information provided by complementary modalities improves the performance of SAR-to-EO image translation compared to the models trained on paired SAR and EO data only. To best of our knowledge, our approach is the first to leverage multiple modalities for improving SAR-to-EO image translation performance.
△ Less
Submitted 26 July, 2022;
originally announced July 2022.
-
FedX: Unsupervised Federated Learning with Cross Knowledge Distillation
Authors:
Sungwon Han,
Sungwon Park,
Fangzhao Wu,
Sundong Kim,
Chuhan Wu,
Xing Xie,
Meeyoung Cha
Abstract:
This paper presents FedX, an unsupervised federated learning framework. Our model learns unbiased representation from decentralized and heterogeneous local data. It employs a two-sided knowledge distillation with contrastive learning as a core component, allowing the federated system to function without requiring clients to share any data features. Furthermore, its adaptable architecture can be us…
▽ More
This paper presents FedX, an unsupervised federated learning framework. Our model learns unbiased representation from decentralized and heterogeneous local data. It employs a two-sided knowledge distillation with contrastive learning as a core component, allowing the federated system to function without requiring clients to share any data features. Furthermore, its adaptable architecture can be used as an add-on module for existing unsupervised algorithms in federated settings. Experiments show that our model improves performance significantly (1.58--5.52pp) on five unsupervised algorithms.
△ Less
Submitted 19 July, 2022;
originally announced July 2022.
-
Develo** a Series of AI Challenges for the United States Department of the Air Force
Authors:
Vijay Gadepally,
Gregory Angelides,
Andrei Barbu,
Andrew Bowne,
Laura J. Brattain,
Tamara Broderick,
Armando Cabrera,
Glenn Carl,
Ronisha Carter,
Miriam Cha,
Emilie Cowen,
Jesse Cummings,
Bill Freeman,
James Glass,
Sam Goldberg,
Mark Hamilton,
Thomas Heldt,
Kuan Wei Huang,
Phillip Isola,
Boris Katz,
Jamie Koerner,
Yen-Chen Lin,
David Mayo,
Kyle McAlpin,
Taylor Perron
, et al. (17 additional authors not shown)
Abstract:
Through a series of federal initiatives and orders, the U.S. Government has been making a concerted effort to ensure American leadership in AI. These broad strategy documents have influenced organizations such as the United States Department of the Air Force (DAF). The DAF-MIT AI Accelerator is an initiative between the DAF and MIT to bridge the gap between AI researchers and DAF mission requireme…
▽ More
Through a series of federal initiatives and orders, the U.S. Government has been making a concerted effort to ensure American leadership in AI. These broad strategy documents have influenced organizations such as the United States Department of the Air Force (DAF). The DAF-MIT AI Accelerator is an initiative between the DAF and MIT to bridge the gap between AI researchers and DAF mission requirements. Several projects supported by the DAF-MIT AI Accelerator are develo** public challenge problems that address numerous Federal AI research priorities. These challenges target priorities by making large, AI-ready datasets publicly available, incentivizing open-source solutions, and creating a demand signal for dual use technologies that can stimulate further research. In this article, we describe these public challenges being developed and how their application contributes to scientific advances.
△ Less
Submitted 14 July, 2022;
originally announced July 2022.
-
Prediction of Football Player Value using Bayesian Ensemble Approach
Authors:
Hansoo Lee,
Bayu Adhi Tama,
Meeyoung Cha
Abstract:
The transfer fees of sports players have become astronomical. This is because bringing players of great future value to the club is essential for their survival. We present a case study on the key factors affecting the world's top soccer players' transfer fees based on the FIFA data analysis. To predict each player's market value, we propose an improved LightGBM model by optimizing its hyperparame…
▽ More
The transfer fees of sports players have become astronomical. This is because bringing players of great future value to the club is essential for their survival. We present a case study on the key factors affecting the world's top soccer players' transfer fees based on the FIFA data analysis. To predict each player's market value, we propose an improved LightGBM model by optimizing its hyperparameter using a Tree-structured Parzen Estimator (TPE) algorithm. We identify prominent features by the SHapley Additive exPlanations (SHAP) algorithm. The proposed method has been compared against the baseline regression models (e.g., linear regression, lasso, elastic net, kernel ridge regression) and gradient boosting model without hyperparameter optimization. The optimized LightGBM model showed an excellent accuracy of approximately 3.8, 1.4, and 1.8 times on average compared to the regression baseline models, GBDT, and LightGBM model in terms of RMSE. Our model offers interpretability in deciding what attributes football clubs should consider in recruiting players in the future.
△ Less
Submitted 24 June, 2022;
originally announced June 2022.
-
The Conflict Between Explainable and Accountable Decision-Making Algorithms
Authors:
Gabriel Lima,
Nina Grgić-Hlača,
** Keun Jeong,
Meeyoung Cha
Abstract:
Decision-making algorithms are being used in important decisions, such as who should be enrolled in health care programs and be hired. Even though these systems are currently deployed in high-stakes scenarios, many of them cannot explain their decisions. This limitation has prompted the Explainable Artificial Intelligence (XAI) initiative, which aims to make algorithms explainable to comply with l…
▽ More
Decision-making algorithms are being used in important decisions, such as who should be enrolled in health care programs and be hired. Even though these systems are currently deployed in high-stakes scenarios, many of them cannot explain their decisions. This limitation has prompted the Explainable Artificial Intelligence (XAI) initiative, which aims to make algorithms explainable to comply with legal requirements, promote trust, and maintain accountability. This paper questions whether and to what extent explainability can help solve the responsibility issues posed by autonomous AI systems. We suggest that XAI systems that provide post-hoc explanations could be seen as blameworthy agents, obscuring the responsibility of developers in the decision-making process. Furthermore, we argue that XAI could result in incorrect attributions of responsibility to vulnerable stakeholders, such as those who are subjected to algorithmic decisions (i.e., patients), due to a misguided perception that they have control over explainable algorithms. This conflict between explainability and accountability can be exacerbated if designers choose to use algorithms and patients as moral and legal scapegoats. We conclude with a set of recommendations for how to approach this tension in the socio-technical process of algorithmic decision-making and a defense of hard regulation to prevent designers from esca** responsibility.
△ Less
Submitted 11 May, 2022;
originally announced May 2022.
-
Learning Economic Indicators by Aggregating Multi-Level Geospatial Information
Authors:
Sungwon Park,
Sungwon Han,
Donghyun Ahn,
Jaeyeon Kim,
Jeasurk Yang,
Susang Lee,
Seunghoon Hong,
Jihee Kim,
Sangyoon Park,
Hyunjoo Yang,
Meeyoung Cha
Abstract:
High-resolution daytime satellite imagery has become a promising source to study economic activities. These images display detailed terrain over large areas and allow zooming into smaller neighborhoods. Existing methods, however, have utilized images only in a single-level geographical unit. This research presents a deep learning model to predict economic indicators via aggregating traits observed…
▽ More
High-resolution daytime satellite imagery has become a promising source to study economic activities. These images display detailed terrain over large areas and allow zooming into smaller neighborhoods. Existing methods, however, have utilized images only in a single-level geographical unit. This research presents a deep learning model to predict economic indicators via aggregating traits observed from multiple levels of geographical units. The model first measures hyperlocal economy over small communities via ordinal regression. The next step extracts district-level features by summarizing interconnection among hyperlocal economies. In the final step, the model estimates economic indicators of districts via aggregating the hyperlocal and district information. Our new multi-level learning model substantially outperforms strong baselines in predicting key indicators such as population, purchasing power, and energy consumption. The model is also robust against data shortage; the trained features from one country can generalize to other countries when evaluated with data gathered from Malaysia, the Philippines, Thailand, and Vietnam. We discuss the multi-level model's implications for measuring inequality, which is the essential first step in policy and social science research on inequality and poverty.
△ Less
Submitted 3 May, 2022;
originally announced May 2022.
-
Design of Blockchain-based Travel Rule Compliance System
Authors:
Chaehyeon Lee,
Changhoon Kang,
Wonseok Choi,
Jehoon Lee,
Myunghun Cha,
Jongsoo Woo,
James Won-Ki Hong
Abstract:
In accordance with the guidelines of the Financial Action Task Force (FATF), Virtual Asset Service Providers (VASPs) should comply with a `travel rule', which requires them to exchange originator's and beneficiary's personal information when transferring virtual assets. In this paper, we propose a novel blockchain-based travel rule compliance system that supports fully-decentralized data exchange.…
▽ More
In accordance with the guidelines of the Financial Action Task Force (FATF), Virtual Asset Service Providers (VASPs) should comply with a `travel rule', which requires them to exchange originator's and beneficiary's personal information when transferring virtual assets. In this paper, we propose a novel blockchain-based travel rule compliance system that supports fully-decentralized data exchange. The proposed system uses a permissioned blockchain, and thereby eliminates the possibility of leakage of personal information to third parties or even to travel rule service providers, and ensures that travel rule data can be managed securely.
△ Less
Submitted 28 April, 2022;
originally announced April 2022.
-
EdgeKeeper: Resilient and Lightweight Coordination for Mobile Edge Computing Systems
Authors:
S. Bhunia,
R. Stoleru,
M. Sagor,
A. Haroon,
A. Altaweel,
M. Chao,
M. Maurice,
R. Blalock
Abstract:
Mobile Edge Computing (MEC) has been gaining significant interest from first responders and tactical teams, primarily because they can employ handheld mobile devices to form a computing cluster (for computing tasks like face/scene recognition, virtual assistance) when connectivity to the cloud is not present or it is limited. High user mobility in first responder or tactical environments makes MEC…
▽ More
Mobile Edge Computing (MEC) has been gaining significant interest from first responders and tactical teams, primarily because they can employ handheld mobile devices to form a computing cluster (for computing tasks like face/scene recognition, virtual assistance) when connectivity to the cloud is not present or it is limited. High user mobility in first responder or tactical environments makes MEC challenging, as wireless links observe substantial fluctuations. Typical cloud-based coordination (e.g., ZooKeeper-based service discovery and coordination, device naming, security) needed by edge computing tasks cannot work in these environments. Driven by the need for a resilient and lightweight coordination service, in this paper, we design and implement \ek to provide cloud-like coordination for MEC systems. It provides naming, network management, application coordination, and security to distributed edge computing applications. It maintains an edge cluster among devices and intelligently stores its data on a group of replicas to guard against node failure and disconnections. We provide a full-system implementation of EdgeKeeper for Android and Linux platforms. We have integrated EdgeKeeper with existing MEC applications and performed real-world performance evaluations in a wide-area search and rescue operation conducted by first responders, which proves it to be lightweight and suitable for mobile devices.
△ Less
Submitted 23 April, 2022;
originally announced April 2022.
-
R-Drive: Resilient Data Storage and Sharing for Mobile Edge Computing Systems
Authors:
M. Sagor,
R. Stoleru,
S. Bhunia,
M. Chao,
A. Haroon,
A. Altaweel,
M. Maurice,
R. Blalock
Abstract:
Mobile edge computing (MEC) systems (in which intensive computation and data storage tasks are performed locally, due to the absence of communication infrastructure for connectivity to the cloud) are currently being developed for disaster response applications and for tactical environments. MEC applications for these scenarios generate and process significant mission-critical and personal data tha…
▽ More
Mobile edge computing (MEC) systems (in which intensive computation and data storage tasks are performed locally, due to the absence of communication infrastructure for connectivity to the cloud) are currently being developed for disaster response applications and for tactical environments. MEC applications for these scenarios generate and process significant mission-critical and personal data that require resilient and secure storage and sharing. In this paper, we present the design, implementation, and evaluation of R-Drive, a resilient data storage and sharing framework for disaster response and tactical MEC applications. R-Drive employs erasure coding and data encryption, ensuring resilient and secure data storage against device failure. R-Drive adaptively chooses erasure coding parameters to ensure the highest data availability with a minimal storage cost. R-Drive's distributed directory service provides a resilient and secure namespace for files with rigorous access control management. R-Drive leverages opportunistic networking, allowing data storage and sharing in mobile and loosely connected edge computing environments. We implemented R-Drive on Android, and integrated it with existing MEC applications. Performance evaluation results show that R-Drive enables resilient and secure data storage and sharing.
△ Less
Submitted 22 April, 2022;
originally announced April 2022.
-
MultiEarth 2022 -- Multimodal Learning for Earth and Environment Workshop and Challenge
Authors:
Miriam Cha,
Kuan Wei Huang,
Morgan Schmidt,
Gregory Angelides,
Mark Hamilton,
Sam Goldberg,
Armando Cabrera,
Phillip Isola,
Taylor Perron,
Bill Freeman,
Yen-Chen Lin,
Brandon Swenson,
Jean Piou
Abstract:
The Multimodal Learning for Earth and Environment Challenge (MultiEarth 2022) will be the first competition aimed at the monitoring and analysis of deforestation in the Amazon rainforest at any time and in any weather conditions. The goal of the Challenge is to provide a common benchmark for multimodal information processing and to bring together the earth and environmental science communities as…
▽ More
The Multimodal Learning for Earth and Environment Challenge (MultiEarth 2022) will be the first competition aimed at the monitoring and analysis of deforestation in the Amazon rainforest at any time and in any weather conditions. The goal of the Challenge is to provide a common benchmark for multimodal information processing and to bring together the earth and environmental science communities as well as multimodal representation learning communities to compare the relative merits of the various multimodal learning methods to deforestation estimation under well-defined and strictly comparable conditions. MultiEarth 2022 will have three sub-challenges: 1) matrix completion, 2) deforestation estimation, and 3) image-to-image translation. This paper presents the challenge guidelines, datasets, and evaluation metrics for the three sub-challenges. Our challenge website is available at https://sites.google.com/view/rainforest-challenge.
△ Less
Submitted 31 May, 2022; v1 submitted 15 April, 2022;
originally announced April 2022.
-
Rates of convergence for nonparametric estimation of singular distributions using generative adversarial networks
Authors:
Minwoo Chae
Abstract:
We consider generative adversarial networks (GAN) for estimating parameters in a deep generative model. The data-generating distribution is assumed to concentrate around some low-dimensional structure, making the target distribution singular to the Lebesgue measure. Under this assumption, we obtain convergence rates of a GAN type estimator with respect to the Wasserstein metric. The convergence ra…
▽ More
We consider generative adversarial networks (GAN) for estimating parameters in a deep generative model. The data-generating distribution is assumed to concentrate around some low-dimensional structure, making the target distribution singular to the Lebesgue measure. Under this assumption, we obtain convergence rates of a GAN type estimator with respect to the Wasserstein metric. The convergence rate depends only on the noise level, intrinsic dimension and smoothness of the underlying structure. Furthermore, the rate is faster than that obtained by likelihood approaches, which provides insights into why GAN approaches perform better in many real problems. A lower bound of the minimax optimal rate is also investigated.
△ Less
Submitted 6 February, 2022;
originally announced February 2022.
-
Knowledge Sharing via Domain Adaptation in Customs Fraud Detection
Authors:
Sungwon Park,
Sundong Kim,
Meeyoung Cha
Abstract:
Knowledge of the changing traffic is critical in risk management. Customs offices worldwide have traditionally relied on local resources to accumulate knowledge and detect tax fraud. This naturally poses countries with weak infrastructure to become tax havens of potentially illicit trades. The current paper proposes DAS, a memory bank platform to facilitate knowledge sharing across multi-national…
▽ More
Knowledge of the changing traffic is critical in risk management. Customs offices worldwide have traditionally relied on local resources to accumulate knowledge and detect tax fraud. This naturally poses countries with weak infrastructure to become tax havens of potentially illicit trades. The current paper proposes DAS, a memory bank platform to facilitate knowledge sharing across multi-national customs administrations to support each other. We propose a domain adaptation method to share transferable knowledge of frauds as prototypes while safeguarding the local trade information. Data encompassing over 8 million import declarations have been used to test the feasibility of this new system, which shows that participating countries may benefit up to 2-11 times in fraud detection with the help of shared knowledge. We discuss implications for substantial tax revenue potential and strengthened policy against illicit trades.
△ Less
Submitted 18 January, 2022;
originally announced January 2022.