Skip to main content

Showing 1–50 of 127 results for author: Cha, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.03450  [pdf, other

    cs.CL cs.AI

    What is the Best Way for ChatGPT to Translate Poetry?

    Authors: Shanshan Wang, Derek F. Wong, **gming Yao, Lidia S. Chao

    Abstract: Machine translation (MT) has historically faced significant challenges when applied to literary works, particularly in the domain of poetry translation. The advent of Large Language Models such as ChatGPT holds potential for innovation in this field. This study examines ChatGPT's capabilities in English-Chinese poetry translation tasks, utilizing targeted prompts and small sample scenarios to asce… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: 19 pages, 1 figure. The paper has been accepted by ACL 2024(Main Conference)

  2. arXiv:2406.00839  [pdf, other

    cs.CL cs.AI

    FOCUS: Forging Originality through Contrastive Use in Self-Plagiarism for Language Models

    Authors: Kaixin Lan, Tao Fang, Derek F. Wong, Yabo Xu, Lidia S. Chao, Cecilia G. Zhao

    Abstract: Pre-trained Language Models (PLMs) have shown impressive results in various Natural Language Generation (NLG) tasks, such as powering chatbots and generating stories. However, an ethical concern arises due to their potential to produce verbatim copies of paragraphs from their training data. This is problematic as PLMs are trained on corpora constructed by human authors. As such, there is a pressin… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

    Comments: 16 pages, 8 figures. The paper has been accepted by ACL 2024 (Findings), with Kaixin Lan and Tao Fang contributing equally, and Derek F. Wong serving as the corresponding author

  3. arXiv:2405.19902  [pdf, other

    cs.LG stat.ML

    Learning Discriminative Dynamics with Label Corruption for Noisy Label Detection

    Authors: Suyeon Kim, Dongha Lee, SeongKu Kang, Sukang Chae, Sanghwan Jang, Hwanjo Yu

    Abstract: Label noise, commonly found in real-world datasets, has a detrimental impact on a model's generalization. To effectively detect incorrectly labeled instances, previous works have mostly relied on distinguishable training signals, such as training loss, as indicators to differentiate between clean and noisy labels. However, they have limitations in that the training signals incompletely reveal the… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: Accepted to CVPR 2024

  4. arXiv:2405.13396  [pdf, other

    cs.LG stat.ML

    Why In-Context Learning Transformers are Tabular Data Classifiers

    Authors: Felix den Breejen, Sangmin Bae, Stephen Cha, Se-Young Yun

    Abstract: The recently introduced TabPFN pretrains an In-Context Learning (ICL) transformer on synthetic data to perform tabular data classification. As synthetic data does not share features or labels with real-world data, the underlying mechanism that contributes to the success of this method remains unclear. This study provides an explanation by demonstrating that ICL-transformers acquire the ability to… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: 9 pages main body, 22 pages total. Preprint under review

  5. arXiv:2405.09858  [pdf, other

    cs.CV cs.LG

    Towards Realistic Incremental Scenario in Class Incremental Semantic Segmentation

    Authors: Jihwan Kwak, Sungmin Cha, Taesup Moon

    Abstract: This paper addresses the unrealistic aspect of the commonly adopted Continuous Incremental Semantic Segmentation (CISS) scenario, termed overlapped. We point out that overlapped allows the same image to reappear in future tasks with different pixel labels, which is far from practical incremental learning scenarios. Moreover, we identified that this flawed scenario may lead to biased results for tw… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

  6. arXiv:2405.05749  [pdf, other

    cs.CV

    NeRFFaceSpeech: One-shot Audio-driven 3D Talking Head Synthesis via Generative Prior

    Authors: Gihoon Kim, Kwanggyoon Seo, Sihun Cha, Junyong Noh

    Abstract: Audio-driven talking head generation is advancing from 2D to 3D content. Notably, Neural Radiance Field (NeRF) is in the spotlight as a means to synthesize high-quality 3D talking head outputs. Unfortunately, this NeRF-based approach typically requires a large number of paired audio-visual data for each identity, thereby limiting the scalability of the method. Although there have been attempts to… ▽ More

    Submitted 10 May, 2024; v1 submitted 9 May, 2024; originally announced May 2024.

    Comments: 11 pages, 5 figures

  7. arXiv:2405.04286  [pdf, other

    cs.CL

    Who Wrote This? The Key to Zero-Shot LLM-Generated Text Detection Is GECScore

    Authors: Junchao Wu, Runzhe Zhan, Derek F. Wong, Shu Yang, Xuebo Liu, Lidia S. Chao, Min Zhang

    Abstract: The efficacy of an large language model (LLM) generated text detector depends substantially on the availability of sizable training data. White-box zero-shot detectors, which require no such data, are nonetheless limited by the accessibility of the source model of the LLM-generated text. In this paper, we propose an simple but effective black-box zero-shot detection approach, predicated on the obs… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  8. arXiv:2405.02925  [pdf, other

    cs.CL

    A Two-Stage Prediction-Aware Contrastive Learning Framework for Multi-Intent NLU

    Authors: Guanhua Chen, Yutong Yao, Derek F. Wong, Lidia S. Chao

    Abstract: Multi-intent natural language understanding (NLU) presents a formidable challenge due to the model confusion arising from multiple intents within a single utterance. While previous works train the model contrastively to increase the margin between different multi-intent labels, they are less suited to the nuances of multi-intent NLU. They ignore the rich information between the shared intents, whi… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

    Comments: LREC-COLING 2024

  9. arXiv:2404.18413  [pdf, other

    cs.CV cs.AI

    3AM: An Ambiguity-Aware Multi-Modal Machine Translation Dataset

    Authors: Xinyu Ma, Xuebo Liu, Derek F. Wong, Jun Rao, Bei Li, Liang Ding, Lidia S. Chao, Dacheng Tao, Min Zhang

    Abstract: Multimodal machine translation (MMT) is a challenging task that seeks to improve translation quality by incorporating visual information. However, recent studies have indicated that the visual information provided by existing MMT datasets is insufficient, causing models to disregard it and overestimate their capabilities. This issue presents a significant obstacle to the development of MMT researc… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

  10. arXiv:2404.16766  [pdf, other

    cs.CL cs.AI

    Prefix Text as a Yarn: Eliciting Non-English Alignment in Foundation Language Model

    Authors: Runzhe Zhan, Xinyi Yang, Derek F. Wong, Lidia S. Chao, Yue Zhang

    Abstract: While supervised fine-tuning (SFT) has been a straightforward approach for tailoring the output of foundation large language model (LLM) to specific preferences, concerns have been raised about the depth of this alignment, with some critiques suggesting it is merely "superficial". We critically examine this hypothesis within the scope of cross-lingual generation tasks, proposing that the effective… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  11. arXiv:2404.09475  [pdf, other

    cs.CV cs.AI

    Improving Weakly-Supervised Object Localization Using Adversarial Erasing and Pseudo Label

    Authors: Byeongkeun Kang, Sinhae Cha, Yee** Lee

    Abstract: Weakly-supervised learning approaches have gained significant attention due to their ability to reduce the effort required for human annotations in training neural networks. This paper investigates a framework for weakly-supervised object localization, which aims to train a neural network capable of predicting both the object class and its location using only images and their image-level class lab… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: 15 pages

    Journal ref: Engineering Applications of Artificial Intelligence, 2024

  12. arXiv:2404.08327  [pdf, other

    cs.CV

    Salience-Based Adaptive Masking: Revisiting Token Dynamics for Enhanced Pre-training

    Authors: Hyesong Choi, Hye** Park, Kwang Moo Yi, Sungmin Cha, Dongbo Min

    Abstract: In this paper, we introduce Saliency-Based Adaptive Masking (SBAM), a novel and cost-effective approach that significantly enhances the pre-training performance of Masked Image Modeling (MIM) approaches by prioritizing token salience. Our method provides robustness against variations in masking ratios, effectively mitigating the performance instability issues common in existing methods. This relax… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

  13. arXiv:2403.15227  [pdf, other

    cs.CV cs.GR

    LeGO: Leveraging a Surface Deformation Network for Animatable Stylized Face Generation with One Example

    Authors: Soyeon Yoon, Kwan Yun, Kwanggyoon Seo, Sihun Cha, Jung Eun Yoo, Junyong Noh

    Abstract: Recent advances in 3D face stylization have made significant strides in few to zero-shot settings. However, the degree of stylization achieved by existing methods is often not sufficient for practical applications because they are mostly based on statistical 3D Morphable Models (3DMM) with limited variations. To this end, we propose a method that can produce a highly stylized 3D face model with de… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

    Comments: 8 pages

    MSC Class: 68T45 ACM Class: I.4.9

  14. arXiv:2403.13680  [pdf, other

    eess.IV cs.CV

    Step-Calibrated Diffusion for Biomedical Optical Image Restoration

    Authors: Yiwei Lyu, Sung Jik Cha, Cheng Jiang, Asadur Chowdury, Xinhai Hou, Edward Harake, Akhil Kondepudi, Christian Freudiger, Honglak Lee, Todd C. Hollon

    Abstract: High-quality, high-resolution medical imaging is essential for clinical care. Raman-based biomedical optical imaging uses non-ionizing infrared radiation to evaluate human tissues in real time and is used for early cancer detection, brain tumor diagnosis, and intraoperative tissue analysis. Unfortunately, optical imaging is vulnerable to image degradation due to laser scattering and absorption, wh… ▽ More

    Submitted 16 May, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

  15. arXiv:2403.11621  [pdf, other

    cs.CL

    Let's Focus on Neuron: Neuron-Level Supervised Fine-tuning for Large Language Model

    Authors: Haoyun Xu, Runzhe Zhan, Derek F. Wong, Lidia S. Chao

    Abstract: Large Language Models (LLMs) are composed of neurons that exhibit various behaviors and roles, which become increasingly diversified as models scale. Recent studies have revealed that not all neurons are active across different datasets, and this sparsity correlates positively with the task-specific ability, leading to advancements in model pruning and training efficiency. Traditional fine-tuning… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  16. arXiv:2403.09066  [pdf, other

    cs.LG cs.CV

    Hyperparameters in Continual Learning: a Reality Check

    Authors: Sungmin Cha, Kyunghyun Cho

    Abstract: Various algorithms for continual learning (CL) have been designed with the goal of effectively alleviating the trade-off between stability and plasticity during the CL process. To achieve this goal, tuning appropriate hyperparameters for each algorithm is essential. As an evaluation protocol, it has been common practice to train a CL algorithm using diverse hyperparameter values on a CL scenario c… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

    Comments: Preprint

  17. arXiv:2402.15923  [pdf, other

    cs.LG cs.AI cs.MM

    Predicting Outcomes in Video Games with Long Short Term Memory Networks

    Authors: Kittimate Chulajata, Sean Wu, Fabien Scalzo, Eun Sang Cha

    Abstract: Forecasting winners in E-sports with real-time analytics has the potential to further engage audiences watching major tournament events. However, making such real-time predictions is challenging due to unpredictable variables within the game involving diverse player strategies and decision-making. Our work attempts to enhance audience engagement within video game tournaments by introducing a real-… ▽ More

    Submitted 24 February, 2024; originally announced February 2024.

    Comments: 7 pages, 2 Figures, 2 Tables. Kittimate Chulajata and Sean Wu are considered co-first authors

  18. arXiv:2402.15188  [pdf, other

    cs.LG math.OC

    Parameter-Free Algorithms for Performative Regret Minimization under Decision-Dependent Distributions

    Authors: Sungwoo Park, Junyeop Kwon, Byeongnoh Kim, Suhyun Chae, Jeeyong Lee, Dabeen Lee

    Abstract: This paper studies performative risk minimization, a formulation of stochastic optimization under decision-dependent distributions. We consider the general case where the performative risk can be non-convex, for which we develop efficient parameter-free optimistic optimization-based methods. Our algorithms significantly improve upon the existing Lipschitz bandit-based method in many aspects. In pa… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

  19. arXiv:2402.09717  [pdf, other

    cs.CV

    Visually Dehallucinative Instruction Generation: Know What You Don't Know

    Authors: Sungguk Cha, Jusung Lee, Younghyun Lee, Cheoljong Yang

    Abstract: "When did the emperor Napoleon invented iPhone?" Such hallucination-inducing question is well known challenge in generative language modeling. In this study, we present an innovative concept of visual hallucination, referred to as "I Know (IK)" hallucination, to address scenarios where "I Don't Know" is the desired response. To effectively tackle this issue, we propose the VQAv2-IDK benchmark, the… ▽ More

    Submitted 15 February, 2024; originally announced February 2024.

  20. arXiv:2402.08360  [pdf, other

    cs.CV

    Visual Question Answering Instruction: Unlocking Multimodal Large Language Model To Domain-Specific Visual Multitasks

    Authors: Jusung Lee, Sungguk Cha, Younghyun Lee, Cheoljong Yang

    Abstract: Having revolutionized natural language processing (NLP) applications, large language models (LLMs) are expanding into the realm of multimodal inputs. Owing to their ability to interpret images, multimodal LLMs (MLLMs) have been primarily used for vision-language tasks. Currently, MLLMs have not yet been extended for domain-specific visual tasks, which require a more explicit understanding of visua… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

  21. arXiv:2402.08348  [pdf, other

    cs.CV

    Visually Dehallucinative Instruction Generation

    Authors: Sungguk Cha, Jusung Lee, Younghyun Lee, Cheoljong Yang

    Abstract: In recent years, synthetic visual instructions by generative language model have demonstrated plausible text generation performance on the visual question-answering tasks. However, challenges persist in the hallucination of generative language models, i.e., the generated image-text data contains unintended contents. This paper presents a novel and scalable method for generating visually dehallucin… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

    Comments: Accepted in ICASSP2024

  22. arXiv:2312.09584  [pdf, other

    cs.CV cs.AI

    Multiscale Vision Transformer With Deep Clustering-Guided Refinement for Weakly Supervised Object Localization

    Authors: David Kim, Sinhae Cha, Byeongkeun Kang

    Abstract: This work addresses the task of weakly-supervised object localization. The goal is to learn object localization using only image-level class labels, which are much easier to obtain compared to bounding box annotations. This task is important because it reduces the need for labor-intensive ground-truth annotations. However, methods for object localization trained using weak supervision often suffer… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

    Comments: 5 pages

    Journal ref: IEEE International Conference on Visual Communications and Image Processing, 2023

  23. arXiv:2311.17878  [pdf, other

    cs.CV

    TSDF-Sampling: Efficient Sampling for Neural Surface Field using Truncated Signed Distance Field

    Authors: Chaerin Min, Sehyun Cha, Changhee Won, Jongwoo Lim

    Abstract: Multi-view neural surface reconstruction has exhibited impressive results. However, a notable limitation is the prohibitively slow inference time when compared to traditional techniques, primarily attributed to the dense sampling, required to maintain the rendering quality. This paper introduces a novel approach that substantially reduces the number of samplings by incorporating the Truncated Sign… ▽ More

    Submitted 29 November, 2023; originally announced November 2023.

  24. arXiv:2311.07343  [pdf, other

    cs.LG

    Fine-Tuning the Retrieval Mechanism for Tabular Deep Learning

    Authors: Felix den Breejen, Sangmin Bae, Stephen Cha, Tae-Young Kim, Seoung Hyun Koh, Se-Young Yun

    Abstract: While interests in tabular deep learning has significantly grown, conventional tree-based models still outperform deep learning methods. To narrow this performance gap, we explore the innovative retrieval mechanism, a methodology that allows neural networks to refer to other data points while making predictions. Our experiments reveal that retrieval-based training, especially when fine-tuning the… ▽ More

    Submitted 13 November, 2023; originally announced November 2023.

    Comments: Table Representation Learning Workshop at NeurIPS 2023

  25. arXiv:2310.14724  [pdf, other

    cs.CL cs.AI

    A Survey on LLM-Generated Text Detection: Necessity, Methods, and Future Directions

    Authors: Junchao Wu, Shu Yang, Runzhe Zhan, Yulin Yuan, Derek F. Wong, Lidia S. Chao

    Abstract: The powerful ability to understand, follow, and generate complex language emerging from large language models (LLMs) makes LLM-generated text flood many areas of our daily lives at an incredible speed and is widely accepted by humans. As LLMs continue to expand, there is an imperative need to develop detectors that can detect LLM-generated text. This is crucial to mitigate potential misuse of LLMs… ▽ More

    Submitted 19 April, 2024; v1 submitted 23 October, 2023; originally announced October 2023.

  26. arXiv:2310.11479  [pdf, other

    cs.LG stat.ML

    On the Temperature of Bayesian Graph Neural Networks for Conformal Prediction

    Authors: Seohyeon Cha, Honggu Kang, Joonhyuk Kang

    Abstract: Accurate uncertainty quantification in graph neural networks (GNNs) is essential, especially in high-stakes domains where GNNs are frequently employed. Conformal prediction (CP) offers a promising framework for quantifying uncertainty by providing $\textit{valid}$ prediction sets for any black-box model. CP ensures formal probabilistic guarantees that a prediction set contains a true label with a… ▽ More

    Submitted 3 December, 2023; v1 submitted 17 October, 2023; originally announced October 2023.

  27. arXiv:2310.08908  [pdf, other

    cs.CL

    Human-in-the-loop Machine Translation with Large Language Model

    Authors: Xinyi Yang, Runzhe Zhan, Derek F. Wong, Junchao Wu, Lidia S. Chao

    Abstract: The large language model (LLM) has garnered significant attention due to its in-context learning mechanisms and emergent capabilities. The research community has conducted several pilot studies to apply LLMs to machine translation tasks and evaluate their performance from diverse perspectives. However, previous research has primarily focused on the LLM itself and has not explored human interventio… ▽ More

    Submitted 13 October, 2023; originally announced October 2023.

    Comments: Accepted to MT Summit 2023

  28. arXiv:2308.07761  [pdf, other

    cs.LG cs.AI

    NeFL: Nested Federated Learning for Heterogeneous Clients

    Authors: Honggu Kang, Seohyeon Cha, **woo Shin, Jongmyeong Lee, Joonhyuk Kang

    Abstract: Federated learning (FL) is a promising approach in distributed learning kee** privacy. However, during the training pipeline of FL, slow or incapable clients (i.e., stragglers) slow down the total training time and degrade performance. System heterogeneity, including heterogeneous computing and network bandwidth, has been addressed to mitigate the impact of stragglers. Previous studies tackle th… ▽ More

    Submitted 9 October, 2023; v1 submitted 15 August, 2023; originally announced August 2023.

  29. arXiv:2306.12978  [pdf, other

    cs.IT eess.SP

    Rate-Splitting Multiple Access for 6G Networks: Ten Promising Scenarios and Applications

    Authors: Jeonghun Park, Byungju Lee, **seok Choi, Hoon Lee, Namyoon Lee, Seok-Hwan Park, Kyoung-Jae Lee, Junil Choi, Sung Ho Chae, Sang-Woon Jeon, Kyung Sup Kwak, Bruno Clerckx, Wonjae Shin

    Abstract: In the upcoming 6G era, multiple access (MA) will play an essential role in achieving high throughput performances required in a wide range of wireless applications. Since MA and interference management are closely related issues, the conventional MA techniques are limited in that they cannot provide near-optimal performance in universal interference regimes. Recently, rate-splitting multiple acce… ▽ More

    Submitted 22 June, 2023; originally announced June 2023.

    Comments: 17 pages, 6 figures, submitted to IEEE Network Magazine

  30. arXiv:2306.05101  [pdf, other

    cs.LG

    Regularizing with Pseudo-Negatives for Continual Self-Supervised Learning

    Authors: Sungmin Cha, Kyunghyun Cho, Taesup Moon

    Abstract: We introduce a novel Pseudo-Negative Regularization (PNR) framework for effective continual self-supervised learning (CSSL). Our PNR leverages pseudo-negatives obtained through model-based augmentation in a way that newly learned representations may not contradict what has been learned in the past. Specifically, for the InfoNCE-based contrastive learning methods, we define symmetric pseudo-negativ… ▽ More

    Submitted 7 June, 2024; v1 submitted 8 June, 2023; originally announced June 2023.

    Comments: ICML 2024 camera-ready version

  31. arXiv:2305.09011  [pdf, other

    eess.IV cs.CV

    The Brain Tumor Segmentation (BraTS) Challenge 2023: Brain MR Image Synthesis for Tumor Segmentation (BraSyn)

    Authors: Hongwei Bran Li, Gian Marco Conte, Syed Muhammad Anwar, Florian Kofler, Ivan Ezhov, Koen van Leemput, Marie Piraud, Maria Diaz, Byrone Cole, Evan Calabrese, Jeff Rudie, Felix Meissen, Maruf Adewole, Anastasia Janas, Anahita Fathi Kazerooni, Dominic LaBella, Ahmed W. Moawad, Keyvan Farahani, James Eddy, Timothy Bergquist, Verena Chung, Russell Takeshi Shinohara, Farouk Dako, Walter Wiggins, Zachary Reitman , et al. (43 additional authors not shown)

    Abstract: Automated brain tumor segmentation methods have become well-established and reached performance levels offering clear clinical utility. These methods typically rely on four input magnetic resonance imaging (MRI) modalities: T1-weighted images with and without contrast enhancement, T2-weighted images, and FLAIR images. However, some sequences are often missing in clinical practice due to time const… ▽ More

    Submitted 28 June, 2023; v1 submitted 15 May, 2023; originally announced May 2023.

    Comments: Technical report of BraSyn

  32. arXiv:2305.08992  [pdf, other

    eess.IV cs.CV cs.LG

    The Brain Tumor Segmentation (BraTS) Challenge 2023: Local Synthesis of Healthy Brain Tissue via Inpainting

    Authors: Florian Kofler, Felix Meissen, Felix Steinbauer, Robert Graf, Eva Oswald, Ezequiel de da Rosa, Hongwei Bran Li, Ujjwal Baid, Florian Hoelzl, Oezguen Turgut, Izabela Horvath, Diana Waldmannstetter, Christina Bukas, Maruf Adewole, Syed Muhammad Anwar, Anastasia Janas, Anahita Fathi Kazerooni, Dominic LaBella, Ahmed W Moawad, Keyvan Farahani, James Eddy, Timothy Bergquist, Verena Chung, Russell Takeshi Shinohara, Farouk Dako , et al. (43 additional authors not shown)

    Abstract: A myriad of algorithms for the automatic analysis of brain MR images is available to support clinicians in their decision-making. For brain tumor patients, the image acquisition time series typically starts with a scan that is already pathological. This poses problems, as many algorithms are designed to analyze healthy brains and provide no guarantees for images featuring lesions. Examples include… ▽ More

    Submitted 9 August, 2023; v1 submitted 15 May, 2023; originally announced May 2023.

    Comments: 5 pages, 1 figure

  33. arXiv:2305.01951  [pdf, other

    cs.CL

    Can LMs Generalize to Future Data? An Empirical Analysis on Text Summarization

    Authors: Chi Seng Cheang, Hou Pong Chan, Derek F. Wong, Xuebo Liu, Zhaocong Li, Yanming Sun, Shudong Liu, Lidia S. Chao

    Abstract: Recent pre-trained language models (PLMs) achieve promising results in existing abstractive summarization datasets. However, existing summarization benchmarks overlap in time with the standard pre-training corpora and finetuning datasets. Hence, the strong performance of PLMs may rely on the parametric knowledge that is memorized during pre-training and fine-tuning. Moreover, the knowledge memoriz… ▽ More

    Submitted 2 November, 2023; v1 submitted 3 May, 2023; originally announced May 2023.

    Comments: Accepted at EMNLP 2023

  34. arXiv:2305.00936  [pdf, other

    cs.CV cs.GR

    Generating Texture for 3D Human Avatar from a Single Image using Sampling and Refinement Networks

    Authors: Sihun Cha, Kwanggyoon Seo, Amirsaman Ashtari, Junyong Noh

    Abstract: There has been significant progress in generating an animatable 3D human avatar from a single image. However, recovering texture for the 3D human avatar from a single image has been relatively less addressed. Because the generated 3D human avatar reveals the occluded texture of the given image as it moves, it is critical to synthesize the occluded texture pattern that is unseen from the source ima… ▽ More

    Submitted 1 May, 2023; originally announced May 2023.

  35. arXiv:2304.01746  [pdf, other

    cs.CL

    Is ChatGPT a Highly Fluent Grammatical Error Correction System? A Comprehensive Evaluation

    Authors: Tao Fang, Shu Yang, Kaixin Lan, Derek F. Wong, **peng Hu, Lidia S. Chao, Yue Zhang

    Abstract: ChatGPT, a large-scale language model based on the advanced GPT-3.5 architecture, has shown remarkable potential in various Natural Language Processing (NLP) tasks. However, there is currently a dearth of comprehensive study exploring its potential in the area of Grammatical Error Correction (GEC). To showcase its capabilities in GEC, we design zero-shot chain-of-thought (CoT) and few-shot CoT set… ▽ More

    Submitted 4 April, 2023; originally announced April 2023.

  36. arXiv:2303.11763  [pdf, other

    cs.IT eess.SP

    Reconfigurable Intelligent Surface Aided Hybrid Beamforming: Optimal Placement and Beamforming Design

    Authors: Najam Us Saqib, Shumei Hou, Sung Ho Chae, Sang-Woon Jeon

    Abstract: We consider reconfigurable intelligent surface (RIS) aided sixth-generation (6G) terahertz (THz) communications for indoor environment in which a base station (BS) wishes to send independent messages to its serving users with the help of multiple RISs. For indoor environment, various obstacles such as pillars, walls, and other objects can result in no line-of-sight signal path between the BS and a… ▽ More

    Submitted 21 March, 2023; originally announced March 2023.

    Comments: This manuscript contains 18 pages and 9 figures

  37. arXiv:2302.05549  [pdf, other

    stat.ME cs.DC

    Balancing Approach for Causal Inference at Scale

    Authors: Sicheng Lin, Meng Xu, Xi Zhang, Shih-Kang Chao, Ying-Kai Huang, Xiaolin Shi

    Abstract: With the modern software and online platforms to collect massive amount of data, there is an increasing demand of applying causal inference methods at large scale when randomized experimentation is not viable. Weighting methods that directly incorporate covariate balancing have recently gained popularity for estimating causal effects in observational studies. These methods reduce the manual effort… ▽ More

    Submitted 3 August, 2023; v1 submitted 10 February, 2023; originally announced February 2023.

    Comments: KDD '23: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

  38. arXiv:2301.11578  [pdf, other

    cs.LG

    Learning to Unlearn: Instance-wise Unlearning for Pre-trained Classifiers

    Authors: Sungmin Cha, Sungjun Cho, Dasol Hwang, Honglak Lee, Taesup Moon, Moontae Lee

    Abstract: Since the recent advent of regulations for data protection (e.g., the General Data Protection Regulation), there has been increasing demand in deleting information learned from sensitive data in pre-trained models without retraining from scratch. The inherent vulnerability of neural networks towards adversarial attacks and unfairness also calls for a robust method to remove or correct information… ▽ More

    Submitted 15 January, 2024; v1 submitted 27 January, 2023; originally announced January 2023.

    Comments: AAAI 2024 camera ready version

  39. arXiv:2212.08817  [pdf, other

    cs.AI

    Accurate Open-set Recognition for Memory Workload

    Authors: Jun-Gi Jang, Sooyeon Shim, Vladimir Egay, Jeeyong Lee, Jongmin Park, Suhyun Chae, U Kang

    Abstract: How can we accurately identify new memory workloads while classifying known memory workloads? Verifying DRAM (Dynamic Random Access Memory) using various workloads is an important task to guarantee the quality of DRAM. A crucial component in the process is open-set recognition which aims to detect new workloads not seen in the training phase. Despite its importance, however, existing open-set reco… ▽ More

    Submitted 17 December, 2022; originally announced December 2022.

    Comments: 15 pages, 5 figures

  40. arXiv:2212.04262  [pdf, other

    cs.CL cs.AI cs.LG

    ConsistTL: Modeling Consistency in Transfer Learning for Low-Resource Neural Machine Translation

    Authors: Zhaocong Li, Xuebo Liu, Derek F. Wong, Lidia S. Chao, Min Zhang

    Abstract: Transfer learning is a simple and powerful method that can be used to boost model performance of low-resource neural machine translation (NMT). Existing transfer learning methods for NMT are static, which simply transfer knowledge from a parent model to a child model once via parameter initialization. In this paper, we propose a novel transfer learning method for NMT, namely ConsistTL, which can c… ▽ More

    Submitted 8 December, 2022; originally announced December 2022.

    Comments: Accepted to EMNLP 2022

  41. arXiv:2211.01548  [pdf, other

    cs.LG cs.AI

    INGREX: An Interactive Explanation Framework for Graph Neural Networks

    Authors: Tien-Cuong Bui, Van-Duc Le, Wen-Syan Li, Sang Kyun Cha

    Abstract: Graph Neural Networks (GNNs) are widely used in many modern applications, necessitating explanations for their decisions. However, the complexity of GNNs makes it difficult to explain predictions. Even though several methods have been proposed lately, they can only provide simple and static explanations, which are difficult for users to understand in many scenarios. Therefore, we introduce INGREX,… ▽ More

    Submitted 2 November, 2022; originally announced November 2022.

    Comments: 4 pages, 5 figures, This paper is under review for IEEE ICDE 2023

  42. arXiv:2210.11094  [pdf, other

    cs.LG cs.AI

    Toward Multiple Specialty Learners for Explaining GNNs via Online Knowledge Distillation

    Authors: Tien-Cuong Bui, Van-Duc Le, Wen-syan Li, Sang Kyun Cha

    Abstract: Graph Neural Networks (GNNs) have become increasingly ubiquitous in numerous applications and systems, necessitating explanations of their predictions, especially when making critical decisions. However, explaining GNNs is challenging due to the complexity of graph data and model execution. Despite additional computational costs, post-hoc explanation approaches have been widely adopted due to the… ▽ More

    Submitted 20 October, 2022; originally announced October 2022.

    Comments: 13 pages, 11 figures, A preliminary paper under review of IEEE ICDE 2023

  43. arXiv:2210.09683  [pdf, other

    cs.CL

    Alibaba-Translate China's Submission for WMT 2022 Metrics Shared Task

    Authors: Yu Wan, Keqin Bao, Dayiheng Liu, Baosong Yang, Derek F. Wong, Lidia S. Chao, Wenqiang Lei, Jun Xie

    Abstract: In this report, we present our submission to the WMT 2022 Metrics Shared Task. We build our system based on the core idea of UNITE (Unified Translation Evaluation), which unifies source-only, reference-only, and source-reference-combined evaluation scenarios into one single model. Specifically, during the model pre-training phase, we first apply the pseudo-labeled data examples to continuously pre… ▽ More

    Submitted 17 February, 2023; v1 submitted 18 October, 2022; originally announced October 2022.

    Comments: WMT 2022 Metrics Shared Task

  44. arXiv:2210.01504  [pdf, other

    cs.CL

    Knowledge Unlearning for Mitigating Privacy Risks in Language Models

    Authors: Joel Jang, Dongkeun Yoon, Sohee Yang, Sungmin Cha, Moontae Lee, Lajanugen Logeswaran, Minjoon Seo

    Abstract: Pretrained Language Models (LMs) memorize a vast amount of knowledge during initial pretraining, including information that may violate the privacy of personal lives and identities. Previous work addressing privacy issues for language models has mostly focused on data preprocessing and differential privacy methods, both requiring re-training the underlying LM. We propose knowledge unlearning as an… ▽ More

    Submitted 19 December, 2022; v1 submitted 4 October, 2022; originally announced October 2022.

  45. arXiv:2208.03075  [pdf, other

    cs.LG

    PGX: A Multi-level GNN Explanation Framework Based on Separate Knowledge Distillation Processes

    Authors: Tien-Cuong Bui, Wen-syan Li, Sang-Kyun Cha

    Abstract: Graph Neural Networks (GNNs) are widely adopted in advanced AI systems due to their capability of representation learning on graph data. Even though GNN explanation is crucial to increase user trust in the systems, it is challenging due to the complexity of GNN execution. Lately, many works have been proposed to address some of the issues in GNN explanation. However, they lack generalization capab… ▽ More

    Submitted 5 August, 2022; originally announced August 2022.

    Comments: 11 pages, 8 figures

  46. arXiv:2206.08101  [pdf, other

    cs.LG

    Towards Diverse Evaluation of Class Incremental Learning: A Representation Learning Perspective

    Authors: Sungmin Cha, Jihwan Kwak, Dongsub Shim, Hyunwoo Kim, Moontae Lee, Honglak Lee, Taesup Moon

    Abstract: Class incremental learning (CIL) algorithms aim to continually learn new object classes from incrementally arriving data while not forgetting past learned classes. The common evaluation protocol for CIL algorithms is to measure the average test accuracy across all classes learned so far -- however, we argue that solely focusing on maximizing the test accuracy may not necessarily lead to develo**… ▽ More

    Submitted 25 June, 2024; v1 submitted 16 June, 2022; originally announced June 2022.

    Comments: CoLLAs 2024 camera-ready version

  47. Attention Mechanism with Energy-Friendly Operations

    Authors: Yu Wan, Baosong Yang, Dayiheng Liu, Rong Xiao, Derek F. Wong, Haibo Zhang, Boxing Chen, Lidia S. Chao

    Abstract: Attention mechanism has become the dominant module in natural language processing models. It is computationally intensive and depends on massive power-hungry multiplications. In this paper, we rethink variants of attention mechanism from the energy consumption aspects. After reaching the conclusion that the energy costs of several energy-friendly operations are far less than their multiplication c… ▽ More

    Submitted 28 April, 2022; originally announced April 2022.

    Comments: Findings@ACL2022

  48. arXiv:2204.13352  [pdf, other

    cs.CL

    RoBLEURT Submission for the WMT2021 Metrics Task

    Authors: Yu Wan, Dayiheng Liu, Baosong Yang, Tianchi Bi, Haibo Zhang, Boxing Chen, Weihua Luo, Derek F. Wong, Lidia S. Chao

    Abstract: In this paper, we present our submission to Shared Metrics Task: RoBLEURT (Robustly Optimizing the training of BLEURT). After investigating the recent advances of trainable metrics, we conclude several aspects of vital importance to obtain a well-performed metric model by: 1) jointly leveraging the advantages of source-included model and reference-only model, 2) continuously pre-training the model… ▽ More

    Submitted 28 April, 2022; originally announced April 2022.

    Comments: WMT2021 Metrics Shared Task

  49. UniTE: Unified Translation Evaluation

    Authors: Yu Wan, Dayiheng Liu, Baosong Yang, Haibo Zhang, Boxing Chen, Derek F. Wong, Lidia S. Chao

    Abstract: Translation quality evaluation plays a crucial role in machine translation. According to the input format, it is mainly separated into three tasks, i.e., reference-only, source-only and source-reference-combined. Recent methods, despite their promising results, are specifically designed and optimized on one of them. This limits the convenience of these methods, and overlooks the commonalities amon… ▽ More

    Submitted 28 April, 2022; originally announced April 2022.

    Comments: ACL2022

  50. arXiv:2204.11068  [pdf, ps, other

    cs.IT

    Integer Forcing Interference Management for the MIMO Interference Channel

    Authors: Sung Ho Chae, Sang-Woon Jeon

    Abstract: A new interference management scheme based on integer forcing (IF) receivers is studied for the two-user multiple-input and multiple-output (MIMO) interference channel. The proposed scheme employs a message splitting method that divides each data stream into common and private sub-streams, in which the private stream is recovered by the dedicated receiver only while the common stream is required t… ▽ More

    Submitted 23 April, 2022; originally announced April 2022.

    Comments: Submitted to IEEE Trans. Wireless Commun. (in revision), 31 pages, 6 figures