-
CBPF: Filtering Poisoned Data Based on Composite Backdoor Attack
Authors:
Hanfeng Xia,
Haibo Hong,
Ruili Wang
Abstract:
Backdoor attacks involve the injection of a limited quantity of poisoned examples containing triggers into the training dataset. During the inference stage, backdoor attacks can uphold a high level of accuracy for normal examples, yet when presented with trigger-containing instances, the model may erroneously predict them as the targeted class designated by the attacker. This paper explores strate…
▽ More
Backdoor attacks involve the injection of a limited quantity of poisoned examples containing triggers into the training dataset. During the inference stage, backdoor attacks can uphold a high level of accuracy for normal examples, yet when presented with trigger-containing instances, the model may erroneously predict them as the targeted class designated by the attacker. This paper explores strategies for mitigating the risks associated with backdoor attacks by examining the filtration of poisoned samples.We primarily leverage two key characteristics of backdoor attacks: the ability for multiple backdoors to exist simultaneously within a single model, and the discovery through Composite Backdoor Attack (CBA) that altering two triggers in a sample to new target labels does not compromise the original functionality of the triggers, yet enables the prediction of the data as a new target class when both triggers are present simultaneously.Therefore, a novel three-stage poisoning data filtering approach, known as Composite Backdoor Poison Filtering (CBPF), is proposed as an effective solution. Firstly, utilizing the identified distinctions in output between poisoned and clean samples, a subset of data is partitioned to include both poisoned and clean instances. Subsequently, benign triggers are incorporated and labels are adjusted to create new target and benign target classes, thereby prompting the poisoned and clean data to be classified as distinct entities during the inference stage. The experimental results indicate that CBPF is successful in filtering out malicious data produced by six advanced attacks on CIFAR10 and ImageNet-12. On average, CBPF attains a notable filtering success rate of 99.91% for the six attacks on CIFAR10. Additionally, the model trained on the uncontaminated samples exhibits sustained high accuracy levels.
△ Less
Submitted 23 June, 2024;
originally announced June 2024.
-
Full reference point cloud quality assessment using support vector regression
Authors:
Ryosuke Watanabe,
Shashank N. Sridhara,
Haoran Hong,
Eduardo Pavez,
Keisuke Nonaka,
Tatsuya Kobayashi,
Antonio Ortega
Abstract:
Point clouds are a general format for representing realistic 3D objects in diverse 3D applications. Since point clouds have large data sizes, develo** efficient point cloud compression methods is crucial. However, excessive compression leads to various distortions, which deteriorates the point cloud quality perceived by end users. Thus, establishing reliable point cloud quality assessment (PCQA)…
▽ More
Point clouds are a general format for representing realistic 3D objects in diverse 3D applications. Since point clouds have large data sizes, develo** efficient point cloud compression methods is crucial. However, excessive compression leads to various distortions, which deteriorates the point cloud quality perceived by end users. Thus, establishing reliable point cloud quality assessment (PCQA) methods is essential as a benchmark to develop efficient compression methods. This paper presents an accurate full-reference point cloud quality assessment (FR-PCQA) method called full-reference quality assessment using support vector regression (FRSVR) for various types of degradations such as compression distortion, Gaussian noise, and down-sampling. The proposed method demonstrates accurate PCQA by integrating five FR-based metrics covering various types of errors (e.g., considering geometric distortion, color distortion, and point count) using support vector regression (SVR). Moreover, the proposed method achieves a superior trade-off between accuracy and calculation speed because it includes only the calculation of these five simple metrics and SVR, which can perform fast prediction. Experimental results with three types of open datasets show that the proposed method is more accurate than conventional FR-PCQA methods. In addition, the proposed method is faster than state-of-the-art methods that utilize complicated features such as curvature and multi-scale features. Thus, the proposed method provides excellent performance in terms of the accuracy of PCQA and processing speed. Our method is available from https://github.com/STAC-USC/FRSVR-PCQA.
△ Less
Submitted 15 June, 2024;
originally announced June 2024.
-
An LLM-Assisted Easy-to-Trigger Backdoor Attack on Code Completion Models: Injecting Disguised Vulnerabilities against Strong Detection
Authors:
Shenao Yan,
Shen Wang,
Yue Duan,
Hanbin Hong,
Kiho Lee,
Doowon Kim,
Yuan Hong
Abstract:
Large Language Models (LLMs) have transformed code completion tasks, providing context-based suggestions to boost developer productivity in software engineering. As users often fine-tune these models for specific applications, poisoning and backdoor attacks can covertly alter the model outputs. To address this critical security challenge, we introduce CodeBreaker, a pioneering LLM-assisted backdoo…
▽ More
Large Language Models (LLMs) have transformed code completion tasks, providing context-based suggestions to boost developer productivity in software engineering. As users often fine-tune these models for specific applications, poisoning and backdoor attacks can covertly alter the model outputs. To address this critical security challenge, we introduce CodeBreaker, a pioneering LLM-assisted backdoor attack framework on code completion models. Unlike recent attacks that embed malicious payloads in detectable or irrelevant sections of the code (e.g., comments), CodeBreaker leverages LLMs (e.g., GPT-4) for sophisticated payload transformation (without affecting functionalities), ensuring that both the poisoned data for fine-tuning and generated code can evade strong vulnerability detection. CodeBreaker stands out with its comprehensive coverage of vulnerabilities, making it the first to provide such an extensive set for evaluation. Our extensive experimental evaluations and user studies underline the strong attack performance of CodeBreaker across various settings, validating its superiority over existing approaches. By integrating malicious payloads directly into the source code with minimal transformation, CodeBreaker challenges current security measures, underscoring the critical need for more robust defenses for code completion.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Why Only Text: Empowering Vision-and-Language Navigation with Multi-modal Prompts
Authors:
Haodong Hong,
Sen Wang,
Zi Huang,
Qi Wu,
Jiajun Liu
Abstract:
Current Vision-and-Language Navigation (VLN) tasks mainly employ textual instructions to guide agents. However, being inherently abstract, the same textual instruction can be associated with different visual signals, causing severe ambiguity and limiting the transfer of prior knowledge in the vision domain from the user to the agent. To fill this gap, we propose Vision-and-Language Navigation with…
▽ More
Current Vision-and-Language Navigation (VLN) tasks mainly employ textual instructions to guide agents. However, being inherently abstract, the same textual instruction can be associated with different visual signals, causing severe ambiguity and limiting the transfer of prior knowledge in the vision domain from the user to the agent. To fill this gap, we propose Vision-and-Language Navigation with Multi-modal Prompts (VLN-MP), a novel task augmenting traditional VLN by integrating both natural language and images in instructions. VLN-MP not only maintains backward compatibility by effectively handling text-only prompts but also consistently shows advantages with different quantities and relevance of visual prompts. Possible forms of visual prompts include both exact and similar object images, providing adaptability and versatility in diverse navigation scenarios. To evaluate VLN-MP under a unified framework, we implement a new benchmark that offers: (1) a training-free pipeline to transform textual instructions into multi-modal forms with landmark images; (2) diverse datasets with multi-modal instructions for different downstream tasks; (3) a novel module designed to process various image prompts for seamless integration with state-of-the-art VLN models. Extensive experiments on four VLN benchmarks (R2R, RxR, REVERIE, CVDN) show that incorporating visual prompts significantly boosts navigation performance. While maintaining efficiency with text-only prompts, VLN-MP enables agents to navigate in the pre-explore setting and outperform text-based models, showing its broader applicability.
△ Less
Submitted 4 June, 2024;
originally announced June 2024.
-
GenMix: Combining Generative and Mixture Data Augmentation for Medical Image Classification
Authors:
Hansang Lee,
Haeil Lee,
Helen Hong
Abstract:
In this paper, we propose a novel data augmentation technique called GenMix, which combines generative and mixture approaches to leverage the strengths of both methods. While generative models excel at creating new data patterns, they face challenges such as mode collapse in GANs and difficulties in training diffusion models, especially with limited medical imaging data. On the other hand, mixture…
▽ More
In this paper, we propose a novel data augmentation technique called GenMix, which combines generative and mixture approaches to leverage the strengths of both methods. While generative models excel at creating new data patterns, they face challenges such as mode collapse in GANs and difficulties in training diffusion models, especially with limited medical imaging data. On the other hand, mixture models enhance class boundary regions but tend to favor the major class in scenarios with class imbalance. To address these limitations, GenMix integrates both approaches to complement each other. GenMix operates in two stages: (1) training a generative model to produce synthetic images, and (2) performing mixup between synthetic and real data. This process improves the quality and diversity of synthetic data while simultaneously benefiting from the new pattern learning of generative models and the boundary enhancement of mixture models. We validate the effectiveness of our method on the task of classifying focal liver lesions (FLLs) in CT images. Our results demonstrate that GenMix enhances the performance of various generative models, including DCGAN, StyleGAN, Textual Inversion, and Diffusion Models. Notably, the proposed method with Textual Inversion outperforms other methods without fine-tuning diffusion model on the FLL dataset.
△ Less
Submitted 31 May, 2024;
originally announced May 2024.
-
Designing Prompt Analytics Dashboards to Analyze Student-ChatGPT Interactions in EFL Writing
Authors:
Minsun Kim,
SeonGyeom Kim,
Suyoun Lee,
Yoosang Yoon,
Junho Myung,
Haneul Yoo,
Hyungseung Lim,
Jieun Han,
Yoonsu Kim,
So-Yeon Ahn,
Juho Kim,
Alice Oh,
Hwajung Hong,
Tak Yeon Lee
Abstract:
While ChatGPT has significantly impacted education by offering personalized resources for students, its integration into educational settings poses unprecedented risks, such as inaccuracies and biases in AI-generated content, plagiarism and over-reliance on AI, and privacy and security issues. To help teachers address such risks, we conducted a two-phase iterative design process that comprises sur…
▽ More
While ChatGPT has significantly impacted education by offering personalized resources for students, its integration into educational settings poses unprecedented risks, such as inaccuracies and biases in AI-generated content, plagiarism and over-reliance on AI, and privacy and security issues. To help teachers address such risks, we conducted a two-phase iterative design process that comprises surveys, interviews, and prototype demonstration involving six EFL (English as a Foreign Language) teachers, who integrated ChatGPT into semester-long English essay writing classes. Based on the needs identified during the initial survey and interviews, we developed a prototype of Prompt Analytics Dashboard (PAD) that integrates the essay editing history and chat logs between students and ChatGPT. Teacher's feedback on the prototype informs additional features and unmet needs for designing future PAD, which helps them (1) analyze contextual analysis of student behaviors, (2) design an overall learning loop, and (3) develop their teaching skills.
△ Less
Submitted 30 May, 2024;
originally announced May 2024.
-
Certifying Adapters: Enabling and Enhancing the Certification of Classifier Adversarial Robustness
Authors:
Jieren Deng,
Hanbin Hong,
Aaron Palmer,
Xin Zhou,
**bo Bi,
Kaleel Mahmood,
Yuan Hong,
Derek Aguiar
Abstract:
Randomized smoothing has become a leading method for achieving certified robustness in deep classifiers against l_{p}-norm adversarial perturbations. Current approaches for achieving certified robustness, such as data augmentation with Gaussian noise and adversarial training, require expensive training procedures that tune large models for different Gaussian noise levels and thus cannot leverage h…
▽ More
Randomized smoothing has become a leading method for achieving certified robustness in deep classifiers against l_{p}-norm adversarial perturbations. Current approaches for achieving certified robustness, such as data augmentation with Gaussian noise and adversarial training, require expensive training procedures that tune large models for different Gaussian noise levels and thus cannot leverage high-performance pre-trained neural networks. In this work, we introduce a novel certifying adapters framework (CAF) that enables and enhances the certification of classifier adversarial robustness. Our approach makes few assumptions about the underlying training algorithm or feature extractor and is thus broadly applicable to different feature extractor architectures (e.g., convolutional neural networks or vision transformers) and smoothing algorithms. We show that CAF (a) enables certification in uncertified models pre-trained on clean datasets and (b) substantially improves the performance of certified classifiers via randomized smoothing and SmoothAdv at multiple radii in CIFAR-10 and ImageNet. We demonstrate that CAF achieves improved certified accuracies when compared to methods based on random or denoised smoothing, and that CAF is insensitive to certifying adapter hyperparameters. Finally, we show that an ensemble of adapters enables a single pre-trained feature extractor to defend against a range of noise perturbation scales.
△ Less
Submitted 24 May, 2024;
originally announced May 2024.
-
Fast 3D Molecule Generation via Unified Geometric Optimal Transport
Authors:
Haokai Hong,
Wanyu Lin,
Kay Chen Tan
Abstract:
This paper proposes a new 3D molecule generation framework, called GOAT, for fast and effective 3D molecule generation based on the flow-matching optimal transport objective. Specifically, we formulate a geometric transport formula for measuring the cost of map** multi-modal features (e.g., continuous atom coordinates and categorical atom types) between a base distribution and a target data dist…
▽ More
This paper proposes a new 3D molecule generation framework, called GOAT, for fast and effective 3D molecule generation based on the flow-matching optimal transport objective. Specifically, we formulate a geometric transport formula for measuring the cost of map** multi-modal features (e.g., continuous atom coordinates and categorical atom types) between a base distribution and a target data distribution. Our formula is solved within a unified, equivalent, and smooth representation space. This is achieved by transforming the multi-modal features into a continuous latent space with equivalent networks. In addition, we find that identifying optimal distributional coupling is necessary for fast and effective transport between any two distributions. We further propose a flow refinement and purification mechanism for optimal coupling identification. By doing so, GOAT can turn arbitrary distribution couplings into new deterministic couplings, leading to a unified optimal transport path for fast 3D molecule generation. The purification filters the subpar molecules to ensure the ultimate generation performance. We theoretically prove the proposed method indeed reduced the transport cost. Finally, extensive experiments show that GOAT enjoys the efficiency of solving geometric optimal transport, leading to a double speedup compared to the sub-optimal method while achieving the best generation quality regarding validity, uniqueness, and novelty.
△ Less
Submitted 24 May, 2024;
originally announced May 2024.
-
Power Variable Projection for Initialization-Free Large-Scale Bundle Adjustment
Authors:
Simon Weber,
Je Hyeong Hong,
Daniel Cremers
Abstract:
Initialization-free bundle adjustment (BA) remains largely uncharted. While Levenberg-Marquardt algorithm is the golden method to solve the BA problem, it generally relies on a good initialization. In contrast, the under-explored Variable Projection algorithm (VarPro) exhibits a wide convergence basin even without initialization. Coupled with object space error formulation, recent works have shown…
▽ More
Initialization-free bundle adjustment (BA) remains largely uncharted. While Levenberg-Marquardt algorithm is the golden method to solve the BA problem, it generally relies on a good initialization. In contrast, the under-explored Variable Projection algorithm (VarPro) exhibits a wide convergence basin even without initialization. Coupled with object space error formulation, recent works have shown its ability to solve (small-scale) initialization-free bundle adjustment problem. We introduce Power Variable Projection (PoVar), extending a recent inverse expansion method based on power series. Importantly, we link the power series expansion to Riemannian manifold optimization. This projective framework is crucial to solve large-scale bundle adjustment problem without initialization. Using the real-world BAL dataset, we experimentally demonstrate that our solver achieves state-of-the-art results in terms of speed and accuracy. In particular, our work is the first, to our knowledge, that addresses the scalability of BA without initialization and opens new venues for initialization-free Structure-from-Motion.
△ Less
Submitted 9 May, 2024; v1 submitted 8 May, 2024;
originally announced May 2024.
-
Diffusion-Driven Domain Adaptation for Generating 3D Molecules
Authors:
Haokai Hong,
Wanyu Lin,
Kay Chen Tan
Abstract:
Can we train a molecule generator that can generate 3D molecules from a new domain, circumventing the need to collect data? This problem can be cast as the problem of domain adaptive molecule generation. This work presents a novel and principled diffusion-based approach, called GADM, that allows shifting a generative model to desired new domains without the need to collect even a single molecule.…
▽ More
Can we train a molecule generator that can generate 3D molecules from a new domain, circumventing the need to collect data? This problem can be cast as the problem of domain adaptive molecule generation. This work presents a novel and principled diffusion-based approach, called GADM, that allows shifting a generative model to desired new domains without the need to collect even a single molecule. As the domain shift is typically caused by the structure variations of molecules, e.g., scaffold variations, we leverage a designated equivariant masked autoencoder (MAE) along with various masking strategies to capture the structural-grained representations of the in-domain varieties. In particular, with an asymmetric encoder-decoder module, the MAE can generalize to unseen structure variations from the target domains. These structure variations are encoded with an equivariant encoder and treated as domain supervisors to control denoising. We show that, with these encoded structural-grained domain supervisors, GADM can generate effective molecules within the desired new domains. We conduct extensive experiments across various domain adaptation tasks over benchmarking datasets. We show that our approach can improve up to 65.6% in terms of success rate defined based on molecular validity, uniqueness, and novelty compared to alternative baselines.
△ Less
Submitted 1 April, 2024;
originally announced April 2024.
-
Fine-tuning Large Language Models for Domain-specific Machine Translation
Authors:
Jiawei Zheng,
Hanghai Hong,
Xiaoli Wang,
**gsong Su,
Yonggui Liang,
Shikai Wu
Abstract:
Large language models (LLMs) have made significant progress in machine translation (MT). However, their potential in domain-specific MT remains under-explored. Current LLM-based MT systems still face several challenges. First, for LLMs with in-context learning, their effectiveness is highly sensitive to input translation examples, and processing them can increase inference costs. They often requir…
▽ More
Large language models (LLMs) have made significant progress in machine translation (MT). However, their potential in domain-specific MT remains under-explored. Current LLM-based MT systems still face several challenges. First, for LLMs with in-context learning, their effectiveness is highly sensitive to input translation examples, and processing them can increase inference costs. They often require extra post-processing due to over-generation. Second, LLMs with fine-tuning on domain-specific data often require high training costs for domain adaptation, and may weaken the zero-shot MT capabilities of LLMs due to over-specialization. The aforementioned methods can struggle to translate rare words in domain transfer scenarios. To address these challenges, this paper proposes a prompt-oriented fine-tuning method, denoted as LlamaIT, to effectively and efficiently fine-tune a general-purpose LLM for domain-specific MT tasks. First, we construct a task-specific mix-domain dataset, which is then used to fine-tune the LLM with LoRA. This can eliminate the need for input translation examples, post-processing, or over-specialization. By zero-shot prompting with instructions, we adapt the MT tasks to the target domain at inference time. To further elicit the MT capability for rare words, we construct new prompts by incorporating domain-specific bilingual vocabulary. We also conduct extensive experiments on both publicly available and self-constructed datasets. The results show that our LlamaIT can significantly enhance the domain-specific MT capabilities of the LLM, meanwhile preserving its zero-shot MT capabilities.
△ Less
Submitted 22 February, 2024;
originally announced February 2024.
-
Leveraging Human-Machine Interactions for Computer Vision Dataset Quality Enhancement
Authors:
Esla Timothy Anzaku,
Hyesoo Hong,
**-Woo Park,
Wonjun Yang,
Kangmin Kim,
JongBum Won,
Deshika Vinoshani Kumari Herath,
Arnout Van Messem,
Wesley De Neve
Abstract:
Large-scale datasets for single-label multi-class classification, such as \emph{ImageNet-1k}, have been instrumental in advancing deep learning and computer vision. However, a critical and often understudied aspect is the comprehensive quality assessment of these datasets, especially regarding potential multi-label annotation errors. In this paper, we introduce a lightweight, user-friendly, and sc…
▽ More
Large-scale datasets for single-label multi-class classification, such as \emph{ImageNet-1k}, have been instrumental in advancing deep learning and computer vision. However, a critical and often understudied aspect is the comprehensive quality assessment of these datasets, especially regarding potential multi-label annotation errors. In this paper, we introduce a lightweight, user-friendly, and scalable framework that synergizes human and machine intelligence for efficient dataset validation and quality enhancement. We term this novel framework \emph{Multilabelfy}. Central to Multilabelfy is an adaptable web-based platform that systematically guides annotators through the re-evaluation process, effectively leveraging human-machine interactions to enhance dataset quality. By using Multilabelfy on the ImageNetV2 dataset, we found that approximately $47.88\%$ of the images contained at least two labels, underscoring the need for more rigorous assessments of such influential datasets. Furthermore, our analysis showed a negative correlation between the number of potential labels per image and model top-1 accuracy, illuminating a crucial factor in model evaluation and selection. Our open-source framework, Multilabelfy, offers a convenient, lightweight solution for dataset enhancement, emphasizing multi-label proportions. This study tackles major challenges in dataset integrity and provides key insights into model performance evaluation. Moreover, it underscores the advantages of integrating human expertise with machine capabilities to produce more robust models and trustworthy data development. The source code for Multilabelfy will be available at https://github.com/esla/Multilabelfy.
\keywords{Computer Vision \and Dataset Quality Enhancement \and Dataset Validation \and Human-Computer Interaction \and Multi-label Annotation.}
△ Less
Submitted 31 January, 2024;
originally announced January 2024.
-
Improving Angular Speed Uniformity by Piecewise Radical Reparameterization
Authors:
Hoon Hong,
Dongming Wang,
**g Yang
Abstract:
For a rational parameterization of a curve, it is desirable that its angular speed is as uniform as possible. Hence, given a rational parameterization, one wants to find re-parameterization with better uniformity. One natural way is to use piecewise rational reparameterization. However, it turns out that the piecewise rational reparameterization does not help when the angular speed of the given ra…
▽ More
For a rational parameterization of a curve, it is desirable that its angular speed is as uniform as possible. Hence, given a rational parameterization, one wants to find re-parameterization with better uniformity. One natural way is to use piecewise rational reparameterization. However, it turns out that the piecewise rational reparameterization does not help when the angular speed of the given rational parameterization is zero at some points on the curve. In this paper, we show how to overcome the challenge by using piecewise radical reparameterization.
△ Less
Submitted 22 January, 2024;
originally announced January 2024.
-
Conditions for eigenvalue configurations of two real symmetric matrices: a signature approach
Authors:
Hoon Hong,
Daniel Profili,
J. Rafael Sendra
Abstract:
For two real symmetric matrices, their eigenvalue configuration is the arrangement of their eigenvalues on the real line. In this paper, we provide quantifier-free necessary and sufficient conditions for two symmetric matrices to realize a given eigenvalue configuration. The basic idea is to generate a set of polynomials in the entries of the two matrices whose roots can be counted to uniquely det…
▽ More
For two real symmetric matrices, their eigenvalue configuration is the arrangement of their eigenvalues on the real line. In this paper, we provide quantifier-free necessary and sufficient conditions for two symmetric matrices to realize a given eigenvalue configuration. The basic idea is to generate a set of polynomials in the entries of the two matrices whose roots can be counted to uniquely determine the eigenvalue configuration. This result can be seen as ageneralization of Descartes' rule of signs to the case of two real univariate polynomials.
△ Less
Submitted 10 May, 2024; v1 submitted 29 December, 2023;
originally announced January 2024.
-
Computing greatest common divisor of several parametric univariate polynomials via generalized subresultant polynomials
Authors:
Hoon Hong,
**g Yang
Abstract:
In this paper, we tackle the following problem: compute the gcd for several univariate polynomials with parametric coefficients. It amounts to partitioning the parameter space into ``cells'' so that the gcd has a uniform expression over each cell and constructing a uniform expression of gcd in each cell. We tackle the problem as follows. We begin by making a natural and obvious extension of subres…
▽ More
In this paper, we tackle the following problem: compute the gcd for several univariate polynomials with parametric coefficients. It amounts to partitioning the parameter space into ``cells'' so that the gcd has a uniform expression over each cell and constructing a uniform expression of gcd in each cell. We tackle the problem as follows. We begin by making a natural and obvious extension of subresultant polynomials of two polynomials to several polynomials. Then we develop the following structural theories about them.
1. We generalize Sylvester's theory to several polynomials, in order to obtain an elegant relationship between generalized subresultant polynomials and the gcd of several polynomials, yielding an elegant algorithm.
2. We generalize Habicht's theory to several polynomials, in order to obtain a systematic relationship between generalized subresultant polynomials and pseudo-remainders, yielding an efficient algorithm.
Using the generalized theories, we present a simple (structurally elegant) algorithm which is significantly more efficient (both in the output size and computing time) than algorithms based on previous approaches.
△ Less
Submitted 31 December, 2023;
originally announced January 2024.
-
Conditions for eigenvalue configurations of two real symmetric matrices: a symmetric function approach
Authors:
Hoon Hong,
Daniel Profili,
J. Rafael Sendra
Abstract:
For two real symmetric matrices, their eigenvalue configuration is the arrangement of their eigenvalues on the real line. We study the problem of determining a quantifier-free necessary and sufficient condition for two real symmetric matrices to realize a given eigenvalue configuration as a generalization of Descartes' rule of signs. We exploit the combinatorial properties of our definition for ei…
▽ More
For two real symmetric matrices, their eigenvalue configuration is the arrangement of their eigenvalues on the real line. We study the problem of determining a quantifier-free necessary and sufficient condition for two real symmetric matrices to realize a given eigenvalue configuration as a generalization of Descartes' rule of signs. We exploit the combinatorial properties of our definition for eigenvalue configuration to reduce a two-polynomial root counting problem into several single-polynomial root counting problems of symmetric polynomials. We then leverage the fundamental theorem of symmetric polynomials to derive a final quantifier-free necessary and sufficient condition for two real symmetric matrices to realize a given eigenvalue configuration.
△ Less
Submitted 10 May, 2024; v1 submitted 29 December, 2023;
originally announced January 2024.
-
One-Dimensional Adapter to Rule Them All: Concepts, Diffusion Models and Erasing Applications
Authors:
Mengyao Lyu,
Yuhong Yang,
Haiwen Hong,
Hui Chen,
Xuan **,
Yuan He,
Hui Xue,
Jungong Han,
Guiguang Ding
Abstract:
The prevalent use of commercial and open-source diffusion models (DMs) for text-to-image generation prompts risk mitigation to prevent undesired behaviors. Existing concept erasing methods in academia are all based on full parameter or specification-based fine-tuning, from which we observe the following issues: 1) Generation alternation towards erosion: Parameter drift during target elimination ca…
▽ More
The prevalent use of commercial and open-source diffusion models (DMs) for text-to-image generation prompts risk mitigation to prevent undesired behaviors. Existing concept erasing methods in academia are all based on full parameter or specification-based fine-tuning, from which we observe the following issues: 1) Generation alternation towards erosion: Parameter drift during target elimination causes alternations and potential deformations across all generations, even eroding other concepts at varying degrees, which is more evident with multi-concept erased; 2) Transfer inability & deployment inefficiency: Previous model-specific erasure impedes the flexible combination of concepts and the training-free transfer towards other models, resulting in linear cost growth as the deployment scenarios increase. To achieve non-invasive, precise, customizable, and transferable elimination, we ground our erasing framework on one-dimensional adapters to erase multiple concepts from most DMs at once across versatile erasing applications. The concept-SemiPermeable structure is injected as a Membrane (SPM) into any DM to learn targeted erasing, and meantime the alteration and erosion phenomenon is effectively mitigated via a novel Latent Anchoring fine-tuning strategy. Once obtained, SPMs can be flexibly combined and plug-and-play for other DMs without specific re-tuning, enabling timely and efficient adaptation to diverse scenarios. During generation, our Facilitated Transport mechanism dynamically regulates the permeability of each SPM to respond to different input prompts, further minimizing the impact on other concepts. Quantitative and qualitative results across ~40 concepts, 7 DMs and 4 erasing applications have demonstrated the superior erasing of SPM. Our code and pre-tuned SPMs are available on the project page https://lyumengyao.github.io/projects/spm.
△ Less
Submitted 11 March, 2024; v1 submitted 26 December, 2023;
originally announced December 2023.
-
A Summarized History-based Dialogue System for Amnesia-Free Prompt Updates
Authors:
Hye** Hong,
Hibiki Kawano,
Takuto Maekawa,
Naoki Yoshimaru,
Takamasa Iio,
Kenji Hatano
Abstract:
In today's society, information overload presents challenges in providing optimal recommendations. Consequently, the importance of dialogue systems that can discern and provide the necessary information through dialogue is increasingly recognized. However, some concerns existing dialogue systems rely on pre-trained models and need help to cope with real-time or insufficient information. To address…
▽ More
In today's society, information overload presents challenges in providing optimal recommendations. Consequently, the importance of dialogue systems that can discern and provide the necessary information through dialogue is increasingly recognized. However, some concerns existing dialogue systems rely on pre-trained models and need help to cope with real-time or insufficient information. To address these concerns, models that allow the addition of missing information to dialogue robots are being proposed. Yet, maintaining the integrity of previous conversation history while integrating new data remains a formidable challenge. This paper presents a novel system for dialogue robots designed to remember user-specific characteristics by retaining past conversation history even as new information is added.
△ Less
Submitted 21 December, 2023;
originally announced December 2023.
-
Foreseeing Reconstruction Quality of Gradient Inversion: An Optimization Perspective
Authors:
HyeongGwon Hong,
Yooshin Cho,
Hanbyel Cho,
Jaesung Ahn,
Junmo Kim
Abstract:
Gradient inversion attacks can leak data privacy when clients share weight updates with the server in federated learning (FL). Existing studies mainly use L2 or cosine distance as the loss function for gradient matching in the attack. Our empirical investigation shows that the vulnerability ranking varies with the loss function used. Gradient norm, which is commonly used as a vulnerability proxy f…
▽ More
Gradient inversion attacks can leak data privacy when clients share weight updates with the server in federated learning (FL). Existing studies mainly use L2 or cosine distance as the loss function for gradient matching in the attack. Our empirical investigation shows that the vulnerability ranking varies with the loss function used. Gradient norm, which is commonly used as a vulnerability proxy for gradient inversion attack, cannot explain this as it remains constant regardless of the loss function for gradient matching. In this paper, we propose a loss-aware vulnerability proxy (LAVP) for the first time. LAVP refers to either the maximum or minimum eigenvalue of the Hessian with respect to gradient matching loss at ground truth. This suggestion is based on our theoretical findings regarding the local optimization of the gradient inversion in proximity to the ground truth, which corresponds to the worst case attack scenario. We demonstrate the effectiveness of LAVP on various architectures and datasets, showing its consistent superiority over the gradient norm in capturing sample vulnerabilities. The performance of each proxy is measured in terms of Spearman's rank correlation with respect to several similarity scores. This work will contribute to enhancing FL security against any potential loss functions beyond L2 or cosine distance in the future.
△ Less
Submitted 19 December, 2023;
originally announced December 2023.
-
Pre-Evolved Model for Complex Multi-objective Optimization Problems
Authors:
Haokai Hong,
Min Jiang
Abstract:
Multi-objective optimization problems (MOPs) necessitate the simultaneous optimization of multiple objectives. Numerous studies have demonstrated that evolutionary computation is a promising paradigm for solving complex MOPs, which involve optimization problems with large-scale decision variables, many objectives, and expensive evaluation functions. However, existing multi-objective evolutionary a…
▽ More
Multi-objective optimization problems (MOPs) necessitate the simultaneous optimization of multiple objectives. Numerous studies have demonstrated that evolutionary computation is a promising paradigm for solving complex MOPs, which involve optimization problems with large-scale decision variables, many objectives, and expensive evaluation functions. However, existing multi-objective evolutionary algorithms (MOEAs) encounter significant challenges in generating high-quality populations when solving diverse complex MOPs. Specifically, the distinct requirements and constraints of the population result in the inefficiency or even incompetence of MOEAs in addressing various complex MOPs. Therefore, this paper proposes the concept of pre-evolving for MOEAs to generate high-quality populations for diverse complex MOPs. Drawing inspiration from the classical transformer architecture, we devise dimension embedding and objective encoding techniques to configure the pre-evolved model (PEM). The PEM is pre-evolved on a substantial number of existing MOPs. Subsequently, when fine-evolving on new complex MOPs, the PEM transforms the population into the next generation to approximate the Pareto-optimal front. Furthermore, it utilizes evaluations on new solutions to iteratively update the PEM for subsequent generations, thereby efficiently solving various complex MOPs. Experimental results demonstrate that the PEM outperforms state-of-the-art MOEAs on a range of complex MOPs.
△ Less
Submitted 20 February, 2024; v1 submitted 11 December, 2023;
originally announced December 2023.
-
How to Determine the Most Powerful Pre-trained Language Model without Brute Force Fine-tuning? An Empirical Survey
Authors:
Jun Bai,
Xiaofeng Zhang,
Chen Li,
Hanhua Hong,
Xi Xu,
Chenghua Lin,
Wenge Rong
Abstract:
Transferability estimation has been attached to great attention in the computer vision fields. Researchers try to estimate with low computational cost the performance of a model when transferred from a source task to a given target task. Considering the effectiveness of such estimations, the communities of natural language processing also began to study similar problems for the selection of pre-tr…
▽ More
Transferability estimation has been attached to great attention in the computer vision fields. Researchers try to estimate with low computational cost the performance of a model when transferred from a source task to a given target task. Considering the effectiveness of such estimations, the communities of natural language processing also began to study similar problems for the selection of pre-trained language models. However, there is a lack of a comprehensive comparison between these estimation methods yet. Also, the differences between vision and language scenarios make it doubtful whether previous conclusions can be established across fields. In this paper, we first conduct a thorough survey of existing transferability estimation methods being able to find the most suitable model, then we conduct a detailed empirical study for the surveyed methods based on the GLUE benchmark. From qualitative and quantitative analyses, we demonstrate the strengths and weaknesses of existing methods and show that H-Score generally performs well with superiorities in effectiveness and efficiency. We also outline the difficulties of consideration of training details, applicability to text generation, and consistency to certain metrics which shed light on future directions.
△ Less
Submitted 7 December, 2023;
originally announced December 2023.
-
LaughTalk: Expressive 3D Talking Head Generation with Laughter
Authors:
Kim Sung-Bin,
Lee Hyun,
Da Hye Hong,
Suekyeong Nam,
Janghoon Ju,
Tae-Hyun Oh
Abstract:
Laughter is a unique expression, essential to affirmative social interactions of humans. Although current 3D talking head generation methods produce convincing verbal articulations, they often fail to capture the vitality and subtleties of laughter and smiles despite their importance in social context. In this paper, we introduce a novel task to generate 3D talking heads capable of both articulate…
▽ More
Laughter is a unique expression, essential to affirmative social interactions of humans. Although current 3D talking head generation methods produce convincing verbal articulations, they often fail to capture the vitality and subtleties of laughter and smiles despite their importance in social context. In this paper, we introduce a novel task to generate 3D talking heads capable of both articulate speech and authentic laughter. Our newly curated dataset comprises 2D laughing videos paired with pseudo-annotated and human-validated 3D FLAME parameters and vertices. Given our proposed dataset, we present a strong baseline with a two-stage training scheme: the model first learns to talk and then acquires the ability to express laughter. Extensive experiments demonstrate that our method performs favorably compared to existing approaches in both talking head generation and expressing laughter signals. We further explore potential applications on top of our proposed method for rigging realistic avatars.
△ Less
Submitted 2 November, 2023;
originally announced November 2023.
-
RIS-Aided Receive Generalized Spatial Modulation Design with Reflecting Modulation
Authors:
Xinghao Guo,
Yin Xu,
Hanjiang Hong,
De Mi,
Ruiqi Liu,
Dazhi He,
Wenjun Zhang,
Yi-yan Wu
Abstract:
Spatial modulation (SM) transmits additional information bits by the selection of antennas. Generalized spatial modulation (GSM), as an advanced type of SM, can be divided into diversity and multiplexing (MUX) schemes according to the symbols carried on the selected antennas are identical or different. Recently, reconfigurable intelligent surface (RIS) assisted SM exhibits better reception perform…
▽ More
Spatial modulation (SM) transmits additional information bits by the selection of antennas. Generalized spatial modulation (GSM), as an advanced type of SM, can be divided into diversity and multiplexing (MUX) schemes according to the symbols carried on the selected antennas are identical or different. Recently, reconfigurable intelligent surface (RIS) assisted SM exhibits better reception performance compared to conventional SM. To overcome the limitations of SM, this paper combines GSM with RIS and proposes the RIS-aided receive generalized spatial modulation (RIS-RGSM) scheme. The RIS-RGSM diversity scheme is realized via a simple improvement based on the state-of-the-art scheme. To further increase the transmission rate, a novel RIS-RGSM MUX scheme is proposed, where the reflection phase shifts and on/off states of RIS elements are configured to achieve bit map**. The theoretical bit error rate (BER) of the proposed scheme is derived and agrees well with the simulation results. Numerical simulations show that the RIS-RGSM MUX scheme has better BER performance than the diversity scheme. The proposed scheme can significantly increase the transmission rate and maintain good performance compared to the existing scheme under a limited number of antennas.
△ Less
Submitted 15 April, 2024; v1 submitted 24 October, 2023;
originally announced October 2023.
-
Capacity-based Spatial Modulation Constellation and Pre-scaling Design
Authors:
Xinghao Guo,
Hanjiang Hong,
Yin Xu,
Yi-yan Wu,
Dazhi He,
Wenjun Zhang
Abstract:
Spatial Modulation (SM) can utilize the index of the transmit antenna (TA) to transmit additional information. In this paper, to improve the performance of SM, a non-uniform constellation (NUC) and pre-scaling coefficients optimization design scheme is proposed. The bit-interleaved coded modulation (BICM) capacity calculation formula of SM system is firstly derived. The constellation and pre-scali…
▽ More
Spatial Modulation (SM) can utilize the index of the transmit antenna (TA) to transmit additional information. In this paper, to improve the performance of SM, a non-uniform constellation (NUC) and pre-scaling coefficients optimization design scheme is proposed. The bit-interleaved coded modulation (BICM) capacity calculation formula of SM system is firstly derived. The constellation and pre-scaling coefficients are optimized by maximizing the BICM capacity without channel state information (CSI) feedback. Optimization results are given for the multiple-input-single-output (MISO) system with Rayleigh channel. Simulation result shows the proposed scheme provides a meaningful performance gain compared to conventional SM system without CSI feedback. The proposed optimization design scheme can be a promising technology for future 6G to achieve high-efficiency.
△ Less
Submitted 24 October, 2023;
originally announced October 2023.
-
Prediction of MET Overexpression in Non-Small Cell Lung Adenocarcinomas from Hematoxylin and Eosin Images
Authors:
Kshitij Ingale,
Sun Hae Hong,
Josh S. K. Bell,
Abbas Rizvi,
Amy Welch,
Lingdao Sha,
Irvin Ho,
Kunal Nagpal,
Aicha BenTaieb,
Rohan P Joshi,
Martin C Stumpe
Abstract:
MET protein overexpression is a targetable event in non-small cell lung cancer (NSCLC) and is the subject of active drug development. Challenges in identifying patients for these therapies include lack of access to validated testing, such as standardized immunohistochemistry (IHC) assessment, and consumption of valuable tissue for a single gene/protein assay. Development of pre-screening algorithm…
▽ More
MET protein overexpression is a targetable event in non-small cell lung cancer (NSCLC) and is the subject of active drug development. Challenges in identifying patients for these therapies include lack of access to validated testing, such as standardized immunohistochemistry (IHC) assessment, and consumption of valuable tissue for a single gene/protein assay. Development of pre-screening algorithms using routinely available digitized hematoxylin and eosin (H&E)-stained slides to predict MET overexpression could promote testing for those who will benefit most. While assessment of MET expression using IHC is currently not routinely performed in NSCLC, next-generation sequencing is common and in some cases includes RNA expression panel testing. In this work, we leveraged a large database of matched H&E slides and RNA expression data to train a weakly supervised model to predict MET RNA overexpression directly from H&E images. This model was evaluated on an independent holdout test set of 300 over-expressed and 289 normal patients, demonstrating an ROC-AUC of 0.70 (95th percentile interval: 0.66 - 0.74) with stable performance characteristics across different patient clinical variables and robust to synthetic noise on the test set. These results suggest that H&E-based predictive models could be useful to prioritize patients for confirmatory testing of MET protein or MET gene expression status.
△ Less
Submitted 12 October, 2023; v1 submitted 11 October, 2023;
originally announced October 2023.
-
Robust Unsupervised Domain Adaptation by Retaining Confident Entropy via Edge Concatenation
Authors:
Hye-Seong Hong,
Abhishek Kumar,
Dong-Gyu Lee
Abstract:
The generalization capability of unsupervised domain adaptation can mitigate the need for extensive pixel-level annotations to train semantic segmentation networks by training models on synthetic data as a source with computer-generated annotations. Entropy-based adversarial networks are proposed to improve source domain prediction; however, they disregard significant external information, such as…
▽ More
The generalization capability of unsupervised domain adaptation can mitigate the need for extensive pixel-level annotations to train semantic segmentation networks by training models on synthetic data as a source with computer-generated annotations. Entropy-based adversarial networks are proposed to improve source domain prediction; however, they disregard significant external information, such as edges, which have the potential to identify and distinguish various objects within an image accurately. To address this issue, we introduce a novel approach to domain adaptation, leveraging the synergy of internal and external information within entropy-based adversarial networks. In this approach, we enrich the discriminator network with edge-predicted probability values within this innovative framework to enhance the clarity of class boundaries. Furthermore, we devised a probability-sharing network that integrates diverse information for more effective segmentation. Incorporating object edges addresses a pivotal aspect of unsupervised domain adaptation that has frequently been neglected in the past -- the precise delineation of object boundaries. Conventional unsupervised domain adaptation methods usually center around aligning feature distributions and may not explicitly model object boundaries. Our approach effectively bridges this gap by offering clear guidance on object boundaries, thereby elevating the quality of domain adaptation. Our approach undergoes rigorous evaluation on the established unsupervised domain adaptation benchmarks, specifically in adapting SYNTHIA $\rightarrow$ Cityscapes and SYNTHIA $\rightarrow$ Mapillary. Experimental results show that the proposed model attains better performance than state-of-the-art methods. The superior performance across different unsupervised domain adaptation scenarios highlights the versatility and robustness of the proposed method.
△ Less
Submitted 10 October, 2023;
originally announced October 2023.
-
MindfulDiary: Harnessing Large Language Model to Support Psychiatric Patients' Journaling
Authors:
Taewan Kim,
Seolyeong Bae,
Hyun Ah Kim,
Su-woo Lee,
Hwajung Hong,
Chanmo Yang,
Young-Ho Kim
Abstract:
In the mental health domain, Large Language Models (LLMs) offer promising new opportunities, though their inherent complexity and low controllability have raised questions about their suitability in clinical settings. We present MindfulDiary, a mobile journaling app incorporating an LLM to help psychiatric patients document daily experiences through conversation. Designed in collaboration with men…
▽ More
In the mental health domain, Large Language Models (LLMs) offer promising new opportunities, though their inherent complexity and low controllability have raised questions about their suitability in clinical settings. We present MindfulDiary, a mobile journaling app incorporating an LLM to help psychiatric patients document daily experiences through conversation. Designed in collaboration with mental health professionals (MHPs), MindfulDiary takes a state-based approach to safely comply with the experts' guidelines while carrying on free-form conversations. Through a four-week field study involving 28 patients with major depressive disorder and five psychiatrists, we found that MindfulDiary supported patients in consistently enriching their daily records and helped psychiatrists better empathize with their patients through an understanding of their thoughts and daily contexts. Drawing on these findings, we discuss the implications of leveraging LLMs in the mental health domain, bridging the technical feasibility and their integration into clinical settings.
△ Less
Submitted 22 February, 2024; v1 submitted 8 October, 2023;
originally announced October 2023.
-
FABRIC: Automated Scoring and Feedback Generation for Essays
Authors:
Jieun Han,
Haneul Yoo,
Junho Myung,
Minsun Kim,
Hyunseung Lim,
Yoonsu Kim,
Tak Yeon Lee,
Hwajung Hong,
Juho Kim,
So-Yeon Ahn,
Alice Oh
Abstract:
Automated essay scoring (AES) provides a useful tool for students and instructors in writing classes by generating essay scores in real-time. However, previous AES models do not provide more specific rubric-based scores nor feedback on how to improve the essays, which can be even more important than the overall scores for learning. We present FABRIC, a pipeline to help students and instructors in…
▽ More
Automated essay scoring (AES) provides a useful tool for students and instructors in writing classes by generating essay scores in real-time. However, previous AES models do not provide more specific rubric-based scores nor feedback on how to improve the essays, which can be even more important than the overall scores for learning. We present FABRIC, a pipeline to help students and instructors in English writing classes by automatically generating 1) the overall scores, 2) specific rubric-based scores, and 3) detailed feedback on how to improve the essays. Under the guidance of English education experts, we chose the rubrics for the specific scores as content, organization, and language. The first component of the FABRIC pipeline is DREsS, a real-world Dataset for Rubric-based Essay Scoring (DREsS). The second component is CASE, a Corruption-based Augmentation Strategy for Essays, with which we can improve the accuracy of the baseline model by 45.44%. The third component is EssayCoT, the Essay Chain-of-Thought prompting strategy which uses scores predicted from the AES model to generate better feedback. We evaluate the effectiveness of the new dataset DREsS and the augmentation strategy CASE quantitatively and show significant improvements over the models trained with existing datasets. We evaluate the feedback generated by EssayCoT with English education experts to show significant improvements in the helpfulness of the feedback across all rubrics. Lastly, we evaluate the FABRIC pipeline with students in a college English writing class who rated the generated scores and feedback with an average of 6 on the Likert scale from 1 to 7.
△ Less
Submitted 8 October, 2023;
originally announced October 2023.
-
CCSPNet-Joint: Efficient Joint Training Method for Traffic Sign Detection Under Extreme Conditions
Authors:
Haoqin Hong,
Yue Zhou,
Xiangyu Shu,
Xiaofang Hu
Abstract:
Traffic sign detection is an important research direction in intelligent driving. Unfortunately, existing methods often overlook extreme conditions such as fog, rain, and motion blur. Moreover, the end-to-end training strategy for image denoising and object detection models fails to utilize inter-model information effectively. To address these issues, we propose CCSPNet, an efficient feature extra…
▽ More
Traffic sign detection is an important research direction in intelligent driving. Unfortunately, existing methods often overlook extreme conditions such as fog, rain, and motion blur. Moreover, the end-to-end training strategy for image denoising and object detection models fails to utilize inter-model information effectively. To address these issues, we propose CCSPNet, an efficient feature extraction module based on Contextual Transformer and CNN, capable of effectively utilizing the static and dynamic features of images, achieving faster inference speed and providing stronger feature enhancement capabilities. Furthermore, we establish the correlation between object detection and image denoising tasks and propose a joint training model, CCSPNet-Joint, to improve data efficiency and generalization. Finally, to validate our approach, we create the CCTSDB-AUG dataset for traffic sign detection in extreme scenarios. Extensive experiments have shown that CCSPNet achieves state-of-the-art performance in traffic sign detection under extreme conditions. Compared to end-to-end methods, CCSPNet-Joint achieves a 5.32% improvement in precision and an 18.09% improvement in [email protected].
△ Less
Submitted 3 February, 2024; v1 submitted 13 September, 2023;
originally announced September 2023.
-
Disposable Transfer Learning for Selective Source Task Unlearning
Authors:
Seunghee Koh,
Hyounguk Shon,
Janghyeon Lee,
Hyeong Gwon Hong,
Junmo Kim
Abstract:
Transfer learning is widely used for training deep neural networks (DNN) for building a powerful representation. Even after the pre-trained model is adapted for the target task, the representation performance of the feature extractor is retained to some extent. As the performance of the pre-trained model can be considered the private property of the owner, it is natural to seek the exclusive right…
▽ More
Transfer learning is widely used for training deep neural networks (DNN) for building a powerful representation. Even after the pre-trained model is adapted for the target task, the representation performance of the feature extractor is retained to some extent. As the performance of the pre-trained model can be considered the private property of the owner, it is natural to seek the exclusive right of the generalized performance of the pre-trained weight. To address this issue, we suggest a new paradigm of transfer learning called disposable transfer learning (DTL), which disposes of only the source task without degrading the performance of the target task. To achieve knowledge disposal, we propose a novel loss named Gradient Collision loss (GC loss). GC loss selectively unlearns the source knowledge by leading the gradient vectors of mini-batches in different directions. Whether the model successfully unlearns the source task is measured by piggyback learning accuracy (PL accuracy). PL accuracy estimates the vulnerability of knowledge leakage by retraining the scrubbed model on a subset of source data or new downstream data. We demonstrate that GC loss is an effective approach to the DTL problem by showing that the model trained with GC loss retains the performance on the target task with a significantly reduced PL accuracy.
△ Less
Submitted 19 August, 2023;
originally announced August 2023.
-
Varying-coefficients for regional quantile via KNN-based LASSO with applications to health outcome study
Authors:
Seyoung Park,
Eun Ryung Lee,
Hyokyoung G. Hong
Abstract:
Health outcomes, such as body mass index and cholesterol levels, are known to be dependent on age and exhibit varying effects with their associated risk factors. In this paper, we propose a novel framework for dynamic modeling of the associations between health outcomes and risk factors using varying-coefficients (VC) regional quantile regression via K-nearest neighbors (KNN) fused Lasso, which ca…
▽ More
Health outcomes, such as body mass index and cholesterol levels, are known to be dependent on age and exhibit varying effects with their associated risk factors. In this paper, we propose a novel framework for dynamic modeling of the associations between health outcomes and risk factors using varying-coefficients (VC) regional quantile regression via K-nearest neighbors (KNN) fused Lasso, which captures the time-varying effects of age. The proposed method has strong theoretical properties, including a tight estimation error bound and the ability to detect exact clustered patterns under certain regularity conditions. To efficiently solve the resulting optimization problem, we develop an alternating direction method of multipliers (ADMM) algorithm. Our empirical results demonstrate the efficacy of the proposed method in capturing the complex age-dependent associations between health outcomes and their risk factors.
△ Less
Submitted 8 August, 2023;
originally announced August 2023.
-
Text-CRS: A Generalized Certified Robustness Framework against Textual Adversarial Attacks
Authors:
Xinyu Zhang,
Hanbin Hong,
Yuan Hong,
Peng Huang,
Binghui Wang,
Zhongjie Ba,
Kui Ren
Abstract:
The language models, especially the basic text classification models, have been shown to be susceptible to textual adversarial attacks such as synonym substitution and word insertion attacks. To defend against such attacks, a growing body of research has been devoted to improving the model robustness. However, providing provable robustness guarantees instead of empirical robustness is still widely…
▽ More
The language models, especially the basic text classification models, have been shown to be susceptible to textual adversarial attacks such as synonym substitution and word insertion attacks. To defend against such attacks, a growing body of research has been devoted to improving the model robustness. However, providing provable robustness guarantees instead of empirical robustness is still widely unexplored. In this paper, we propose Text-CRS, a generalized certified robustness framework for natural language processing (NLP) based on randomized smoothing. To our best knowledge, existing certified schemes for NLP can only certify the robustness against $\ell_0$ perturbations in synonym substitution attacks. Representing each word-level adversarial operation (i.e., synonym substitution, word reordering, insertion, and deletion) as a combination of permutation and embedding transformation, we propose novel smoothing theorems to derive robustness bounds in both permutation and embedding space against such adversarial operations. To further improve certified accuracy and radius, we consider the numerical relationships between discrete words and select proper noise distributions for the randomized smoothing. Finally, we conduct substantial experiments on multiple language models and datasets. Text-CRS can address all four different word-level adversarial operations and achieve a significant accuracy improvement. We also provide the first benchmark on certified accuracy and radius of four word-level operations, besides outperforming the state-of-the-art certification against synonym substitution attacks.
△ Less
Submitted 11 June, 2024; v1 submitted 31 July, 2023;
originally announced July 2023.
-
Statistical Tests for Replacing Human Decision Makers with Algorithms
Authors:
Kai Feng,
Han Hong,
Ke Tang,
**gyuan Wang
Abstract:
This paper proposes a statistical framework with which artificial intelligence can improve human decision making. The performance of each human decision maker is first benchmarked against machine predictions; we then replace the decisions made by a subset of the decision makers with the recommendation from the proposed artificial intelligence algorithm. Using a large nationwide dataset of pregnanc…
▽ More
This paper proposes a statistical framework with which artificial intelligence can improve human decision making. The performance of each human decision maker is first benchmarked against machine predictions; we then replace the decisions made by a subset of the decision makers with the recommendation from the proposed artificial intelligence algorithm. Using a large nationwide dataset of pregnancy outcomes and doctor diagnoses from prepregnancy checkups of reproductive age couples, we experimented with both a heuristic frequentist approach and a Bayesian posterior loss function approach with an application to abnormal birth detection. We find that our algorithm on a test dataset results in a higher overall true positive rate and a lower false positive rate than the diagnoses made by doctors only. We also find that the diagnoses of doctors from rural areas are more frequently replaceable, suggesting that artificial intelligence assisted decision making tends to improve precision more in less developed regions.
△ Less
Submitted 20 June, 2023;
originally announced June 2023.
-
Record Deduplication for Entity Distribution Modeling in ASR Transcripts
Authors:
Tianyu Huang,
Chung Hoon Hong,
Carl Wivagg,
Kanna Shimizu
Abstract:
Voice digital assistants must keep up with trending search queries. We rely on a speech recognition model using contextual biasing with a rapidly updated set of entities, instead of frequent model retraining, to keep up with trends. There are several challenges with this approach: (1) the entity set must be frequently reconstructed, (2) the entity set is of limited size due to latency and accuracy…
▽ More
Voice digital assistants must keep up with trending search queries. We rely on a speech recognition model using contextual biasing with a rapidly updated set of entities, instead of frequent model retraining, to keep up with trends. There are several challenges with this approach: (1) the entity set must be frequently reconstructed, (2) the entity set is of limited size due to latency and accuracy trade-offs, and (3) finding the true entity distribution for biasing is complicated by ASR misrecognition. We address these challenges and define an entity set by modeling customers true requested entity distribution from ASR output in production using record deduplication, a technique from the field of entity resolution. Record deduplication resolves or deduplicates coreferences, including misrecognitions, of the same latent entity. Our method successfully retrieves 95% of misrecognized entities and when used for contextual biasing shows an estimated 5% relative word error rate reduction.
△ Less
Submitted 9 June, 2023;
originally announced June 2023.
-
Towards Visualization Thumbnail Designs that Entice Reading Data-driven Articles
Authors:
Hwiyeon Kim,
Joohee Kim,
Yunha Han,
Hwajung Hong,
Oh-Sang Kwon,
Young-Woo Park,
Niklas Elmqvist,
Sungahn Ko,
Bum Chul Kwon
Abstract:
As online news increasingly include data journalism, there is a corresponding increase in the incorporation of visualization in article thumbnail images. However, little research exists on the design rationale for visualization thumbnails, such as resizing, crop**, simplifying, and embellishing charts that appear within the body of the associated article. Therefore, in this paper we aim to under…
▽ More
As online news increasingly include data journalism, there is a corresponding increase in the incorporation of visualization in article thumbnail images. However, little research exists on the design rationale for visualization thumbnails, such as resizing, crop**, simplifying, and embellishing charts that appear within the body of the associated article. Therefore, in this paper we aim to understand these design choices and determine what makes a visualization thumbnail inviting and interpretable. To this end, we first survey visualization thumbnails collected online and discuss visualization thumbnail practices with data journalists and news graphics designers. Based on the survey and discussion results, we then define a design space for visualization thumbnails and conduct a user study with four types of visualization thumbnails derived from the design space. The study results indicate that different chart components play different roles in attracting reader attention and enhancing reader understandability of the visualization thumbnails. We also find various thumbnail design strategies for effectively combining the charts' components, such as a data summary with highlights and data labels, and a visual legend with text labels and Human Recognizable Objects (HROs), into thumbnails. Ultimately, we distill our findings into design implications that allow effective visualization thumbnail designs for data-rich news articles. Our work can thus be seen as a first step toward providing structured guidance on how to design compelling thumbnails for data stories.
△ Less
Submitted 26 May, 2023;
originally announced May 2023.
-
FieldHAR: A Fully Integrated End-to-end RTL Framework for Human Activity Recognition with Neural Networks from Heterogeneous Sensors
Authors:
Mengxi Liu,
Bo Zhou,
Zimin Zhao,
Hyeonseok Hong,
Hyun Kim,
Sungho Suh,
Vitor Fortes Rey,
Paul Lukowicz
Abstract:
In this work, we propose an open-source scalable end-to-end RTL framework FieldHAR, for complex human activity recognition (HAR) from heterogeneous sensors using artificial neural networks (ANN) optimized for FPGA or ASIC integration. FieldHAR aims to address the lack of apparatus to transform complex HAR methodologies often limited to offline evaluation to efficient run-time edge applications. Th…
▽ More
In this work, we propose an open-source scalable end-to-end RTL framework FieldHAR, for complex human activity recognition (HAR) from heterogeneous sensors using artificial neural networks (ANN) optimized for FPGA or ASIC integration. FieldHAR aims to address the lack of apparatus to transform complex HAR methodologies often limited to offline evaluation to efficient run-time edge applications. The framework uses parallel sensor interfaces and integer-based multi-branch convolutional neural networks (CNNs) to support flexible modality extensions with synchronous sampling at the maximum rate of each sensor. To validate the framework, we used a sensor-rich kitchen scenario HAR application which was demonstrated in a previous offline study. Through resource-aware optimizations, with FieldHAR the entire RTL solution was created from data acquisition to ANN inference taking as low as 25\% logic elements and 2\% memory bits of a low-end Cyclone IV FPGA and less than 1\% accuracy loss from the original FP32 precision offline study. The RTL implementation also shows advantages over MCU-based solutions, including superior data acquisition performance and virtually eliminating ANN inference bottleneck.
△ Less
Submitted 22 May, 2023;
originally announced May 2023.
-
RECIPE: How to Integrate ChatGPT into EFL Writing Education
Authors:
Jieun Han,
Haneul Yoo,
Yoonsu Kim,
Junho Myung,
Minsun Kim,
Hyunseung Lim,
Juho Kim,
Tak Yeon Lee,
Hwajung Hong,
So-Yeon Ahn,
Alice Oh
Abstract:
The integration of generative AI in the field of education is actively being explored. In particular, ChatGPT has garnered significant interest, offering an opportunity to examine its effectiveness in English as a foreign language (EFL) education. To address this need, we present a novel learning platform called RECIPE (Revising an Essay with ChatGPT on an Interactive Platform for EFL learners). O…
▽ More
The integration of generative AI in the field of education is actively being explored. In particular, ChatGPT has garnered significant interest, offering an opportunity to examine its effectiveness in English as a foreign language (EFL) education. To address this need, we present a novel learning platform called RECIPE (Revising an Essay with ChatGPT on an Interactive Platform for EFL learners). Our platform features two types of prompts that facilitate conversations between ChatGPT and students: (1) a hidden prompt for ChatGPT to take an EFL teacher role and (2) an open prompt for students to initiate a dialogue with a self-written summary of what they have learned. We deployed this platform for 213 undergraduate and graduate students enrolled in EFL writing courses and seven instructors. For this study, we collect students' interaction data from RECIPE, including students' perceptions and usage of the platform, and user scenarios are examined with the data. We also conduct a focus group interview with six students and an individual interview with one EFL instructor to explore design opportunities for leveraging generative AI models in the field of EFL education.
△ Less
Submitted 19 May, 2023;
originally announced May 2023.
-
REMAST: Real-time Emotion-based Music Arrangement with Soft Transition
Authors:
Zihao Wang,
Le Ma,
Chen Zhang,
Bo Han,
Yunfei Xu,
Yikai Wang,
Xinyi Chen,
HaoRong Hong,
Wenbo Liu,
Xinda Wu,
Kejun Zhang
Abstract:
Music as an emotional intervention medium has important applications in scenarios such as music therapy, games, and movies. However, music needs real-time arrangement according to changing emotions, bringing challenges to balance emotion real-time fit and soft emotion transition due to the fine-grained and mutable nature of the target emotion. Existing studies mainly focus on achieving emotion rea…
▽ More
Music as an emotional intervention medium has important applications in scenarios such as music therapy, games, and movies. However, music needs real-time arrangement according to changing emotions, bringing challenges to balance emotion real-time fit and soft emotion transition due to the fine-grained and mutable nature of the target emotion. Existing studies mainly focus on achieving emotion real-time fit, while the issue of smooth transition remains understudied, affecting the overall emotional coherence of the music. In this paper, we propose REMAST to address this trade-off. Specifically, we recognize the last timestep's music emotion and fuse it with the current timestep's input emotion. The fused emotion then guides REMAST to generate the music based on the input melody. To adjust music similarity and emotion real-time fit flexibly, we downsample the original melody and feed it into the generation model. Furthermore, we design four music theory features by domain knowledge to enhance emotion information and employ semi-supervised learning to mitigate the subjective bias introduced by manual dataset annotation. According to the evaluation results, REMAST surpasses the state-of-the-art methods in objective and subjective metrics. These results demonstrate that REMAST achieves real-time fit and smooth transition simultaneously, enhancing the coherence of the generated music.
△ Less
Submitted 5 February, 2024; v1 submitted 13 May, 2023;
originally announced May 2023.
-
Localization using Multi-Focal Spatial Attention for Masked Face Recognition
Authors:
Yooshin Cho,
Hanbyel Cho,
Hyeong Gwon Hong,
Jaesung Ahn,
Dongmin Cho,
JungWoo Chang,
Junmo Kim
Abstract:
Since the beginning of world-wide COVID-19 pandemic, facial masks have been recommended to limit the spread of the disease. However, these masks hide certain facial attributes. Hence, it has become difficult for existing face recognition systems to perform identity verification on masked faces. In this context, it is necessary to develop masked Face Recognition (MFR) for contactless biometric reco…
▽ More
Since the beginning of world-wide COVID-19 pandemic, facial masks have been recommended to limit the spread of the disease. However, these masks hide certain facial attributes. Hence, it has become difficult for existing face recognition systems to perform identity verification on masked faces. In this context, it is necessary to develop masked Face Recognition (MFR) for contactless biometric recognition systems. Thus, in this paper, we propose Complementary Attention Learning and Multi-Focal Spatial Attention that precisely removes masked region by training complementary spatial attention to focus on two distinct regions: masked regions and backgrounds. In our method, standard spatial attention and networks focus on unmasked regions, and extract mask-invariant features while minimizing the loss of the conventional Face Recognition (FR) performance. For conventional FR, we evaluate the performance on the IJB-C, Age-DB, CALFW, and CPLFW datasets. We evaluate the MFR performance on the ICCV2021-MFR/Insightface track, and demonstrate the improved performance on the both MFR and FR datasets. Additionally, we empirically verify that spatial attention of proposed method is more precisely activated in unmasked regions.
△ Less
Submitted 7 September, 2023; v1 submitted 3 May, 2023;
originally announced May 2023.
-
Certifiable Black-Box Attack: Ensuring Provably Successful Attack for Adversarial Examples
Authors:
Hanbin Hong,
Yuan Hong
Abstract:
Black-box adversarial attacks have shown strong potential to subvert machine learning models. Existing black-box adversarial attacks craft the adversarial examples by iteratively querying the target model and/or leveraging the transferability of a local surrogate model. Whether such attack can succeed remains unknown to the adversary when empirically designing the attack. In this paper, to our bes…
▽ More
Black-box adversarial attacks have shown strong potential to subvert machine learning models. Existing black-box adversarial attacks craft the adversarial examples by iteratively querying the target model and/or leveraging the transferability of a local surrogate model. Whether such attack can succeed remains unknown to the adversary when empirically designing the attack. In this paper, to our best knowledge, we take the first step to study a new paradigm of adversarial attacks -- certifiable black-box attack that can guarantee the attack success rate of the crafted adversarial examples. Specifically, we revise the randomized smoothing to establish novel theories for ensuring the attack success rate of the adversarial examples. To craft the adversarial examples with the certifiable attack success rate (CASR) guarantee, we design several novel techniques, including a randomized query method to query the target model, an initialization method with smoothed self-supervised perturbation to derive certifiable adversarial examples, and a geometric shifting method to reduce the perturbation size of the certifiable adversarial examples for better imperceptibility. We have comprehensively evaluated the performance of the certifiable black-box attack on CIFAR10 and ImageNet datasets against different levels of defenses. Both theoretical and experimental results have validated the effectiveness of the proposed certifiable attack.
△ Less
Submitted 9 April, 2023;
originally announced April 2023.
-
Improving Performance Insensitivity of Large-scale Multiobjective Optimization via Monte Carlo Tree Search
Authors:
Haokai Hong,
Min Jiang,
Gary G. Yen
Abstract:
The large-scale multiobjective optimization problem (LSMOP) is characterized by simultaneously optimizing multiple conflicting objectives and involving hundreds of decision variables. Many real-world applications in engineering fields can be modeled as LSMOPs; simultaneously, engineering applications require insensitivity in performance. This requirement usually means that the results from the alg…
▽ More
The large-scale multiobjective optimization problem (LSMOP) is characterized by simultaneously optimizing multiple conflicting objectives and involving hundreds of decision variables. Many real-world applications in engineering fields can be modeled as LSMOPs; simultaneously, engineering applications require insensitivity in performance. This requirement usually means that the results from the algorithm runs should not only be good for every run in terms of performance but also that the performance of multiple runs should not fluctuate too much, i.e., the algorithm shows good insensitivity. Considering that substantial computational resources are requested for each run, it is essential to improve upon the performance of the large-scale multiobjective optimization algorithm, as well as the insensitivity of the algorithm. However, existing large-scale multiobjective optimization algorithms solely focus on improving the performance of the algorithms, leaving the insensitivity characteristics unattended. In this work, we propose an evolutionary algorithm for solving LSMOPs based on Monte Carlo tree search, the so-called LMMOCTS, which aims to improve the performance and insensitivity for large-scale multiobjective optimization problems. The proposed method samples the decision variables to construct new nodes on the Monte Carlo tree for optimization and evaluation. It selects nodes with good evaluation for further search to reduce the performance sensitivity caused by large-scale decision variables. We compare the proposed algorithm with several state-of-the-art designs on different benchmark functions. We also propose two metrics to measure the sensitivity of the algorithm. The experimental results confirm the effectiveness and performance insensitivity of the proposed design for solving large-scale multiobjective optimization problems.
△ Less
Submitted 14 April, 2023; v1 submitted 8 April, 2023;
originally announced April 2023.
-
Efficiently Tackling Million-Dimensional Multiobjective Problems: A Direction Sampling and Fine-Tuning Approach
Authors:
Haokai Hong,
Min Jiang,
Qiuzhen Lin,
Kay Chen Tan
Abstract:
We define very large-scale multiobjective optimization problems as optimizing multiple objectives (VLSMOPs) with more than 100,000 decision variables. These problems hold substantial significance, given the ubiquity of real-world scenarios necessitating the optimization of hundreds of thousands, if not millions, of variables. However, the larger dimension in VLSMOPs intensifies the curse of dimens…
▽ More
We define very large-scale multiobjective optimization problems as optimizing multiple objectives (VLSMOPs) with more than 100,000 decision variables. These problems hold substantial significance, given the ubiquity of real-world scenarios necessitating the optimization of hundreds of thousands, if not millions, of variables. However, the larger dimension in VLSMOPs intensifies the curse of dimensionality and poses significant challenges for existing large-scale evolutionary multiobjective algorithms, rendering them more difficult to solve within the constraints of practical computing resources. To overcome this issue, we propose a novel approach called the very large-scale multiobjective optimization framework (VMOF). The method efficiently samples general yet suitable evolutionary directions in the very large-scale space and subsequently fine-tunes these directions to locate the Pareto-optimal solutions. To sample the most suitable evolutionary directions for different solutions, Thompson sampling is adopted for its effectiveness in recommending from a very large number of items within limited historical evaluations. Furthermore, a technique is designed for fine-tuning directions specific to tracking Pareto-optimal solutions. To understand the designed framework, we present our analysis of the framework and then evaluate VMOF using widely recognized benchmarks and real-world problems spanning dimensions from 100 to 1,000,000. Experimental results demonstrate that our method exhibits superior performance not only on LSMOPs but also on VLSMOPs when compared to existing algorithms.
△ Less
Submitted 7 April, 2024; v1 submitted 8 April, 2023;
originally announced April 2023.
-
Robust Parameter Estimation for Rational Ordinary Differential Equations
Authors:
Oren Bassik,
Yosef Berman,
Soo Go,
Hoon Hong,
Ilia Ilmer,
Alexey Ovchinnikov,
Chris Rackauckas,
Pedro Soto,
Chee Yap
Abstract:
We present a new approach for estimating parameters in rational ODE models from given (measured) time series data.
In typical existing approaches, an initial guess for the parameter values is made from a given search interval. Then, in a loop, the corresponding outputs are computed by solving the ODE numerically, followed by computing the error from the given time series data. If the error is sm…
▽ More
We present a new approach for estimating parameters in rational ODE models from given (measured) time series data.
In typical existing approaches, an initial guess for the parameter values is made from a given search interval. Then, in a loop, the corresponding outputs are computed by solving the ODE numerically, followed by computing the error from the given time series data. If the error is small, the loop terminates and the parameter values are returned. Otherwise, heuristics/theories are used to possibly improve the guess and continue the loop.
These approaches tend to be non-robust in the sense that their accuracy depend on the search interval and the true parameter values; furthermore, they cannot handle the case where the parameters are locally identifiable.
In this paper, we propose a new approach, which does not suffer from the above non-robustness. In particular, it does not require making good initial guesses for the parameter values or specifying search intervals. Instead, it uses differential algebra, interpolation of the data using rational functions, and multivariate polynomial system solving. We also compare the performance of the resulting software with several other estimation software packages.
△ Less
Submitted 17 December, 2023; v1 submitted 2 March, 2023;
originally announced March 2023.
-
PDIWS: Thermal Imaging Dataset for Person Detection in Intrusion Warning Systems
Authors:
Nguyen Duc Thuan,
Le Hai Anh,
Hoang Si Hong
Abstract:
In this paper, we present a synthetic thermal imaging dataset for Person Detection in Intrusion Warning Systems (PDIWS). The dataset consists of a training set with 2000 images and a test set with 500 images. Each image is synthesized by compounding a subject (intruder) with a background using the modified Poisson image editing method. There are a total of 50 different backgrounds and nearly 1000…
▽ More
In this paper, we present a synthetic thermal imaging dataset for Person Detection in Intrusion Warning Systems (PDIWS). The dataset consists of a training set with 2000 images and a test set with 500 images. Each image is synthesized by compounding a subject (intruder) with a background using the modified Poisson image editing method. There are a total of 50 different backgrounds and nearly 1000 subjects divided into five classes according to five human poses: cree**, crawling, stoo**, climbing and other. The presence of the intruder will be confirmed if the first four poses are detected. Advanced object detection algorithms have been implemented with this dataset and give relatively satisfactory results, with the highest mAP values of 95.5% and 90.9% for IoU of 0.5 and 0.75 respectively. The dataset is freely published online for research purposes at https://github.com/thuan-researcher/Intruder-Thermal-Dataset.
△ Less
Submitted 2 October, 2023; v1 submitted 26 February, 2023;
originally announced February 2023.
-
HUST bearing: a practical dataset for ball bearing fault diagnosis
Authors:
Nguyen Duc Thuan,
Hoang Si Hong
Abstract:
In this work, we introduce a practical dataset named HUST bearing, that provides a large set of vibration data on different ball bearings. This dataset contains 90 raw vibration data of 6 types of defects (inner crack, outer crack, ball crack, and their 2-combinations) on 5 types of bearing at 3 working conditions with the sample rate of 51,200 samples per second. We established the envelope analy…
▽ More
In this work, we introduce a practical dataset named HUST bearing, that provides a large set of vibration data on different ball bearings. This dataset contains 90 raw vibration data of 6 types of defects (inner crack, outer crack, ball crack, and their 2-combinations) on 5 types of bearing at 3 working conditions with the sample rate of 51,200 samples per second. We established the envelope analysis and order tracking analysis on the introduced dataset to allow an initial evaluation of the data. A number of classical machine learning classification methods are used to identify bearing faults of the dataset using features in different domains. The typical advanced unsupervised transfer learning algorithms also perform to observe the transferability of knowledge among parts of the dataset. The experimental results of examined methods on the dataset gain divergent accuracy up to 100% on classification task and 60-80% on unsupervised transfer learning task.
△ Less
Submitted 2 October, 2023; v1 submitted 24 February, 2023;
originally announced February 2023.
-
Mining compact high utility sequential patterns
Authors:
Tai Dinh,
Philippe Fournier-Viger,
Huynh Van Hong
Abstract:
High utility sequential pattern mining (HUSPM) aims to mine all patterns that yield a high utility (profit) in a sequence dataset. HUSPM is useful for several applications such as market basket analysis, marketing, and website clickstream analysis. In these applications, users may also consider high utility patterns frequently appearing in the dataset to obtain more fruitful information. However,…
▽ More
High utility sequential pattern mining (HUSPM) aims to mine all patterns that yield a high utility (profit) in a sequence dataset. HUSPM is useful for several applications such as market basket analysis, marketing, and website clickstream analysis. In these applications, users may also consider high utility patterns frequently appearing in the dataset to obtain more fruitful information. However, this task is high computation since algorithms may generate a combinatorial explosive number of candidates that may be redundant or of low importance. To reduce complexity and obtain a compact set of frequent high utility sequential patterns (FHUSPs), this paper proposes an algorithm named CHUSP for mining closed frequent high utility sequential patterns (CHUSPs). Such patterns keep a concise representation while preserving the same expressive power of the complete set of FHUSPs. The proposed algorithm relies on a CHUS data structure to maintain information during mining. It uses three pruning strategies to eliminate early low-utility and non-frequent patterns, thereby reducing the search space. An extensive experimental evaluation was performed on six real-life datasets to evaluate the performance of CHUSP in terms of execution time, memory usage, and the number of generated patterns. Experimental results show that CHUSP can efficiently discover the compact set of CHUSPs under different user-defined thresholds.
△ Less
Submitted 22 February, 2023;
originally announced February 2023.
-
Parametric "Non-nested" Discriminants for Multiplicities of Univariate Polynomials
Authors:
Hoon Hong,
**g Yang
Abstract:
We consider the problem of complex root classification, i.e., finding the conditions on the coefficients of a univariate polynomial for all possible multiplicity structures on its complex roots. It is well known that such conditions can be written as conjunctions of several polynomial equations and one inequation in the coefficients. Those polynomials in the coefficients are called discriminants f…
▽ More
We consider the problem of complex root classification, i.e., finding the conditions on the coefficients of a univariate polynomial for all possible multiplicity structures on its complex roots. It is well known that such conditions can be written as conjunctions of several polynomial equations and one inequation in the coefficients. Those polynomials in the coefficients are called discriminants for multiplicities. It is well known that discriminants can be obtained by using repeated parametric gcd's. The resulting discriminants are usually nested determinants, that is, determinants of matrices whose entries are determinants, and so son. In this paper, we give a new type of discriminants which are not based on repeated gcd's. The new discriminants are simpler in that they are non-nested determinants and have smaller maximum degrees.
△ Less
Submitted 22 April, 2023; v1 submitted 31 December, 2022;
originally announced January 2023.
-
Noisy Label Classification using Label Noise Selection with Test-Time Augmentation Cross-Entropy and NoiseMix Learning
Authors:
Hansang Lee,
Haeil Lee,
Helen Hong,
Junmo Kim
Abstract:
As the size of the dataset used in deep learning tasks increases, the noisy label problem, which is a task of making deep learning robust to the incorrectly labeled data, has become an important task. In this paper, we propose a method of learning noisy label data using the label noise selection with test-time augmentation (TTA) cross-entropy and classifier learning with the NoiseMix method. In th…
▽ More
As the size of the dataset used in deep learning tasks increases, the noisy label problem, which is a task of making deep learning robust to the incorrectly labeled data, has become an important task. In this paper, we propose a method of learning noisy label data using the label noise selection with test-time augmentation (TTA) cross-entropy and classifier learning with the NoiseMix method. In the label noise selection, we propose TTA cross-entropy by measuring the cross-entropy to predict the test-time augmented training data. In the classifier learning, we propose the NoiseMix method based on MixUp and BalancedMix methods by mixing the samples from the noisy and the clean label data. In experiments on the ISIC-18 public skin lesion diagnosis dataset, the proposed TTA cross-entropy outperformed the conventional cross-entropy and the TTA uncertainty in detecting label noise data in the label noise selection process. Moreover, the proposed NoiseMix not only outperformed the state-of-the-art methods in the classification performance but also showed the most robustness to the label noise in the classifier learning.
△ Less
Submitted 1 December, 2022;
originally announced December 2022.
-
Test-Time Mixup Augmentation for Data and Class-Specific Uncertainty Estimation in Deep Learning Image Classification
Authors:
Hansang Lee,
Haeil Lee,
Helen Hong,
Junmo Kim
Abstract:
Uncertainty estimation of trained deep learning networks is valuable for optimizing learning efficiency and evaluating the reliability of network predictions. In this paper, we propose a method for estimating uncertainty in deep learning image classification using test-time mixup augmentation (TTMA). To improve the ability to distinguish correct and incorrect predictions in existing aleatoric unce…
▽ More
Uncertainty estimation of trained deep learning networks is valuable for optimizing learning efficiency and evaluating the reliability of network predictions. In this paper, we propose a method for estimating uncertainty in deep learning image classification using test-time mixup augmentation (TTMA). To improve the ability to distinguish correct and incorrect predictions in existing aleatoric uncertainty, we introduce TTMA data uncertainty (TTMA-DU) by applying mixup augmentation to test data and measuring the entropy of the predicted label histogram. In addition to TTMA-DU, we propose TTMA class-specific uncertainty (TTMA-CSU), which captures aleatoric uncertainty specific to individual classes and provides insight into class confusion and class similarity within the trained network. We validate our proposed methods on the ISIC-18 skin lesion diagnosis dataset and the CIFAR-100 real-world image classification dataset. Our experiments show that (1) TTMA-DU more effectively differentiates correct and incorrect predictions compared to existing uncertainty measures due to mixup perturbation, and (2) TTMA-CSU provides information on class confusion and class similarity for both datasets.
△ Less
Submitted 14 February, 2024; v1 submitted 30 November, 2022;
originally announced December 2022.
-
Data Poisoning Attack Aiming the Vulnerability of Continual Learning
Authors:
Gyo** Han,
Jaehyun Choi,
Hyeong Gwon Hong,
Junmo Kim
Abstract:
Generally, regularization-based continual learning models limit access to the previous task data to imitate the real-world constraints related to memory and privacy. However, this introduces a problem in these models by not being able to track the performance on each task. In essence, current continual learning methods are susceptible to attacks on previous tasks. We demonstrate the vulnerability…
▽ More
Generally, regularization-based continual learning models limit access to the previous task data to imitate the real-world constraints related to memory and privacy. However, this introduces a problem in these models by not being able to track the performance on each task. In essence, current continual learning methods are susceptible to attacks on previous tasks. We demonstrate the vulnerability of regularization-based continual learning methods by presenting a simple task-specific data poisoning attack that can be used in the learning process of a new task. Training data generated by the proposed attack causes performance degradation on a specific task targeted by the attacker. We experiment with the attack on the two representative regularization-based continual learning methods, Elastic Weight Consolidation (EWC) and Synaptic Intelligence (SI), trained with variants of MNIST dataset. The experiment results justify the vulnerability proposed in this paper and demonstrate the importance of develo** continual learning models that are robust to adversarial attacks.
△ Less
Submitted 3 July, 2023; v1 submitted 28 November, 2022;
originally announced November 2022.