Search | arXiv e-print repository

Mitigating Backdoor Attacks using Activation-Guided Model Editing

Authors: Felix Hsieh, Huy H. Nguyen, AprilPyone MaungMaung, Dmitrii Usynin, Isao Echizen

Abstract: Backdoor attacks compromise the integrity and reliability of machine learning models by embedding a hidden trigger during the training process, which can later be activated to cause unintended misbehavior. We propose a novel backdoor mitigation approach via machine unlearning to counter such backdoor attacks. The proposed method utilizes model activation of domain-equivalent unseen data to guide t… ▽ More Backdoor attacks compromise the integrity and reliability of machine learning models by embedding a hidden trigger during the training process, which can later be activated to cause unintended misbehavior. We propose a novel backdoor mitigation approach via machine unlearning to counter such backdoor attacks. The proposed method utilizes model activation of domain-equivalent unseen data to guide the editing of the model's weights. Unlike the previous unlearning-based mitigation methods, ours is computationally inexpensive and achieves state-of-the-art performance while only requiring a handful of unseen samples for unlearning. In addition, we also point out that unlearning the backdoor may cause the whole targeted class to be unlearned, thus introducing an additional repair step to preserve the model's utility after editing the model. Experiment results show that the proposed method is effective in unlearning the backdoor on different datasets and trigger patterns. △ Less

Submitted 10 July, 2024; originally announced July 2024.

arXiv:2104.07191 [pdf, other]

Coarse- and fine-scale geometric information content of Multiclass Classification and implied Data-driven Intelligence

Authors: Fushing Hsieh, Xiaodong Wang

Abstract: Under any Multiclass Classification (MCC) setting defined by a collection of labeled point-cloud specified by a feature-set, we extract only stochastic partial orderings from all possible triplets of point-cloud without explicitly measuring the three cloud-to-cloud distances. We demonstrate that such a collective of partial ordering can efficiently compute a label embedding tree geometry on the La… ▽ More Under any Multiclass Classification (MCC) setting defined by a collection of labeled point-cloud specified by a feature-set, we extract only stochastic partial orderings from all possible triplets of point-cloud without explicitly measuring the three cloud-to-cloud distances. We demonstrate that such a collective of partial ordering can efficiently compute a label embedding tree geometry on the Label-space. This tree in turn gives rise to a predictive graph, or a network with precisely weighted linkages. Such two multiscale geometries are taken as the coarse scale information content of MCC. They indeed jointly shed lights on explainable knowledge on why and how labeling comes about and facilitates error-free prediction with potential multiple candidate labels supported by data. For revealing within-label heterogeneity, we further undergo labeling naturally found clusters within each point-cloud, and likewise derive multiscale geometry as its fine-scale information content contained in data. This fine-scale endeavor shows that our computational proposal is indeed scalable to a MCC setting having a large label-space. Overall the computed multiscale collective of data-driven patterns and knowledge will serve as a basis for constructing visible and explainable subject matter intelligence regarding the system of interest. △ Less

Submitted 14 April, 2021; originally announced April 2021.

Comments: 15 pages, 5 figures

Journal ref: In Proceedings of the 16th International Conference on Machine Learning and Data Mining, MLDM 2020, July 20-21, 2020, Amsterdam, The Netherlands, pp 171-184

arXiv:2103.03431 [pdf, ps, other]

High Altitude Platform Stations (HAPS): Architecture and System Performance

Authors: Yunchou Xing, Frank Hsieh, Amitava Ghosh, Theodore S. Rappaport

Abstract: High Altitude Platform Station (HAPS) has the potential to provide global wireless connectivity and data services such as high-speed wireless backhaul, industrial Internet of things (IoT), and public safety for large areas not served by terrestrial networks. A unified HAPS design is desired to support various use cases and a wide range of requirements. In this paper, we present two architecture de… ▽ More High Altitude Platform Station (HAPS) has the potential to provide global wireless connectivity and data services such as high-speed wireless backhaul, industrial Internet of things (IoT), and public safety for large areas not served by terrestrial networks. A unified HAPS design is desired to support various use cases and a wide range of requirements. In this paper, we present two architecture designs of the HAPS system: i) repeater based HAPS, and ii) base station based HAPS, which are both viable technical solutions. The energy efficiency is analyzed and compared between the two architectures using consumption factor theory. The system performance of these two architectures is evaluated through Monte Carlo simulations and is characterized in metrics of spectral efficiency using LTE band 1 for both single-cell and multi-cell cases. Both designs can provide good downlink spectral efficiency and coverage, while the uplink coverage is significantly limited by UE transmit power and antenna gain. Using directional antennas at the UEs can improve the system performance for both downlink and uplink. △ Less

Submitted 4 March, 2021; originally announced March 2021.

arXiv:2011.09682 [pdf, other]

Categorical exploratory data analysis on goodness-of-fit issues

Authors: Sabrina Enriquez, Fushing Hsieh

Abstract: If the aphorism "All models are wrong"- George Box, continues to be true in data analysis, particularly when analyzing real-world data, then we should annotate this wisdom with visible and explainable data-driven patterns. Such annotations can critically shed invaluable light on validity as well as limitations of statistical modeling as a data analysis approach. In an effort to avoid holding our r… ▽ More If the aphorism "All models are wrong"- George Box, continues to be true in data analysis, particularly when analyzing real-world data, then we should annotate this wisdom with visible and explainable data-driven patterns. Such annotations can critically shed invaluable light on validity as well as limitations of statistical modeling as a data analysis approach. In an effort to avoid holding our real data to potentially unattainable or even unrealistic theoretical structures, we propose to utilize the data analysis paradigm called Categorical Exploratory Data Analysis (CEDA). We illustrate the merits of this proposal with two real-world data sets from the perspective of goodness-of-fit. In both data sets, the Normal distribution's bell shape seemingly fits rather well by first glance. We apply CEDA to bring out where and how each data fits or deviates from the model shape via several important distributional aspects. We also demonstrate that CEDA affords a version of tree-based p-value, and compare it with p-values based on traditional statistical approaches. Along our data analysis, we invest computational efforts in making graphic display to illuminate the advantages of using CEDA as one primary way of data analysis in Data Science education. △ Less

Submitted 3 December, 2020; v1 submitted 19 November, 2020; originally announced November 2020.

arXiv:2007.14485 [pdf, other]

doi 10.1371/journal.pone.0251258

Color-complexity enabled exhaustive color-dots identification and spatial patterns testing in images

Authors: Shuting Liao, Li-Yu Liu, Ting-An Chen, Kuang-Yu Chen, Fushing Hsieh

Abstract: Targeted color-dots with varying shapes and sizes in images are first exhaustively identified, and then their multiscale 2D geometric patterns are extracted for testing spatial uniformness in a progressive fashion. Based on color theory in physics, we develop a new color-identification algorithm relying on highly associative relations among the three color-coordinates: RGB or HSV. Such high associ… ▽ More Targeted color-dots with varying shapes and sizes in images are first exhaustively identified, and then their multiscale 2D geometric patterns are extracted for testing spatial uniformness in a progressive fashion. Based on color theory in physics, we develop a new color-identification algorithm relying on highly associative relations among the three color-coordinates: RGB or HSV. Such high associations critically imply low color-complexity of a color image, and renders potentials of exhaustive identification of targeted color-dots of all shapes and sizes. Via heterogeneous shaded regions and lighting conditions, our algorithm is shown being robust, practical and efficient comparing with the popular Contour and OpenCV approaches. Upon all identified color-pixels, we form color-dots as individually connected networks with shapes and sizes. We construct minimum spanning trees (MST) as spatial geometries of dot-collectives of various size-scales. Given a size-scale, the distribution of distances between immediate neighbors in the observed MST is extracted, so do many simulated MSTs under the spatial uniformness assumption. We devise a new algorithm for testing 2D spatial uniformness based on a Hierarchical clustering tree upon all involving MSTs. Our developments are illustrated on images obtained by mimicking chemical spraying via drone in Precision Agriculture. △ Less

Submitted 28 July, 2020; originally announced July 2020.

Comments: 21 pages, 21 figures

Showing 1–5 of 5 results for author: Hsieh, F