\credit

Conceptualization of this study, Methodology, Software

\credit

Conceptualization of this study, Methodology, Software \cormark[1]

\credit

Conceptualization of this study, Methodology, Software

\credit

Conceptualization of this study, Methodology, Software

\credit

Conceptualization of this study, Methodology, Software

1]organization=School of Information Science and Engineering, Chongqing Jiaotong University, city=Chongqing, postcode=400074, country=China 2]organization=Institute of Intelligent Software, city=Guangzhou, postcode=511400, country=China 3]organization=Saarland University, city=Saarbrücken, postcode=66123, country=Germany 4]organization=Department of Civil and Environmental Engineering, University of Auckland, city=Auckland, postcode=1023, country=New Zealand

\cortext

[cor1]Corresponding author

Enhancing robustness of data-driven SHM models: adversarial training with circle loss

Xiangli Yang [email protected] Xijie Deng [email protected] Hanwei Zhang [email protected] Yang Zou [email protected] Jianxi Yang [email protected] [ [ [ [

Abstract

Structural health monitoring (SHM) is critical to safeguarding the safety and reliability of aerospace, civil, and mechanical infrastructure. Machine learning-based data-driven approaches have gained popularity in SHM due to advancements in sensors and computational power. However, machine learning models used in SHM are vulnerable to adversarial examples —even small changes in input can lead to different model outputs. This paper aims to address this problem by discussing adversarial defenses in SHM. In this paper, we propose an adversarial training method for defense, which uses circle loss to optimize the distance between features in training to keep examples away from the decision boundary. Through this simple yet effective constraint, our method demonstrates substantial improvements in model robustness, surpassing existing defense mechanisms.

keywords:

Structural health monitoring \sepAdversarial examples \sepAdversarial defense \sepModel robustness

1 Introduction

Structural Health Monitoring (SHM) serves to continuously monitor and evaluate the condition of engineering structures in real-time, alerting in advance when anomalies in the structure’s health arise, and offering tailored solutions for various issues. The SHM process involves the installation of sensors in the engineering structure to collect response data within its environment. Subsequently, a set of indirect tools is utilized to analyze this data, aiming to detect, pinpoint, and manage any structural damage present. The existing methods for SHM can be categorized into two main types. The first type, known as the model-driven method, aims to create a precise finite element model by adjusting model parameters using input-output measurements. This method involves comparing these model parameters with actual structural measurement data to identify structural conditions. However, it is highly dependent on the accuracy of the theoretical model and the quality of the monitoring data. Additionally, it requires personnel with specialized knowledge in bridge modeling. Yet, these models are often static and might not accommodate unforeseen changes or new parameters, necessitating a ‘re-modeling’ process when such alterations occur. On the other hand, the second type, known as the data-driven method, constructs a model in the form of statistical representations. This approach detects changes in the structural state by analyzing evolving patterns and probability distributions within the monitoring data itself.

Data-driven methods in SHM are gaining popularity due to recent technological advancements in sensors, high-speed internet, and cloud-based computation. The data-driven method is based on the machine learning (ML) paradigm, which allows for the rapid and inexpensive creation of effective models that are widely used for structural diagnosis and structural damage detection. Some pioneering studies of ML methods have been conducted in data-driven SHM, including the use of Bayesian networks [1, 2, 3], artificial neural networks (ANN) [4, 5, 6], and support vector machines [7, 8, 9, 10]. However, several studies have shown that ML models are vulnerable to adversarial examples [11, 12, 13, 14, 15], that is, samples with perturbations designed to deceive the ML model and cause it to predict an incorrect label with high confidence. Adversarial perturbation is usually imperceptible to humans and difficult to detect from the original sample.

The expanding use of ML models in SHM has raised concerns due to opaque decision-making processes and a lack of interpretability in certain models, especially in critical fields where safety is paramount. Furthermore, the emergence of adversarial examples has heightened these concerns. Max et al. [16] have demonstrated the serious vulnerability of data-driven SHM models to adversarial attacks concisely and efficiently. Consequently, ensuring robustness against adversarial attacks has become a key challenge in the broad implementation of SHM frameworks. Adversarial robustness, which signifies a model’s capability to withstand adversarial examples, remains an area largely unexplored within the SHM field.

In computer vision, there exist two approaches for bolstering adversarial robustness: proactive defenses [17, 18, 19], which involve altering or identifying input data during the inference phase, and active defense [20, 21, 22], which entails adjusting the fundamental framework or learning process during the training phase. Transformations, a typical reactive approach [23, 24, 25], aim to neutralize adversarial effects through the application of simple filters. Although cost-effective, this method fares poorly against potent attacks like PGD [26], C&W [27], and DeepFool [28]. To enhance its efficacy, transformations introduce randomness [29, 30] and representation [31, 18, 32]. While this augmentation boosts robustness, it often compromises accuracy by altering original images, thereby discarding adversarial context. Consequently, networks trained on original data may struggle to recognize distorted information. Adversarial Training [12, 26, 33], a widely-used proactive defense strategy, enriches the training process by incorporating adversarial images. This approach enables the network to learn from adversarial instances, enhancing its comprehension of relevant knowledge. Adversarial training employs specific attacks to generate adversarial images. However, as various attacks possess different characteristics, the model becomes susceptible to unseen attack methods.

To boost adversarial training, we introduce a novel adversarial training loss, namely circle loss [34], which enhances adversarial robustness by limiting the distance between samples within the feature space. By promoting better compactness within classes and minimizing discrepancies between classes, the circle loss indirectly shifts data points away from the decision boundary. This not only enhances robustness but also prevents the model from overfitting to the specific adversarial examples used during training. Furthermore, we use the circle loss as a regularization term, integrating it into the standard training procedure. The experimental results show that this approach aims to enhance the overall robustness of the training models.

The main contributions of this paper can be summarized as follows:

[ 1.]
1.

We delve into the adversarial phenomenon within SHM, conducting a comprehensive analysis of the threats and establishing an adversarial attack threat model specifically tailored for the SHM field.
2.

We analyze the adaptability and robustness of applying existing defense methods in SHM and explore potential directions for adversarial machine learning within the SHM domain.
3.

We introduce an adversarial training methodology for defense, optimizing the feature distances during training to ensure examples remain distant from the decision boundary. This method significantly improves the adversarial robustness of data-driven SHM models.

The rest of the paper is organized as follows. Section 2 presents the adversarial threats in data-driven SHM. Section 3 introduces our defense method. Section 4 demonstrates the performance of our defense method. Conclusions are drawn in Section 5.

2 Adversarial vulnerability of data-driven SHM

In this section, we first define adversarial vulnerability and the scope of threat models in SHM. Then we introduce the existing adversarial attacks and defense strategies.

2.1 Threat model

Given a trained model $\mathcal{F}$ , an original input data sample $x$ , generating an adversarial example $x^{\mathrm{adv}}$ can generally be described as an optimization problem:

	$\displaystyle\operatorname{minimize}$	$\displaystyle~{}\mathcal{D}(x,x^{\mathrm{adv}})$		(1)
	such that	$\displaystyle\mathcal{F}(x)\neq\mathcal{F}(x^{\mathrm{adv}})$		(1)

where $\mathcal{D}$ is some distance metric. By minimizing the difference between $x^{\mathrm{adv}}$ and $x$ with the constrain, we force the modification is imperceptible (i.e., $x\approx x^{\mathrm{adv}}$ ) while the model output is completely different.

Adversary’s knowledge.

According to the adversary’s knowledge we have two different settings, i.e., attacks under white-box settings and black-box settings. White-box attacks assume the adversary possesses complete knowledge about the targeted model, including its structure, parameters, and training data. In contrast, black-box attacks assume that the adversary is only able to observe its output (labels or confidence scores), which is more realistic and aligns better with real-world threat models. White-box attacks are crucial since adversarial examples’ transferability allows white-box attacks to corrupt ML models in black-box settings by transferring the adversarial effects from surrogate models to target models [35, 36].

Adversary’s goal.

Adversarial goals in altering the classifier output are broadly classified as untargeted and targeted attacks. Untargeted attacks aim to change the classification output to any different class from the original, essentially misrepresenting the structural state by minimizing the likelihood of the correct class. While targeted attacks aim to force the output classification into a specific target class. For instance, causing a mislabeling of a damaged state as undamaged (false negative) by maximizing the probability of the desired target label. Targeted attacks require larger perturbations to succeed due to limited space available to redirect examples toward a specific label [37].

Adversary’s capability.

In addition to deceiving the model, the adversarial examples are supposed to stay stealthy. In computer vision, most work constrains adversarial perturbations by using $L_{p}$ norm or perceptual metrics [38] as $\mathcal{D}$ . Different from computer vision, SHM models receive signals from sensors to evaluate the structure’s state. Such signals are original with small noises barely recognizable by humans. Thus the distance metric $\mathcal{D}$ for attacking SHM models is supposed to focus on preserving damage-sensitive information in original samples, which differs in two scenarios: (1) in cases where engineers quickly estimate time-domain features from sensor data visually, $\mathcal{D}$ should preserve these features related to structural damage; (2) in cases where sensor data requires further signal processing and feature extraction, e.g., calculating Frequency Response Functions (FRFs) for intuitive diagnosis, $\mathcal{D}$ should focus on preserving damage-related information, such as FRF peaks and valleys. The choice of the budget of distortion should consider the normal noise level in SHM to ensure stealthiness. Inherent noise, stemming from environmental sources and system components, increases engineers’ tolerance for noise and offers opportunities for adversarial perturbations.

2.2 Adversarial attacks

Existing image adversarial example generation methods can be categorized as gradient-based, optimization-based, and generator-based, which provide valuable references for attacking SHM models. Gradient-based methods generate adversarial examples by perturbing along the gradient direction of a differentiable model. FGSM [12], as a first try, searches for adversarial examples along the sign of gradients of the loss function with a given step size once. Techniques like Basic Iterative Method (BIM) [39] extend FGSM by applying it iteratively. In black-box settings, Zeroth Order Optimization (ZOO) [40] and other gradient-based methods like [41, 42, 43, 44] estimate the gradients directly to generate adversarial examples. Optimization-based methods frame adversarial sample generation as an optimization problem, like the C&W method [27]. The C&W method customizes different objective functions $f$ , users can choose the optimal objective function through experiments to achieve adversarial attacks. Generator-based methods like AdvGAN [45] use parameterized generative adversarial networks (GAN) to create attacks, where GANs’ generators and discriminators compete to generate high-quality adversarial examples.

2.3 Defense strategies

Within computer vision, there are two strategies to enhance adversarial robustness: proactive defenses, which alter or identify input data during inference, and active defense, which modifies the core framework or learning process during training. To further classify existing approaches, proactive defense comprises data preprocessing and adversarial detection, while active defense includes gradient masking and adversarial training. Data preprocessing involves techniques that effectively compress and transform input data to mitigate the impact of adversarial noise [46, 17, 18, 19, 21]. Adversarial detection consists of methods that identify adversarial examples before feeding data into the model [47, 48]. Gradient masking refers to techniques that obscure model gradients during inference to hinder the direct construction of adversarial examples against the model [49, 50]. Adversarial training involves training the network using adversarial examples to fortify it against attacks [51, 52].

Proactive defenses aim to boost adversarial resilience without changing the model, relying on transformations or detection techniques. However, this strategy often sacrifices accuracy to prioritize robustness. Additionally, the defense strategies commonly used in computer vision aren’t easily adaptable to SHM models. Several data preprocessing techniques predominantly adjust image-specific traits like bit depth and pixel values, which are unsuitable for SHM data. Similarly, methods solely altering frequency and vibration amplitude lack effectiveness in fortifying SHM systems against adversarial attacks. Moreover, employing an adversarial detection approach adds computational overhead, negatively impacting real-time performance. Consequently, we prioritize active defenses that involve modifying the model or its learning process to strike a better balance between accuracy and robustness within SHM systems.

In the context of masking gradients, Randomized Smoothing (RS) [53] introduces random noise around input data points, creating a smoothed model that mitigates the impact of small perturbations on predictions. Employing Monte Carlo methods allows for estimating prediction uncertainties and establishing a certified radius around data points, ensuring stable predictions within defined bounds and further strengthening the model’s robustness. On the contrary, Distillation [49] involves training a smaller model (student) by leveraging insights from a larger, more accurate model (teacher). In this Distillation Defense, both the student and teacher models share the same structure. By learning from the teacher’s softer outputs (probabilities or logits), the student aims to replicate the teacher’s behavior, enhancing its resilience against adversarial attacks. The gradient masking technique diminishes model accuracy and proves ineffective against attacks not reliant on gradients. Within this context, adversarial training stands out as the most appropriate defense for SHM systems.

2.3.1 Adversarial training

Adversarial training can be traced back to [12], in which models were strengthened by producing adversarial examples and injecting them into training data. Later, Shaham et al. [54] proposed a formulation of adversarial training, which has been theoretically and empirically justified. Resembling a game between the attacker and the defender, adversarial training can be formulated as a minimax optimization problem as

\min_{\theta}\sum_{i=1}^{m}\max_{\tilde{x}_{i}\in\mathcal{U}_{i}}\mathcal{L}(% \theta,\tilde{x}_{i},y_{i})

(2)

in which $\mathcal{U}_{i}$ is the uncertainty set corresponding to sample $x_{i}$ , $y_{i}$ denotes the label of $x_{i}$ and $\mathcal{L}(\cdot)$ is a classification loss (i.e., cross-entropy) . Madry et al. [26] gives a reasonable interpretation of this formulation: the inner problem aims at generating adversarial examples by maximizing the training loss while the outer one guides the network in the direction that minimizes the loss to resist attacks. With such a connection, they use the adversarial examples generated by the Projected Gradient Descent (PGD) attack method as a solution for the inner problem. We refer to such an adversarial training as PGD-based adversarial training (PGD-AT). Through extensive experiments, their approach significantly increases the adversarial robustness of deep learning models against a wide range of attacks, which is a milestone of adversarial training methods. To improve efficiency, Fast adversarial training (Fast-AT) [22] employs the FGSM method with random initialization to generate adversarial examples. It aims to efficiently improve the model’s robustness against adversarial attacks by reducing the computational overhead of generating and incorporating adversarial examples during training.

Refer to caption — Figure 1: A conceptual illustration of decision boundaries after different training. Each circle represents a sample and its adversarial space within a perturbation budget $\epsilon$ . In standard training, samples of different classes can be easily separated by a simple decision boundary, but this simple decision boundary cannot separate samples with adversarial perturbations. So some adversarial examples (noted by red stars) are misclassified. Conventional adversarial training learns a more complex decision boundary and separates samples with a certain perturbation budget, but it is powerless for samples with a larger perturbation budget. Our method can withstand large perturbations budget.

An assumption for adversarial phenomena is that data points reside near the model’s decision boundary. Consequently, minor perturbations can push these points across the boundary, weakening the robustness of classification models. Adversarial training widens the gap between data points and the decision boundary, making it challenging for minor perturbations to shift data points to the opposite side, as depicted in Fig. 1(a)(b). This enhancement accounts for the model’s increased robustness after adversarial training. However, this also signifies that the effectiveness of adversarial training is constrained by the distortion levels of adversarial examples during training. By elevating the distortion budget, we can compromise the effectiveness of the adversarial training model in the inference phase.

To further strengthen adversarial training, we propose an enhancement integrating techniques from metric learning. Our aim is to optimize feature distances, providing a more effective defense against adversarial attacks. To augment adversarial training, related researches [55, 56, 57, 58] impose regularization constraints on the discrepancy between the output of adversarial examples and their correct labels. Differing from them, our method employs regularization terms to ensure discrimination in the feature space between classes. Adversarial training based on triplet loss [59, 60] applied metric learning methods to optimize feature distances between samples through similarity pairs, i.e., triples. Our approach implements circle loss for adversarial training, an unexplored method, eliminating the need for constructing triples and pre-training a model beforehand.

3 Methodology

This section begins by presenting the concept of circle loss. Then we introduce how to apply circle loss with adversarial training.

3.1 Circle loss

Metric learning can be divided into two basic paradigms based on different optimization goals: Learning with Class-Level Labels and Learning with Pair-Wise Labels. Given the class label, the first method mainly learns to classify each training sample into the target category through the softmax classification loss function. Given a pair of labels, the second method directly learns the pairwise similarity in the feature space, generally using contrastive loss, triplet loss, etc.. Given a single sample $x$ in the feature space, assume that there are $K$ within-class samples and $L$ between-class samples associated with $x$ , class-level label learning and pairwise label learning can be unified with the following loss:

	$\displaystyle\mathcal{L}_{\mathrm{uni}}$	$\displaystyle=\log{\left[1+\sum_{i=1}^{K}\sum_{j=1}^{L}\exp\left(s_{n}^{j}-s_{% p}^{i}+m\right)\right]}$		(3)
		$\displaystyle=\log{\left[1+\sum_{j=1}^{L}\exp(\gamma(s_{n}^{j}+m))\sum_{i=1}^{% K}\exp(\gamma(-s_{p}^{i})\right]}$		(3)

in which ${\left\{s_{p}^{1},s_{p}^{2},\cdots,s_{p}^{K}\right\}}$ is the intra-class similarity score between $x$ and similar samples, $\left\{s_{n}^{1},s_{n}^{2},\cdots,s_{n}^{L}\right\}$ is the similarity between $x$ and The inter-class similarity score between heterogeneous samples, $\gamma$ is the scaling factor. $m$ represents the margin, which represents the gap between the intra-class distance and the inter-class distance. It can be intuitively seen that this formula reduces $(s_{n}^{j}-s_{p}^{i})$ by traversing each similar pair. In addition, with adjustments, this formula can also degenerate into a classification loss in class label learning or a triplet loss in pairwise label learning.

Circle loss was proposed by Sun et al. [34] in the field of metric learning in 2020. It is based on the equation (3) but can allow each similarity score to adjust its weight according to the current optimization status, with a high degree of optimization flexibility and clearer convergence. state. The formula for circle loss is as follows

	$\displaystyle\mathcal{L}_{\mathrm{circle}}$	$\displaystyle=\log{\left[1+\sum_{i=1}^{K}\sum_{j=1}^{L}\exp\left(\alpha_{n}^{j% }s_{n}^{j}-\alpha_{p}^{i}s_{p}^{i}\right)\right]}$		(4)
		$\displaystyle=\log{\left[1+\sum_{j=1}^{L}\exp(\alpha_{n}^{j}s_{n}^{j})\sum_{i=% 1}^{K}\exp(\gamma(-\alpha_{p}^{i}s_{p}^{i})\right]}$		(4)

in which, $\alpha_{n}^{j}$ and $\alpha_{p}^{i}$ are non-negative weighting factors. It can be seen that compared with equation (3), equation (4) removes the margin $m$ , and uses $\alpha_{n}^{j}s_{n}^{j}-\alpha_{p}^{i}s_{p}^{i}$ instead of $(s_{n}^{j}-s_{p}^{i})$ . During the training process, the gradient for $\alpha_{n}^{j}s_{n}^{j}-\alpha_{p}^{i}s_{p}^{i}$ will be multiplied by the weights $\alpha_{n}^{j}$ and $\alpha_{p}^{i}$ respectively when backpropagating to $s_{n}^{j}$ and $s_{p}^{i}$ . When a similarity score moves away from its optimal value, its weight $\alpha$ will increase, allowing it to be updated at a larger pace. The adjustment methods of $\alpha_{n}^{j}$ and $\alpha_{p}^{i}$ are defined as

	$\displaystyle\alpha_{n}^{j}=\left[s_{j}^{n}-O_{n}\right]_{+}$		(5)
	$\displaystyle\alpha_{p}^{i}=\left[O_{n}-s_{p}^{i}\right]_{+}$		(5)

in which $[\cdot]_{+}$ means "truncated to zero", that is, values less than zero are set to zero to ensure that $\alpha_{n}^{j}$ and $\alpha_{p}^{i}$ are non-negative numbers. $O_{n}$ is the best value of $s_{j}^{n}$ , and $O_{p}$ is the best value of $s_{i}^{p}$ . Under the cosine similarity metric, the target of $s_{p}$ is 1 and the target of $s_{n}$ is 0, so we can set $O_{n}=-m$ , $O_{p}=1-m$ , where $m$ is the margin.

Rescaling cosine similarity is a common practice in class-level label learning. In traditional methods, all similarity scores share the same scaling factor $\gamma$ . Unlike traditional methods, circle loss multiplies each similarity score by an independent weighting factor before rescaling. This approach removes the restriction of equal rescaling and provides greater flexibility in the optimization process. In addition to better optimization effects, this reweighting strategy enables circle loss to hold a similar pair optimization perspective, making it compatible with class-level and pairwise label learning. Circle loss achieves both better within-class compactness and between-class discrepancy (on the training set), we believe that it indicates better optimization.

3.2 Adversarial training with circle loss

To achieve adversarial robustness, we utilize a cross-entropy loss encouraging the model’s predictions for the adversarial sample close to the true value and apply the circle loss to enhance the adversarial training effect, i.e., maximizing the within-class distance and minimizing the between-class distance. Thus the loss function of our algorithm is formulated as

\mathcal{L}=\mathcal{L}_{\mathrm{CE}}(\mathcal{F}(x^{\mathrm{adv}}),y)+\beta% \cdot\mathcal{L}_{\mathrm{circle}}(\phi(x^{\mathrm{adv}}))

(6)

where $\mathcal{L}_{\mathrm{CE}}(\cdot)$ is the cross-entropy loss, $\mathcal{F}(x^{\mathrm{adv}})$ is the output vector of the learning model (with the softmax operator in the last layer), $\mathcal{L}_{\mathrm{{circle}}}(\cdot)$ is the circle loss, $\phi(x^{\mathrm{adv}})$ is the features of the samples extracted by the learning model, and $\beta>0$ is a scaling parameter that balances the two parts of the final loss.

The first term in Eq. (6) guarantees the prediction accuracy of the model for adversarial examples by minimizing the difference between $\mathcal{F}(x^{\mathrm{adv}})$ and $y$ , while the second regularization term enhances adversarial robustness, that is, it further pushes the decision boundary of the classifier away from the sample instances via maximizing the within-class distance and minimizing the between-class distance. The conceptual illustration is shown in Fig. 1.

Algorithm 1 Adversarial training with circle loss

1: Initialize network

f(\theta)

;

2: repeat

3: Read mini-batch

X

from training set;

4: Construct

X^{\mathrm{adv}}

against

f(\theta)

for each instance in

X

, adversarial attack methods choose from FGSM, PGD and C&W;

5: Train

f(\theta)

with

X^{\mathrm{adv}}

using Eq. (6);

6: until Training converged

The pseudocode of the adversarial training procedure is displayed in Algorithm 1. In Line 4, we construct adversarial examples for adversarial training. As described in Eq. (2), the cost of adversarial training is dominated by solving the inner maximization problem. From this perspective, the success of learning a robust classifier depends on the quality of the inner maxima $x^{\mathrm{adv}}$ . Unfortunately, direct optimization on Eq. (2) is practically intractable due to the challenges in optimizing (nonconcave) inner maximization over all training data. In practice, we instead approximate the optimal adversary with a local maximum $x^{\mathrm{adv}}$ . The most commonly used inner maximizer is the PGD algorithm, which can quickly generate aggressive adversarial examples. However, even if one method is chosen to generate high-quality local maxima $x^{\mathrm{adv}}$ , the trained model may not be as robust to the adversarial examples generated by other adversarial methods. So, we attempt to combine different attack methods in adversarial training to increase the robustness. We consider three methods for generating adversarial examples, FGSM, PGD, and C&W. These three methods also represent single-step attacks, iterative attacks, and optimization attacks, respectively. This increases the diversity of perturbations during training and can improve the robustness not only under a known type of attack but also under an unknown type of attack. Once we have the adversarial examples, we train the network by minimizing Eq. (6) as Line 5 of Algorithm 1.

4 Experiment

In this section, we first introduce the SHM task, including datasets, and classifiers. Following this, we evaluate the performance of our method in scenarios involving white-box attacks, black-box attacks, and Gaussian noise. Subsequently, we conduct ablation experiments to investigate the impact of circle loss on model robustness.

4.1 SHM datasets and classifiers

We selected two SHM datasets and established two distinct models based on their dataset characteristics: a simple ANN representing conventional machine learning methods and a complex network representing deep learning methods. Below are detailed descriptions of the datasets and classifiers used in this study.

4.1.1 Three-span continuous rigid frame bridge (TCRF bridge) scale model

We use a scale bridge model of the real TCRF bridge where the primary bridge structure, bridge pier, and bridge abutment are constructed following the same scaling ratio of 1:20 [4]. The stiffness degradation of the bridge structure is simulated by applying the concentrated force in the span of the continuous rigid frame bridge to make floor cracks. We use these cracks to represent the structural damages. In our experiment settings, we have 4 kinds of structural damage states, which are shown in Table 1.

Table 1: Damage conditions (DC) of TCRF bridge scale model.

Label	Descriptions
DC0	No damage in the scale bridge
DC1	One crack in the scale bridge
DC2	Two cracks in the scale bridge
DC3	Two larger cracks in the scale bridge

Dynamic characteristics of the structure itself such as the mode shape and natural frequency will change when a certain part of the structure is damaged. Among them, the vibration response acceleration information containing rich structure dynamic information can better reflect the overall damage to the structure, and the vibration response acceleration data is relatively easy to collect in actual engineering. So, to monitor the changes in the structural state degradation, 18 acceleration sensors have been installed on the scale model, including 12 vertical measuring points at the bottom of the beam and 6 on the web horizontal measuring points, as shown in Fig. 3.

We tow a 0.3kg scale car across the bridge deck in 15 seconds to simulate the dynamic load process under four structural damage states, then the acceleration can be recorded by the sensors at a sampling frequency of 8192 Hz. In Fig. 2, it shows that when the car passes under different conditions, the data collected by the acceleration sensor changes significantly. For professionals, there is no need to process the data, and they can directly distinguish various state conditions from these acceleration data to obtain the health status of the structure.

To obtain the best structural damage identification performance, our network structure borrows the parallel Convolutional Neural Network and Bidirectional Bidirectional Gated Recurrent Unit (PCBG) framework in [5], which is currently a state-of-the-art network framework on this dataset. The network structure and parameters are shown in Fig. 4. The total number of data samples was 270000, each containing 320 data points (20 values for each of the 18 sensors), all of which were fed into this network for damage recognition classification. The network was trained by dividing the dataset into a training set and a validation set in the ratio of 7:3, and the network was able to achieve an accuracy of 99.51% on the entire dataset.

The neural network exhibits robust accuracy on the dataset, maintaining a 94.03% accuracy even when the dataset is augmented with Gaussian noise of mean amplitude 0.003. However, when adversarial noise generated by the BIM adversarial attack with a perturbation magnitude of 0.003 is introduced into the dataset, the model’s accuracy drops drastically to 26.75% (as depicted in a typical case in Fig. 5). This demonstrates a significantly more severe impact of adversarial noise compared to Gaussian noise at an equivalent level, highlighting the heightened threat posed by adversarial perturbations.

Table 2: Structural damage states of the three-storey structure.

Label	Description
State #0	Baseline condition
State #1	Mass=1.2kg at the base
State #2	Mass=1.2kg on the 1st floor
State #3	87.5% stiffness reduction in column 1BD
State #4	87.5% stiffness reduction in column 1AD and 1BD
State #5	87.5% stiffness reduction in column 2BD
State #6	87.5% stiffness reduction in column 2AD and 2BD
State #7	87.5% stiffness reduction in column 3BD
State #8	87.5% stiffness reduction in column 3AD and 3BD
State #9	Gap=0.20mm
State #10	Gap=0.15mm
State #11	Gap=0.13mm
State #12	Gap=0.10mm
State #13	Gap=0.05mm
State #14	Gap=0.20mm and mass=1.2kg at the base
State #15	Gap=0.20mm and mass=1.2kg on the 1st floor
State #16	Gap=0.10mm and mass=1.2kg on the 1st floor

4.1.2 Los Alamos National Laboratory (LANL) three-storey structure

The basic dimensions of the three-storey structure [61] are shown in Fig. 6. There are 17 structural state conditions, information that describes the different states as shown in Table 2. For example, the state condition labeled "State #3" is described as "87.5% stiffness reduction in column 1BD", which means there was 87.5% stiffness reduction in the column located between the base and first floor at the intersection of plane B and D as defined in Fig. 6. The "gap" mentioned in the descriptions of States #9–#16 means the distance between the bumper and the suspended column, which is variable and used to introduce different levels of nonlinearity for a given level of excitation. The structure is excited with a band-limited (20–150 Hz) Gaussian forcing signal by an electrodynamic shaker attached to the base, and one sensor per layer is used to collect the force and acceleration histories under various structural state conditions. Each data sample for SHM contains excitation signal as well as the acceleration response information for the base, the 1st floor, the 2nd floor, and the 3rd floor of the structure.

In Fig. 7, State #0 is the undamaged condition, and State #16 is the damaged condition, but their force and acceleration histories plots are displayed too similarly to be distinguished by the eye. Considering that State #0 and State #16 belong to two completely different state conditions, the difference they show should be large enough to be noticed. So we use the ratio of acceleration to excitation force to extract the FRFs of the data. FRFs are frequently used as damage and condition-sensitive features in SHM as they encode physical dynamic properties and can be efficiently computed. Their peaks and valleys have been revealed to be important features for damage characterization. To a trained engineer, FRFs of a structure are approximately tractable visually, and structural damage information can be easily derived from the offset of their peaks and valleys [62]. As shown in Fig. 7b, the normalized FRFs of different state conditions show shifts of the resonance frequencies as well as distortions of the FRF shape caused by damage. The FRFs were estimated for each layer of each sample using the non-overlap** Welch method, Hanning windows, and five-fold averaging. After extracting FRF features, our classification task is a lot lighter and can be done with a simple ANN. We set up a three-layer ANN containing an input layer, hidden layer, and output layer, where the hidden layer contains 17 nodes. The dataset contains 1700 samples with 1640 data points per sample. The network was trained by dividing the dataset into a training set and a validation set in the ratio of 7:3, and the network was able to achieve an accuracy of 99.60% on the entire dataset.

Table 3: Natural accuracy and parameter settings of defense models.

		Natural Accuracy (%)	Parameters
TCRF Bridge	RS	97.94	Failure probability = 0.1 and noise level hyperparameter = 0.003
	Distillation	98.93	Temperature = 2048 and balancing factor = 0.4
	Fast-AT	98.86	Radius = 0.003 and step size = 0.0045
	PGD-AT	94.81	PGD with 20 steps and step size = 0.0003
LANL Structure	RS	99.51	Failure probability = 0.1 and noise level hyperparameter = 0.1
	Distillation	98.29	Temperature = 2 and balancing factor = 0.85
	Fast-AT	97.59	Radius = 0.5 and step size = 0.75
	PGD-AT	98.47	PGD with 20 steps and step size = 0.005

The neural network exhibits robust accuracy on the dataset, maintaining a 98.94% accuracy even when the dataset is augmented with Gaussian noise of mean amplitude 0.1. However, when adversarial noise generated by the BIM adversarial attack with a perturbation magnitude of 0.1 is introduced into the dataset, the model’s accuracy drops drastically to 6.53% (as depicted in a typical case in Fig. 8). This demonstrates a significantly more severe impact of adversarial noise compared to Gaussian noise at an equivalent level, highlighting the heightened threat posed by adversarial perturbations.

4.2 Performance under white-box attacks

In our approach, $L_{\mathrm{circle}}$ operates on the penultimate layer features. The following transformation of the penultimate layer only consists of a linear layer and a softmax layer, which ensures that small fluctuations in the embedding will only result in a monotonous adjustment to the output controlled by some tractable Lipschitz constant [59, 20]. The penultimate layer tends to preserve more information than the logit layer. For the TCRF bridge dataset, we configured the proportion of adversarial examples in training as PGD: FGSM: C&W = 3: 1: 1, employing Equation 6 with a scaling parameter $\beta$ of 0.01. The resultant model achieved an accuracy of 96.98% on the clean dataset. Regarding the LANL structure dataset, the training comprised a proportion of adversarial examples set to PGD: FGSM: C&W = 1: 3: 1, utilizing Equation 6 with a scaling parameter $\beta$ of 0.1. The model attained an accuracy of 95.88% on the clean dataset.

We confirm the efficacy of our defense method by conducting identical experimental procedures on the aforementioned TCRF bridge and LANL structure datasets. Initially, we compare the defense models trained by our approach against the original models mentioned in Section 4.1 under BIM and FGSM attacks, presenting the outcomes in Table 4.

Table 4: Comparison of the accuracy of the model trained using our proposed defense method and the original model under FGSM and BIM attack

		Standard		Our
		FGSM	BIM	FGSM	BIM
TCRF Bridge	$\epsilon=0.001$	65.11	57.42	94.38	90.56
	$\epsilon=0.003$	38.14	26.75	86.98	81.08
	$\epsilon=0.005$	27.46	14.76	65.63	57.15
LANL Structure	$\epsilon=0.05$	61.53	58.59	89.35	85.88
	$\epsilon=0.1$	13.76	6.53	77.18	76.47
	$\epsilon=0.15$	0.29	0.00	57.41	54.43

As the perturbation magnitude $\epsilon$ increases, the attack becomes more potent, causing a quicker decline in model accuracy. Our defense model showcases superior resilience, maintaining higher accuracy levels against adversarial attacks compared to standard models. Particularly noteworthy is the ability of our method to sustain over half of its accuracy in the LANL structure dataset at $\epsilon=0.15$ , while the original model’s accuracy drops to zero. Furthermore, our defense approach exhibits notable resilience against smaller perturbations, preserving significantly higher accuracy levels than standard models. These findings underscore the effectiveness of our method in fortifying adversarial robustness.

To rigorously assess the efficacy of our defense approach, we subsequently compare it against four well-established and effective defense methodologies introduced in Section 2.3, i.e., Randomized Smoothing (RS), Distillation, Fast adversarial training (Fast-AT) and PGD-based adversarial training (PGD-AT).

For all methods, we use the same network architectures that are specified in Section 4.1 and turn the parameters to ensure they show the best performance, the natural accuracy and parameter settings of these defense models are shown in Table 3.

In addition to BIM and FGSM, which are gradient-based attack methods, we incorporate C&W and AdvGAN attacks. Unlike BIM and FGSM, C&W and AdvGAN are respectively based on optimization and generator techniques. All attacks have full access to model parameters and are constrained by the same perturbation limit $\epsilon$ . The added adversarial noise maintains a signal-to-noise ratio between 25-50 dB concerning the original sample, aligning with typical noise levels present in SHM systems [63, 64, 65]. In the C&W attack in the TCRF bridge scenario, the number of iterations is 1000, the misclassification confidence factor $k=0$ , and the objective function’s weight $c=0.0001$ . In the C&W attack in the LANL structure scenario, the number of iterations is 1000, the misclassification confidence factor $k=0$ , and the objective function’s weight $c=2$ . All C&W attacks use the $L_{2}$ attack version.

Table 5: White-box robustness (accuracy (%) on white-box test attacks) on TCRF bridge and LANL structure scenarios.

	Standard	RS	Distillation	Fast-AT	PGD-AT	Our
TCRF Bridge $(\epsilon=0.003)$
FGSM	38.14	43.19	49.36	84.13	86.88	86.98
BIM	26.75	29.23	33.94	79.24	80.30	81.08
C&W	1.19	17.15	1.02	9.76	25.19	25.22
AdvGAN	71.31	73.76	73.83	65.64	91.49	93.85
LANL Structure $(\epsilon=0.1)$
FGSM	13.76	36.25	38.12	66.82	70.76	77.18
BIM	6.53	29.31	33.65	63.00	69.47	76.47
C&W	0.00	0.00	38.94	16.88	25.24	55.47
AdvGAN	41.12	41.31	55.06	80.82	83.76	84.29

The white-box robustness of all defense models is reported in Table 5. our defense mechanism shows the highest robustness against FGSM, BIM, C&W, and AdvGAN attacks on both the TCRF bridge and LANL structure SHM datasets. Particularly notable is the significant enhancement in adversarial robustness against these attacks on the LANL structure scenario compared to the second-best approach. Our method achieves approximately a $7\%$ improvement in robustness against FGSM and BIM attacks, while demonstrating a substantial improvement of approximately $30\%$ against C&W attacks, showcasing its remarkable effectiveness in mitigating diverse adversarial attacks.

4.3 Performance under black-box attacks

Table 6: Black-box robustness (accuracy (%) on white-box test attacks) on TCRF bridge and LANL structure scenarios.

	Standard	RS	Distillation	Fast-AT	PGD-AT	Our
TCRF Bridge $(\epsilon=0.003)$
FGSM	76.52	81.28	77.32	86.64	94.63	95.55
BIM²⁰	75.46	81.01	78.34	82.60	92.52	93.98
C&W	93.74	97.14	92.78	98.16	92.64	95.51
AdvGAN	67.78	69.94	71.04	84.02	92.17	92.97
LANL Structure $(\epsilon=0.1)$
FGSM	50.82	51.89	54.59	65.41	77.12	82.00
BIM²⁰	44.82	44.56	52.88	64.00	77.35	81.88
C&W	75.76	79.86	81.94	94.47	92.53	92.65
AdvGAN	66.29	66.72	64.18	81.29	85.41	88.00

Black-box attacks are executed using a substitute model approach, harnessing the transferability of adversarial examples to target the model. The process involves generating synthetic datasets, training substitute models, and conducting transfer attacks. In the TCRF bridge SHM scenario, the chosen substitute model is the HCG model [4], an innovative hierarchical framework that amalgamates Convolutional Neural Networks and Gated Recurrent Units to effectively capture spatial and temporal relations. In the LANL three-storey structure scenario, the substitute model is an ANN with a hidden layer comprising 32 nodes.

In the black-box scenario, the adversaries are still FGSM, BIM²⁰, C&W, and AdvGAN. The black-box robustness of all defense strategies is reported in Table 6. Compared with the white-box results, all defense methods achieve much better robustness against black-box attacks, even close to the natural accuracy. Once again, our defense model demonstrates superior robustness compared to other baseline approaches.

4.3.1 Transferability test

We investigate the transferability of attacks on the TCRF bridge dataset and LANL structure dataset between a standard training model, distillation model, Fast-AT model, PGD-AT model, and our model. In Table 7 to 10, we present the accuracy of target models (columns) when subjected to attacks from adversarial samples generated from source models (rows). The bottom rows indicate the average accuracy of the target model against adversarial samples transferred from other source models.

In comparison to other models, our model showcases remarkable resilience against adversarial samples transferred from other models. Particularly on the LANL dataset, our model boasts an average accuracy approximately 7% higher than the second-best model when confronted with adversarial samples generated by other models

Table 7: Transferability test on TCRF bridge dataset: FGSM adversaries are generated with

\epsilon=0.003

using the source network and then evaluated on the target model.

	Standard	Distillation	Fast-AT	PGD-AT	Our
Standard	38.12	47.82	91.72	92	93.06
Distillation	52.28	49.36	92.61	93.03	94.01
Fast-AT	80.31	79.14	84.13	87.87	89.52
PGD-AT	81.96	82.76	87.53	86.88	90.03
Our	78.55	77.95	87.45	88.35	86.98
Mean	66.24	67.41	88.69	89.63	90.72

Table 8: Transferability test on TCRF bridge dataset: BIM adversaries are generated with

\epsilon=0.003

using the source network and then evaluated on the target model.

	Standard	Distillation	Fast-AT	PGD-AT	Our
Standard	26.74	37.63	92.21	92.42	94.15
Distillation	39.89	33.94	93.39	93.63	94.88
Fast-AT	79.94	79.23	79.24	86.42	87.65
PGD-AT	82.46	84.47	86.43	83.29	88.49
Our	75.00	74.98	86.32	87.28	81.08
Mean	60.81	62.05	87.52	88.61	89.25

Table 9: Transferability test on LANL structure dataset: FGSM adversaries are generated with

\epsilon=0.003

using the source network and then evaluated on the target model.

	Standard	Distillation	Fast-AT	PGD-AT	Our
Standard	13.76	45.12	72.41	76.12	82.71
Distillation	29.76	38.12	72.71	75.94	81.59
Fast-AT	63.24	68.00	66.82	71.65	80.59
PGD-AT	62.94	69.24	68.24	70.76	80.82
Our	56.94	61.71	69.94	73.24	77.18
Mean	45.33	56.44	70.02	73.54	80.58

Table 10: Transferability test on LANL structure dataset: BIM adversaries are generated with

\epsilon=0.003

using the source network and then evaluated on the target model.

	Standard	Distillation	Fast-AT	PGD-AT	Our
Standard	6.53	42.35	71.29	75.71	82.00
Distillation	24.65	33.65	72.65	76.59	81.65
Fast-AT	58.94	64.47	63.00	70.53	79.29
PGD-AT	58.47	65.71	65.88	69.47	79.71
Our	56.29	59.76	69.82	73.06	76.47
Mean	40.98	53.19	68.53	73.07	79.82

The research by [66, 67] suggests a trend where iterative attacks often overly adapt to specific network parameters, leading to high success rates in white-box scenarios but diminished performance in black-box scenarios. On the other hand, single-step attacks tend to have lower success rates in white-box settings but yield slightly better transferability of adversarial examples. However, regarding transferability, our experimental findings differed from this trend as we observed that BIM adversarial examples demonstrated stronger transferability compared to FGSM adversarial examples. We believe that the evaluation of transferability might be notably influenced by the dataset used.

4.4 Performance under Gaussian Noise

A robust defense method should not only counter meticulously designed adversarial noise but also demonstrate resilience against the inherent and inevitable random white noise present in SHM systems. Although white noise of the same magnitude does not possess the same attacking power as adversarial noise, it can still disrupt machine learning models. Numerous articles investigate the impact of white noise on SHM systems. For example, Campeiro et al. [64] present an experimental analysis of white noise effects on structural damage detection in impedance-based SHM systems, they indicate that even a low noise causes significant variations in the impedance signatures. Balasubramanian et al. [63] prove the inherent noise present in the sensor response poses a substantial hurdle for SHM systems based on neural networks to estimate the external impact correctly.

To emulate environmental loads such as aerodynamic pressure, wind disturbances, and electrical noise, we intentionally injected Gaussian noise following a $\mathcal{N}(0,\sigma^{2})$ distribution into the data. This was done to explore the effectiveness of our method against Gaussian noise. The results are presented in Table 11.

Table 11: Comparison of the accuracy of the model trained using our proposed defense method and the original model under Gaussian noise.

	Noise Level	Standard	Our
TCRF Bridge	$\sigma=0.001$	99.42	97.07
	$\sigma=0.003$	94.03	97.19
	$\sigma=0.005$	80.54	97.28
	$\sigma=0.007$	69.53	86.00
LANL Structure	$\sigma=0.4$	96.59	94.29
	$\sigma=0.6$	91.76	94.41
	$\sigma=0.8$	88.47	92.88
	$\sigma=1.0$	84.76	92.82

It is evident from the table that the presence of Gaussian noise impacts the accuracy of SHM ML models. However, it’s noteworthy that our method’s trained models showcase remarkable resilience to Gaussian noise. The accuracy of our models on samples affected by Gaussian noise remains consistently similar to the accuracy achieved on clean, uncontaminated samples. This ability to maintain comparable accuracy levels in the presence of noise signifies the robustness of our models and renders them well-suited for deployment in SHM systems where noise interference is inevitable. Furthermore, an intriguing observation is that in the TCRF bridge dataset, our model’s accuracy on samples contaminated with Gaussian noise (with $\sigma$ values of 0.003 and 0.005) is marginally higher compared to the accuracy on clean samples. The addition of these Gaussian noises can even be likened to a beneficial form of smoothing preprocessing for our model’s input data in certain scenarios.

4.5 Ablation studies

4.5.1 $\mathcal{L}_{\mathrm{circle}}$ at different layers

As the TCRF bridge scenario involves the application of a deep neural network model, we delved into the impact of employing circle loss at different depths within this context. Specifically, after each layer identified in Fig. 4, we applied circle loss to regulate the distances between features. Subsequently, we evaluated the defensive efficacy of these models post-training and reported their performance against FGSM and BIM attacks in Table 12.

Table 12: Ablation analysis with

\mathcal{L}_{\mathrm{circle}}

applied at different layers of PCBG network ( Fig. 4) for TCRF bridge dataset.

Layer

No Attack

FGSM

(\epsilon=0.003)

BIM

(\epsilon=0.003)

None

97.32

84.81

78.29

Layer 1

96.85

86.01

80.49

Layer 2

96.98

86.98

81.08

Layer 3

96.87

85.35

77.78

Layer 1+2

96.90

85.56

80.01

Layer 1+3

96.19

80.55

75.75

Layer 2+3

97.07

86.00

79.77

Layer 1+2+3

96.96

85.98

81.29

Training without the integration of circle loss can be likened to softmax adversarial training using mixed adversarial samples. The model trained in this manner showcases a marginal improvement in accuracy compared to models trained exclusively on PGD adversarial samples.

As can be seen, opting to implement circle loss after the penultimate layer proves to be a favorable choice. The model trained in this manner exhibits relatively high original accuracy and accuracy against FGSM and BIM attacks.

4.5.2 Circle regularization

To further investigate the enhancement of model robustness due to circle loss, we compare the standard training with and without circle loss regularization terms. Additionally, we explore the comparison between PGD adversarial training with and without circle loss regularization terms. As shown in Fig. 9, models trained using the circle regularization term can be resistant to BIM attacks with a larger perturbation budget.

The experimental results indicate that circle loss can also be regarded as a regularization term. Thus, it can be incorporated into the standard training process to reduce the vulnerability of the model to adversarial examples and can also be incorporated into most of the existing defense methods for better robustness.

5 Conclusion

In this work, we explore the vulnerability of data-driven SHM to some well-known existing adversarial attacks. As a result, we emphasize the importance of protecting against such attacks, particularly when ML models are used in sensitive tasks such as structural diagnosis and structural damage detection. Further, we discuss some mechanisms for avoiding these attacks while strengthening the robustness of models to adversarial examples. We propose adversarial training with circle loss, which optimizes feature distances to increase the distance between data points and decision boundaries. The results of our experiments validate the effectiveness of our method in improving adversarial robustness by incorporating a feature distance constraint in the objective function, while conventional cross-entropy loss fails to impose. Finally, we encourage researchers to consider robustness to adversarial attacks when evaluating data-driven SHM models.

CRediT authorship contribution statement

Xiangli Yang: Conceptualization, Validation, Formal analysis, Writing - Original draft preparation. Xijie Deng: Methodology, Investigation, Validation, Software, Writing - Original draft preparation. Hanwei Zhang: Conceptualization, Methodology, Validation, Writing - Original draft preparation. Yang Zou: Project administration, Review and editing. Jianxi Yang: Supervision, Funding acquisition.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgement

This work was supported in part by the National Natural Science Foundation of China (Grant No. 62101081 and 62205039), the Science and Technology Research Program of Chongqing Municipal Education Commission (Grant No. KJQN202100747 and KJZD-M202300703), and the Graduate Research Innovation Project of Information science and Engineering School of Chongqing Jiaotong University (Grant No. 2023yjkc003).

References

[1] K. V. Yuen, Bayesian methods for structural dynamics and civil engineering, John Wiley & Sons, 2010.
[2] S. Arangio, F. Bontempi, Structural health monitoring of a cable-stayed bridge with Bayesian neural networks, Struct. Infrastruct. Eng. 11 (4) (2015) 575–587, https://doi.org/10.1080/15732479.2014.951867.
[3] T. Yin, Q. H. Jiang, K. V. Yuen, Vibration-based damage detection for structural connections using incomplete modal data by Bayesian approach and model reduction technique, Eng. Struct. 132 (2017) 260–277, https://doi.org/10.1016/j.engstruct.2016.11.035.
[4] J. Yang, L. K. Zhang, C. Chen, Y. F. Li, R. Li, G. Wang, S. Jiang, Z. Zeng, A hierarchical deep convolutional neural network and gated recurrent unit framework for structural damage detection, Inf. Sci. 540 (2020) 117–130, https://doi.org/10.1016/j.ins.2020.05.090.
[5] J. Yang, F. Yang, Y. Zhou, D. Wang, R. Li, G. Wang, W. Chen, A data-driven structural damage detection framework based on parallel convolutional neural network and bidirectional gated recurrent unit, Inf. Sci. 566 (2021) 103–117, https://doi.org/10.1016/j.ins.2021.02.064.
[6] Y. He, L. Zhang, Z. Chen, C. Y. Li, A framework of structural damage detection for civil structures using a combined multi-scale convolutional neural network and echo state network, Eng. Comput. 38 (2022) 1–19, https://doi.org/10.1007/s00366-021-01584-4.
[7] A. Widodo, B. S. Yang, Support vector machine in machine condition monitoring and fault diagnosis, Mech. Syst. Signal. Process. 21 (6) (2007) 2560–2574, https://doi.org/10.1016/j.ymssp.2006.12.007.
[8] P. Prasanna, K. J. Dana, N. Gucunski, B. B. Basily, H. M. La, R. S. Lim, H. Parvardeh, Automated crack detection on concrete bridges, IEEE Trans. Autom. Sci. Eng. 13 (2) (2014) 591–599, https://doi.org/10.2749/nan**g.2022.1506.
[9] H. Kim, E. Ahn, M. Shin, S. H. Sim, Crack and noncrack classification from concrete surface images using machine learning, Struct. Heal. Monit. 18 (3) (2019) 725–738, https://doi.org/10.1177/1475921718768747.
[10] J. Yang, F. Yang, L. Zhang, R. Li, S. Jiang, G. Wang, L. Zhang, Z. Zeng, Bridge health anomaly detection using deep support vector data description, Neurocomputing 444 (2021) 170–178, https://doi.org/10.1016/j.neucom.2020.08.087.
[11] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, R. Fergus, Intriguing properties of neural networks, in: Proc. Int. Conf. Learn. Represent., 2014, pp. 1–10, https://doi.org/10.48550/arXiv.1312.6199.
[12] I. Goodfellow, J. Shlens, C. Szegedy, Explaining and harnessing adversarial examples, in: Proc. Int. Conf. Learn. Represent., 2015, pp. 1–11, https://doi.org/10.48550/arXiv.1412.6572.
[13] B. Biggio, I. Corona, D. Maiorca, N. Nelson, P. Laskov, G. Giacinto, F. Roli, Evasion attacks against machine learning at test time, in: ECML-PKDD, Springer, 2013, pp. 387–402, https://doi.org/10.1007/978-3-642-40994-3_25.
[14] A. Nguyen, J. Yosinski, J. Clune, Deep neural networks are easily fooled: High confidence predictions for unrecognizable images, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2015, pp. 427–436, https://doi.org/10.1109/cvpr.2015.7298640.
[15] H. Zhang, T. Furon, L. Amsaleg, Y. Avrithis, Deep neural network attacks and defense: The case of image classification, Multimedia Secur. (2022) 41–75https://doi.org/10.1002/9781119901808.ch2.
[16] M. D. Champneys, A. Green, J. Morales, M. Silva, D. Mascarenas, On the vulnerability of data-driven structural health monitoring models to adversarial attack, Struct. Heal. Monit. 20 (4) (2021) 1476–1493, https://doi.org/10.1177/1475921720920233.
[17] C. Xie, Y. Wu, L. Maaten Van Der, A. L. Yuille, K. He, Feature denoising for improving adversarial robustness, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2019, pp. 501–509, https://doi.org/10.1109/cvpr.2019.00059.
[18] J. Buckman, A. Roy, C. Raffel, I. Goodfellow, Thermometer encoding: One hot way to resist adversarial examples, in: Proc. Int. Conf. Learn. Represent., 2018, pp. 1–22, https://openreview.net/forum?id=S18Su–CW.
[19] H. Zhang, Y. Avrithis, T. Furon, L. Amsaleg, Patch replacement: A transformation-based method to improve robustness against adversarial attacks, in: Proc. 1st Int. Trustworthy AI Multimedia Comput. Workshop, 2021, pp. 9–17, https://doi.org/10.1145/3475731.3484955.
[20] A. M. Oberman, J. Calder, Lipschitz regularized deep neural networks converge and generalize, arXiv preprint, https://doi.org/10.48550/arXiv.1808.09540 (2018).
[21] R. Li, H. Zhang, P. Yang, C.-C. Huang, A. Zhou, B. Xue, L. Zhang, Ensemble defense with data diversity: Weak correlation implies strong robustness, arXiv preprint, https://doi.org/10.48550/arXiv.2106.02867 (2021).
[22] E. Wong, L. Rice, J. Z. Kolter, Fast is better than free: Revisiting adversarial training, arXiv preprint, https://doi.org/10.48550/arXiv.2001.03994 (2020).
[23] W. Xu, D. Evans, Y. Qi, Feature squeezing: Detecting adversarial examples in deep neural networks, arXiv preprint arXiv:1704.01155 (2017).
[24] C. Guo, M. Rana, M. Cisse, L. van der Maaten, Countering adversarial images using input transformations, arXiv preprint arXiv:1711.00117 (2017).
[25] S.-A. Rebuffi, S. Gowal, D. A. Calian, F. Stimberg, O. Wiles, T. A. Mann, Data augmentation can improve robustness, Advances in Neural Information Processing Systems (NeurIPS) 34 (2021) 29935–29948.
[26] A. Madry, A. Makelov, L. Schmidt, D. Tsipras, A. Vladu, Towards deep learning models resistant to adversarial attacks, in: Proc. Int. Conf. Learn. Represent., 2018, pp. 1–28, https://doi.org/10.48550/arXiv.1706.06083.
[27] N. Carlini, D. Wagner, Towards evaluating the robustness of neural networks, in: Proc. IEEE Secur. Priv., IEEE, 2017, pp. 39–57, https://doi.org/10.1109/sp.2017.49.
[28] S. M. Moosavi Dezfooli, A. Fawzi, P. Frossard, Deepfool: a simple and accurate method to fool deep neural networks, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 2574–2582, https://doi.org/10.1109/cvpr.2016.28.
[29] E. Raff, J. Sylvester, S. Forsyth, M. McLean, Barrage of random transforms for adversarially robust defense, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2019, pp. 6528–6537.
[30] A. Prakash, N. Moran, S. Garber, A. DiLillo, J. Storer, Deflecting adversarial attacks with pixel deflection, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2018, pp. 8571–8580.
[31] S.-M. Moosavi-Dezfooli, A. Shrivastava, O. Tuzel, Divide, denoise, and defend against adversarial attacks, arXiv preprint arXiv:1802.06806 (2018).
[32] Z. Liu, Q. Liu, T. Liu, N. Xu, X. Lin, Y. Wang, W. Wen, Feature distillation: DNN-oriented JPEG compression against adversarial examples, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit., IEEE, 2019, pp. 860–868.
[33] L. Li, M. W. Spratling, Data augmentation alone can improve adversarial training, in: Proc. Int. Conf. Learn. Represent., 2023.
[34] Y. Sun, C. Cheng, Y. Zhang, C. Zhang, L. Zheng, Z. Wang, Y. Wei, Circle loss: A unified perspective of pair similarity optimization, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2020, pp. 6398–6407, https://doi.org/10.1109/cvpr42600.2020.00643.
[35] N. Papernot, P. McDaniel, I. Goodfellow, Transferability in machine learning: from phenomena to black-box attacks using adversarial samples, arXiv preprint, https://doi.org/10.48550/arXiv.1605.07277 (2016).
[36] Z. Zhao, H. Zhang, R. Li, R. Sicre, L. Amsaleg, M. Backes, Towards good practices in evaluating transfer adversarial attacks, arXiv preprint, https://doi.org/10.48550/arXiv.2211.09565 (2022).
[37] Y. Liu, X. Chen, C. Liu, D. Song, Delving into transferable adversarial examples and black-box attacks, in: Proc. Int. Conf. Learn. Represent., 2017, p. 24–26, https://doi.org/10.48550/arXiv.1611.02770.
[38] H. Zhang, Y. Avrithis, T. Furon, L. Amsaleg, Smooth adversarial examples, EURASIP J. Inf. Secur. 2020 (1) (2020) 1–12, https://doi.org/10.1186/s13635-020-00112-z.
[39] A. Kurakin, I. J. Goodfellow, S. Bengio, Adversarial examples in the physical world, in: Proc. Artif. Intell. Saf. Secur., 2018, pp. 99–112, https://doi.org/10.1201/9781351251389-8.
[40] C. P. Yu, Z. Huan, S. Yash, Y. J. feng, H. C. Jui, Zoo: Zeroth order optimization based black-box attacks to deep neural networks without training substitute models, in: Proc. ACM Artif. Intell. Secur., 2017, pp. 15–26, https://doi.org/10.1145/3128572.3140448.
[41] A. Ilyas, L. Engstrom, A. Madry, Prior convictions: Black-box adversarial attacks with bandits and priors, in: Proc. Int. Conf. Learn. Represent., 2019, pp. 1–25, https://doi.org/10.48550/arXiv.1807.07978.
[42] A. Ilyas, L. Engstrom, A. Athalye, J. Lin, Black-box adversarial attacks with limited queries and information, in: Proc. Int. Conf. Learn. Represent., 2018, pp. 2137–2146, https://doi.org/10.48550/arXiv.1804.08598.
[43] C. Tu, P. Ting, P. Chen, S. Liu, H. Zhang, J. Yi, C. Hsieh, S. Cheng, Autozoom: Autoencoder-based zeroth order optimization method for attacking black-box neural networks, in: Proc. AAAI Conf. Artif. Intell., Vol. 33, 2019, pp. 742–749, https://doi.org/10.1609/aaai.v33i01.3301742.
[44] A. N. Bhagoji, W. He, B. Li, D. Song, Exploring the space of black-box attacks on deep neural networks, in: Proc. Int. Conf. Learn. Represent., 2018, pp. 1–25, https://doi.org/10.48550/arXiv.1712.09491.
[45] C. Xiao, B. Li, J. Y. Zhu, W. He, M. Liu, D. Song, Generating adversarial examples with adversarial networks, in: Int. Jt. Conf. Artif. Intell., 2018, pp. 3905–3911, https://doi.org/10.24963/ijcai.2018/543.
[46] D. Meng, H. Chen, Magnet: a two-pronged defense against adversarial examples, in: ACM SIGSAC, 2017, pp. 135–147, https://doi.org/10.1145/3133956.3134057.
[47] J. H. Metzen, T. Genewein, V. Fischer, B. Bischoff, On detecting adversarial perturbations, in: Proc. Int. Conf. Learn. Represent., 2017, pp. 1–12, https://doi.org/10.48550/arXiv.1702.04267.
[48] J. Lu, T. Issaranon, D. Forsyth, Safetynet: Detecting and rejecting adversarial examples robustly, in: Proc. IEEE Int. Conf. Comput. Vis., 2017, pp. 446–454, https://doi.org/10.1109/iccv.2017.56.
[49] N. Papernot, P. McDaniel, X. Wu, S. Jha, A. Swami, Distillation as a defense to adversarial perturbations against deep neural networks, in: Proc. IEEE Secur. Priv., IEEE, 2016, pp. 582–597, https://doi.org/10.1109/sp.2016.41.
[50] J. Ba, R. Caruana, Do deep nets really need to be deep?, in: Proc. Adv. Neural Inf. Process. Syst., Vol. 27, 2014, pp. 2654–2662, https://dl.acm.org/doi/10.5555/2969033.2969123.
[51] H. Zhang, Y. Yu, J. Jiao, E. Xing, L. El Ghaoui, M. Jordan, Theoretically principled trade-off between robustness and accuracy, in: Proc. Int. Conf. Learn. Represent., 2019, pp. 7472–7482, https://doi.org/10.48550/arXiv.1901.08573.
[52] A. Robey, L. Chamon, G. J. Pappas, H. Hassani, A. Ribeiro, Adversarial robustness with semi-infinite constrained learning, in: Proc. Adv. Neural Inf. Process. Syst., Vol. 34, 2021, pp. 6198–6215, https://doi.org/10.48550/arXiv.2110.15767.
[53] J. Cohen, E. Rosenfeld, Z. Kolter, Certified adversarial robustness via randomized smoothing, in: Proc. Int. Conf. Mach. Learn., 2019, pp. 1310–1320, https://doi.org/10.48550/arXiv.1902.02918.
[54] U. Shaham, Y. Yamada, S. Negahban, Understanding adversarial training: Increasing local stability of supervised models through robust optimization, Neurocomputing 307 (2018) 195–204, https://doi.org/10.1016/j.neucom.2018.04.027.
[55] A. Ross, F. Doshi Velez, Improving the adversarial robustness and interpretability of deep neural networks by regularizing their input gradients, in: Proc. AAAI Conf. Artif. Intell., Vol. 32, 2018, pp. 1–10, https://doi.org/10.1609/aaai.v32i1.11504.
[56] S. Zheng, Y. Song, T. Leung, I. Goodfellow, Improving the robustness of deep neural networks via stability training, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 4480–4488, https://doi.org/10.1109/cvpr.2016.485.
[57] S. M. Moosavi Dezfooli, A. Fawzi, J. Uesato, P. Frossard, Robustness via curvature regularization, and vice versa, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2019, pp. 9078–9086, https://doi.org/10.1109/cvpr.2019.00929.
[58] H. Kannan, A. Kurakin, I. Goodfellow, Adversarial logit pairing, arXiv preprint, https://doi.org/10.48550/arXiv.1803.06373 (2018).
[59] C. Mao, Z. Zhong, J. Yang, C. Vondrick, B. Ray, Metric learning for adversarial robustness, in: Proc. Adv. Neural Inf. Process. Syst., 2019, pp. 480–491, https://doi.org/10.48550/arXiv.1909.00900.
[60] P. Li, J. Yi, B. Zhou, L. Zhang, Improving the robustness of deep neural networks via adversarial training with triplet loss, in: Proc. Int. Jt. Conf. Artif. Intell., 2019, pp. 2909–2915, https://doi.org/10.24963/ijcai.2019/403.
[61] E. Figueiredo, G. Park, J. Figueiras, C. Farrar, K. Worden, Structural health monitoring algorithm comparisons using standard data sets, Tech. rep., Los Alamos National Lab.(LANL), Los Alamos, NM (United States), https://doi.org/10.2172/961604 (2009).
[62] S. Chesné, A. Deraemaeker, Damage localization using transmissibility functions: A critical review, Mech. Syst. Signal. Process. 38 (2) (2013) 569–584, https://doi.org/10.1016/j.ymssp.2013.01.020.
[63] P. Balasubramanian, V. Kaushik, S. Y. Altamimi, M. Amabili, M. Alteneiji, Comparison of neural networks based on accuracy and robustness in identifying impact location for structural health monitoring application, Struct. Health. Monit. 22 (2023) 417–432, https://doi.org/10.1177/14759217221098569.
[64] L. M. Campeiro, R. Z. da Silveira, F. G. Baptista, Impedance-based damage detection under noise and vibration effects, Struct. Heal. Monit. 17 (3) (2018) 654–667, https://doi.org/10.1177/1475921717715240.
[65] E. B. Flynn, M. D. Todd, A Bayesian approach to optimal sensor placement for structural health monitoring with application to active sensing, Mech. Syst. Signal. Process. 24 (4) (2010) 891–903, https://doi.org/10.1016/j.ymssp.2009.09.003.
[66] A. Kurakin, I. Goodfellow, S. Bengio, Adversarial machine learning at scale, in: Proc. Int. Conf. Learn. Represent., 2017, pp. 1–17, https://doi.org/10.48550/arXiv.1611.01236.
[67] C. Xie, Z. Zhang, Y. Zhou, S. Bai, J. Wang, Z. Ren, A. L. Yuille, Improving transferability of adversarial examples with input diversity, in: Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2019, pp. 2730–2739, https://doi.org/10.1109/cvpr.2019.00284.

Enhancing robustness of data-driven SHM models: adversarial training with circle loss

Abstract

keywords:

1 Introduction

2 Adversarial vulnerability of data-driven SHM

2.1 Threat model

Adversary’s knowledge.

Adversary’s goal.

Adversary’s capability.

2.2 Adversarial attacks

2.3 Defense strategies

2.3.1 Adversarial training

3 Methodology

3.1 Circle loss

3.2 Adversarial training with circle loss

4 Experiment

4.1 SHM datasets and classifiers

4.1.1 Three-span continuous rigid frame bridge (TCRF bridge) scale model

4.1.2 Los Alamos National Laboratory (LANL) three-storey structure

4.2 Performance under white-box attacks

4.3 Performance under black-box attacks

4.3.1 Transferability test

4.4 Performance under Gaussian Noise

4.5 Ablation studies

4.5.1 ℒcirclesubscriptℒcircle\mathcal{L}_{\mathrm{circle}}caligraphic_L start_POSTSUBSCRIPT roman_circle end_POSTSUBSCRIPT at different layers

4.5.2 Circle regularization

5 Conclusion

CRediT authorship contribution statement

Declaration of Competing Interest

Acknowledgement

References

4.5.1 $\mathcal{L}_{\mathrm{circle}}$ at different layers