ESND: An Embedding-based Framework for Signed Network Dismantling

Chenwei Xie, Chuang Liu, Cong Li, Xiu-Xiu Zhan, Xiang Li Chenwei Xie and Chuang Liu are with the Research Center for Complexity Sciences, Hangzhou Normal University, Hangzhou 311121, China (e-mail: [email protected]; [email protected]).Cong Li is with the Adaptive Networks and Control Laboratory, Electronic Engineering Department, School of Information Science and Engineering, and the Research Center of Smart Networks and Systems, Fudan University, Shanghai 200433, China (e-mail: [email protected], Corresponding author). Xiu-Xiu Zhan is with the Research Center for Complexity Sciences, Hangzhou Normal University, Hangzhou 311121, China, and the College of Media and International Culture, Zhejiang University, Hangzhou 310058, China (e-mail: [email protected], Corresponding author).Xiang Li is with the Institute of Complex Networks and Intelligent Systems, Shanghai Research Institute for Intelligent Autonomous Systems, the Frontiers Science Center for Intelligent Autonomous Systems, and the State Key Laboratory of Intelligent Autonomous Systems, Tongji University, Shanghai 201210, China (e-mail: [email protected]).Manuscript received xxx; revised xxx
Abstract

Network dismantling aims to maximize the disintegration of a network by removing a specific set of nodes or edges and is applied to various tasks in various domains, such as cracking down on crime organizations, delaying the propagation of rumors, and blocking the transmission of viruses. Most of the current network dismantling methods are tailored for unsigned networks, which only consider the connection between nodes without evaluating the nature of the relationships, such as friendship/hostility, enhancing/repressing, and trust/distrust. We here propose an embedding-based algorithm, namely ESND, to solve the signed network dismantling problem. The algorithm generally iterates the following four steps, i.e., giant component detection, network embedding, node clustering, and removal node selection. To illustrate the efficacy and stability of ESND, we conduct extensive experiments on six signed network datasets as well as null models, and compare the performance of our method with baselines. Experimental results consistently show that the proposed ESND is superior to the baselines and displays stable performance with the change in the network structure. Additionally, we examine the impact of sign proportions on network robustness via ESND, observing that networks with a high ratio of negative edges are generally easier to dismantle than networks with high positive edges.

Index Terms:
Network dismantling, signed network, node embedding, node clustering

I Introduction

Network dismantling aims to remove a certain number of nodes that could maximize the damage to the network in terms of connectivity  [1, 2, 3]. It has become a prominent topic in network science due to its extensive applications in different fields [4, 5, 6]. For instance, it could be used to delay the spread of diseases by immunizing (or isolating) the critical nodes in epidemic-spreading networks [7, 8, 9]. In terms of information dissemination, it has the potential to help block key users to control the propagation of rumors and false information on online social platforms [10, 11]. In addition, effective network dismantling measures can achieve the purpose of quickly thwarting the crime for terrorist organization networks [12, 13].

Network dismantling has been proven to fall into the category of NP-hard problems [14, 15, 16], the mathematical essence of which is a combinatorial optimization problem. Researchers have proposed various methods to identify critical nodes for network dismantling problems, such as centrality-based methods (e.g., degree, k-shell, betweenness, and closeness) [17, 18, 19, 20, 21], heuristic algorithms (e.g., acquaintance immunization, collective influence (CI) and generalized network dismantling (GND)) [22, 23, 24, 25], meta-heuristic algorithms (e.g., artificial bee colony algorithm, memetic algorithm) [26, 27], and machine learning algorithms (e.g., finding key players in networks through deep reinforcement learning (FINDER), graph dismantling with machine learning (GDM), neural extraction framework for multiscale essential structures (NEES)) [28, 29, 30]. Although these methods have shown efficacy in rapidly disintegrating networks, most of them are tailored to unsigned networks, i.e., networks without positive or negative signs on the edges. Actually, interactions between different individuals in the real world may contain specific meanings [31, 32, 33]. For example, users could be friends or enemies in social networks, and a signed network is needed to represent the different relationships between users [34]. Moreover, the dynamics of signed networks is quite different from that of unsigned networks. For instance, we need to consider signs when modeling the spread process on a signed network, and the signed network structure may result in different dynamic behaviors [35]. With regard to the dismantling problem, few works have considered this problem on signed networks, and the main challenge relies on how to utilize the signed network topology to solve this problem.

To address the challenge of signed network dismantling, we propose an algorithm named the Embedding-based framework for signed network dismantling (ESND), which integrates node embedding [36] and node clustering to achieve rapid disintegration of a signed network. The ESND consists of three main parts that iteratively remove nodes from the network (see Figure 1): First, we perform a signed network embedding algorithm (SiNE) to obtain node embedding vectors that could capture the local and global structure of a signed network. Second, we employ the K-means algorithm to classify the nodes into different clusters. Lastly, the node with the highest degree in the largest cluster is removed from the network. We compare ESND with the baselines on different empirical signed networks and their null models. The results show that ESND could better dismantle a signed network than the baseline methods.

The subsequent sections of this paper are organized as follows. Section II details the specifics of our proposed algorithm. Section III offers a clear description of the baseline methods. Section IV introduces the datasets and presents all the experimental results. We summarize our work and discuss future research directions in Section V.

II Methods

In this section, we introduce the iterative dynamic approach to the dismantling of signed networks, as shown in Figure 1. Initially, we identify the giant connected component (GCC) in the network and then use a signed network embedding algorithm, i.e., SiNE [37], to get the embedding vector of each node. Later, we use the K-means algorithm to partition the GCC into several clusters based on the embedding vectors of the nodes. The node with the highest degree in the largest cluster is removed from the network. If the network contains several nodes with the same value of the highest degree, we randomly choose one of them to remove. Subsequently, we re-identify the GCC within the remaining network and perform signed network embedding on the GCC. We then eliminate the node with the highest degree in the largest cluster using K-means. The process will iterate until the fraction q𝑞qitalic_q of removed nodes reaches a specified value qrsubscript𝑞𝑟q_{r}italic_q start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT. The essential steps, i.e., giant component detection, signed network embedding (SiNE), node clustering, and node elimination, of the ESND algorithm are illustrated as follows.

Refer to caption
Figure 1: Framework of ESND. The solid black lines represent positive edges, the red dashed lines indicate negative edges, q𝑞qitalic_q represents the fraction of the removed nodes, and qrsubscript𝑞𝑟q_{r}italic_q start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT is a threshold value indicating when we will stop the algorithm.

II-A Giant Connected Component Detection

Given an undirected and unweighted signed network G=(V,E)𝐺𝑉𝐸G=(V,E)italic_G = ( italic_V , italic_E ) consisting of N𝑁Nitalic_N nodes and M𝑀Mitalic_M edges, where V={v1,v2,,vN}𝑉subscript𝑣1subscript𝑣2subscript𝑣𝑁V=\left\{v_{1},v_{2},\cdots,v_{N}\right\}italic_V = { italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , ⋯ , italic_v start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT } represents the set of nodes and E𝐸Eitalic_E is the set of edges. An edge eij=(vi,vj)Esubscript𝑒𝑖𝑗subscript𝑣𝑖subscript𝑣𝑗𝐸e_{ij}=(v_{i},v_{j})\in Eitalic_e start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT = ( italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ∈ italic_E can take a value of 1111 or 11-1- 1, indicating a positive or negative edge in the network. To effectively dismantle a signed network, we need to detect the GCC from the current network as input for embedding at each iteration. Therefore, we use the breadth-first-search (BFS) algorithm to detect the giant connected component within a signed network. Specifically, we start from each unvisited node to find all nodes connected to it and record the size of its corresponding connected component. The component containing most nodes is referred to as the GCC.

II-B Signed Network Embedding (SiNE)

We choose to use a classic signed network embedding method rooted in deep learning, specifically known as SiNE, to obtain embedding vectors for each node. In the subsequent sections, we provide an in-depth description of the three fundamental components of this method, i.e., the establishment of the objective function, the construction of a deep learning network, and the update of the parameters. The formulation of the objective function in SiNE is based on structural balance theory, positing that individuals are more like their “friends” than their “enemies”. We utilize 𝒯={(vi,vj,vk)eij=1,eik=1,vi,vj,vkV}𝒯conditional-setsubscript𝑣𝑖subscript𝑣𝑗subscript𝑣𝑘formulae-sequencesubscript𝑒𝑖𝑗1formulae-sequencesubscript𝑒𝑖𝑘1subscript𝑣𝑖subscript𝑣𝑗subscript𝑣𝑘𝑉\mathcal{T}=\{(v_{i},v_{j},v_{k})\mid e_{ij}=1,e_{ik}=-1,v_{i},v_{j},v_{k}\in V\}caligraphic_T = { ( italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ∣ italic_e start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT = 1 , italic_e start_POSTSUBSCRIPT italic_i italic_k end_POSTSUBSCRIPT = - 1 , italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∈ italic_V } to denote a collection of triplets, where there is a positive connection between visubscript𝑣𝑖v_{i}italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and vjsubscript𝑣𝑗v_{j}italic_v start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT, and a negative connection between visubscript𝑣𝑖v_{i}italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and vksubscript𝑣𝑘v_{k}italic_v start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT. Hence, it is necessary to allocate a greater similarity to visubscript𝑣𝑖v_{i}italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and vjsubscript𝑣𝑗v_{j}italic_v start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT compared to visubscript𝑣𝑖v_{i}italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and vksubscript𝑣𝑘v_{k}italic_v start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT. Mathematically, we express the similarity as f(𝐱i,𝐱j)f(𝐱i,𝐱k)+ϵ𝑓subscript𝐱𝑖subscript𝐱𝑗𝑓subscript𝐱𝑖subscript𝐱𝑘italic-ϵf(\mathbf{x}_{i},\mathbf{x}_{j})\geq f(\mathbf{x}_{i},\mathbf{x}_{k})+\epsilonitalic_f ( bold_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ≥ italic_f ( bold_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) + italic_ϵ, where f𝑓fitalic_f denotes the similarity function that requires learning, and ϵitalic-ϵ\epsilonitalic_ϵ fine-tunes the dissimilarity between the nodes. The higher value of ϵitalic-ϵ\epsilonitalic_ϵ makes visubscript𝑣𝑖v_{i}italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and vjsubscript𝑣𝑗v_{j}italic_v start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT closer and visubscript𝑣𝑖v_{i}italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and vksubscript𝑣𝑘v_{k}italic_v start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT farther away in the embedding space. Since the mentioned function is unable to handle cases where 2-hop networks of nodes only have positive or negative links, and given that positive connections are more prevalent than negative ones in real-world networks, the study introduces a virtual node v0subscript𝑣0v_{0}italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT. The virtual node is utilized to establish a negative link between v0subscript𝑣0v_{0}italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and the node connected to its 2-hop neighbors only by positive links. Assuming 𝒯0={(vi,vj,v0)eij=1,ei0=1}subscript𝒯0conditional-setsubscript𝑣𝑖subscript𝑣𝑗subscript𝑣0formulae-sequencesubscript𝑒𝑖𝑗1subscript𝑒𝑖01\mathcal{T}_{0}=\{(v_{i},v_{j},v_{0})\mid e_{ij}=1,e_{i0}=-1\}caligraphic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = { ( italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ∣ italic_e start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT = 1 , italic_e start_POSTSUBSCRIPT italic_i 0 end_POSTSUBSCRIPT = - 1 } is one of these triplets, we have f(𝐱i,𝐱j)f(𝐱i,𝐱0)+ϵ0𝑓subscript𝐱𝑖subscript𝐱𝑗𝑓subscript𝐱𝑖subscript𝐱0subscriptitalic-ϵ0f(\mathbf{x}_{i},\mathbf{x}_{j})\geq f(\mathbf{x}_{i},\mathbf{x}_{0})+\epsilon% _{0}italic_f ( bold_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ≥ italic_f ( bold_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) + italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, where ϵ0subscriptitalic-ϵ0\epsilon_{0}italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT plays a similar role as ϵitalic-ϵ\epsilonitalic_ϵ. Consequently, the objective function for signed network embedding is

min𝐗,𝐱0,ϵsubscript𝐗subscript𝐱0italic-ϵ\displaystyle\min_{\mathbf{X},\mathbf{x}_{0},\epsilon}roman_min start_POSTSUBSCRIPT bold_X , bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_ϵ end_POSTSUBSCRIPT 1T[(𝐱i,𝐱j,𝐱k)𝒯max(0,f(𝐱i,𝐱k)+ϵf(𝐱i,𝐱j))\displaystyle\frac{1}{T}\left[\sum\limits_{\left(\mathbf{x}_{i},\mathbf{x}_{j}% ,\mathbf{x}_{k}\right)\in\mathcal{T}}\max\left(0,f\left(\mathbf{x}_{i},\mathbf% {x}_{k}\right)+\epsilon-f\left(\mathbf{x}_{i},\mathbf{x}_{j}\right)\right)\right.divide start_ARG 1 end_ARG start_ARG italic_T end_ARG [ ∑ start_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ∈ caligraphic_T end_POSTSUBSCRIPT roman_max ( 0 , italic_f ( bold_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) + italic_ϵ - italic_f ( bold_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ) (1)
+(𝐱i,𝐱j,𝐱0)𝒯0max(0,f(𝐱i,𝐱0)+ϵ0f(𝐱i,𝐱j))]\displaystyle\left.+\sum\limits_{\left(\mathbf{x}_{i},\mathbf{x}_{j},\mathbf{x% }_{0}\right)\in\mathcal{T}_{0}}\max\left(0,f\left(\mathbf{x}_{i},\mathbf{x}_{0% }\right)+\epsilon_{0}-f\left(\mathbf{x}_{i},\mathbf{x}_{j}\right)\right)\right]+ ∑ start_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ∈ caligraphic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT roman_max ( 0 , italic_f ( bold_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) + italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - italic_f ( bold_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ) ]
+λ(H(ϕ)+𝐗F2+𝐱022),𝜆𝐻italic-ϕsuperscriptsubscriptnorm𝐗𝐹2superscriptsubscriptnormsubscript𝐱022\displaystyle+\lambda\left({H}(\phi)+\|\mathbf{X}\|_{F}^{2}+\left\|\mathbf{x}_% {0}\right\|_{2}^{2}\right),+ italic_λ ( italic_H ( italic_ϕ ) + ∥ bold_X ∥ start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ∥ bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) ,

where the size of the training data is denoted by T=|𝒯|+|𝒯0|𝑇𝒯subscript𝒯0{T}=\left|\mathcal{T}\right|+\left|\mathcal{T}_{0}\right|italic_T = | caligraphic_T | + | caligraphic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT |, and 𝐗={𝐱1,𝐱2,,𝐱N}𝐗subscript𝐱1subscript𝐱2subscript𝐱𝑁\mathbf{X}=\left\{\mathbf{x}_{1},\mathbf{x}_{2},\cdots,\mathbf{x}_{N}\right\}bold_X = { bold_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , ⋯ , bold_x start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT } represents the embedding vectors of the N𝑁Nitalic_N nodes. The similarity function f𝑓fitalic_f is determined by the parameter set ϕitalic-ϕ\phiitalic_ϕ, and H(ϕ)𝐻italic-ϕH(\phi)italic_H ( italic_ϕ ) serves as a regularizer to prevent overfitting. The parameter λ𝜆\lambdaitalic_λ is utilized to control the impact of the regularizers. In addition, F\|\cdot\|_{F}∥ ⋅ ∥ start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT is the Frobenius norm, while 2\|\cdot\|_{2}∥ ⋅ ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT represents the 2subscript2\ell_{2}roman_ℓ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT-norm.

The optimization of the objective function is carried out to acquire nonlinear embedding vectors for nodes within signed networks. Within the SiNE framework, the function f𝑓fitalic_f and the parameter set ϕitalic-ϕ\phiitalic_ϕ in the objective function are defined through the construction of a neural network. The framework consists of two layers of neural networks, where 𝐖11superscript𝐖11\mathbf{W}^{11}bold_W start_POSTSUPERSCRIPT 11 end_POSTSUPERSCRIPT and 𝐖12superscript𝐖12\mathbf{W}^{12}bold_W start_POSTSUPERSCRIPT 12 end_POSTSUPERSCRIPT are the weights of the first hidden layer, and 𝐛1superscript𝐛1\mathbf{b}^{1}bold_b start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT is the bias. The specific output form of the first layer is as follows:

𝐳11=tanh(𝐖11𝐱i+𝐖12𝐱j+𝐛1),superscript𝐳11superscript𝐖11subscript𝐱𝑖superscript𝐖12subscript𝐱𝑗superscript𝐛1\displaystyle\mathbf{z}^{11}=\tanh(\mathbf{W}^{11}\mathbf{x}_{i}+\mathbf{W}^{1% 2}\mathbf{x}_{j}+\mathbf{b}^{1}),bold_z start_POSTSUPERSCRIPT 11 end_POSTSUPERSCRIPT = roman_tanh ( bold_W start_POSTSUPERSCRIPT 11 end_POSTSUPERSCRIPT bold_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT + bold_W start_POSTSUPERSCRIPT 12 end_POSTSUPERSCRIPT bold_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + bold_b start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) , (2)
𝐳12=tanh(𝐖11𝐱i+𝐖12𝐱k+𝐛1).superscript𝐳12superscript𝐖11subscript𝐱𝑖superscript𝐖12subscript𝐱𝑘superscript𝐛1\displaystyle\mathbf{z}^{12}=\tanh(\mathbf{W}^{11}\mathbf{x}_{i}+\mathbf{W}^{1% 2}\mathbf{x}_{k}+\mathbf{b}^{1}).bold_z start_POSTSUPERSCRIPT 12 end_POSTSUPERSCRIPT = roman_tanh ( bold_W start_POSTSUPERSCRIPT 11 end_POSTSUPERSCRIPT bold_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT + bold_W start_POSTSUPERSCRIPT 12 end_POSTSUPERSCRIPT bold_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT + bold_b start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ) .

Similarly, the outputs of the first layer, 𝐳11superscript𝐳11\mathbf{z}^{11}bold_z start_POSTSUPERSCRIPT 11 end_POSTSUPERSCRIPT and 𝐳12superscript𝐳12\mathbf{z}^{12}bold_z start_POSTSUPERSCRIPT 12 end_POSTSUPERSCRIPT, serve as inputs of the second layer. The specific structure of the output of the second hidden layer is expressed as 𝐳21=tanh(𝐖2𝐳11+𝐛2)superscript𝐳21superscript𝐖2superscript𝐳11superscript𝐛2\mathbf{z}^{21}=\tanh(\mathbf{W}^{2}\mathbf{z}^{11}+\mathbf{b}^{2})bold_z start_POSTSUPERSCRIPT 21 end_POSTSUPERSCRIPT = roman_tanh ( bold_W start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_z start_POSTSUPERSCRIPT 11 end_POSTSUPERSCRIPT + bold_b start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) and 𝐳22=tanh(𝐖2𝐳12+𝐛2)superscript𝐳22superscript𝐖2superscript𝐳12superscript𝐛2\mathbf{z}^{22}=\tanh(\mathbf{W}^{2}\mathbf{z}^{12}+\mathbf{b}^{2})bold_z start_POSTSUPERSCRIPT 22 end_POSTSUPERSCRIPT = roman_tanh ( bold_W start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_z start_POSTSUPERSCRIPT 12 end_POSTSUPERSCRIPT + bold_b start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ), where 𝐖2superscript𝐖2\mathbf{W}^{2}bold_W start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT represents the weight of the second-layer network, and 𝐛2superscript𝐛2\mathbf{b}^{2}bold_b start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT denotes the bias. Thus, the final output of the neural network determines the nonlinear function f𝑓fitalic_f used to evaluate node similarity in the objective function, which can be expressed as

f(𝐱i,𝐱j)=tanh(𝐰T𝐳21+b),𝑓subscript𝐱𝑖subscript𝐱𝑗superscript𝐰𝑇superscript𝐳21𝑏\displaystyle f\left(\mathbf{x}_{i},\mathbf{x}_{j}\right)=\tanh\left(\mathbf{w% }^{T}\mathbf{z}^{21}+b\right),italic_f ( bold_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) = roman_tanh ( bold_w start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_z start_POSTSUPERSCRIPT 21 end_POSTSUPERSCRIPT + italic_b ) , (3)

and

f(𝐱i,𝐱k)=tanh(𝐰T𝐳22+b),𝑓subscript𝐱𝑖subscript𝐱𝑘superscript𝐰𝑇superscript𝐳22𝑏\displaystyle f\left(\mathbf{x}_{i},\mathbf{x}_{k}\right)=\tanh\left(\mathbf{w% }^{T}\mathbf{z}^{22}+b\right),italic_f ( bold_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) = roman_tanh ( bold_w start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT bold_z start_POSTSUPERSCRIPT 22 end_POSTSUPERSCRIPT + italic_b ) , (4)

where the elements vector of 𝐰𝐰\mathbf{w}bold_w are the weights and the scalar b𝑏bitalic_b denotes the bias. The parameter set ϕitalic-ϕ\phiitalic_ϕ in the objective function is given by ϕ={𝐖11,𝐖12,𝐖2,𝐰,𝐛1,𝐛2,b}italic-ϕsuperscript𝐖11superscript𝐖12superscript𝐖2𝐰superscript𝐛1superscript𝐛2𝑏\phi=\left\{\mathbf{W}^{11},\mathbf{W}^{12},\mathbf{W}^{2},\mathbf{w},\mathbf{% b}^{1},\mathbf{b}^{2},b\right\}italic_ϕ = { bold_W start_POSTSUPERSCRIPT 11 end_POSTSUPERSCRIPT , bold_W start_POSTSUPERSCRIPT 12 end_POSTSUPERSCRIPT , bold_W start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , bold_w , bold_b start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , bold_b start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , italic_b }, and H𝐻Hitalic_H is given by H(ϕ)=𝐖11F2+𝐖12F2+𝐖222+𝐰22+𝐛122+𝐛222+b2𝐻italic-ϕsuperscriptsubscriptnormsuperscript𝐖11𝐹2superscriptsubscriptnormsuperscript𝐖12𝐹2superscriptsubscriptnormsuperscript𝐖222superscriptsubscriptnorm𝐰22superscriptsubscriptnormsuperscript𝐛122superscriptsubscriptnormsuperscript𝐛222superscript𝑏2H(\phi)=\left\|\mathbf{W}^{11}\right\|_{F}^{2}+\left\|\mathbf{W}^{12}\right\|_% {F}^{2}+\left\|\mathbf{W}^{2}\right\|_{2}^{2}+\|\mathbf{w}\|_{2}^{2}+\left\|% \mathbf{b}^{1}\right\|_{2}^{2}+\left\|\mathbf{b}^{2}\right\|_{2}^{2}+b^{2}italic_H ( italic_ϕ ) = ∥ bold_W start_POSTSUPERSCRIPT 11 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ∥ bold_W start_POSTSUPERSCRIPT 12 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ∥ bold_W start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ∥ bold_w ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ∥ bold_b start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ∥ bold_b start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_b start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT.

In the SiNE framework, backpropagation is employed to optimize the deep learning network. This process entails updating network parameters by backpropagating “errors”, facilitating a more efficient computation of gradients. The key to optimizing the objective function lies in obtaining gradients with respect to the parameters 𝐗𝐗\mathbf{X}bold_X, 𝐱0subscript𝐱0\mathbf{x}_{0}bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, and ϕitalic-ϕ\phiitalic_ϕ for max(0,f(𝐱i,𝐱k)+ϵf(𝐱i,𝐱j))0𝑓subscript𝐱𝑖subscript𝐱𝑘italic-ϵ𝑓subscript𝐱𝑖subscript𝐱𝑗\max(0,f(\mathbf{x}_{i},\mathbf{x}_{k})+\epsilon-f(\mathbf{x}_{i},\mathbf{x}_{% j}))roman_max ( 0 , italic_f ( bold_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) + italic_ϵ - italic_f ( bold_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ) and max(0,f(𝐱i,𝐱0)+ϵ0f(𝐱i,𝐱j))0𝑓subscript𝐱𝑖subscript𝐱0subscriptitalic-ϵ0𝑓subscript𝐱𝑖subscript𝐱𝑗\max(0,f(\mathbf{x}_{i},\mathbf{x}_{0})+\epsilon_{0}-f(\mathbf{x}_{i},\mathbf{% x}_{j}))roman_max ( 0 , italic_f ( bold_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) + italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - italic_f ( bold_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) ). Based on the mini-batch stochastic gradient descent algorithm, the training data is divided into small batches during each training iteration. Subsequently, the gradients for the current batch are computed using the backpropagation method. These gradients are then backward propagated from the output layer to the input layer, elucidating the influence of each parameter on the overall network output “errors”.

II-C Node clustering

After obtaining the embedding vector of each node using SiNE, we further use the K-means algorithm to partition the nodes in the network into k𝑘kitalic_k clusters, where k𝑘kitalic_k is a tunable parameter. We illustrate the details of using K-means as follows:

  • Initialization: We randomly select k𝑘kitalic_k nodes from the signed network, and each of them serves as the central node for one of the k𝑘kitalic_k clusters.

  • Assignment: For every node left in the network, we determine the Euclidean distance from it to the cluster centers by utilizing their embedding vectors. We then assign each node to the cluster with the nearest distance and guarantee that each cluster consists of nodes that are most akin to its centroid.

  • Update Centroids: The average of the embedding vectors of the nodes is computed for each cluster, and this average is then designated as the new cluster center.

  • Iteration: The assignment and update centroids steps are iterated until either the cluster centers stabilize or the specified number of iterations is reached.

Because each iteration involves a relatively low computational burden, the K-means algorithm runs quickly. By setting the number of clusters (k𝑘kitalic_k), it promptly aids in selecting nodes and improving the efficiency of the algorithm proposed in this paper.

II-D Node Elimination

Empirical evidence indicates that most nodes are affiliated with a single cluster, while only a minority are assigned to various other distinct clusters. Furthermore, previous studies have indicated that the elimination of nodes within a cluster or community can improve the efficiency of network dismantling [38, 39]. Hence, we utilize the largest cluster as the central part for decomposition. More precisely, at each stage of the attack process, we will pinpoint the largest cluster in the network and remove the node with the highest degree in that cluster.

III Baselines

To demonstrate the enhanced effectiveness of ESND in network dismantling, we have selected 12121212 classic centrality metrics as benchmarks. These metrics encompass those that are agnostic to the sign of the network, such as Degree, Betweenness, K-shell, and Closeness, as well as those that take into account the edge signs, such as P-DEG, N-DEG, Net-DEG, Ratio-DEG, PN, TE, and SPR. The basic explanations of these centrality metrics are provided below.

  • Degree: Degree quantifies the number of direct neighbors of a node when we ignore the sign of the edges, and nodes with higher degrees are generally considered more important. The node degree centrality of node visubscript𝑣𝑖v_{i}italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is kiN1subscript𝑘𝑖𝑁1\frac{k_{i}}{N-1}divide start_ARG italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG italic_N - 1 end_ARG, where N𝑁Nitalic_N represents the number of nodes and N1𝑁1N-1italic_N - 1 signifies the maximum possible degree value for a node, and kisubscript𝑘𝑖k_{i}italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is the degree of the node visubscript𝑣𝑖v_{i}italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT indicating the number of its neighbors.

  • Betweenness: It assesses the role of a node in the shortest paths between other nodes. The betweenness centrality of node visubscript𝑣𝑖v_{i}italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is is,jt,stgstigstsubscriptformulae-sequence𝑖𝑠formulae-sequence𝑗𝑡𝑠𝑡superscriptsubscript𝑔𝑠𝑡𝑖subscript𝑔𝑠𝑡\sum_{i\neq s,j\neq t,s\neq t}\frac{g_{st}^{i}}{g_{st}}∑ start_POSTSUBSCRIPT italic_i ≠ italic_s , italic_j ≠ italic_t , italic_s ≠ italic_t end_POSTSUBSCRIPT divide start_ARG italic_g start_POSTSUBSCRIPT italic_s italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT end_ARG start_ARG italic_g start_POSTSUBSCRIPT italic_s italic_t end_POSTSUBSCRIPT end_ARG, where gstsubscript𝑔𝑠𝑡g_{st}italic_g start_POSTSUBSCRIPT italic_s italic_t end_POSTSUBSCRIPT represents the total number of shortest paths from node vssubscript𝑣𝑠v_{s}italic_v start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT to vtsubscript𝑣𝑡v_{t}italic_v start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, and gstisuperscriptsubscript𝑔𝑠𝑡𝑖g_{st}^{i}italic_g start_POSTSUBSCRIPT italic_s italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT denotes the number of these shortest paths among the gstsubscript𝑔𝑠𝑡g_{st}italic_g start_POSTSUBSCRIPT italic_s italic_t end_POSTSUBSCRIPT that pass through visubscript𝑣𝑖v_{i}italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT.

  • K-shell: K-shell centrality categorizes nodes based on their degrees to evaluate their importance in a network. Assuming there are no isolated nodes in the network, we eliminate nodes with one connection until no more such nodes remain and assign them to the 1111-shell. Similarly, we recursively eliminate nodes with degree of 2222 to form the 2222-shell. This process concludes when all nodes have been allocated to one of the shells.

  • Closeness: This centrality functions as a global indicator delineating the node’s position in the network, and it quantifies the average distance between a node and the remaining nodes. The closeness centrality of node visubscript𝑣𝑖v_{i}italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is N1jidij𝑁1subscript𝑗𝑖subscript𝑑𝑖𝑗\frac{N-1}{\sum_{j\neq i}d_{ij}}divide start_ARG italic_N - 1 end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_j ≠ italic_i end_POSTSUBSCRIPT italic_d start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT end_ARG, where dijsubscript𝑑𝑖𝑗d_{ij}italic_d start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT is the length of the shortest path between node visubscript𝑣𝑖v_{i}italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT to node vjsubscript𝑣𝑗v_{j}italic_v start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT. A higher closeness value indicates that visubscript𝑣𝑖v_{i}italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is closer to the other nodes in a network.

  • Positive degree (P-DEG): P-DEG counts the number of positive edges linked to a node, which is referred to as the positive degree. Thus, the P-DEG centrality value of node visubscript𝑣𝑖v_{i}italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is given by its number of positive edges ki+superscriptsubscript𝑘𝑖{k_{i}^{+}}italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT.

  • Negative degree (N-DEG): N-DEG quantifies the number of negative edges associated with each node, denoted as the negative degree. The N-DEG centrality of node visubscript𝑣𝑖v_{i}italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT can be presented by its number of negative edges kisuperscriptsubscript𝑘𝑖{k_{i}^{-}}italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - end_POSTSUPERSCRIPT.

  • Net degree (Net-DEG): This metric represents the difference between the number of positive edges and negative edges that a node has. For node visubscript𝑣𝑖v_{i}italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, the Net-DEG value is presented as ki+kisuperscriptsubscript𝑘𝑖superscriptsubscript𝑘𝑖{k_{i}^{+}}-{k_{i}^{-}}italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT - italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - end_POSTSUPERSCRIPT.

  • Ratio degree (Ratio-DEG): It represents the proportion of positive edges that a node visubscript𝑣𝑖v_{i}italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT has among its total number of edges in the network, which reads ki+ki++kisuperscriptsubscript𝑘𝑖superscriptsubscript𝑘𝑖superscriptsubscript𝑘𝑖\frac{k_{i}^{+}}{{k_{i}^{+}}+{k_{i}^{-}}}divide start_ARG italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT end_ARG start_ARG italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT + italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - end_POSTSUPERSCRIPT end_ARG.

  • PN centrality [40]: Everett and Borgatti argue that nodes with more positive connections are more significant, while nodes with more negative connections are less important. Thus, they propose the PN index to evaluate node importance in signed networks, calculated using the following formula

    PN=(I12N2A)11,𝑃𝑁superscript𝐼12𝑁2𝐴11\displaystyle PN=\left(I-{\frac{1}{2N-2}}A\right)^{-1}\textbf{1},italic_P italic_N = ( italic_I - divide start_ARG 1 end_ARG start_ARG 2 italic_N - 2 end_ARG italic_A ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT 1 , (5)

    where N𝑁Nitalic_N represents the number of nodes in the network, I𝐼Iitalic_I is the N𝑁Nitalic_N-order identity matrix, A=A+2A𝐴superscript𝐴2superscript𝐴A=A^{+}-2A^{-}italic_A = italic_A start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT - 2 italic_A start_POSTSUPERSCRIPT - end_POSTSUPERSCRIPT, and A+superscript𝐴A^{+}italic_A start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT (or Asuperscript𝐴A^{-}italic_A start_POSTSUPERSCRIPT - end_POSTSUPERSCRIPT) represents the adjacency matrix containing only positive (or negative edges). 1 denotes an N𝑁Nitalic_N-dimension vector with all elements equal to 1.

  • TE [41]: This index calculates the centrality of a target node considering the total effect (TE) of all other nodes to it in the network. The higher the value of TE, the more important the node. For an undirected signed network, if there is an edge between visubscript𝑣𝑖v_{i}italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and vjsubscript𝑣𝑗v_{j}italic_v start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT, the effect of visubscript𝑣𝑖v_{i}italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT to vjsubscript𝑣𝑗v_{j}italic_v start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT is defined as Eij,1S=S×1Djsubscript𝐸𝑖𝑗superscript1𝑆𝑆1subscript𝐷𝑗E_{ij,1^{S}}=S\times\frac{1}{D_{j}}italic_E start_POSTSUBSCRIPT italic_i italic_j , 1 start_POSTSUPERSCRIPT italic_S end_POSTSUPERSCRIPT end_POSTSUBSCRIPT = italic_S × divide start_ARG 1 end_ARG start_ARG italic_D start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG, where S𝑆Sitalic_S is the sign (+11+1+ 1 or 11-1- 1) of the edge eijsubscript𝑒𝑖𝑗e_{ij}italic_e start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT, and Djsubscript𝐷𝑗D_{j}italic_D start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT is the degree of vjsubscript𝑣𝑗v_{j}italic_v start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT. We construct two matrices CEn+={CEij,n+}N×N={l=1nEij,l+}N×N𝐶subscript𝐸superscript𝑛subscript𝐶subscript𝐸𝑖𝑗superscript𝑛𝑁𝑁subscriptsuperscriptsubscript𝑙1𝑛subscript𝐸𝑖𝑗superscript𝑙𝑁𝑁CE_{n^{+}}=\{CE_{ij,n^{+}}\}_{N\times N}=\{\sum_{l=1}^{n}E_{ij,l^{+}}\}_{N% \times N}italic_C italic_E start_POSTSUBSCRIPT italic_n start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT end_POSTSUBSCRIPT = { italic_C italic_E start_POSTSUBSCRIPT italic_i italic_j , italic_n start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_N × italic_N end_POSTSUBSCRIPT = { ∑ start_POSTSUBSCRIPT italic_l = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_E start_POSTSUBSCRIPT italic_i italic_j , italic_l start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_N × italic_N end_POSTSUBSCRIPT and CEn={CEij,n}N×N={l=1nEij,l}N×N𝐶subscript𝐸superscript𝑛subscript𝐶subscript𝐸𝑖𝑗superscript𝑛𝑁𝑁subscriptsuperscriptsubscript𝑙1𝑛subscript𝐸𝑖𝑗superscript𝑙𝑁𝑁CE_{n^{-}}=\{CE_{ij,n^{-}}\}_{N\times N}=\{\sum_{l=1}^{n}E_{ij,l^{-}}\}_{N% \times N}italic_C italic_E start_POSTSUBSCRIPT italic_n start_POSTSUPERSCRIPT - end_POSTSUPERSCRIPT end_POSTSUBSCRIPT = { italic_C italic_E start_POSTSUBSCRIPT italic_i italic_j , italic_n start_POSTSUPERSCRIPT - end_POSTSUPERSCRIPT end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_N × italic_N end_POSTSUBSCRIPT = { ∑ start_POSTSUBSCRIPT italic_l = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_E start_POSTSUBSCRIPT italic_i italic_j , italic_l start_POSTSUPERSCRIPT - end_POSTSUPERSCRIPT end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_N × italic_N end_POSTSUBSCRIPT to represent the sum of the positive and negative effects of visubscript𝑣𝑖v_{i}italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT to vjsubscript𝑣𝑗v_{j}italic_v start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT up to n𝑛nitalic_n steps, respectively. Therefore, TEij,n=CEij,n++|CEij,n|𝑇subscript𝐸𝑖𝑗𝑛𝐶subscript𝐸𝑖𝑗superscript𝑛𝐶subscript𝐸𝑖𝑗superscript𝑛TE_{ij,n}=CE_{ij,n^{+}}+\left|CE_{ij,n^{-}}\right|italic_T italic_E start_POSTSUBSCRIPT italic_i italic_j , italic_n end_POSTSUBSCRIPT = italic_C italic_E start_POSTSUBSCRIPT italic_i italic_j , italic_n start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT end_POSTSUBSCRIPT + | italic_C italic_E start_POSTSUBSCRIPT italic_i italic_j , italic_n start_POSTSUPERSCRIPT - end_POSTSUPERSCRIPT end_POSTSUBSCRIPT | indicates the sum of effects from visubscript𝑣𝑖v_{i}italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT to vjsubscript𝑣𝑗v_{j}italic_v start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT, and the TE value of visubscript𝑣𝑖v_{i}italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is further given by

    TEi,n=j=1NTEij,n.𝑇subscript𝐸𝑖𝑛superscriptsubscript𝑗1𝑁𝑇subscript𝐸𝑖𝑗𝑛\displaystyle TE_{i,n}=\sum_{j=1}^{N}TE_{ij,n}.italic_T italic_E start_POSTSUBSCRIPT italic_i , italic_n end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT italic_T italic_E start_POSTSUBSCRIPT italic_i italic_j , italic_n end_POSTSUBSCRIPT . (6)

    Here, we set n=2𝑛2n=2italic_n = 2, meaning that we only calculate the effect of a node on its neighbors in two hops.

  • Signed-PageRank (SPR) [42]: SPR is a PageRank algorithm adapted for signed networks, which updates the SPR value for each node in each iteration by aggregating the weights and sign information. The formula for updating the SPR value of visubscript𝑣𝑖v_{i}italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT at iteration t+1𝑡1t+1italic_t + 1 is

    SPRi,t+1=vjDiout(SPRi,tSPRj,t)yi,j+1dN,𝑆𝑃subscript𝑅𝑖𝑡1subscriptsubscript𝑣𝑗superscriptsubscript𝐷𝑖𝑜𝑢𝑡𝑆𝑃subscript𝑅𝑖𝑡𝑆𝑃subscript𝑅𝑗𝑡subscript𝑦𝑖𝑗1𝑑𝑁\displaystyle SPR_{i,t+1}=\sum_{v_{j}\in D_{i}^{out}}(SPR_{i,t}-SPR_{j,t}){y_{% i,j}}+\frac{1-d}{N},italic_S italic_P italic_R start_POSTSUBSCRIPT italic_i , italic_t + 1 end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ∈ italic_D start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_o italic_u italic_t end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_S italic_P italic_R start_POSTSUBSCRIPT italic_i , italic_t end_POSTSUBSCRIPT - italic_S italic_P italic_R start_POSTSUBSCRIPT italic_j , italic_t end_POSTSUBSCRIPT ) italic_y start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT + divide start_ARG 1 - italic_d end_ARG start_ARG italic_N end_ARG , (7)

    where Dioutsuperscriptsubscript𝐷𝑖𝑜𝑢𝑡D_{i}^{out}italic_D start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_o italic_u italic_t end_POSTSUPERSCRIPT is the set of out-neighbors of visubscript𝑣𝑖v_{i}italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT. Y={yi,j}N×N=dH𝑌subscriptsubscript𝑦𝑖𝑗𝑁𝑁𝑑𝐻Y=\{y_{i,j}\}_{N\times N}=dHitalic_Y = { italic_y start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_N × italic_N end_POSTSUBSCRIPT = italic_d italic_H is the Signed-PageRank adjacency matrix with dam** coefficient d𝑑ditalic_d, where H𝐻Hitalic_H represents the Hadamard product of the normalized weight matrix W𝑊Witalic_W and the label matrix L𝐿Litalic_L. In our work, the weights of all edges are equal to 1111, thus in matrix W={wij}N×N𝑊subscriptsubscript𝑤𝑖𝑗𝑁𝑁W=\{w_{ij}\}_{N\times N}italic_W = { italic_w start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_N × italic_N end_POSTSUBSCRIPT, wij=1Disubscript𝑤𝑖𝑗1subscript𝐷𝑖{w_{ij}}=\frac{1}{D_{i}}italic_w start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT = divide start_ARG 1 end_ARG start_ARG italic_D start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG if visubscript𝑣𝑖v_{i}italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and vjsubscript𝑣𝑗v_{j}italic_v start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT have a connection. In matrix L={lij}N×N𝐿subscriptsubscript𝑙𝑖𝑗𝑁𝑁L=\{l_{ij}\}_{N\times N}italic_L = { italic_l start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_N × italic_N end_POSTSUBSCRIPT, lij=1subscript𝑙𝑖𝑗1l_{ij}=1italic_l start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT = 1 if there is a positive connection between visubscript𝑣𝑖v_{i}italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and vjsubscript𝑣𝑗v_{j}italic_v start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT, and lij=1subscript𝑙𝑖𝑗1l_{ij}=-1italic_l start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT = - 1 signifies a negative connection between them. Unlike the PageRank algorithm, the iteration of the Signed-PageRank algorithm continues until the ranking of nodes based on SPR values remains unchanged. Here, we consider the final rank of the nodes as their importance.

  • Signed Eigenvector (SE) [43]: SE is an extension of the eigenvector of signed networks. The main idea is that a node with more positive edges to the important nodes is more important, and vice versa for nodes with more negative edges to the important nodes. Given the label matrix LN×Nsubscript𝐿𝑁𝑁L_{N\times N}italic_L start_POSTSUBSCRIPT italic_N × italic_N end_POSTSUBSCRIPT of an undirected and unweighted signed network, we can swap the rows and columns of L𝐿Litalic_L to obtain a matrix

    A=(L+LLL+),𝐴matrixsuperscript𝐿superscript𝐿superscript𝐿superscript𝐿\displaystyle A=\left(\begin{matrix}L^{+}&L^{-}\\ L^{-}&L^{+}\end{matrix}\right),italic_A = ( start_ARG start_ROW start_CELL italic_L start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT end_CELL start_CELL italic_L start_POSTSUPERSCRIPT - end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL italic_L start_POSTSUPERSCRIPT - end_POSTSUPERSCRIPT end_CELL start_CELL italic_L start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT end_CELL end_ROW end_ARG ) , (8)

    where Ln1×n1+subscriptsuperscript𝐿subscript𝑛1subscript𝑛1L^{+}_{n_{1}\times n_{1}}italic_L start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT × italic_n start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT is an adjacency matrix only containing positive edges, Ln2×n2subscriptsuperscript𝐿subscript𝑛2subscript𝑛2L^{-}_{n_{2}\times n_{2}}italic_L start_POSTSUPERSCRIPT - end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT × italic_n start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT denotes a adjacency matrix with negative edges, and n1+n2=Nsubscript𝑛1subscript𝑛2𝑁n_{1}+n_{2}=Nitalic_n start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_n start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = italic_N. Let B=DAD𝐵𝐷𝐴𝐷B=DADitalic_B = italic_D italic_A italic_D, where D𝐷Ditalic_D is a diagonal matrix, whose first n1subscript𝑛1n_{1}italic_n start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT diagonal elements are 1111 and the remaining n2subscript𝑛2n_{2}italic_n start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT elements are equal to 11-1- 1. In particular, B𝐵Bitalic_B has positive eigenvalues λ𝜆\lambdaitalic_λ because it contains only non-negative elements and corresponding eigenvector x𝑥xitalic_x. Since Bx=DADx=λx𝐵𝑥𝐷𝐴𝐷𝑥𝜆𝑥Bx=DADx=\lambda xitalic_B italic_x = italic_D italic_A italic_D italic_x = italic_λ italic_x, we have ADx=λD1x=λDx𝐴𝐷𝑥𝜆superscript𝐷1𝑥𝜆𝐷𝑥ADx=\lambda D^{-1}x=\lambda Dxitalic_A italic_D italic_x = italic_λ italic_D start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_x = italic_λ italic_D italic_x. Therefore, the signed eigenvalue centrality of each node can be represented by the eigenvector Dx𝐷𝑥Dxitalic_D italic_x when Dx𝐷𝑥Dxitalic_D italic_x is in a steady state.

IV Experiments

We apply the proposed ESND to dismantle six distinct real signed networks and three different signed network null models, and compare the results of ESND with those of the baselines on these null models to assess the stability of ESND. Additionally, we compute Kendall correlation coefficients for target attack node sequences generated by various decomposition strategies to analyze differences in node selection. Finally, we test how the ratio of negative edges would affect the robustness of a signed network through artificial network models.

IV-A Datasets

We select six real-world datasets that can be constructed as signed networks to evaluate the performance of our method. Specifically, Bitcoinalpha and Bitcoinotc are data sourced from SNAP 111https://snap.stanford.edu/data/, illustrating the trust networks between users participating in Bitcoin transactions. Due to transactional anonymity in Bitcoin transactions, users provide positive and negative ratings to signify trust (positive) or distrust (negative) relationships. WikiVote represents the voting network to select Wikipedia administrators222https://doi.org/10.6084/m9.figshare.12152628. The eligibility of the users for administration is determined through voting, with the edges denoting voting interactions, i.e., positive signs indicate support while negative signs indicate opposition. Slashdot is a notable technology news site where users comment and share technology-related information333https://www.aminer.cn/data-sna. Positive and negative signs in the dataset denote friendly or adversarial relationships between users. Reddit captures connections between users in diverse sub-communities, reflecting positive or negative sentiment in shared content across online accounts2. Epinions constitutes a trust network among users on a product review website, with positive and negative signs indicating trust or distrust relationships between user connections3. We show the topological information of these signed networks in Table I, including the number of nodes (N𝑁Nitalic_N), the number of edges (M𝑀Mitalic_M), the number of positive edges (E+superscript𝐸E^{+}italic_E start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT), the number of negative edges (Esuperscript𝐸E^{-}italic_E start_POSTSUPERSCRIPT - end_POSTSUPERSCRIPT), the average degree of nodes (kdelimited-⟨⟩𝑘\left\langle k\right\rangle⟨ italic_k ⟩) and the clustering coefficient (C𝐶Citalic_C). The table shows that all the signed networks have more positive edges than negative ones.

TABLE I: Topological information of the signed networks, in which N𝑁Nitalic_N denotes the number of nodes; M𝑀Mitalic_M represents the number of edges; |E+|superscript𝐸\left|E^{+}\right|| italic_E start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT | and |E|superscript𝐸\left|E^{-}\right|| italic_E start_POSTSUPERSCRIPT - end_POSTSUPERSCRIPT | indicate the number of positive and negative edges, respectively. The values in parentheses represent the proportions of positive and negative edges in the network; kdelimited-⟨⟩𝑘\left\langle{k}\right\rangle⟨ italic_k ⟩ denotes the average degree, and C𝐶Citalic_C signifies the average clustering coefficient.
N𝑁Nitalic_N M𝑀Mitalic_M |E+|superscript𝐸\left|E^{+}\right|| italic_E start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT | |E|superscript𝐸\left|E^{-}\right|| italic_E start_POSTSUPERSCRIPT - end_POSTSUPERSCRIPT | kdelimited-⟨⟩𝑘\left\langle{k}\right\rangle⟨ italic_k ⟩ C𝐶Citalic_C
Bitcoinalpha 3783 14124 12759(90%) 1365(10%) 7.47 0.177
Bitcoinotc 5881 21492 18250(85%) 3242(15%) 7.31 0.178
WikiVote 7118 100751 78658(78%) 22093(22%) 28.3 0.141
Slashdot 13182 34260 28884(84.3%) 5376(15.7%) 5.19 0.149
Reddit 18282 107301 99084(92.3%) 8217(7.7%) 11.74 0.374
Epinions 25148 99880 69185(69.2%) 30695(30.7%) 7.94 0.073

IV-B Performance Evaluation Metric

Network dismantling methods aim to produce an optimal node sequence to remove that could disrupt the network as much as possible. We use the robustness metric R𝑅Ritalic_R to assess the performance of ESND, as well as the baselines[44, 45]

R=1NQ=1Ns(Q),𝑅1𝑁superscriptsubscript𝑄1𝑁𝑠𝑄\displaystyle R=\frac{1}{N}\sum_{Q=1}^{N}s(Q),italic_R = divide start_ARG 1 end_ARG start_ARG italic_N end_ARG ∑ start_POSTSUBSCRIPT italic_Q = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT italic_s ( italic_Q ) , (9)

where N𝑁Nitalic_N is the size of network, s(Q)𝑠𝑄s(Q)italic_s ( italic_Q ) represents the fraction of nodes in the largest connected components after the removal of Q=qN𝑄𝑞𝑁Q=qNitalic_Q = italic_q italic_N nodes, and 1/N1𝑁1/N1 / italic_N is a standardized operation for comparing the robustness of networks with different sizes. To compute R𝑅Ritalic_R, a node rank is necessary; therefore, various dismantling methods are proposed to find the minimum R𝑅Ritalic_R in all possible node orders. A lower value of R𝑅Ritalic_R indicates that the method is more effective in destroying the network.

IV-C Parameter Analysis

Refer to caption
Figure 2: Performance of ESND under different parameter settings. The x-axis indicates the number of clusters k𝑘kitalic_k and the y-axis shows the performance of network dismantling. Different curves show the use of different values of dimension d𝑑ditalic_d in the embedding procedure. We show results for: (a)Bitcoinalpha; (b)Bitcoinotc; (c)WikiVote; (d)Slashdot; (e)Reddit; (f)Epinions.

To optimize the effectiveness of dismantling the network of the proposed method, we perform a thorough analysis of various parameters. Specifically, we focus on two key parameters, i.e., the embedding dimension size d𝑑ditalic_d and the number of clusters k𝑘kitalic_k, and keep the other parameters unchanged (we set hidden layers L=2𝐿2L=2italic_L = 2, learning rate λ=0.0001𝜆0.0001\lambda=0.0001italic_λ = 0.0001, and similarity parameters ϵitalic-ϵ\epsilonitalic_ϵ and ϵ0subscriptitalic-ϵ0\epsilon_{0}italic_ϵ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT set to 1 and 0.5, respectively. Note that these parameters are unchanged in the following experiments). We systematically compare the R𝑅Ritalic_R values for each dataset across different values of d𝑑ditalic_d and k𝑘kitalic_k, the results are given in Figure 2. We observe that when k𝑘kitalic_k is unchanged, the smallest R𝑅Ritalic_R is given by d=20𝑑20d=20italic_d = 20 in most networks, except Reddit where d=128𝑑128d=128italic_d = 128 achieves the best performance. Meanwhile, as k𝑘kitalic_k increases, the value of R𝑅Ritalic_R decreases and reaches its minimum at k=8𝑘8k=8italic_k = 8 in all networks. Therefore, in the following experiments, we set k=8𝑘8k=8italic_k = 8 for the six networks, d=128𝑑128d=128italic_d = 128 for Reddit, and d=20𝑑20d=20italic_d = 20 for the remaining networks.

IV-D Experimental Results

IV-D1 Results on Real Signed Networks

We compare the performance of the ESND with the selected baselines on the six signed networks, where the results are given in Figure 3 and Table II. In Figure 3, the horizontal axis (q𝑞qitalic_q) represents the proportion of nodes removed, while the vertical axis (S(qN)𝑆𝑞𝑁S(qN)italic_S ( italic_q italic_N )) corresponds to the fraction of nodes in the GCC after removing q𝑞qitalic_q fraction of nodes. For a fixed value of q𝑞qitalic_q, the smaller value of S(qN)𝑆𝑞𝑁S(qN)italic_S ( italic_q italic_N ) indicates that the dismantling method is more effective in dismantling the corresponding signed network than other methods. The values in Table II reveal the area under each curve (AUC) in Figure 3, with a smaller value indicating the better performance of the corresponding dismantling method. The experimental results show that the robustness of these real signed networks is notably different, with some of them demonstrating fast network collapse with only a small fraction of nodes being removed, such as Slashdot and Epinions, while the remaining ones are more robust. For example, WikiVote and Reddit networks necessitate approximately 40% removal to attain complete decomposition for most of the dismantling methods. In addition, ESND outperforms all baseline methods in dismantling most signed networks, particularly when we remove a large fraction of nodes. In dismantling an unsigned network, normally the betweenness can outperform the other methods in most cases[45]. However, it performs second best in most cases in dismantling signed networks, which reveals that considering the topology deduced by the signs is important in dismantling a signed network. Moreover, various centrality methods exhibit varying performances across different datasets, including Closeness, K-shell, PN, TE, SPR, and SE. The efficacy of these methods is closely related to the specific structures of the networks. In contrast, ESND consistently achieves optimal network dismantling results across diverse datasets, showing its stability and effectiveness.

Refer to caption
Figure 3: Comparison of ESND with baselines on signed networks: (a)Bitcoinalpha; (b)Bitcoinotc; (c)WikiVote; (d)Slashdot; (e)Reddit; (f)Epinions. X-axis shows the fraction of nodes removed and y-axis means the ratio of nodes in the giant component after node removal.
TABLE II: Area under each curve (AUC) of each curve in Figure 3. The best performance is highlighted in bold.
ESND Degree P-DEG N-DEG Net-DEG Ratio-DEG Closeness Betweenness K-shell PN TE SPR SE
Bitcoinalpha 0.0596 0.0704 0.0699 0.1603 0.0907 0.4356 0.1305 0.0782 0.0949 0.1327 0.0823 0.2955 0.2972
Bitcoinotc 0.0499 0.0579 0.0611 0.1599 0.1326 0.4538 0.1133 0.0634 0.0789 0.1889 0.0692 0.404 0.4021
WikiVote 0.1455 0.1521 0.1644 0.2315 0.2686 0.4562 0.1675 0.1363 0.1596 0.3610 0.1649 0.4067 0.4274
Slashdot 0.0106 0.0128 0.0161 0.0975 0.1357 0.4246 0.022 0.0119 0.0289 0.2087 0.0137 0.0958 0.2186
Reddit 0.0802 0.0921 0.0915 0.1778 0.0935 0.4616 0.1774 0.1033 0.1152 0.1097 0.1027 0.3748 0.3173
Epinions 0.0097 0.0123 0.0161 0.0877 0.2774 0.4003 0.0457 0.0108 0.0256 0.3428 0.0147 0.0989 0.2363

Various methods demonstrate diverse efficacy in network dismantling due to disparities in the strategies employed for node selection during each iteration. To scrutinize the dissimilarities in the node removal sequences generated by these methods, we conduct a correlation analysis. For each method, we first obtain the node removal sequence, i.e., different methods may result in different orders of node removal. Then we calculate the Kendall correlation coefficient between the node sequences obtained by a pair of dismantling methods. The Kendall correlation coefficients between each pair of methods are given in Figure 4. In particular, the Kendall correlation coefficients between the proposed ESND and the baselines are generally low, indicating a significant deviation in the node removal strategy of the ESND from these baseline methods. Additionally, Degree, P-DEG, N-DEG, Net-DEG, and Ratio-DEG, despite relying on distinct dismantling strategies derived from node degree, yield node sequences with relatively low correlation.

Refer to caption
Figure 4: Analysis of differences between removal node sequences generated by different methods. Each square represents the Kendall correlation coefficient between the removed node sequences generated by the corresponding pair of methods. We show the results for the following signed networks: (a) Bitcoinalpha; (b) Bitcoinotc; (c) WikiVote; (d) Slashdot; (e) Reddit; (f) Epinions.

IV-D2 Results on Null Models of the Signed Network

To delve deeper into the potential influence of factors such as network topology and signs on ESND and their consequent impact on variations in network dismantling outcomes, three distinct null models for signed networks[46] were constructed in six datasets. We illustrate examples of the null models in Figure 5, in which they preserve certain properties of the original network. Detailed descriptions of them are given below.

Refer to caption
Figure 5: Toy examples of null models of a signed network. Solid lines represent positive edges, and dotted lines represent negative edges.
  • Sign shuffle: In this model, the topological structure of the network is preserved by randomly selecting one positive edge and one negative edge and exchanging their signs, but the positive and negative degrees of each node will change. Taking node v1subscript𝑣1v_{1}italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT in Figure 5a and b as an example, the degree of node v1subscript𝑣1v_{1}italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT is preserved, but the positive degree and negative degree of node v1subscript𝑣1v_{1}italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT change from {2,0}20\{2,0\}{ 2 , 0 } to {1,1}11\{1,1\}{ 1 , 1 } via the sign shuffle model.

  • Signed rewire: Initially, two subgraphs containing only positive or negative edges are constructed from the original network. Subsequently, the edges are rewired within each subgraph, which could preserve the positive and negative degrees of the nodes. The process ends by merging the two rewired subgraphs to establish the null model. In this model, the positive and negative degrees of the nodes remain the same as in the original network, while the network structure is changed. Figure 5c demonstrates the generation of a signed rewire null model. For example, we disconnect the edges (v1,v2)subscript𝑣1subscript𝑣2(v_{1},v_{2})( italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ), (v5,v6)subscript𝑣5subscript𝑣6(v_{5},v_{6})( italic_v start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 6 end_POSTSUBSCRIPT ) and form new edges (v1,v5)subscript𝑣1subscript𝑣5(v_{1},v_{5})( italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT ), (v2,v6)subscript𝑣2subscript𝑣6(v_{2},v_{6})( italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 6 end_POSTSUBSCRIPT ) but keep the positive and negative degree of each node.

  • Rewire: The model exchanges edges between nodes while kee** the degree of each node unchanged. In this null model, both the topological structure of the network and the positive and negative degrees of each node undergo alterations. In Figure 5d, we show that the degree, positive degree, and negative degree of each node are changed through the random rewiring process of the rewire model.

Refer to caption
Figure 6: Network dismantling on different signed network null models. We show the results for signed networks: Bitcoinalpha; Bitcoinotc; WikiVote; Slashdot; Reddit and Epinions. The results are the average of 100 realizations.

We perform network dismantling on these null models generated by each of the six signed real-world networks, the specific experimental results are illustrated in Figure 6. The horizontal axis denotes the original signed network and their corresponding null models, while the vertical axis represents the evaluation metrics R𝑅Ritalic_R for various network dismantling methods applied to these signed networks and null models. The results show that each of the dismantling methods demonstrates generally consistent performance in network dismantling across both the original network and three null models within the same dataset. This suggests that modifications to the topological properties and sign distribution of the signed network do not significantly affect the efficacy of these methods. More importantly, ESND consistently attains superior dismantling performance compared to these baselines across these varied null models (as shown by the red diamonds in the figures), emphasizing the stability of ESND as an effective method for dismantling networks.

IV-D3 Impact of the signs on network robustness

We further examine the robustness of a signed network by adding different ratios of positive or negative edges. Specifically, we first generate unsigned synthetic networks, i.e., ER, WS, and BA, and then assign different ratios of positive or negative edges in the networks. Finally, we evaluate the robustness of these networks by using ESND. To be consistent and comparable, all synthetic networks contain 1000 nodes with the same average degree of 10101010. In the ER network, the probability of randomly connecting edges is set to p=0.01𝑝0.01p=0.01italic_p = 0.01. For the WS network, each node is connected to its k=10𝑘10k=10italic_k = 10 nearest neighbors, with a rewiring probability of p=0.01𝑝0.01p=0.01italic_p = 0.01. In the BA network, the initial number of nodes is m0=6subscript𝑚06m_{0}=6italic_m start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = 6, and each new node was connected to 5555 existing nodes. Subsequently, random positive and negative signs were assigned to each edge in each synthetic network, controlling the ratio of negative edges p=[0.1,,0.9]subscript𝑝0.10.9p_{-}=[0.1,\cdots,0.9]italic_p start_POSTSUBSCRIPT - end_POSTSUBSCRIPT = [ 0.1 , ⋯ , 0.9 ] to generate signed synthetic networks corresponding to different negative edge ratios. We show the dismantling results in Figure 7, where each point is the average of 100100100100 realizations.

In Figure 7, the x-axis indicates the ratio of negative edges (|E|Msuperscript𝐸𝑀\frac{|E^{-}|}{M}divide start_ARG | italic_E start_POSTSUPERSCRIPT - end_POSTSUPERSCRIPT | end_ARG start_ARG italic_M end_ARG) in each of the networks, and the y-axis shows the R𝑅Ritalic_R values, revealing the robustness of the corresponding networks. Although the WS network has a higher value of R𝑅Ritalic_R (indicating more robustness) for a low value of |E|Msuperscript𝐸𝑀\frac{|E^{-}|}{M}divide start_ARG | italic_E start_POSTSUPERSCRIPT - end_POSTSUPERSCRIPT | end_ARG start_ARG italic_M end_ARG compared to ER and BA, it is easier to disassemble when |E|M>0.4superscript𝐸𝑀0.4\frac{|E^{-}|}{M}>0.4divide start_ARG | italic_E start_POSTSUPERSCRIPT - end_POSTSUPERSCRIPT | end_ARG start_ARG italic_M end_ARG > 0.4. Meanwhile, we observe that as the ratio of negative edges increases for a relatively small value of |E|Msuperscript𝐸𝑀\frac{|E^{-}|}{M}divide start_ARG | italic_E start_POSTSUPERSCRIPT - end_POSTSUPERSCRIPT | end_ARG start_ARG italic_M end_ARG (|E|M<0.4superscript𝐸𝑀0.4\frac{|E^{-}|}{M}<0.4divide start_ARG | italic_E start_POSTSUPERSCRIPT - end_POSTSUPERSCRIPT | end_ARG start_ARG italic_M end_ARG < 0.4 for WS, |E|M<0.7superscript𝐸𝑀0.7\frac{|E^{-}|}{M}<0.7divide start_ARG | italic_E start_POSTSUPERSCRIPT - end_POSTSUPERSCRIPT | end_ARG start_ARG italic_M end_ARG < 0.7 for ER and BA), the robustness of the networks is relatively stable. However, for a large value of |E|Msuperscript𝐸𝑀\frac{|E^{-}|}{M}divide start_ARG | italic_E start_POSTSUPERSCRIPT - end_POSTSUPERSCRIPT | end_ARG start_ARG italic_M end_ARG, the networks can easily be dismantled, with the value of R𝑅Ritalic_R decreasing with increasing |E|Msuperscript𝐸𝑀\frac{|E^{-}|}{M}divide start_ARG | italic_E start_POSTSUPERSCRIPT - end_POSTSUPERSCRIPT | end_ARG start_ARG italic_M end_ARG. This suggests that increasing the proportion of positive edges in the network contributes to enhancing its robustness. This observation aligns with real-world scenarios. In a social network where negative edges dominate, signifying antagonistic relationships between individuals, the network is naturally more vulnerable. In general, ER and BA networks are more resilient than the WS network with increasing |E|Msuperscript𝐸𝑀\frac{|E^{-}|}{M}divide start_ARG | italic_E start_POSTSUPERSCRIPT - end_POSTSUPERSCRIPT | end_ARG start_ARG italic_M end_ARG.

Refer to caption
Figure 7: Robustness of synthetic signed network with the change of negative edge ratio. The horizontal axis represents the proportion of negative edges, the vertical axis represents the R𝑅Ritalic_R, and different lines correspond to different synthetic networks, i.e., ER, WS and BA. Each point is averaged over 100 realizations.

V Conclusion

In this study, we propose an embedding-based network dismantling framework, namely ESND, to address the signed network dismantling problem. The algorithm mainly iteratively processes the following four steps: it first detects the giant connected component (GCC) within the network and then utilizes the signed network embedding algorithm (SiNE) to generate embedding vectors for each of the nodes. Then, it partitions the GCC into different groups via the K-means algorithm based on the node embedding vectors. Subsequently, the node with the highest degree in the largest cluster is removed from the network. The above process is repeated until the fraction of removed nodes, indicated as q𝑞qitalic_q, reaches a predetermined threshold qrsubscript𝑞𝑟q_{r}italic_q start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT. Comprehensive experiments on various real signed networks and their corresponding null models demonstrate that ESND surpasses other baseline methods, thus confirming its efficacy and stability. Additionally, correlation analysis of the removed node sequences reveals why ESND can better dismantle a signed network than other baseline methods. Moreover, experiments with testing how the ratio of negative edges in a network could affect the robustness of a signed network show that networks with more negative edges are easier to dismantle.

Certain aspects of this study warrant further attention. Future work could explore the following aspects: a) in this work, we treat signed networks as undirected networks. However, most of the signed networks in the real world contain directionality. Therefore, how to efficiently dismantle a directed signed network remains a topic worth investigating. b) We confine ourselves to using unsigned network connectivity, i.e., the fraction of nodes in the giant component, to evaluate the performance of a dismantling method. Future work could also propose performance evaluation methods regarding the sign nature of a network. c) Most of the existing research on network dismantling focuses on removing critical nodes. Devising methods that could identify critical edges in signed networks to achieve rapid network decomposition is also an interesting direction that deserves further exploration.

Acknowledgment

This work is supported by the National Natural Science Foundation of China (Grant Nos. 62173095, U23A20331), Natural Science Foundation of Zhejiang Province (Grant Nos. LQ22F030008), Scientific Research Foundation for Scholars of HZNU (2021QDL030), and the Natural Science Foundation of Shanghai (Grant No. 21ZR1404700).

References

  • [1] R. Albert, H. Jeong, and A.-L. Barabási, “Error and attack tolerance of complex networks,” Nature, vol. 406, no. 6794, pp. 378–382, Jul. 2000.
  • [2] P. Holme, B. J. Kim, C. N. Yoon, and S. K. Han, “Attack vulnerability of complex networks,” Phys. Rev. E, vol. 65, no. 5, p. 056109, May. 2002.
  • [3] A. Braunstein, L. Dall’Asta, G. Semerjian, and L. Zdeborová, “Network dismantling,” Proc. Nat. Acad. Sci. USA, vol. 113, no. 44, pp. 12 368–12 373, Oct. 2016.
  • [4] R. Cohen, S. Havlin, and D. Ben-Avraham, “Efficient immunization strategies for computer networks and populations,” Phys Rev. Lett., vol. 91, no. 24, p. 247901, Dec. 2003.
  • [5] R. Albert, I. Albert, and G. L. Nakarado, “Structural vulnerability of the north american power grid,” Phys. Rev. E, vol. 69, no. 2, p. 025103, Feb. 2004.
  • [6] T. Verma, N. A. Araújo, and H. J. Herrmann, “Revealing the structure of the world airline network,” Sci. Rep., vol. 4, no. 1, p. 5638, Jul. 2014.
  • [7] C. Li, H. Wang, and P. Van Mieghem, “Bounds for the spectral radius of a graph when nodes are removed,” Linear Algebra Its Appl., vol. 437, no. 1, pp. 319–323, Jul. 2012.
  • [8] X.-X. Zhan, K. Zhang, L. Ge, J. Huang, Z. Zhang, L. Wei, G.-Q. Sun, C. Liu, and Z.-K. Zhang, “Exploring the effect of social media and spatial characteristics during the covid-19 pandemic in china,” IEEE Trans. Network Sci. Eng., vol. 10, no. 1, pp. 553–564, Mar. 2022.
  • [9] M. U. Akhtar, J. Liu, X. Liu, S. Ahmed, and X. Cui, “NRAND: An efficient and robust dismantling approach for infectious disease network,” Inf. Process. Manage., vol. 60, no. 2, p. 103221, Mar. 2023.
  • [10] X.-X. Zhan, A. Hanjalic, and H. Wang, “Suppressing information diffusion via link blocking in temporal networks,” in International Conference on Complex Networks and Their Applications.   Springer, Nov. 2020, pp. 448–458.
  • [11] F. Gao, Q. He, X. Wang, L. Qiu, and M. Huang, “An efficient rumor suppression approach with knowledge graph convolutional network in social network,” IEEE Trans. Comput. Soc. Syst., Apr. 2024.
  • [12] P. A. Duijn, V. Kashirin, and P. M. Sloot, “The relative ineffectiveness of criminal network disruption,” Sci. Rep., vol. 4, no. 1, p. 4238, Feb. 2014.
  • [13] B. Collins, D. T. Hoang, N. T. Nguyen, and D. Hwang, “A new model for predicting and dismantling a complex terrorist network,” IEEE Access, vol. 10, pp. 126 466–126 478, Nov. 2022.
  • [14] T. N. Bui and C. Jones, “Finding good approximate vertex and edge partitions is np-hard,” Inf. Process. Lett., vol. 42, no. 3, pp. 153–159, May. 1992.
  • [15] S. V. Buldyrev, R. Parshani, G. Paul, H. E. Stanley, and S. Havlin, “Catastrophic cascade of failures in interdependent networks,” Nature, vol. 464, no. 7291, pp. 1025–1028, Apr. 2010.
  • [16] S. Osat, A. Faqeeh, and F. Radicchi, “Optimal percolation on multiplex networks,” Nat. Commun., vol. 8, no. 1, p. 1540, Nov. 2017.
  • [17] S. Sun, Y. Wu, Y. Ma, L. Wang, Z. Gao, and C. Xia, “Impact of degree heterogeneity on attack vulnerability of interdependent networks,” Sci. Rep., vol. 6, no. 1, p. 32983, Sep. 2016.
  • [18] B. Zhou, Y. Lv, Y. Mao, J. Wang, S. Yu, and Q. Xuan, “The robustness of graph k-shell structure under adversarial attacks,” IEEE Trans. Circuits Syst. II: Express Br., vol. 69, no. 3, pp. 1797–1801, Mar. 2021.
  • [19] N. Almeira, J. I. Perotti, A. Chacoma, and O. V. Billoni, “Explosive dismantling of two-dimensional random lattices under betweenness centrality attacks,” Chaos, Solitons Fractals, vol. 153, p. 111529, Dec. 2021.
  • [20] Y. Hao, Y. Wang, L. Jia, and Z. He, “Cascading failures in networks with the harmonic closeness under edge attack strategies,” Chaos, Solitons Fractals, vol. 135, p. 109772, Jun. 2020.
  • [21] S. Iyer, T. Killingback, B. Sundaram, and Z. Wang, “Attack robustness and centrality of complex networks,” PLoS One, vol. 8, no. 4, p. e59613, Apr. 2013.
  • [22] D. Zhao, B. Gao, Y. Wang, L. Wang, and Z. Wang, “Optimal dismantling of interdependent networks based on inverse explosive percolation,” IEEE Trans. Circuits Syst. II: Express Br., vol. 65, no. 7, pp. 953–957, Jan. 2018.
  • [23] F. Morone and H. A. Makse, “Influence maximization in complex networks through optimal percolation,” Nature, vol. 524, no. 7563, pp. 65–68, Jul. 2015.
  • [24] X.-L. Ren, N. Gleinig, D. Helbing, and N. Antulov-Fantulin, “Generalized network dismantling,” Proc. Natl. Acad. Sci. U.S.A., vol. 116, no. 14, pp. 6554–6559, Feb. 2019.
  • [25] Z. Feng, Z. Cao, and X. Qi, “Generalized network dismantling via a novel spectral partition algorithm,” Inf. Sci., vol. 632, pp. 285–298, Jun. 2023.
  • [26] M. Lozano, C. Garcia-Martinez, F. J. Rodriguez, and H. M. Trujillo, “Optimizing network attacks by artificial bee colony,” Inf. Sci., vol. 377, pp. 30–50, Jan. 2017.
  • [27] S. Wang, J. Liu, and Y. **, “Finding influential nodes in multiplex networks using a memetic algorithm,” IEEE Trans. Cybern., vol. 51, no. 2, pp. 900–912, Jun. 2019.
  • [28] C. Fan, L. Zeng, Y. Sun, and Y.-Y. Liu, “Finding key players in complex networks through deep reinforcement learning,” Nat. Mach. Intell., vol. 2, no. 6, pp. 317–324, May. 2020.
  • [29] M. Grassia, M. De Domenico, and G. Mangioni, “Machine learning dismantling and early-warning signals of disintegration in complex systems,” Nat. Commun., vol. 12, no. 1, p. 5190, Aug. 2021.
  • [30] Q. Liu and B. Wang, “Neural extraction of multiscale essential structure for network dismantling,” Int. J. Neural Netw., vol. 154, pp. 99–108, Jul. 2022.
  • [31] J. Leskovec, D. Huttenlocher, and J. Kleinberg, “Signed networks in social media,” in Proc. SIGCHI Conf. Hum. Factor Comput. Syst., Apr. 2010, pp. 1361–1370.
  • [32] D. Yan, W. Xie, Y. Zhang, Q. He, and Y. Yang, “Hypernetwork dismantling via deep reinforcement learning,” IEEE Trans. Network Sci. Eng., vol. 9, no. 5, pp. 3302–3315, May. 2022.
  • [33] M. Qi, P. Chen, J. Wu, Y. Liang, and X. Duan, “Robustness measurement of multiplex networks based on graph spectrum,” Chaos, vol. 33, no. 2, Feb. 2023.
  • [34] J. Tang, Y. Chang, C. Aggarwal, and H. Liu, “A survey of signed network mining in social media,” ACM Comput. Surv., vol. 49, no. 3, pp. 1–37, Aug. 2016.
  • [35] H.-J. Li, W. Xu, S. Song, W.-X. Wang, and M. Perc, “The dynamics of epidemic spreading on signed networks,” Chaos, Solitons Fractals, vol. 151, p. 111294, Oct. 2021.
  • [36] S. Osat, F. Papadopoulos, A. S. Teixeira, and F. Radicchi, “Embedding-aided network dismantling,” Phys. Rev. Res., vol. 5, no. 1, p. 013076, Feb. 2023.
  • [37] S. Wang, J. Tang, C. Aggarwal, Y. Chang, and H. Liu, “Signed network embedding in social media,” in Proceedings of the 2017 SIAM International Conference on Data Mining.   SIAM, 2017, pp. 327–335.
  • [38] S. Wandelt, X. Shi, X. Sun, and M. Zanin, “Community detection boosts network dismantling on real-world networks,” IEEE Access, vol. 8, pp. 111 954–111 965, Jun. 2020.
  • [39] F. Musciotto and S. Miccichè, “Exploring the landscape of dismantling strategies based on the community structure of networks,” Sci. Rep., vol. 13, no. 1, pp. 1–12, Sep. 2023.
  • [40] M. G. Everett and S. P. Borgatti, “Networks containing negative ties,” Soc. Netw., vol. 38, pp. 111–120, Jul. 2014.
  • [41] W.-C. Liu, L.-C. Huang, C. W.-J. Liu, and F. Jordán, “A simple approach for quantifying node centrality in signed and directed social networks,” Appl. Netw. Sci., vol. 5, pp. 1–26, Aug. 2020.
  • [42] X. Yin, X. Hu, Y. Chen, X. Yuan, and B. Li, “Signed-pagerank: An efficient influence maximization framework for signed social networks,” IEEE Trans. Knowl. Data Eng., vol. 33, no. 5, pp. 2208–2222, Oct. 2019.
  • [43] P. Bonacich and P. Lloyd, “Calculating status with negative relations,” Soc. Netw., vol. 26, no. 4, pp. 331–338, 2004.
  • [44] C. M. Schneider, A. A. Moreira, J. S. Andrade Jr, S. Havlin, and H. J. Herrmann, “Mitigation of malicious attacks on networks,” Proc. Natl. Acad. Sci. U.S.A., vol. 108, no. 10, pp. 3838–3841, Mar. 2011.
  • [45] S. Wandelt, X. Sun, D. Feng, M. Zanin, and S. Havlin, “A comparative analysis of approaches to network-dismantling,” Sci. Rep., vol. 8, no. 1, pp. 1–15, Sep. 2018.
  • [46] B. Hao and I. A. Kovács, “Proper network randomization is key to assessing social balance,” Sci. Adv., vol. 10, no. 18, p. eadj0104, May. 2024.