ESND: An Embedding-based Framework for Signed Network Dismantling

Chenwei Xie, Chuang Liu, Cong Li, Xiu-Xiu Zhan, Xiang Li Chenwei Xie and Chuang Liu are with the Research Center for Complexity Sciences, Hangzhou Normal University, Hangzhou 311121, China (e-mail: [email protected]; [email protected]).Cong Li is with the Adaptive Networks and Control Laboratory, Electronic Engineering Department, School of Information Science and Engineering, and the Research Center of Smart Networks and Systems, Fudan University, Shanghai 200433, China (e-mail: [email protected], Corresponding author). Xiu-Xiu Zhan is with the Research Center for Complexity Sciences, Hangzhou Normal University, Hangzhou 311121, China, and the College of Media and International Culture, Zhejiang University, Hangzhou 310058, China (e-mail: [email protected], Corresponding author).Xiang Li is with the Institute of Complex Networks and Intelligent Systems, Shanghai Research Institute for Intelligent Autonomous Systems, the Frontiers Science Center for Intelligent Autonomous Systems, and the State Key Laboratory of Intelligent Autonomous Systems, Tongji University, Shanghai 201210, China (e-mail: [email protected]).Manuscript received xxx; revised xxx

Abstract

Network dismantling aims to maximize the disintegration of a network by removing a specific set of nodes or edges and is applied to various tasks in various domains, such as cracking down on crime organizations, delaying the propagation of rumors, and blocking the transmission of viruses. Most of the current network dismantling methods are tailored for unsigned networks, which only consider the connection between nodes without evaluating the nature of the relationships, such as friendship/hostility, enhancing/repressing, and trust/distrust. We here propose an embedding-based algorithm, namely ESND, to solve the signed network dismantling problem. The algorithm generally iterates the following four steps, i.e., giant component detection, network embedding, node clustering, and removal node selection. To illustrate the efficacy and stability of ESND, we conduct extensive experiments on six signed network datasets as well as null models, and compare the performance of our method with baselines. Experimental results consistently show that the proposed ESND is superior to the baselines and displays stable performance with the change in the network structure. Additionally, we examine the impact of sign proportions on network robustness via ESND, observing that networks with a high ratio of negative edges are generally easier to dismantle than networks with high positive edges.

Index Terms:

Network dismantling, signed network, node embedding, node clustering

I Introduction

Network dismantling aims to remove a certain number of nodes that could maximize the damage to the network in terms of connectivity [1, 2, 3]. It has become a prominent topic in network science due to its extensive applications in different fields [4, 5, 6]. For instance, it could be used to delay the spread of diseases by immunizing (or isolating) the critical nodes in epidemic-spreading networks [7, 8, 9]. In terms of information dissemination, it has the potential to help block key users to control the propagation of rumors and false information on online social platforms [10, 11]. In addition, effective network dismantling measures can achieve the purpose of quickly thwarting the crime for terrorist organization networks [12, 13].

Network dismantling has been proven to fall into the category of NP-hard problems [14, 15, 16], the mathematical essence of which is a combinatorial optimization problem. Researchers have proposed various methods to identify critical nodes for network dismantling problems, such as centrality-based methods (e.g., degree, k-shell, betweenness, and closeness) [17, 18, 19, 20, 21], heuristic algorithms (e.g., acquaintance immunization, collective influence (CI) and generalized network dismantling (GND)) [22, 23, 24, 25], meta-heuristic algorithms (e.g., artificial bee colony algorithm, memetic algorithm) [26, 27], and machine learning algorithms (e.g., finding key players in networks through deep reinforcement learning (FINDER), graph dismantling with machine learning (GDM), neural extraction framework for multiscale essential structures (NEES)) [28, 29, 30]. Although these methods have shown efficacy in rapidly disintegrating networks, most of them are tailored to unsigned networks, i.e., networks without positive or negative signs on the edges. Actually, interactions between different individuals in the real world may contain specific meanings [31, 32, 33]. For example, users could be friends or enemies in social networks, and a signed network is needed to represent the different relationships between users [34]. Moreover, the dynamics of signed networks is quite different from that of unsigned networks. For instance, we need to consider signs when modeling the spread process on a signed network, and the signed network structure may result in different dynamic behaviors [35]. With regard to the dismantling problem, few works have considered this problem on signed networks, and the main challenge relies on how to utilize the signed network topology to solve this problem.

To address the challenge of signed network dismantling, we propose an algorithm named the Embedding-based framework for signed network dismantling (ESND), which integrates node embedding [36] and node clustering to achieve rapid disintegration of a signed network. The ESND consists of three main parts that iteratively remove nodes from the network (see Figure 1): First, we perform a signed network embedding algorithm (SiNE) to obtain node embedding vectors that could capture the local and global structure of a signed network. Second, we employ the K-means algorithm to classify the nodes into different clusters. Lastly, the node with the highest degree in the largest cluster is removed from the network. We compare ESND with the baselines on different empirical signed networks and their null models. The results show that ESND could better dismantle a signed network than the baseline methods.

The subsequent sections of this paper are organized as follows. Section II details the specifics of our proposed algorithm. Section III offers a clear description of the baseline methods. Section IV introduces the datasets and presents all the experimental results. We summarize our work and discuss future research directions in Section V.

II Methods

In this section, we introduce the iterative dynamic approach to the dismantling of signed networks, as shown in Figure 1. Initially, we identify the giant connected component (GCC) in the network and then use a signed network embedding algorithm, i.e., SiNE [37], to get the embedding vector of each node. Later, we use the K-means algorithm to partition the GCC into several clusters based on the embedding vectors of the nodes. The node with the highest degree in the largest cluster is removed from the network. If the network contains several nodes with the same value of the highest degree, we randomly choose one of them to remove. Subsequently, we re-identify the GCC within the remaining network and perform signed network embedding on the GCC. We then eliminate the node with the highest degree in the largest cluster using K-means. The process will iterate until the fraction $q$ of removed nodes reaches a specified value $q_{r}$ . The essential steps, i.e., giant component detection, signed network embedding (SiNE), node clustering, and node elimination, of the ESND algorithm are illustrated as follows.

Refer to caption — Figure 1: Framework of ESND. The solid black lines represent positive edges, the red dashed lines indicate negative edges, $q$ represents the fraction of the removed nodes, and $q_{r}$ is a threshold value indicating when we will stop the algorithm.

II-A Giant Connected Component Detection

Given an undirected and unweighted signed network $G=(V,E)$ consisting of $N$ nodes and $M$ edges, where $V=\left\{v_{1},v_{2},\cdots,v_{N}\right\}$ represents the set of nodes and $E$ is the set of edges. An edge $e_{ij}=(v_{i},v_{j})\in E$ can take a value of $1$ or $-1$ , indicating a positive or negative edge in the network. To effectively dismantle a signed network, we need to detect the GCC from the current network as input for embedding at each iteration. Therefore, we use the breadth-first-search (BFS) algorithm to detect the giant connected component within a signed network. Specifically, we start from each unvisited node to find all nodes connected to it and record the size of its corresponding connected component. The component containing most nodes is referred to as the GCC.

II-B Signed Network Embedding (SiNE)

We choose to use a classic signed network embedding method rooted in deep learning, specifically known as SiNE, to obtain embedding vectors for each node. In the subsequent sections, we provide an in-depth description of the three fundamental components of this method, i.e., the establishment of the objective function, the construction of a deep learning network, and the update of the parameters. The formulation of the objective function in SiNE is based on structural balance theory, positing that individuals are more like their “friends” than their “enemies”. We utilize $\mathcal{T}=\{(v_{i},v_{j},v_{k})\mid e_{ij}=1,e_{ik}=-1,v_{i},v_{j},v_{k}\in V\}$ to denote a collection of triplets, where there is a positive connection between $v_{i}$ and $v_{j}$ , and a negative connection between $v_{i}$ and $v_{k}$ . Hence, it is necessary to allocate a greater similarity to $v_{i}$ and $v_{j}$ compared to $v_{i}$ and $v_{k}$ . Mathematically, we express the similarity as $f(\mathbf{x}_{i},\mathbf{x}_{j})\geq f(\mathbf{x}_{i},\mathbf{x}_{k})+\epsilon$ , where $f$ denotes the similarity function that requires learning, and $\epsilon$ fine-tunes the dissimilarity between the nodes. The higher value of $\epsilon$ makes $v_{i}$ and $v_{j}$ closer and $v_{i}$ and $v_{k}$ farther away in the embedding space. Since the mentioned function is unable to handle cases where 2-hop networks of nodes only have positive or negative links, and given that positive connections are more prevalent than negative ones in real-world networks, the study introduces a virtual node $v_{0}$ . The virtual node is utilized to establish a negative link between $v_{0}$ and the node connected to its 2-hop neighbors only by positive links. Assuming $\mathcal{T}_{0}=\{(v_{i},v_{j},v_{0})\mid e_{ij}=1,e_{i0}=-1\}$ is one of these triplets, we have $f(\mathbf{x}_{i},\mathbf{x}_{j})\geq f(\mathbf{x}_{i},\mathbf{x}_{0})+\epsilon% _{0}$ , where $\epsilon_{0}$ plays a similar role as $\epsilon$ . Consequently, the objective function for signed network embedding is

$\displaystyle\min_{\mathbf{X},\mathbf{x}_{0},\epsilon}$	$\displaystyle\frac{1}{T}\left[\sum\limits_{\left(\mathbf{x}_{i},\mathbf{x}_{j}% ,\mathbf{x}_{k}\right)\in\mathcal{T}}\max\left(0,f\left(\mathbf{x}_{i},\mathbf% {x}_{k}\right)+\epsilon-f\left(\mathbf{x}_{i},\mathbf{x}_{j}\right)\right)\right.$	(1)
	$\displaystyle\left.+\sum\limits_{\left(\mathbf{x}_{i},\mathbf{x}_{j},\mathbf{x% }_{0}\right)\in\mathcal{T}_{0}}\max\left(0,f\left(\mathbf{x}_{i},\mathbf{x}_{0% }\right)+\epsilon_{0}-f\left(\mathbf{x}_{i},\mathbf{x}_{j}\right)\right)\right]$
	$\displaystyle+\lambda\left({H}(\phi)+\\|\mathbf{X}\\|_{F}^{2}+\left\\|\mathbf{x}_% {0}\right\\|_{2}^{2}\right),$

where the size of the training data is denoted by ${T}=\left|\mathcal{T}\right|+\left|\mathcal{T}_{0}\right|$ , and $\mathbf{X}=\left\{\mathbf{x}_{1},\mathbf{x}_{2},\cdots,\mathbf{x}_{N}\right\}$ represents the embedding vectors of the $N$ nodes. The similarity function $f$ is determined by the parameter set $\phi$ , and $H(\phi)$ serves as a regularizer to prevent overfitting. The parameter $\lambda$ is utilized to control the impact of the regularizers. In addition, $\|\cdot\|_{F}$ is the Frobenius norm, while $\|\cdot\|_{2}$ represents the $\ell_{2}$ -norm.

The optimization of the objective function is carried out to acquire nonlinear embedding vectors for nodes within signed networks. Within the SiNE framework, the function $f$ and the parameter set $\phi$ in the objective function are defined through the construction of a neural network. The framework consists of two layers of neural networks, where $\mathbf{W}^{11}$ and $\mathbf{W}^{12}$ are the weights of the first hidden layer, and $\mathbf{b}^{1}$ is the bias. The specific output form of the first layer is as follows:

	$\displaystyle\mathbf{z}^{11}=\tanh(\mathbf{W}^{11}\mathbf{x}_{i}+\mathbf{W}^{1% 2}\mathbf{x}_{j}+\mathbf{b}^{1}),$		(2)
	$\displaystyle\mathbf{z}^{12}=\tanh(\mathbf{W}^{11}\mathbf{x}_{i}+\mathbf{W}^{1% 2}\mathbf{x}_{k}+\mathbf{b}^{1}).$		(2)

Similarly, the outputs of the first layer, $\mathbf{z}^{11}$ and $\mathbf{z}^{12}$ , serve as inputs of the second layer. The specific structure of the output of the second hidden layer is expressed as $\mathbf{z}^{21}=\tanh(\mathbf{W}^{2}\mathbf{z}^{11}+\mathbf{b}^{2})$ and $\mathbf{z}^{22}=\tanh(\mathbf{W}^{2}\mathbf{z}^{12}+\mathbf{b}^{2})$ , where $\mathbf{W}^{2}$ represents the weight of the second-layer network, and $\mathbf{b}^{2}$ denotes the bias. Thus, the final output of the neural network determines the nonlinear function $f$ used to evaluate node similarity in the objective function, which can be expressed as

\displaystyle f\left(\mathbf{x}_{i},\mathbf{x}_{j}\right)=\tanh\left(\mathbf{w% }^{T}\mathbf{z}^{21}+b\right),

(3)

and

\displaystyle f\left(\mathbf{x}_{i},\mathbf{x}_{k}\right)=\tanh\left(\mathbf{w% }^{T}\mathbf{z}^{22}+b\right),

(4)

where the elements vector of $\mathbf{w}$ are the weights and the scalar $b$ denotes the bias. The parameter set $\phi$ in the objective function is given by $\phi=\left\{\mathbf{W}^{11},\mathbf{W}^{12},\mathbf{W}^{2},\mathbf{w},\mathbf{% b}^{1},\mathbf{b}^{2},b\right\}$ , and $H$ is given by $H(\phi)=\left\|\mathbf{W}^{11}\right\|_{F}^{2}+\left\|\mathbf{W}^{12}\right\|_% {F}^{2}+\left\|\mathbf{W}^{2}\right\|_{2}^{2}+\|\mathbf{w}\|_{2}^{2}+\left\|% \mathbf{b}^{1}\right\|_{2}^{2}+\left\|\mathbf{b}^{2}\right\|_{2}^{2}+b^{2}$ .

In the SiNE framework, backpropagation is employed to optimize the deep learning network. This process entails updating network parameters by backpropagating “errors”, facilitating a more efficient computation of gradients. The key to optimizing the objective function lies in obtaining gradients with respect to the parameters $\mathbf{X}$ , $\mathbf{x}_{0}$ , and $\phi$ for $\max(0,f(\mathbf{x}_{i},\mathbf{x}_{k})+\epsilon-f(\mathbf{x}_{i},\mathbf{x}_{% j}))$ and $\max(0,f(\mathbf{x}_{i},\mathbf{x}_{0})+\epsilon_{0}-f(\mathbf{x}_{i},\mathbf{% x}_{j}))$ . Based on the mini-batch stochastic gradient descent algorithm, the training data is divided into small batches during each training iteration. Subsequently, the gradients for the current batch are computed using the backpropagation method. These gradients are then backward propagated from the output layer to the input layer, elucidating the influence of each parameter on the overall network output “errors”.

II-C Node clustering

After obtaining the embedding vector of each node using SiNE, we further use the K-means algorithm to partition the nodes in the network into $k$ clusters, where $k$ is a tunable parameter. We illustrate the details of using K-means as follows:

•

Initialization: We randomly select $k$ nodes from the signed network, and each of them serves as the central node for one of the $k$ clusters.
•

Assignment: For every node left in the network, we determine the Euclidean distance from it to the cluster centers by utilizing their embedding vectors. We then assign each node to the cluster with the nearest distance and guarantee that each cluster consists of nodes that are most akin to its centroid.
•

Update Centroids: The average of the embedding vectors of the nodes is computed for each cluster, and this average is then designated as the new cluster center.
•

Iteration: The assignment and update centroids steps are iterated until either the cluster centers stabilize or the specified number of iterations is reached.

Because each iteration involves a relatively low computational burden, the K-means algorithm runs quickly. By setting the number of clusters ( $k$ ), it promptly aids in selecting nodes and improving the efficiency of the algorithm proposed in this paper.

II-D Node Elimination

Empirical evidence indicates that most nodes are affiliated with a single cluster, while only a minority are assigned to various other distinct clusters. Furthermore, previous studies have indicated that the elimination of nodes within a cluster or community can improve the efficiency of network dismantling [38, 39]. Hence, we utilize the largest cluster as the central part for decomposition. More precisely, at each stage of the attack process, we will pinpoint the largest cluster in the network and remove the node with the highest degree in that cluster.

III Baselines

To demonstrate the enhanced effectiveness of ESND in network dismantling, we have selected $12$ classic centrality metrics as benchmarks. These metrics encompass those that are agnostic to the sign of the network, such as Degree, Betweenness, K-shell, and Closeness, as well as those that take into account the edge signs, such as P-DEG, N-DEG, Net-DEG, Ratio-DEG, PN, TE, and SPR. The basic explanations of these centrality metrics are provided below.

•

Degree: Degree quantifies the number of direct neighbors of a node when we ignore the sign of the edges, and nodes with higher degrees are generally considered more important. The node degree centrality of node $v_{i}$ is $\frac{k_{i}}{N-1}$ , where $N$ represents the number of nodes and $N-1$ signifies the maximum possible degree value for a node, and $k_{i}$ is the degree of the node $v_{i}$ indicating the number of its neighbors.
•

Betweenness: It assesses the role of a node in the shortest paths between other nodes. The betweenness centrality of node $v_{i}$ is $\sum_{i\neq s,j\neq t,s\neq t}\frac{g_{st}^{i}}{g_{st}}$ , where $g_{st}$ represents the total number of shortest paths from node $v_{s}$ to $v_{t}$ , and $g_{st}^{i}$ denotes the number of these shortest paths among the $g_{st}$ that pass through $v_{i}$ .
•

K-shell: K-shell centrality categorizes nodes based on their degrees to evaluate their importance in a network. Assuming there are no isolated nodes in the network, we eliminate nodes with one connection until no more such nodes remain and assign them to the $1$ -shell. Similarly, we recursively eliminate nodes with degree of $2$ to form the $2$ -shell. This process concludes when all nodes have been allocated to one of the shells.
•

Closeness: This centrality functions as a global indicator delineating the node’s position in the network, and it quantifies the average distance between a node and the remaining nodes. The closeness centrality of node $v_{i}$ is $\frac{N-1}{\sum_{j\neq i}d_{ij}}$ , where $d_{ij}$ is the length of the shortest path between node $v_{i}$ to node $v_{j}$ . A higher closeness value indicates that $v_{i}$ is closer to the other nodes in a network.
•

Positive degree (P-DEG): P-DEG counts the number of positive edges linked to a node, which is referred to as the positive degree. Thus, the P-DEG centrality value of node $v_{i}$ is given by its number of positive edges ${k_{i}^{+}}$ .
•

Negative degree (N-DEG): N-DEG quantifies the number of negative edges associated with each node, denoted as the negative degree. The N-DEG centrality of node $v_{i}$ can be presented by its number of negative edges ${k_{i}^{-}}$ .
•

Net degree (Net-DEG): This metric represents the difference between the number of positive edges and negative edges that a node has. For node $v_{i}$ , the Net-DEG value is presented as ${k_{i}^{+}}-{k_{i}^{-}}$ .
•

Ratio degree (Ratio-DEG): It represents the proportion of positive edges that a node $v_{i}$ has among its total number of edges in the network, which reads $\frac{k_{i}^{+}}{{k_{i}^{+}}+{k_{i}^{-}}}$ .

•

PN centrality [40]: Everett and Borgatti argue that nodes with more positive connections are more significant, while nodes with more negative connections are less important. Thus, they propose the PN index to evaluate node importance in signed networks, calculated using the following formula

\displaystyle PN=\left(I-{\frac{1}{2N-2}}A\right)^{-1}\textbf{1},

(5)

where $N$ represents the number of nodes in the network, $I$ is the $N$ -order identity matrix, $A=A^{+}-2A^{-}$ , and $A^{+}$ (or $A^{-}$ ) represents the adjacency matrix containing only positive (or negative edges). 1 denotes an $N$ -dimension vector with all elements equal to 1.

•

TE [41]: This index calculates the centrality of a target node considering the total effect (TE) of all other nodes to it in the network. The higher the value of TE, the more important the node. For an undirected signed network, if there is an edge between $v_{i}$ and $v_{j}$ , the effect of $v_{i}$ to $v_{j}$ is defined as $E_{ij,1^{S}}=S\times\frac{1}{D_{j}}$ , where $S$ is the sign ( $+1$ or $-1$ ) of the edge $e_{ij}$ , and $D_{j}$ is the degree of $v_{j}$ . We construct two matrices $CE_{n^{+}}=\{CE_{ij,n^{+}}\}_{N\times N}=\{\sum_{l=1}^{n}E_{ij,l^{+}}\}_{N% \times N}$ and $CE_{n^{-}}=\{CE_{ij,n^{-}}\}_{N\times N}=\{\sum_{l=1}^{n}E_{ij,l^{-}}\}_{N% \times N}$ to represent the sum of the positive and negative effects of $v_{i}$ to $v_{j}$ up to $n$ steps, respectively. Therefore, $TE_{ij,n}=CE_{ij,n^{+}}+\left|CE_{ij,n^{-}}\right|$ indicates the sum of effects from $v_{i}$ to $v_{j}$ , and the TE value of $v_{i}$ is further given by

\displaystyle TE_{i,n}=\sum_{j=1}^{N}TE_{ij,n}.

(6)

Here, we set $n=2$ , meaning that we only calculate the effect of a node on its neighbors in two hops.

•

Signed-PageRank (SPR) [42]: SPR is a PageRank algorithm adapted for signed networks, which updates the SPR value for each node in each iteration by aggregating the weights and sign information. The formula for updating the SPR value of $v_{i}$ at iteration $t+1$ is

\displaystyle SPR_{i,t+1}=\sum_{v_{j}\in D_{i}^{out}}(SPR_{i,t}-SPR_{j,t}){y_{% i,j}}+\frac{1-d}{N},

(7)

where $D_{i}^{out}$ is the set of out-neighbors of $v_{i}$ . $Y=\{y_{i,j}\}_{N\times N}=dH$ is the Signed-PageRank adjacency matrix with dam** coefficient $d$ , where $H$ represents the Hadamard product of the normalized weight matrix $W$ and the label matrix $L$ . In our work, the weights of all edges are equal to $1$ , thus in matrix $W=\{w_{ij}\}_{N\times N}$ , ${w_{ij}}=\frac{1}{D_{i}}$ if $v_{i}$ and $v_{j}$ have a connection. In matrix $L=\{l_{ij}\}_{N\times N}$ , $l_{ij}=1$ if there is a positive connection between $v_{i}$ and $v_{j}$ , and $l_{ij}=-1$ signifies a negative connection between them. Unlike the PageRank algorithm, the iteration of the Signed-PageRank algorithm continues until the ranking of nodes based on SPR values remains unchanged. Here, we consider the final rank of the nodes as their importance.

•

Signed Eigenvector (SE) [43]: SE is an extension of the eigenvector of signed networks. The main idea is that a node with more positive edges to the important nodes is more important, and vice versa for nodes with more negative edges to the important nodes. Given the label matrix $L_{N\times N}$ of an undirected and unweighted signed network, we can swap the rows and columns of $L$ to obtain a matrix

\displaystyle A=\left(\begin{matrix}L^{+}&L^{-}\\ L^{-}&L^{+}\end{matrix}\right),

(8)

where $L^{+}_{n_{1}\times n_{1}}$ is an adjacency matrix only containing positive edges, $L^{-}_{n_{2}\times n_{2}}$ denotes a adjacency matrix with negative edges, and $n_{1}+n_{2}=N$ . Let $B=DAD$ , where $D$ is a diagonal matrix, whose first $n_{1}$ diagonal elements are $1$ and the remaining $n_{2}$ elements are equal to $-1$ . In particular, $B$ has positive eigenvalues $\lambda$ because it contains only non-negative elements and corresponding eigenvector $x$ . Since $Bx=DADx=\lambda x$ , we have $ADx=\lambda D^{-1}x=\lambda Dx$ . Therefore, the signed eigenvalue centrality of each node can be represented by the eigenvector $Dx$ when $Dx$ is in a steady state.

IV Experiments

We apply the proposed ESND to dismantle six distinct real signed networks and three different signed network null models, and compare the results of ESND with those of the baselines on these null models to assess the stability of ESND. Additionally, we compute Kendall correlation coefficients for target attack node sequences generated by various decomposition strategies to analyze differences in node selection. Finally, we test how the ratio of negative edges would affect the robustness of a signed network through artificial network models.

IV-A Datasets

We select six real-world datasets that can be constructed as signed networks to evaluate the performance of our method. Specifically, Bitcoinalpha and Bitcoinotc are data sourced from SNAP ¹¹1https://snap.stanford.edu/data/, illustrating the trust networks between users participating in Bitcoin transactions. Due to transactional anonymity in Bitcoin transactions, users provide positive and negative ratings to signify trust (positive) or distrust (negative) relationships. WikiVote represents the voting network to select Wikipedia administrators²²2https://doi.org/10.6084/m9.figshare.12152628. The eligibility of the users for administration is determined through voting, with the edges denoting voting interactions, i.e., positive signs indicate support while negative signs indicate opposition. Slashdot is a notable technology news site where users comment and share technology-related information³³3https://www.aminer.cn/data-sna. Positive and negative signs in the dataset denote friendly or adversarial relationships between users. Reddit captures connections between users in diverse sub-communities, reflecting positive or negative sentiment in shared content across online accounts2. Epinions constitutes a trust network among users on a product review website, with positive and negative signs indicating trust or distrust relationships between user connections3. We show the topological information of these signed networks in Table I, including the number of nodes ( $N$ ), the number of edges ( $M$ ), the number of positive edges ( $E^{+}$ ), the number of negative edges ( $E^{-}$ ), the average degree of nodes ( $\left\langle k\right\rangle$ ) and the clustering coefficient ( $C$ ). The table shows that all the signed networks have more positive edges than negative ones.

TABLE I: Topological information of the signed networks, in which

N

denotes the number of nodes;

M

represents the number of edges;

\left|E^{+}\right|

and

\left|E^{-}\right|

indicate the number of positive and negative edges, respectively. The values in parentheses represent the proportions of positive and negative edges in the network;

\left\langle{k}\right\rangle

denotes the average degree, and

C

signifies the average clustering coefficient.

	$N$	$M$	$\left\|E^{+}\right\|$	$\left\|E^{-}\right\|$	$\left\langle{k}\right\rangle$	$C$
Bitcoinalpha	3783	14124	12759(90%)	1365(10%)	7.47	0.177
Bitcoinotc	5881	21492	18250(85%)	3242(15%)	7.31	0.178
WikiVote	7118	100751	78658(78%)	22093(22%)	28.3	0.141
Slashdot	13182	34260	28884(84.3%)	5376(15.7%)	5.19	0.149
Reddit	18282	107301	99084(92.3%)	8217(7.7%)	11.74	0.374
Epinions	25148	99880	69185(69.2%)	30695(30.7%)	7.94	0.073

IV-B Performance Evaluation Metric

Network dismantling methods aim to produce an optimal node sequence to remove that could disrupt the network as much as possible. We use the robustness metric $R$ to assess the performance of ESND, as well as the baselines[44, 45]

\displaystyle R=\frac{1}{N}\sum_{Q=1}^{N}s(Q),

(9)

where $N$ is the size of network, $s(Q)$ represents the fraction of nodes in the largest connected components after the removal of $Q=qN$ nodes, and $1/N$ is a standardized operation for comparing the robustness of networks with different sizes. To compute $R$ , a node rank is necessary; therefore, various dismantling methods are proposed to find the minimum $R$ in all possible node orders. A lower value of $R$ indicates that the method is more effective in destroying the network.

IV-C Parameter Analysis

To optimize the effectiveness of dismantling the network of the proposed method, we perform a thorough analysis of various parameters. Specifically, we focus on two key parameters, i.e., the embedding dimension size $d$ and the number of clusters $k$ , and keep the other parameters unchanged (we set hidden layers $L=2$ , learning rate $\lambda=0.0001$ , and similarity parameters $\epsilon$ and $\epsilon_{0}$ set to 1 and 0.5, respectively. Note that these parameters are unchanged in the following experiments). We systematically compare the $R$ values for each dataset across different values of $d$ and $k$ , the results are given in Figure 2. We observe that when $k$ is unchanged, the smallest $R$ is given by $d=20$ in most networks, except Reddit where $d=128$ achieves the best performance. Meanwhile, as $k$ increases, the value of $R$ decreases and reaches its minimum at $k=8$ in all networks. Therefore, in the following experiments, we set $k=8$ for the six networks, $d=128$ for Reddit, and $d=20$ for the remaining networks.

IV-D Experimental Results

IV-D1 Results on Real Signed Networks

We compare the performance of the ESND with the selected baselines on the six signed networks, where the results are given in Figure 3 and Table II. In Figure 3, the horizontal axis ( $q$ ) represents the proportion of nodes removed, while the vertical axis ( $S(qN)$ ) corresponds to the fraction of nodes in the GCC after removing $q$ fraction of nodes. For a fixed value of $q$ , the smaller value of $S(qN)$ indicates that the dismantling method is more effective in dismantling the corresponding signed network than other methods. The values in Table II reveal the area under each curve (AUC) in Figure 3, with a smaller value indicating the better performance of the corresponding dismantling method. The experimental results show that the robustness of these real signed networks is notably different, with some of them demonstrating fast network collapse with only a small fraction of nodes being removed, such as Slashdot and Epinions, while the remaining ones are more robust. For example, WikiVote and Reddit networks necessitate approximately 40% removal to attain complete decomposition for most of the dismantling methods. In addition, ESND outperforms all baseline methods in dismantling most signed networks, particularly when we remove a large fraction of nodes. In dismantling an unsigned network, normally the betweenness can outperform the other methods in most cases[45]. However, it performs second best in most cases in dismantling signed networks, which reveals that considering the topology deduced by the signs is important in dismantling a signed network. Moreover, various centrality methods exhibit varying performances across different datasets, including Closeness, K-shell, PN, TE, SPR, and SE. The efficacy of these methods is closely related to the specific structures of the networks. In contrast, ESND consistently achieves optimal network dismantling results across diverse datasets, showing its stability and effectiveness.

TABLE II: Area under each curve (AUC) of each curve in Figure 3. The best performance is highlighted in bold.

	ESND	Degree	P-DEG	N-DEG	Net-DEG	Ratio-DEG	Closeness	Betweenness	K-shell	PN	TE	SPR	SE
Bitcoinalpha	0.0596	0.0704	0.0699	0.1603	0.0907	0.4356	0.1305	0.0782	0.0949	0.1327	0.0823	0.2955	0.2972
Bitcoinotc	0.0499	0.0579	0.0611	0.1599	0.1326	0.4538	0.1133	0.0634	0.0789	0.1889	0.0692	0.404	0.4021
WikiVote	0.1455	0.1521	0.1644	0.2315	0.2686	0.4562	0.1675	0.1363	0.1596	0.3610	0.1649	0.4067	0.4274
Slashdot	0.0106	0.0128	0.0161	0.0975	0.1357	0.4246	0.022	0.0119	0.0289	0.2087	0.0137	0.0958	0.2186
Reddit	0.0802	0.0921	0.0915	0.1778	0.0935	0.4616	0.1774	0.1033	0.1152	0.1097	0.1027	0.3748	0.3173
Epinions	0.0097	0.0123	0.0161	0.0877	0.2774	0.4003	0.0457	0.0108	0.0256	0.3428	0.0147	0.0989	0.2363

Various methods demonstrate diverse efficacy in network dismantling due to disparities in the strategies employed for node selection during each iteration. To scrutinize the dissimilarities in the node removal sequences generated by these methods, we conduct a correlation analysis. For each method, we first obtain the node removal sequence, i.e., different methods may result in different orders of node removal. Then we calculate the Kendall correlation coefficient between the node sequences obtained by a pair of dismantling methods. The Kendall correlation coefficients between each pair of methods are given in Figure 4. In particular, the Kendall correlation coefficients between the proposed ESND and the baselines are generally low, indicating a significant deviation in the node removal strategy of the ESND from these baseline methods. Additionally, Degree, P-DEG, N-DEG, Net-DEG, and Ratio-DEG, despite relying on distinct dismantling strategies derived from node degree, yield node sequences with relatively low correlation.

IV-D2 Results on Null Models of the Signed Network

To delve deeper into the potential influence of factors such as network topology and signs on ESND and their consequent impact on variations in network dismantling outcomes, three distinct null models for signed networks[46] were constructed in six datasets. We illustrate examples of the null models in Figure 5, in which they preserve certain properties of the original network. Detailed descriptions of them are given below.

•

Sign shuffle: In this model, the topological structure of the network is preserved by randomly selecting one positive edge and one negative edge and exchanging their signs, but the positive and negative degrees of each node will change. Taking node $v_{1}$ in Figure 5a and b as an example, the degree of node $v_{1}$ is preserved, but the positive degree and negative degree of node $v_{1}$ change from $\{2,0\}$ to $\{1,1\}$ via the sign shuffle model.
•

Signed rewire: Initially, two subgraphs containing only positive or negative edges are constructed from the original network. Subsequently, the edges are rewired within each subgraph, which could preserve the positive and negative degrees of the nodes. The process ends by merging the two rewired subgraphs to establish the null model. In this model, the positive and negative degrees of the nodes remain the same as in the original network, while the network structure is changed. Figure 5c demonstrates the generation of a signed rewire null model. For example, we disconnect the edges $(v_{1},v_{2})$ , $(v_{5},v_{6})$ and form new edges $(v_{1},v_{5})$ , $(v_{2},v_{6})$ but keep the positive and negative degree of each node.
•

Rewire: The model exchanges edges between nodes while kee** the degree of each node unchanged. In this null model, both the topological structure of the network and the positive and negative degrees of each node undergo alterations. In Figure 5d, we show that the degree, positive degree, and negative degree of each node are changed through the random rewiring process of the rewire model.

We perform network dismantling on these null models generated by each of the six signed real-world networks, the specific experimental results are illustrated in Figure 6. The horizontal axis denotes the original signed network and their corresponding null models, while the vertical axis represents the evaluation metrics $R$ for various network dismantling methods applied to these signed networks and null models. The results show that each of the dismantling methods demonstrates generally consistent performance in network dismantling across both the original network and three null models within the same dataset. This suggests that modifications to the topological properties and sign distribution of the signed network do not significantly affect the efficacy of these methods. More importantly, ESND consistently attains superior dismantling performance compared to these baselines across these varied null models (as shown by the red diamonds in the figures), emphasizing the stability of ESND as an effective method for dismantling networks.

IV-D3 Impact of the signs on network robustness

We further examine the robustness of a signed network by adding different ratios of positive or negative edges. Specifically, we first generate unsigned synthetic networks, i.e., ER, WS, and BA, and then assign different ratios of positive or negative edges in the networks. Finally, we evaluate the robustness of these networks by using ESND. To be consistent and comparable, all synthetic networks contain 1000 nodes with the same average degree of $10$ . In the ER network, the probability of randomly connecting edges is set to $p=0.01$ . For the WS network, each node is connected to its $k=10$ nearest neighbors, with a rewiring probability of $p=0.01$ . In the BA network, the initial number of nodes is $m_{0}=6$ , and each new node was connected to $5$ existing nodes. Subsequently, random positive and negative signs were assigned to each edge in each synthetic network, controlling the ratio of negative edges $p_{-}=[0.1,\cdots,0.9]$ to generate signed synthetic networks corresponding to different negative edge ratios. We show the dismantling results in Figure 7, where each point is the average of $100$ realizations.

In Figure 7, the x-axis indicates the ratio of negative edges ( $\frac{|E^{-}|}{M}$ ) in each of the networks, and the y-axis shows the $R$ values, revealing the robustness of the corresponding networks. Although the WS network has a higher value of $R$ (indicating more robustness) for a low value of $\frac{|E^{-}|}{M}$ compared to ER and BA, it is easier to disassemble when $\frac{|E^{-}|}{M}>0.4$ . Meanwhile, we observe that as the ratio of negative edges increases for a relatively small value of $\frac{|E^{-}|}{M}$ ( $\frac{|E^{-}|}{M}<0.4$ for WS, $\frac{|E^{-}|}{M}<0.7$ for ER and BA), the robustness of the networks is relatively stable. However, for a large value of $\frac{|E^{-}|}{M}$ , the networks can easily be dismantled, with the value of $R$ decreasing with increasing $\frac{|E^{-}|}{M}$ . This suggests that increasing the proportion of positive edges in the network contributes to enhancing its robustness. This observation aligns with real-world scenarios. In a social network where negative edges dominate, signifying antagonistic relationships between individuals, the network is naturally more vulnerable. In general, ER and BA networks are more resilient than the WS network with increasing $\frac{|E^{-}|}{M}$ .

V Conclusion

In this study, we propose an embedding-based network dismantling framework, namely ESND, to address the signed network dismantling problem. The algorithm mainly iteratively processes the following four steps: it first detects the giant connected component (GCC) within the network and then utilizes the signed network embedding algorithm (SiNE) to generate embedding vectors for each of the nodes. Then, it partitions the GCC into different groups via the K-means algorithm based on the node embedding vectors. Subsequently, the node with the highest degree in the largest cluster is removed from the network. The above process is repeated until the fraction of removed nodes, indicated as $q$ , reaches a predetermined threshold $q_{r}$ . Comprehensive experiments on various real signed networks and their corresponding null models demonstrate that ESND surpasses other baseline methods, thus confirming its efficacy and stability. Additionally, correlation analysis of the removed node sequences reveals why ESND can better dismantle a signed network than other baseline methods. Moreover, experiments with testing how the ratio of negative edges in a network could affect the robustness of a signed network show that networks with more negative edges are easier to dismantle.

Certain aspects of this study warrant further attention. Future work could explore the following aspects: a) in this work, we treat signed networks as undirected networks. However, most of the signed networks in the real world contain directionality. Therefore, how to efficiently dismantle a directed signed network remains a topic worth investigating. b) We confine ourselves to using unsigned network connectivity, i.e., the fraction of nodes in the giant component, to evaluate the performance of a dismantling method. Future work could also propose performance evaluation methods regarding the sign nature of a network. c) Most of the existing research on network dismantling focuses on removing critical nodes. Devising methods that could identify critical edges in signed networks to achieve rapid network decomposition is also an interesting direction that deserves further exploration.

Acknowledgment

This work is supported by the National Natural Science Foundation of China (Grant Nos. 62173095, U23A20331), Natural Science Foundation of Zhejiang Province (Grant Nos. LQ22F030008), Scientific Research Foundation for Scholars of HZNU (2021QDL030), and the Natural Science Foundation of Shanghai (Grant No. 21ZR1404700).

References

[1] R. Albert, H. Jeong, and A.-L. Barabási, “Error and attack tolerance of complex networks,” Nature, vol. 406, no. 6794, pp. 378–382, Jul. 2000.
[2] P. Holme, B. J. Kim, C. N. Yoon, and S. K. Han, “Attack vulnerability of complex networks,” Phys. Rev. E, vol. 65, no. 5, p. 056109, May. 2002.
[3] A. Braunstein, L. Dall’Asta, G. Semerjian, and L. Zdeborová, “Network dismantling,” Proc. Nat. Acad. Sci. USA, vol. 113, no. 44, pp. 12 368–12 373, Oct. 2016.
[4] R. Cohen, S. Havlin, and D. Ben-Avraham, “Efficient immunization strategies for computer networks and populations,” Phys Rev. Lett., vol. 91, no. 24, p. 247901, Dec. 2003.
[5] R. Albert, I. Albert, and G. L. Nakarado, “Structural vulnerability of the north american power grid,” Phys. Rev. E, vol. 69, no. 2, p. 025103, Feb. 2004.
[6] T. Verma, N. A. Araújo, and H. J. Herrmann, “Revealing the structure of the world airline network,” Sci. Rep., vol. 4, no. 1, p. 5638, Jul. 2014.
[7] C. Li, H. Wang, and P. Van Mieghem, “Bounds for the spectral radius of a graph when nodes are removed,” Linear Algebra Its Appl., vol. 437, no. 1, pp. 319–323, Jul. 2012.
[8] X.-X. Zhan, K. Zhang, L. Ge, J. Huang, Z. Zhang, L. Wei, G.-Q. Sun, C. Liu, and Z.-K. Zhang, “Exploring the effect of social media and spatial characteristics during the covid-19 pandemic in china,” IEEE Trans. Network Sci. Eng., vol. 10, no. 1, pp. 553–564, Mar. 2022.
[9] M. U. Akhtar, J. Liu, X. Liu, S. Ahmed, and X. Cui, “NRAND: An efficient and robust dismantling approach for infectious disease network,” Inf. Process. Manage., vol. 60, no. 2, p. 103221, Mar. 2023.
[10] X.-X. Zhan, A. Hanjalic, and H. Wang, “Suppressing information diffusion via link blocking in temporal networks,” in International Conference on Complex Networks and Their Applications. Springer, Nov. 2020, pp. 448–458.
[11] F. Gao, Q. He, X. Wang, L. Qiu, and M. Huang, “An efficient rumor suppression approach with knowledge graph convolutional network in social network,” IEEE Trans. Comput. Soc. Syst., Apr. 2024.
[12] P. A. Duijn, V. Kashirin, and P. M. Sloot, “The relative ineffectiveness of criminal network disruption,” Sci. Rep., vol. 4, no. 1, p. 4238, Feb. 2014.
[13] B. Collins, D. T. Hoang, N. T. Nguyen, and D. Hwang, “A new model for predicting and dismantling a complex terrorist network,” IEEE Access, vol. 10, pp. 126 466–126 478, Nov. 2022.
[14] T. N. Bui and C. Jones, “Finding good approximate vertex and edge partitions is np-hard,” Inf. Process. Lett., vol. 42, no. 3, pp. 153–159, May. 1992.
[15] S. V. Buldyrev, R. Parshani, G. Paul, H. E. Stanley, and S. Havlin, “Catastrophic cascade of failures in interdependent networks,” Nature, vol. 464, no. 7291, pp. 1025–1028, Apr. 2010.
[16] S. Osat, A. Faqeeh, and F. Radicchi, “Optimal percolation on multiplex networks,” Nat. Commun., vol. 8, no. 1, p. 1540, Nov. 2017.
[17] S. Sun, Y. Wu, Y. Ma, L. Wang, Z. Gao, and C. Xia, “Impact of degree heterogeneity on attack vulnerability of interdependent networks,” Sci. Rep., vol. 6, no. 1, p. 32983, Sep. 2016.
[18] B. Zhou, Y. Lv, Y. Mao, J. Wang, S. Yu, and Q. Xuan, “The robustness of graph k-shell structure under adversarial attacks,” IEEE Trans. Circuits Syst. II: Express Br., vol. 69, no. 3, pp. 1797–1801, Mar. 2021.
[19] N. Almeira, J. I. Perotti, A. Chacoma, and O. V. Billoni, “Explosive dismantling of two-dimensional random lattices under betweenness centrality attacks,” Chaos, Solitons Fractals, vol. 153, p. 111529, Dec. 2021.
[20] Y. Hao, Y. Wang, L. Jia, and Z. He, “Cascading failures in networks with the harmonic closeness under edge attack strategies,” Chaos, Solitons Fractals, vol. 135, p. 109772, Jun. 2020.
[21] S. Iyer, T. Killingback, B. Sundaram, and Z. Wang, “Attack robustness and centrality of complex networks,” PLoS One, vol. 8, no. 4, p. e59613, Apr. 2013.
[22] D. Zhao, B. Gao, Y. Wang, L. Wang, and Z. Wang, “Optimal dismantling of interdependent networks based on inverse explosive percolation,” IEEE Trans. Circuits Syst. II: Express Br., vol. 65, no. 7, pp. 953–957, Jan. 2018.
[23] F. Morone and H. A. Makse, “Influence maximization in complex networks through optimal percolation,” Nature, vol. 524, no. 7563, pp. 65–68, Jul. 2015.
[24] X.-L. Ren, N. Gleinig, D. Helbing, and N. Antulov-Fantulin, “Generalized network dismantling,” Proc. Natl. Acad. Sci. U.S.A., vol. 116, no. 14, pp. 6554–6559, Feb. 2019.
[25] Z. Feng, Z. Cao, and X. Qi, “Generalized network dismantling via a novel spectral partition algorithm,” Inf. Sci., vol. 632, pp. 285–298, Jun. 2023.
[26] M. Lozano, C. Garcia-Martinez, F. J. Rodriguez, and H. M. Trujillo, “Optimizing network attacks by artificial bee colony,” Inf. Sci., vol. 377, pp. 30–50, Jan. 2017.
[27] S. Wang, J. Liu, and Y. **, “Finding influential nodes in multiplex networks using a memetic algorithm,” IEEE Trans. Cybern., vol. 51, no. 2, pp. 900–912, Jun. 2019.
[28] C. Fan, L. Zeng, Y. Sun, and Y.-Y. Liu, “Finding key players in complex networks through deep reinforcement learning,” Nat. Mach. Intell., vol. 2, no. 6, pp. 317–324, May. 2020.
[29] M. Grassia, M. De Domenico, and G. Mangioni, “Machine learning dismantling and early-warning signals of disintegration in complex systems,” Nat. Commun., vol. 12, no. 1, p. 5190, Aug. 2021.
[30] Q. Liu and B. Wang, “Neural extraction of multiscale essential structure for network dismantling,” Int. J. Neural Netw., vol. 154, pp. 99–108, Jul. 2022.
[31] J. Leskovec, D. Huttenlocher, and J. Kleinberg, “Signed networks in social media,” in Proc. SIGCHI Conf. Hum. Factor Comput. Syst., Apr. 2010, pp. 1361–1370.
[32] D. Yan, W. Xie, Y. Zhang, Q. He, and Y. Yang, “Hypernetwork dismantling via deep reinforcement learning,” IEEE Trans. Network Sci. Eng., vol. 9, no. 5, pp. 3302–3315, May. 2022.
[33] M. Qi, P. Chen, J. Wu, Y. Liang, and X. Duan, “Robustness measurement of multiplex networks based on graph spectrum,” Chaos, vol. 33, no. 2, Feb. 2023.
[34] J. Tang, Y. Chang, C. Aggarwal, and H. Liu, “A survey of signed network mining in social media,” ACM Comput. Surv., vol. 49, no. 3, pp. 1–37, Aug. 2016.
[35] H.-J. Li, W. Xu, S. Song, W.-X. Wang, and M. Perc, “The dynamics of epidemic spreading on signed networks,” Chaos, Solitons Fractals, vol. 151, p. 111294, Oct. 2021.
[36] S. Osat, F. Papadopoulos, A. S. Teixeira, and F. Radicchi, “Embedding-aided network dismantling,” Phys. Rev. Res., vol. 5, no. 1, p. 013076, Feb. 2023.
[37] S. Wang, J. Tang, C. Aggarwal, Y. Chang, and H. Liu, “Signed network embedding in social media,” in Proceedings of the 2017 SIAM International Conference on Data Mining. SIAM, 2017, pp. 327–335.
[38] S. Wandelt, X. Shi, X. Sun, and M. Zanin, “Community detection boosts network dismantling on real-world networks,” IEEE Access, vol. 8, pp. 111 954–111 965, Jun. 2020.
[39] F. Musciotto and S. Miccichè, “Exploring the landscape of dismantling strategies based on the community structure of networks,” Sci. Rep., vol. 13, no. 1, pp. 1–12, Sep. 2023.
[40] M. G. Everett and S. P. Borgatti, “Networks containing negative ties,” Soc. Netw., vol. 38, pp. 111–120, Jul. 2014.
[41] W.-C. Liu, L.-C. Huang, C. W.-J. Liu, and F. Jordán, “A simple approach for quantifying node centrality in signed and directed social networks,” Appl. Netw. Sci., vol. 5, pp. 1–26, Aug. 2020.
[42] X. Yin, X. Hu, Y. Chen, X. Yuan, and B. Li, “Signed-pagerank: An efficient influence maximization framework for signed social networks,” IEEE Trans. Knowl. Data Eng., vol. 33, no. 5, pp. 2208–2222, Oct. 2019.
[43] P. Bonacich and P. Lloyd, “Calculating status with negative relations,” Soc. Netw., vol. 26, no. 4, pp. 331–338, 2004.
[44] C. M. Schneider, A. A. Moreira, J. S. Andrade Jr, S. Havlin, and H. J. Herrmann, “Mitigation of malicious attacks on networks,” Proc. Natl. Acad. Sci. U.S.A., vol. 108, no. 10, pp. 3838–3841, Mar. 2011.
[45] S. Wandelt, X. Sun, D. Feng, M. Zanin, and S. Havlin, “A comparative analysis of approaches to network-dismantling,” Sci. Rep., vol. 8, no. 1, pp. 1–15, Sep. 2018.
[46] B. Hao and I. A. Kovács, “Proper network randomization is key to assessing social balance,” Sci. Adv., vol. 10, no. 18, p. eadj0104, May. 2024.