11institutetext: Technion, Haifa, Israel 11email: {shapirayuval@campus,wieselnaor@campus,shabelman@campus,ddana@ee}.technion.ac.il

Boosting Few-Pixel Robustness Verification via Covering Verification Designs

Yuval Shapira    Naor Wiesel    Shahar Shabelman    Dana Drachsler-Cohen
Abstract

Proving local robustness is crucial to increase the reliability of neural networks. While many verifiers prove robustness in Lsubscript𝐿L_{\infty}italic_L start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ϵitalic-ϵ\epsilonitalic_ϵ-balls, very little work deals with robustness verification in L0subscript𝐿0L_{0}italic_L start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ϵitalic-ϵ\epsilonitalic_ϵ-balls, capturing robustness to few pixel attacks. This verification introduces a combinatorial challenge, because the space of pixels to perturb is discrete and of exponential size. A previous work relies on covering designs to identify sets for defining Lsubscript𝐿L_{\infty}italic_L start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT neighborhoods, which if proven robust imply that the L0subscript𝐿0L_{0}italic_L start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ϵitalic-ϵ\epsilonitalic_ϵ-ball is robust. However, the number of neighborhoods to verify remains very high, leading to a high analysis time. We propose covering verification designs, a combinatorial design that tailors effective but analysis-incompatible coverings to L0subscript𝐿0L_{0}italic_L start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT robustness verification. The challenge is that computing a covering verification design introduces a high time and memory overhead, which is intensified in our setting, where multiple candidate coverings are required to identify how to reduce the overall analysis time. We introduce CoVerD, an L0subscript𝐿0L_{0}italic_L start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT robustness verifier that selects between different candidate coverings without constructing them, but by predicting their block size distribution. This prediction relies on a theorem providing closed-form expressions for the mean and variance of this distribution. CoVerD constructs the chosen covering verification design on-the-fly, while kee** the memory consumption minimal and enabling to parallelize the analysis. The experimental results show that CoVerD reduces the verification time on average by up to 5.1x compared to prior work and that it scales to larger L0subscript𝐿0L_{0}italic_L start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ϵitalic-ϵ\epsilonitalic_ϵ-balls.

1 Introduction

Neural networks are very successful in various applications, most notably in image recognition tasks [14]. However, neural networks are also vulnerable to adversarial example attacks [32, 17]. In an adversarial example attack, an attacker slightly perturbs the input to mislead the network. Many attack models and different kinds of perturbations have been considered for neural networks implementing image classifiers [32, 15, 26]. The most commonly studied perturbations are Lpsubscript𝐿𝑝L_{p}italic_L start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT perturbations, where p𝑝pitalic_p is 00 [9, 39], 1111 [10], 2222 [32, 4] or \infty [15, 4]. For Lpsubscript𝐿𝑝L_{p}italic_L start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT perturbations, the attacker is given a small budget ϵitalic-ϵ\epsilonitalic_ϵ and the goal is to find a perturbed input in the Lpsubscript𝐿𝑝L_{p}italic_L start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ϵitalic-ϵ\epsilonitalic_ϵ-ball that causes misclassification.

In response to adversarial attacks, many verifiers have been proposed to reason about the robustness of neural networks in a given neighborhood of inputs. Most deterministic robustness verifiers analyze robustness in Lsubscript𝐿L_{\infty}italic_L start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT ϵitalic-ϵ\epsilonitalic_ϵ-balls [33, 25, 31, 13, 21, 2], while some deterministic verifiers analyze L2subscript𝐿2L_{2}italic_L start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ϵitalic-ϵ\epsilonitalic_ϵ-balls [22, 19] or L1subscript𝐿1L_{1}italic_L start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ϵitalic-ϵ\epsilonitalic_ϵ-balls [37, 40]. Probabilistic verifiers, often leveraging randomized smoothing [6], have been proposed for analyzing Lpsubscript𝐿𝑝L_{p}italic_L start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ϵitalic-ϵ\epsilonitalic_ϵ-balls for p{0,1,2,}𝑝012p\in\{0,1,2,\infty\}italic_p ∈ { 0 , 1 , 2 , ∞ } [23, 28, 38, 11]. Other verifiers analyze neighborhoods defined by semantic or geometric features (e.g., brightness or rotation) [20, 24, 3]. An existing gap is deterministically verifying robustness in L0subscript𝐿0L_{0}italic_L start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ϵitalic-ϵ\epsilonitalic_ϵ-balls, for a small ϵitalic-ϵ\epsilonitalic_ϵ, also known as robustness to few pixel attacks. In L0subscript𝐿0L_{0}italic_L start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ϵitalic-ϵ\epsilonitalic_ϵ-balls, ϵitalic-ϵ\epsilonitalic_ϵ is the number of pixels that can be perturbed. Since ϵitalic-ϵ\epsilonitalic_ϵ is an integer (as opposed to a real number), we denote it as t𝑡titalic_t. L0subscript𝐿0L_{0}italic_L start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT t𝑡titalic_t-balls consist of discrete perturbations, unlike many other attack models whose perturbations are continuous. Thus, their analysis is a challenging combinatorial problem. Theoretically, robustness verification of an L0subscript𝐿0L_{0}italic_L start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT t𝑡titalic_t-ball can be reduced into a set of robustness verification tasks over Lsubscript𝐿L_{\infty}italic_L start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT neighborhoods, each allows a specific set of t𝑡titalic_t pixels to be perturbed. However, this approach is infeasible in practice for t>2𝑡2t>2italic_t > 2, since the number of the Lsubscript𝐿L_{\infty}italic_L start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT neighborhoods that need to be proven robust is (vt)binomial𝑣𝑡v\choose t( binomial start_ARG italic_v end_ARG start_ARG italic_t end_ARG ), where v𝑣vitalic_v is the number of pixels. To illustrate, for MNIST images, where v=784𝑣784v=784italic_v = 784, the number of neighborhoods is 1.610101.6superscript10101.6\cdot 10^{10}1.6 ⋅ 10 start_POSTSUPERSCRIPT 10 end_POSTSUPERSCRIPT for t=4𝑡4t=4italic_t = 4, 2.410122.4superscript10122.4\cdot 10^{12}2.4 ⋅ 10 start_POSTSUPERSCRIPT 12 end_POSTSUPERSCRIPT for t=5𝑡5t=5italic_t = 5, and 3.210143.2superscript10143.2\cdot 10^{14}3.2 ⋅ 10 start_POSTSUPERSCRIPT 14 end_POSTSUPERSCRIPT for t=6𝑡6t=6italic_t = 6. That is, every minimal increase of t𝑡titalic_t (by one) increases the neighborhood size by two orders of magnitude.

A recent work proposes a deterministic L0subscript𝐿0L_{0}italic_L start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT robustness verifier for few pixel attacks, called Calzone [30]. Calzone builds on two main observations. First, if a network is robust to perturbations of a specific set of k𝑘kitalic_k pixels, then it is also robust to perturbations of any subsumed set of these pixels. Second, often Lsubscript𝐿L_{\infty}italic_L start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT robustness verifiers can analyze robustness to arbitrary perturbations of k𝑘kitalic_k specific pixels, for values of k𝑘kitalic_k that are significantly larger than t𝑡titalic_t. They thus reduce the problem of verifying robustness in an L0subscript𝐿0L_{0}italic_L start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT t𝑡titalic_t-ball to proving robustness in a set of Lsubscript𝐿L_{\infty}italic_L start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT neighborhoods defined by a set of k𝑘kitalic_k-sized pixel sets, subsuming all possible sets of t𝑡titalic_t pixels. To compute the k𝑘kitalic_k-sized pixel sets, they rely on covering designs [16, 34]. Given parameters (v,k,t)𝑣𝑘𝑡(v,k,t)( italic_v , italic_k , italic_t ), a covering is a set of k𝑘kitalic_k-sized sets that cover all subsets of size t𝑡titalic_t of a set [v]={1,,v}delimited-[]𝑣1𝑣[v]=\{1,\ldots,v\}[ italic_v ] = { 1 , … , italic_v } (e.g., the pixel set). Covering designs is a field in combinatorics providing construction techniques to compute coverings. The challenge is to compute a covering of minimal size. While many covering constructions have been proposed, computing an optimal covering is an open combinatorial problem for most values of v𝑣vitalic_v, k𝑘kitalic_k and t𝑡titalic_t. Further, most best-known coverings for t>3𝑡3t>3italic_t > 3 are far from the best general lower bound, known as the Schönheim bound [29]. This severely impacts the analysis time of Calzone. In practice, Calzone often does not complete within the five hour timeout when analyzing L0subscript𝐿0L_{0}italic_L start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT 5555-balls. To scale, it is crucial to lower the number of analyzed sets. While there are effective covering constructions renowned for the small coverings they compute, they are limited to specific values of v𝑣vitalic_v and k𝑘kitalic_k, which are incompatible for the analysis of L0subscript𝐿0L_{0}italic_L start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT robustness. Since Calzone treats covering constructions as black-box, it is limited to rely on analysis-compatible coverings and cannot benefit from these effective constructions.

To boost the robustness verification of few pixel attacks, we propose a new covering type, called a covering verification design (CVD), tailoring covering designs for L0subscript𝐿0L_{0}italic_L start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT robustness verification. CVD relies on a highly effective construction to obtain an analysis-incompatible covering and partially induces it to an analysis-compatible covering, where sets can have different sizes. Although the exact sets and their sizes depend on a random choice, we prove that the mean and variance of the set sizes are independent of this choice and have closed-form expressions. Partially inducing this effective construction has been proposed before [27], however it has been proposed for another combinatorial design, requiring a bound on the maximal set size in the covering, unlike CVD. We demonstrate that the sizes of CVDs are lower by 8% for t=4𝑡4t=4italic_t = 4 and by 15% for t=5𝑡5t=5italic_t = 5 than the respective Schönheim lower bound. This improvement, enabled by considering a new type of coverings, is remarkable for scaling L0subscript𝐿0L_{0}italic_L start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT robustness analysis. To date, for analysis-compatible values of v𝑣vitalic_v and k𝑘kitalic_k and for t3𝑡3t\geq 3italic_t ≥ 3, it is impossible to obtain an optimal covering design, and even if we obtained it, its size is at least the Schönheim bound. In particular, Calzone’s considered coverings are larger by 4x than the Schönheim lower bound for t=4𝑡4t=4italic_t = 4 and by 8.4x for t=5𝑡5t=5italic_t = 5. While promising, CVDs raise a practical challenge: their construction as well as their final size introduce a high memory overhead. Further, to minimize the analysis time, the verifier chooses between multiple coverings. However, the total memory overhead makes it infeasible to store these coverings in a covering database without limiting their size (like Calzone does).

We introduce CoVerD, an L0subscript𝐿0L_{0}italic_L start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT robustness verifier, boosting Calzone’s performance by leveraging CVDs. CoVerD has two main components, planning and analysis. The planning component predicts the CVD that will allow it to minimize the overall analysis time. To reduce the memory overhead, it predicts the best CVD out of many candidates, without constructing the candidates. This prediction relies on estimating the set size distribution of a candidate covering, using our expressions for the mean and variance. The analysis component constructs the chosen CVD. The challenge is that the original covering that is being induced may be too large to fit the memory. To cope, CoVerD induces the covering while constructing the original covering. Further, it constructs on-the-fly a partitioning of the CVD so that the analysis can be parallelized over multiple GPUs. Another advantage of the on-the-fly construction is that CoVerD does not need to prepare coverings for every image dimension in advance. This both saves memory consumption and makes CoVerD suitable for any image classifier, without requiring to precompute coverings for new image dimensions, as Calzone requires.

We evaluate CoVerD on convolutional and fully-connected networks, trained for MNIST, Fashion-MNIST, and CIFAR-10. CoVerD is faster than Calzone in verifying robust t𝑡titalic_t-balls on average by 2.8x for t=4𝑡4t=4italic_t = 4 and by 5.1x for t=5𝑡5t=5italic_t = 5. Further, CoVerD scales to more challenging t𝑡titalic_t-balls than Calzone. In particular, it verifies some 6666-balls, which Calzone does not consider at all, within 42424242 minutes.

2 Background

In this section, we define the problem of verifying robustness of an image classifier in an L0subscript𝐿0L_{0}italic_L start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT t𝑡titalic_t-ball and provide background on Calzone [30].

L0subscript𝐿0L_{0}italic_L start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT robustness verification

We address the problem of determining the local robustness of an image classifier in an L0subscript𝐿0L_{0}italic_L start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT t𝑡titalic_t-ball of an image x𝑥xitalic_x. An image classifier N𝑁Nitalic_N takes as input an image x𝑥xitalic_x consisting of v𝑣vitalic_v pixels, each ranges over [0,1]01[0,1][ 0 , 1 ] (all definitions extend to colored images, but omitted for simplicity’s sake). It returns a vector consisting of a score for every possible class. The classification the classifier N𝑁Nitalic_N assigns to an input image x𝑥xitalic_x is the class with the maximal score: cx=argmax(N(x))subscript𝑐𝑥argmax𝑁𝑥c_{x}=\texttt{argmax}(N(x))italic_c start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT = argmax ( italic_N ( italic_x ) ). We focus on classifiers implemented by neural networks. Specifically, our focus is on fully-connected and convolutional networks, since many Lsubscript𝐿L_{\infty}italic_L start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT robustness verifiers can analyze them [31, 13, 21, 2, 33, 25]. However, like Calzone, CoVerD is not coupled to the underlying implementation of the classifier and can reason about any classifier for which there are Lsubscript𝐿L_{\infty}italic_L start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT robustness verifiers that it can rely on. The problem we study is determining whether a classifier N𝑁Nitalic_N is locally robust in the L0subscript𝐿0L_{0}italic_L start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT t𝑡titalic_t-ball of an input x𝑥xitalic_x, for t2𝑡2t\geq 2italic_t ≥ 2. That is, whether every input whose L0subscript𝐿0L_{0}italic_L start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT distance from x𝑥xitalic_x is at most t𝑡titalic_t is classified by N𝑁Nitalic_N as x𝑥xitalic_x is classified. Formally, the t𝑡titalic_t-ball of x𝑥xitalic_x is Bt(x)={xxx0t}subscript𝐵𝑡𝑥conditional-setsuperscript𝑥subscriptnormsuperscript𝑥𝑥0𝑡B_{t}(x)=\{x^{\prime}\mid||x^{\prime}-x||_{0}\leq t\}italic_B start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_x ) = { italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∣ | | italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - italic_x | | start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ≤ italic_t } and N𝑁Nitalic_N is locally robust in Bt(x)subscript𝐵𝑡𝑥B_{t}(x)italic_B start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_x ) if xBt(x).argmax(N(x))=argmax(N(x))formulae-sequencefor-allsuperscript𝑥subscript𝐵𝑡𝑥argmax𝑁superscript𝑥argmax𝑁𝑥\forall x^{\prime}\in B_{t}(x).\ \texttt{argmax}(N(x^{\prime}))=\texttt{argmax% }(N(x))∀ italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_x ) . argmax ( italic_N ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) = argmax ( italic_N ( italic_x ) ). We note that the L0subscript𝐿0L_{0}italic_L start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT distance of two images is the number of pixels that the images differ, that is xx0=|{i[v]xixi}|subscriptnormsuperscript𝑥𝑥0conditional-set𝑖delimited-[]𝑣subscript𝑥𝑖superscriptsubscript𝑥𝑖||x^{\prime}-x||_{0}=|\{i\in[v]\mid x_{i}\neq x_{i}^{\prime}\}|| | italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - italic_x | | start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = | { italic_i ∈ [ italic_v ] ∣ italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ≠ italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT } | (where [v]={1,,v}delimited-[]𝑣1𝑣[v]=\{1,\ldots,v\}[ italic_v ] = { 1 , … , italic_v }). In other words, an L0subscript𝐿0L_{0}italic_L start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT perturbation to an image x𝑥xitalic_x can arbitrarily perturb up to t𝑡titalic_t pixels in x𝑥xitalic_x.

Calzone

Calzone, depicted in Figure 1, is an L0subscript𝐿0L_{0}italic_L start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT robustness verifier. It verifies by determining the robustness of a classifier N𝑁Nitalic_N in all neighborhoods in which a specific set of pixels S𝑆Sitalic_S is arbitrarily perturbed, for every S[v]𝑆delimited-[]𝑣S\subseteq[v]italic_S ⊆ [ italic_v ] of size t𝑡titalic_t. Namely, to prove robustness, it has to determine for every such S𝑆Sitalic_S whether N𝑁Nitalic_N classifies the same all inputs in the neighborhood consisting of all images that are identical to x𝑥xitalic_x in all pixels, but the pixels in S𝑆Sitalic_S. We denote this neighborhood by IS(x)={x[0,1]viS.xi=xi}subscript𝐼𝑆𝑥conditional-setsuperscript𝑥superscript01𝑣formulae-sequencefor-all𝑖𝑆subscriptsuperscript𝑥𝑖subscript𝑥𝑖I_{S}(x)=\{x^{\prime}\in[0,1]^{v}\mid\forall i\notin S.\ x^{\prime}_{i}=x_{i}\}italic_I start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT ( italic_x ) = { italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ [ 0 , 1 ] start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT ∣ ∀ italic_i ∉ italic_S . italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT }. Such neighborhoods can be specified as a sequence of intervals, one for every pixel, where the ithsuperscript𝑖thi^{\text{th}}italic_i start_POSTSUPERSCRIPT th end_POSTSUPERSCRIPT interval is [0,1]01[0,1][ 0 , 1 ] if iS𝑖𝑆i\in Sitalic_i ∈ italic_S (i.e., it can be perturbed) or [xi,xi]subscript𝑥𝑖subscript𝑥𝑖[x_{i},x_{i}][ italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ] if iS𝑖𝑆i\notin Sitalic_i ∉ italic_S (i.e., it cannot be perturbed). Most existing Lsubscript𝐿L_{\infty}italic_L start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT robustness verifiers can determine the robustness of such interval neighborhoods. However, verifying (vt)binomial𝑣𝑡v\choose t( binomial start_ARG italic_v end_ARG start_ARG italic_t end_ARG ) interval neighborhoods, one for every selection of t𝑡titalic_t pixels to perturb, is practically infeasible for t>2𝑡2t>2italic_t > 2. Instead, Calzone builds on the following observation: if N𝑁Nitalic_N is locally robust in a neighborhood IS(x)subscript𝐼superscript𝑆𝑥I_{S^{\prime}}(x)italic_I start_POSTSUBSCRIPT italic_S start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_x ) for S[v]superscript𝑆delimited-[]𝑣S^{\prime}\subseteq[v]italic_S start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⊆ [ italic_v ] of size k>t𝑘𝑡k>titalic_k > italic_t, then N𝑁Nitalic_N is also robust in every IS(x)subscript𝐼𝑆𝑥I_{S}(x)italic_I start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT ( italic_x ), for SS𝑆superscript𝑆S\subseteq S^{\prime}italic_S ⊆ italic_S start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT of size t𝑡titalic_t. This observation allows Calzone to leverage covering designs to reduce the number of neighborhoods analyzed by an Lsubscript𝐿L_{\infty}italic_L start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT verifier. Given three numbers (v,k,t)𝑣𝑘𝑡(v,k,t)( italic_v , italic_k , italic_t ), for tkv𝑡𝑘𝑣t\leq k\leq vitalic_t ≤ italic_k ≤ italic_v, a covering C(v,k,t)𝐶𝑣𝑘𝑡C(v,k,t)italic_C ( italic_v , italic_k , italic_t ) is a set of blocks, where (1) each block is subset of size k𝑘kitalic_k of [v]delimited-[]𝑣[v][ italic_v ] and (2) the blocks cover all subsets of [v]delimited-[]𝑣[v][ italic_v ] of size t𝑡titalic_t: for every S[v]𝑆delimited-[]𝑣S\subseteq[v]italic_S ⊆ [ italic_v ] of size t𝑡titalic_t, there is a block BC(v,k,t)𝐵𝐶𝑣𝑘𝑡B\in C(v,k,t)italic_B ∈ italic_C ( italic_v , italic_k , italic_t ) such that SB𝑆𝐵S\subseteq Bitalic_S ⊆ italic_B. Coverings are evaluated by their size, |C(v,k,t)|𝐶𝑣𝑘𝑡|C(v,k,t)|| italic_C ( italic_v , italic_k , italic_t ) |, where the smaller the better. We next describe the components of Calzone: analysis, planning and covering database.

Refer to caption
Figure 1: The Calzone L0subscript𝐿0L_{0}italic_L start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT robustness verifier.

Calzone’s analysis

Calzone begins the analysis by obtaining a covering C(v,k1,t)𝐶𝑣subscript𝑘1𝑡C(v,k_{1},t)italic_C ( italic_v , italic_k start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_t ) from its covering database, where k1subscript𝑘1k_{1}italic_k start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT is determined by the planning component (described shortly). It pushes all blocks in the covering into a stack. It then iteratively pops a block S𝑆Sitalic_S from the stack and verifies the robustness of N𝑁Nitalic_N in IS(x)subscript𝐼𝑆𝑥I_{S}(x)italic_I start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT ( italic_x ) by running GPUPoly [25]. GPUPoly is a sound Lsubscript𝐿L_{\infty}italic_L start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT robustness verifier which is highly scalable because it performs the analysis on a GPU. However, it relies on a linear relaxation and thus may fail proving robustness due to overapproximation errors. If it determines that IS(x)subscript𝐼𝑆𝑥I_{S}(x)italic_I start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT ( italic_x ) is robust, Calzone continues to the next block. Otherwise, Calzone performs an exact analysis or refines the block. If |S|=t𝑆𝑡|S|=t| italic_S | = italic_t, Calzone invokes a sound and complete mixed-integer linear programming (MILP) verifier [33]. If it determines that IS(x)subscript𝐼𝑆𝑥I_{S}(x)italic_I start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT ( italic_x ) is not robust, Calzone returns non-robust, otherwise Calzone continues to the next block. If |S|𝑆|S|| italic_S | is greater than t𝑡titalic_t, Calzone refines S𝑆Sitalic_S by pushing to the stack all blocks in a covering for S𝑆Sitalic_S and t𝑡titalic_t. The blocks’ size is ki+1subscript𝑘𝑖1k_{i+1}italic_k start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT, which is the block size following the current block size ki=|S|subscript𝑘𝑖𝑆k_{i}=|S|italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = | italic_S |, as determined by the planning component. The covering is obtained by retrieving from the covering database the covering C(|S|,ki+1,t)𝐶𝑆subscript𝑘𝑖1𝑡C(|S|,k_{i+1},t)italic_C ( | italic_S | , italic_k start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT , italic_t ) and renaming the numbers in the blocks to range over the numbers in S𝑆Sitalic_S (instead of [|S|]delimited-[]𝑆[|S|][ | italic_S | ]), denoted as CS(|S|,ki+1,t)subscript𝐶𝑆𝑆subscript𝑘𝑖1𝑡C_{S}(|S|,k_{i+1},t)italic_C start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT ( | italic_S | , italic_k start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT , italic_t ). If Calzone observes an empty stack, it returns robust. This analysis is proven sound and complete. To scale, Calzone parallelizes the analysis over multiple GPUs (for GPUPoly) and CPUs (for the MILP verifier). Technically, the first covering is split between the GPUs, each independently analyzes its assigned blocks and refines if needed.

Calzone’s planning

The planning determines the block sizes of the first covering and of the refinements’ coverings. These are given as a K strategy, a decreasing series k1>>kmsubscript𝑘1subscript𝑘𝑚k_{1}>\ldots>k_{m}italic_k start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT > … > italic_k start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT, where k1MAX_K=99subscript𝑘1MAX_K99k_{1}\leq\text{MAX\_K}=99italic_k start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ MAX_K = 99 and km=tsubscript𝑘𝑚𝑡k_{m}=titalic_k start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT = italic_t. Calzone predicts the K strategy that minimizes the overall analysis time using dynamic programming, defined over the analysis time of the first covering, the average fraction of blocks that will be refined, and the analysis time of the refined blocks. This computation requires GPUPoly’s success rate and average analysis time for neighborhoods IS(x)subscript𝐼𝑆𝑥I_{S}(x)italic_I start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT ( italic_x ), for all |S|MAX_K𝑆MAX_K|S|\leq\text{MAX\_K}| italic_S | ≤ MAX_K. These are estimated by sampling nsamples=400subscript𝑛samples400n_{\text{samples}}=400italic_n start_POSTSUBSCRIPT samples end_POSTSUBSCRIPT = 400 sets S𝑆Sitalic_S for every kMAX_K𝑘MAX_Kk\leq\text{MAX\_K}italic_k ≤ MAX_K and submitting their neighborhood IS(x)subscript𝐼𝑆𝑥I_{S}(x)italic_I start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT ( italic_x ) to GPUPoly.

Calzone’s covering database

As described, the analysis obtains coverings from a database. This database has been populated by obtaining well-optimized coverings from online resources and extending them for large values of v𝑣vitalic_v and k𝑘kitalic_k using general covering constructions. Because of these general constructions, the database’s coverings tend to be far from the Schönheim bound [29], the best-known general lower bound, especially for large values of v𝑣vitalic_v (the image dimension). This inefficiency results in longer analysis, since more blocks are analyzed.

3 Our Approach: Covering Verification Designs

To scale Calzone’s analysis, it is crucial to reduce the number of blocks that are analyzed by GPUPoly or the MILP verifier. A dominant contributor to this number is the size of the first covering, for two reasons. First, the first covering is over a large v𝑣vitalic_v (the image dimension), thus its size is significantly larger than the sizes of coverings added upon refinement, which are over significantly smaller v𝑣vitalic_v (typically v80𝑣80v\leq 80italic_v ≤ 80 and at most vMAX_K𝑣MAX_Kv\leq\text{MAX\_K}italic_v ≤ MAX_K). Second, the first covering has an accumulative effect on the number of refinements, and consequently it dominates the analysis time. Reducing this size is theoretically possible by relying on finite geometry covering constructions [27, 1, 16], which are renowned for computing very small coverings. However, finite geometry coverings are limited to (v,k,t)𝑣𝑘𝑡(v,k,t)( italic_v , italic_k , italic_t ) triples in which v𝑣vitalic_v and k𝑘kitalic_k are defined by related mathematical expressions over t𝑡titalic_t. In Calzone’s analysis, the first covering has to be defined over a given v𝑣vitalic_v (the image dimension) and t𝑡titalic_t (the number of perturbed pixels). Thus, for some values of v𝑣vitalic_v and t𝑡titalic_t, there is no finite geometry covering. For the other values, there are very few values for k𝑘kitalic_k, leading to long analysis either because they are large and have a low success rate, triggering many refinements, or small and have very large coverings. We propose to tailor induced coverings for L0subscript𝐿0L_{0}italic_L start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT robustness analysis in order to leverage finite geometry coverings. To this end, we introduce a new type of a covering design, called a covering verification design (CVD). We next provide background on finite geometry coverings and induced coverings. We then define partially-induced coverings and our new covering type. We discuss its properties, its effectiveness, and the practical challenges in integrating it to L0subscript𝐿0L_{0}italic_L start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT verification.

Refer to caption
Figure 2: The Fano Plane.

Finite geometry coverings

Finite geometry covering constructions are widely known for obtaining small (sometimes optimal) coverings [27, 1, 16]. Popular finite geometry constructions rely on projective geometry (PG) or affine geometry (AG). We focus on PG, but our approach extends to AG. A PG construction views the problem of constructing a covering for a given (v𝑣vitalic_v, k𝑘kitalic_k, t𝑡titalic_t) from a finite geometry point of view, where v𝑣vitalic_v is the number of points in the geometry. It constructs coverings by computing flats (linear subspaces) of dimension t1𝑡1t-1italic_t - 1, each containing k𝑘kitalic_k points. Since every t𝑡titalic_t points from [v]delimited-[]𝑣[v][ italic_v ] are contained in at least one flat [27], the flats provide a covering. Figure 2 shows the Fano plane, a well-known example. In this example, there are v=7𝑣7v=7italic_v = 7 points, the flats are of dimension t1=1𝑡11t-1=1italic_t - 1 = 1 (the lines and the circle), each containing k=3𝑘3k=3italic_k = 3 points. The set of flats forms a covering, where each flat is a block: C(7,3,2)={{1,2,3},{1,4,6},{1,5,7},{2,4,7},{2,5,6},{3,4,5},{3,6,7}}𝐶732123146157247256345367C(7,3,2)=\{\{1,2,3\},\{1,4,6\},\{1,5,7\},\{2,4,7\},\{2,5,6\},\{3,4,5\},\{3,6,7\}\}italic_C ( 7 , 3 , 2 ) = { { 1 , 2 , 3 } , { 1 , 4 , 6 } , { 1 , 5 , 7 } , { 2 , 4 , 7 } , { 2 , 5 , 6 } , { 3 , 4 , 5 } , { 3 , 6 , 7 } }. PG coverings exist for triples where v=qm+11q1𝑣superscript𝑞𝑚11𝑞1v=\frac{q^{m+1}-1}{q-1}italic_v = divide start_ARG italic_q start_POSTSUPERSCRIPT italic_m + 1 end_POSTSUPERSCRIPT - 1 end_ARG start_ARG italic_q - 1 end_ARG and k=qt1q1𝑘superscript𝑞𝑡1𝑞1k=\frac{q^{t}-1}{q-1}italic_k = divide start_ARG italic_q start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT - 1 end_ARG start_ARG italic_q - 1 end_ARG, for a prime power q𝑞qitalic_q and mt2𝑚𝑡2m\geq t\geq 2italic_m ≥ italic_t ≥ 2 (it also exists for m=t1𝑚𝑡1m=t-1italic_m = italic_t - 1, but then v=k𝑣𝑘v=kitalic_v = italic_k, which is unhelpful to our analysis). Because PG is restricted to such triples, Calzone cannot effectively leverage it for the first covering, whose v𝑣vitalic_v and t𝑡titalic_t are given. This is because for common image dimensions (e.g., v=784𝑣784v=784italic_v = 784 for MNIST and v=1024𝑣1024v=1024italic_v = 1024 for CIFAR-10), there are no suitable q𝑞qitalic_q and m𝑚mitalic_m. Even if there are suitable q𝑞qitalic_q and m𝑚mitalic_m, there are very few possible k𝑘kitalic_k values, which are unlikely to include or be close to an optimal value of k𝑘kitalic_k. Thus, either they are smaller than an optimal k𝑘kitalic_k, leading to larger coverings and a longer analysis time, or that they are larger than an optimal k𝑘kitalic_k and have a lower success rate, leading to many refinements, resulting, again, in a longer analysis time. For example, for v=364𝑣364v=364italic_v = 364 and t=5𝑡5t=5italic_t = 5, the only suitable values are q=3𝑞3q=3italic_q = 3 and m=5𝑚5m=5italic_m = 5 (i.e., 364=(35+11)/(31)364superscript351131364=({3^{5+1}-1})/({3-1})364 = ( 3 start_POSTSUPERSCRIPT 5 + 1 end_POSTSUPERSCRIPT - 1 ) / ( 3 - 1 )), namely there is only one triple for these values of v𝑣vitalic_v and t𝑡titalic_t. In this triple, k=(351)/(31)=121𝑘superscript35131121k=({3^{5}-1})/({3-1})=121italic_k = ( 3 start_POSTSUPERSCRIPT 5 end_POSTSUPERSCRIPT - 1 ) / ( 3 - 1 ) = 121. Since kv3𝑘𝑣3k\approx\frac{v}{3}italic_k ≈ divide start_ARG italic_v end_ARG start_ARG 3 end_ARG, neighborhoods IS(x)subscript𝐼𝑆𝑥I_{S}(x)italic_I start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT ( italic_x ) for which |S|=121𝑆121|S|=121| italic_S | = 121 are not likely to be robust, thus such k𝑘kitalic_k is likely to have a low success rate. Induced coverings [16] enable to leverage finite geometry coverings for other (v,k,t)𝑣𝑘𝑡(v,k,t)( italic_v , italic_k , italic_t ) triples, as next explained.

Induced coverings

Given vv𝑣superscript𝑣v\leq v^{\prime}italic_v ≤ italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT and kk𝑘superscript𝑘k\leq k^{\prime}italic_k ≤ italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, a covering C(v,k,t)𝐶superscript𝑣superscript𝑘𝑡C(v^{\prime},k^{\prime},t)italic_C ( italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_t ) can be induced to form a covering C(v,k,t)𝐶𝑣𝑘𝑡C(v,k,t)italic_C ( italic_v , italic_k , italic_t ) [16]. The induced covering is obtained in three steps. First, we select a subset of numbers of size v𝑣vitalic_v, denoted L[v]𝐿delimited-[]superscript𝑣L\subseteq[v^{\prime}]italic_L ⊆ [ italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ], and remove every l[v]L𝑙delimited-[]superscript𝑣𝐿l\in[v^{\prime}]\setminus Litalic_l ∈ [ italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ] ∖ italic_L from every block in C(v,k,t)𝐶superscript𝑣superscript𝑘superscript𝑡C(v^{\prime},k^{\prime},t^{\prime})italic_C ( italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ). This results in a set of blocks of different sizes that covers all subsets of L𝐿Litalic_L of size t𝑡titalic_t [27, Lemma 1]. This follows since every subset SL𝑆𝐿S\subseteq Litalic_S ⊆ italic_L of size t𝑡titalic_t is a subset of [v]delimited-[]superscript𝑣[v^{\prime}][ italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ] and thus there is BC(v,k,t)𝐵𝐶superscript𝑣superscript𝑘𝑡B\in C(v^{\prime},k^{\prime},t)italic_B ∈ italic_C ( italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_t ) such that SB𝑆𝐵S\subseteq Bitalic_S ⊆ italic_B. The first step removes from B𝐵Bitalic_B only numbers from [v]Ldelimited-[]superscript𝑣𝐿[v^{\prime}]\setminus L[ italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ] ∖ italic_L and thus S𝑆Sitalic_S is contained in the respective block to B𝐵Bitalic_B after this step. The next two steps fix blocks whose size is not k𝑘kitalic_k. The second step extends every block whose size is smaller than k𝑘kitalic_k with numbers from L𝐿Litalic_L. The third step refines every block whose size is larger than k𝑘kitalic_k to multiple blocks of size k𝑘kitalic_k that cover all of its subsets of size t𝑡titalic_t. This step significantly increases the number of blocks, unless the number of blocks larger than k𝑘kitalic_k is negligible. We note that these steps provide a covering over the numbers in L𝐿Litalic_L (i.e., CL(|L|,k,t)subscript𝐶𝐿𝐿𝑘𝑡C_{L}(|L|,k,t)italic_C start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ( | italic_L | , italic_k , italic_t )). A covering for (|L|,k,t)𝐿𝑘𝑡(|L|,k,t)( | italic_L | , italic_k , italic_t ) can be obtained by renaming the numbers to range over [|L|]delimited-[]𝐿[|L|][ | italic_L | ].

Partially-induced covering

Our new covering design is an instance of a partially-induced covering. A partially-induced covering is the set of blocks obtained by the first step, where the blocks cover all subsets of L𝐿Litalic_L of size t𝑡titalic_t and are of different sizes. For example, for the Fano plane and L1={4,5,6,7}subscript𝐿14567L_{1}=\{4,5,6,7\}italic_L start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = { 4 , 5 , 6 , 7 }, the partially-induced covering is: C1={{},{4,6},{5,7},{4,7},{5,6},{4,5},{6,7}}subscript𝐶1465747564567C_{1}=\{\{\},\{4,6\},\{5,7\},\{4,7\},\{5,6\},\{4,5\},\{6,7\}\}italic_C start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = { { } , { 4 , 6 } , { 5 , 7 } , { 4 , 7 } , { 5 , 6 } , { 4 , 5 } , { 6 , 7 } }, while for L2={1,2,3,4}subscript𝐿21234L_{2}=\{1,2,3,4\}italic_L start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = { 1 , 2 , 3 , 4 }, it is: C2={{1,2,3},{1,4},{1},{2,4},{2},{3,4},{3}}subscript𝐶2123141242343C_{2}=\{\{1,2,3\},\{1,4\},\{1\},\{2,4\},\{2\},\{3,4\},\{3\}\}italic_C start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = { { 1 , 2 , 3 } , { 1 , 4 } , { 1 } , { 2 , 4 } , { 2 } , { 3 , 4 } , { 3 } }. Partially-induced coverings have two benefits in our setting: (1) by not extending blocks whose size is smaller than k𝑘kitalic_k, we increase the likelihood that GPUPoly will prove their robustness, and (2) by not refining blocks whose size is larger than k𝑘kitalic_k, we (a) preserve the number of blocks as in the original covering, (b) provide GPUPoly an opportunity to verify these blocks, and (c) rely on the optimal refinement sizes (computed by the dynamic programming) for blocks that GPUPoly fails proving robustness. Our covering design partially induces PG coverings, to obtain additional benefits for L0subscript𝐿0L_{0}italic_L start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT robustness verification.

Covering verification designs

Given the number of pixels v𝑣vitalic_v and the number of pixels to perturb t𝑡titalic_t, a covering verification design (CVD) is the set of blocks obtained by partially inducing a PG covering C(v,k,t)𝐶superscript𝑣superscript𝑘𝑡C(v^{\prime},k^{\prime},t)italic_C ( italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_t ), where vv𝑣superscript𝑣v\leq v^{\prime}italic_v ≤ italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, using a random set of numbers L[v]𝐿delimited-[]superscript𝑣L\subseteq[v^{\prime}]italic_L ⊆ [ italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ] of size v𝑣vitalic_v. The numbers in the blocks can later be renamed to range over [v]delimited-[]𝑣[v][ italic_v ]. For example, since the Fano plane is a PG covering, the partially-induced coverings C1subscript𝐶1C_{1}italic_C start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and C2subscript𝐶2C_{2}italic_C start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT are CVDs. A CVD has two important properties. First, it is a partially-induced covering and thus has all the aforementioned advantages in our setting. In particular, its size is equal to the size of the original covering, which is highly beneficial since CVD induces from PG coverings, known for their small size. Second, although different sets L𝐿Litalic_L lead to different block size distributions, we prove that the mean block size and its variance are the same regardless of the choice of L𝐿Litalic_L. Further, we identify closed-form expressions for them and show that the variance is bounded by the mean. For example, although the block size distributions of C1subscript𝐶1C_{1}italic_C start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and C2subscript𝐶2C_{2}italic_C start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT are different, they have the same average block size (127127\frac{12}{7}divide start_ARG 12 end_ARG start_ARG 7 end_ARG) and the same variance (24492449\frac{24}{49}divide start_ARG 24 end_ARG start_ARG 49 end_ARG). This property has practical advantages: (1) it allows us to estimate the block size distribution (Section 4.2), and (2) since the variance is bounded by the mean, the smaller the mean block size, the less likely that there are very large blocks, which are less likely to be proven robust by GPUPoly. To prove this property, we rely on the fact that PG coverings (and AG coverings) are also a combinatorial design called a balanced incomplete block design (BIBD) [7, Part VII, Proposition 2.36]. We next describe BIBD and then state our theorem on its mean and variance.

BIBD

Given positive integers (v,b,r,k,λ)𝑣𝑏𝑟𝑘𝜆(v,b,r,k,\lambda)( italic_v , italic_b , italic_r , italic_k , italic_λ ), a BIBD is a set of b𝑏bitalic_b blocks, each is a subset of [v]delimited-[]𝑣[v][ italic_v ] of size k𝑘kitalic_k, such that every i[v]𝑖delimited-[]𝑣i\in[v]italic_i ∈ [ italic_v ] appears in r𝑟ritalic_r blocks and every ij[v]𝑖𝑗delimited-[]𝑣i\neq j\in[v]italic_i ≠ italic_j ∈ [ italic_v ] appear together in λ𝜆\lambdaitalic_λ blocks. For example, the Fano plane is a BIBD with v=7,b=7,r=3,k=3,λ=1formulae-sequence𝑣7formulae-sequence𝑏7formulae-sequence𝑟3formulae-sequence𝑘3𝜆1v=7,b=7,r=3,k=3,\lambda=1italic_v = 7 , italic_b = 7 , italic_r = 3 , italic_k = 3 , italic_λ = 1. This is because it has b=7𝑏7b=7italic_b = 7 blocks, each block is a subset of [v]={1,,7}delimited-[]𝑣17[v]=\{1,\dots,7\}[ italic_v ] = { 1 , … , 7 } of size k=3𝑘3k=3italic_k = 3, every number in {1,,7}17\{1,\dots,7\}{ 1 , … , 7 } appears in r=3𝑟3r=3italic_r = 3 blocks and every two different numbers appear together in λ=1𝜆1\lambda=1italic_λ = 1 block. Given a BIBD with parameters (v,b,r,k,λ)superscript𝑣𝑏𝑟superscript𝑘𝜆(v^{\prime},b,r,k^{\prime},\lambda)( italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_b , italic_r , italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_λ ), we define a partially-induced BIBD for vv𝑣superscript𝑣v\leq v^{\prime}italic_v ≤ italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT by selecting a subset of numbers L[v]𝐿delimited-[]superscript𝑣L\subseteq[v^{\prime}]italic_L ⊆ [ italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ] of size v𝑣vitalic_v and removing every l[v]L𝑙delimited-[]superscript𝑣𝐿l\in[v^{\prime}]\setminus Litalic_l ∈ [ italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ] ∖ italic_L from every block in the BIBD (empty blocks or repetitive blocks are kept). While the distribution of the induced blocks’ sizes depends on L𝐿Litalic_L, the mean block size and its variance depend only on v,v,k𝑣superscript𝑣superscript𝑘v,v^{\prime},k^{\prime}italic_v , italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT.

Theorem 3.1

Given a (v,b,r,k,λ)superscript𝑣𝑏𝑟superscript𝑘𝜆(v^{\prime},b,r,k^{\prime},\lambda)( italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_b , italic_r , italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_λ )-BIBD, for v>1superscript𝑣1v^{\prime}>1italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT > 1, and 1vv1𝑣superscript𝑣1\leq v\leq v^{\prime}1 ≤ italic_v ≤ italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, for every L[v]𝐿delimited-[]superscript𝑣L\subseteq[v^{\prime}]italic_L ⊆ [ italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ] of size v𝑣vitalic_v, the mean μv,k,vsubscript𝜇superscript𝑣superscript𝑘𝑣\mu_{v^{\prime},k^{\prime},v}italic_μ start_POSTSUBSCRIPT italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_v end_POSTSUBSCRIPT and variance σv,k,v2superscriptsubscript𝜎superscript𝑣superscript𝑘𝑣2\sigma_{v^{\prime},k^{\prime},v}^{2}italic_σ start_POSTSUBSCRIPT italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_v end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT of the block sizes in the partially-induced BIBD satisfy:

  1. 1.

    μv,k,v=vkvsubscript𝜇superscript𝑣superscript𝑘𝑣𝑣superscript𝑘superscript𝑣\mu_{v^{\prime},k^{\prime},v}=\frac{vk^{\prime}}{v^{\prime}}italic_μ start_POSTSUBSCRIPT italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_v end_POSTSUBSCRIPT = divide start_ARG italic_v italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG start_ARG italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG

  2. 2.

    σv,k,v2=μv,k,v(1+(v1)(k1)v1μv,k,v)=vkv(1+(v1)(k1)v1vkv)superscriptsubscript𝜎superscript𝑣superscript𝑘𝑣2subscript𝜇superscript𝑣superscript𝑘𝑣1𝑣1superscript𝑘1superscript𝑣1subscript𝜇superscript𝑣superscript𝑘𝑣𝑣superscript𝑘superscript𝑣1𝑣1superscript𝑘1superscript𝑣1𝑣superscript𝑘superscript𝑣\sigma_{v^{\prime},k^{\prime},v}^{2}=\mu_{v^{\prime},k^{\prime},v}\left(1+% \frac{(v-1)(k^{\prime}-1)}{v^{\prime}-1}-\mu_{v^{\prime},k^{\prime},v}\right)=% \frac{vk^{\prime}}{v^{\prime}}\left(1+\frac{(v-1)(k^{\prime}-1)}{v^{\prime}-1}% -\frac{vk^{\prime}}{v^{\prime}}\right)italic_σ start_POSTSUBSCRIPT italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_v end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = italic_μ start_POSTSUBSCRIPT italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_v end_POSTSUBSCRIPT ( 1 + divide start_ARG ( italic_v - 1 ) ( italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - 1 ) end_ARG start_ARG italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - 1 end_ARG - italic_μ start_POSTSUBSCRIPT italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_v end_POSTSUBSCRIPT ) = divide start_ARG italic_v italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG start_ARG italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG ( 1 + divide start_ARG ( italic_v - 1 ) ( italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - 1 ) end_ARG start_ARG italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - 1 end_ARG - divide start_ARG italic_v italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG start_ARG italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG )

  3. 3.

    σv,k,v2μv,k,vsuperscriptsubscript𝜎superscript𝑣superscript𝑘𝑣2subscript𝜇superscript𝑣superscript𝑘𝑣\sigma_{v^{\prime},k^{\prime},v}^{2}\leq\mu_{v^{\prime},k^{\prime},v}italic_σ start_POSTSUBSCRIPT italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_v end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ≤ italic_μ start_POSTSUBSCRIPT italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_v end_POSTSUBSCRIPT

Proof

1. We prove μv,k,v=vkvsubscript𝜇superscript𝑣superscript𝑘𝑣𝑣superscript𝑘superscript𝑣\mu_{v^{\prime},k^{\prime},v}=\frac{vk^{\prime}}{v^{\prime}}italic_μ start_POSTSUBSCRIPT italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_v end_POSTSUBSCRIPT = divide start_ARG italic_v italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG start_ARG italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG. Since |L|=v𝐿𝑣|L|=v| italic_L | = italic_v and r𝑟ritalic_r is the number of occurrences of every number in all blocks, the sum of the sizes of the induced blocks is vr𝑣𝑟vritalic_v italic_r. By counting arguments, for a BIBD it holds that rv=bk𝑟superscript𝑣𝑏superscript𝑘rv^{\prime}=bk^{\prime}italic_r italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = italic_b italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT [7, Part II, Proposition 1.2], and so r=bkv𝑟𝑏superscript𝑘superscript𝑣r=\frac{bk^{\prime}}{v^{\prime}}italic_r = divide start_ARG italic_b italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG start_ARG italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG. That is, the sum of the induced blocks’ sizes is vbkv𝑣𝑏superscript𝑘superscript𝑣\frac{vbk^{\prime}}{v^{\prime}}divide start_ARG italic_v italic_b italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG start_ARG italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG. The mean is obtained by dividing by the number of blocks b𝑏bitalic_b: μv,k,v=vkvsubscript𝜇superscript𝑣superscript𝑘𝑣𝑣superscript𝑘superscript𝑣\mu_{v^{\prime},k^{\prime},v}=\frac{vk^{\prime}}{v^{\prime}}italic_μ start_POSTSUBSCRIPT italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_v end_POSTSUBSCRIPT = divide start_ARG italic_v italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG start_ARG italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG.

2. We prove σv,k,v2=μv,k,v(1+(v1)(k1)v1μv,k,v)superscriptsubscript𝜎superscript𝑣superscript𝑘𝑣2subscript𝜇superscript𝑣superscript𝑘𝑣1𝑣1superscript𝑘1superscript𝑣1subscript𝜇superscript𝑣superscript𝑘𝑣\sigma_{v^{\prime},k^{\prime},v}^{2}=\mu_{v^{\prime},k^{\prime},v}\left(1+% \frac{(v-1)(k^{\prime}-1)}{v^{\prime}-1}-\mu_{v^{\prime},k^{\prime},v}\right)italic_σ start_POSTSUBSCRIPT italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_v end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = italic_μ start_POSTSUBSCRIPT italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_v end_POSTSUBSCRIPT ( 1 + divide start_ARG ( italic_v - 1 ) ( italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - 1 ) end_ARG start_ARG italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - 1 end_ARG - italic_μ start_POSTSUBSCRIPT italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_v end_POSTSUBSCRIPT ).
Let Z0b𝑍superscriptsubscript0𝑏Z\in{\mathbb{N}_{0}}^{b}italic_Z ∈ blackboard_N start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT be a vector such that, for every n[b]𝑛delimited-[]𝑏n\in[b]italic_n ∈ [ italic_b ], Znsubscript𝑍𝑛Z_{n}italic_Z start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT is the size of block n𝑛nitalic_n in the partially-induced BIBD. It can be written as Z=ATxL𝑍superscript𝐴𝑇subscript𝑥𝐿Z=A^{T}x_{L}italic_Z = italic_A start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT, where A𝐴Aitalic_A represents the BIBD and xLsubscript𝑥𝐿x_{L}italic_x start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT the set L𝐿Litalic_L, used for partially inducing the BIBD. The matrix A𝐴Aitalic_A is a v×bsuperscript𝑣𝑏v^{\prime}\times bitalic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT × italic_b incidence matrix, where A[m,n]=1𝐴𝑚𝑛1A[m,n]=1italic_A [ italic_m , italic_n ] = 1 if m𝑚mitalic_m is in block n𝑛nitalic_n and A[m,n]=0𝐴𝑚𝑛0A[m,n]=0italic_A [ italic_m , italic_n ] = 0 otherwise. The vector xLsubscript𝑥𝐿x_{L}italic_x start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT is a vsuperscript𝑣v^{\prime}italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT-dimensional vector, where xL[m]=1subscript𝑥𝐿delimited-[]𝑚1x_{L}[m]=1italic_x start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT [ italic_m ] = 1 if mL𝑚𝐿m\in Litalic_m ∈ italic_L and xL[m]=0subscript𝑥𝐿delimited-[]𝑚0x_{L}[m]=0italic_x start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT [ italic_m ] = 0 otherwise. Thus, the average of the squares of the block sizes, denoted 𝔼[Z2]𝔼delimited-[]superscript𝑍2\mathbb{E}[Z^{2}]blackboard_E [ italic_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ], is 𝔼[Z2]=1b(n=1b(ATxL)n2)=1bATxL22𝔼delimited-[]superscript𝑍21𝑏superscriptsubscript𝑛1𝑏superscriptsubscriptsuperscript𝐴𝑇subscript𝑥𝐿𝑛21𝑏superscriptsubscriptnormsuperscript𝐴𝑇subscript𝑥𝐿22\mathbb{E}[Z^{2}]=\frac{1}{b}\left(\sum_{n=1}^{b}(A^{T}x_{L})_{n}^{2}\right)=% \frac{1}{b}\|A^{T}x_{L}\|_{2}^{2}blackboard_E [ italic_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] = divide start_ARG 1 end_ARG start_ARG italic_b end_ARG ( ∑ start_POSTSUBSCRIPT italic_n = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_b end_POSTSUPERSCRIPT ( italic_A start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ) start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) = divide start_ARG 1 end_ARG start_ARG italic_b end_ARG ∥ italic_A start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT (3).
By the variance definition, σv,k,v2=𝔼[Z2]μv,k,v2superscriptsubscript𝜎superscript𝑣superscript𝑘𝑣2𝔼delimited-[]superscript𝑍2superscriptsubscript𝜇superscript𝑣superscript𝑘𝑣2\sigma_{v^{\prime},k^{\prime},v}^{2}=\mathbb{E}[Z^{2}]-\mu_{v^{\prime},k^{% \prime},v}^{2}italic_σ start_POSTSUBSCRIPT italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_v end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = blackboard_E [ italic_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] - italic_μ start_POSTSUBSCRIPT italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_v end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT. Thus, we need to show: 𝔼[Z2]=μv,k,v(1+(v1)(k1)v1)=vkv(1+(v1)(k1)v1)=kvv+k(k1)v(v1)v(v1)𝔼delimited-[]superscript𝑍2subscript𝜇superscript𝑣superscript𝑘𝑣1𝑣1superscript𝑘1superscript𝑣1𝑣superscript𝑘superscript𝑣1𝑣1superscript𝑘1superscript𝑣1superscript𝑘superscript𝑣𝑣superscript𝑘superscript𝑘1superscript𝑣superscript𝑣1𝑣𝑣1\mathbb{E}[Z^{2}]=\mu_{v^{\prime},k^{\prime},v}(1+\frac{(v-1)(k^{\prime}-1)}{v% ^{\prime}-1})=\frac{vk^{\prime}}{v^{\prime}}(1+\frac{(v-1)(k^{\prime}-1)}{v^{% \prime}-1})=\frac{k^{\prime}}{v^{\prime}}v+\frac{k^{\prime}(k^{\prime}-1)}{v^{% \prime}(v^{\prime}-1)}v(v-1)blackboard_E [ italic_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] = italic_μ start_POSTSUBSCRIPT italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_v end_POSTSUBSCRIPT ( 1 + divide start_ARG ( italic_v - 1 ) ( italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - 1 ) end_ARG start_ARG italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - 1 end_ARG ) = divide start_ARG italic_v italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG start_ARG italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG ( 1 + divide start_ARG ( italic_v - 1 ) ( italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - 1 ) end_ARG start_ARG italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - 1 end_ARG ) = divide start_ARG italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG start_ARG italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG italic_v + divide start_ARG italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - 1 ) end_ARG start_ARG italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - 1 ) end_ARG italic_v ( italic_v - 1 ). By counting arguments [7], we have kv=rbsuperscript𝑘superscript𝑣𝑟𝑏\frac{k^{\prime}}{v^{\prime}}=\frac{r}{b}divide start_ARG italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG start_ARG italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG = divide start_ARG italic_r end_ARG start_ARG italic_b end_ARG and k(k1)v(v1)=λbsuperscript𝑘superscript𝑘1superscript𝑣superscript𝑣1𝜆𝑏\frac{k^{\prime}(k^{\prime}-1)}{v^{\prime}(v^{\prime}-1)}=\frac{\lambda}{b}divide start_ARG italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - 1 ) end_ARG start_ARG italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - 1 ) end_ARG = divide start_ARG italic_λ end_ARG start_ARG italic_b end_ARG. Namely, it suffices to show: 𝔼[Z2]=1b(rv+λv(v1))𝔼delimited-[]superscript𝑍21𝑏𝑟𝑣𝜆𝑣𝑣1\mathbb{E}[Z^{2}]=\frac{1}{b}\left(rv+\lambda v(v-1)\right)blackboard_E [ italic_Z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] = divide start_ARG 1 end_ARG start_ARG italic_b end_ARG ( italic_r italic_v + italic_λ italic_v ( italic_v - 1 ) ). By (3), we can show ATxL22=rv+λv(v1)superscriptsubscriptnormsuperscript𝐴𝑇subscript𝑥𝐿22𝑟𝑣𝜆𝑣𝑣1\|A^{T}x_{L}\|_{2}^{2}=rv+\lambda v(v-1)∥ italic_A start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = italic_r italic_v + italic_λ italic_v ( italic_v - 1 ). We prove by induction on v=|L|𝑣𝐿v=|L|italic_v = | italic_L | that ATxL22=rv+λv(v1)superscriptsubscriptnormsuperscript𝐴𝑇subscript𝑥𝐿22𝑟𝑣𝜆𝑣𝑣1\|A^{T}x_{L}\|_{2}^{2}=rv+\lambda v(v-1)∥ italic_A start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = italic_r italic_v + italic_λ italic_v ( italic_v - 1 ):
Base For v=1𝑣1v=1italic_v = 1, we show ATxL22=r1+λ10superscriptsubscriptnormsuperscript𝐴𝑇subscript𝑥𝐿22𝑟1𝜆10\|A^{T}x_{L}\|_{2}^{2}=r\cdot 1+\lambda\cdot 1\cdot 0∥ italic_A start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = italic_r ⋅ 1 + italic_λ ⋅ 1 ⋅ 0: Since v=|L|=1𝑣𝐿1v=|L|=1italic_v = | italic_L | = 1, by definition of a BIBD, the vector of the induced blocks’ sizes Z𝑍Zitalic_Z has r𝑟ritalic_r ones and the rest are zeros. Thus, Z22=rsuperscriptsubscriptnorm𝑍22𝑟\|Z\|_{2}^{2}=r∥ italic_Z ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = italic_r. Since Z=ATxL𝑍superscript𝐴𝑇subscript𝑥𝐿Z=A^{T}x_{L}italic_Z = italic_A start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT, the claim follows.
Induction hypothesis Assume that the claim holds for every 1,,v1𝑣1,\ldots,v1 , … , italic_v.
Step Let L[v]𝐿delimited-[]superscript𝑣L\subseteq[v^{\prime}]italic_L ⊆ [ italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ] such that |L|=v+1𝐿𝑣1|L|=v+1| italic_L | = italic_v + 1. Pick some iL𝑖𝐿i\in Litalic_i ∈ italic_L and define L=L{i}superscript𝐿𝐿𝑖L^{\prime}=L\setminus\{i\}italic_L start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = italic_L ∖ { italic_i } of size v𝑣vitalic_v. We get xL=xL+eisubscript𝑥𝐿subscript𝑥superscript𝐿subscript𝑒𝑖x_{L}=x_{L^{\prime}}+e_{i}italic_x start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT = italic_x start_POSTSUBSCRIPT italic_L start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT + italic_e start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, where eisubscript𝑒𝑖e_{i}italic_e start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is the ithsuperscript𝑖thi^{\text{th}}italic_i start_POSTSUPERSCRIPT th end_POSTSUPERSCRIPT standard unit vector. Thus:

ATxL22=AT(xL+ei)22=ATxL22+ATei22+2ATxL,ATeisuperscriptsubscriptnormsuperscript𝐴𝑇subscript𝑥𝐿22superscriptsubscriptnormsuperscript𝐴𝑇subscript𝑥superscript𝐿subscript𝑒𝑖22superscriptsubscriptnormsuperscript𝐴𝑇subscript𝑥superscript𝐿22superscriptsubscriptnormsuperscript𝐴𝑇subscript𝑒𝑖222superscript𝐴𝑇subscript𝑥superscript𝐿superscript𝐴𝑇subscript𝑒𝑖\|A^{T}x_{L}\|_{2}^{2}=\|A^{T}(x_{L^{\prime}}+e_{i})\|_{2}^{2}=\|A^{T}x_{L^{% \prime}}\|_{2}^{2}+\|A^{T}e_{i}\|_{2}^{2}+2\left\langle A^{T}x_{L^{\prime}},A^% {T}e_{i}\right\rangle∥ italic_A start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = ∥ italic_A start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( italic_x start_POSTSUBSCRIPT italic_L start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT + italic_e start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = ∥ italic_A start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x start_POSTSUBSCRIPT italic_L start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ∥ italic_A start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_e start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + 2 ⟨ italic_A start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x start_POSTSUBSCRIPT italic_L start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT , italic_A start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_e start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⟩
  • By the induction hypothesis, ATxL22=rv+λv(v1)superscriptsubscriptnormsuperscript𝐴𝑇subscript𝑥superscript𝐿22𝑟𝑣𝜆𝑣𝑣1\|A^{T}x_{L^{\prime}}\|_{2}^{2}=rv+\lambda v(v-1)∥ italic_A start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x start_POSTSUBSCRIPT italic_L start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = italic_r italic_v + italic_λ italic_v ( italic_v - 1 ).

  • Since eisubscript𝑒𝑖e_{i}italic_e start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT can be viewed as xL′′subscript𝑥superscript𝐿′′x_{L^{\prime\prime}}italic_x start_POSTSUBSCRIPT italic_L start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT for some L′′superscript𝐿′′L^{\prime\prime}italic_L start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT of size 1, we get ATei22=rsuperscriptsubscriptnormsuperscript𝐴𝑇subscript𝑒𝑖22𝑟\|A^{T}e_{i}\|_{2}^{2}=r∥ italic_A start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_e start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = italic_r.

  • We show ATxL,ATei=xLT(AAT)ei=λvsuperscript𝐴𝑇subscript𝑥superscript𝐿superscript𝐴𝑇subscript𝑒𝑖superscriptsubscript𝑥superscript𝐿𝑇𝐴superscript𝐴𝑇subscript𝑒𝑖𝜆𝑣\left\langle A^{T}x_{L^{\prime}},A^{T}e_{i}\right\rangle=x_{L^{\prime}}^{T}% \left(AA^{T}\right)e_{i}=\lambda v⟨ italic_A start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x start_POSTSUBSCRIPT italic_L start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT , italic_A start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_e start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⟩ = italic_x start_POSTSUBSCRIPT italic_L start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( italic_A italic_A start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) italic_e start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_λ italic_v: Since A𝐴Aitalic_A is an incidence matrix of a BIBD, AAT𝐴superscript𝐴𝑇AA^{T}italic_A italic_A start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT is the matrix with r𝑟ritalic_r on the diagonal and λ𝜆\lambdaitalic_λ elsewhere [7, Part II, Theorem 1.8]. Therefore, (AAT)ei𝐴superscript𝐴𝑇subscript𝑒𝑖\left(AA^{T}\right)e_{i}( italic_A italic_A start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) italic_e start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is a vector whose entries are λ𝜆\lambdaitalic_λ except for the ithsuperscript𝑖thi^{\text{th}}italic_i start_POSTSUPERSCRIPT th end_POSTSUPERSCRIPT entry which is r𝑟ritalic_r. The vector xLsubscript𝑥superscript𝐿x_{L^{\prime}}italic_x start_POSTSUBSCRIPT italic_L start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT has v𝑣vitalic_v ones and 00 on the ithsuperscript𝑖thi^{\text{th}}italic_i start_POSTSUPERSCRIPT th end_POSTSUPERSCRIPT entry (since iL𝑖superscript𝐿i\notin L^{\prime}italic_i ∉ italic_L start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT). Thus, their dot product is xLT(AAT)ei=λvsuperscriptsubscript𝑥superscript𝐿𝑇𝐴superscript𝐴𝑇subscript𝑒𝑖𝜆𝑣x_{L^{\prime}}^{T}\left(AA^{T}\right)e_{i}=\lambda vitalic_x start_POSTSUBSCRIPT italic_L start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( italic_A italic_A start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) italic_e start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_λ italic_v.

Putting it all together: ATxL22=rv+λv(v1)+r+2λv=r(v+1)+λ(v+1)vsuperscriptsubscriptnormsuperscript𝐴𝑇subscript𝑥𝐿22𝑟𝑣𝜆𝑣𝑣1𝑟2𝜆𝑣𝑟𝑣1𝜆𝑣1𝑣\|A^{T}x_{L}\|_{2}^{2}=rv+\lambda v(v-1)+r+2\lambda v=r(v+1)+\lambda(v+1)v∥ italic_A start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_x start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = italic_r italic_v + italic_λ italic_v ( italic_v - 1 ) + italic_r + 2 italic_λ italic_v = italic_r ( italic_v + 1 ) + italic_λ ( italic_v + 1 ) italic_v.

3. We show σv,k,v2μv,k,vsuperscriptsubscript𝜎superscript𝑣superscript𝑘𝑣2subscript𝜇superscript𝑣superscript𝑘𝑣\sigma_{v^{\prime},k^{\prime},v}^{2}\leq\mu_{v^{\prime},k^{\prime},v}italic_σ start_POSTSUBSCRIPT italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_v end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ≤ italic_μ start_POSTSUBSCRIPT italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_v end_POSTSUBSCRIPT by showing that 1+(v1)(k1)v1μv,k,v11𝑣1superscript𝑘1superscript𝑣1subscript𝜇superscript𝑣superscript𝑘𝑣11+\frac{(v-1)(k^{\prime}-1)}{v^{\prime}-1}-\mu_{v^{\prime},k^{\prime},v}\leq 11 + divide start_ARG ( italic_v - 1 ) ( italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - 1 ) end_ARG start_ARG italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - 1 end_ARG - italic_μ start_POSTSUBSCRIPT italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_v end_POSTSUBSCRIPT ≤ 1. Since μv,k,v=vkvsubscript𝜇superscript𝑣superscript𝑘𝑣𝑣superscript𝑘superscript𝑣\mu_{v^{\prime},k^{\prime},v}=\frac{vk^{\prime}}{v^{\prime}}italic_μ start_POSTSUBSCRIPT italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_v end_POSTSUBSCRIPT = divide start_ARG italic_v italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG start_ARG italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG, we show (v1)(k1)v1vkv𝑣1superscript𝑘1superscript𝑣1𝑣superscript𝑘superscript𝑣\frac{(v-1)(k^{\prime}-1)}{v^{\prime}-1}\leq\frac{vk^{\prime}}{v^{\prime}}divide start_ARG ( italic_v - 1 ) ( italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - 1 ) end_ARG start_ARG italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - 1 end_ARG ≤ divide start_ARG italic_v italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG start_ARG italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG. We have 1vv1𝑣superscript𝑣1\leq v\leq v^{\prime}1 ≤ italic_v ≤ italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT and 1<v1superscript𝑣1<v^{\prime}1 < italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, thus we get v1v1vv𝑣1superscript𝑣1𝑣superscript𝑣\frac{v-1}{v^{\prime}-1}\leq\frac{v}{v^{\prime}}divide start_ARG italic_v - 1 end_ARG start_ARG italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - 1 end_ARG ≤ divide start_ARG italic_v end_ARG start_ARG italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG. Since k10superscript𝑘10k^{\prime}-1\geq 0italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - 1 ≥ 0, we get (v1)(k1)v1v(k1)vvkv𝑣1superscript𝑘1superscript𝑣1𝑣superscript𝑘1superscript𝑣𝑣superscript𝑘superscript𝑣\frac{(v-1)(k^{\prime}-1)}{v^{\prime}-1}\leq\frac{v(k^{\prime}-1)}{v^{\prime}}% \leq\frac{vk^{\prime}}{v^{\prime}}divide start_ARG ( italic_v - 1 ) ( italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - 1 ) end_ARG start_ARG italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - 1 end_ARG ≤ divide start_ARG italic_v ( italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT - 1 ) end_ARG start_ARG italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG ≤ divide start_ARG italic_v italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG start_ARG italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG. ∎

Refer to caption
(a) The plot for t=4𝑡4t=4italic_t = 4.
Refer to caption
(b) The plot for t=5𝑡5t=5italic_t = 5.
Figure 3: The ratio of CVD sizes and their respective Schönheim bound vs. the ratio of Calzone’s covering sizes and their Schönheim bound. The black line is ratio 1111, i.e., coverings whose sizes are equal to the respective Schönheim bound.

Size of covering verification designs

CVDs enable us to obtain coverings whose sizes are small, often close or better than their respective Schönheim bound. Given a CVD whose mean block size is a real number k𝑘kitalic_k, we define its respective Schönheim bound as the bound for the covering design of (v,k,t)𝑣𝑘𝑡(v,\lceil{k}\rceil,t)( italic_v , ⌈ italic_k ⌉ , italic_t ). Note that this bound is not a lower bound on the size of the CVD, since the CVD can have blocks larger than k𝑘\lceil{k}\rceil⌈ italic_k ⌉ and thereby be smaller than covering designs for (v,k,t)𝑣𝑘𝑡(v,\lceil{k}\rceil,t)( italic_v , ⌈ italic_k ⌉ , italic_t ). Still, comparing to this bound enables understanding how much smaller our coverings are compared to the coverings considered by Calzone, whose sizes are lower bounded by the Schönheim bound. Figure 3 shows the ratio of the sizes of our CVDs and their respective Schönheim bound and the ratio of the sizes of Calzone’s covering designs and their Schönheim bound. The comparison is for v=784𝑣784v=784italic_v = 784 and t=4𝑡4t=4italic_t = 4 (Figure 3(a)) and t=5𝑡5t=5italic_t = 5 (Figure 3(b)). We compute CVDs from different PG coverings and the figure shows CVDs whose mean block size k𝑘kitalic_k is at least 10101010. For Calzone, we show all coverings in its database. The plots demonstrate that typically the size of a CVD is smaller or equal to the Schönheim bound, and on average, the ratio is 0.920.920.920.92 for t=4𝑡4t=4italic_t = 4 and 0.850.850.850.85 for t=5𝑡5t=5italic_t = 5. In contrast, Calzone’s coverings are significantly larger than the Schönheim bound, on average the ratio is 4.044.044.044.04 for t=4𝑡4t=4italic_t = 4 and 8.448.448.448.44 for t=5𝑡5t=5italic_t = 5. The plots also show that Calzone has many more coverings than the number of CVDs. This is because Calzone relies on general techniques to compute coverings and thus it can generate a covering for every kMAX_K=99𝑘MAX_K99k\leq\text{MAX\_K}=99italic_k ≤ MAX_K = 99 (except that it is limited to coverings with at most 107superscript10710^{7}10 start_POSTSUPERSCRIPT 7 end_POSTSUPERSCRIPT blocks). In contrast, our CVDs induce PG coverings and are thus limited to coverings whose mean block size is given by the expression given in Theorem 3.1, over vsuperscript𝑣v^{\prime}italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT and ksuperscript𝑘k^{\prime}italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT such that there is a PG covering for (v,k,t)superscript𝑣superscript𝑘𝑡(v^{\prime},k^{\prime},t)( italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_t ).

Challenge: memory consumption

The main challenge in computing CVDs is that it requires to compute a PG covering for large values of vsuperscript𝑣v^{\prime}italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT and ksuperscript𝑘k^{\prime}italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, which poses a high memory overhead. To illustrate, in our experiments, CoVerD uses a CVD induced from a PG covering for (v=1508598,k=88741,t=5)formulae-sequencesuperscript𝑣1508598formulae-sequencesuperscript𝑘88741𝑡5(v^{\prime}=1508598,k^{\prime}=88741,t=5)( italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = 1508598 , italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = 88741 , italic_t = 5 ). If CoVerD stored this covering in the memory, it would require 124124124124GB of memory, assuming each number in a block takes a byte. To cope, CoVerD computes the partially-induced covering during the PG covering construction. However, even the partially-induced coverings can consume a lot of memory, since the number of blocks can be large. Calzone faced a similar challenge and coped by restricting the size of the covering designs to at most 107superscript10710^{7}10 start_POSTSUPERSCRIPT 7 end_POSTSUPERSCRIPT, which allowed it to keep all coverings in the covering database. While CoVerD could take a similar approach, this would prevent it from picking CVDs of larger size which overall may lead to a lower analysis time (since they will require fewer refinements). Instead, CoVerD generates a CVD on-the-fly and uses the covering database only for the refinements, which tend to require coverings of significantly smaller size than the first covering. Another advantage of building the CVD on-the-fly is that it enables CoVerD to analyze any classifier over any image dimension, without any special adaptation. This is in contrast to Calzone, which requires to extend its covering database upon every new image dimension v𝑣vitalic_v.

4 CoVerD

In this section, we present CoVerD, our L0subscript𝐿0L_{0}italic_L start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT robustness verifier. We first describe our system and its components and then provide a running example.

Refer to caption
Figure 4: CoVerD: An L0subscript𝐿0L_{0}italic_L start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT robustness verifier.

4.1 Our System

Figure 4 shows CoVerD that, given an image classifier N𝑁Nitalic_N, an image x𝑥xitalic_x with v𝑣vitalic_v pixels, and the maximal number of perturbed pixels t𝑡titalic_t, returns whether N𝑁Nitalic_N is robust in the t𝑡titalic_t-ball of x𝑥xitalic_x. We next describe its planning and analysis components.

Planning

The planning component consists of several steps. First, it samples sets of different sizes k𝑘kitalic_k to estimate the success rate and average analysis time of their respective neighborhoods, like Calzone. Since CoVerD considers CVDs, it can observe larger block sizes than Calzone, thus the maximal sampled set size is MAX_K=200MAX_K200\text{MAX\_K}=200MAX_K = 200, unlike 99999999 in Calzone. Because of the larger bound, CoVerD is likely to observe many more k𝑘kitalic_k values whose success rate is zero. To save execution time while still enabling to determine the success rate and average analysis time of large k𝑘kitalic_k values, CoVerD reduces the number of samples after observing nfailsubscript𝑛failn_{\text{fail}}italic_n start_POSTSUBSCRIPT fail end_POSTSUBSCRIPT times k𝑘kitalic_k values whose success rate is zero. Second, the planning component relies on Calzone’s dynamic programming for computing a K strategy, but uses it differently. Since CoVerD begins the analysis from a CVD consisting of different sized blocks, there is no single K strategy. Instead, it runs Calzone’s dynamic programming for every k{t+1,,MAX_K}𝑘𝑡1MAX_Kk\in\{t+1,\dots,\text{MAX\_K}\}italic_k ∈ { italic_t + 1 , … , MAX_K } to define a function fRsubscript𝑓𝑅f_{R}italic_f start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT map** every set size k𝑘kitalic_k to the best set size to use upon a refinement of a set of size k𝑘kitalic_k. Then, the planning component iterates over every candidate CVD and picks the best CVD for the analysis. It picks between the candidates without constructing them, as the construction is time and memory intensive and we wish to execute it only for the chosen candidate. To pick the best candidate, it leverages two observations. First, a CVD candidate is uniquely defined by the parameters of the PG covering, (q,m)𝑞𝑚(q,m)( italic_q , italic_m ) (formally, its parameters are (q,m,t)𝑞𝑚𝑡(q,m,t)( italic_q , italic_m , italic_t ) but t𝑡titalic_t is identical in all our PG coverings), so it suffices to pick a pair (q,m)𝑞𝑚(q,m)( italic_q , italic_m ) which can later be used to construct the CVD. Second, to predict the CVD with the minimal analysis time, only the block sizes are needed. In Section 4.2, we describe how to estimate a CVD’s block size distribution dist(q,m)𝑑𝑖𝑠𝑡𝑞𝑚dist(q,m)italic_d italic_i italic_s italic_t ( italic_q , italic_m ) and estimate its analysis time Tdist(q,m)subscript𝑇𝑑𝑖𝑠𝑡𝑞𝑚T_{dist(q,m)}italic_T start_POSTSUBSCRIPT italic_d italic_i italic_s italic_t ( italic_q , italic_m ) end_POSTSUBSCRIPT, in order to predict the best CVD. Given the best candidate (q,m)superscript𝑞superscript𝑚(q^{*},m^{*})( italic_q start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_m start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ), it randomly samples an ordered set L𝐿Litalic_L of v𝑣vitalic_v indices from vsuperscript𝑣v^{\prime}italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, which is a function of (q,m)superscript𝑞superscript𝑚(q^{*},m^{*})( italic_q start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_m start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ).

Analysis

After determining the best (q,m)superscript𝑞superscript𝑚(q^{*},m^{*})( italic_q start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_m start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ), L𝐿Litalic_L, and the refinement map** fRsubscript𝑓𝑅f_{R}italic_f start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT, CoVerD continues to analyze the robustness of the classifier N𝑁Nitalic_N in the t𝑡titalic_t-ball of the given image x𝑥xitalic_x. The analysis constructs the CVD on-the-fly block-by-block. Technically, there is a covering generator that constructs the blocks one-by-one. Every block is pushed to the stack of blocks to verify, and then the analysis proceeds as Calzone. That is, the block is popped, submitted to GPUPoly, and if needed, refinement is executed. After the block is verified (directly or by refinement), the next block in the CVD is obtained from the covering generator. We note that although CoVerD could use CVDs for refinements, the coverings for refinements are smaller than the first covering since these coverings are for triples (v~,k~,t)~𝑣~𝑘𝑡(\tilde{v},\tilde{k},t)( over~ start_ARG italic_v end_ARG , over~ start_ARG italic_k end_ARG , italic_t ) where v~~𝑣\tilde{v}over~ start_ARG italic_v end_ARG is typically few dozens and at most MAX_K=200MAX_K200\text{MAX\_K}=200MAX_K = 200, whereas the first covering is for a triple (v~,k~,t)~𝑣~𝑘𝑡(\tilde{v},\tilde{k},t)( over~ start_ARG italic_v end_ARG , over~ start_ARG italic_k end_ARG , italic_t ) where v~~𝑣\tilde{v}over~ start_ARG italic_v end_ARG is the image dimension. Like Calzone, CoVerD parallelizes the analysis on GPUs. Thus, our covering generator generates disjoint parts of the covering, described in Section 4.3.

4.2 Choosing a Covering Verification Design

In this section, we describe how CoVerD predicts the CVD that enables CoVerD to minimize the overall analysis time. We begin with describing the CVD candidates, then describe how CoVerD estimates their block size distributions, and finally explain how CoVerD predicts the CVD leading to the minimal analysis time.

Candidates

A CVD candidate is defined by the PG covering from which it is partially-induced. Recall that a PG covering is defined for triples (v,k,t)superscript𝑣superscript𝑘𝑡(v^{\prime},k^{\prime},t)( italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_t ), where v=qm+11q1superscript𝑣superscript𝑞𝑚11𝑞1v^{\prime}=\frac{q^{m+1}-1}{q-1}italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = divide start_ARG italic_q start_POSTSUPERSCRIPT italic_m + 1 end_POSTSUPERSCRIPT - 1 end_ARG start_ARG italic_q - 1 end_ARG and k=qt1q1superscript𝑘superscript𝑞𝑡1𝑞1k^{\prime}=\frac{q^{t}-1}{q-1}italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = divide start_ARG italic_q start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT - 1 end_ARG start_ARG italic_q - 1 end_ARG for a prime power q𝑞qitalic_q and mt2𝑚𝑡2m\geq t\geq 2italic_m ≥ italic_t ≥ 2. By Theorem 3.1, given a PG covering, the mean block size of the CVD has a closed-form expression μv,k,v=vkv=v(qt1)qm+11subscript𝜇superscript𝑣superscript𝑘𝑣𝑣superscript𝑘superscript𝑣𝑣superscript𝑞𝑡1superscript𝑞𝑚11\mu_{v^{\prime},k^{\prime},v}=\frac{vk^{\prime}}{v^{\prime}}=\frac{v(q^{t}-1)}% {q^{m+1}-1}italic_μ start_POSTSUBSCRIPT italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_v end_POSTSUBSCRIPT = divide start_ARG italic_v italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG start_ARG italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG = divide start_ARG italic_v ( italic_q start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT - 1 ) end_ARG start_ARG italic_q start_POSTSUPERSCRIPT italic_m + 1 end_POSTSUPERSCRIPT - 1 end_ARG. By this expression, given q𝑞qitalic_q, as m𝑚mitalic_m increases μv,k,vsubscript𝜇superscript𝑣superscript𝑘𝑣\mu_{v^{\prime},k^{\prime},v}italic_μ start_POSTSUBSCRIPT italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_v end_POSTSUBSCRIPT decreases, and given m𝑚mitalic_m, as q𝑞qitalic_q increases μv,k,vsubscript𝜇superscript𝑣superscript𝑘𝑣\mu_{v^{\prime},k^{\prime},v}italic_μ start_POSTSUBSCRIPT italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_v end_POSTSUBSCRIPT decreases. Further, this expression approaches 00 for high values of q𝑞qitalic_q or m𝑚mitalic_m. Thus, to obtain a finite set of candidates, we provide a positive lower bound on μv,k,vsubscript𝜇superscript𝑣superscript𝑘𝑣\mu_{v^{\prime},k^{\prime},v}italic_μ start_POSTSUBSCRIPT italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_v end_POSTSUBSCRIPT, denoted MIN_K (our implementation sets it to t𝑡titalic_t). That is, the finite set of candidates CoVerD considers is:

{(q,m)2q is a prime power,mt,vv,μv,k,vMIN_K}conditional-set𝑞𝑚superscript2formulae-sequence𝑞 is a prime power𝑚𝑡formulae-sequencesuperscript𝑣𝑣subscript𝜇superscript𝑣superscript𝑘𝑣MIN_K\{(q,m)\in\mathbb{N}^{2}\mid q\text{ is a prime power},\ m\geq t,\ v^{\prime}% \geq v,\ \mu_{v^{\prime},k^{\prime},v}\geq\text{MIN\_K}\}{ ( italic_q , italic_m ) ∈ blackboard_N start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ∣ italic_q is a prime power , italic_m ≥ italic_t , italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ≥ italic_v , italic_μ start_POSTSUBSCRIPT italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_v end_POSTSUBSCRIPT ≥ MIN_K }

Estimating the block size distribution

For every CVD candidate, defined by (q,m)𝑞𝑚(q,m)( italic_q , italic_m ), CoVerD estimates the distribution of its block sizes. While Theorem 3.1 provides expressions for the mean block size and the variance, it does not define the block size distribution. We empirically observe that our CVDs have the property that the distribution of their block sizes resembles a discrete approximation of a Gaussian distribution with mean μv,k,vsubscript𝜇superscript𝑣superscript𝑘𝑣\mu_{v^{\prime},k^{\prime},v}italic_μ start_POSTSUBSCRIPT italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_v end_POSTSUBSCRIPT and variance σv,k,v2superscriptsubscript𝜎superscript𝑣superscript𝑘𝑣2\sigma_{v^{\prime},k^{\prime},v}^{2}italic_σ start_POSTSUBSCRIPT italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_v end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT. The higher the mean and the number of blocks, the higher the resemblance. Figure 5(a) visualizes this resemblance for a CVD, with v=784𝑣784v=784italic_v = 784, induced from a PG with parameters q=17𝑞17q=17italic_q = 17, m=5𝑚5m=5italic_m = 5, and t=5𝑡5t=5italic_t = 5. We believe this resemblance exists because a CVD is partially-induced from a PG covering given a random set of numbers L𝐿Litalic_L. This resemblance may not hold for other choices of L𝐿Litalic_L, for example for the choice of L𝐿Litalic_L proposed by [27], which compute a partially-induced covering whose maximal block size is bounded (unlike our CVD). Because of this resemblance, we model the block size as drawn from the Gaussian distribution with the true mean and variance 𝒢(μv,k,v,σv,k,v2)𝒢subscript𝜇superscript𝑣superscript𝑘𝑣superscriptsubscript𝜎superscript𝑣superscript𝑘𝑣2\mathcal{G}\left(\mu_{v^{\prime},k^{\prime},v},\sigma_{v^{\prime},k^{\prime},v% }^{2}\right)caligraphic_G ( italic_μ start_POSTSUBSCRIPT italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_v end_POSTSUBSCRIPT , italic_σ start_POSTSUBSCRIPT italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_v end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ). Even if this modeling is imprecise, in practice, it is sufficient to allow CoVerD identify the candidate CVD leading to the minimal analysis time. Formally, given a CVD candidate defined by (q,m)𝑞𝑚(q,m)( italic_q , italic_m ), the distribution of the block sizes is dist(q,m)={Nkq,mkMAX_K}𝑑𝑖𝑠𝑡𝑞𝑚conditional-setsubscriptsuperscript𝑁𝑞𝑚𝑘𝑘MAX_Kdist(q,m)=\{N^{q,m}_{k}\mid k\leq\text{MAX\_K}\}italic_d italic_i italic_s italic_t ( italic_q , italic_m ) = { italic_N start_POSTSUPERSCRIPT italic_q , italic_m end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∣ italic_k ≤ MAX_K }, where Nkq,msubscriptsuperscript𝑁𝑞𝑚𝑘N^{q,m}_{k}italic_N start_POSTSUPERSCRIPT italic_q , italic_m end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT is our estimation of the number of blocks of size k𝑘kitalic_k in this CVD. We define the probability that a block size in this CVD is of size k𝑘kitalic_k as: (k0.5<Zk+0.5)=Φ((k+0.5)μv,k,vσv,k,v)Φ((k0.5)μv,k,vσv,k,v)𝑘0.5𝑍𝑘0.5Φ𝑘0.5subscript𝜇superscript𝑣superscript𝑘𝑣subscript𝜎superscript𝑣superscript𝑘𝑣Φ𝑘0.5subscript𝜇superscript𝑣superscript𝑘𝑣subscript𝜎superscript𝑣superscript𝑘𝑣\mathbb{P}(k-0.5<Z\leq k+0.5)=\Phi\left(\frac{(k+0.5)-\mu_{v^{\prime},k^{% \prime},v}}{\sigma_{v^{\prime},k^{\prime},v}}\right)-\Phi\left(\frac{(k-0.5)-% \mu_{v^{\prime},k^{\prime},v}}{\sigma_{v^{\prime},k^{\prime},v}}\right)blackboard_P ( italic_k - 0.5 < italic_Z ≤ italic_k + 0.5 ) = roman_Φ ( divide start_ARG ( italic_k + 0.5 ) - italic_μ start_POSTSUBSCRIPT italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_v end_POSTSUBSCRIPT end_ARG start_ARG italic_σ start_POSTSUBSCRIPT italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_v end_POSTSUBSCRIPT end_ARG ) - roman_Φ ( divide start_ARG ( italic_k - 0.5 ) - italic_μ start_POSTSUBSCRIPT italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_v end_POSTSUBSCRIPT end_ARG start_ARG italic_σ start_POSTSUBSCRIPT italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_v end_POSTSUBSCRIPT end_ARG ), where Z𝒢(μv,k,v,σv,k,v2)similar-to𝑍𝒢subscript𝜇superscript𝑣superscript𝑘𝑣superscriptsubscript𝜎superscript𝑣superscript𝑘𝑣2Z\sim\mathcal{G}\left(\mu_{v^{\prime},k^{\prime},v},\sigma_{v^{\prime},k^{% \prime},v}^{2}\right)italic_Z ∼ caligraphic_G ( italic_μ start_POSTSUBSCRIPT italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_v end_POSTSUBSCRIPT , italic_σ start_POSTSUBSCRIPT italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_v end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) and ΦΦ\Phiroman_Φ is the cumulative distribution function (CDF) of a Gaussian distribution with mean 00 and variance 1111. The number of blocks bq,msuperscript𝑏𝑞𝑚b^{q,m}italic_b start_POSTSUPERSCRIPT italic_q , italic_m end_POSTSUPERSCRIPT is identical to the number of blocks in the PG covering, which has a closed-form expression [16]. Thus, the estimated number of blocks of size k𝑘kitalic_k is: N~kq,m=bq,m(k0.5<Zk+0.5)subscriptsuperscript~𝑁𝑞𝑚𝑘superscript𝑏𝑞𝑚𝑘0.5𝑍𝑘0.5\tilde{N}^{q,m}_{k}=b^{q,m}\cdot\mathbb{P}(k-0.5<Z\leq k+0.5)over~ start_ARG italic_N end_ARG start_POSTSUPERSCRIPT italic_q , italic_m end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = italic_b start_POSTSUPERSCRIPT italic_q , italic_m end_POSTSUPERSCRIPT ⋅ blackboard_P ( italic_k - 0.5 < italic_Z ≤ italic_k + 0.5 ). To make the estimated number an integer, we define Nkq,msubscriptsuperscript𝑁𝑞𝑚𝑘N^{q,m}_{k}italic_N start_POSTSUPERSCRIPT italic_q , italic_m end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT as the floor of N~kq,msubscriptsuperscript~𝑁𝑞𝑚𝑘\tilde{N}^{q,m}_{k}over~ start_ARG italic_N end_ARG start_POSTSUPERSCRIPT italic_q , italic_m end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT and add 1111 with probability of the remainder: Nkq,m=N~kq,m+Xsubscriptsuperscript𝑁𝑞𝑚𝑘subscriptsuperscript~𝑁𝑞𝑚𝑘𝑋N^{q,m}_{k}=\left\lfloor\tilde{N}^{q,m}_{k}\right\rfloor+Xitalic_N start_POSTSUPERSCRIPT italic_q , italic_m end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = ⌊ over~ start_ARG italic_N end_ARG start_POSTSUPERSCRIPT italic_q , italic_m end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ⌋ + italic_X where XBern(N~kq,mN~kq,m)similar-to𝑋Bernsubscriptsuperscript~𝑁𝑞𝑚𝑘subscriptsuperscript~𝑁𝑞𝑚𝑘X\sim\text{Bern}\left(\tilde{N}^{q,m}_{k}-\left\lfloor\tilde{N}^{q,m}_{k}% \right\rfloor\right)italic_X ∼ Bern ( over~ start_ARG italic_N end_ARG start_POSTSUPERSCRIPT italic_q , italic_m end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - ⌊ over~ start_ARG italic_N end_ARG start_POSTSUPERSCRIPT italic_q , italic_m end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ⌋ ). Figure 5(b) visualizes how close our estimation of the block size distribution is to the distribution of the CVD shown in Figure 5(a). We note that CoVerD considers a candidate and estimates its block size distribution only if its estimated number of overly large blocks (larger than MAX_K) is close to zero. Formally, it considers candidates that satisfy bq,m(1Φ(MAX_Kμv,k,vσv,k,v))ϵsuperscript𝑏𝑞𝑚1ΦMAX_Ksubscript𝜇superscript𝑣superscript𝑘𝑣subscript𝜎superscript𝑣superscript𝑘𝑣italic-ϵb^{q,m}\cdot\left(1-\Phi\left(\frac{\text{MAX\_K}-\mu_{v^{\prime},k^{\prime},v% }}{\sigma_{v^{\prime},k^{\prime},v}}\right)\right)\leq\epsilonitalic_b start_POSTSUPERSCRIPT italic_q , italic_m end_POSTSUPERSCRIPT ⋅ ( 1 - roman_Φ ( divide start_ARG MAX_K - italic_μ start_POSTSUBSCRIPT italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_v end_POSTSUBSCRIPT end_ARG start_ARG italic_σ start_POSTSUBSCRIPT italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_v end_POSTSUBSCRIPT end_ARG ) ) ≤ italic_ϵ, where ϵitalic-ϵ\epsilonitalic_ϵ is a small number. This is the reason that dist(q,m)={Nkq,mkMAX_K}𝑑𝑖𝑠𝑡𝑞𝑚conditional-setsubscriptsuperscript𝑁𝑞𝑚𝑘𝑘MAX_Kdist(q,m)=\{N^{q,m}_{k}\mid k\leq\text{MAX\_K}\}italic_d italic_i italic_s italic_t ( italic_q , italic_m ) = { italic_N start_POSTSUPERSCRIPT italic_q , italic_m end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∣ italic_k ≤ MAX_K } consists of estimations only for blocks whose size is at most MAX_K.

Refer to caption
(a) The block size distribution of a CVD and the respective Gaussian distribution.
Refer to caption
(b) Our estimated block size distribution vs. the distribution of a respective CVD.
Figure 5: Block size distributions.

Predicting the best CVD

Given the candidates and their estimated block size distributions, CoVerD chooses the CVD which will enable CoVerD to minimize the overall analysis time. To this end, it predicts for every candidate CVD its overall analysis time. The prediction relies on: (1) the estimated number of blocks Nkq,msubscriptsuperscript𝑁𝑞𝑚𝑘N^{q,m}_{k}italic_N start_POSTSUPERSCRIPT italic_q , italic_m end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT of size k𝑘kitalic_k, (2) the average analysis time of a block of size k𝑘kitalic_k, denoted k_array[k][time]𝑘_𝑎𝑟𝑟𝑎𝑦delimited-[]𝑘delimited-[]𝑡𝑖𝑚𝑒k\_array[k][time]italic_k _ italic_a italic_r italic_r italic_a italic_y [ italic_k ] [ italic_t italic_i italic_m italic_e ] (given by the initial sampling), (3) the fraction of the non-robust blocks of size k𝑘kitalic_k, which is one minus the success rate of k𝑘kitalic_k, denoted k_array[k][success]𝑘_𝑎𝑟𝑟𝑎𝑦delimited-[]𝑘delimited-[]𝑠𝑢𝑐𝑐𝑒𝑠𝑠k\_array[k][success]italic_k _ italic_a italic_r italic_r italic_a italic_y [ italic_k ] [ italic_s italic_u italic_c italic_c italic_e italic_s italic_s ] (given by the initial sampling) and (4) the analysis time of refining a non-robust block of size k𝑘kitalic_k, denoted T(k)𝑇𝑘T(k)italic_T ( italic_k ) (given by the dynamic programming, as defined in [30]). Similarly to Calzone’s dynamic programming, the analysis time is the sum of the analysis time of verifying all blocks in the CVD and the analysis time of the refinements of the non-robust blocks:

Tdist(q,m)=k=tMAX_KNkq,m(k_array[k][time]+(1k_array[k][success])T(k))subscript𝑇𝑑𝑖𝑠𝑡𝑞𝑚superscriptsubscript𝑘𝑡MAX_Ksubscriptsuperscript𝑁𝑞𝑚𝑘𝑘_𝑎𝑟𝑟𝑎𝑦delimited-[]𝑘delimited-[]𝑡𝑖𝑚𝑒1𝑘_𝑎𝑟𝑟𝑎𝑦delimited-[]𝑘delimited-[]𝑠𝑢𝑐𝑐𝑒𝑠𝑠𝑇𝑘T_{dist(q,m)}=\sum_{k=t}^{\text{MAX\_K}}N^{q,m}_{k}\cdot\left(k\_array[k][time% ]+(1-k\_array[k][success])\cdot T(k)\right)italic_T start_POSTSUBSCRIPT italic_d italic_i italic_s italic_t ( italic_q , italic_m ) end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_k = italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT MAX_K end_POSTSUPERSCRIPT italic_N start_POSTSUPERSCRIPT italic_q , italic_m end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ⋅ ( italic_k _ italic_a italic_r italic_r italic_a italic_y [ italic_k ] [ italic_t italic_i italic_m italic_e ] + ( 1 - italic_k _ italic_a italic_r italic_r italic_a italic_y [ italic_k ] [ italic_s italic_u italic_c italic_c italic_e italic_s italic_s ] ) ⋅ italic_T ( italic_k ) )

This computation ignores blocks of size less than t𝑡titalic_t since they do not cover any subset of size t𝑡titalic_t and need not be analyzed to prove L0subscript𝐿0L_{0}italic_L start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT robustness. After predicting the analysis time of every candidate, CoVerD picks the candidate with the minimal time.

4.3 Constructing a Covering Verification Design

In this section, we present our covering generator that computes a CVD. The covering generator operates as an independent process, one for every GPU, that outputs blocks a-synchronically. At every iteration, every GPU worker obtains a block from its covering generator, analyzes it with GPUPoly, and refines if needed. If the block is robust or its refinement does not detect an adversarial example, the GPU worker obtains the next block from the covering generator. The covering generator relies on the chosen CVD’s parameters q𝑞qitalic_q and m𝑚mitalic_m and the ordered set L𝐿Litalic_L from the planning component. It computes the PG covering for (q𝑞qitalic_q, m𝑚mitalic_m, t𝑡titalic_t) block-by-block and induces it to obtain a CVD. Generally, its construction follows the meta-algorithm of generating PG coverings described in [16]. The novel parts are our implementation of inducing blocks immediately upon generating them and partitioning them to enable their analysis to proceed in parallel over the available GPUs. We next describe the covering generator.

Input: PG parameters (q𝑞qitalic_q, m𝑚mitalic_m, t𝑡titalic_t), an ordered set of v𝑣vitalic_v indices L=[s1,,sv]𝐿subscript𝑠1subscript𝑠𝑣L=[s_{1},\dots,s_{v}]italic_L = [ italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_s start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT ] which is a subset of [qm+11q1]delimited-[]superscript𝑞𝑚11𝑞1[\frac{q^{m+1}-1}{q-1}][ divide start_ARG italic_q start_POSTSUPERSCRIPT italic_m + 1 end_POSTSUPERSCRIPT - 1 end_ARG start_ARG italic_q - 1 end_ARG ], and an index of a GPU iGPUsubscript𝑖𝐺𝑃𝑈i_{GPU}italic_i start_POSTSUBSCRIPT italic_G italic_P italic_U end_POSTSUBSCRIPT.
Output: A stream of the covering verification design’s blocks.
1
i[v].P[:,i]formulae-sequencefor-all𝑖delimited-[]𝑣𝑃:𝑖\forall i\in[v].\ P[:,i]∀ italic_i ∈ [ italic_v ] . italic_P [ : , italic_i ] = a unique vector in 𝔽qm+1superscriptsubscript𝔽𝑞𝑚1\mathbb{F}_{q}^{m+1}blackboard_F start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m + 1 end_POSTSUPERSCRIPT computed for sisubscript𝑠𝑖s_{i}italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT
  // P𝔽q(m+1)×v𝑃superscriptsubscript𝔽𝑞𝑚1𝑣P\in\mathbb{F}_{q}^{(m+1)\times v}italic_P ∈ blackboard_F start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_m + 1 ) × italic_v end_POSTSUPERSCRIPT
2 =[M𝔽q(mt+1)×(m+1)M is full rank and in reduced row echelon form]delimited-[]𝑀conditionalsuperscriptsubscript𝔽𝑞𝑚𝑡1𝑚1M is full rank and in reduced row echelon form\mathcal{M}=[M\in\mathbb{F}_{q}^{(m-t+1)\times(m+1)}\mid\text{$M$ is full rank% and in reduced row echelon form}]caligraphic_M = [ italic_M ∈ blackboard_F start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_m - italic_t + 1 ) × ( italic_m + 1 ) end_POSTSUPERSCRIPT ∣ italic_M is full rank and in reduced row echelon form ]
3 for j=0𝑗0j=0italic_j = 0; j<||𝑗j<|\mathcal{M}|italic_j < | caligraphic_M |; j++j++italic_j + + do
4       if j modulo GPUsiGPU𝑗 modulo 𝐺𝑃𝑈𝑠subscript𝑖𝐺𝑃𝑈j\text{ modulo }GPUs\neq i_{GPU}italic_j modulo italic_G italic_P italic_U italic_s ≠ italic_i start_POSTSUBSCRIPT italic_G italic_P italic_U end_POSTSUBSCRIPT then continue
5       R𝑅Ritalic_R = [j]×Pdelimited-[]𝑗𝑃\mathcal{M}[j]\times Pcaligraphic_M [ italic_j ] × italic_P
       block = {i[v]R[:,i]=0}conditional-set𝑖delimited-[]𝑣𝑅:𝑖0\left\{i\in[v]\mid R[:,i]=\vec{0}\right\}{ italic_i ∈ [ italic_v ] ∣ italic_R [ : , italic_i ] = over→ start_ARG 0 end_ARG }
        // generate induced block
6       output block
Algorithm 1 CoveringGenerator(q𝑞qitalic_q, m𝑚mitalic_m, t𝑡titalic_t, L𝐿Litalic_L, iGPUsubscript𝑖𝐺𝑃𝑈i_{GPU}italic_i start_POSTSUBSCRIPT italic_G italic_P italic_U end_POSTSUBSCRIPT)

Algorithm 1 shows the algorithm of our covering generator. It takes as input the PG parameters (q,m,t)𝑞𝑚𝑡(q,m,t)( italic_q , italic_m , italic_t ), an ordered set L𝐿Litalic_L of v𝑣vitalic_v indices from [v]delimited-[]superscript𝑣[v^{\prime}][ italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ] (where v=qm+11q1superscript𝑣superscript𝑞𝑚11𝑞1v^{\prime}=\frac{q^{m+1}-1}{q-1}italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = divide start_ARG italic_q start_POSTSUPERSCRIPT italic_m + 1 end_POSTSUPERSCRIPT - 1 end_ARG start_ARG italic_q - 1 end_ARG), and the GPU index iGPUsubscript𝑖𝐺𝑃𝑈i_{GPU}italic_i start_POSTSUBSCRIPT italic_G italic_P italic_U end_POSTSUBSCRIPT. As described in Section 3, a PG construction for (vsuperscript𝑣v^{\prime}italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, ksuperscript𝑘k^{\prime}italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, t𝑡titalic_t) views vsuperscript𝑣v^{\prime}italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT as the number of points in the geometry. Formally, given the finite field 𝔽qsubscript𝔽𝑞\mathbb{F}_{q}blackboard_F start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT of order q𝑞qitalic_q, we identify the points of the geometry as a subset W𝔽qm+1𝑊superscriptsubscript𝔽𝑞𝑚1W\subset\mathbb{F}_{q}^{m+1}italic_W ⊂ blackboard_F start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m + 1 end_POSTSUPERSCRIPT of size vsuperscript𝑣v^{\prime}italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT (technically, these points are representatives of equivalence classes over 𝔽qm+1superscriptsubscript𝔽𝑞𝑚1\mathbb{F}_{q}^{m+1}blackboard_F start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m + 1 end_POSTSUPERSCRIPT, as explained in [16]). To later partially-induce the covering using L𝐿Litalic_L, Algorithm 1 maps every index in L𝐿Litalic_L to a unique point in W𝑊Witalic_W and stores all points (column vectors) in a matrix P𝑃Pitalic_P (Algorithm 1). Then, Algorithm 1 begins to construct the PG covering by computing flats (linear subspaces) of dimension t1𝑡1t-1italic_t - 1, each containing k=qt1q1superscript𝑘superscript𝑞𝑡1𝑞1k^{\prime}=\frac{q^{t}-1}{q-1}italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = divide start_ARG italic_q start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT - 1 end_ARG start_ARG italic_q - 1 end_ARG points. As described in [16], every block in the PG covering is a solution (a set of points in W𝑊Witalic_W) to mt+1𝑚𝑡1m-t+1italic_m - italic_t + 1 independent linear equations over m+1𝑚1m+1italic_m + 1 variables. Such a linear system can be represented as a full rank matrix, where its solutions are vectors in the matrix’s null space. Thus, to compute the blocks in the PG covering, Algorithm 1 defines a set of matrices \mathcal{M}caligraphic_M, each is over 𝔽qsubscript𝔽𝑞\mathbb{F}_{q}blackboard_F start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT, of dimension (mt+1)×(m+1)𝑚𝑡1𝑚1(m-t+1)\times(m+1)( italic_m - italic_t + 1 ) × ( italic_m + 1 ), and full rank (equal to mt+1𝑚𝑡1m-t+1italic_m - italic_t + 1). Each matrix has exactly ksuperscript𝑘k^{\prime}italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT points in W𝑊Witalic_W in its null space. These points form a PG block (a flat of dimension t1𝑡1t-1italic_t - 1). To avoid block duplication, the matrices in \mathcal{M}caligraphic_M need to have different null spaces. Thus, Algorithm 1 considers matrices in reduced row echelon form, i.e., \mathcal{M}caligraphic_M is all full rank (mt+1)×(m+1)𝑚𝑡1𝑚1(m-t+1)\times(m+1)( italic_m - italic_t + 1 ) × ( italic_m + 1 ) matrices over 𝔽qsubscript𝔽𝑞\mathbb{F}_{q}blackboard_F start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT in reduced row echelon form (Algorithm 1). The covering generator then iterates these matrices. To avoid a high memory overhead, the matrix [j]delimited-[]𝑗\mathcal{M}[j]caligraphic_M [ italic_j ] is generated only upon reaching its index j𝑗jitalic_j. If j𝑗jitalic_j belongs to the disjoint part of the given GPU, its induced block is generated (Algorithm 1). To construct a PG block, one needs to compute all the points in the null space of [j]delimited-[]𝑗\mathcal{M}[j]caligraphic_M [ italic_j ]. However, the generator requires only the partially-induced blocks. Thus, it immediately induces the block by obtaining all points sisubscript𝑠𝑖s_{i}italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, for i[v]𝑖delimited-[]𝑣i\in[v]italic_i ∈ [ italic_v ], whose respective point P[:,i]𝑃:𝑖P[:,i]italic_P [ : , italic_i ] belongs to the null space of [j]delimited-[]𝑗\mathcal{M}[j]caligraphic_M [ italic_j ]. To this end, it defines R𝑅Ritalic_R as the multiplication of [j]delimited-[]𝑗\mathcal{M}[j]caligraphic_M [ italic_j ] and P𝑃Pitalic_P (Algorithm 1), forms the induced block by identifying the points that are in the null space of [j]delimited-[]𝑗\mathcal{M}[j]caligraphic_M [ italic_j ] (i.e., every sisubscript𝑠𝑖s_{i}italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT satisfying R[:,i]=0𝑅:𝑖0R[:,i]=\vec{0}italic_R [ : , italic_i ] = over→ start_ARG 0 end_ARG), and makes the induced block a subset of [v]delimited-[]𝑣[v][ italic_v ] by map** every sisubscript𝑠𝑖s_{i}italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT in the induced block to i𝑖iitalic_i (Algorithm 1).

4.4 A Running Example

In this section, we describe a real execution of CoVerD, for an MNIST image, a fully-connected network (6×\times×200_PGD in Section 5), and t=4𝑡4t=4italic_t = 4. CoVerD begins with the planning component. It first estimates the success rate and average analysis time of blocks. For every k{4,5,,200}𝑘45200k\in\{4,5,\dots,200\}italic_k ∈ { 4 , 5 , … , 200 }, it samples blocks S𝑆Sitalic_S (subsets of [784]delimited-[]784[784][ 784 ]) of size k𝑘kitalic_k and submits their neighborhood IS(x)={x[0,1]viS.xi=xi}subscript𝐼𝑆𝑥conditional-setsuperscript𝑥superscript01𝑣formulae-sequencefor-all𝑖𝑆subscriptsuperscript𝑥𝑖subscript𝑥𝑖I_{S}(x)=\{x^{\prime}\in[0,1]^{v}\mid\forall i\notin S.\ x^{\prime}_{i}=x_{i}\}italic_I start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT ( italic_x ) = { italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ [ 0 , 1 ] start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT ∣ ∀ italic_i ∉ italic_S . italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT } to GPUPoly. Based on all samples for k𝑘kitalic_k, it estimates the success rate and the average analysis time. For instance, for k=34𝑘34k=34italic_k = 34 the success rate is 94.05%percent94.0594.05\%94.05 % and the average analysis time is 16.1916.1916.1916.19ms, while for k=41𝑘41k=41italic_k = 41 they are 65.85%percent65.8565.85\%65.85 % and 16.9616.9616.9616.96ms. Then, CoVerD runs Calzone’s dynamic programming to map every k{5,6,,200}𝑘56200k\in\{5,6,\dots,200\}italic_k ∈ { 5 , 6 , … , 200 } to the refinement size. For example, k=34𝑘34k=34italic_k = 34 is mapped to 28282828 and k=41𝑘41k=41italic_k = 41 to 33333333. Next, CoVerD determines the CVD for the first covering, out of 50505050 candidates. For each candidate, it predicts the block size distribution and the respective overall analysis time of this candidate. To this end, it computes the mean, variance, and number of blocks using the closed-form expressions. For example, the CVD of the candidate (q=23,m=4)formulae-sequence𝑞23𝑚4(q=23,m=4)( italic_q = 23 , italic_m = 4 ) has mean block size 34.08734.08734.08734.087, variance 32.51832.51832.51832.518 and 292,561292561292,561292 , 561 blocks. The CVD of (q=19,m=4)formulae-sequence𝑞19𝑚4(q=19,m=4)( italic_q = 19 , italic_m = 4 ) has mean block size 41.26341.26341.26341.263, variance 38.86738.86738.86738.867, and 137,561137561137,561137 , 561 blocks. Although the second candidate has less than half the number of blocks of the first candidate, CoVerD predicts that using the first candidate will enable a faster analysis. This is because its success rate is significantly higher and thus it will require fewer refinements (e.g., the success rate of its mean block size is 94.05%percent94.0594.05\%94.05 %, whereas the second candidate’s success rate of the mean block size is 65.85%percent65.8565.85\%65.85 %). The estimated analysis times (in minutes) are Tdist(23,4)=21.20subscript𝑇𝑑𝑖𝑠𝑡23421.20T_{dist(23,4)}=21.20italic_T start_POSTSUBSCRIPT italic_d italic_i italic_s italic_t ( 23 , 4 ) end_POSTSUBSCRIPT = 21.20 and Tdist(19,4)=27.92subscript𝑇𝑑𝑖𝑠𝑡19427.92T_{dist(19,4)}=27.92italic_T start_POSTSUBSCRIPT italic_d italic_i italic_s italic_t ( 19 , 4 ) end_POSTSUBSCRIPT = 27.92. The last step of the planning component samples an ordered set L𝐿Litalic_L of size 784784784784 (the number of pixels in the MNIST image) from [2351231]=[292561]delimited-[]superscript2351231delimited-[]292561[\frac{23^{5}-1}{23-1}]=[292561][ divide start_ARG 23 start_POSTSUPERSCRIPT 5 end_POSTSUPERSCRIPT - 1 end_ARG start_ARG 23 - 1 end_ARG ] = [ 292561 ]. In total, the planning component takes 63.5 seconds.

Then, CoVerD continues to the analysis component. It starts by creating eight instances of the covering generator (Algorithm 1), one for each GPU. A covering generator creates blocks for its GPU one-by-one, given q=23superscript𝑞23q^{*}=23italic_q start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = 23, m=4superscript𝑚4m^{*}=4italic_m start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = 4, t=4𝑡4t=4italic_t = 4 and L𝐿Litalic_L. For every CVD block S𝑆Sitalic_S, the GPU worker defines its neighborhood IS(x)subscript𝐼𝑆𝑥I_{S}(x)italic_I start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT ( italic_x ) and submits to GPUPoly. If GPUPoly verifies successfully, the next CVD block is obtained. If GPUPoly fails proving robustness, S𝑆Sitalic_S is refined. As example, if a block S𝑆Sitalic_S of size 34343434 is refined, the analysis pushes to the stack all blocks in the covering CS(34,28,4)subscript𝐶𝑆34284C_{S}(34,28,4)italic_C start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT ( 34 , 28 , 4 ), which is the covering for (34,28,4)34284(34,28,4)( 34 , 28 , 4 ) that is in the covering database, where the numbers are renamed to range over the numbers in S𝑆Sitalic_S. In this example, GPUPoly is invoked 659,326659326659,326659 , 326 times, where 44%percent4444\%44 % of these calls are for blocks in the CVD. The maximal size of block submitted to GPUPoly is 62626262 and the minimal size is 8888. In particular, CoVerD did not submit any block of size t=4𝑡4t=4italic_t = 4 (i.e., there are no calls to the MILP verifier). The analysis takes 23.49 minutes, which is only 10.8%percent10.810.8\%10.8 % higher than the estimated time.

5 Evaluation

In this section, we describe the experimental evaluation of CoVerD on multiple datasets and networks and compare it to Calzone.

Implementation and setup

We implemented CoVerD 111https://github.com/YuvShap/CoVerD as an extension of Calzone222https://github.com/YuvShap/calzone. Experiments ran on a dual AMD EPYC 7713 server, 2TB RAM, eight NVIDIA A100 GPUs and Ubuntu 20.04.1. We evaluate CoVerD on the networks evaluated by Calzone, whose architectures are described in ERAN333https://github.com/eth-sri/eran. We consider networks trained for popular image datasets: MNIST and Fashion-MNIST, consisting of 28×28282828\times 2828 × 28 greyscale images, and CIFAR-10, consisting of 32×32323232\times 3232 × 32 colored images. CoVerD’s hyper-parameters are: the maximal block size is MAX_K=200MAX_K200\text{MAX\_K}=200MAX_K = 200, the number of samples is initially nsamples=400subscript𝑛samples400n_{\text{samples}}=400italic_n start_POSTSUBSCRIPT samples end_POSTSUBSCRIPT = 400 and after nfail=10subscript𝑛fail10n_{\text{fail}}=10italic_n start_POSTSUBSCRIPT fail end_POSTSUBSCRIPT = 10 failures, it is reduced to nsamples=24subscript𝑛samples24n_{\text{samples}}=24italic_n start_POSTSUBSCRIPT samples end_POSTSUBSCRIPT = 24, and the bound on the estimated number of overly large blocks is ϵ=0.01italic-ϵ0.01\epsilon=0.01italic_ϵ = 0.01. Our covering database, used for the refinement steps, contains coverings for v,k200𝑣𝑘200v,k\leq 200italic_v , italic_k ≤ 200, t6𝑡6t\leq 6italic_t ≤ 6. The covering sizes are restricted to at most 500,000500000500,000500 , 000 blocks. This limitation is stricter than Calzone, which limited to 107superscript10710^{7}10 start_POSTSUPERSCRIPT 7 end_POSTSUPERSCRIPT, but in practice this is unnoticeable since CoVerD only uses the coverings for refinements, and even Calzone typically refines to coverings whose size is at most 500,000500000500,000500 , 000. Like Calzone, the database consists of coverings computed by extending coverings from the La Jolla Covering Repository Tables444https://ljcr.dmgordon.org/cover/table.html using construction techniques from [16, Section 6.1]. Additionally, our database includes finite geometry coverings (for v,k200𝑣𝑘200v,k\leq 200italic_v , italic_k ≤ 200, t6𝑡6t\leq 6italic_t ≤ 6) and extends coverings using the dynamic programming of [16, Section 5]. Like Calzone, We ran CoVerD with eight GPUPoly instances and five MILP instances, except for the CIFAR-10 network where it ran 50 MILP instances. For the matrix multiplication over finite fields (Algorithm 1), CoVerD relies on an effective library [18] and considers only prime numbers for q𝑞qitalic_q (since matrix multiplication is too slow for prime powers).

Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 6: CoVerD vs. Calzone on Calzone’s most challenging benchmarks.
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 7: CoVerD vs. Calzone for t=4𝑡4t=4italic_t = 4.

Comparison to Calzone

We begin by evaluating CoVerD on Calzone’s benchmarks (i.e., the same networks, images and timeouts) for t3𝑡3t\geq 3italic_t ≥ 3. Figure 6 shows the comparison for the most challenging benchmarks of Calzone, and Figure 7 shows comparisons for t=4𝑡4t=4italic_t = 4 (the plots for t=3𝑡3t=3italic_t = 3 are shown in Appendix 0.A). For a given network and t𝑡titalic_t, the plot shows the execution time in minutes of CoVerD and Calzone for every t𝑡titalic_t-ball. The x𝑥xitalic_x-axis orders the t𝑡titalic_t-balls by CoVerD’s output: non-robust (in light red background), robust (in light green background), and timeout (in light blue background, e.g., Figure 7, top). Within each section, the t𝑡titalic_t-balls are sorted by their execution time for clearer visuality. Timeouts, of CoVerD or Calzone, are shown by bars reaching the red horizontal line. The lower part of each bar shows in a lighter color the execution time of the initial sampling (unless it is too short to be visible in the plot). The sampling time is highlighted since Calzone and CoVerD sample slightly differently: Calzone samples 400400400400 sets of size k𝑘kitalic_k, for every k99𝑘99k\leq 99italic_k ≤ 99, while CoVerD samples up to k200𝑘200k\leq 200italic_k ≤ 200 and reduces the number of samples after observing ten k𝑘kitalic_k values whose average success rate is zero. We note that the other computations of the planning component take a few seconds. The plots’ titles include the speedup in the average analysis time of CoVerD over Calzone for non-robust t𝑡titalic_t-balls (NR) and for robust t𝑡titalic_t-balls (R).

The plots show that, on the most challenging benchmarks (Figure 6), CoVerD is always faster than Calzone, except for two non-robust t𝑡titalic_t-balls which CoVerD completes their analysis within 140 seconds. In the plots of Figure 7, CoVerD is always faster than Calzone except for thirteen 4444-balls whose analysis terminates within seven minutes by both verifiers. In the other plots (Figure 9 in Appendix 0.A), where t=3𝑡3t=3italic_t = 3, Calzone is sometimes faster, but in these cases the analysis time is typically short. In other words, the significance of CoVerD is in shortening the analysis time of t𝑡titalic_t-balls with long analysis time. On average, CoVerD is faster than Calzone in verifying robust t𝑡titalic_t-balls by 1.3x for t=3𝑡3t=3italic_t = 3, by 2.8x for t=4𝑡4t=4italic_t = 4, and by 5.1x for t=5𝑡5t=5italic_t = 5.

Refer to caption
Refer to caption
Refer to caption
Figure 8: CoVerD’s new benchmarks.

Challenging benchmarks

Next, we show more challenging benchmarks than Calzone. We evaluate the robustness of three networks for t𝑡titalic_t-balls with larger values of t𝑡titalic_t than Calzone considers, for t=5𝑡5t=5italic_t = 5 and for t=6𝑡6t=6italic_t = 6 (we remind that Calzone is evaluated for t5𝑡5t\leq 5italic_t ≤ 5). Similarly to Calzone’s most challenging benchmarks, these benchmarks evaluate CoVerD for ten images (misclassified images are discarded) and a five hour timeout. Figure 8 shows CoVerD’s analysis time. CoVerD completes the analysis for 73%percent7373\%73 % t𝑡titalic_t-balls. Further, it verifies robustness in some 6666-balls within 42424242 minutes. As before, CoVerD is significantly faster for non-robust t𝑡titalic_t-balls.

We provide additional statistics on CoVerD in Appendix 0.A.

6 Related Work

In this section, we discuss the closest related work.

Robustness verification of neural networks

Many works propose robustness verifiers for neural networks. Most works focus on local robustness in Lsubscript𝐿L_{\infty}italic_L start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT neighborhoods, defined by a series of intervals [25, 33, 31, 13, 21, 2, 36, 35, 12]. Some verifiers provide a complete analysis, i.e., they determine whether a network is robust in the given neighborhood [33, 21, 12]. These approaches typically rely on constraint solving (SAT/SMT solvers or MILP solvers) and thus they often do not scale to large networks. Incomplete verifiers scale the analysis by over-approximating the non-linear computations of the network (e.g., the activation functions) by linear approximations or abstract domains [25, 31, 13, 35]. Several local robustness verifiers address L2subscript𝐿2L_{2}italic_L start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT-balls, e.g., by computing a bound on the network’s global or local Lipschitz constant [22, 19], or L1subscript𝐿1L_{1}italic_L start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT-balls [37, 40]. Other approaches analyze robustness in Lpsubscript𝐿𝑝L_{p}italic_L start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT-balls for p{0,1,2,}𝑝012p\in\{0,1,2,\infty\}italic_p ∈ { 0 , 1 , 2 , ∞ } using randomized smoothing [6, 38, 23, 28, 11], providing probabilistic guarantees. To the best of our knowledge, Calzone [30] is the first work to deterministically verify local robustness in L0subscript𝐿0L_{0}italic_L start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT-balls. Other works prove robustness in neighborhoods defined by high-level features [20, 24, 3].

Covering and combinatorial designs

CVD is related to several combinatorial designs: the combinatorial design defined by [27], covering designs [16] and balanced incomplete block designs [7]. Covering designs, in particular finite geometry coverings, have been leveraged in various domains, including file information retrieval [27], file organization [1] and coding theory [5]. General combinatorial designs have also been leveraged in various domains in computer science [8].

7 Conclusion

We present CoVerD, an L0subscript𝐿0L_{0}italic_L start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT robustness verifier for neural networks. CoVerD boosts the performance of a previous L0subscript𝐿0L_{0}italic_L start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT robustness verifier by employing several ideas. First, it relies on a covering verification design (CVD), a new combinatorial design partially inducing a projective geometry covering. Second, it chooses between candidate CVDs without constructing them but only predicting their block size distribution. Third, it constructs the chosen CVD on-the-fly to keep the memory overhead minimal. We evaluate CoVerD on fully-connected and convolutional networks. We show that it boosts the performance of proving a network’s robustness to at most t𝑡titalic_t perturbed pixels on average by 2.8x, for t=4𝑡4t=4italic_t = 4, and by 5.1x, for t=5𝑡5t=5italic_t = 5. For t=6𝑡6t=6italic_t = 6, CoVerD sometimes proves robustness within 42424242 minutes.

References

  • [1] Abraham, C., Ghosh, S., Ray-Chaudhuri, D.: File organization schemes based on finite geometries. Information and Control 12(2), 143–163 (1968)
  • [2] Anderson, G., Pailoor, S., Dillig, I., Chaudhuri, S.: Optimization and abstraction: a synergistic approach for analyzing neural network robustness. In: Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI (2019)
  • [3] Balunovic, M., Baader, M., Singh, G., Gehr, T., Vechev, M.T.: Certifying geometric robustness of neural networks. In: Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems, NeurIPS (2019)
  • [4] Carlini, N., Wagner, D.A.: Towards evaluating the robustness of neural networks. In: IEEE Symposium on Security and Privacy, SP (2017)
  • [5] Chan, A., Games, R.: (n,k,t))-covering systems and error-trap** decoding (corresp.). IEEE Transactions on Information Theory 27(5), 643–646 (1981)
  • [6] Cohen, J., Rosenfeld, E., Kolter, J.Z.: Certified adversarial robustness via randomized smoothing. In: Proceedings of the 36th International Conference on Machine Learning, ICML. vol. 97. PMLR (2019)
  • [7] Colbourn, C.J., Dinitz, J.H. (eds.): Handbook of Combinatorial Designs. Chapman and Hall/CRC, 2nd edn. (2006). https://doi.org/10.1201/9781420010541
  • [8] Colbourn, C.J., van Oorschot, P.C.: Applications of combinatorial designs in computer science. ACM Comput. Surv. 21(2), 223–250 (jun 1989)
  • [9] Croce, F., Andriushchenko, M., Singh, N.D., Flammarion, N., Hein, M.: Sparse-rs: A versatile framework for query-efficient sparse black-box adversarial attacks. In: Thirty-Sixth AAAI Conference on Artificial Intelligence, AAAI Thirty-Fourth Conference on Innovative Applications of Artificial Intelligence, IAAI , The Twelveth Symposium on Educational Advances in Artificial Intelligence, EAAI. AAAI Press (2022)
  • [10] Croce, F., Hein, M.: Mind the box: l11{}_{\mbox{1}}start_FLOATSUBSCRIPT 1 end_FLOATSUBSCRIPT-apgd for sparse adversarial attacks on image classifiers. In: Proceedings of the 38th International Conference on Machine Learning, ICML. PMLR (2021)
  • [11] Dvijotham, K.D., Hayes, J., Balle, B., Kolter, J.Z., Qin, C., György, A., Xiao, K., Gowal, S., Kohli, P.: A framework for robustness certification of smoothed classifiers using f-divergences. In: 8th International Conference on Learning Representations, ICLR. OpenReview.net (2020)
  • [12] Ehlers, R.: Formal verification of piece-wise linear feed-forward neural networks. In: Automated Technology for Verification and Analysis - 15th International Symposium, ATVA. Lecture Notes in Computer Science, vol. 10482. Springer (2017)
  • [13] Gehr, T., Mirman, M., Drachsler-Cohen, D., Tsankov, P., Chaudhuri, S., Vechev, M.T.: AI2: safety and robustness certification of neural networks with abstract interpretation. In: IEEE Symposium on Security and Privacy, SP (2018)
  • [14] Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press (2016), http://www.deeplearningbook.org
  • [15] Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: 3rd International Conference on Learning Representations, ICLR (2015)
  • [16] Gordon, D.M., Kuperberg, G., Patashnik, O.: New constructions for covering designs. J. COMBIN. DESIGNS 3, 269–284 (1995)
  • [17] Grosse, K., Papernot, N., Manoharan, P., Backes, M., McDaniel, P.D.: Adversarial perturbations against deep neural networks for malware classification. CoRR abs/1606.04435 (2016)
  • [18] Hostetter, M.: Galois: A performant NumPy extension for Galois fields (11 2020), https://github.com/mhostetter/galois
  • [19] Huang, Y., Zhang, H., Shi, Y., Kolter, J.Z., Anandkumar, A.: Training certifiably robust neural networks with efficient local lipschitz bounds. In: Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems NeurIPS (2021)
  • [20] Kabaha, A., Drachsler-Cohen, D.: Boosting robustness verification of semantic feature neighborhoods. In: Static Analysis - 29th International Symposium, SAS. Lecture Notes in Computer Science, vol. 13790. Springer (2022)
  • [21] Katz, G., Barrett, C.W., Dill, D.L., Julian, K., Kochenderfer, M.J.: Reluplex: An efficient SMT solver for verifying deep neural networks. In: Computer Aided Verification - 29th International Conference, CAV (2017)
  • [22] Leino, K., Wang, Z., Fredrikson, M.: Globally-robust neural networks. In: Proceedings of the 38th International Conference on Machine Learning, ICML. Proceedings of Machine Learning Research, vol. 139. PMLR (2021)
  • [23] Li, B., Chen, C., Wang, W., Carin, L.: Certified adversarial robustness with additive noise. In: Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems, NeurIPS (2019)
  • [24] Mohapatra, J., Weng, T., Chen, P., Liu, S., Daniel, L.: Towards verifying robustness of neural networks against A family of semantic perturbations. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR (2020)
  • [25] Müller, C., Serre, F., Singh, G., Püschel, M., Vechev, M.T.: Scaling polyhedral neural network verification on gpus. In: Proceedings of Machine Learning and Systems MLSys (2021)
  • [26] Papernot, N., McDaniel, P.D., Jha, S., Fredrikson, M., Celik, Z.B., Swami, A.: The limitations of deep learning in adversarial settings. In: IEEE European Symposium on Security and Privacy, EuroS&P (2016)
  • [27] Ray-Chaudhuri, D.K.: Combinatorial information retrieval systems for files. SIAM J. Appl. Math. 16(5), 973–992 (sep 1968)
  • [28] Salman, H., Li, J., Razenshteyn, I.P., Zhang, P., Zhang, H., Bubeck, S., Yang, G.: Provably robust deep learning via adversarially trained smoothed classifiers. In: Wallach, H.M., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E.B., Garnett, R. (eds.) Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems, NeurIPS. pp. 11289–11300 (2019)
  • [29] Schönheim, J.: On coverings. Pacific Journal of Mathematics 14(4), 1405 – 1411 (1964)
  • [30] Shapira, Y., Avneri, E., Drachsler-Cohen, D.: Deep learning robustness verification for few-pixel attacks. Proc. ACM Program. Lang. 7(OOPSLA1) (2023)
  • [31] Singh, G., Gehr, T., Püschel, M., Vechev, M.T.: An abstract domain for certifying neural networks. PACMPL 3(POPL) (2019)
  • [32] Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I.J., Fergus, R.: Intriguing properties of neural networks. In: 2nd International Conference on Learning Representations, ICLR (2014)
  • [33] Tjeng, V., Xiao, K.Y., Tedrake, R.: Evaluating robustness of neural networks with mixed integer programming. In 7th International Conference on Learning Representations, ICLR (2019)
  • [34] Todorov, D.: Combinatorial coverings. Ph.D. thesis, PhD thesis, University of Sofia, 1985 (1985)
  • [35] Wang, S., Zhang, H., Xu, K., Lin, X., Jana, S., Hsieh, C., Kolter, J.Z.: Beta-crown: Efficient bound propagation with per-neuron split constraints for neural network robustness verification. In: Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems (2021)
  • [36] Wu, H., Ozdemir, A., Zeljic, A., Julian, K., Irfan, A., Gopinath, D., Fouladi, S., Katz, G., Pasareanu, C.S., Barrett, C.W.: Parallelization techniques for verifying neural networks. In: Formal Methods in Computer Aided Design, FMCAD. IEEE (2020)
  • [37] Wu, Y., Zhang, M.: Tightening robustness verification of convolutional neural networks with fine-grained linear approximation. In: Thirty-Fifth AAAI Conference on Artificial Intelligence, AAAI. pp. 11674–11681. AAAI Press (2021)
  • [38] Yang, G., Duan, T., Hu, J.E., Salman, H., Razenshteyn, I.P., Li, J.: Randomized smoothing of all shapes and sizes. In: Proceedings of the 37th International Conference on Machine Learning, ICML. Proceedings of Machine Learning Research, vol. 119. PMLR (2020)
  • [39] Yuviler, T., Drachsler-Cohen, D.: One pixel adversarial attacks via sketched programs. Proc. ACM Program. Lang. 7(PLDI) (2023)
  • [40] Zhang, H., Weng, T., Chen, P., Hsieh, C., Daniel, L.: Efficient neural network robustness certification with general activation functions. In: Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems, NeurIPS (2018)

Appendix 0.A Additional Results

In this section, we provide additional experimental results.

Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 9: CoVerD vs. Calzone for t=3𝑡3t=3italic_t = 3.

Figure 9 compares CoVerD and Calzone for different networks and t=3𝑡3t=3italic_t = 3.

Table 1: The average number of completed analysis tasks of GPUPoly and the MILP verifier, for robust t𝑡titalic_t-balls (R) and non-robust t𝑡titalic_t-balls (NR).
Dataset Network t𝑡titalic_t GPUPoly MILP
R NR R NR
MNIST 6×\times×200_PGD 3 46565.93 12939.12 2.99 34.88
4 3468151.25 637245.50 363.00 3567.00
ConvSmall 3 20693.16 5822.55 0.03 2.00
4 615418.88 221191.00 0.03 8.17
5 3095356.00 36720.00 0.00 18.00
ConvSmallPGD 3 9937.48 7689.20 0.01 3.50
4 171847.55 13430.30 0.00 3.90
5 1730609.17 35571.33 0.00 1.67
6 5856433.00 29218.33 0.00 3.67
ConvMedPGD 3 10617.40 11101.56 0.01 2.67
4 274938.91 59813.17 0.05 4.33
5 3773350.00 13614.00 0.00 5.00
ConvBig 3 217192.22 - 2.33 -
FASHION ConvSmallPGD 3 25196.52 15810.41 1.76 48.65
4 644109.33 66506.73 1.80 67.45
5 2714415.20 976522.50 33.00 8.00
ConvMedPGD 3 69537.46 42235.61 8.25 35.33
4 1560073.43 60455.00 0.00 58.00
CIFAR-10 ConvSmallPGD 3 5768227.40 92157.75 1933.60 1509.75

We next study the number of completed analysis tasks performed by GPUPoly and the MILP verifier. For robust t𝑡titalic_t-balls, this number is equal to the number of tasks CoVerD submits to these verifiers. For non-robust t𝑡titalic_t-balls, this number may be smaller than the number of submitted tasks, since CoVerD terminates when an adversarial example is found. We note that CoVerD, like Calzone, submits to the MILP verifier unique blocks (of size t𝑡titalic_t) to avoid repeating the same analysis tasks in the MILP verifier, which is significantly slower than GPUPoly. Table 1 shows the average number of completed analysis tasks for robust t𝑡titalic_t-balls and non-robust t𝑡titalic_t-balls. As expected, the number of completed tasks of the MILP verifier is significantly smaller than the number of completed tasks of GPUPoly, especially for robust balls. This is because the MILP verifier is executed for unique blocks of size t𝑡titalic_t that GPUPoly failed to verify.

Next, we justify the value chosen for MAX_K, the upper bound on the size of blocks in the CVD candidates. This hyper-parameter has been introduced by Calzone to limit the k𝑘kitalic_k of the considered coverings. Limiting the maximal value of k𝑘kitalic_k is required for the covering database, the sampling, and the dynamic programming. This limitation should not impact Calzone’s performance, because overly large blocks are unlikely to be robust and thus are unhelpful to the analysis. In Calzone, MAX_K=99MAX_K99\text{MAX\_K}=99MAX_K = 99 since for most networks and images, larger blocks are typically not robust. To illustrate, Figure 10 shows the success rate of GPUPoly for the ConvSmallPGD network trained for MNIST and neighborhoods IS(x)subscript𝐼𝑆𝑥I_{S}(x)italic_I start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT ( italic_x ) as a function of k=|S|𝑘𝑆k=|S|italic_k = | italic_S |, for eight values of x𝑥xitalic_x (the first eight images in the MNIST test set). We note that ConvSmallPGD has the highest success rates for large values of k𝑘kitalic_k, among our evaluated networks. For every image, we compute the success rate of every k[200]𝑘delimited-[]200k\in[200]italic_k ∈ [ 200 ] by sampling 5,00050005,0005 , 000 sets S𝑆Sitalic_S of size k𝑘kitalic_k and submitting IS(x)subscript𝐼𝑆𝑥I_{S}(x)italic_I start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT ( italic_x ) to GPUPoly. The success rate is the fraction of verified sets. The plot shows that the success rates are very low for k>99𝑘99k>99italic_k > 99, justifying Calzone’s choice of MAX_K. Unlike Calzone, CoVerD computes for the first covering a CVD, consisting of blocks of different sizes. A CVD whose mean block size is k𝑘kitalic_k contains blocks larger than k𝑘kitalic_k. To allow CoVerD to consider CVDs with mean block sizes similar to those in Calzone’s coverings, we increase MAX_K to 200200200200. This number enables CoVerD to consider CVDs with mean block size k99𝑘99k\leq 99italic_k ≤ 99. This follows since, by Theorem 3.1, the standard deviation of the block size distribution of a CVD with mean block size k99𝑘99k\leq 99italic_k ≤ 99 is at most 99109910\sqrt{99}\leq 10square-root start_ARG 99 end_ARG ≤ 10. Thus, the probability that a CVD with mean k99𝑘99k\leq 99italic_k ≤ 99 has a block of size larger than 200200200200 is negligible, since such sizes are greater than the mean by at least 10101010x the standard deviation.

Refer to caption
Figure 10: GPUPoly’s success rate for MNIST ConvSmallPGD and IS(x)subscript𝐼𝑆𝑥I_{S}(x)italic_I start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT ( italic_x ) as a function of k=|S|𝑘𝑆k=|S|italic_k = | italic_S |, for eight values (images) of x𝑥xitalic_x.

Lastly, we justify the value of nfail=10subscript𝑛fail10n_{\text{fail}}=10italic_n start_POSTSUBSCRIPT fail end_POSTSUBSCRIPT = 10. As described, CoVerD’s sampling is different from Calzone’s sampling. Calzone samples nsamples=400subscript𝑛samples400n_{\text{samples}}=400italic_n start_POSTSUBSCRIPT samples end_POSTSUBSCRIPT = 400 sets of pixels of size k𝑘kitalic_k for every tk99𝑡𝑘99t\leq k\leq 99italic_t ≤ italic_k ≤ 99. CoVerD begins by sampling nsamples=400subscript𝑛samples400n_{\text{samples}}=400italic_n start_POSTSUBSCRIPT samples end_POSTSUBSCRIPT = 400 sets of pixels of size k𝑘kitalic_k, for every tk200𝑡𝑘200t\leq k\leq 200italic_t ≤ italic_k ≤ 200 and decreases nsamplessubscript𝑛samplesn_{\text{samples}}italic_n start_POSTSUBSCRIPT samples end_POSTSUBSCRIPT to 24242424 when detecting nfailsubscript𝑛failn_{\text{fail}}italic_n start_POSTSUBSCRIPT fail end_POSTSUBSCRIPT values of k𝑘kitalic_k whose estimated success rate is zero. Technically, the sampling is distributed among the eight GPUs, each samples independently. Each GPU samples nGPU_samples=50subscript𝑛GPU_samples50n_{\text{GPU\_samples}}=50italic_n start_POSTSUBSCRIPT GPU_samples end_POSTSUBSCRIPT = 50 sets of size k𝑘kitalic_k for every k𝑘kitalic_k from t𝑡titalic_t to 200200200200, in an increasing order. After a GPU observes nfailsubscript𝑛failn_{\text{fail}}italic_n start_POSTSUBSCRIPT fail end_POSTSUBSCRIPT values of k𝑘kitalic_k whose estimated success rate is zero, it sets nGPU_samples=3subscript𝑛GPU_samples3n_{\text{GPU\_samples}}=3italic_n start_POSTSUBSCRIPT GPU_samples end_POSTSUBSCRIPT = 3. By decreasing nGPU_samplessubscript𝑛GPU_samplesn_{\text{GPU\_samples}}italic_n start_POSTSUBSCRIPT GPU_samples end_POSTSUBSCRIPT, the sampling time decreases while still enabling CoVerD to estimate the success rate and average analysis time of large values of k𝑘kitalic_k. We evaluate CoVerD’s sampling over a single image and four networks. Table 2 shows the minimal k𝑘kitalic_k where at least one GPU decreases nGPU_samplessubscript𝑛GPU_samplesn_{\text{GPU\_samples}}italic_n start_POSTSUBSCRIPT GPU_samples end_POSTSUBSCRIPT, denoted kstartsubscript𝑘startk_{\text{start}}italic_k start_POSTSUBSCRIPT start end_POSTSUBSCRIPT, and the minimum k𝑘kitalic_k where all GPUs decrease nGPU_samplessubscript𝑛GPU_samplesn_{\text{GPU\_samples}}italic_n start_POSTSUBSCRIPT GPU_samples end_POSTSUBSCRIPT, denoted kallsubscript𝑘allk_{\text{all}}italic_k start_POSTSUBSCRIPT all end_POSTSUBSCRIPT. We compare CoVerD’s estimated success rates to the estimation of Calzone’s sampling extended to k200𝑘200k\leq 200italic_k ≤ 200, relying on 400400400400 samples for every k𝑘kitalic_k. For all networks, the estimated success rates of kkstart𝑘subscript𝑘startk\geq k_{\text{start}}italic_k ≥ italic_k start_POSTSUBSCRIPT start end_POSTSUBSCRIPT of CoVerD and Calzone are zero, except for MNIST ConvSmall and k=97𝑘97k=97italic_k = 97, where Calzone’s sampling estimated the success rate as 0.25%percent0.250.25\%0.25 %. Figure 11 shows the estimated analysis time of every k𝑘kitalic_k by CoVerD’s sampling and by Calzone’s sampling. The dashed vertical lines in the plots indicate kstartsubscript𝑘startk_{\text{start}}italic_k start_POSTSUBSCRIPT start end_POSTSUBSCRIPT. The plots show that the estimated analysis time of CoVerD is very close to Calzone’s estimated analysis time, despite of sampling significantly fewer sets. As expected, its estimations are noisier for kkstart𝑘subscript𝑘startk\geq k_{\text{start}}italic_k ≥ italic_k start_POSTSUBSCRIPT start end_POSTSUBSCRIPT. The benefit of CoVerD’s sampling is that it significantly reduces the overall sampling time, on average by 2.652.652.652.65x and up to 3.393.393.393.39x compared to Calzone’s sampling (extended to k200𝑘200k\leq 200italic_k ≤ 200).

Table 2: The value of k𝑘kitalic_k when at least one GPU decreases nsamplessubscript𝑛samplesn_{\text{samples}}italic_n start_POSTSUBSCRIPT samples end_POSTSUBSCRIPT and when all GPU decrease nsamplessubscript𝑛samplesn_{\text{samples}}italic_n start_POSTSUBSCRIPT samples end_POSTSUBSCRIPT.
Dataset Network kstartsubscript𝑘startk_{\text{start}}italic_k start_POSTSUBSCRIPT start end_POSTSUBSCRIPT kallsubscript𝑘allk_{\text{all}}italic_k start_POSTSUBSCRIPT all end_POSTSUBSCRIPT
MNIST 6×\times×200_PGD 69 72
ConvSmall 94 99
ConvMedPGD 90 94
FASHION ConvMedPGD 62 65
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 11: Estimated analysis time by CoVerD’s sampling vs. Calzone’s sampling.