License: CC BY 4.0
arXiv:2312.13250v2 [quant-ph] 27 Jan 2024

The role of data embedding in equivariant quantum convolutional neural networks

Sreetama Das1,2,3123{}^{1,2,3}start_FLOATSUPERSCRIPT 1 , 2 , 3 end_FLOATSUPERSCRIPT, Stefano Martina2,323{}^{2,3}start_FLOATSUPERSCRIPT 2 , 3 end_FLOATSUPERSCRIPT, Filippo Caruso1,2,3123{}^{1,2,3}start_FLOATSUPERSCRIPT 1 , 2 , 3 end_FLOATSUPERSCRIPT 11{}^{1}start_FLOATSUPERSCRIPT 1 end_FLOATSUPERSCRIPTIstituto Nazionale di Ottica del Consiglio Nazionale delle Ricerche (CNR-INO), I-50019 Sesto Fiorentino, Italy 22{}^{2}start_FLOATSUPERSCRIPT 2 end_FLOATSUPERSCRIPTDepartment of Physics and Astronomy, University of Florence, Via Sansone 1, Sesto Fiorentino, I-50019, Italy 33{}^{3}start_FLOATSUPERSCRIPT 3 end_FLOATSUPERSCRIPTEuropean Laboratory for Non-Linear Spectroscopy (LENS), University of Florence, Via Nello Carrara 1, Sesto Fiorentino, I-50019, Italy
Abstract

Geometric deep learning refers to the scenario in which the symmetries of a dataset are used to constrain the parameter space of a neural network and thus, improve their trainability and generalization. Recently this idea has been incorporated into the field of quantum machine learning, which has given rise to equivariant quantum neural networks (EQNNs). In this work, we investigate the role of classical-to-quantum embedding on the performance of equivariant quantum convolutional neural networks (EQCNNs) for the classification of images. We discuss the connection between the data embedding method and the resulting representation of a symmetry group and analyze how changing representation affects the expressibility of an EQCNN. We numerically compare the classification accuracy of EQCNNs with three different basis-permuted amplitude embeddings to the one obtained from a non-equivariant quantum convolutional neural network (QCNN). Our results show a clear dependence of classification accuracy on the underlying embedding, especially for initial training iterations. The improvement in classification accuracy of EQCNN over non-equivariant QCNN may be present or absent depending on the particular embedding and dataset used. It is expected that the results of this work can be useful to the community for a better understanding of the importance of data embedding choice in the context of geometric quantum machine learning.

I Introduction

Quantum computing holds the promise to surpass classical supercomputers in achieving polynomial and exponential speed-up when performing certain tasks [1, 2]. The recent breakthrough [3] of realizing such speed-up with state-of-the-art noisy intermediate-scale quantum (NISQ) devices [4] has drawn widespread attention and research interest to this field. One of the most active research lines in recent times is to connect quantum computation with classical machine learning (ML). In this context there exist two approaches, the first one is to study how classical ML tools can facilitate quantum information processing tasks [5, 6, 7, 8, 9]. The other is the more frequently studied direction, i.e. to use a quantum system itself to build machine learning models [10, 11] and in particular quantum neural networks (QNNs) [12, 13, 14, 15, 16, 17, 18, 19]. The central component in a prototypical QNN is a quantum circuit with single and multiple qubit gates with trainable parameters. The latter is called parametric quantum circuit (PQC) or variational quantum circuit (VQC) [20, 21, 22, 23, 24]. PQC-based QNNs have been used to design quantum analogs of well-known classical ML networks, e.g., quantum autoencoder [22], quantum convolutional neural networks (QCNNs) [25, 26, 27], quantum generative adversarial networks (QGANs) [28, 29, 30, 31, 32, 33], quantum generative diffusion models [17, 18, 19], etc. It is important to note here that the cost function obtained from the above networks is optimized using a classical optimization routine. In this sense, the quantum machine learning networks are hybrid quantum-classical networks. Despite the potential speed-up and signatures of success of quantum machine learning over classical ML models [34, 35, 36, 37, 38], there exist several limitations for QNNs, the barren plateau problem being the principal among them. The latter means that for sufficiently deep quantum neural networks the gradient of the cost function vanishes exponentially with the size of the PQC [39, 40, 41, 42]. On one hand, an arbitrary PQC ideally should have high expressibility [43] to ensure that the solution of the optimization problem is close enough to the actual solution. On the other hand, PQCs with higher expressibility are more prone to exhibiting barren plateau [39, 40]. It is therefore a crucial task to mitigate barren plateau for a practical application of quantum machine learning algorithms.

One way to improve the trainability and generalization in machine learning algorithms is to introduce inductive bias in the network, i.e. to use some prior known information about the dataset to build a problem-specific model and constrain the optimization space of the network. Particularly, geometric machine learning refers to a scheme in which the known symmetries of the dataset are used to construct a network which respects those symmetries [44]. Such a network will contain a sufficiently good solution, yet explores a smaller parameter space while training. For instance, the convolutional neural network (CNN), which is greatly successful for image recognition, is constructed by leveraging the translational symmetry of 2D images. Inspired from this, a number of recent works study the role of symmetry in improving QNN architectures [45, 46, 47, 48, 49, 50, 51]. The results show that a linear map representing the QNN can adapt to a symmetric dataset if its action on the data points commute with the action of the symmetry group (two operators A𝐴Aitalic_A and B𝐵Bitalic_B commute if AB=BA𝐴𝐵𝐵𝐴AB=BAitalic_A italic_B = italic_B italic_A or equivalently if [A,B]ABBA=0𝐴𝐵𝐴𝐵𝐵𝐴0[A,B]\equiv AB-BA=0[ italic_A , italic_B ] ≡ italic_A italic_B - italic_B italic_A = 0). These QNNs are called equivariant quantum neural networks (EQNNs), in analogy to equivariant neural networks in classical geometric learning. EQNNs have less number of trainable parameters and reduced expressibility, but they are expected to show a faster training time and improved generalization compared to a general QNN with high expressibility. Indeed, studies show that some classes of symmetry-respecting QNNs are devoid of the barren plateau problem [52, 53, 54]. Moreover, the performance of an EQNN shows improvement over a non-equivariant QNN for pattern recognition and image classification [48, 55, 56, 54].

A set of symmetry operations form an abstract group 𝒢𝒢\mathcal{G}caligraphic_G. A representation of a group 𝒢𝒢\mathcal{G}caligraphic_G is a map** R:𝒢GL:𝑅𝒢𝐺𝐿R:\mathcal{G}\rightarrow GLitalic_R : caligraphic_G → italic_G italic_L where GL𝐺𝐿GLitalic_G italic_L is the space of invertible linear operators that preserve the group structure of 𝒢𝒢\mathcal{G}caligraphic_G. Let us suppose we are given a classical data point with a certain symmetry 𝒢𝒢\mathcal{G}caligraphic_G. If we encode the data point using an N𝑁Nitalic_N-dimensional quantum state, then the representations of the group elements are N×N𝑁𝑁N\times Nitalic_N × italic_N unitary matrices. However more than one unitary representations for the same group is possible. Change of representation is induced by the change in the quantum embedding of the data. In EQNN, the action of the network on the input state must commute with representation R(g)𝑅𝑔R(g)italic_R ( italic_g ) for all group elements g𝒢𝑔𝒢g\in\mathcal{G}italic_g ∈ caligraphic_G. As different representations have different commutator space, the choice of representation decides which PQC and measurements are to be used in the EQNN. This has been discussed in Ref. [57] where the authors conceptualize a network in which the representation of the input state is altered at the intermediate layers by applying a linear transformation on the quantum state. Also in Refs. [55, 54] the authors encode classical images using an altered-basis amplitude embedding in a way so that the resulting group representation simplifies the construction of EQNN. However, whether the resulting EQNN has a similar performance as before the basis-change is not clear. In this work, our objective is to attain a better understanding of how a change in the data embedding affects the performance of EQCNN. We present our theoretical observations about the role of embedding and the resulting representation on the construction of equivariant convolutional and pooling layers in EQCNNs. For our numerical study, we choose standard amplitude embedding of images along with other permuted-basis amplitude embedding, similar to Refs. [55, 56]. We then consider datasets in which the class labels are symmetric with respect to reflection and 180superscript180180^{\circ}180 start_POSTSUPERSCRIPT ∘ end_POSTSUPERSCRIPT rotation. We compare the classification accuracy of a number of EQCNNs with different permuted basis embeddings to a general non-equivariant QCNN for image classification. These results show that the choice of embedding substantially affects the performance of the EQCNN, and can be crucial for obtaining an improvement over non-equivariant QCNNs.

The manuscript is arranged as follows. In Sec. II, we start with an introduction to QCNN and EQCNN, followed by discussing the role of representation in constructing an EQCNN. In Sec. III we describe the reflection symmetry group and its different representations used in this work, alongside presenting the unitary ansatze used in building the EQCNNs and the non-equivariant QCNN. We present our findings in Sec. IV and conclude in Sec. V.

Refer to caption
Figure 1: The structure of QCNN for 10 qubits. The orange and the cyan boxes represent respectively the parametrized convolutional and pooling ansatze. The input quantum state is |ψket𝜓|\psi\rangle| italic_ψ ⟩ and the final measurement is M𝑀Mitalic_M.

II Equivariant quantum convolutional neural network

II.1 Quantum Convolutional Neural Network (QCNN)

Convolutional neural networks (CNNs) are classical ML models extensively used for image classification, speech recognition etc [58, 59]. They consist of a sequence of convolutional and pooling layers followed by a fully-connected layer at the end. The convolution operation can be visualized as a d×d𝑑𝑑d\times ditalic_d × italic_d matrix with trainable weights, known as kernel, traversing along the height and width of the input image. At each position of the kernel, the dot product between it and the d×d𝑑𝑑d\times ditalic_d × italic_d block of the input image over which it is placed is calculated to get a feature map of the input image. The trainable weights of the kernel are the same within a convolutional layer. It is common to apply a nonlinear activation function after each convolutional layer. In the pooling layer, the dimension is reduced by aggregating over areas of the feature map by taking the average or the maximum value. After application of a number of convolution and pooling layers, the feature maps are flattened to 1D and used as the input layer of a fully-connected neural network that performs the prediction. The cost function is calculated using the output nodes and the network is trained. One important aspect of CNN is that it can recognize a particular feature of an image irrespective of the translation of that feature in the image plane. Thus, CNNs respect the translational symmetry of images.

In Ref. [25], the authors proposed a quantum analogue of CNN. We show this architecture in Fig. 1. The architecture is inspired from the multiscale entanglement renormalization ansatz (MERA) representation of quantum many-body states [60]. Any quantum neural network has three components– an n𝑛nitalic_n-qubit quantum register encoding the input quantum state |ΨketΨ|\Psi\rangle| roman_Ψ ⟩, a PQC 𝒰θsubscript𝒰𝜃\mathcal{U}_{\theta}caligraphic_U start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT acting on |ΨketΨ|\Psi\rangle| roman_Ψ ⟩, and lastly a measurement M𝑀Mitalic_M on some (or all) of the qubits. In case of QCNN, 𝒰θsubscript𝒰𝜃\mathcal{U}_{\theta}caligraphic_U start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT consists of a series of convolutional and pooling layers. In the ithsuperscript𝑖thi^{\mathrm{th}}italic_i start_POSTSUPERSCRIPT roman_th end_POSTSUPERSCRIPT convolutional layer, an m𝑚mitalic_m-qubit (m<n𝑚𝑛m<nitalic_m < italic_n) trainable convolutional ansatz Ui(Θi)subscript𝑈𝑖subscriptΘ𝑖U_{i}(\Theta_{i})italic_U start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( roman_Θ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) is applied on all combinations of neighboring m𝑚mitalic_m qubits, mimicking the action of a kernel in CNN. Here Θi={θi1,θi2,..,θik}\Theta_{i}=\{\theta_{i}^{1},\theta_{i}^{2},..,\theta_{i}^{k}\}roman_Θ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = { italic_θ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , italic_θ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , . . , italic_θ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT } is a set of k𝑘kitalic_k trainable parameters which is the same for all Ui(Θi)subscript𝑈𝑖subscriptΘ𝑖U_{i}(\Theta_{i})italic_U start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( roman_Θ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) in the ithsuperscript𝑖thi^{\mathrm{th}}italic_i start_POSTSUPERSCRIPT roman_th end_POSTSUPERSCRIPT convolutional layer. In the pooling layer, from each pair of neighboring qubits one qubit is measured in a particular basis, and conditioned on the measurement outcomes a set of parametrized rotations are applied on the other qubit. These actions constitute the pooling ansatz Vi(Φi)subscript𝑉𝑖subscriptΦ𝑖V_{i}(\Phi_{i})italic_V start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( roman_Φ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ). Similar to convolution layers the parameters Φi={ϕi1,ϕi2,..,ϕil}\Phi_{i}=\{\phi_{i}^{1},\phi_{i}^{2},..,\phi_{i}^{l}\}roman_Φ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = { italic_ϕ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , italic_ϕ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , . . , italic_ϕ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT } are shared within a pooling layer. Note that one can also relax the translational invariance condition by using convolutional and pooling ansatze with different parameters within a layer [61]. Following the pooling layer, all the measured qubits are traced out, thus reducing the effective dimension of the system. In real quantum devices this is synonymous with ignoring the qubits in the subsequent stages after the measurement, and considering only the dynamics on the remaining qubits. The sequence of a convolutional layer followed by a pooling layer is then repeated until a small fraction of qubits are left. In analogy to the fully-connected part of a CNN, one can then apply a general PQC on the remaining qubits at the end and finally measure them to obtain the prediction. The network is then trained using a suitable loss function and optimization algorithm. Compared to a generic deep quantum neural network, in this case due to progressive qubit reduction, the structure has a shallower depth of 𝒪(logn)𝒪𝑛\mathcal{O}(\log n)caligraphic_O ( roman_log italic_n ), which is preferable for training a QNN. In [25] the authors demonstrated the utility of QCNN for topological phase recognition of quantum many-body states as well as optimization of error correcting codes. Later, this architecture of QCNN was as well studied for classical image classification in [62]. In this case, there is an additional step in which the classical images are encoded as quantum states and used as the input to the network. In this regard, two widely used embedding methods are amplitude embedding and qubit embedding– the former encode each classical information value in one of the quantum state amplitudes while the latter employs one qubit to encode one value. For large images amplitude embedding is advantageous as we need logarithmically less qubits compared to number of pixels to encode them. For smaller images, qubit embedding is preferred since the information about each pixel is more easily encoded in the single qubits. In [62], the convolution and pooling layers were applied until only a single qubit remained, on which Pauli-Z𝑍Zitalic_Z measurement is performed. This structure is suitable for binary classification since the expectation value of the measurement operator on a qubit can be associated with the two classes to be distinguished. Overall, good classification accuracy for simple datasets (e.g., MNIST and Fashion MNIST) can be reached. However, one must note that the state-of-the-art QCNN is still too immature to learn the complex features of general complex image datasets. Thus, in general CNNs can achieve higher classification accuracies than QCNNs.

Refer to caption
Figure 2: A 2×2222\times 22 × 2 image and its transformations under reflection and rotation by 180superscript180180^{\circ}180 start_POSTSUPERSCRIPT ∘ end_POSTSUPERSCRIPT. The table shows group representations of these two symmetries with changing embedding of the image as a quantum state. 𝒩𝒩\mathcal{N}caligraphic_N is the quantum state normalization factor.
Refer to caption
Figure 3: The relation between quantum-encoded image state |ΨketΨ|\Psi\rangle| roman_Ψ ⟩, basis-permuting matrix A𝐴Aitalic_A, and the representations R(g)𝑅𝑔R(g)italic_R ( italic_g ) and R(g)superscript𝑅𝑔R^{\prime}(g)italic_R start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_g ) before and after the basis permutation, respectively.

II.2 Equivariant QCNN with label-symmetry

The architecture of CNNs respect the translational symmetry of 2D images. However, there can exist additional symmetries in a dataset. Particularly, for classical images one of the most commonly occurring symmetry is label symmetry in which the class labels of the images remain unchanged under a set of operations. For example, in MNIST dataset the labels of digits 1, 8 and 0 are reflection invariant. Let us consider a binary-classification task for a dataset 𝒳𝒳\mathcal{X}caligraphic_X with images xisubscript𝑥𝑖x_{i}italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and corresponding class labels yi𝒴subscript𝑦𝑖𝒴y_{i}\in\mathcal{Y}italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ caligraphic_Y where 𝒴={0,1}𝒴01\mathcal{Y}=\{0,1\}caligraphic_Y = { 0 , 1 }. There is a function f:𝒳𝒴:𝑓𝒳𝒴f:\mathcal{X}\rightarrow\mathcal{Y}italic_f : caligraphic_X → caligraphic_Y that maps the images to their labels. The task of the QCNN is to prepare a quantum circuit fθsubscriptsuperscript𝑓𝜃f^{\prime}_{\theta}italic_f start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT that closely approximates f𝑓fitalic_f. A set of operations 𝒢𝒢\mathcal{G}caligraphic_G form a label symmetry group for 𝒳𝒳\mathcal{X}caligraphic_X if for applications of g𝒢𝑔𝒢g\in\mathcal{G}italic_g ∈ caligraphic_G to the images, the assigned labels remain unchanged, i.e.,

f(g(xi))=yi=f(xi)xi𝒳,g𝒢.formulae-sequence𝑓𝑔subscript𝑥𝑖subscript𝑦𝑖𝑓subscript𝑥𝑖formulae-sequencefor-allsubscript𝑥𝑖𝒳for-all𝑔𝒢f(g(x_{i}))=y_{i}=f(x_{i})\quad\forall x_{i}\in\mathcal{X},\quad\forall g\in% \mathcal{G}.italic_f ( italic_g ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) = italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_f ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ∀ italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ caligraphic_X , ∀ italic_g ∈ caligraphic_G . (1)

Note that the data points themselves may not be invariant under the group action, i.e., g(xi)xi𝑔subscript𝑥𝑖subscript𝑥𝑖g(x_{i})\neq x_{i}italic_g ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ≠ italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT in general. The QCNN fθsubscriptsuperscript𝑓𝜃f^{\prime}_{\theta}italic_f start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT used to classify 𝒳𝒳\mathcal{X}caligraphic_X should not predict different labels for inputs related by the symmetry operations. The QCNNs that satisfy this condition are called EQCNNs. Recent works explore EQNN and its improvements over a general non-equivariant QNN for image classification [55, 56]. In particular, we use the result from [55] which shows that a PQC 𝒰θsubscript𝒰𝜃\mathcal{U}_{\theta}caligraphic_U start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT along with a measurement M𝑀Mitalic_M will construct an EQNN fθsubscriptsuperscript𝑓𝜃f^{\prime}_{\theta}italic_f start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT if the following condition holds,

[R(g),𝒰θM𝒰θ]=0g𝒢,formulae-sequence𝑅𝑔superscriptsubscript𝒰𝜃𝑀subscript𝒰𝜃0for-all𝑔𝒢\big{[}R(g),\mathcal{U}_{\theta}^{\dagger}M\mathcal{U}_{\theta}\big{]}=0\quad% \forall g\in\mathcal{G},[ italic_R ( italic_g ) , caligraphic_U start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_M caligraphic_U start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ] = 0 ∀ italic_g ∈ caligraphic_G , (2)

where R(g)𝑅𝑔R(g)italic_R ( italic_g ) is the unitary representation of g𝒢𝑔𝒢g\in\mathcal{G}italic_g ∈ caligraphic_G. It was also proved in [49, 57] that a map 𝒰θsubscript𝒰𝜃\mathcal{U}_{\theta}caligraphic_U start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT and a measurement M𝑀Mitalic_M will form an EQNN if and only if

[R(g),𝒰θ]=0and[R(g),M]=0,gformulae-sequence𝑅𝑔subscript𝒰𝜃0and𝑅𝑔𝑀0for-all𝑔[R(g),\mathcal{U}_{\theta}]=0\quad\mbox{and}\quad[R(g),M]=0,\quad\forall g[ italic_R ( italic_g ) , caligraphic_U start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ] = 0 and [ italic_R ( italic_g ) , italic_M ] = 0 , ∀ italic_g (3)

Thus if we define the commutator space of R(g)𝑅𝑔R(g)italic_R ( italic_g ) to be the space of all operators that commute with R(g)𝑅𝑔R(g)italic_R ( italic_g ), then 𝒰θsubscript𝒰𝜃\mathcal{U}_{\theta}caligraphic_U start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT and M𝑀Mitalic_M must belong to that commutator space. It is easy to see that Eq. (2) is satisfied when Eq. (3) is true.

In the context of EQCNN, 𝒰θsubscript𝒰𝜃\mathcal{U}_{\theta}caligraphic_U start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT can be obtained by assuring that each convolution and pooling layer is equivariant with respect to the representation in the output of the previous layer. Since in our case these layers are composed of two-qubit local variational ansatze, it is sufficient to make these ansatze equivariant with respect to the two-qubit local symmetry representations r(g)𝑟𝑔r(g)italic_r ( italic_g ). In detail, for the ithsuperscript𝑖thi^{\mathrm{th}}italic_i start_POSTSUPERSCRIPT roman_th end_POSTSUPERSCRIPT layer,

[r(g),Ui(Θi)]=0and[r(g),Vi(Φi)]=0r(g).formulae-sequence𝑟𝑔subscript𝑈𝑖subscriptΘ𝑖0and𝑟𝑔subscript𝑉𝑖subscriptΦ𝑖0for-all𝑟𝑔[r(g),U_{i}(\Theta_{i})]=0\quad\mathrm{and}\quad[r(g),V_{i}(\Phi_{i})]=0\quad% \forall r(g).[ italic_r ( italic_g ) , italic_U start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( roman_Θ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ] = 0 roman_and [ italic_r ( italic_g ) , italic_V start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( roman_Φ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ] = 0 ∀ italic_r ( italic_g ) . (4)

In particular, Uisubscript𝑈𝑖U_{i}italic_U start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and Visubscript𝑉𝑖V_{i}italic_V start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT are constituted of parameterized single-qubit and two-qubit rotational gates which can be expressed as eiθHsuperscript𝑒𝑖𝜃𝐻e^{-i\theta H}italic_e start_POSTSUPERSCRIPT - italic_i italic_θ italic_H end_POSTSUPERSCRIPT, where H𝐻Hitalic_H is a Hermitian operator and is called the generator of that gate. In this case, Eq. (4) is satisfied if

[r(g),H]=0𝑟𝑔𝐻0[r(g),H]=0[ italic_r ( italic_g ) , italic_H ] = 0 (5)

holds for all the generators of the single and two qubit gates used in the construction of Uisubscript𝑈𝑖U_{i}italic_U start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and Visubscript𝑉𝑖V_{i}italic_V start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT. Given a symmetry representation R(g)𝑅𝑔R(g)italic_R ( italic_g ), these generators can be found using a variety of methods as discussed in Refs. [49, 48, 57].

II.3 Role of data embedding

If an image has symmetries, then the representation of that symmetry group depends on the particular way the image is encoded using qubits. In this work, we will consider amplitude embedding (AE) in which N𝑁Nitalic_N pixel values are encoded as the amplitudes of the basis states of n=logN𝑛𝑁n=\log Nitalic_n = roman_log italic_N qubits. Let us consider two simple label symmetries– reflection with respect to the vertical axis and rotation by 180superscript180180^{\circ}180 start_POSTSUPERSCRIPT ∘ end_POSTSUPERSCRIPT. In standard AE the pixel values of the classical image are encoded row-wise into the amplitudes of the quantum state. Let us consider a small image for which N=4𝑁4N=4italic_N = 4 and n=2𝑛2n=2italic_n = 2. For standard AE the representation of reflection group is {𝕀2,𝕀X}superscript𝕀tensor-productabsent2tensor-product𝕀𝑋\{\mathbb{I}^{\otimes 2},\mathbb{I}\otimes X\}{ blackboard_I start_POSTSUPERSCRIPT ⊗ 2 end_POSTSUPERSCRIPT , blackboard_I ⊗ italic_X } and that for the 180superscript180180^{\circ}180 start_POSTSUPERSCRIPT ∘ end_POSTSUPERSCRIPT rotation group is {𝕀2,XX}superscript𝕀tensor-productabsent2tensor-product𝑋𝑋\{\mathbb{I}^{\otimes 2},X\otimes X\}{ blackboard_I start_POSTSUPERSCRIPT ⊗ 2 end_POSTSUPERSCRIPT , italic_X ⊗ italic_X }, where 𝕀𝕀\mathbb{I}blackboard_I is the qubit identity operator and X𝑋Xitalic_X is the Pauli-X𝑋Xitalic_X operator. However, if one modifies the order in which the pixels are encoded, in a way presented in the table in Fig. 2, then the representation for the reflection symmetry group becomes {𝕀2,X𝕀}superscript𝕀tensor-productabsent2tensor-product𝑋𝕀\{\mathbb{I}^{\otimes 2},X\otimes\mathbb{I}\}{ blackboard_I start_POSTSUPERSCRIPT ⊗ 2 end_POSTSUPERSCRIPT , italic_X ⊗ blackboard_I } and that for the 180superscript180180^{\circ}180 start_POSTSUPERSCRIPT ∘ end_POSTSUPERSCRIPT rotation symmetry group becomes {𝕀2,𝕀X}superscript𝕀tensor-productabsent2tensor-product𝕀𝑋\{\mathbb{I}^{\otimes 2},\mathbb{I}\otimes X\}{ blackboard_I start_POSTSUPERSCRIPT ⊗ 2 end_POSTSUPERSCRIPT , blackboard_I ⊗ italic_X }.

In a more general scenario, let us suppose A𝐴Aitalic_A is a 2n×2nsuperscript2𝑛superscript2𝑛2^{n}\times 2^{n}2 start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT × 2 start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT matrix that acts on the standard amplitude-encoded input state |ΨketΨ|\Psi\rangle| roman_Ψ ⟩ and alters the order in which pixel values are encoded. Thus A𝐴Aitalic_A has to be a permutation matrix which permutes the coefficients of the canonical basis states of nsuperscript𝑛\mathcal{H}^{n}caligraphic_H start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT, or equivalently permutes the basis states. For input state |ΨketΨ|\Psi\rangle| roman_Ψ ⟩ with standard AE the group representation is R(g)𝑅𝑔R(g)italic_R ( italic_g ), while for input state |ΨketsuperscriptΨ|\Psi^{\prime}\rangle| roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⟩ with basis-permuted AE, it is R(g)superscript𝑅𝑔R^{\prime}(g)italic_R start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_g ). If |ΦketΦ|\Phi\rangle| roman_Φ ⟩ and |ΦketsuperscriptΦ|\Phi^{\prime}\rangle| roman_Φ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⟩ are the states obtained after applying the corresponding symmetry transformations, then the relation between them can be summarized as that in Fig. 3. From this we can write the following–

R(g)A|Ψ=AR(g)|Ψsuperscript𝑅𝑔𝐴ketΨ𝐴𝑅𝑔ketΨ\displaystyle R^{\prime}(g)A|\Psi\rangle=AR(g)|\Psi\rangleitalic_R start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_g ) italic_A | roman_Ψ ⟩ = italic_A italic_R ( italic_g ) | roman_Ψ ⟩
\displaystyle\Rightarrow R(g)=AR(g)A1.superscript𝑅𝑔𝐴𝑅𝑔superscript𝐴1\displaystyle R^{\prime}(g)=AR(g)A^{-1}.italic_R start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_g ) = italic_A italic_R ( italic_g ) italic_A start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT . (6)

In other words, R(g)𝑅𝑔R(g)italic_R ( italic_g ) and R(g)superscript𝑅𝑔R^{\prime}(g)italic_R start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_g ) are related by a basis permutation and they are two equivalent representations of the same group. In the same way, the unitary operators in the commutator space of R(g)𝑅𝑔R(g)italic_R ( italic_g ) are related by a basis permutation A𝐴Aitalic_A to the unitary operators in the commutator space of R(g)superscript𝑅𝑔R^{\prime}(g)italic_R start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_g ).

However, the architecture of EQCNNs is a special case of QCNN differing in the following aspects from a general QNN.

  1. 1.

    𝒰θsubscript𝒰𝜃\mathcal{U}_{\theta}caligraphic_U start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT is composed of m𝑚mitalic_m-qubit local unitary ansatze which are equivariant with respect to the locally acting components of the full symmetry representation. Whether these locally equivariant ansatze can realize the full set of globally equivariant ansatze depends on the particular group and its representations [63].

  2. 2.

    Due to translational symmetry, in one particular layer of EQCNN all local ansatze must be the same. However, the local symmetry representations may not be same for all groups of m𝑚mitalic_m qubits. Let us suppose that in the ithsuperscript𝑖thi^{\mathrm{th}}italic_i start_POSTSUPERSCRIPT roman_th end_POSTSUPERSCRIPT layer there are s𝑠sitalic_s distinct local representations rij(g)superscriptsubscript𝑟𝑖𝑗𝑔r_{i}^{j}(g)italic_r start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_j end_POSTSUPERSCRIPT ( italic_g ) (j=1,2,,s𝑗12𝑠j=1,2,...,sitalic_j = 1 , 2 , … , italic_s) and the corresponding sets of equivariant generators are σijsuperscriptsubscript𝜎𝑖𝑗\sigma_{i}^{j}italic_σ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_j end_POSTSUPERSCRIPT. Then the local ansatze in this layer must be generated from the elements which are common to all σijsuperscriptsubscript𝜎𝑖𝑗\sigma_{i}^{j}italic_σ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_j end_POSTSUPERSCRIPT. Thus, the translation symmetry hinders the use of the full set of equivariant generators with respect to the local symmetries and reduces the expressibility of the network. In other words, the EQCNN is capable of taking advantage of only some of the local symmetries that exist in groups of m𝑚mitalic_m qubits and not the global or more-than-m𝑚mitalic_m qubit symmetries.

  3. 3.

    This dependence on local symmetries becomes more significant after each pooling layer when some of the qubits are traced out. It means the local ansatze are not applied on all the local groups of qubits.

From the discussion above, it is evident that EQCNN ansatze depend on the local symmetries and the reduced symmetries in the subsequent layers. Thus, a change in representation R(g)𝑅𝑔R(g)italic_R ( italic_g ) is expected to vary the expressibility. We note here again that the changes in representation and the expressibility is accompanied with a basis permutation of the input state itself. Though the set of input states are different for different representations, the correlations between images related by a symmetry operation remain unchanged.

There are a few earlier works where particular basis-permuted embedding was used in order to facilitate construction of the ansatze that satisfy Eq. (3[55, 54]. In this work, our aim is to compare the non-equivariant QCNN with standard AE to EQCNNs with a number of basis-permuted AEs. In order to do this, we first choose a particular R(g)superscript𝑅𝑔R^{\prime}(g)italic_R start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_g ) and then we find the corresponding A𝐴Aitalic_A to be applied to the standard amplitude encoded state |ΨketΨ|\Psi\rangle| roman_Ψ ⟩ to get the basis-permuted embedding |ΨketsuperscriptΨ|\Psi^{\prime}\rangle| roman_Ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⟩. Observing that the matrices that are related by a similarity transformation have the same eigenvalues, we can write

M1R(g)M=N1R(g)Nsuperscript𝑀1𝑅𝑔𝑀superscript𝑁1superscript𝑅𝑔𝑁\displaystyle M^{-1}R(g)M=N^{-1}R^{\prime}(g)Nitalic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_R ( italic_g ) italic_M = italic_N start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_R start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_g ) italic_N
\displaystyle\Rightarrow R(g)=NM1R(g)MN1,superscript𝑅𝑔𝑁superscript𝑀1𝑅𝑔𝑀superscript𝑁1\displaystyle R^{\prime}(g)=NM^{-1}R(g)MN^{-1},italic_R start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_g ) = italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_R ( italic_g ) italic_M italic_N start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT , (7)

where M𝑀Mitalic_M and N𝑁Nitalic_N are matrices constituted of eigenvectors of respectively R(g)𝑅𝑔R(g)italic_R ( italic_g ) and R(g)superscript𝑅𝑔R^{\prime}(g)italic_R start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_g ). Thus, the desired value of A𝐴Aitalic_A is

A=NM1.𝐴𝑁superscript𝑀1A=NM^{-1}.italic_A = italic_N italic_M start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT . (8)

In the next section we discuss in detail the architecture of our EQCNN and non-equivariant QCNN for different embeddings.

Refer to caption
Refer to caption
Figure 4: (a) The convolutional ansatze and (b) the pooling ansatze used in this work. The specific use case of each ansatz is discussed in Sec. III.
Table 1: The datasets, their quantum embeddings, corresponding circuit indices of convolutional and pooling ansatze from Fig. 4 and measurements M𝑀Mitalic_M used in the construction of EQCNN. Here we use AE, AE 1 and AE 2 to indicate respectively standard AE, basis-permuted AE 1 and basis-permuted AE 2
Dataset # Qubits Embedding Ansatz M

U1

V1

U2

V2

U3

V3

U4

V4

Fashion MNIST 8

AE

1 6 1 6 4 7 -

-

σZsubscript𝜎𝑍\sigma_{Z}italic_σ start_POSTSUBSCRIPT italic_Z end_POSTSUBSCRIPT

AE 1

1 6 2 6 3 6 -

-

σXsubscript𝜎𝑋\sigma_{X}italic_σ start_POSTSUBSCRIPT italic_X end_POSTSUBSCRIPT

AE 2

1 7 5 9 5 9 -

-

σZsubscript𝜎𝑍\sigma_{Z}italic_σ start_POSTSUBSCRIPT italic_Z end_POSTSUBSCRIPT

Cifar10/ Blood MNIST 10

AE/AE1

1 6 1 7 1 9 4

8

σZsubscript𝜎𝑍\sigma_{Z}italic_σ start_POSTSUBSCRIPT italic_Z end_POSTSUBSCRIPT

AE 1/AE

1 6 2 6 3 6 3

6

σXsubscript𝜎𝑋\sigma_{X}italic_σ start_POSTSUBSCRIPT italic_X end_POSTSUBSCRIPT

AE 2

1 7 5 9 5 9 5

9

σZsubscript𝜎𝑍\sigma_{Z}italic_σ start_POSTSUBSCRIPT italic_Z end_POSTSUBSCRIPT

.

Table 1: The datasets, their quantum embeddings, corresponding circuit indices of convolutional and pooling ansatze from Fig. 4 and measurements M𝑀Mitalic_M used in the construction of EQCNN. Here we use AE, AE 1 and AE 2 to indicate respectively standard AE, basis-permuted AE 1 and basis-permuted AE 2

III EQCNN with different data embeddings

We consider classification of images whose labels remain invariant under a reflection about the vertical axis and 180superscript180180^{\circ}180 start_POSTSUPERSCRIPT ∘ end_POSTSUPERSCRIPT rotation. In both cases, the underlying groups have only two group elements– the identity operation which keeps the image unchanged, and the operation corresponding to reflection or rotation. They form the abstract group 2subscript2\mathbb{Z}_{2}blackboard_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. In our experiments, for the reflection-symmetric images we choose to employ the classes 0 (tshirt/top) and 1 (trouser) of Fashion MNIST dataset, and classes 1 (car) and 2 (bird) of Cifar10 dataset. For rotationally symmetric images, we perform classification between classes 1 and 6, as well as between classes 4 and 5 of Blood MNIST dataset. The Fashion MNIST dataset is a collection of 70000 greyscale images of different clothing articles and accessories. Each image has dimension 28×28282828\times 2828 × 28 pixels and the dataset has 10 classes in total. The Cifar10 dataset has 60000 RGB images of animals and vehicles divided into 10 classes, each image having dimension 32×32323232\times 3232 × 32 pixels. The Blood MNIST dataset is a collection of 32×32323232\times 3232 × 32 pixels RGB images of human Blood cells divided into 8 classes. Each class denotes a cell type and contains a few hundred to about a thousand of images. We downsize the Fashion MNIST images to 16×16161616\times 1616 × 16 pixels and encode them using 8 qubits. Regarding the Cifar10 and Blood MNIST images, we keep the dimension unchanged but they are transformed from RGB to greyscale and we use 10 qubits for the embedding. In all cases, we use two-qubit convolutional ansatze and assume periodicity in the qubit register i.e. the first and the last qubits are nearest neighbours.

For the non-equivariant QCNN we use standard amplitude embedding. For constructing 𝒰θsubscript𝒰𝜃\mathcal{U}_{\theta}caligraphic_U start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT, it is possible to choose from a large number of non-equivariant convolutional and pooling ansatze, which may affect the overall performance of the QCNN. In this work, we build the two-qubit convolutional ansatze by applying a parametrized arbitrary rotation on each qubit and a CNOT𝐶𝑁𝑂𝑇CNOTitalic_C italic_N italic_O italic_T gate to entangle them, as shown in circuit 3 in Fig. 4, where Rot(θ1,θ2,θ3)=Rz(θ3)Ry(θ2)Rz(θ1)Rotsubscript𝜃1subscript𝜃2subscript𝜃3subscript𝑅𝑧subscript𝜃3subscript𝑅𝑦subscript𝜃2subscript𝑅𝑧subscript𝜃1\mathrm{Rot}(\theta_{1},\theta_{2},\theta_{3})=R_{z}(\theta_{3})R_{y}(\theta_{% 2})R_{z}(\theta_{1})roman_Rot ( italic_θ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_θ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_θ start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ) = italic_R start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT ( italic_θ start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ) italic_R start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT ( italic_θ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) italic_R start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT ( italic_θ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ). As the pooling ansatze we use parametrized controlled z𝑧zitalic_z and controlled x𝑥xitalic_x rotations on the target qubit, when the control qubit is in computational basis states |1ket1|1\rangle| 1 ⟩ and |0ket0|0\rangle| 0 ⟩ respectively. This is presented as circuit 9 in Fig. 4. In the last step we measure the remaining qubit 1 in z𝑧zitalic_z basis.

For the EQCNN, we use standard AE along with two different basis-permuted embeddings, namely basis-permuted AE 1 and basis-permuted AE 2.

III.0.1 Standard AE

In standard amplitude embedding of 2D square images using n𝑛nitalic_n qubits, the representation of the reflection group is

𝒢ref{𝕀n,𝕀1𝕀2𝕀n/2X(n/2)+1Xn},subscript𝒢refsuperscript𝕀𝑛tensor-productsubscript𝕀1subscript𝕀2subscript𝕀𝑛2subscript𝑋𝑛21subscript𝑋𝑛\mathcal{G}_{\mathrm{ref}}\equiv\{\mathbb{I}^{n},\mathbb{I}_{1}\otimes\mathbb{% I}_{2}\otimes...\otimes\mathbb{I}_{n/2}\otimes X_{(n/2)+1}\otimes...\otimes X_% {n}\},caligraphic_G start_POSTSUBSCRIPT roman_ref end_POSTSUBSCRIPT ≡ { blackboard_I start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , blackboard_I start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⊗ blackboard_I start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ⊗ … ⊗ blackboard_I start_POSTSUBSCRIPT italic_n / 2 end_POSTSUBSCRIPT ⊗ italic_X start_POSTSUBSCRIPT ( italic_n / 2 ) + 1 end_POSTSUBSCRIPT ⊗ … ⊗ italic_X start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT } , (9)

where 𝕀nsuperscript𝕀𝑛\mathbb{I}^{n}blackboard_I start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT is the n𝑛nitalic_n-qubit identity operator, and 𝕀isubscript𝕀𝑖\mathbb{I}_{i}blackboard_I start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and Xisubscript𝑋𝑖X_{i}italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT denote respectively the identity operator and Pauli-X𝑋Xitalic_X operator applied on ithsuperscript𝑖thi^{\mathrm{th}}italic_i start_POSTSUPERSCRIPT roman_th end_POSTSUPERSCRIPT qubit. The second element of 𝒢𝒢\mathcal{G}caligraphic_G corresponding to reflection operation, leaves the first half qubits unchanged and flips the second half. This is because, in the standard AE of square images, the amplitudes related to a single row span all the bases with fixed values for the first half qubits and all the possible combinations of 00 and 1111 for the second half qubits. For a rectangular 2n1×2n2superscript2subscript𝑛1superscript2subscript𝑛22^{n_{1}}\times 2^{n_{2}}2 start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT × 2 start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT image, the group representation becomes,

𝒢ref{𝕀n,𝕀1𝕀2𝕀n1Xn1+1Xn1+n2}.subscript𝒢refsuperscript𝕀𝑛tensor-productsubscript𝕀1subscript𝕀2subscript𝕀subscript𝑛1subscript𝑋subscript𝑛11subscript𝑋subscript𝑛1subscript𝑛2\mathcal{G}_{\mathrm{ref}}\equiv\{\mathbb{I}^{n},\mathbb{I}_{1}\otimes\mathbb{% I}_{2}\otimes...\otimes\mathbb{I}_{n_{1}}\otimes X_{n_{1}+1}\otimes...\otimes X% _{n_{1}+n_{2}}\}.caligraphic_G start_POSTSUBSCRIPT roman_ref end_POSTSUBSCRIPT ≡ { blackboard_I start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , blackboard_I start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⊗ blackboard_I start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ⊗ … ⊗ blackboard_I start_POSTSUBSCRIPT italic_n start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ⊗ italic_X start_POSTSUBSCRIPT italic_n start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + 1 end_POSTSUBSCRIPT ⊗ … ⊗ italic_X start_POSTSUBSCRIPT italic_n start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_n start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT } . (10)

Similarly, one can check that the representation of 180superscript180180^{\circ}180 start_POSTSUPERSCRIPT ∘ end_POSTSUPERSCRIPT rotation group for standard AE is,

𝒢rot{𝕀n,X1X2Xn}.subscript𝒢rotsuperscript𝕀𝑛tensor-productsubscript𝑋1subscript𝑋2subscript𝑋𝑛\mathcal{G}_{\mathrm{rot}}\equiv\{\mathbb{I}^{n},X_{1}\otimes X_{2}\otimes...% \otimes X_{n}\}.caligraphic_G start_POSTSUBSCRIPT roman_rot end_POSTSUBSCRIPT ≡ { blackboard_I start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⊗ italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ⊗ … ⊗ italic_X start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT } . (11)

III.0.2 Basis-permuted AE 1

Basis-permuted AE 1 swaps the representations of reflection and rotation groups in Eq. (9) and Eq. (11). We choose a basis-permuting matrix A𝐴Aitalic_A such that the image is reflected when X𝑋Xitalic_X is applied on all the qubits. Thus the representation of reflection group becomes,

𝒢ref{𝕀n,X1X2Xn}.subscript𝒢refsuperscript𝕀𝑛tensor-productsubscript𝑋1subscript𝑋2subscript𝑋𝑛\mathcal{G}_{\mathrm{ref}}\equiv\{\mathbb{I}^{n},X_{1}\otimes X_{2}\otimes...% \otimes X_{n}\}.caligraphic_G start_POSTSUBSCRIPT roman_ref end_POSTSUBSCRIPT ≡ { blackboard_I start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⊗ italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ⊗ … ⊗ italic_X start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT } . (12)

For images with rotational symmetry, we apply A1superscript𝐴1A^{-1}italic_A start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT to permute the basis vectors so that the representation of the 180superscript180180^{\circ}180 start_POSTSUPERSCRIPT ∘ end_POSTSUPERSCRIPT rotation group becomes,

𝒢rot{𝕀n,𝕀1𝕀2𝕀n/2X(n/2)+1Xn}.subscript𝒢rotsuperscript𝕀𝑛tensor-productsubscript𝕀1subscript𝕀2subscript𝕀𝑛2subscript𝑋𝑛21subscript𝑋𝑛\mathcal{G}_{\mathrm{rot}}\equiv\{\mathbb{I}^{n},\mathbb{I}_{1}\otimes\mathbb{% I}_{2}\otimes...\otimes\mathbb{I}_{n/2}\otimes X_{(n/2)+1}\otimes...\otimes X_% {n}\}.caligraphic_G start_POSTSUBSCRIPT roman_rot end_POSTSUBSCRIPT ≡ { blackboard_I start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , blackboard_I start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⊗ blackboard_I start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ⊗ … ⊗ blackboard_I start_POSTSUBSCRIPT italic_n / 2 end_POSTSUBSCRIPT ⊗ italic_X start_POSTSUBSCRIPT ( italic_n / 2 ) + 1 end_POSTSUBSCRIPT ⊗ … ⊗ italic_X start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT } . (13)

III.0.3 Basis-permuted AE 2

We permute the basis vectors in a way such that the representation of both the reflection and rotation group becomes

𝒢ref/rot{𝕀n,𝕀1X2𝕀3X4𝕀n1Xn},subscript𝒢refrotsuperscript𝕀𝑛tensor-productsubscript𝕀1subscript𝑋2subscript𝕀3subscript𝑋4subscript𝕀𝑛1subscript𝑋𝑛\mathcal{G}_{\mathrm{ref/rot}}\equiv\{\mathbb{I}^{n},\mathbb{I}_{1}\otimes X_{% 2}\otimes\mathbb{I}_{3}\otimes X_{4}\otimes...\otimes\mathbb{I}_{n-1}\otimes X% _{n}\},caligraphic_G start_POSTSUBSCRIPT roman_ref / roman_rot end_POSTSUBSCRIPT ≡ { blackboard_I start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , blackboard_I start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⊗ italic_X start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ⊗ blackboard_I start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ⊗ italic_X start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT ⊗ … ⊗ blackboard_I start_POSTSUBSCRIPT italic_n - 1 end_POSTSUBSCRIPT ⊗ italic_X start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT } , (14)

i.e., the identity operator and Pauli-X operator are applied alternatively on the qubits. The matrix A𝐴Aitalic_A, however, is different for the two groups in this case.

Refer to caption
Figure 5: Test set accuracy obtained from equivariant and non-equivariant QCNNs when using maximum possible set of equivariant generators. The vertical axis shows the average accuracy obtained over 10 randomly initialized runs and the standard deviation is indicated with the shaded area. (a) Classification of classes 00 and 1111 of Fashion MNIST dataset. The inset shows a magnified part of the same with the horizontal axis spanning a few hundreds of iterations. (b) Classification of classes 1111 and 2222 of Cifar10 dataset.

Now we describe in detail the construction of the EQCNNs in these three cases. We try to use the largest possible set of equivariant generators in each case for constructing the equivariant convolutional and pooling ansatze, this is summarized in Table 1. However, we also investigate a scenario in which a subset of equivariant generators is used. All the convolutional and pooling ansatze used in this work are presented in Fig. 4.

Let us consider the standard AE for reflection group for which the representation is given by Eq. (9). In the first layer of the EQCNN, equivariant convolutional and pooling layers can be built if U1subscript𝑈1U_{1}italic_U start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and V1subscript𝑉1V_{1}italic_V start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT commutes with all the possible two-qubit local group representations within 𝒢𝒢\mathcal{G}caligraphic_G which are {𝕀𝕀,𝕀X,X𝕀,XX}tensor-product𝕀𝕀tensor-product𝕀𝑋tensor-product𝑋𝕀tensor-product𝑋𝑋\{\mathbb{I}\otimes\mathbb{I},\mathbb{I}\otimes X,X\otimes\mathbb{I},X\otimes X\}{ blackboard_I ⊗ blackboard_I , blackboard_I ⊗ italic_X , italic_X ⊗ blackboard_I , italic_X ⊗ italic_X }. To satisfy the commuting condition, we use circuit 1 in Fig. 4(a) for U1subscript𝑈1U_{1}italic_U start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and the circuit 6 in Fig. 4(b) for V1subscript𝑉1V_{1}italic_V start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT (note that the control qubit is in the x𝑥xitalic_x basis). One can check that these are the only ansatze that commutes with all local representations in this layer. In circuit 6, we have used two Rxsubscript𝑅𝑥R_{x}italic_R start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT gates with different parameters to draw an analogy with the other pooling circuits, however one can merge them into a single parametrized Rxsubscript𝑅𝑥R_{x}italic_R start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT gate to be applied when the control qubit is either in |0ket0|0\rangle| 0 ⟩ or |1ket1|1\rangle| 1 ⟩. Afterwards, we trace out the even-indexed qubits leaving the reduced representation {𝕀5,𝕀1𝕀3𝕀5X7X9}superscript𝕀5tensor-productsubscript𝕀1subscript𝕀3subscript𝕀5subscript𝑋7subscript𝑋9\{\mathbb{I}^{5},\mathbb{I}_{1}\otimes\mathbb{I}_{3}\otimes\mathbb{I}_{5}% \otimes X_{7}\otimes X_{9}\}{ blackboard_I start_POSTSUPERSCRIPT 5 end_POSTSUPERSCRIPT , blackboard_I start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⊗ blackboard_I start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ⊗ blackboard_I start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT ⊗ italic_X start_POSTSUBSCRIPT 7 end_POSTSUBSCRIPT ⊗ italic_X start_POSTSUBSCRIPT 9 end_POSTSUBSCRIPT } for n=10𝑛10n=10italic_n = 10 when considering the Cifar10 dataset, and {𝕀4,𝕀1𝕀3X5X7}superscript𝕀4tensor-productsubscript𝕀1subscript𝕀3subscript𝑋5subscript𝑋7\{\mathbb{I}^{4},\mathbb{I}_{1}\otimes\mathbb{I}_{3}\otimes X_{5}\otimes X_{7}\}{ blackboard_I start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT , blackboard_I start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⊗ blackboard_I start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ⊗ italic_X start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT ⊗ italic_X start_POSTSUBSCRIPT 7 end_POSTSUBSCRIPT } for n=8𝑛8n=8italic_n = 8 when considering the Fashion MNIST dataset. The two-qubit local representations for the convolutional ansatze remain unchanged, thus in the second layer we still use circuit 1 for U2subscript𝑈2U_{2}italic_U start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. In the second pooling layer, we will trace out qubits 3 and 7 and apply conditional rotations on qubits 1 and 5. The local representations are {𝕀1𝕀3,𝕀5X7}tensor-productsubscript𝕀1subscript𝕀3tensor-productsubscript𝕀5subscript𝑋7\{\mathbb{I}_{1}\otimes\mathbb{I}_{3},\mathbb{I}_{5}\otimes X_{7}\}{ blackboard_I start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⊗ blackboard_I start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT , blackboard_I start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT ⊗ italic_X start_POSTSUBSCRIPT 7 end_POSTSUBSCRIPT } for n=10𝑛10n=10italic_n = 10, and {𝕀1𝕀3,X5X7}tensor-productsubscript𝕀1subscript𝕀3tensor-productsubscript𝑋5subscript𝑋7\{\mathbb{I}_{1}\otimes\mathbb{I}_{3},X_{5}\otimes X_{7}\}{ blackboard_I start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⊗ blackboard_I start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT , italic_X start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT ⊗ italic_X start_POSTSUBSCRIPT 7 end_POSTSUBSCRIPT } for n=8𝑛8n=8italic_n = 8. For the former, both circuit 7 and circuit 8 are equivariant generators, however since we can use only one ansatze in a pooling layer, we choose circuit 7 without loss of generality. For n=8𝑛8n=8italic_n = 8 we use circuit 6 as V2subscript𝑉2V_{2}italic_V start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. In the third layer, the reduced representations are {𝕀3,𝕀1𝕀5X9}superscript𝕀3tensor-productsubscript𝕀1subscript𝕀5subscript𝑋9\{\mathbb{I}^{3},\mathbb{I}_{1}\otimes\mathbb{I}_{5}\otimes X_{9}\}{ blackboard_I start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT , blackboard_I start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⊗ blackboard_I start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT ⊗ italic_X start_POSTSUBSCRIPT 9 end_POSTSUBSCRIPT } for n=10𝑛10n=10italic_n = 10 and {𝕀2,𝕀1X5}superscript𝕀2tensor-productsubscript𝕀1subscript𝑋5\{\mathbb{I}^{2},\mathbb{I}_{1}\otimes X_{5}\}{ blackboard_I start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , blackboard_I start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⊗ italic_X start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT } for n=8𝑛8n=8italic_n = 8. For the former, we use the circuit 1 as U3subscript𝑈3U_{3}italic_U start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT. For pooling, we trace out qubit 5 by applying conditional rotation on qubit 1, thus V3subscript𝑉3V_{3}italic_V start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT should commute with {𝕀1𝕀5}tensor-productsubscript𝕀1subscript𝕀5\{\mathbb{I}_{1}\otimes\mathbb{I}_{5}\}{ blackboard_I start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⊗ blackboard_I start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT }. Here, we are free to use any pooling ansatze, so we choose to use the same one as in the non-equivariant QCNN, i.e. circuit 9. For n=8𝑛8n=8italic_n = 8, U3subscript𝑈3U_{3}italic_U start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT should commute with {𝕀1X5}tensor-productsubscript𝕀1subscript𝑋5\{\mathbb{I}_{1}\otimes X_{5}\}{ blackboard_I start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⊗ italic_X start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT }, thus both circuit 1 and circuit 4 are equivariant ansatze. Since we have already used circuit 1 in previous layers, we use circuit 4 for U3subscript𝑈3U_{3}italic_U start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT and the circuit 7 for V3subscript𝑉3V_{3}italic_V start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT. At this point for n=8𝑛8n=8italic_n = 8 we measure the remaining qubit 1 in z𝑧zitalic_z basis to get the result. For n=10𝑛10n=10italic_n = 10 instead, we still have two qubits left with the reduced representation being {𝕀2,𝕀1X5}superscript𝕀2tensor-productsubscript𝕀1subscript𝑋5\{\mathbb{I}^{2},\mathbb{I}_{1}\otimes X_{5}\}{ blackboard_I start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , blackboard_I start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⊗ italic_X start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT }. We apply a further layer with circuit 4 as U4subscript𝑈4U_{4}italic_U start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT. For the pooling layer, this time we choose circuit 8 as V4subscript𝑉4V_{4}italic_V start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT. We measure the remaining qubit 1 in z𝑧zitalic_z basis.

Refer to caption
Figure 6: Test set accuracy of Blood MNIST images obtained from equivariant and non-equivariant QCNNs when using maximum possible set of equivariant generators. The vertical axis shows the average accuracy obtained over 10 randomly initialized runs and the standard deviation is indicated with the shaded area.

For 180superscript180180^{\circ}180 start_POSTSUPERSCRIPT ∘ end_POSTSUPERSCRIPT rotation group with standard AE, at every layer the reduced representation is a tensor product of Pauli-X matrices. Thus, Uisubscript𝑈𝑖U_{i}italic_U start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and Visubscript𝑉𝑖V_{i}italic_V start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT must commute with {𝕀2,XX}superscript𝕀tensor-productabsent2tensor-product𝑋𝑋\{\mathbb{I}^{\otimes 2},X\otimes X\}{ blackboard_I start_POSTSUPERSCRIPT ⊗ 2 end_POSTSUPERSCRIPT , italic_X ⊗ italic_X } for all layers. The convolutional ansatze that satisfies the latter are circuit 1, circuit 2 and circuit 3 in Fig. 4(a). Therefore, we use circuit 1 for U1subscript𝑈1U_{1}italic_U start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, circuit 2 for U2subscript𝑈2U_{2}italic_U start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, and circuit 3 for U3subscript𝑈3U_{3}italic_U start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT and U4subscript𝑈4U_{4}italic_U start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT. For pooling ansatze, we are left with the unique choice of circuit 6 for Visubscript𝑉𝑖V_{i}italic_V start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT at every layer. At the end, to obtain the prediction, we measure the remaining qubit 1 in the x𝑥xitalic_x basis.

For basis-permuted AE 1, we just swap the EQCNN structure for reflection and rotation group discussed above.

For basis-permuted AE 2, in the first layer, U1subscript𝑈1U_{1}italic_U start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT must commute with {𝕀X,X𝕀}tensor-product𝕀𝑋tensor-product𝑋𝕀\{\mathbb{I}\otimes X,X\otimes\mathbb{I}\}{ blackboard_I ⊗ italic_X , italic_X ⊗ blackboard_I } and V1subscript𝑉1V_{1}italic_V start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT must commute with {𝕀X}tensor-product𝕀𝑋\{\mathbb{I}\otimes X\}{ blackboard_I ⊗ italic_X }, thus we use respectively circuit 1 and circuit 7. In all the subsequent layers, the reduced group representation is simply {𝕀k}superscript𝕀tensor-productabsent𝑘\{\mathbb{I}^{\otimes k}\}{ blackboard_I start_POSTSUPERSCRIPT ⊗ italic_k end_POSTSUPERSCRIPT }, k𝑘kitalic_k being the number of qubits in each layer. Since no equivariance constraint is imposed, we use the circuit 5 for Uisubscript𝑈𝑖U_{i}italic_U start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and the circuit 9 for Visubscript𝑉𝑖V_{i}italic_V start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT for i>1𝑖1i>1italic_i > 1. In the end, we measure the remaining qubit in the z𝑧zitalic_z basis to get the result.

Note that, one can obtain the same EQCNN for symmetry representations in Eq. (9) and Eq. (14) by changing the qubits on which the first pooling layer acts. To elaborate, one can choose to trace out qubits {6,7,8,9,10}678910\{6,7,8,9,10\}{ 6 , 7 , 8 , 9 , 10 } in the first pooling layer when the representation is that in Eq. (9). However, we consider a more realistic scenario in which the two-qubit gate connectivity between the qubits is constrained due to the underlying architecture of the quantum hardware. In our case, we implicitly assume that the gate connectivity is exactly the one shown in Fig. 1. Therefore, our convolutional and pooling layers act on different set of qubits for these two representations.

Refer to caption
Figure 7: Test set accuracy of (a) Fashion MNIST and (b) Cifar10 images obtained from equivariant and non-equivariant QCNNs when using a subset of equivariant generators. The other details remain same as that in the caption of Fig. 5.
Refer to caption
Figure 8: Test set accuracy obtained from equivariant and non-equivariant QCNNs for Blood MNIST dataset when using a subset of equivariant generators. All other details remain same as in the caption of Fig. 6.

IV Results

We have used Pennylane [64] quantum simulator to implement the networks. In this work we consider only binary classification. For each dataset we choose a batch size p𝑝pitalic_p, which is the number of randomly sampled images used for one training iteration. As loss function, we use mean squared error (MSE) which is defined as

C=1pi=1p(fθ(xi)yi)2,𝐶1𝑝superscriptsubscript𝑖1𝑝superscriptsubscriptsuperscript𝑓𝜃subscript𝑥𝑖subscript𝑦𝑖2C=\frac{1}{p}\sum\limits_{i=1}^{p}(f^{\prime}_{\theta}(x_{i})-y_{i})^{2},italic_C = divide start_ARG 1 end_ARG start_ARG italic_p end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT ( italic_f start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) - italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , (15)

where fθ(xi)subscriptsuperscript𝑓𝜃subscript𝑥𝑖f^{\prime}_{\theta}(x_{i})italic_f start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) is the expectation value of the measurement operator for the ithsuperscript𝑖thi^{\mathrm{th}}italic_i start_POSTSUPERSCRIPT roman_th end_POSTSUPERSCRIPT input quantum state in the batch. Since the expectation values of Pauli operators lies in the range (1,1)11(-1,1)( - 1 , 1 ), the class labels corresponding to the two classes are mapped to yi{1,1}subscript𝑦𝑖11y_{i}\in\{-1,1\}italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ { - 1 , 1 } for all the datasets. We use Nesterov moment optimizer for training with a learning rate 0.01. We train the network for 2000 or 3000 iterations depending on the dataset. This scheme is run for 10 instances in parallel, for each of which the parameters are randomly initialized. We calculate the average test accuracies and standard deviations over these parallel runs after each 10 iterations.

First, we discuss the results when the maximum possible set of equivariant generators are used. In the following paragraphs, we use ‘AE 1’ and ‘AE 2’ to imply respectively basis-permuted AE 1 and basis-permute AE 2. For the Fashion MNIST dataset, the training is performed for 2000200020002000 iterations of batches composed of 32 randomly-sampled images. We plot the average test set accuracies and standard deviations for the different equivariant and non-equivariant models with increasing number of iterations in Fig. 5(a). Even though Fashion MNIST is a relatively simple greyscale dataset, the advantage of using an EQCNN over non-equivariant QCNN is clear from the plots. Standard AE and AE 2 achieve a higher accuracy in a much shorter time. The performance of AE 1 lags behind and is comparable to non-equivariant QCNN. All EQCNNs have significantly lower standard deviation compared to the non-equivariant QCNN.

In Fig. 5(b) we present the test set accuracies for Cifar10 images. In this case, the network is trained for 3000300030003000 iterations with batch size 64. We again observe a very fast convergence and higher accuracy for standard AE and AE 2 for lower number of iterations, as well as very low standard deviation. Compared to that, AE 1 shows a slower convergence, lower accuracy and high standard deviation. Overall, non-equivariant QCNN achieves highest accuracy for longer training iterations.

The results for Blood MNIST dataset are presented in Fig. 6. We train the network for 2000 iterations with batch size 32 and 64 respectively for classifying between classes {4,5}45\{4,5\}{ 4 , 5 } and classes {1,6}16\{1,6\}{ 1 , 6 }. Remember that in this case the representations corresponding to standard AE and AE 1 are swapped. Let us consider the classification of classes 4 and 5. The test accuracies show a similar behaviour to those of reflection-symmetric datasets wherein AE 1 and AE 2 performs better than the others for lower iterations. For longer iterations, all equivariant and non-equivariant QCNNs performs equally well. For classification of classes 1 and 6, in contrast to all other results, AE 1 has significantly lower accuracy for low number of iterations, while all other embeddings performs equally well.

Let us now see how the above trends in accuracy change when using a subset of the equivariant generators. For this, we replace all use-cases of circuit 2 and circuit 3 by circuit 1. Thus, for reflection-symmetric images, the EQCNN corresponding AE 1 is now generated only from Rxsubscript𝑅𝑥R_{x}italic_R start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT and Rxxsubscript𝑅𝑥𝑥R_{xx}italic_R start_POSTSUBSCRIPT italic_x italic_x end_POSTSUBSCRIPT gates. The same applies for rotationally-equivariant EQCNN corresponding to standard AE. We also replace all use-cases of circuit 7 by circuit 8, i.e. all pooling layers are generated from Rzsubscript𝑅𝑧R_{z}italic_R start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT and Rxsubscript𝑅𝑥R_{x}italic_R start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT gates. The resulting behaviour is presented in Fig. 7 and Fig. 8. For reflection-symmetric images, this improves the accuracy obtained from AE 1, which either surpasses or matches closely to that obtained from non-equivariant QCNN. For both cases of Blood MNIST dataset, the accuracy of AE 1 significantly decreases.

We note that the non-equivariant convolutional ansatze has six trainable parameters compared to three trainable parameters in all the equivariant convolutional ansatze. To compare these two QCNNs with an equal number of trainable parameters, we append circuit 2 to circuit 1 to build a new convolutional ansatze with six trainable parameters, and use it in all the layers of EQCNN with AE 1 for classification of Cifar10 dataset. In Fig. 9, we present the resulting behaviour which shows a higher accuracy for EQCNN compared to the non-equivariant QCNN.

Overall, we observe that the performance of an EQCNN varies depending the classical-to-quantum embedding. In particular, when the group representation is a tensor product of Pauli-X matrices acting on all the qubits, the EQCNN has a lower accuracy. On the other hand, when the representation is a tensor product of Pauli-X matrices acting on half of the qubits and the identity operator acting on the rest half, the EQCNNs have a faster convergence and higher accuracy. For the latter, the pooling choice can make a little difference in the initial training regime, i.e. the EQCNNs show slightly different performance when the pooling layers act on different set of qubits. We note, however, that this behaviour may not show for every dataset, as evident from Fig. 5.

Refer to caption
Figure 9: Test set accuracy of Cifar10 for non-equivariant QCNN and EQCNN with 6 trainable parameters in the convolutional ansatze.

V Conclusion

Quantum machine learning, though currently is at its inception, has already been useful in designing novel ML-based algorithms and also has shown some advantages over classical ML. However, there are still a number of open problems when it comes to understanding how to ensure a sufficiently good performance from a QNN. Equivariant QNNs are promising candidates to improve the training and generalization of quantum machine learning algorithms. A typical application of quantum machine learning is image classification which is an ubiquitous task in many daily-life scenarios. For this task, it is possible to construct equivariant QCNN compatible with the label symmetry of images additionally respecting the general translational symmetry of 2D images. In this work, we have explored the connections between the classical-to-quantum embedding of images, the resulting representation of a symmetry group, and the structure of the EQCNN respecting that symmetry. We considered datasets of images characterized by reflection and rotation symmetry and different amplitude embeddings of these images obtained by basis-permutation. Our theoretical observations ascertained that the local representations play a crucial role in deciding the equivariant ansatze to be used and hence the expressibility of the EQCNN. Our numerical results support this by showing a largely varying test set classification accuracy corresponding to different embeddings. It will be interesting to explore if instead of using 2-qubit local ansatze, an m𝑚mitalic_m-qubit ansatz with m>2𝑚2m>2italic_m > 2 can reduce the dependency of the EQCNN on local symmetries. It is also possible to compare amplitude embedding with other kinds of embedding, e.g. qubit embedding and dense qubit embedding [62], to investigate their effect on the EQCNN performances. Finally, one can also run the QCNN circuits in the real quantum hardwares, for example by using the corresponding plugins provided by Pennylane. In this case, it will be interesting to see if EQCNNs have better noise robustness compared to non-equivariant QCNNs due to reduced number of parametrized gates used in the former.

Acknowledgements.
This work was supported by the European Commission’s Horizon Europe Framework Programme under the Research and Innovation Action GA n. 101070546–MUQUABIS, by the European Union’s Horizon 2020 research and innovation programme under FET-OPEN GA n. 828946–PATHOS, by the European Defence Agency under the project Q-LAMPS Contract No B PRJ- RT-989, and by the MUR Progetti di Ricerca di Rilevante Interesse Nazionale (PRIN) Bando 2022 - project n. 20227HSE83 – ThAI-MIA funded by the European Union - Next Generation EU. S.M. acknowledges financial support from PNRR MUR project PE0000023-NQSTI. S.D. and S.M. thank Paolo Braccia for useful discussions regarding some of the initial ideas.

References

  • Shor [1997] P. W. Shor, Polynomial-time algorithms for prime factorization and discrete logarithms on a quantum computer, SIAM Journal on Computing 26, 1484 (1997)https://doi.org/10.1137/S0097539795293172 .
  • Grover [1996] L. K. Grover, A fast quantum mechanical algorithm for database search, in Proceedings of the Twenty-Eighth Annual ACM Symposium on Theory of Computing, STOC ’96 (Association for Computing Machinery, New York, NY, USA, 1996) p. 212–219.
  • Arute et al. [2019] F. Arute, K. Arya, R. Babbush, D. Bacon, J. C. Bardin, R. Barends, R. Biswas, S. Boixo, F. G. S. L. Brandao, D. A. Buell, B. Burkett, Y. Chen, Z. Chen, B. Chiaro, R. Collins, W. Courtney, A. Dunsworth, E. Farhi, B. Foxen, A. Fowler, C. Gidney, M. Giustina, R. Graff, K. Guerin, S. Habegger, M. P. Harrigan, M. J. Hartmann, A. Ho, M. Hoffmann, T. Huang, T. S. Humble, S. V. Isakov, E. Jeffrey, Z. Jiang, D. Kafri, K. Kechedzhi, J. Kelly, P. V. Klimov, S. Knysh, A. Korotkov, F. Kostritsa, D. Landhuis, M. Lindmark, E. Lucero, D. Lyakh, S. Mandrà, J. R. McClean, M. McEwen, A. Megrant, X. Mi, K. Michielsen, M. Mohseni, J. Mutus, O. Naaman, M. Neeley, C. Neill, M. Y. Niu, E. Ostby, A. Petukhov, J. C. Platt, C. Quintana, E. G. Rieffel, P. Roushan, N. C. Rubin, D. Sank, K. J. Satzinger, V. Smelyanskiy, K. J. Sung, M. D. Trevithick, A. Vainsencher, B. Villalonga, T. White, Z. J. Yao, P. Yeh, A. Zalcman, H. Neven, and J. M. Martinis, Quantum supremacy using a programmable superconducting processor, Nature 574, 505 (2019).
  • Preskill [2018] J. Preskill, Quantum Computing in the NISQ era and beyond, Quantum 2, 79 (2018).
  • Huang et al. [2022a] H.-Y. Huang, R. Kueng, G. Torlai, V. V. Albert, and J. Preskill, Provably efficient machine learning for quantum many-body problems, Science 377, eabk3333 (2022a)https://www.science.org/doi/pdf/10.1126/science.abk3333 .
  • Martina et al. [2022] S. Martina, L. Buffoni, S. Gherardini, and F. Caruso, Learning the noise fingerprint of quantum devices, Quantum Machine Intelligence 4, 8 (2022).
  • Martina et al. [2023a] S. Martina, S. Gherardini, and F. Caruso, Machine learning classification of non-markovian noise disturbing quantum dynamics, Physica Scripta 98, 035104 (2023a).
  • Martina et al. [2023b] S. Martina, S. Hernández-Gómez, S. Gherardini, F. Caruso, and N. Fabbri, Deep learning enhanced noise spectroscopy of a spin qubit environment, Machine Learning: Science and Technology 4, 02LT01 (2023b).
  • [9] E. Canonici, S. Martina, R. Mengoni, D. Ottaviani, and F. Caruso, Machine learning based noise characterization and correction on neutral atoms nisq devices, Advanced Quantum Technologies n/a, 2300192https://onlinelibrary.wiley.com/doi/pdf/10.1002/qute.202300192 .
  • Dalla Pozza et al. [2022] N. Dalla Pozza, L. Buffoni, S. Martina, and F. Caruso, Quantum reinforcement learning: the maze problem, Quantum Machine Intelligence 4, 11 (2022).
  • Das et al. [2023] S. Das, J. Zhang, S. Martina, D. Suter, and F. Caruso, Quantum pattern recognition on real quantum processing units, Quantum Machine Intelligence 5, 16 (2023).
  • Schuld et al. [2015] M. Schuld, I. Sinayskiy, and F. Petruccione, An introduction to quantum machine learning, Contemporary Physics 56, 172 (2015)https://doi.org/10.1080/00107514.2014.964942 .
  • Biamonte et al. [2017] J. Biamonte, P. Wittek, N. Pancotti, P. Rebentrost, N. Wiebe, and S. Lloyd, Quantum machine learning, Nature 549, 195 (2017).
  • Benedetti et al. [2019] M. Benedetti, E. Lloyd, S. Sack, and M. Fiorentini, Parameterized quantum circuits as machine learning models, Quantum Science and Technology 4, 043001 (2019).
  • Perdomo-Ortiz et al. [2018] A. Perdomo-Ortiz, M. Benedetti, J. Realpe-Gómez, and R. Biswas, Opportunities and challenges for quantum-assisted machine learning in near-term quantum computers, Quantum Science and Technology 3, 030502 (2018).
  • Schuld and Killoran [2022] M. Schuld and N. Killoran, Is quantum advantage the right goal for quantum machine learning?, PRX Quantum 3, 030101 (2022).
  • Parigi et al. [2023] M. Parigi, S. Martina, and F. Caruso, Quantum-noise-driven generative diffusion models (2023), arXiv:2308.12013 [quant-ph] .
  • Zhang et al. [2023] B. Zhang, P. Xu, X. Chen, and Q. Zhuang, Generative quantum machine learning via denoising diffusion probabilistic models (2023), arXiv:2310.05866 [quant-ph] .
  • Cacioppo et al. [2023] A. Cacioppo, L. Colantonio, S. Bordoni, and S. Giagu, Quantum diffusion models (2023), arXiv:2311.15444 [quant-ph] .
  • Peruzzo et al. [2014] A. Peruzzo, J. McClean, P. Shadbolt, M.-H. Yung, X.-Q. Zhou, P. J. Love, A. Aspuru-Guzik, and J. L. O’Brien, A variational eigenvalue solver on a photonic quantum processor, Nature Communications 5, 4213 (2014).
  • McClean et al. [2016] J. R. McClean, J. Romero, R. Babbush, and A. Aspuru-Guzik, The theory of variational hybrid quantum-classical algorithms, New Journal of Physics 18, 023023 (2016).
  • Romero et al. [2017] J. Romero, J. P. Olson, and A. Aspuru-Guzik, Quantum autoencoders for efficient compression of quantum data, Quantum Science and Technology 2, 045001 (2017).
  • Mitarai et al. [2018] K. Mitarai, M. Negoro, M. Kitagawa, and K. Fujii, Quantum circuit learning, Phys. Rev. A 98, 032309 (2018).
  • Schuld et al. [2020] M. Schuld, A. Bocharov, K. M. Svore, and N. Wiebe, Circuit-centric quantum classifiers, Phys. Rev. A 101, 032308 (2020).
  • Cong et al. [2019] I. Cong, S. Choi, and M. D. Lukin, Quantum convolutional neural networks, Nature Physics 15, 1273 (2019).
  • Li et al. [2020] Y. Li, R.-G. Zhou, R. Xu, J. Luo, and W. Hu, A quantum deep convolutional neural network for image recognition, Quantum Science and Technology 5, 044003 (2020).
  • Henderson et al. [2020] M. Henderson, S. Shakya, S. Pradhan, and T. Cook, Quanvolutional neural networks: powering image recognition with quantum circuits, Quantum Machine Intelligence 2, 2 (2020).
  • Braccia et al. [2021] P. Braccia, F. Caruso, and L. Banchi, How to enhance quantum generative adversarial learning of noisy information, New Journal of Physics 23, 053024 (2021).
  • Braccia et al. [2022] P. Braccia, L. Banchi, and F. Caruso, Quantum noise sensing by generating fake noise, Phys. Rev. Appl. 17, 024002 (2022).
  • Rudolph et al. [2022] M. S. Rudolph, N. B. Toussaint, A. Katabarwa, S. Johri, B. Peropadre, and A. Perdomo-Ortiz, Generation of high-resolution handwritten digits with an ion-trap quantum computer (2022), arXiv:2012.03924 [quant-ph] .
  • Boyle and Nikandish [2023] A. O. Boyle and R. Nikandish, A hybrid quantum-classical generative adversarial network for near-term quantum processors (2023), arXiv:2307.03269 [quant-ph] .
  • Tsang et al. [2023] S. L. Tsang, M. T. West, S. M. Erfani, and M. Usman, Hybrid quantum-classical generative adversarial network for high resolution image generation (2023), arXiv:2212.11614 [quant-ph] .
  • Zhou et al. [2023] N.-R. Zhou, T.-F. Zhang, X.-W. Xie, and J.-Y. Wu, Hybrid quantum–classical generative adversarial networks for image generation via learning discrete distribution, Signal Processing: Image Communication 110, 116891 (2023).
  • Rebentrost et al. [2014] P. Rebentrost, M. Mohseni, and S. Lloyd, Quantum support vector machine for big data classification, Phys. Rev. Lett. 113, 130503 (2014).
  • Liu et al. [2021] Y. Liu, S. Arunachalam, and K. Temme, A rigorous and robust quantum speed-up in supervised machine learning, Nature Physics 17, 1013 (2021).
  • Abbas et al. [2021] A. Abbas, D. Sutter, C. Zoufal, A. Lucchi, A. Figalli, and S. Woerner, The power of quantum neural networks, Nature Computational Science 1, 403 (2021).
  • Huang et al. [2022b] H.-Y. Huang, M. Broughton, J. Cotler, S. Chen, J. Li, M. Mohseni, H. Neven, R. Babbush, R. Kueng, J. Preskill, and J. R. McClean, Quantum advantage in learning from experiments, Science 376, 1182 (2022b)https://www.science.org/doi/pdf/10.1126/science.abn7293 .
  • Caro et al. [2022] M. C. Caro, H.-Y. Huang, M. Cerezo, K. Sharma, A. Sornborger, L. Cincio, and P. J. Coles, Generalization in quantum machine learning from few training data, Nature Communications 13, 4919 (2022).
  • McClean et al. [2018] J. R. McClean, S. Boixo, V. N. Smelyanskiy, R. Babbush, and H. Neven, Barren plateaus in quantum neural network training landscapes, Nature Communications 9, 4812 (2018).
  • Holmes et al. [2022] Z. Holmes, K. Sharma, M. Cerezo, and P. J. Coles, Connecting ansatz expressibility to gradient magnitudes and barren plateaus, PRX Quantum 3, 010313 (2022).
  • Cerezo et al. [2021] M. Cerezo, A. Sone, T. Volkoff, L. Cincio, and P. J. Coles, Cost function dependent barren plateaus in shallow parametrized quantum circuits, Nature Communications 12, 1791 (2021).
  • Wang et al. [2021] S. Wang, E. Fontana, M. Cerezo, K. Sharma, A. Sone, L. Cincio, and P. J. Coles, Noise-induced barren plateaus in variational quantum algorithms, Nature Communications 12, 6961 (2021).
  • Sim et al. [2019] S. Sim, P. D. Johnson, and A. Aspuru-Guzik, Expressibility and entangling capability of parameterized quantum circuits for hybrid quantum-classical algorithms, Advanced Quantum Technologies 2, 1900070 (2019)https://onlinelibrary.wiley.com/doi/pdf/10.1002/qute.201900070 .
  • Bronstein et al. [2021] M. M. Bronstein, J. Bruna, T. Cohen, and P. Veličković, Geometric deep learning: Grids, groups, graphs, geodesics, and gauges (2021), arXiv:2104.13478 [cs.LG] .
  • Larocca et al. [2022] M. Larocca, F. Sauvage, F. M. Sbahi, G. Verdon, P. J. Coles, and M. Cerezo, Group-invariant quantum machine learning, PRX Quantum 3, 030341 (2022).
  • Mernyei et al. [2022] P. Mernyei, K. Meichanetzidis, and İsmail İlkan Ceylan, Equivariant quantum graph circuits (2022), arXiv:2112.05261 [cs.LG] .
  • Skolik et al. [2023] A. Skolik, M. Cattelan, S. Yarkoni, T. Bäck, and V. Dunjko, Equivariant quantum circuits for learning on weighted graphs (2023), arXiv:2205.06109 [quant-ph] .
  • Meyer et al. [2023] J. J. Meyer, M. Mularski, E. Gil-Fuster, A. A. Mele, F. Arzani, A. Wilms, and J. Eisert, Exploiting symmetry in variational quantum machine learning, PRX Quantum 4, 010328 (2023).
  • Nguyen et al. [2022] Q. T. Nguyen, L. Schatzki, P. Braccia, M. Ragone, P. J. Coles, F. Sauvage, M. Larocca, and M. Cerezo, Theory for equivariant quantum neural networks (2022), arXiv:2210.08566 [quant-ph] .
  • Zheng et al. [2023] H. Zheng, Z. Li, J. Liu, S. Strelchuk, and R. Kondor, Speeding up learning quantum states through group equivariant convolutional quantum ansätze, PRX Quantum 4, 020327 (2023).
  • East et al. [2023] R. D. P. East, G. Alonso-Linaje, and C.-Y. Park, All you need is spin: Su(2) equivariant variational quantum circuits based on spin networks (2023), arXiv:2309.07250 [quant-ph] .
  • Pesah et al. [2021] A. Pesah, M. Cerezo, S. Wang, T. Volkoff, A. T. Sornborger, and P. J. Coles, Absence of barren plateaus in quantum convolutional neural networks, Phys. Rev. X 11, 041011 (2021).
  • Schatzki et al. [2022] L. Schatzki, M. Larocca, Q. T. Nguyen, F. Sauvage, and M. Cerezo, Theoretical guarantees for permutation-equivariant quantum neural networks (2022), arXiv:2210.09974 [quant-ph] .
  • West et al. [2023a] M. T. West, J. Heredge, M. Sevior, and M. Usman, Provably trainable rotationally equivariant quantum machine learning (2023a), arXiv:2311.05873 [quant-ph] .
  • West et al. [2023b] M. T. West, M. Sevior, and M. Usman, Reflection equivariant quantum neural networks for enhanced image classification, Machine Learning: Science and Technology 4, 035027 (2023b).
  • Chang et al. [2023] S. Y. Chang, M. Grossi, B. L. Saux, and S. Vallecorsa, Approximately equivariant quantum neural network for p4m𝑝4𝑚p4mitalic_p 4 italic_m group symmetries in images (2023), arXiv:2310.02323 [quant-ph] .
  • Ragone et al. [2023] M. Ragone, P. Braccia, Q. T. Nguyen, L. Schatzki, P. J. Coles, F. Sauvage, M. Larocca, and M. Cerezo, Representation theory for geometric quantum machine learning (2023), arXiv:2210.07980 [quant-ph] .
  • LeCun et al. [2015] Y. LeCun, Y. Bengio, and G. Hinton, Deep learning, Nature 521, 436 (2015).
  • Schmidhuber [2015] J. Schmidhuber, Deep learning in neural networks: An overview, Neural Networks 61, 85 (2015).
  • Vidal [2008] G. Vidal, Class of quantum many-body states that can be efficiently simulated, Phys. Rev. Lett. 101, 110501 (2008).
  • Liu et al. [2023] Y.-J. Liu, A. Smith, M. Knap, and F. Pollmann, Model-independent learning of quantum phases of matter with quantum convolutional neural networks, Phys. Rev. Lett. 130, 220603 (2023).
  • Hur et al. [2022] T. Hur, L. Kim, and D. K. Park, Quantum convolutional neural network for classical data classification, Quantum Machine Intelligence 4, 3 (2022).
  • Marvian [2022] I. Marvian, Restrictions on realizable unitary operations imposed by symmetry and locality, Nature Physics 18, 283 (2022).
  • Bergholm et al. [2022] V. Bergholm, J. Izaac, M. Schuld, C. Gogolin, S. Ahmed, V. Ajith, M. S. Alam, G. Alonso-Linaje, B. AkashNarayanan, A. Asadi, J. M. Arrazola, U. Azad, S. Banning, C. Blank, T. R. Bromley, B. A. Cordier, J. Ceroni, A. Delgado, O. D. Matteo, A. Dusko, T. Garg, D. Guala, A. Hayes, R. Hill, A. Ijaz, T. Isacsson, D. Ittah, S. Jahangiri, P. Jain, E. Jiang, A. Khandelwal, K. Kottmann, R. A. Lang, C. Lee, T. Loke, A. Lowe, K. McKiernan, J. J. Meyer, J. A. Montañez-Barrera, R. Moyard, Z. Niu, L. J. O’Riordan, S. Oud, A. Panigrahi, C.-Y. Park, D. Polatajko, N. Quesada, C. Roberts, N. Sá, I. Schoch, B. Shi, S. Shu, S. Sim, A. Singh, I. Strandberg, J. Soni, A. Száva, S. Thabet, R. A. Vargas-Hernández, T. Vincent, N. Vitucci, M. Weber, D. Wierichs, R. Wiersema, M. Willmann, V. Wong, S. Zhang, and N. Killoran, Pennylane: Automatic differentiation of hybrid quantum-classical computations (2022), arXiv:1811.04968 [quant-ph] .