[1]\fnmAgnieszka \surNiemczynowicz \equalcontThese authors contributed equally to this work.

\equalcont

These authors contributed equally to this work.

[1]\orgdivFaculty of Mathematics and Computer Science, \orgnameUniversity of Warmia and Mazury in Olsztyn, \orgaddress\street Słoneczna 54, \cityOlsztyn, \postcode10-710, \stateWarmińsko-Mazurskie, \countryOlsztyn

2]\orgdivFaculty of Computer Science and Telecommunications, \orgnameCracow University of Technology, \orgaddress\streetWarszawska 24, \cityKraków, \postcode31-155, \stateMałopolska, \countryPoland

Fully tensorial approach to hypercomplex neural networks

[email protected] \fnmRadosław Antoni \surKycia [email protected] * [

Abstract

Fully tensorial theory of hypercomplex neural networks is given. The key point is to observe that the algebra multiplication can be represented as a rank three tensor. This approach is attractive for neural network libraries that support effective tensorial operations.

keywords:

Hypercomplex neural network, algebra, tensor

pacs:

[

MSC Classification]15A69, 15-04

1 Introduction

The fast progress in applications of Artificial Neural Networks (NN) promotes new directions of research and generalizations. This involves advanced mathematical concepts such as group theory [19], differential geometry [5, 6], or topological methods in data analysis [7].

The core of NN implementations lies in linear algebra usage. In most popular programming libraries (TensorFlow [1], PyTorch [15]), the most popular architecture is feed forward NN, which is based on a stack of layers where the data passes between them unidirectionally. Optimized tensorial operations realize the flow of the data.

There are different algebraic extensions. One of these paths is Algebraic Neural Networks [14], where the additional endomorphism operations on data are performed. The other algebra-geometry direction is the neural networks based on Geometric Algebra/Clifford algebra [4, 17]. Recently, Parametrized Hypercomplex Neural Networks were invented [10] for convolutional layers. They can learn optimal hypercomplex algebra adjusted to data, exploring optimized Kronecker product. However, in some applications kee** hyperalgebra of even more general algebra parameters as (fixed) hyperparameters is needed. In such a way we can optimize algebra structure at the metalevel.

In this paper we discuss implementation in which we change classical real algebra computations into various hypercomplex or even more general algebras computations. This approach presented, e.g., in [3] is not new. However, there is a revival of interest of this direction due to better complexity properties in such areas as image processing [18] or time series analysis [11]. In these contributions the Open Source code [18] for specific four-dimensional hypercomplex algebras was used.

The implementation explained in this article significantly expands the ideas from [18] for arbitrary algebras, including hypercomplex ones. The algorithms described here agree with the NN presented in [18] for specific 4-dimensional hypercomplex algebras. However, implementation of [18] was obtained by constructing an additional multiplication structure from the multiplication table for the hypercomplex algebra, which is treated as additional step in setting up neural network. Our approach permit us to omit this complexity and generalize to arbitrary algebras. This is very important contribution form theoretical treatment of general algebraic approach to hypercomplex neural networks.

The main contribution of this paper is following:

•

summarize basic concepts on tensorial operations in terms of hypercomplex and more general algebras, especially, we noted that the algebra multiplication can be expressed as a third-rank tensor,
•

provide general algorithm for computations within hypercomplex dense layer,
•

provide general algorithms for 1-, 2-, and 3-dimensional hypercomplex convolutional layer computations.

2 Methods

This section provides an overview of the mathematical theory behind the operations used in implementing hypercomplex neural networks. This is a tenet of methods used in this paper. These are classical notions explained in detail in standard references, e.g., [2].

2.1 Tensors

The primary object that is used in NN implementations is a tensor. It relies on the tensor product described in the following definition

Definition 1.

The tensor product of two vector spaces $V_{1}$ and $V_{2}$ over the field $\mathbb{F}$ is the vector space denoted by $V_{1}\otimes V_{2}$ and defined as a quotient space $V_{1}\times V_{2}/L$ , where $L$ is a subspace of $V_{1}\times V_{2}$ that is spanned by

\begin{array}[]{c}(v+w,x)-(v,x)-(w,x),\\ (v,x+y)-(v,w)-(v,x),\\ (\lambda v,x)-\lambda(v,x),\\ (v,\lambda x)-\lambda(v,x),\end{array}

(1)

where $v,w\in V_{1}$ , $x,y\in V_{2}$ , $\lambda\in\mathbb{F}$ .

By induction, it can be defined for $k$ vector spaces $\{V_{i}\}_{i=1}^{k}$ and denoted by $V_{1}\otimes\ldots\otimes V_{k}$ that can be dented by $\otimes_{i=1}^{k}V_{i}$ .

The tensor product can also be defined for duals spaces $V_{i}^{*}$ of a vector spaces $V_{i}$ for $i\in\{1,\ldots k\}$ , $0<k<\infty$ , and we can define mixed tensor product made for vector spaces and their duals.

The tensor product of vector space is a vecotr space, so we can define a base, e.g., if the base of $V$ is $\{e_{i}\}_{i=1}^{n_{1}}$ and $W$ is $\{f_{i}\}_{i=1}^{n_{2}}$ , then the base of $V\otimes W$ is $\{e_{i}\otimes f_{j}\}_{i,j=1}^{n_{1},n_{2}}$ .

The tensor space can be used to decompose any multilinear map**. It is expressed in the universal factorization theorem for tensor product. It states that for a bilinear map** $F:V\times W\rightarrow X$ of vector spaces $V,W,X$ can be uniquely factorized by a new map** $\bar{F}:V\otimes W\rightarrow X$ according to Fig. 1.

Figure 1: Unique factorization property of the tensor product.

Then the map $\bar{F}$ is called the tensor. Moreover, in the map $F$ , we can move $X$ to the domain of the map, i.e., we can define $\hat{F}:V\times W\times X\rightarrow\mathbb{F}$ instead of $F$ , where $\mathbb{F}$ is a field common to the vector spaces.

Example 1.

In the base $\{e_{i}\}$ of $V$ and $\{f_{j}\}$ of $W$ the bilinear map** $F:V\otimes W\rightarrow\mathbb{R}$ has the form

F=F^{ij}e_{i}\otimes f_{j},

(2)

where $F^{ij}\mathbb{R}$ for all $i,j$ , are the coefficients of a numerical matrix that is a representation of the multilinear map** (tensor) $F$ in the fixed base of $V$ and $W$ . The matrix collects the components of the tensor in a fixed tensor base.

The matrix $[F_{ij}]$ is implemented in TensorFlow and PyTorch library as a tensor class. The critical difference is that the mathematical tensors have specific properties of transformations under the change of basis of underlying vector spaces. The libraries implement the tensors as a multidimensional matrix of numbers. Moreover, they do not keep the upper (contravariant) or lower (covariant) position of indices.

The example can be extended to the tensor product of multiple vector spaces and their duals.

Example 2.

The linear map** $A:V\rightarrow W$ can be written as a map** $\bar{A}\in V^{*}\otimes W\rightarrow\mathbb{R}$ that can be written in the base $\{e^{i}\}$ of $V^{*}$ and $\{f_{i}\}$ of $W$ as¹¹1Here we use Einstein summation convention: repeated bottom and top index indicate the summation over the whole range. $A=A_{i}^{j}e^{i}\otimes f_{j}$ . The vector space of linear operators is denoted as $L(V,W)$ .

For tensors we also often use abstract index notation where we provide only components of the tensor, e.g., $A_{ij}$ , understanding them not as fixed base numerical values but as a full tensor $A_{ij}e^{i}\otimes e^{j}$ , see [16],

Since the tensor product is a functor in the category of linear spaces [2], therefore, for a linear map** $A:V\rightarrow W$ , we can define the extension of the map** for a tensor product space $\otimes_{i=1}^{n}A:\otimes_{i=1}^{n}V\rightarrow\otimes_{i=1}^{n}W$ by acting on product as $\otimes A(v_{1}\otimes\ldots\otimes v_{n})=Av_{1}\otimes\ldots\otimes Av_{n}$ and extending by linearity for all combinations.

This is similar behaviour as for the Cartesian product of vector spaces. The Cartesian product is also a functor, and therefore, all the above operations apply in this case.

We can define a few linear algebra operations realized in tensor libraries.

•

Broadcasting: it is defined on for a linear operator $A:V\rightarrow W$ to be an multilinear extension

b_{1}^{n}:L(V,W)\rightarrow L(\times_{i=1}^{n}V,\times_{i=1}^{n}W),

(3)

that is $b_{1}^{n}(A)(v_{1},\ldots,v_{n})=(Av_{1},\ldots,Av_{n})$ . Similar broadcasting is realized for the tensor product. Both operations relies on functoriality of Cartesian product and tensor product.

•

Transposition/Permutation: The transposition of two components relies on the following fact: there is the unique map** $\tau:V\otimes W\rightarrow W\otimes V$ that simply reverses the order of factors $\tau(v\otimes w)=w\otimes v$ . We can extend it to arbitrary permutation of $n$ numbers, $p:\{1,\ldots,n\}\rightarrow\{1,\ldots,n\}$ , we have $\tau_{p}(v_{1}\otimes\ldots\otimes v_{n})=v_{p(1)}\otimes\ldots\otimes v_{p(n)}$ . We can define a similar operation for the Cartesian product.

In the abstract index notation, we have $\tau_{p}A_{i_{1}\ldots i_{k}}=A_{i_{p(1)}\ldots i_{p(k)}}$ .
•

Resha**: For a pair of indices $i,j$ of the range $0\leq R(i)<\infty$ and $0\leq R(j)<\infty$ we can define a reshape operation, which is the new sole index $Rs(i,j)$ that value depends on the values of $i,j$ given by the function: $Rs(i,j)=iR(j)+j$ . We can extend the operation for the pair $(p,p+1)$ of neighbour indices of a tensor by $Rs_{p}A_{i_{1}\ldots i_{p}i_{p+1}\ldots i_{k}}=A_{i_{1}\ldots Rs(i_{p},i_{p+1}% )\ldots i_{k}}$ , where $p+1\leq k$ . This operation changes only the way of indexing; however, it is useful in applications. In the abstract index notation we can write $Rs_{p}A_{i_{1}\ldots i_{p}i_{p+1}\ldots i_{k}}=A_{i_{1}\ldots i_{p-1}Rs(i_{p},% i_{p+1})i_{p+2}\ldots i_{k}}$ .
•

Contraction: For a two tensors $A$ and $B$ , and a fixed base $\{e_{i}\}$ of $V$ , the contraction of indices $p$ (related to $V$ ) and $q$ (related to $V^{*})$ is (note implicit sum): $C_{p,q}(A,B)=A(\ldots\underbrace{e_{i}}_{p}\ldots)\otimes B(\ldots\underbrace{% e^{i}}_{q}\ldots)$ where $A\in\ldots\otimes\underbrace{V}_{p}\otimes\ldots$ and $B\in\ldots\otimes\underbrace{V^{*}}_{q}\otimes\ldots$ .

The contraction can also be defined for a single tensor in the same way, e.g., in abstract index notation for a single tensor $T$ , ones get $C_{p,q}T_{\ldots i_{p}\ldots}^{\ldots i_{q}\ldots}=T_{\ldots\underbrace{i}_{p}% \ldots}^{\ldots\overbrace{i}^{q}\dots}$ , where implicit summation was applied.
•

Concatenation: joins the tensors $\{A^{(i)}_{i_{1}\ldots i_{l}}\}_{i=1}^{n}$ of the same shape along given dimension $j$ , i.e., $K_{j}(\{A^{(i)}\}_{i=1}^{n})=A_{i_{1}\ldots i_{j-1}k_{j}i_{j}\ldots i_{l}}$ .

2.2 (Hypercomplex) Algebras

In this part we introduce mathematical concepts related to hypercomplex and general algebras, as in the following definition.

Definition 2.

The algebra over a field $\mathbb{F}$ is a vector space $V$ equipped with a product - a binary operation $\_\cdot\_:V\times V\rightarrow V$ with the following properties:

•

$(x+y)\cdot z=x\cdot z+y\cdot z$ ,
•

$z\cdot(x+y)=z\cdot x+z\cdot y$ ,
•

$(\alpha x)\cdot(\beta y)=\alpha\beta(x\cdot y)$ ,

for $x,y,z\in V$ and $\alpha,\beta\in\mathbb{F}$ .

Moreover, the algebra is commutative if $x\cdot y=y\cdot x$ for $x,y\in V$ .

Example 3.

For real numbers $\mathbb{R}$ we have $V=\{e_{0}\}$ , $\mathbb{F}=\mathbb{R}$ with $e_{0}\cdot e_{0}=1$ .

Complex numbers $\mathbb{C}$ can be obtained from commutative algebra with $V=\{e_{0},e_{1}\}$ , $F=\mathbb{R}$ and $e_{0}\cdot e_{0}=e_{0}$ , $e_{0}\cdot e_{1}=e_{1}$ and $e_{1}\cdot e_{1}=-e_{0}$ .

By convention we will always assume that the neutral element of the algebra mutliplication is $e_{0}$ .

Efficient use of algebras in computations based on tensors (TensorFlow, PyTorch) relies on converting the product within the algebra into a tensor operation.

Treating algebra as a vector space with additional structure of vector multiplication²²2It can be define using a functor that cast category of algebras into the category of linear spaces. we have the following definition, which is essential to the rest of the paper.

Definition 3.

For an algebra $V$ over $F$ the product can be defined as a tensor $A\in V^{*}\otimes V^{*}\otimes V$ . Selecting the base of $V=span\{e_{i}\}_{i=0}^{1}$ and the dual base $V^{*}=\{e^{i}\}_{i=1}^{n}$ with $e^{i}(e_{j})=\delta_{j}^{i}$ , the product has the form

A=A_{ij}^{k}e^{i}\otimes e^{j}\otimes e_{k}.

(4)

Then the multiplication table entry is presented in (5).

\begin{array}[]{c|c}&e_{j}\\ \hline\cr e_{i}&A_{ij}^{k}e_{k}\end{array}

(5)

The tensor coefficients (abstract index notation), $A_{ij}^{k}$ , play the same role as structure constants for a group [8]. These coefficients in a fixed base can be represented as a multidimensional matrix (called tensors in TensorFlow and PyTorch). When the algebra is commutative, then $A(x,y)=A(y,x)$ or $A_{ij}^{k}=A_{ji}^{k}$ .

3 Results

In this section, we provide mathematical details of the implementation of hypercomplex dense and convolutional neural networks.

We do not distinguish co- and contravariant indices in the description and write them at the bottom level. Moreover, the tensor is treated as a multidimensional array, with indices starting from 0. This is a standard convention in the tensor libraries such as TensorFlow and PyTorch.

We assume that the algebra multiplication structure constant is fixed and stored in the tensor (in abstract index notation) $A=A_{i_{a}j_{a}k_{a}}$ , see Subsection 2.2.

3.1 Hypercomplex Dense layer

We start with a description of the dense layer. It is a general-purpose layer that operates on the data with additional dimensionality, which is a multiple algebra dimensions. We assume that the input data are of dimension $b\times\underbrace{al\times in}_{n}$ , where $b$ is the batch size, $al$ - the algebra size, and $in$ - the positive integer multiplier. The last two numbers determine the input data size. The input tensor $X=X_{i_{b}Rs(i_{al},i_{in})}$ , where $i_{b}$ - batch index, $i_{al}$ is the dimension of algebra, $i_{in}$ is the multiplicity index of algebra dimension. Moreover, we use learning parameters (weights/kernel) $K=K_{i_{al}i_{in}i_{u}}$ , where $i_{u}$ is the index over units/neurons. The bias $b=b_{Rs(i_{al},i_{u})}$ is used if needed. Kernel and bias are usually initialized with numbers taken from specific distributions [9].

We now provide the algorithm of hypercomplex dense network in Algorithm 1. We offer both tensorial and abstract index notations (AIN). We need two flags bias - if bias is included and activation - if activation function $\sigma$ is used.

Algorithm 1 Hypercomplex dense NN

X

A

\sigma

, bias, activation

K

b

- initialized

W\leftarrow C_{1,0}(A,K)

[AIN:

W_{i_{a}k_{a}i_{in}i_{u}}\leftarrow\sum_{j}A_{i_{a}jk_{a}}K_{ji_{in}i_{u}}

]

W\leftarrow\tau_{p=(0\rightarrow 0,1\rightarrow 2,2\rightarrow 1,3\rightarrow 3% )}(W)

[AIN:

W_{i_{a}i_{in}k_{a}i_{u}}\leftarrow W_{p(i_{a}k_{a}i_{in}i_{u})}

]

W\leftarrow Rs_{1}\circ Rs_{0}(W)

[AIN:

W_{i_{1}i_{2}}\leftarrow W_{Rs(i_{a},i_{in})Rs(k_{a},i_{u})}

]

Output\leftarrow C_{1,0}(X,W)

[AIN:

Outputs_{i_{b}i_{2}}\leftarrow\sum_{k}X_{i_{b}k=Rs(i_{al},i_{in})}W_{ki_{2}}

]

if bias is True then

Output\leftarrow Output+b

end if

if activation is True then

Output\leftarrow b(\sigma)(Output)

end if

The Keras with TensorFlow implementation and PyTorch implementations are given in [13].

3.2 Hypercomplex Convolutional layer

In this part the hypercomplex convolutional neural network will be described. We present general $k$ -dimensional ( $k=1,2,3$ ) layers. They differ by the shape of the input data and kernel size.

The additional ’image channels’ of the data can be packed as an element of algebra. For instance, the two-dimensional image can be decomposed as a matrix of four-dimensional algebra elements, i.e., color channels plus alpha, making a single pixel an element of four-dimensional algebra.

The general idea of NN action, as in [18], is to use algebra constants $A$ tensor to separate the coefficients of the algebra base component and then apply the traditional convolution for each algebra component.

The dimension of the input data $X=X_{i_{b}i_{1}\ldots i_{k}Rs(i_{al},i_{in})}$ of dimension $b\times n_{1}\times\ldots\times n_{k}\times\underbrace{al\times in}_{n}$ , where $b$ is the batch size, $n_{i}$ , $i\in\{1,\ldots,k\}$ - the size of data sample in each dimension, $al$ - the algebra dimension, and $in$ - the positive multiplier³³3One can see that the size of the data sample in each dimension (axes) stands right after the batch size, which is used in TensorFlow convention. For PyTorch, this multiindex is placed at the end. We will not provide an implementation for the PyTorch convention since it differs by a few permutations.. The kernel size is $al\times L_{1}\times\ldots L_{k}\times in\times F$ , where: $L_{i}$ , $i\in\{1,\ldots,k\}$ is the kernel dimension in each dimension, and $F$ - the number of filters. We therefore have the kernel $K=K_{i_{al}i_{l1}\ldots i_{lk}i_{in}i_{f}}$ . The flag $bias$ indicates the bias $b$ usage. If it is used, it has dimension $al\times F$ . Kernel and bias can be initialized by arbitrary distributions [9]. We use standard $k$ -dimensional convolution $convkD(X,K,strides,padding)$ for convoluting algebra components, there are standard optimized convolution operations [12]. The algorithm is presented in the Algorithm 2.

Algorithm 2 Hypercomplex

k

-dimensional convolutional NN

X

A

\sigma

, bias, activation

K

b

- initialized

W\leftarrow C_{1,0}(A,K)

[AIN:

W_{i_{a}k_{a}i_{l1}\ldots l_{lk}i_{in}i_{f}}\leftarrow\sum_{j}A_{i_{a}jk_{a}}K% _{ji_{l1}\ldots i_{lk}i_{in}i_{f}}

]

W\leftarrow\tau_{p=(0\rightarrow k+2)}(W)

[AIN:

W_{k_{a}i_{l1}\ldots l_{lk}i_{a}i_{in}i_{f}}\leftarrow W_{p(i_{a}k_{a}i_{l1}% \ldots l_{lk}i_{in}i_{f})}

]

W\leftarrow Rs_{k+2}(W)

[AIN:

W_{k_{a}i_{l1}\ldots l_{lk}ji_{f}}\leftarrow W_{k_{a}i_{l1}\ldots l_{lk}% \underbrace{Rs(i_{a},i_{in})}_{=j}i_{f}}

]

temp = []

for

i\in\{0,\ldots,al\}

do temp = temp + [

convkD(X,W_{i\ldots})

]

end for

Output_{i_{b}i_{l1}\ldots i_{lk}h}\leftarrow K_{k+1}(temp)

if bias is True then

Output\leftarrow Output+b

end if

if activation is True then

Output\leftarrow b(\sigma)(Output)

end if

4 Conclusion

In this paper, we introduced the mathematical details of the implementation of dense and convolutional NN based on hypercomplex and general algebras. The critical point in this presentation is to associate algebra multiplication with rank-three tensor. Thanks to this observation, all the NN processing steps can be represented as tensorial operations.

The fully tensorial operations applied to algebra operations simplifies neural networks operations, and allows to support for fast tensorial operations in modern packages as TensorFlow and PyTorch. Implementation of the above generalized hypercomplex NN is described elsewhere [13].

\bmhead

Supplementary information

\bmhead

Acknowledgements This paper has been supported by the Polish National Agency for Academic Exchange Strategic Partnership Programme under Grant No. BPI/PST/2021/1/00031 (nawa.gov.pl).

Declarations

•

Funding - This paper has been supported by the Polish National Agency for Academic Exchange Strategic Partnership Programme under Grant No. BPI/PST/2021/1/00031 (nawa.gov.pl).
•

Conflict of interest/Competing interests - Not applicable
•

Ethics approval and consent to participate - Not applicable
•

Consent for publication - All authors agree on publication
•

Data availability - Not applicable
•

Materials availability - Not applicable
•

Code availability - Not applicable
•

Author contribution - A.N., R.K.: Conceptualization, Methodology, Investigation, Writing - Original Draft, Review, Editing. A.N.: Funding acquisition, Supervision.

Acknowledgements

This paper has been supported by the Polish National Agency for Academic Exchange Strategic Partnership Programme under Grant No. BPI/PST/2021/1/00031 (nawa.gov.pl).

References

[1] M. Abadi, P. Barham, J. Chen, Z. Chen et al. TensorFlow: a system for large-scale machine learning, In Proceedings of the 12th USENIX conference on Operating Systems Design and Implementation (OSDI’16). USENIX Association, USA, 265–283, (2016).
[2] P. Aluffi, Algebra: Chapter 0, American Mathematical Society, 2009
[3] P. Arena, L. Fortuna, G. Muscato, M.G. Xibilia, Neural Networks in Multidimensional Domains, Lecture Notes in Control and Information Sciences, Springer London, 1998l doi: 10.1007/BFb0047683
[4] S. Buchholz, G. Sommer, On Clifford neurons and Clifford multi-layer perceptrons, Neural Networks, 21, 7, 925-935 (2008); doi: 10.1016/j.neunet.2008.03.004.
[5] W. Cao, C. Zheng, Z. Yan, et al. Geometric machine learning: research and applications. Multimed Tools Appl 81, 30545–30597 (2022). https://doi.org/10.1007/s11042-022-12683-9
[6] W. Cao, Z. Yan, Z. He and Z. He, A Comprehensive Survey on Geometric Deep Learning, IEEE Access, vol. 8, pp. 35929-35949, 2020, doi: 10.1109/ACCESS.2020.2975067.
[7] G. Carlsson, Topology and Data, Bull. Amer. Math. Soc. 46 (2), 255–308. (2009) doi:10.1090/s0273-0979-09-01249-x
[8] W. Fulton, J. Harris, Representation theory. A first course, Springer-Verlag, 1991
[9] X. Glorot, Y. Bengio, Understanding the difficulty of training deep feedforward neural networks, Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, PMLR 9:249-256 (2010)
[10] E. Grassucci, A. Zhang, D. Comminiello, PHNNs: Lightweight Neural Networks via Parameterized Hypercomplex Convolutions, IEEE Transactions on Neural Networks and Learning Systems, 1-13 (2022); doi: 10.1109/TNNLS.2022.3226772
[11] R. Kycia, A. Niemczynowicz, Hypercomplex neural network in time series forecasting of stock data, Submitted, arXiv:2401.04632 [cs.NE]
[12] A. Lavin, S. Gray, Fast Algorithms for Convolutional Neural Networks, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 2016, pp. 4013-4021, doi: 10.1109/CVPR.2016.435.
[13] A. Niemczynowicz, R. Kycia, KHNNs: hypercomplex neural networks computations via Keras using TensorFlow and PyTorch, in preparation
[14] A. Parada-Mayorga, A. Ribeiro, Algebraic Neural Networks: Stability to Deformations, IEEE Transactions on Signal Processing, vol. 69, pp. 3351-3366, (2021); doi: 10.1109/TSP.2021.3084537.
[15] A. Paszke, S. Gross, F. Massa, A. Lerer et al., PyTorch: an imperative style, high-performance deep learning library, Proceedings of the 33rd International Conference on Neural Information Processing Systems. Curran Associates Inc., Red Hook, NY, USA, Article 721, 8026–8037 (2019)
[16] R. Penrose, W. Rindler, Spinors and Space-Time, Volume 1: Two-Spinor Calculus and Relativistic Fields, Cambridge University Press, 1984
[17] D. Ruhe, J.K. Gupta, S. De Keninck, M. Welling, J. Brandstetter, Geometric clifford algebra networks. In Proceedings of the 40th International Conference on Machine Learning (ICML’23), Vol. 202. JMLR.org, Article 1219, 29306–29337, (2023)
[18] G. Vieira, M.E. Valle, W. Lopes, Clifford Convolutional Neural Networks for Lymphoblast Image Classification, Silva, D.W., Hitzer, E., Hildenbrand, D. (eds) Advanced Computational Applications of Geometric Algebra. ICACGA 2022. Lecture Notes in Computer Science, vol 13771. Springer, Cham. (2024); doi: 10.1007/978-3-031-34031-4_7
[19] J. Wood, J. ShaweTaylor, Representation theory and invariant neural networks, Discrete Applied Mathematics, 69, 1-2, 33–60 (1996)