Search | arXiv e-print repository

HOTTBOX: Higher Order Tensor ToolBOX

Authors: Ilya Kisil, Giuseppe G. Calvi, Bruno S. Dees, Danilo P. Mandic

Abstract: HOTTBOX is a Python library for exploratory analysis and visualisation of multi-dimensional arrays of data, also known as tensors. The library includes methods ranging from standard multi-way operations and data manipulation through to multi-linear algebra based tensor decompositions. HOTTBOX also comprises sophisticated algorithms for generalised multi-linear classification and data fusion, such… ▽ More HOTTBOX is a Python library for exploratory analysis and visualisation of multi-dimensional arrays of data, also known as tensors. The library includes methods ranging from standard multi-way operations and data manipulation through to multi-linear algebra based tensor decompositions. HOTTBOX also comprises sophisticated algorithms for generalised multi-linear classification and data fusion, such as Support Tensor Machine (STM) and Tensor Ensemble Learning (TEL). For user convenience, HOTTBOX offers a unifying API which establishes a self-sufficient ecosystem for various forms of efficient representation of multi-way data and the corresponding decomposition and association algorithms. Particular emphasis is placed on scalability and interactive visualisation, to support multidisciplinary data analysis communities working on big data and tensors. HOTTBOX also provides means for integration with other popular data science libraries for visualisation and data manipulation. The source code, examples and documentation ca be found at https://github.com/hottbox/hottbox. △ Less

Submitted 30 November, 2021; originally announced November 2021.

arXiv:2109.00626 [pdf, other]

Reducing Computational Complexity of Tensor Contractions via Tensor-Train Networks

Authors: Ilya Kisil, Giuseppe G. Calvi, Kriton Konstantinidis, Yao Lei Xu, Danilo P. Mandic

Abstract: There is a significant expansion in both volume and range of applications along with the concomitant increase in the variety of data sources. These ever-expanding trends have highlighted the necessity for more versatile analysis tools that offer greater opportunities for algorithmic developments and computationally faster operations than the standard flat-view matrix approach. Tensors, or multi-wa… ▽ More There is a significant expansion in both volume and range of applications along with the concomitant increase in the variety of data sources. These ever-expanding trends have highlighted the necessity for more versatile analysis tools that offer greater opportunities for algorithmic developments and computationally faster operations than the standard flat-view matrix approach. Tensors, or multi-way arrays, provide such an algebraic framework which is naturally suited to data of such large volume, diversity, and veracity. Indeed, the associated tensor decompositions have demonstrated their potential in breaking the Curse of Dimensionality associated with traditional matrix methods, where a necessary exponential increase in data volume leads to adverse or even intractable consequences on computational complexity. A key tool underpinning multi-linear manipulation of tensors and tensor networks is the standard Tensor Contraction Product (TCP). However, depending on the dimensionality of the underlying tensors, the TCP also comes at the price of high computational complexity in tensor manipulation. In this work, we resort to diagrammatic tensor network manipulation to calculate such products in an efficient and computationally tractable manner, by making use of Tensor Train decomposition (TTD). This has rendered the underlying concepts easy to perceive, thereby enhancing intuition of the associated underlying operations, while preserving mathematical rigour. In addition to bypassing the cumbersome mathematical multi-linear expressions, the proposed Tensor Train Contraction Product model is shown to accelerate significantly the underlying computational operations, as it is independent of tensor order and linear in the tensor dimension, as opposed to performing the full computations through the standard approach (exponential in tensor order). △ Less

Submitted 8 September, 2021; v1 submitted 1 September, 2021; originally announced September 2021.

arXiv:2105.04983 [pdf, other]

Tensor-Train Recurrent Neural Networks for Interpretable Multi-Way Financial Forecasting

Authors: Yao Lei Xu, Giuseppe G. Calvi, Danilo P. Mandic

Abstract: Recurrent Neural Networks (RNNs) represent the de facto standard machine learning tool for sequence modelling, owing to their expressive power and memory. However, when dealing with large dimensional data, the corresponding exponential increase in the number of parameters imposes a computational bottleneck. The necessity to equip RNNs with the ability to deal with the curse of dimensionality, such… ▽ More Recurrent Neural Networks (RNNs) represent the de facto standard machine learning tool for sequence modelling, owing to their expressive power and memory. However, when dealing with large dimensional data, the corresponding exponential increase in the number of parameters imposes a computational bottleneck. The necessity to equip RNNs with the ability to deal with the curse of dimensionality, such as through the parameter compression ability inherent to tensors, has led to the development of the Tensor-Train RNN (TT-RNN). Despite achieving promising results in many applications, the full potential of the TT-RNN is yet to be explored in the context of interpretable financial modelling, a notoriously challenging task characterized by multi-modal data with low signal-to-noise ratio. To address this issue, we investigate the potential of TT-RNN in the task of financial forecasting of currencies. We show, through the analysis of TT-factors, that the physical meaning underlying tensor decomposition, enables the TT-RNN model to aid the interpretability of results, thus mitigating the notorious "black-box" issue associated with neural networks. Furthermore, simulation results highlight the regularization power of TT decomposition, demonstrating the superior performance of TT-RNN over its uncompressed RNN counterpart and other tensor forecasting methods. △ Less

Submitted 11 May, 2021; originally announced May 2021.

Comments: International Joint Conference on Neural Networks (IJCNN) 2021

arXiv:1909.05831 [pdf, other]

Tight Lower Bound on the Tensor Rank based on the Maximally Square Unfolding

Authors: Giuseppe G. Calvi, Bruno Scalzo Dees, Danilo P. Mandic

Abstract: Tensors decompositions are a class of tools for analysing datasets of high dimensionality and variety in a natural manner, with the Canonical Polyadic Decomposition (CPD) being a main pillar. While the notion of CPD is closely intertwined with that of the tensor rank, $R$, unlike the matrix rank, the computation of the tensor rank is an NP-hard problem, owing to the associated computational burden… ▽ More Tensors decompositions are a class of tools for analysing datasets of high dimensionality and variety in a natural manner, with the Canonical Polyadic Decomposition (CPD) being a main pillar. While the notion of CPD is closely intertwined with that of the tensor rank, $R$, unlike the matrix rank, the computation of the tensor rank is an NP-hard problem, owing to the associated computational burden of evaluating the CPD. To address this issue, we investigate tight lower bounds on $R$ with the aim to provide a reduced search space, and hence to lessen the computational costs of the CPD evaluation. This is achieved by establishing a link between the maximum attainable lower bound on $R$ and the dimensions of the matrix unfolding of the tensor with aspect ratio closest to unity (maximally square). Moreover, we demonstrate that, for a generic tensor, such lower bound can be attained under very mild conditions, whereby the tensor rank becomes detectable. Numerical examples demonstrate the benefits of this result. △ Less

Submitted 14 November, 2019; v1 submitted 12 September, 2019; originally announced September 2019.

arXiv:1903.06133 [pdf, other]

Compression and Interpretability of Deep Neural Networks via Tucker Tensor Layer: From First Principles to Tensor Valued Back-Propagation

Authors: Giuseppe G. Calvi, Ahmad Moniri, Mahmoud Mahfouz, Qibin Zhao, Danilo P. Mandic

Abstract: This work aims to help resolve the two main stumbling blocks in the application of Deep Neural Networks (DNNs), that is, the exceedingly large number of trainable parameters and their physical interpretability. This is achieved through a tensor valued approach, based on the proposed Tucker Tensor Layer (TTL), as an alternative to the dense weight-matrices of DNNs. This allows us to treat the weigh… ▽ More This work aims to help resolve the two main stumbling blocks in the application of Deep Neural Networks (DNNs), that is, the exceedingly large number of trainable parameters and their physical interpretability. This is achieved through a tensor valued approach, based on the proposed Tucker Tensor Layer (TTL), as an alternative to the dense weight-matrices of DNNs. This allows us to treat the weight-matrices of general DNNs as a matrix unfolding of a higher order weight-tensor. By virtue of the compression properties of tensor decompositions, this enables us to introduce a novel and efficient framework for exploiting the multi-way nature of the weight-tensor in order to dramatically reduce the number of DNN parameters. We also derive the tensor valued back-propagation algorithm within the TTL framework, by extending the notion of matrix derivatives to tensors. In this way, the physical interpretability of the Tucker decomposition is exploited to gain physical insights into the NN training, through the process of computing gradients with respect to each factor matrix. The proposed framework is validated on both synthetic data, and the benchmark datasets MNIST, Fashion-MNIST, and CIFAR-10. Overall, through the ability to provide the relative importance of each data feature in training, the TTL back-propagation is shown to help mitigate the "black-box" nature inherent to NNs. Experiments also illustrate that the TTL achieves a 66.63-fold compression on MNIST and Fashion-MNIST, while, by simplifying the VGG-16 network, it achieves a 10\% speed up in training time, at a comparable performance. △ Less

Submitted 6 January, 2020; v1 submitted 14 March, 2019; originally announced March 2019.

arXiv:1711.00701 [pdf, other]

The sum of tensor networks

Authors: Giuseppe G. Calvi, Ilia Kisil, Danilo P. Mandic

Abstract: Tensor networks (TNs) have been gaining interest as multiway data analysis tools owing to their ability to tackle the curse of dimensionality and to represent tensors as smaller-scale interconnections of their intrinsic features. However, despite the obvious advantages, the current treatment of TNs as stand-alone entities does not take full benefit of their underlying structure and the associated… ▽ More Tensor networks (TNs) have been gaining interest as multiway data analysis tools owing to their ability to tackle the curse of dimensionality and to represent tensors as smaller-scale interconnections of their intrinsic features. However, despite the obvious advantages, the current treatment of TNs as stand-alone entities does not take full benefit of their underlying structure and the associated feature localization. To this end, embarking upon the analogy with a feature fusion, we propose a rigorous framework for the combination of TNs, focusing on their summation as the natural way for their combination. This allows for feature combination for any number of tensors, as long as their TN representation topologies are isomorphic. The benefits of the proposed framework are demonstrated on the classification of several groups of partially related images, where it outperforms standard machine learning algorithms. △ Less

Submitted 2 November, 2017; originally announced November 2017.

arXiv:1711.00487 [pdf, other]

Tensor Valued Common and Individual Feature Extraction: Multi-dimensional Perspective

Authors: Ilia Kisil, Giuseppe G. Calvi, Danilo P. Mandic

Abstract: A novel method for common and individual feature analysis from exceedingly large-scale data is proposed, in order to ensure the tractability of both the computation and storage and thus mitigate the curse of dimensionality, a major bottleneck in modern data science. This is achieved by making use of the inherent redundancy in so-called multi-block data structures, which represent multiple observat… ▽ More A novel method for common and individual feature analysis from exceedingly large-scale data is proposed, in order to ensure the tractability of both the computation and storage and thus mitigate the curse of dimensionality, a major bottleneck in modern data science. This is achieved by making use of the inherent redundancy in so-called multi-block data structures, which represent multiple observations of the same phenomenon taken at different times, angles or recording conditions. Upon providing an intrinsic link between the properties of the outer vector product and extracted features in tensor decompositions (TDs), the proposed common and individual information extraction from multi-block data is performed through imposing physical meaning to otherwise unconstrained factorisation approaches. This is shown to dramatically reduce the dimensionality of search spaces for subsequent classification procedures and to yield greatly enhanced accuracy. Simulations on a multi-class classification task of large-scale extraction of individual features from a collection of partially related real-world images demonstrate the advantages of the "blessing of dimensionality" associated with TDs. △ Less

Submitted 1 November, 2017; originally announced November 2017.

Showing 1–7 of 7 results for author: Calvi, G G