-
Tensor Networks for Explainable Machine Learning in Cybersecurity
Authors:
Borja Aizpurua,
Samuel Palmer,
Roman Orus
Abstract:
In this paper we show how tensor networks help in develo** explainability of machine learning algorithms. Specifically, we develop an unsupervised clustering algorithm based on Matrix Product States (MPS) and apply it in the context of a real use-case of adversary-generated threat intelligence. Our investigation proves that MPS rival traditional deep learning models such as autoencoders and GANs…
▽ More
In this paper we show how tensor networks help in develo** explainability of machine learning algorithms. Specifically, we develop an unsupervised clustering algorithm based on Matrix Product States (MPS) and apply it in the context of a real use-case of adversary-generated threat intelligence. Our investigation proves that MPS rival traditional deep learning models such as autoencoders and GANs in terms of performance, while providing much richer model interpretability. Our approach naturally facilitates the extraction of feature-wise probabilities, Von Neumann Entropy, and mutual information, offering a compelling narrative for classification of anomalies and fostering an unprecedented level of transparency and interpretability, something fundamental to understand the rationale behind artificial intelligence decisions.
△ Less
Submitted 5 April, 2024; v1 submitted 29 December, 2023;
originally announced January 2024.
-
Towards understanding neural collapse in supervised contrastive learning with the information bottleneck method
Authors:
Siwei Wang,
Stephanie E Palmer
Abstract:
Neural collapse describes the geometry of activation in the final layer of a deep neural network when it is trained beyond performance plateaus. Open questions include whether neural collapse leads to better generalization and, if so, why and how training beyond the plateau helps. We model neural collapse as an information bottleneck (IB) problem in order to investigate whether such a compact repr…
▽ More
Neural collapse describes the geometry of activation in the final layer of a deep neural network when it is trained beyond performance plateaus. Open questions include whether neural collapse leads to better generalization and, if so, why and how training beyond the plateau helps. We model neural collapse as an information bottleneck (IB) problem in order to investigate whether such a compact representation exists and discover its connection to generalization. We demonstrate that neural collapse leads to good generalization specifically when it approaches an optimal IB solution of the classification problem. Recent research has shown that two deep neural networks independently trained with the same contrastive loss objective are linearly identifiable, meaning that the resulting representations are equivalent up to a matrix transformation. We leverage linear identifiability to approximate an analytical solution of the IB problem. This approximation demonstrates that when class means exhibit $K$-simplex Equiangular Tight Frame (ETF) behavior (e.g., $K$=10 for CIFAR10 and $K$=100 for CIFAR100), they coincide with the critical phase transitions of the corresponding IB problem. The performance plateau occurs once the optimal solution for the IB problem includes all of these phase transitions. We also show that the resulting $K$-simplex ETF can be packed into a $K$-dimensional Gaussian distribution using supervised contrastive learning with a ResNet50 backbone. This geometry suggests that the $K$-simplex ETF learned by supervised contrastive learning approximates the optimal features for source coding. Hence, there is a direct correspondence between optimal IB solutions and generalization in contrastive learning.
△ Less
Submitted 26 June, 2024; v1 submitted 19 May, 2023;
originally announced May 2023.
-
Application of Tensor Neural Networks to Pricing Bermudan Swaptions
Authors:
Raj G. Patel,
Tomas Dominguez,
Mohammad Dib,
Samuel Palmer,
Andrea Cadarso,
Fernando De Lope Contreras,
Abdelkader Ratnani,
Francisco Gomez Casanova,
Senaida Hernández-Santana,
Álvaro Díaz-Fernández,
Eva Andrés,
Jorge Luis-Hita,
Escolástico Sánchez-Martínez,
Samuel Mugel,
Roman Orus
Abstract:
The Cheyette model is a quasi-Gaussian volatility interest rate model widely used to price interest rate derivatives such as European and Bermudan Swaptions for which Monte Carlo simulation has become the industry standard. In low dimensions, these approaches provide accurate and robust prices for European Swaptions but, even in this computationally simple setting, they are known to underestimate…
▽ More
The Cheyette model is a quasi-Gaussian volatility interest rate model widely used to price interest rate derivatives such as European and Bermudan Swaptions for which Monte Carlo simulation has become the industry standard. In low dimensions, these approaches provide accurate and robust prices for European Swaptions but, even in this computationally simple setting, they are known to underestimate the value of Bermudan Swaptions when using the state variables as regressors. This is mainly due to the use of a finite number of predetermined basis functions in the regression. Moreover, in high-dimensional settings, these approaches succumb to the Curse of Dimensionality. To address these issues, Deep-learning techniques have been used to solve the backward Stochastic Differential Equation associated with the value process for European and Bermudan Swaptions; however, these methods are constrained by training time and memory. To overcome these limitations, we propose leveraging Tensor Neural Networks as they can provide significant parameter savings while attaining the same accuracy as classical Dense Neural Networks. In this paper we rigorously benchmark the performance of Tensor Neural Networks and Dense Neural Networks for pricing European and Bermudan Swaptions, and we show that Tensor Neural Networks can be trained faster than Dense Neural Networks and provide more accurate and robust prices than their Dense counterparts.
△ Less
Submitted 10 March, 2024; v1 submitted 18 April, 2023;
originally announced April 2023.
-
Quantum-Inspired Tensor Neural Networks for Option Pricing
Authors:
Raj G. Patel,
Chia-Wei Hsing,
Serkan Sahin,
Samuel Palmer,
Saeed S. Jahromi,
Shivam Sharma,
Tomas Dominguez,
Kris Tziritas,
Christophe Michel,
Vincent Porte,
Mustafa Abid,
Stephane Aubert,
Pierre Castellani,
Samuel Mugel,
Roman Orus
Abstract:
Recent advances in deep learning have enabled us to address the curse of dimensionality (COD) by solving problems in higher dimensions. A subset of such approaches of addressing the COD has led us to solving high-dimensional PDEs. This has resulted in opening doors to solving a variety of real-world problems ranging from mathematical finance to stochastic control for industrial applications. Altho…
▽ More
Recent advances in deep learning have enabled us to address the curse of dimensionality (COD) by solving problems in higher dimensions. A subset of such approaches of addressing the COD has led us to solving high-dimensional PDEs. This has resulted in opening doors to solving a variety of real-world problems ranging from mathematical finance to stochastic control for industrial applications. Although feasible, these deep learning methods are still constrained by training time and memory. Tackling these shortcomings, Tensor Neural Networks (TNN) demonstrate that they can provide significant parameter savings while attaining the same accuracy as compared to the classical Dense Neural Network (DNN). In addition, we also show how TNN can be trained faster than DNN for the same accuracy. Besides TNN, we also introduce Tensor Network Initializer (TNN Init), a weight initialization scheme that leads to faster convergence with smaller variance for an equivalent parameter count as compared to a DNN. We benchmark TNN and TNN Init by applying them to solve the parabolic PDE associated with the Heston model, which is widely used in financial pricing theory.
△ Less
Submitted 10 March, 2024; v1 submitted 28 December, 2022;
originally announced December 2022.
-
Financial Index Tracking via Quantum Computing with Cardinality Constraints
Authors:
Samuel Palmer,
Konstantinos Karagiannis,
Adam Florence,
Asier Rodriguez,
Roman Orus,
Harish Naik,
Samuel Mugel
Abstract:
In this work, we demonstrate how to apply non-linear cardinality constraints, important for real-world asset management, to quantum portfolio optimization. This enables us to tackle non-convex portfolio optimization problems using quantum annealing that would otherwise be challenging for classical algorithms. Being able to use cardinality constraints for portfolio optimization opens the doors to n…
▽ More
In this work, we demonstrate how to apply non-linear cardinality constraints, important for real-world asset management, to quantum portfolio optimization. This enables us to tackle non-convex portfolio optimization problems using quantum annealing that would otherwise be challenging for classical algorithms. Being able to use cardinality constraints for portfolio optimization opens the doors to new applications for creating innovative portfolios and exchange-traded-funds (ETFs). We apply the methodology to the practical problem of enhanced index tracking and are able to construct smaller portfolios that significantly outperform the risk profile of the target index whilst retaining high degrees of tracking.
△ Less
Submitted 24 August, 2022;
originally announced August 2022.
-
Quantum-Inspired Tensor Neural Networks for Partial Differential Equations
Authors:
Raj Patel,
Chia-Wei Hsing,
Serkan Sahin,
Saeed S. Jahromi,
Samuel Palmer,
Shivam Sharma,
Christophe Michel,
Vincent Porte,
Mustafa Abid,
Stephane Aubert,
Pierre Castellani,
Chi-Guhn Lee,
Samuel Mugel,
Roman Orus
Abstract:
Partial Differential Equations (PDEs) are used to model a variety of dynamical systems in science and engineering. Recent advances in deep learning have enabled us to solve them in a higher dimension by addressing the curse of dimensionality in new ways. However, deep learning methods are constrained by training time and memory. To tackle these shortcomings, we implement Tensor Neural Networks (TN…
▽ More
Partial Differential Equations (PDEs) are used to model a variety of dynamical systems in science and engineering. Recent advances in deep learning have enabled us to solve them in a higher dimension by addressing the curse of dimensionality in new ways. However, deep learning methods are constrained by training time and memory. To tackle these shortcomings, we implement Tensor Neural Networks (TNN), a quantum-inspired neural network architecture that leverages Tensor Network ideas to improve upon deep learning approaches. We demonstrate that TNN provide significant parameter savings while attaining the same accuracy as compared to the classical Dense Neural Network (DNN). In addition, we also show how TNN can be trained faster than DNN for the same accuracy. We benchmark TNN by applying them to solve parabolic PDEs, specifically the Black-Scholes-Barenblatt equation, widely used in financial pricing theory, empirically showing the advantages of TNN over DNN. Further examples, such as the Hamilton-Jacobi-Bellman equation, are also discussed.
△ Less
Submitted 10 August, 2022; v1 submitted 3 August, 2022;
originally announced August 2022.
-
Iterative Human and Automated Identification of Wildlife Images
Authors:
Zhongqi Miao,
Ziwei Liu,
Kaitlyn M. Gaynor,
Meredith S. Palmer,
Stella X. Yu,
Wayne M. Getz
Abstract:
Camera trap** is increasingly used to monitor wildlife, but this technology typically requires extensive data annotation. Recently, deep learning has significantly advanced automatic wildlife recognition. However, current methods are hampered by a dependence on large static data sets when wildlife data is intrinsically dynamic and involves long-tailed distributions. These two drawbacks can be ov…
▽ More
Camera trap** is increasingly used to monitor wildlife, but this technology typically requires extensive data annotation. Recently, deep learning has significantly advanced automatic wildlife recognition. However, current methods are hampered by a dependence on large static data sets when wildlife data is intrinsically dynamic and involves long-tailed distributions. These two drawbacks can be overcome through a hybrid combination of machine learning and humans in the loop. Our proposed iterative human and automated identification approach is capable of learning from wildlife imagery data with a long-tailed distribution. Additionally, it includes self-updating learning that facilitates capturing the community dynamics of rapidly changing natural systems. Extensive experiments show that our approach can achieve a ~90% accuracy employing only ~20% of the human annotations of existing approaches. Our synergistic collaboration of humans and machines transforms deep learning from a relatively inefficient post-annotation tool to a collaborative on-going annotation tool that vastly relieves the burden of human annotation and enables efficient and constant model updates.
△ Less
Submitted 18 October, 2021; v1 submitted 5 May, 2021;
originally announced May 2021.