Search | arXiv e-print repository

The HR-Calculus: Enabling Information Processing with Quaternion Algebra

Authors: Danilo P. Mandic, Sayed Pouria Talebi, Clive Cheong Took, Yili Xia, Dongpo Xu, Min Xiang, Pauline Bourigault

Abstract: From their inception, quaternions and their division algebra have proven to be advantageous in modelling rotation/orientation in three-dimensional spaces and have seen use from the initial formulation of electromagnetic filed theory through to forming the basis of quantum filed theory. Despite their impressive versatility in modelling real-world phenomena, adaptive information processing technique… ▽ More From their inception, quaternions and their division algebra have proven to be advantageous in modelling rotation/orientation in three-dimensional spaces and have seen use from the initial formulation of electromagnetic filed theory through to forming the basis of quantum filed theory. Despite their impressive versatility in modelling real-world phenomena, adaptive information processing techniques specifically designed for quaternion-valued signals have only recently come to the attention of the machine learning, signal processing, and control communities. The most important development in this direction is introduction of the HR-calculus, which provides the required mathematical foundation for deriving adaptive information processing techniques directly in the quaternion domain. In this article, the foundations of the HR-calculus are revised and the required tools for deriving adaptive learning techniques suitable for dealing with quaternion-valued signals, such as the gradient operator, chain and product derivative rules, and Taylor series expansion are presented. This serves to establish the most important applications of adaptive information processing in the quaternion domain for both single-node and multi-node formulations. The article is supported by Supplementary Material, which will be referred to as SM. △ Less

Submitted 28 November, 2023; originally announced November 2023.

arXiv:2009.08727 [pdf, other]

Recurrent Graph Tensor Networks: A Low-Complexity Framework for Modelling High-Dimensional Multi-Way Sequence

Authors: Yao Lei Xu, Danilo P. Mandic

Abstract: Recurrent Neural Networks (RNNs) are among the most successful machine learning models for sequence modelling, but tend to suffer from an exponential increase in the number of parameters when dealing with large multidimensional data. To this end, we develop a multi-linear graph filter framework for approximating the modelling of hidden states in RNNs, which is embedded in a tensor network architec… ▽ More Recurrent Neural Networks (RNNs) are among the most successful machine learning models for sequence modelling, but tend to suffer from an exponential increase in the number of parameters when dealing with large multidimensional data. To this end, we develop a multi-linear graph filter framework for approximating the modelling of hidden states in RNNs, which is embedded in a tensor network architecture to improve modelling power and reduce parameter complexity, resulting in a novel Recurrent Graph Tensor Network (RGTN). The proposed framework is validated through several multi-way sequence modelling tasks and benchmarked against traditional RNNs. By virtue of the domain aware information processing of graph filters and the expressive power of tensor networks, we show that the proposed RGTN is capable of not only out-performing standard RNNs, but also mitigating the Curse of Dimensionality associated with traditional RNNs, demonstrating superior properties in terms of performance and complexity. △ Less

Submitted 11 May, 2021; v1 submitted 18 September, 2020; originally announced September 2020.

Comments: 29th European Signal Processing Conference (EUSIPCO) 2021

arXiv:2002.11835 [pdf, ps, other]

Tensor Decompositions in Deep Learning

Authors: Davide Bacciu, Danilo P. Mandic

Abstract: The paper surveys the topic of tensor decompositions in modern machine learning applications. It focuses on three active research topics of significant relevance for the community. After a brief review of consolidated works on multi-way data analysis, we consider the use of tensor decompositions in compressing the parameter space of deep learning models. Lastly, we discuss how tensor methods can b… ▽ More The paper surveys the topic of tensor decompositions in modern machine learning applications. It focuses on three active research topics of significant relevance for the community. After a brief review of consolidated works on multi-way data analysis, we consider the use of tensor decompositions in compressing the parameter space of deep learning models. Lastly, we discuss how tensor methods can be leveraged to yield richer adaptive representations of complex data, including structured information. The paper concludes with a discussion on interesting open research challenges. △ Less

Submitted 26 February, 2020; originally announced February 2020.

arXiv:2001.10109 [pdf, other]

Supervised Learning for Non-Sequential Data: A Canonical Polyadic Decomposition Approach

Authors: Alexandros Haliassos, Kriton Konstantinidis, Danilo P. Mandic

Abstract: Efficient modelling of feature interactions underpins supervised learning for non-sequential tasks, characterized by a lack of inherent ordering of features (variables). The brute force approach of learning a parameter for each interaction of every order comes at an exponential computational and memory cost (Curse of Dimensionality). To alleviate this issue, it has been proposed to implicitly repr… ▽ More Efficient modelling of feature interactions underpins supervised learning for non-sequential tasks, characterized by a lack of inherent ordering of features (variables). The brute force approach of learning a parameter for each interaction of every order comes at an exponential computational and memory cost (Curse of Dimensionality). To alleviate this issue, it has been proposed to implicitly represent the model parameters as a tensor, the order of which is equal to the number of features; for efficiency, it can be further factorized into a compact Tensor Train (TT) format. However, both TT and other Tensor Networks (TNs), such as Tensor Ring and Hierarchical Tucker, are sensitive to the ordering of their indices (and hence to the features). To establish the desired invariance to feature ordering, we propose to represent the weight tensor through the Canonical Polyadic (CP) Decomposition (CPD), and introduce the associated inference and learning algorithms, including suitable regularization and initialization schemes. It is demonstrated that the proposed CP-based predictor significantly outperforms other TN-based predictors on sparse data while exhibiting comparable performance on dense non-sequential tasks. Furthermore, for enhanced expressiveness, we generalize the framework to allow feature map** to arbitrarily high-dimensional feature vectors. In conjunction with feature vector normalization, this is shown to yield dramatic improvements in performance for dense non-sequential tasks, matching models such as fully-connected neural networks. △ Less

Submitted 30 March, 2021; v1 submitted 27 January, 2020; originally announced January 2020.

Comments: Accepted at IEEE Transactions on Neural Networks and Learning Systems

arXiv:1910.11374 [pdf, other]

Robust Principal Component Analysis Based On Maximum Correntropy Power Iterations

Authors: Jean P. Chereau, Bruno Scalzo Dees, Danilo P. Mandic

Abstract: Principal component analysis (PCA) is recognised as a quintessential data analysis technique when it comes to describing linear relationships between the features of a dataset. However, the well-known sensitivity of PCA to non-Gaussian samples and/or outliers often makes it unreliable in practice. To this end, a robust formulation of PCA is derived based on the maximum correntropy criterion (MCC)… ▽ More Principal component analysis (PCA) is recognised as a quintessential data analysis technique when it comes to describing linear relationships between the features of a dataset. However, the well-known sensitivity of PCA to non-Gaussian samples and/or outliers often makes it unreliable in practice. To this end, a robust formulation of PCA is derived based on the maximum correntropy criterion (MCC) so as to maximise the expected likelihood of Gaussian distributed reconstruction errors. In this way, the proposed solution reduces to a generalised power iteration, whereby: (i) robust estimates of the principal components are obtained even in the presence of outliers; (ii) the number of principal components need not be specified in advance; and (iii) the entire set of principal components can be obtained, unlike existing approaches. The advantages of the proposed maximum correntropy power iteration (MCPI) are demonstrated through an intuitive numerical example. △ Less

Submitted 24 October, 2019; originally announced October 2019.

Comments: 5 pages, 1 figure

arXiv:1711.08171 [pdf, other]

doi 10.1609/aaai.v32i1.11823

Hypergraph $p$-Laplacian: A Differential Geometry View

Authors: Shota Saito, Danilo P Mandic, Hideyuki Suzuki

Abstract: The graph Laplacian plays key roles in information processing of relational data, and has analogies with the Laplacian in differential geometry. In this paper, we generalize the analogy between graph Laplacian and differential geometry to the hypergraph setting, and propose a novel hypergraph $p$-Laplacian. Unlike the existing two-node graph Laplacians, this generalization makes it possible to ana… ▽ More The graph Laplacian plays key roles in information processing of relational data, and has analogies with the Laplacian in differential geometry. In this paper, we generalize the analogy between graph Laplacian and differential geometry to the hypergraph setting, and propose a novel hypergraph $p$-Laplacian. Unlike the existing two-node graph Laplacians, this generalization makes it possible to analyze hypergraphs, where the edges are allowed to connect any number of nodes. Moreover, we propose a semi-supervised learning method based on the proposed hypergraph $p$-Laplacian, and formalize them as the analogue to the Dirichlet problem, which often appears in physics. We further explore theoretical connections to normalized hypergraph cut on a hypergraph, and propose normalized cut corresponding to hypergraph $p$-Laplacian. The proposed $p$-Laplacian is shown to outperform standard hypergraph Laplacians in the experiment on a hypergraph semi-supervised learning and normalized cut setting. △ Less

Submitted 22 November, 2017; originally announced November 2017.

Comments: Extended version of our AAAI-18 paper

Journal ref: Proceedings of the AAAI Conference on Artificial Intelligence, 32(1), 3984-3991 (2018)

arXiv:1711.00487 [pdf, other]

Tensor Valued Common and Individual Feature Extraction: Multi-dimensional Perspective

Authors: Ilia Kisil, Giuseppe G. Calvi, Danilo P. Mandic

Abstract: A novel method for common and individual feature analysis from exceedingly large-scale data is proposed, in order to ensure the tractability of both the computation and storage and thus mitigate the curse of dimensionality, a major bottleneck in modern data science. This is achieved by making use of the inherent redundancy in so-called multi-block data structures, which represent multiple observat… ▽ More A novel method for common and individual feature analysis from exceedingly large-scale data is proposed, in order to ensure the tractability of both the computation and storage and thus mitigate the curse of dimensionality, a major bottleneck in modern data science. This is achieved by making use of the inherent redundancy in so-called multi-block data structures, which represent multiple observations of the same phenomenon taken at different times, angles or recording conditions. Upon providing an intrinsic link between the properties of the outer vector product and extracted features in tensor decompositions (TDs), the proposed common and individual information extraction from multi-block data is performed through imposing physical meaning to otherwise unconstrained factorisation approaches. This is shown to dramatically reduce the dimensionality of search spaces for subsequent classification procedures and to yield greatly enhanced accuracy. Simulations on a multi-class classification task of large-scale extraction of individual features from a collection of partially related real-world images demonstrate the advantages of the "blessing of dimensionality" associated with TDs. △ Less

Submitted 1 November, 2017; originally announced November 2017.

arXiv:1701.04398 [pdf, other]

Automatic sleep monitoring using ear-EEG

Authors: Takashi Nakamura, Valentin Goverdovsky, Mary J. Morrell, Danilo P. Mandic

Abstract: The monitoring of sleep patterns without patient's inconvenience or involvement of a medical specialist is a clinical question of significant importance. To this end, we propose an automatic sleep stage monitoring system based on an affordable, unobtrusive, discreet, and long-term wearable in-ear sensor for recording the Electroencephalogram (ear-EEG). The selected features for sleep pattern class… ▽ More The monitoring of sleep patterns without patient's inconvenience or involvement of a medical specialist is a clinical question of significant importance. To this end, we propose an automatic sleep stage monitoring system based on an affordable, unobtrusive, discreet, and long-term wearable in-ear sensor for recording the Electroencephalogram (ear-EEG). The selected features for sleep pattern classification from a single ear-EEG channel include the spectral edge frequency (SEF) and multi- scale fuzzy entropy (MSFE), a structural complexity feature. In this preliminary study, the manually scored hypnograms from simultaneous scalp-EEG and ear-EEG recordings of four subjects are used as labels for two analysis scenarios: 1) classification of ear-EEG hypnogram labels from ear-EEG recordings and 2) prediction of scalp-EEG hypnogram labels from ear-EEG recordings. We consider both 2-class and 4-class sleep scoring, with the achieved accuracies ranging from 78.5 % to 95.2 % for ear-EEG labels predicted from ear-EEG, and 76.8 % to 91.8 % for scalp-EEG labels predicted from ear-EEG. The corresponding kappa coefficients, which range from 0.64 to 0.83 for Scenario 1 and from 0.65 to 0.80 for Scenario 2, indicate a Substantial to Almost Perfect agreement, thus proving the feasibility of in-ear sensing for sleep monitoring in the community. △ Less

Submitted 3 January, 2017; originally announced January 2017.

arXiv:1603.07653 [pdf, ps, other]

A Quaternion Frequency and Phasor Estimator for Three-Phase Power Distribution Networks

Authors: Sayed Pouria Talebi, Professor Danilo P. Mandic

Abstract: For the first time quaternions have been used for real-time frequency estimation, where the multi-dimensional nature of quaternions allows for the full characterization of three-phase power systems. This is achieved through the use of quaternions to provide a unified framework for incorporating voltage measurements from all the phases of a three-phase system and then employing the recently introdu… ▽ More For the first time quaternions have been used for real-time frequency estimation, where the multi-dimensional nature of quaternions allows for the full characterization of three-phase power systems. This is achieved through the use of quaternions to provide a unified framework for incorporating voltage measurements from all the phases of a three-phase system and then employing the recently introduced HR-calculus to derive a state space estimator based on the quaternion extended Kalman filter (QEKF). The components of the state space vector are designed such that they can be deployed for adaptive estimation of the system phasors. Finally, the proposed algorithm is validated through simulations using both synthetic and real-world data, which indicate that the developed quaternion frequency estimator can outperform its complex-valued counterparts. △ Less

Submitted 7 March, 2016; originally announced March 2016.

arXiv:1603.02977 [pdf, ps, other]

Frequency estimation in three-phase power systems with harmonic contamination: A multistage quaternion Kalman filtering approach

Authors: Sayed Pouria Talebi, Danilo P. Mandic

Abstract: Motivated by the need for accurate frequency information, a novel algorithm for estimating the fundamental frequency and its rate of change in three-phase power systems is developed. This is achieved through two stages of Kalman filtering. In the first stage a quaternion extended Kalman filter, which provides a unified framework for joint modeling of voltage measurements from all the phases, is us… ▽ More Motivated by the need for accurate frequency information, a novel algorithm for estimating the fundamental frequency and its rate of change in three-phase power systems is developed. This is achieved through two stages of Kalman filtering. In the first stage a quaternion extended Kalman filter, which provides a unified framework for joint modeling of voltage measurements from all the phases, is used to estimate the instantaneous phase increment of the three-phase voltages. The phase increment estimates are then used as observations of the extended Kalman filter in the second stage that accounts for the dynamic behavior of the system frequency and simultaneously estimates the fundamental frequency and its rate of change. The framework is then extended to account for the presence of harmonics. Finally, the concept is validated through simulation on both synthetic and real-world data. △ Less

Submitted 8 March, 2016; originally announced March 2016.

Showing 1–10 of 10 results for author: Mandic, D P