Search | arXiv e-print repository

Demystifying the Hypercomplex: Inductive Biases in Hypercomplex Deep Learning

Authors: Danilo Comminiello, Eleonora Grassucci, Danilo P. Mandic, Aurelio Uncini

Abstract: Hypercomplex algebras have recently been gaining prominence in the field of deep learning owing to the advantages of their division algebras over real vector spaces and their superior results when dealing with multidimensional signals in real-world 3D and 4D paradigms. This paper provides a foundational framework that serves as a roadmap for understanding why hypercomplex deep learning methods are… ▽ More Hypercomplex algebras have recently been gaining prominence in the field of deep learning owing to the advantages of their division algebras over real vector spaces and their superior results when dealing with multidimensional signals in real-world 3D and 4D paradigms. This paper provides a foundational framework that serves as a roadmap for understanding why hypercomplex deep learning methods are so successful and how their potential can be exploited. Such a theoretical framework is described in terms of inductive bias, i.e., a collection of assumptions, properties, and constraints that are built into training algorithms to guide their learning process toward more efficient and accurate solutions. We show that it is possible to derive specific inductive biases in the hypercomplex domains, which extend complex numbers to encompass diverse numbers and data structures. These biases prove effective in managing the distinctive properties of these domains, as well as the complex structures of multidimensional and multimodal signals. This novel perspective for hypercomplex deep learning promises to both demystify this class of methods and clarify their potential, under a unifying framework, and in this way promotes hypercomplex models as viable alternatives to traditional real-valued deep learning for multidimensional signal processing. △ Less

Submitted 11 May, 2024; originally announced May 2024.

Comments: Accepted for Publication in IEEE Signal Processing Magazine

arXiv:2404.15332 [pdf, other]

Clinical translation of machine learning algorithms for seizure detection in scalp electroencephalography: a systematic review

Authors: Nina Moutonnet, Steven White, Benjamin P Campbell, Danilo Mandic, Gregory Scott

Abstract: Machine learning algorithms for seizure detection have shown great diagnostic potential, with recent reported accuracies reaching 100%. However, few published algorithms have fully addressed the requirements for successful clinical translation. For example, the properties of training data may critically limit the generalisability of algorithms, algorithms may be sensitive to variability across EEG… ▽ More Machine learning algorithms for seizure detection have shown great diagnostic potential, with recent reported accuracies reaching 100%. However, few published algorithms have fully addressed the requirements for successful clinical translation. For example, the properties of training data may critically limit the generalisability of algorithms, algorithms may be sensitive to variability across EEG acquisition hardware, and run-time processing costs may render them unfeasible for real-time clinical use cases. Here, we systematically review machine learning seizure detection algorithms with a focus on clinical translatability, assessed by criteria including generalisability, run-time costs, explainability, and clinically-relevant performance metrics. For non-specialists, we provide domain-specific knowledge necessary to contextualise model development and evaluation. Our critical evaluation of machine learning algorithms with respect to their potential real-world effectiveness can help accelerate clinical translation and identify gaps in the current seizure detection literature. △ Less

Submitted 8 April, 2024; originally announced April 2024.

arXiv:2403.12285 [pdf, other]

FinLlama: Financial Sentiment Classification for Algorithmic Trading Applications

Authors: Thanos Konstantinidis, Giorgos Iacovides, Mingxue Xu, Tony G. Constantinides, Danilo Mandic

Abstract: There are multiple sources of financial news online which influence market movements and trader's decisions. This highlights the need for accurate sentiment analysis, in addition to having appropriate algorithmic trading techniques, to arrive at better informed trading decisions. Standard lexicon based sentiment approaches have demonstrated their power in aiding financial decisions. However, they… ▽ More There are multiple sources of financial news online which influence market movements and trader's decisions. This highlights the need for accurate sentiment analysis, in addition to having appropriate algorithmic trading techniques, to arrive at better informed trading decisions. Standard lexicon based sentiment approaches have demonstrated their power in aiding financial decisions. However, they are known to suffer from issues related to context sensitivity and word ordering. Large Language Models (LLMs) can also be used in this context, but they are not finance-specific and tend to require significant computational resources. To facilitate a finance specific LLM framework, we introduce a novel approach based on the Llama 2 7B foundational model, in order to benefit from its generative nature and comprehensive language manipulation. This is achieved by fine-tuning the Llama2 7B model on a small portion of supervised financial sentiment analysis data, so as to jointly handle the complexities of financial lexicon and context, and further equip** it with a neural network based decision mechanism. Such a generator-classifier scheme, referred to as FinLlama, is trained not only to classify the sentiment valence but also quantify its strength, thus offering traders a nuanced insight into financial news articles. Complementing this, the implementation of parameter-efficient fine-tuning through LoRA optimises trainable parameters, thus minimising computational and memory requirements, without sacrificing accuracy. Simulation results demonstrate the ability of the proposed FinLlama to provide a framework for enhanced portfolio management decisions and increased market returns. These results underpin the ability of FinLlama to construct high-return portfolios which exhibit enhanced resilience, even during volatile periods and unpredictable market events. △ Less

Submitted 18 March, 2024; originally announced March 2024.

arXiv:2403.10481 [pdf, other]

Tensor Star Decomposition

Authors: Wuyang Zhou, Yu-Bang Zheng, Qibin Zhao, Danilo Mandic

Abstract: A novel tensor decomposition framework, termed Tensor Star (TS) decomposition, is proposed which represents a new type of tensor network decomposition based on tensor contractions. This is achieved by connecting the core tensors in a ring shape, whereby the core tensors act as skip connections between the factor tensors and allow for direct correlation characterisation between any two arbitrary di… ▽ More A novel tensor decomposition framework, termed Tensor Star (TS) decomposition, is proposed which represents a new type of tensor network decomposition based on tensor contractions. This is achieved by connecting the core tensors in a ring shape, whereby the core tensors act as skip connections between the factor tensors and allow for direct correlation characterisation between any two arbitrary dimensions. Uniquely, this makes it possible to decompose an order-$N$ tensor into $N$ order-$3$ factor tensors $\{\mathcal{G}_{k}\}_{k=1}^{N}$ and $N$ order-$4$ core tensors $\{\mathcal{C}_{k}\}_{k=1}^{N}$, which are arranged in a star shape. Unlike the class of Tensor Train (TT) decompositions, these factor tensors are not directly connected to one another. The so obtained core tensors also enable consecutive factor tensors to have different latent ranks. In this way, the TS decomposition alleviates the "curse of dimensionality" and controls the "curse of ranks", exhibiting a storage complexity which scales linearly with the number of dimensions and as the fourth power of the ranks. △ Less

Submitted 15 March, 2024; originally announced March 2024.

arXiv:2402.14227 [pdf, other]

Quaternion recurrent neural network with real-time recurrent learning and maximum correntropy criterion

Authors: Pauline Bourigault, Dongpo Xu, Danilo P. Mandic

Abstract: We develop a robust quaternion recurrent neural network (QRNN) for real-time processing of 3D and 4D data with outliers. This is achieved by combining the real-time recurrent learning (RTRL) algorithm and the maximum correntropy criterion (MCC) as a loss function. While both the mean square error and maximum correntropy criterion are viable cost functions, it is shown that the non-quadratic maximu… ▽ More We develop a robust quaternion recurrent neural network (QRNN) for real-time processing of 3D and 4D data with outliers. This is achieved by combining the real-time recurrent learning (RTRL) algorithm and the maximum correntropy criterion (MCC) as a loss function. While both the mean square error and maximum correntropy criterion are viable cost functions, it is shown that the non-quadratic maximum correntropy loss function is less sensitive to outliers, making it suitable for applications with multidimensional noisy or uncertain data. Both algorithms are derived based on the novel generalised HR (GHR) calculus, which allows for the differentiation of real functions of quaternion variables and offers the product and chain rules, thus enabling elegant and compact derivations. Simulation results in the context of motion prediction of chest internal markers for lung cancer radiotherapy, which includes regular and irregular breathing sequences, support the analysis. △ Less

Submitted 3 April, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

Comments: 2024 International Joint Conference on Neural Networks (IJCNN)

arXiv:2401.17380 [pdf, other]

Detecting gamma-band responses to the speech envelope for the ICASSP 2024 Auditory EEG Decoding Signal Processing Grand Challenge

Authors: Mike Thornton, Jonas Auernheimer, Constantin Jehn, Danilo Mandic, Tobias Reichenbach

Abstract: The 2024 ICASSP Auditory EEG Signal Processing Grand Challenge concerns the decoding of electroencephalography (EEG) measurements taken from participants who listened to speech material. This work details our solution to the match-mismatch sub-task: given a short temporal segment of EEG recordings and several candidate speech segments, the task is to classify which of the speech segments was time-… ▽ More The 2024 ICASSP Auditory EEG Signal Processing Grand Challenge concerns the decoding of electroencephalography (EEG) measurements taken from participants who listened to speech material. This work details our solution to the match-mismatch sub-task: given a short temporal segment of EEG recordings and several candidate speech segments, the task is to classify which of the speech segments was time-aligned with the EEG signals. We show that high-frequency gamma-band responses to the speech envelope can be detected with a high accuracy. By jointly assessing gamma-band responses and low-frequency envelope tracking, we develop a match-mismatch decoder which placed first in this task. △ Less

Submitted 30 January, 2024; originally announced January 2024.

Comments: Accepted for ICASSP 2024 (challenge track)

arXiv:2401.16729 [pdf, other]

Widely Linear Matched Filter: A Lynchpin towards the Interpretability of Complex-valued CNNs

Authors: Qingchen Wang, Zhe Li, Zdenka Babic, Wei Deng, Ljubiša Stanković, Danilo P. Mandic

Abstract: A recent study on the interpretability of real-valued convolutional neural networks (CNNs) {Stankovic_Mandic_2023CNN} has revealed a direct and physically meaningful link with the task of finding features in data through matched filters. However, applying this paradigm to illuminate the interpretability of complex-valued CNNs meets a formidable obstacle: the extension of matched filtering to a gen… ▽ More A recent study on the interpretability of real-valued convolutional neural networks (CNNs) {Stankovic_Mandic_2023CNN} has revealed a direct and physically meaningful link with the task of finding features in data through matched filters. However, applying this paradigm to illuminate the interpretability of complex-valued CNNs meets a formidable obstacle: the extension of matched filtering to a general class of noncircular complex-valued data, referred to here as the widely linear matched filter (WLMF), has been only implicit in the literature. To this end, to establish the interpretability of the operation of complex-valued CNNs, we introduce a general WLMF paradigm, provide its solution and undertake analysis of its performance. For rigor, our WLMF solution is derived without imposing any assumption on the probability density of noise. The theoretical advantages of the WLMF over its standard strictly linear counterpart (SLMF) are provided in terms of their output signal-to-noise-ratios (SNRs), with WLMF consistently exhibiting enhanced SNR. Moreover, the lower bound on the SNR gain of WLMF is derived, together with condition to attain this bound. This serves to revisit the convolution-activation-pooling chain in complex-valued CNNs through the lens of matched filtering, which reveals the potential of WLMFs to provide physical interpretability and enhance explainability of general complex-valued CNNs. Simulations demonstrate the agreement between the theoretical and numerical results. △ Less

Submitted 31 January, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

arXiv:2401.05187 [pdf, other]

Decoding of Selective Attention to Speech From Ear-EEG Recordings

Authors: Mike Thornton, Danilo Mandic, Tobias Reichenbach

Abstract: Many people with hearing loss struggle to comprehend speech in crowded auditory scenes, even when they are using hearing aids. Future hearing technologies which can identify the focus of a listener's auditory attention, and selectively amplify that sound alone, could improve the experience that this patient group has with their hearing aids. In this work, we present the results of our experiments… ▽ More Many people with hearing loss struggle to comprehend speech in crowded auditory scenes, even when they are using hearing aids. Future hearing technologies which can identify the focus of a listener's auditory attention, and selectively amplify that sound alone, could improve the experience that this patient group has with their hearing aids. In this work, we present the results of our experiments with an ultra-wearable in-ear electroencephalography (EEG) monitoring device. Participants listened to two competing speakers in an auditory attention experiment whilst their EEG was recorded. We show that typical neural responses to the speech envelope, as well as its onsets, can be recovered from such a device, and that the morphology of the recorded responses is indeed modulated by selective attention to speech. Features of the attended and ignored speech stream can also be reconstructed from the EEG recordings, with the reconstruction quality serving as a marker of selective auditory attention. Using the stimulus-reconstruction method, we show that with this device auditory attention can be decoded from short segments of EEG recordings which are of just a few seconds in duration. The results provide further evidence that ear-EEG systems offer good prospects for wearable auditory monitoring as well as future cognitively-steered hearing aids. △ Less

Submitted 10 January, 2024; originally announced January 2024.

arXiv:2312.09768 [pdf, other]

Decoding Envelope and Frequency-Following EEG Responses to Continuous Speech Using Deep Neural Networks

Authors: Mike Thornton, Danilo Mandic, Tobias Reichenbach

Abstract: The electroencephalogram (EEG) offers a non-invasive means by which a listener's auditory system may be monitored during continuous speech perception. Reliable auditory-EEG decoders could facilitate the objective diagnosis of hearing disorders, or find applications in cognitively-steered hearing aids. Previously, we developed decoders for the ICASSP Auditory EEG Signal Processing Grand Challenge (… ▽ More The electroencephalogram (EEG) offers a non-invasive means by which a listener's auditory system may be monitored during continuous speech perception. Reliable auditory-EEG decoders could facilitate the objective diagnosis of hearing disorders, or find applications in cognitively-steered hearing aids. Previously, we developed decoders for the ICASSP Auditory EEG Signal Processing Grand Challenge (SPGC). These decoders aimed to solve the match-mismatch task: given a short temporal segment of EEG recordings, and two candidate speech segments, the task is to identify which of the two speech segments is temporally aligned, or matched, with the EEG segment. The decoders made use of cortical responses to the speech envelope, as well as speech-related frequency-following responses, to relate the EEG recordings to the speech stimuli. Here we comprehensively document the methods by which the decoders were developed. We extend our previous analysis by exploring the association between speaker characteristics (pitch and sex) and classification accuracy, and provide a full statistical analysis of the final performance of the decoders as evaluated on a heldout portion of the dataset. Finally, the generalisation capabilities of the decoders are characterised, by evaluating them using an entirely different dataset which contains EEG recorded under a variety of speech-listening conditions. The results show that the match-mismatch decoders achieve accurate and robust classification accuracies, and they can even serve as auditory attention decoders without additional training. △ Less

Submitted 15 December, 2023; originally announced December 2023.

arXiv:2311.16771 [pdf, other]

The HR-Calculus: Enabling Information Processing with Quaternion Algebra

Authors: Danilo P. Mandic, Sayed Pouria Talebi, Clive Cheong Took, Yili Xia, Dongpo Xu, Min Xiang, Pauline Bourigault

Abstract: From their inception, quaternions and their division algebra have proven to be advantageous in modelling rotation/orientation in three-dimensional spaces and have seen use from the initial formulation of electromagnetic filed theory through to forming the basis of quantum filed theory. Despite their impressive versatility in modelling real-world phenomena, adaptive information processing technique… ▽ More From their inception, quaternions and their division algebra have proven to be advantageous in modelling rotation/orientation in three-dimensional spaces and have seen use from the initial formulation of electromagnetic filed theory through to forming the basis of quantum filed theory. Despite their impressive versatility in modelling real-world phenomena, adaptive information processing techniques specifically designed for quaternion-valued signals have only recently come to the attention of the machine learning, signal processing, and control communities. The most important development in this direction is introduction of the HR-calculus, which provides the required mathematical foundation for deriving adaptive information processing techniques directly in the quaternion domain. In this article, the foundations of the HR-calculus are revised and the required tools for deriving adaptive learning techniques suitable for dealing with quaternion-valued signals, such as the gradient operator, chain and product derivative rules, and Taylor series expansion are presented. This serves to establish the most important applications of adaptive information processing in the quaternion domain for both single-node and multi-node formulations. The article is supported by Supplementary Material, which will be referred to as SM. △ Less

Submitted 28 November, 2023; originally announced November 2023.

arXiv:2310.15742 [pdf, other]

Improving Diffusion Models for ECG Imputation with an Augmented Template Prior

Authors: Alexander Jenkins, Zehua Chen, Fu Siong Ng, Danilo Mandic

Abstract: Pulsative signals such as the electrocardiogram (ECG) are extensively collected as part of routine clinical care. However, noisy and poor-quality recordings are a major issue for signals collected using mobile health systems, decreasing the signal quality, leading to missing values, and affecting automated downstream tasks. Recent studies have explored the imputation of missing values in ECG with… ▽ More Pulsative signals such as the electrocardiogram (ECG) are extensively collected as part of routine clinical care. However, noisy and poor-quality recordings are a major issue for signals collected using mobile health systems, decreasing the signal quality, leading to missing values, and affecting automated downstream tasks. Recent studies have explored the imputation of missing values in ECG with probabilistic time-series models. Nevertheless, in comparison with the deterministic models, their performance is still limited, as the variations across subjects and heart-beat relationships are not explicitly considered in the training objective. In this work, to improve the imputation and forecasting accuracy for ECG with probabilistic models, we present a template-guided denoising diffusion probabilistic model (DDPM), PulseDiff, which is conditioned on an informative prior for a range of health conditions. Specifically, 1) we first extract a subject-level pulsative template from the observed values to use as an informative prior of the missing values, which personalises the prior; 2) we then add beat-level stochastic shift terms to augment the prior, which considers variations in the position and amplitude of the prior at each beat; 3) we finally design a confidence score to consider the health condition of the subject, which ensures our prior is provided safely. Experiments with the PTBXL dataset reveal that PulseDiff improves the performance of two strong DDPM baseline models, CSDI and SSSD$^{S4}$, verifying that our method guides the generation of DDPMs while managing the uncertainty. When combined with SSSD$^{S4}$, PulseDiff outperforms the leading deterministic model for short-interval missing data and is comparable for long-interval data loss. △ Less

Submitted 14 November, 2023; v1 submitted 24 October, 2023; originally announced October 2023.

arXiv:2309.03557 [pdf, ps, other]

On the dynamics of multi agent nonlinear filtering and learning

Authors: Sayed Pouria Talebi, Danilo Mandic

Abstract: Multiagent systems aim to accomplish highly complex learning tasks through decentralised consensus seeking dynamics and their use has garnered a great deal of attention in the signal processing and computational intelligence societies. This article examines the behaviour of multiagent networked systems with nonlinear filtering/learning dynamics. To this end, a general formulation for the actions o… ▽ More Multiagent systems aim to accomplish highly complex learning tasks through decentralised consensus seeking dynamics and their use has garnered a great deal of attention in the signal processing and computational intelligence societies. This article examines the behaviour of multiagent networked systems with nonlinear filtering/learning dynamics. To this end, a general formulation for the actions of an agent in multiagent networked systems is presented and conditions for achieving a cohesive learning behaviour is given. Importantly, application of the so derived framework in distributed and federated learning scenarios are presented. △ Less

Submitted 19 September, 2023; v1 submitted 7 September, 2023; originally announced September 2023.

arXiv:2307.00526 [pdf, other]

TensorGPT: Efficient Compression of the Embedding Layer in LLMs based on the Tensor-Train Decomposition

Authors: Mingxue Xu, Yao Lei Xu, Danilo P. Mandic

Abstract: High-dimensional token embeddings underpin Large Language Models (LLMs), as they can capture subtle semantic information and significantly enhance the modelling of complex language patterns. However, the associated high dimensionality also introduces considerable model parameters, and a prohibitively high model storage. To address this issue, this work proposes an approach based on the Tensor-Trai… ▽ More High-dimensional token embeddings underpin Large Language Models (LLMs), as they can capture subtle semantic information and significantly enhance the modelling of complex language patterns. However, the associated high dimensionality also introduces considerable model parameters, and a prohibitively high model storage. To address this issue, this work proposes an approach based on the Tensor-Train Decomposition (TTD), where each token embedding is treated as a Matrix Product State (MPS) that can be efficiently computed in a distributed manner. The experimental results on GPT-2 demonstrate that, through our approach, the embedding layer can be compressed by a factor of up to 38.40 times, and when the compression factor is 3.31 times, even produced a better performance than the original GPT-2 model. △ Less

Submitted 2 July, 2023; originally announced July 2023.

arXiv:2305.19183 [pdf, other]

Graph-based Time Series Clustering for End-to-End Hierarchical Forecasting

Authors: Andrea Cini, Danilo Mandic, Cesare Alippi

Abstract: Existing relationships among time series can be exploited as inductive biases in learning effective forecasting models. In hierarchical time series, relationships among subsets of sequences induce hard constraints (hierarchical inductive biases) on the predicted values. In this paper, we propose a graph-based methodology to unify relational and hierarchical inductive biases in the context of deep… ▽ More Existing relationships among time series can be exploited as inductive biases in learning effective forecasting models. In hierarchical time series, relationships among subsets of sequences induce hard constraints (hierarchical inductive biases) on the predicted values. In this paper, we propose a graph-based methodology to unify relational and hierarchical inductive biases in the context of deep learning for time series forecasting. In particular, we model both types of relationships as dependencies in a pyramidal graph structure, with each pyramidal layer corresponding to a level of the hierarchy. By exploiting modern - trainable - graph pooling operators we show that the hierarchical structure, if not available as a prior, can be learned directly from data, thus obtaining cluster assignments aligned with the forecasting objective. A differentiable reconciliation stage is incorporated into the processing architecture, allowing hierarchical constraints to act both as an architectural bias as well as a regularization element for predictions. Simulation results on representative datasets show that the proposed method compares favorably against the state of the art. △ Less

Submitted 30 May, 2023; originally announced May 2023.

arXiv:2305.14102 [pdf, other]

A Deep Matched Filter For R-Peak Detection in Ear-ECG

Authors: Harry J. Davies, Ghena Hammour, Marek Zylinski, Amir Nassibi, Danilo P. Mandic

Abstract: The Ear-ECG provides a continuous Lead I electrocardiogram (ECG) by measuring the potential difference related to heart activity using electrodes that can be embedded within earphones. The significant increase in wearability and comfort afforded by Ear-ECG is often accompanied by a corresponding degradation in signal quality - a common obstacle that is shared by most wearable technologies. We aim… ▽ More The Ear-ECG provides a continuous Lead I electrocardiogram (ECG) by measuring the potential difference related to heart activity using electrodes that can be embedded within earphones. The significant increase in wearability and comfort afforded by Ear-ECG is often accompanied by a corresponding degradation in signal quality - a common obstacle that is shared by most wearable technologies. We aim to resolve this issue by introducing a Deep Matched Filter (Deep-MF) for the highly accurate detection of R-peaks in wearable ECG, thus enhancing the utility of Ear-ECG in real-world scenarios. The Deep-MF consists of an encoder stage (trained as part of an encoder-decoder module to reproduce ground truth ECG), and an R-peak classifier stage. Through its operation as a Matched Filter, the encoder searches for matches with an ECG template pattern in the input signal, prior to filtering the matches with the subsequent convolutional layers and selecting peaks corresponding to true ECG matches. The so condensed latent representation of R-peak information is then fed into a simple R-peak classifier, of which the output provides precise R-peak locations. The proposed Deep Matched Filter is evaluated using leave-one-subject-out cross validation over 36 subjects with an age range of 18-75, with the Deep-MF outperforming existing algorithms for R-peak detection in noisy ECG. The Deep-MF achieves a median R-peak recall of 94.9\%, a median precision of 91.2\% and an (AUC) value of 0.97. Furthermore, we demonstrate that the Deep Matched Filter algorithm not only retains the initialised ECG kernel structure during the training process, but also amplifies portions of the ECG which it deems most valuable. Overall, the Deep Matched Filter serves as a valuable step forward for the real-world functionality of Ear-ECG and, through its explainable operation, the acceptance of deep learning models in e-health. △ Less

Submitted 23 May, 2023; originally announced May 2023.

Comments: 7 pages, 7 figures

arXiv:2305.14062 [pdf, other]

Amplitude-Independent Machine Learning for PPG through Visibility Graphs and Transfer Learning

Authors: Yuyang Miao, Harry J. Davies, Danilo P. Mandic

Abstract: Photoplethysmography (PPG) refers to the measurement of variations in blood volume using light and is a feature of most wearable devices. The PPG signals provide insight into the body's circulatory system and can be employed to extract various bio-features, such as heart rate and vascular ageing. Although several algorithms have been proposed for this purpose, many exhibit limitations, including h… ▽ More Photoplethysmography (PPG) refers to the measurement of variations in blood volume using light and is a feature of most wearable devices. The PPG signals provide insight into the body's circulatory system and can be employed to extract various bio-features, such as heart rate and vascular ageing. Although several algorithms have been proposed for this purpose, many exhibit limitations, including heavy reliance on human calibration, high signal quality requirements, and a lack of generalisation. In this paper, we introduce a PPG signal processing framework that integrates graph theory and computer vision algorithms, to provide an analysis framework which is amplitude-independent and invariant to affine transformations. It also requires minimal preprocessing, fuses information through RGB channels and exhibits robust generalisation across tasks and datasets. The proposed VGTL-net achieves state-of-the-art performance in the prediction of vascular ageing and demonstrates robust estimation of continuous blood pressure waveforms. △ Less

Submitted 16 January, 2024; v1 submitted 23 May, 2023; originally announced May 2023.

arXiv:2305.06879 [pdf, ps, other]

doi 10.1109/TSP.2023.3328053

Convex Quaternion Optimization for Signal Processing: Theory and Applications

Authors: Shuning Sun, Qiankun Diao, Dongpo Xu, Pauline Bourigault, Danilo P. Mandic

Abstract: Convex optimization methods have been extensively used in the fields of communications and signal processing. However, the theory of quaternion optimization is currently not as fully developed and systematic as that of complex and real optimization. To this end, we establish an essential theory of convex quaternion optimization for signal processing based on the generalized Hamilton-real (GHR) cal… ▽ More Convex optimization methods have been extensively used in the fields of communications and signal processing. However, the theory of quaternion optimization is currently not as fully developed and systematic as that of complex and real optimization. To this end, we establish an essential theory of convex quaternion optimization for signal processing based on the generalized Hamilton-real (GHR) calculus. This is achieved in a way which conforms with traditional complex and real optimization theory. For rigorous, We present five discriminant theorems for convex quaternion functions, and four discriminant criteria for strongly convex quaternion functions. Furthermore, we provide a fundamental theorem for the optimality of convex quaternion optimization problems, and demonstrate its utility through three applications in quaternion signal processing. These results provide a solid theoretical foundation for convex quaternion optimization and open avenues for further developments in signal processing applications. △ Less

Submitted 9 May, 2023; originally announced May 2023.

Journal ref: IEEE Trans. Signal Process., vol. 71, pp. 4106-4115, Oct. 2023

arXiv:2305.05675 [pdf, ps, other]

UAdam: Unified Adam-Type Algorithmic Framework for Non-Convex Stochastic Optimization

Authors: Yiming Jiang, **lan Liu, Dongpo Xu, Danilo P. Mandic

Abstract: Adam-type algorithms have become a preferred choice for optimisation in the deep learning setting, however, despite success, their convergence is still not well understood. To this end, we introduce a unified framework for Adam-type algorithms (called UAdam). This is equipped with a general form of the second-order moment, which makes it possible to include Adam and its variants as special cases,… ▽ More Adam-type algorithms have become a preferred choice for optimisation in the deep learning setting, however, despite success, their convergence is still not well understood. To this end, we introduce a unified framework for Adam-type algorithms (called UAdam). This is equipped with a general form of the second-order moment, which makes it possible to include Adam and its variants as special cases, such as NAdam, AMSGrad, AdaBound, AdaFom, and Adan. This is supported by a rigorous convergence analysis of UAdam in the non-convex stochastic setting, showing that UAdam converges to the neighborhood of stationary points with the rate of $\mathcal{O}(1/T)$. Furthermore, the size of neighborhood decreases as $β$ increases. Importantly, our analysis only requires the first-order momentum factor to be close enough to 1, without any restrictions on the second-order momentum factor. Theoretical results also show that vanilla Adam can converge by selecting appropriate hyperparameters, which provides a theoretical guarantee for the analysis, applications, and further developments of the whole class of Adam-type algorithms. △ Less

Submitted 9 May, 2023; originally announced May 2023.

arXiv:2303.13565 [pdf, other]

Graph Tensor Networks: An Intuitive Framework for Designing Large-Scale Neural Learning Systems on Multiple Domains

Authors: Yao Lei Xu, Kriton Konstantinidis, Danilo P. Mandic

Abstract: Despite the omnipresence of tensors and tensor operations in modern deep learning, the use of tensor mathematics to formally design and describe neural networks is still under-explored within the deep learning community. To this end, we introduce the Graph Tensor Network (GTN) framework, an intuitive yet rigorous graphical framework for systematically designing and implementing large-scale neural… ▽ More Despite the omnipresence of tensors and tensor operations in modern deep learning, the use of tensor mathematics to formally design and describe neural networks is still under-explored within the deep learning community. To this end, we introduce the Graph Tensor Network (GTN) framework, an intuitive yet rigorous graphical framework for systematically designing and implementing large-scale neural learning systems on both regular and irregular domains. The proposed framework is shown to be general enough to include many popular architectures as special cases, and flexible enough to handle data on any and many data domains. The power and flexibility of the proposed framework is demonstrated through real-data experiments, resulting in improved performance at a drastically lower complexity costs, by virtue of tensor algebra. △ Less

Submitted 23 March, 2023; originally announced March 2023.

arXiv:2303.06435 [pdf, other]

Relating EEG recordings to speech using envelope tracking and the speech-FFR

Authors: Mike Thornton, Danilo Mandic, Tobias Reichenbach

Abstract: During speech perception, a listener's electroencephalogram (EEG) reflects acoustic-level processing as well as higher-level cognitive factors such as speech comprehension and attention. However, decoding speech from EEG recordings is challenging due to the low signal-to-noise ratios of EEG signals. We report on an approach developed for the ICASSP 2023 'Auditory EEG Decoding' Signal Processing Gr… ▽ More During speech perception, a listener's electroencephalogram (EEG) reflects acoustic-level processing as well as higher-level cognitive factors such as speech comprehension and attention. However, decoding speech from EEG recordings is challenging due to the low signal-to-noise ratios of EEG signals. We report on an approach developed for the ICASSP 2023 'Auditory EEG Decoding' Signal Processing Grand Challenge. A simple ensembling method is shown to considerably improve upon the baseline decoder performance. Even higher classification rates are achieved by jointly decoding the speech-evoked frequency-following response and responses to the temporal envelope of speech, as well as by fine-tuning the decoders to individual subjects. Our results could have applications in the diagnosis of hearing disorders or in cognitively steered hearing aids. △ Less

Submitted 11 March, 2023; originally announced March 2023.

Comments: 2 pages, 3 figures. Accepted for ICASSP 2023 (challenge track)

arXiv:2301.12503 [pdf, other]

AudioLDM: Text-to-Audio Generation with Latent Diffusion Models

Authors: Haohe Liu, Zehua Chen, Yi Yuan, Xinhao Mei, Xubo Liu, Danilo Mandic, Wenwu Wang, Mark D. Plumbley

Abstract: Text-to-audio (TTA) system has recently gained attention for its ability to synthesize general audio based on text descriptions. However, previous studies in TTA have limited generation quality with high computational costs. In this study, we propose AudioLDM, a TTA system that is built on a latent space to learn the continuous audio representations from contrastive language-audio pretraining (CLA… ▽ More Text-to-audio (TTA) system has recently gained attention for its ability to synthesize general audio based on text descriptions. However, previous studies in TTA have limited generation quality with high computational costs. In this study, we propose AudioLDM, a TTA system that is built on a latent space to learn the continuous audio representations from contrastive language-audio pretraining (CLAP) latents. The pretrained CLAP models enable us to train LDMs with audio embedding while providing text embedding as a condition during sampling. By learning the latent representations of audio signals and their compositions without modeling the cross-modal relationship, AudioLDM is advantageous in both generation quality and computational efficiency. Trained on AudioCaps with a single GPU, AudioLDM achieves state-of-the-art TTA performance measured by both objective and subjective metrics (e.g., frechet distance). Moreover, AudioLDM is the first TTA system that enables various text-guided audio manipulations (e.g., style transfer) in a zero-shot fashion. Our implementation and demos are available at https://audioldm.github.io. △ Less

Submitted 9 September, 2023; v1 submitted 29 January, 2023; originally announced January 2023.

Comments: Accepted by ICML 2023. Demo and implementation at https://audioldm.github.io. Evaluation toolbox at https://github.com/haoheliu/audioldm_eval

arXiv:2301.09984 [pdf, other]

Fair and skill-diverse student group formation via constrained k-way graph partitioning

Authors: Alexander Jenkins, Imad Jaimoukha, Ljubisa Stankovic, Danilo Mandic

Abstract: Forming the right combination of students in a group promises to enable a powerful and effective environment for learning and collaboration. However, defining a group of students is a complex task which has to satisfy multiple constraints. This work introduces an unsupervised algorithm for fair and skill-diverse student group formation. This is achieved by taking account of student course marks an… ▽ More Forming the right combination of students in a group promises to enable a powerful and effective environment for learning and collaboration. However, defining a group of students is a complex task which has to satisfy multiple constraints. This work introduces an unsupervised algorithm for fair and skill-diverse student group formation. This is achieved by taking account of student course marks and sensitive attributes provided by the education office. The skill sets of students are determined using unsupervised dimensionality reduction of course mark data via the Laplacian eigenmap. The problem is formulated as a constrained graph partitioning problem, whereby the diversity of skill sets in each group are maximised, group sizes are upper and lower bounded according to available resources, and `balance' of a sensitive attribute is lower bounded to enforce fairness in group formation. This optimisation problem is solved using integer programming and its effectiveness is demonstrated on a dataset of student course marks from Imperial College London. △ Less

Submitted 12 January, 2023; originally announced January 2023.

arXiv:2301.06831 [pdf, other]

Generalizing Impermanent Loss on Decentralized Exchanges with Constant Function Market Makers

Authors: Rohan Tangri, Peter Yatsyshin, Elisabeth A. Duijnstee, Danilo Mandic

Abstract: Liquidity providers are essential for the function of decentralized exchanges to ensure liquidity takers can be guaranteed a counterparty for their trades. However, liquidity providers investing in liquidity pools face many risks, the most prominent of which is impermanent loss. Currently, analysis of this metric is difficult to conduct due to different market maker algorithms, fee structures and… ▽ More Liquidity providers are essential for the function of decentralized exchanges to ensure liquidity takers can be guaranteed a counterparty for their trades. However, liquidity providers investing in liquidity pools face many risks, the most prominent of which is impermanent loss. Currently, analysis of this metric is difficult to conduct due to different market maker algorithms, fee structures and concentrated liquidity dynamics across the various exchanges. To this end, we provide a framework to generalize impermanent loss for multiple asset pools obeying any constant function market maker with optional concentrated liquidity. We also discuss how pool fees fit into the framework, and identify the condition for which liquidity provisioning becomes profitable when earnings from trading fees exceed impermanent loss. Finally, we demonstrate the utility and generalizability of this framework with simulations in BalancerV2 and UniswapV3. △ Less

Submitted 17 January, 2023; originally announced January 2023.

Comments: 14 pages

arXiv:2301.06406 [pdf, other]

Hearables: Ear EEG Based Driver Fatigue Detection

Authors: Metin C. Yarici, Pierluigi Amadori, Harry Davies, Takashi Nakamura, Nico Lingg, Yiannis Demiris, Danilo P. Mandic

Abstract: Ear EEG based driver fatigue monitoring systems have the potential to provide a seamless, efficient, and feasibly deployable alternative to existing scalp EEG based systems, which are often cumbersome and impractical. However, the feasibility of detecting the relevant delta, theta, alpha, and beta band EEG activity through the ear EEG is yet to be investigated. Through measurements of scalp and ea… ▽ More Ear EEG based driver fatigue monitoring systems have the potential to provide a seamless, efficient, and feasibly deployable alternative to existing scalp EEG based systems, which are often cumbersome and impractical. However, the feasibility of detecting the relevant delta, theta, alpha, and beta band EEG activity through the ear EEG is yet to be investigated. Through measurements of scalp and ear EEG on ten subjects during a simulated, monotonous driving experiment, this study provides statistical analysis of characteristic ear EEG changes that are associated with the transition from alert to mentally fatigued states, and subsequent testing of a machine learning based automatic fatigue detection model. Novel numerical evidence is provided to support the feasibility of detection of mental fatigue with ear EEG that is in agreement with widely reported scalp EEG findings. This study paves the way for the development of ultra-wearable and readily deployable hearables based driver fatigue monitoring systems. △ Less

Submitted 16 January, 2023; originally announced January 2023.

arXiv:2301.02475 [pdf, other]

Hearables: Feasibility of Recording Cardiac Rhythms from Single Ear Locations

Authors: Metin Yarici, Wilhelm Von Rosenberg, Ghena Hammour, Harry Davies, Pierluigi Amadori, Nico Lingg, Yiannis Demiris, Danilo P. Mandic

Abstract: Wearable technologies are envisaged to provide critical support to future healthcare systems. Hearables - devices worn in the ear - are of particular interest due to their ability to provide health monitoring in an efficient, reliable and unobtrusive way. Despite the considerable potential of these devices, the ECG signal that can be acquired through a hearable device worn on a single ear is still… ▽ More Wearable technologies are envisaged to provide critical support to future healthcare systems. Hearables - devices worn in the ear - are of particular interest due to their ability to provide health monitoring in an efficient, reliable and unobtrusive way. Despite the considerable potential of these devices, the ECG signal that can be acquired through a hearable device worn on a single ear is still relatively unexplored. Biophysics modelling of ECG volume conduction was used to establish principles behind the single ear ECG signal, and measurements of cardiac rhythms from 10 subjects were found to be in good correspondence with simulated equivalents. Additionally, the viability of the single ear ECG in real-world environments was determined through one hour duration measurements during a simulated driving task on 5 subjects. Results demonstrated that the single ear ECG resembles the Lead I signal, the most widely used ECG signal in the identification of heart conditions such as myocardial infarction and atrial fibrillation, and was robust against real-world measurement noise, even after prolonged measurements. This study conclusively demonstrates that hearables can enable continuous monitoring of vital signs in an unobtrusive and seamless way, with the potential for reliable identification and management of heart conditions such as myocardial infarction and atrial fibrillation. △ Less

Submitted 6 January, 2023; originally announced January 2023.

arXiv:2212.14518 [pdf, other]

ResGrad: Residual Denoising Diffusion Probabilistic Models for Text to Speech

Authors: Zehua Chen, Yihan Wu, Yichong Leng, Jiawei Chen, Haohe Liu, Xu Tan, Yang Cui, Ke Wang, Lei He, Sheng Zhao, Jiang Bian, Danilo Mandic

Abstract: Denoising Diffusion Probabilistic Models (DDPMs) are emerging in text-to-speech (TTS) synthesis because of their strong capability of generating high-fidelity samples. However, their iterative refinement process in high-dimensional data space results in slow inference speed, which restricts their application in real-time systems. Previous works have explored speeding up by minimizing the number of… ▽ More Denoising Diffusion Probabilistic Models (DDPMs) are emerging in text-to-speech (TTS) synthesis because of their strong capability of generating high-fidelity samples. However, their iterative refinement process in high-dimensional data space results in slow inference speed, which restricts their application in real-time systems. Previous works have explored speeding up by minimizing the number of inference steps but at the cost of sample quality. In this work, to improve the inference speed for DDPM-based TTS model while achieving high sample quality, we propose ResGrad, a lightweight diffusion model which learns to refine the output spectrogram of an existing TTS model (e.g., FastSpeech 2) by predicting the residual between the model output and the corresponding ground-truth speech. ResGrad has several advantages: 1) Compare with other acceleration methods for DDPM which need to synthesize speech from scratch, ResGrad reduces the complexity of task by changing the generation target from ground-truth mel-spectrogram to the residual, resulting into a more lightweight model and thus a smaller real-time factor. 2) ResGrad is employed in the inference process of the existing TTS model in a plug-and-play way, without re-training this model. We verify ResGrad on the single-speaker dataset LJSpeech and two more challenging datasets with multiple speakers (LibriTTS) and high sampling rate (VCTK). Experimental results show that in comparison with other speed-up methods of DDPMs: 1) ResGrad achieves better sample quality with the same inference speed measured by real-time factor; 2) with similar speech quality, ResGrad synthesizes speech faster than baseline methods by more than 10 times. Audio samples are available at https://resgrad1.github.io/. △ Less

Submitted 29 December, 2022; originally announced December 2022.

Comments: 13 pages, 5 figures

arXiv:2212.12578 [pdf, other]

Rapid Extraction of Respiratory Waveforms from Photoplethysmography: A Deep Encoder Approach

Authors: Harry J. Davies, Danilo P. Mandic

Abstract: Much of the information of breathing is contained within the photoplethysmography (PPG) signal, through changes in venous blood flow, heart rate and stroke volume. We aim to leverage this fact, by employing a novel deep learning framework which is a based on a repurposed convolutional autoencoder. Our model aims to encode all of the relevant respiratory information contained within photoplethysmog… ▽ More Much of the information of breathing is contained within the photoplethysmography (PPG) signal, through changes in venous blood flow, heart rate and stroke volume. We aim to leverage this fact, by employing a novel deep learning framework which is a based on a repurposed convolutional autoencoder. Our model aims to encode all of the relevant respiratory information contained within photoplethysmography waveform, and decode it into a waveform that is similar to a gold standard respiratory reference. The model is employed on two photoplethysmography data sets, namely Capnobase and BIDMC. We show that the model is capable of producing respiratory waveforms that approach the gold standard, while in turn producing state of the art respiratory rate estimates. We also show that when it comes to capturing more advanced respiratory waveform characteristics such as duty cycle, our model is for the most part unsuccessful. A suggested reason for this, in light of a previous study on in-ear PPG, is that the respiratory variations in finger-PPG are far weaker compared with other recording locations. Importantly, our model can perform these waveform estimates in a fraction of a millisecond, giving it the capacity to produce over 6 hours of respiratory waveforms in a single second. Moreover, we attempt to interpret the behaviour of the kernel weights within the model, showing that in part our model intuitively selects different breathing frequencies. The model proposed in this work could help to improve the usefulness of consumer PPG-based wearables for medical applications, where detailed respiratory information is required. △ Less

Submitted 22 December, 2022; originally announced December 2022.

arXiv:2212.02281 [pdf, other]

Complexity-based Financial Stress Evaluation

Authors: Hongjian Xiao, Yao Lei Xu, Danilo P. Mandic

Abstract: Financial markets typically exhibit dynamically complex properties as they undergo continuous interactions with economic and environmental factors. The Efficient Market Hypothesis indicates a rich difference in the structural complexity of security prices between normal (stable markets) and abnormal (financial crises) situations. Considering the analogy between market undulation of price time seri… ▽ More Financial markets typically exhibit dynamically complex properties as they undergo continuous interactions with economic and environmental factors. The Efficient Market Hypothesis indicates a rich difference in the structural complexity of security prices between normal (stable markets) and abnormal (financial crises) situations. Considering the analogy between market undulation of price time series and physical stress of bio-signals, we investigate whether stress indices in bio-systems can be adopted and modified so as to measure 'standard stress' in financial markets. This is achieved by employing structural complexity analysis, based on variants of univariate and multivariate sample entropy, to estimate the stress level of both financial markets on the whole and the performance of the individual financial indices. Further, we propose a novel graphical framework to establish the sensitivity of individual assets and stock markets to financial crises. This is achieved through Catastrophe Theory and entropy-based stress evaluations indicating the unique performance of each index/individual stock in response to different crises. Four major indices and four individual equities with gold prices are considered over the past 32 years from 1991-2021. Our findings based on nonlinear analyses and the proposed framework support the Efficient Market Hypothesis and reveal the relations among economic indices and within each price time series. △ Less

Submitted 5 December, 2022; originally announced December 2022.

arXiv:2211.05581 [pdf, other]

Graph-Regularized Tensor Regression: A Domain-Aware Framework for Interpretable Multi-Way Financial Modelling

Authors: Yao Lei Xu, Kriton Konstantinidis, Danilo P. Mandic

Abstract: Analytics of financial data is inherently a Big Data paradigm, as such data are collected over many assets, asset classes, countries, and time periods. This represents a challenge for modern machine learning models, as the number of model parameters needed to process such data grows exponentially with the data dimensions; an effect known as the Curse-of-Dimensionality. Recently, Tensor Decompositi… ▽ More Analytics of financial data is inherently a Big Data paradigm, as such data are collected over many assets, asset classes, countries, and time periods. This represents a challenge for modern machine learning models, as the number of model parameters needed to process such data grows exponentially with the data dimensions; an effect known as the Curse-of-Dimensionality. Recently, Tensor Decomposition (TD) techniques have shown promising results in reducing the computational costs associated with large-dimensional financial models while achieving comparable performance. However, tensor models are often unable to incorporate the underlying economic domain knowledge. To this end, we develop a novel Graph-Regularized Tensor Regression (GRTR) framework, whereby knowledge about cross-asset relations is incorporated into the model in the form of a graph Laplacian matrix. This is then used as a regularization tool to promote an economically meaningful structure within the model parameters. By virtue of tensor algebra, the proposed framework is shown to be fully interpretable, both coefficient-wise and dimension-wise. The GRTR model is validated in a multi-way financial forecasting setting and compared against competing models, and is shown to achieve improved performance at reduced computational costs. Detailed visualizations are provided to help the reader gain an intuitive understanding of the employed tensor operations. △ Less

Submitted 26 October, 2022; originally announced November 2022.

arXiv:2211.04988 [pdf, other]

Hyper-GST: Predict Metro Passenger Flow Incorporating GraphSAGE, Hypergraph, Social-meaningful Edge Weights and Temporal Exploitation

Authors: Yuyang Miao, Yao Xu, Danilo Mandic

Abstract: Predicting metro passenger flow precisely is of great importance for dynamic traffic planning. Deep learning algorithms have been widely applied due to their robust performance in modelling non-linear systems. However, traditional deep learning algorithms completely discard the inherent graph structure within the metro system. Graph-based deep learning algorithms could utilise the graph structure… ▽ More Predicting metro passenger flow precisely is of great importance for dynamic traffic planning. Deep learning algorithms have been widely applied due to their robust performance in modelling non-linear systems. However, traditional deep learning algorithms completely discard the inherent graph structure within the metro system. Graph-based deep learning algorithms could utilise the graph structure but raise a few challenges, such as how to determine the weights of the edges and the shallow receptive field caused by the over-smoothing issue. To further improve these challenges, this study proposes a model based on GraphSAGE with an edge weights learner applied. The edge weights learner utilises socially meaningful features to generate edge weights. Hypergraph and temporal exploitation modules are also constructed as add-ons for better performance. A comparison study is conducted on the proposed algorithm and other state-of-art graph neural networks, where the proposed algorithm could improve the performance. △ Less

Submitted 9 November, 2022; originally announced November 2022.

arXiv:2210.08521 [pdf, other]

Demystifying CNNs for Images by Matched Filters

Authors: Shengxi Li, Xinyi Zhao, Ljubisa Stankovic, Danilo Mandic

Abstract: The success of convolution neural networks (CNN) has been revolutionising the way we approach and use intelligent machines in the Big Data era. Despite success, CNNs have been consistently put under scrutiny owing to their \textit{black-box} nature, an \textit{ad hoc} manner of their construction, together with the lack of theoretical support and physical meanings of their operation. This has been… ▽ More The success of convolution neural networks (CNN) has been revolutionising the way we approach and use intelligent machines in the Big Data era. Despite success, CNNs have been consistently put under scrutiny owing to their \textit{black-box} nature, an \textit{ad hoc} manner of their construction, together with the lack of theoretical support and physical meanings of their operation. This has been prohibitive to both the quantitative and qualitative understanding of CNNs, and their application in more sensitive areas such as AI for health. We set out to address these issues, and in this way demystify the operation of CNNs, by employing the perspective of matched filtering. We first illuminate that the convolution operation, the very core of CNNs, represents a matched filter which aims to identify the presence of features in input data. This then serves as a vehicle to interpret the convolution-activation-pooling chain in CNNs under the theoretical umbrella of matched filtering, a common operation in signal processing. We further provide extensive examples and experiments to illustrate this connection, whereby the learning in CNNs is shown to also perform matched filtering, which further sheds light onto physical meaning of learnt parameters and layers. It is our hope that this material will provide new insights into the understanding, constructing and analysing of CNNs, as well as paving the way for develo** new methods and architectures of CNNs. △ Less

Submitted 16 October, 2022; originally announced October 2022.

arXiv:2207.08629 [pdf, other]

Comprehensive Graph Gradual Pruning for Sparse Training in Graph Neural Networks

Authors: Chuang Liu, Xueqi Ma, Yibing Zhan, Liang Ding, Dapeng Tao, Bo Du, Wenbin Hu, Danilo Mandic

Abstract: Graph Neural Networks (GNNs) tend to suffer from high computation costs due to the exponentially increasing scale of graph data and the number of model parameters, which restricts their utility in practical applications. To this end, some recent works focus on sparsifying GNNs with the lottery ticket hypothesis (LTH) to reduce inference costs while maintaining performance levels. However, the LTH-… ▽ More Graph Neural Networks (GNNs) tend to suffer from high computation costs due to the exponentially increasing scale of graph data and the number of model parameters, which restricts their utility in practical applications. To this end, some recent works focus on sparsifying GNNs with the lottery ticket hypothesis (LTH) to reduce inference costs while maintaining performance levels. However, the LTH-based methods suffer from two major drawbacks: 1) they require exhaustive and iterative training of dense models, resulting in an extremely large training computation cost, and 2) they only trim graph structures and model parameters but ignore the node feature dimension, where significant redundancy exists. To overcome the above limitations, we propose a comprehensive graph gradual pruning framework termed CGP. This is achieved by designing a during-training graph pruning paradigm to dynamically prune GNNs within one training process. Unlike LTH-based methods, the proposed CGP approach requires no re-training, which significantly reduces the computation costs. Furthermore, we design a co-sparsifying strategy to comprehensively trim all three core elements of GNNs: graph structures, node features, and model parameters. Meanwhile, aiming at refining the pruning operation, we introduce a regrowth process into our CGP framework, in order to re-establish the pruned but important connections. The proposed CGP is evaluated by using a node classification task across 6 GNN architectures, including shallow models (GCN and GAT), shallow-but-deep-propagation models (SGC and APPNP), and deep models (GCNII and ResGCN), on a total of 14 real-world graph datasets, including large-scale graph datasets from the challenging Open Graph Benchmark. Experiments reveal that our proposed strategy greatly improves both training and inference efficiency while matching or even exceeding the accuracy of existing methods. △ Less

Submitted 18 July, 2022; v1 submitted 18 July, 2022; originally announced July 2022.

Comments: 29 pages, 27 figures, submitting to IEEE TNNLS

arXiv:2207.08497 [pdf, ps, other]

Ear-EEG Sensitivity Modelling for Neural and Artifact Sources

Authors: Metin Yarici, Mike Thornton, Danilo Mandic

Abstract: The ear-EEG has emerged as a promising candidate for wearable brain monitoring in real-world scenarios. While experimental studies have validated ear-EEG in multiple scenarios, the source-sensor relationship for a variety of neural sources has not been established. In addition, a detailed theoretical analysis of the ear-EEG sensitivity to sources of artifacts is still missing. Within the present s… ▽ More The ear-EEG has emerged as a promising candidate for wearable brain monitoring in real-world scenarios. While experimental studies have validated ear-EEG in multiple scenarios, the source-sensor relationship for a variety of neural sources has not been established. In addition, a detailed theoretical analysis of the ear-EEG sensitivity to sources of artifacts is still missing. Within the present study, the sensitivity of various configurations of ear-EEG is established in the presence of neural sources from a range of brain surface locations, in addition to ocular sources for the blink, vertical saccade, and horizontal saccade eye movements which produce artifacts in the EEG signal. Results conclusively support the introduction of ear-EEG into conventional EEG paradigms for monitoring neural activity that originates from within the temporal lobes, while also revealing the extent to which ear-EEG can be used for sources further away from these regions. The use of ear-EEG for sources that are located further away from the ears is supported through the analysis of the prominence of ocular artifacts in ear-EEG. The results from this study can be used to support both existing and prospective experimental ear-EEG studies and applications in the context of both neural and ocular artifact sensitivity. △ Less

Submitted 18 July, 2022; originally announced July 2022.

arXiv:2205.14811 [pdf, other]

doi 10.1016/j.neucom.2023.01.032

Last-iterate convergence analysis of stochastic momentum methods for neural networks

Authors: Dongpo Xu, **lan Liu, Yinghua Lu, Jun Kong, Danilo Mandic

Abstract: The stochastic momentum method is a commonly used acceleration technique for solving large-scale stochastic optimization problems in artificial neural networks. Current convergence results of stochastic momentum methods under non-convex stochastic settings mostly discuss convergence in terms of the random output and minimum output. To this end, we address the convergence of the last iterate output… ▽ More The stochastic momentum method is a commonly used acceleration technique for solving large-scale stochastic optimization problems in artificial neural networks. Current convergence results of stochastic momentum methods under non-convex stochastic settings mostly discuss convergence in terms of the random output and minimum output. To this end, we address the convergence of the last iterate output (called last-iterate convergence) of the stochastic momentum methods for non-convex stochastic optimization problems, in a way conformal with traditional optimization theory. We prove the last-iterate convergence of the stochastic momentum methods under a unified framework, covering both stochastic heavy ball momentum and stochastic Nesterov accelerated gradient momentum. The momentum factors can be fixed to be constant, rather than time-varying coefficients in existing analyses. Finally, the last-iterate convergence of the stochastic momentum methods is verified on the benchmark MNIST and CIFAR-10 datasets. △ Less

Submitted 29 May, 2022; originally announced May 2022.

Comments: 21pages, 4figures

MSC Class: 90C26 ACM Class: G.1.6

Journal ref: Neurocomputing 527 (2023) 27-35

arXiv:2205.14807 [pdf, other]

BinauralGrad: A Two-Stage Conditional Diffusion Probabilistic Model for Binaural Audio Synthesis

Authors: Yichong Leng, Zehua Chen, Junliang Guo, Haohe Liu, Jiawei Chen, Xu Tan, Danilo Mandic, Lei He, Xiang-Yang Li, Tao Qin, Sheng Zhao, Tie-Yan Liu

Abstract: Binaural audio plays a significant role in constructing immersive augmented and virtual realities. As it is expensive to record binaural audio from the real world, synthesizing them from mono audio has attracted increasing attention. This synthesis process involves not only the basic physical war** of the mono audio, but also room reverberations and head/ear related filtrations, which, however,… ▽ More Binaural audio plays a significant role in constructing immersive augmented and virtual realities. As it is expensive to record binaural audio from the real world, synthesizing them from mono audio has attracted increasing attention. This synthesis process involves not only the basic physical war** of the mono audio, but also room reverberations and head/ear related filtrations, which, however, are difficult to accurately simulate in traditional digital signal processing. In this paper, we formulate the synthesis process from a different perspective by decomposing the binaural audio into a common part that shared by the left and right channels as well as a specific part that differs in each channel. Accordingly, we propose BinauralGrad, a novel two-stage framework equipped with diffusion models to synthesize them respectively. Specifically, in the first stage, the common information of the binaural audio is generated with a single-channel diffusion model conditioned on the mono audio, based on which the binaural audio is generated by a two-channel diffusion model in the second stage. Combining this novel perspective of two-stage synthesis with advanced generative models (i.e., the diffusion models),the proposed BinauralGrad is able to generate accurate and high-fidelity binaural audio samples. Experiment results show that on a benchmark dataset, BinauralGrad outperforms the existing baselines by a large margin in terms of both object and subject evaluation metrics (Wave L2: 0.128 vs. 0.157, MOS: 3.80 vs. 3.61). The generated audio samples (https://speechresearch.github.io/binauralgrad) and code (https://github.com/microsoft/NeuralSpeech/tree/master/BinauralGrad) are available online. △ Less

Submitted 29 November, 2022; v1 submitted 29 May, 2022; originally announced May 2022.

Comments: NeurIPS 2022 camera version

arXiv:2202.03751 [pdf, other]

InferGrad: Improving Diffusion Models for Vocoder by Considering Inference in Training

Authors: Zehua Chen, Xu Tan, Ke Wang, Shifeng Pan, Danilo Mandic, Lei He, Sheng Zhao

Abstract: Denoising diffusion probabilistic models (diffusion models for short) require a large number of iterations in inference to achieve the generation quality that matches or surpasses the state-of-the-art generative models, which invariably results in slow inference speed. Previous approaches aim to optimize the choice of inference schedule over a few iterations to speed up inference. However, this re… ▽ More Denoising diffusion probabilistic models (diffusion models for short) require a large number of iterations in inference to achieve the generation quality that matches or surpasses the state-of-the-art generative models, which invariably results in slow inference speed. Previous approaches aim to optimize the choice of inference schedule over a few iterations to speed up inference. However, this results in reduced generation quality, mainly because the inference process is optimized separately, without jointly optimizing with the training process. In this paper, we propose InferGrad, a diffusion model for vocoder that incorporates inference process into training, to reduce the inference iterations while maintaining high generation quality. More specifically, during training, we generate data from random noise through a reverse process under inference schedules with a few iterations, and impose a loss to minimize the gap between the generated and ground-truth data samples. Then, unlike existing approaches, the training of InferGrad considers the inference process. The advantages of InferGrad are demonstrated through experiments on the LJSpeech dataset showing that InferGrad achieves better voice quality than the baseline WaveGrad under same conditions while maintaining the same voice quality as the baseline but with $3$x speedup ($2$ iterations for InferGrad vs $6$ iterations for WaveGrad). △ Less

Submitted 8 February, 2022; originally announced February 2022.

Comments: 5 Pages, 2 figures. Accepted to ICASSP 2022

arXiv:2201.09568 [pdf, other]

Pearl: Parallel Evolutionary and Reinforcement Learning Library

Authors: Rohan Tangri, Danilo P. Mandic, Anthony G. Constantinides

Abstract: Reinforcement learning is increasingly finding success across domains where the problem can be represented as a Markov decision process. Evolutionary computation algorithms have also proven successful in this domain, exhibiting similar performance to the generally more complex reinforcement learning. Whilst there exist many open-source reinforcement learning and evolutionary computation libraries,… ▽ More Reinforcement learning is increasingly finding success across domains where the problem can be represented as a Markov decision process. Evolutionary computation algorithms have also proven successful in this domain, exhibiting similar performance to the generally more complex reinforcement learning. Whilst there exist many open-source reinforcement learning and evolutionary computation libraries, no publicly available library combines the two approaches for enhanced comparison, cooperation, or visualization. To this end, we have created Pearl (https://github.com/LondonNode/Pearl), an open source Python library designed to allow researchers to rapidly and conveniently perform optimized reinforcement learning, evolutionary computation and combinations of the two. The key features within Pearl include: modular and expandable components, opinionated module settings, Tensorboard integration, custom callbacks and comprehensive visualizations. △ Less

Submitted 24 January, 2022; originally announced January 2022.

arXiv:2111.15662 [pdf, other]

HOTTBOX: Higher Order Tensor ToolBOX

Authors: Ilya Kisil, Giuseppe G. Calvi, Bruno S. Dees, Danilo P. Mandic

Abstract: HOTTBOX is a Python library for exploratory analysis and visualisation of multi-dimensional arrays of data, also known as tensors. The library includes methods ranging from standard multi-way operations and data manipulation through to multi-linear algebra based tensor decompositions. HOTTBOX also comprises sophisticated algorithms for generalised multi-linear classification and data fusion, such… ▽ More HOTTBOX is a Python library for exploratory analysis and visualisation of multi-dimensional arrays of data, also known as tensors. The library includes methods ranging from standard multi-way operations and data manipulation through to multi-linear algebra based tensor decompositions. HOTTBOX also comprises sophisticated algorithms for generalised multi-linear classification and data fusion, such as Support Tensor Machine (STM) and Tensor Ensemble Learning (TEL). For user convenience, HOTTBOX offers a unifying API which establishes a self-sufficient ecosystem for various forms of efficient representation of multi-way data and the corresponding decomposition and association algorithms. Particular emphasis is placed on scalability and interactive visualisation, to support multidisciplinary data analysis communities working on big data and tensors. HOTTBOX also provides means for integration with other popular data science libraries for visualisation and data manipulation. The source code, examples and documentation ca be found at https://github.com/hottbox/hottbox. △ Less

Submitted 30 November, 2021; originally announced November 2021.

arXiv:2110.02156 [pdf, other]

Bayesian autoregressive spectral estimation

Authors: Alejandro Cuevas, Sebastián López, Danilo Mandic, Felipe Tobar

Abstract: Autoregressive (AR) time series models are widely used in parametric spectral estimation (SE), where the power spectral density (PSD) of the time series is approximated by that of the \emph{best-fit} AR model, which is available in closed form. Since AR parameters are usually found via maximum-likelihood, least squares or the method of moments, AR-based SE fails to account for the uncertainty of t… ▽ More Autoregressive (AR) time series models are widely used in parametric spectral estimation (SE), where the power spectral density (PSD) of the time series is approximated by that of the \emph{best-fit} AR model, which is available in closed form. Since AR parameters are usually found via maximum-likelihood, least squares or the method of moments, AR-based SE fails to account for the uncertainty of the approximate PSD, and thus only yields point estimates. We propose to handle the uncertainty related to the AR approximation by finding the full posterior distribution of the AR parameters to then propagate this uncertainty to the PSD approximation by \emph{integrating out the AR parameters}; we implement this concept by assuming two different priors over the model noise. Through practical experiments, we show that the proposed Bayesian autoregressive spectral estimation (BASE) provides point estimates that follow closely those of standard autoregressive spectral estimation (ASE), while also providing error bars. BASE is validated against ASE and the Periodogram on both synthetic and real-world signals. △ Less

Submitted 5 October, 2021; originally announced October 2021.

arXiv:2110.01325 [pdf, other]

doi 10.1145/3490354.3494386

Learning to Classify and Imitate Trading Agents in Continuous Double Auction Markets

Authors: Mahmoud Mahfouz, Tucker Balch, Manuela Veloso, Danilo Mandic

Abstract: Continuous double auctions such as the limit order book employed by exchanges are widely used in practice to match buyers and sellers of a variety of financial instruments. In this work, we develop an agent-based model for trading in a limit order book and show (1) how opponent modelling techniques can be applied to classify trading agent archetypes and (2) how behavioural cloning can be used to i… ▽ More Continuous double auctions such as the limit order book employed by exchanges are widely used in practice to match buyers and sellers of a variety of financial instruments. In this work, we develop an agent-based model for trading in a limit order book and show (1) how opponent modelling techniques can be applied to classify trading agent archetypes and (2) how behavioural cloning can be used to imitate these agents in a simulated setting. We experimentally compare a number of techniques for both tasks and evaluate their applicability and use in real-world scenarios. △ Less

Submitted 29 October, 2021; v1 submitted 4 October, 2021; originally announced October 2021.

arXiv:2109.09845 [pdf, other]

doi 10.3390/e24010026

Variational Embedding Multiscale Sample Entropy:complexity-based analysis for multichannel systems

Authors: Hongjian Xiao, Danilo P. Mandic

Abstract: To quantify the complexity of a system, entropy-based methods have received considerable critical attentions in real-world data analysis. Among numerous entropy algorithms, amplitude-based formulas, represented by Sample Entropy, suffer from a limitation of data length especially when it comes to practical scenarios. And this shortcoming is further highlighted by involving coarse graining procedur… ▽ More To quantify the complexity of a system, entropy-based methods have received considerable critical attentions in real-world data analysis. Among numerous entropy algorithms, amplitude-based formulas, represented by Sample Entropy, suffer from a limitation of data length especially when it comes to practical scenarios. And this shortcoming is further highlighted by involving coarse graining procedure in multi-scale process. The unbalance between embedding dimension and data size will undoubtedly result in inaccurate and undefined estimation. To that cause, Variational Embedding Multiscale Sample Entropy is proposed in this paper, which assigns signals from various channels with distinct embedding dimensions. And this algorithm is tested by both stimulated and real signals. Furthermore, the performance of the new entropy is investigated and compared with Multivariate Multiscale Sample Entropy and Variational Embedding Multiscale Diversity Entropy. Two real-world database, wind data sets with varying regimes and physiological database recorded from young and elderly people, were utilized. As a result, the proposed algorithm gives an improved separation for both situations. △ Less

Submitted 20 September, 2021; originally announced September 2021.

arXiv:2109.06699 [pdf, other]

An Apparatus for the Simulation of Breathing Disorders: Physically Meaningful Generation of Surrogate Data

Authors: Harry J. Davies, Ghena Hammour, Hongjian Xiao, Danilo P. Mandic

Abstract: The rapidly increasing prevalence of debilitating breathing disorders, such as chronic obstructive pulmonary disease (COPD), calls for a meaningful integration of artificial intelligence (AI) into healthcare. While this promises improved detection and monitoring of breathing disorders, AI techniques are almost invariably "data hungry" which highlights the importance of generating physically meanin… ▽ More The rapidly increasing prevalence of debilitating breathing disorders, such as chronic obstructive pulmonary disease (COPD), calls for a meaningful integration of artificial intelligence (AI) into healthcare. While this promises improved detection and monitoring of breathing disorders, AI techniques are almost invariably "data hungry" which highlights the importance of generating physically meaningful surrogate data. Indeed, domain aware surrogates would enable both an improved understanding of respiratory waveform changes with different breathing disorders, and enhance the training of machine learning algorithms. To this end, we introduce an apparatus comprising of PVC tubes and 3D printed parts as a simple yet effective method of simulating both obstructive and restrictive respiratory waveforms in healthy subjects. Independent control over both inspiratory and expiratory resistances allows for the simulation of obstructive breathing disorders through the whole spectrum of FEV1/FVC spirometry ratios (used to classify COPD), ranging from healthy values to values seen in severe chronic obstructive pulmonary disease. Moreover, waveform characteristics of breathing disorders, such as a change in inspiratory duty cycle or peak flow are also observed in the waveforms resulting from use of the artificial breathing disorder simulation apparatus. Overall, the proposed apparatus provides us with a simple, effective and physically meaningful way to generate faithful surrogate breathing disorder waveforms, a prerequisite for the use of artificial intelligence in respiratory health. △ Less

Submitted 6 October, 2022; v1 submitted 14 September, 2021; originally announced September 2021.

arXiv:2109.00626 [pdf, other]

Reducing Computational Complexity of Tensor Contractions via Tensor-Train Networks

Authors: Ilya Kisil, Giuseppe G. Calvi, Kriton Konstantinidis, Yao Lei Xu, Danilo P. Mandic

Abstract: There is a significant expansion in both volume and range of applications along with the concomitant increase in the variety of data sources. These ever-expanding trends have highlighted the necessity for more versatile analysis tools that offer greater opportunities for algorithmic developments and computationally faster operations than the standard flat-view matrix approach. Tensors, or multi-wa… ▽ More There is a significant expansion in both volume and range of applications along with the concomitant increase in the variety of data sources. These ever-expanding trends have highlighted the necessity for more versatile analysis tools that offer greater opportunities for algorithmic developments and computationally faster operations than the standard flat-view matrix approach. Tensors, or multi-way arrays, provide such an algebraic framework which is naturally suited to data of such large volume, diversity, and veracity. Indeed, the associated tensor decompositions have demonstrated their potential in breaking the Curse of Dimensionality associated with traditional matrix methods, where a necessary exponential increase in data volume leads to adverse or even intractable consequences on computational complexity. A key tool underpinning multi-linear manipulation of tensors and tensor networks is the standard Tensor Contraction Product (TCP). However, depending on the dimensionality of the underlying tensors, the TCP also comes at the price of high computational complexity in tensor manipulation. In this work, we resort to diagrammatic tensor network manipulation to calculate such products in an efficient and computationally tractable manner, by making use of Tensor Train decomposition (TTD). This has rendered the underlying concepts easy to perceive, thereby enhancing intuition of the associated underlying operations, while preserving mathematical rigour. In addition to bypassing the cumbersome mathematical multi-linear expressions, the proposed Tensor Train Contraction Product model is shown to accelerate significantly the underlying computational operations, as it is independent of tensor order and linear in the tensor dimension, as opposed to performing the full computations through the standard approach (exponential in tensor order). △ Less

Submitted 8 September, 2021; v1 submitted 1 September, 2021; originally announced September 2021.

arXiv:2108.11663 [pdf, other]

Convolutional Neural Networks Demystified: A Matched Filtering Perspective Based Tutorial

Authors: Ljubisa Stankovic, Danilo Mandic

Abstract: Deep Neural Networks (DNN) and especially Convolutional Neural Networks (CNN) are a de-facto standard for the analysis of large volumes of signals and images. Yet, their development and underlying principles have been largely performed in an ad-hoc and black box fashion. To help demystify CNNs, we revisit their operation from first principles and a matched filtering perspective. We establish that… ▽ More Deep Neural Networks (DNN) and especially Convolutional Neural Networks (CNN) are a de-facto standard for the analysis of large volumes of signals and images. Yet, their development and underlying principles have been largely performed in an ad-hoc and black box fashion. To help demystify CNNs, we revisit their operation from first principles and a matched filtering perspective. We establish that the convolution operation within CNNs, their very backbone, represents a matched filter which examines the input signal/image for the presence of pre-defined features. This perspective is shown to be physically meaningful, and serves as a basis for a step-by-step tutorial on the operation of CNNs, including pooling, zero padding, various ways of dimensionality reduction. Starting from first principles, both the feed-forward pass and the learning stage (via back-propagation) are illuminated in detail, both through a worked-out numerical example and the corresponding visualizations. It is our hope that this tutorial will help shed new light and physical intuition into the understanding and further development of deep neural networks. △ Less

Submitted 22 March, 2022; v1 submitted 26 August, 2021; originally announced August 2021.

Comments: 21 pages, 9 figures. arXiv admin note: text overlap with arXiv:2108.10751

arXiv:2108.10751 [pdf, other]

Understanding the Basis of Graph Convolutional Neural Networks via an Intuitive Matched Filtering Approach

Authors: Ljubisa Stankovic, Danilo Mandic

Abstract: Graph Convolutional Neural Networks (GCNN) are becoming a preferred model for data processing on irregular domains, yet their analysis and principles of operation are rarely examined due to the black box nature of NNs. To this end, we revisit the operation of GCNNs and show that their convolution layers effectively perform matched filtering of input data with the chosen patterns (features). This a… ▽ More Graph Convolutional Neural Networks (GCNN) are becoming a preferred model for data processing on irregular domains, yet their analysis and principles of operation are rarely examined due to the black box nature of NNs. To this end, we revisit the operation of GCNNs and show that their convolution layers effectively perform matched filtering of input data with the chosen patterns (features). This allows us to provide a unifying account of GCNNs through a matched filter perspective, whereby the nonlinear ReLU and max-pooling layers are also discussed within the matched filtering framework. This is followed by a step-by-step guide on information propagation and learning in GCNNs. It is also shown that standard CNNs and fully connected NNs can be obtained as a special case of GCNNs. A carefully chosen numerical example guides the reader through the various steps of GCNN operation and learning both visually and numerically. △ Less

Submitted 23 August, 2021; originally announced August 2021.

Comments: 14 pages, 6 figures, 1 table

arXiv:2106.03417 [pdf, other]

Dynamic Portfolio Cuts: A Spectral Approach to Graph-Theoretic Diversification

Authors: Alvaro Arroyo, Bruno Scalzo, Ljubisa Stankovic, Danilo P. Mandic

Abstract: Stock market returns are typically analyzed using standard regression, yet they reside on irregular domains which is a natural scenario for graph signal processing. To this end, we consider a market graph as an intuitive way to represent the relationships between financial assets. Traditional methods for estimating asset-return covariance operate under the assumption of statistical time-invariance… ▽ More Stock market returns are typically analyzed using standard regression, yet they reside on irregular domains which is a natural scenario for graph signal processing. To this end, we consider a market graph as an intuitive way to represent the relationships between financial assets. Traditional methods for estimating asset-return covariance operate under the assumption of statistical time-invariance, and are thus unable to appropriately infer the underlying true structure of the market graph. This work introduces a class of graph spectral estimators which cater for the nonstationarity inherent to asset price movements, and serve as a basis to represent the time-varying interactions between assets through a dynamic spectral market graph. Such an account of the time-varying nature of the asset-return covariance allows us to introduce the notion of dynamic spectral portfolio cuts, whereby the graph is partitioned into time-evolving clusters, allowing for online and robust asset allocation. The advantages of the proposed framework over traditional methods are demonstrated through numerical case studies using real-world price data. △ Less

Submitted 7 June, 2021; originally announced June 2021.

Comments: 5 pages, 3 Figures, 2 Tables

arXiv:2105.04991 [pdf, other]

Graph Theory for Metro Traffic Modelling

Authors: Bruno Scalzo Dees, Yao Lei Xu, Anthony G. Constantinides, Danilo P. Mandic

Abstract: A unifying graph theoretic framework for the modelling of metro transportation networks is proposed. This is achieved by first introducing a basic graph framework for the modelling of the London underground system from a diffusion law point of view. This forms a basis for the analysis of both station importance and their vulnerability, whereby the concept of graph vertex centrality plays a key rol… ▽ More A unifying graph theoretic framework for the modelling of metro transportation networks is proposed. This is achieved by first introducing a basic graph framework for the modelling of the London underground system from a diffusion law point of view. This forms a basis for the analysis of both station importance and their vulnerability, whereby the concept of graph vertex centrality plays a key role. We next explore k-edge augmentation of a graph topology, and illustrate its usefulness both for improving the network robustness and as a planning tool. Upon establishing the graph theoretic attributes of the underlying graph topology, we proceed to introduce models for processing data on such a metro graph. Commuter movement is shown to obey the Fick's law of diffusion, where the graph Laplacian provides an analytical model for the diffusion process of commuter population dynamics. Finally, we also explore the application of modern deep learning models, such as graph neural networks and hyper-graph neural networks, as general purpose models for the modelling and forecasting of underground data, especially in the context of the morning and evening rush hours. Comprehensive simulations including the passenger in- and out-flows during the morning rush hour in London demonstrates the advantages of the graph models in metro planning and traffic management, a formal mathematical approach with wide economic implications. △ Less

Submitted 11 May, 2021; originally announced May 2021.

Comments: International Joint Conference on Neural Networks (IJCNN) 2021. arXiv admin note: text overlap with arXiv:1912.05964, arXiv:2001.00426

arXiv:2105.04983 [pdf, other]

Tensor-Train Recurrent Neural Networks for Interpretable Multi-Way Financial Forecasting

Authors: Yao Lei Xu, Giuseppe G. Calvi, Danilo P. Mandic

Abstract: Recurrent Neural Networks (RNNs) represent the de facto standard machine learning tool for sequence modelling, owing to their expressive power and memory. However, when dealing with large dimensional data, the corresponding exponential increase in the number of parameters imposes a computational bottleneck. The necessity to equip RNNs with the ability to deal with the curse of dimensionality, such… ▽ More Recurrent Neural Networks (RNNs) represent the de facto standard machine learning tool for sequence modelling, owing to their expressive power and memory. However, when dealing with large dimensional data, the corresponding exponential increase in the number of parameters imposes a computational bottleneck. The necessity to equip RNNs with the ability to deal with the curse of dimensionality, such as through the parameter compression ability inherent to tensors, has led to the development of the Tensor-Train RNN (TT-RNN). Despite achieving promising results in many applications, the full potential of the TT-RNN is yet to be explored in the context of interpretable financial modelling, a notoriously challenging task characterized by multi-modal data with low signal-to-noise ratio. To address this issue, we investigate the potential of TT-RNN in the task of financial forecasting of currencies. We show, through the analysis of TT-factors, that the physical meaning underlying tensor decomposition, enables the TT-RNN model to aid the interpretability of results, thus mitigating the notorious "black-box" issue associated with neural networks. Furthermore, simulation results highlight the regularization power of TT decomposition, demonstrating the superior performance of TT-RNN over its uncompressed RNN counterpart and other tensor forecasting methods. △ Less

Submitted 11 May, 2021; originally announced May 2021.

Comments: International Joint Conference on Neural Networks (IJCNN) 2021

arXiv:2103.14998 [pdf, other]

Tensor Networks for Multi-Modal Non-Euclidean Data

Authors: Yao Lei Xu, Kriton Konstantinidis, Danilo P. Mandic

Abstract: Modern data sources are typically of large scale and multi-modal natures, and acquired on irregular domains, which poses serious challenges to traditional deep learning models. These issues are partially mitigated by either extending existing deep learning algorithms to irregular domains through graphs, or by employing tensor methods to alleviate the computational bottlenecks imposed by the Curse… ▽ More Modern data sources are typically of large scale and multi-modal natures, and acquired on irregular domains, which poses serious challenges to traditional deep learning models. These issues are partially mitigated by either extending existing deep learning algorithms to irregular domains through graphs, or by employing tensor methods to alleviate the computational bottlenecks imposed by the Curse of Dimensionality. To simultaneously resolve both these issues, we introduce a novel Multi-Graph Tensor Network (MGTN) framework, which leverages on the desirable properties of graphs, tensors and neural networks in a physically meaningful and compact manner. This equips MGTNs with the ability to exploit local information in irregular data sources at a drastically reduced parameter complexity, and over a range of learning paradigms such as regression, classification and reinforcement learning. The benefits of the MGTN framework, especially its ability to avoid overfitting through the inherent low-rank regularization properties of tensor networks, are demonstrated through its superior performance against competing models in the individual tensor, graph, and neural network domains. △ Less

Submitted 27 March, 2021; originally announced March 2021.

Comments: arXiv admin note: substantial text overlap with arXiv:2010.13209

arXiv:2103.07948 [pdf, other]

Von Mises-Fisher Elliptical Distribution

Authors: Shengxi Li, Danilo Mandic

Abstract: A large class of modern probabilistic learning systems assumes symmetric distributions, however, real-world data tend to obey skewed distributions and are thus not always adequately modelled through symmetric distributions. To address this issue, elliptical distributions are increasingly used to generalise symmetric distributions, and further improvements to skewed elliptical distributions have re… ▽ More A large class of modern probabilistic learning systems assumes symmetric distributions, however, real-world data tend to obey skewed distributions and are thus not always adequately modelled through symmetric distributions. To address this issue, elliptical distributions are increasingly used to generalise symmetric distributions, and further improvements to skewed elliptical distributions have recently attracted much attention. However, existing approaches are either hard to estimate or have complicated and abstract representations. To this end, we propose to employ the von-Mises-Fisher (vMF) distribution to obtain an explicit and simple probability representation of the skewed elliptical distribution. This is shown not only to allow us to deal with non-symmetric learning systems, but also to provide a physically meaningful way of generalising skewed distributions. For rigour, our extension is proved to share important and desirable properties with its symmetric counterpart. We also demonstrate that the proposed vMF distribution is both easy to generate and stable to estimate, both theoretically and through examples. △ Less

Submitted 14 March, 2021; originally announced March 2021.

Showing 1–50 of 113 results for author: Mandic, D