Search | arXiv e-print repository

doi 10.1109/LSP.2021.3084559

Improved Coherence Index-Based Bound in Compressive Sensing

Authors: Ljubisa Stankovic, Milos Brajovic, Danilo Mandic, Isidora Stankovic, Milos Dakovic

Abstract: Within the Compressive Sensing (CS) paradigm, sparse signals can be reconstructed based on a reduced set of measurements. Reliability of the solution is determined by the uniqueness condition. With its mathematically tractable and feasible calculation, coherence index is one of very few CS metrics with a considerable practical importance. In this paper, we propose an improvement of the coherence b… ▽ More Within the Compressive Sensing (CS) paradigm, sparse signals can be reconstructed based on a reduced set of measurements. Reliability of the solution is determined by the uniqueness condition. With its mathematically tractable and feasible calculation, coherence index is one of very few CS metrics with a considerable practical importance. In this paper, we propose an improvement of the coherence based uniqueness relation for the matching pursuit algorithms. Starting from a simple and intuitive derivation of the standard uniqueness condition based on the coherence index, we derive a less conservative coherence index-based lower bound for signal sparsity. The results are generalized to the uniqueness condition of the $l_0$-norm minimization for a signal represented in two orthonormal bases. △ Less

Submitted 11 March, 2021; originally announced March 2021.

Comments: 5 pages, 1 figure

arXiv:2102.00477 [pdf, other]

Nonstationary Portfolios: Diversification in the Spectral Domain

Authors: Bruno Scalzo, Alvaro Arroyo, Ljubisa Stankovic, Danilo P. Mandic

Abstract: Classical portfolio optimization methods typically determine an optimal capital allocation through the implicit, yet critical, assumption of statistical time-invariance. Such models are inadequate for real-world markets as they employ standard time-averaging based estimators which suffer significant information loss if the market observables are non-stationary. To this end, we reformulate the port… ▽ More Classical portfolio optimization methods typically determine an optimal capital allocation through the implicit, yet critical, assumption of statistical time-invariance. Such models are inadequate for real-world markets as they employ standard time-averaging based estimators which suffer significant information loss if the market observables are non-stationary. To this end, we reformulate the portfolio optimization problem in the spectral domain to cater for the nonstationarity inherent to asset price movements and, in this way, allow for optimal capital allocations to be time-varying. Unlike existing spectral portfolio techniques, the proposed framework employs augmented complex statistics in order to exploit the interactions between the real and imaginary parts of the complex spectral variables, which in turn allows for the modelling of both harmonics and cyclostationarity in the time domain. The advantages of the proposed framework over traditional methods are demonstrated through numerical simulations using real-world price data. △ Less

Submitted 31 January, 2021; originally announced February 2021.

Comments: 5 pages, 3 figures, 1 table. arXiv admin note: text overlap with arXiv:2007.13855

arXiv:2101.00647 [pdf, other]

doi 10.1109/TCDS.2022.3196841

In-Ear SpO2 for Classification of Cognitive Workload

Authors: Harry J. Davies, Ian Williams, Ghena Hammour, Metin Yarici, Barry M. Seemungal, Danilo P. Mandic

Abstract: Classification of cognitive workload promises immense benefit in diverse areas ranging from driver safety to augmenting human capability through closed loop brain computer interface. The brain is the most metabolically active organ in the body and increases its metabolic activity and thus oxygen consumption with increasing cognitive demand. In this study, we explore the feasibility of in-ear SpO2… ▽ More Classification of cognitive workload promises immense benefit in diverse areas ranging from driver safety to augmenting human capability through closed loop brain computer interface. The brain is the most metabolically active organ in the body and increases its metabolic activity and thus oxygen consumption with increasing cognitive demand. In this study, we explore the feasibility of in-ear SpO2 cognitive workload tracking. To this end, we preform cognitive workload assessment in 8 subjects, based on an N-back task, whereby the subjects are asked to count and remember the number of odd numbers displayed on a screen in 5 second windows. The 2 and 3-back tasks lead to either the lowest median absolute SpO2 or largest median decrease in SpO2 in all of the subjects, indicating a robust and measurable decrease in blood oxygen in response to increased cognitive workload. Using features derived from in-ear pulse oximetry, including SpO2, pulse rate and respiration rate, we were able to classify the 4 N-back task categories, over 5 second epochs, with a mean accuracy of 94.2%. Moreover, out of 21 total features, the 9 most important features for classification accuracy were all SpO2 related features. The findings suggest that in-ear SpO2 measurements provide valuable information for classification of cognitive workload over short time windows, which together with the small form factor promises a new avenue for real time cognitive workload tracking. △ Less

Submitted 3 January, 2021; originally announced January 2021.

Comments: 8 pages, 9 figures

arXiv:2012.06104 [pdf, other]

doi 10.1016/j.inffus.2020.11.008

A Review of Hidden Markov Models and Recurrent Neural Networks for Event Detection and Localization in Biomedical Signals

Authors: Yassin Khalifa, Danilo Mandic, Ervin Sejdić

Abstract: Biomedical signals carry signature rhythms of complex physiological processes that control our daily bodily activity. The properties of these rhythms indicate the nature of interaction dynamics among physiological processes that maintain a homeostasis. Abnormalities associated with diseases or disorders usually appear as disruptions in the structure of the rhythms which makes isolating these rhyth… ▽ More Biomedical signals carry signature rhythms of complex physiological processes that control our daily bodily activity. The properties of these rhythms indicate the nature of interaction dynamics among physiological processes that maintain a homeostasis. Abnormalities associated with diseases or disorders usually appear as disruptions in the structure of the rhythms which makes isolating these rhythms and the ability to differentiate between them, indispensable. Computer aided diagnosis systems are ubiquitous nowadays in almost every medical facility and more closely in wearable technology, and rhythm or event detection is the first of many intelligent steps that they perform. How these rhythms are isolated? How to develop a model that can describe the transition between processes in time? Many methods exist in the literature that address these questions and perform the decoding of biomedical signals into separate rhythms. In here, we demystify the most effective methods that are used for detection and isolation of rhythms or events in time series and highlight the way in which they were applied to different biomedical signals and how they contribute to information fusion. The key strengths and limitations of these methods are also discussed as well as the challenges encountered with application in biomedical signals. △ Less

Submitted 10 December, 2020; originally announced December 2020.

Journal ref: Yassin Khalifa, Danilo Mandic, and Ervin Sejdić. "A review of Hidden Markov models and Recurrent Neural Networks for event detection and localization in biomedical signals." Information Fusion, Volume 69, 52-72 (2021)

arXiv:2010.13209 [pdf, other]

Multi-Graph Tensor Networks

Authors: Yao Lei Xu, Kriton Konstantinidis, Danilo P. Mandic

Abstract: The irregular and multi-modal nature of numerous modern data sources poses serious challenges for traditional deep learning algorithms. To this end, recent efforts have generalized existing algorithms to irregular domains through graphs, with the aim to gain additional insights from data through the underlying graph topology. At the same time, tensor-based methods have demonstrated promising resul… ▽ More The irregular and multi-modal nature of numerous modern data sources poses serious challenges for traditional deep learning algorithms. To this end, recent efforts have generalized existing algorithms to irregular domains through graphs, with the aim to gain additional insights from data through the underlying graph topology. At the same time, tensor-based methods have demonstrated promising results in bypassing the bottlenecks imposed by the Curse of Dimensionality. In this paper, we introduce a novel Multi-Graph Tensor Network (MGTN) framework, which exploits both the ability of graphs to handle irregular data sources and the compression properties of tensor networks in a deep learning setting. The potential of the proposed framework is demonstrated through an MGTN based deep Q agent for Foreign Exchange (FOREX) algorithmic trading. By virtue of the MGTN, a FOREX currency graph is leveraged to impose an economically meaningful structure on this demanding task, resulting in a highly superior performance against three competing models and at a drastically lower complexity. △ Less

Submitted 21 January, 2021; v1 submitted 25 October, 2020; originally announced October 2020.

Comments: NeurIPS 2020 - First Workshop on Quantum Tensor Networks in Machine Learning

arXiv:2009.08727 [pdf, other]

Recurrent Graph Tensor Networks: A Low-Complexity Framework for Modelling High-Dimensional Multi-Way Sequence

Authors: Yao Lei Xu, Danilo P. Mandic

Abstract: Recurrent Neural Networks (RNNs) are among the most successful machine learning models for sequence modelling, but tend to suffer from an exponential increase in the number of parameters when dealing with large multidimensional data. To this end, we develop a multi-linear graph filter framework for approximating the modelling of hidden states in RNNs, which is embedded in a tensor network architec… ▽ More Recurrent Neural Networks (RNNs) are among the most successful machine learning models for sequence modelling, but tend to suffer from an exponential increase in the number of parameters when dealing with large multidimensional data. To this end, we develop a multi-linear graph filter framework for approximating the modelling of hidden states in RNNs, which is embedded in a tensor network architecture to improve modelling power and reduce parameter complexity, resulting in a novel Recurrent Graph Tensor Network (RGTN). The proposed framework is validated through several multi-way sequence modelling tasks and benchmarked against traditional RNNs. By virtue of the domain aware information processing of graph filters and the expressive power of tensor networks, we show that the proposed RGTN is capable of not only out-performing standard RNNs, but also mitigating the Curse of Dimensionality associated with traditional RNNs, demonstrating superior properties in terms of performance and complexity. △ Less

Submitted 11 May, 2021; v1 submitted 18 September, 2020; originally announced September 2020.

Comments: 29th European Signal Processing Conference (EUSIPCO) 2021

arXiv:2008.04145 [pdf, ps, other]

doi 10.1109/TSP.2021.3096431

A Full Second-Order Analysis of the Widely Linear MVDR Beamformer for Noncircular Signals

Authors: Zhe Li, Rui Pu, Yili Xia, Wenjiang Pei, Danilo P. Mandic

Abstract: A full performance analysis of the widely linear (WL) minimum variance distortionless response (MVDR) beamformer is introduced. While the WL MVDR is known to outperform its strictly linear counterpart, the Capon beamformer, for noncircular complex signals, the existing approaches provide limited physical insights, since they explicitly or implicitly omit the complementary second-order (SO) statist… ▽ More A full performance analysis of the widely linear (WL) minimum variance distortionless response (MVDR) beamformer is introduced. While the WL MVDR is known to outperform its strictly linear counterpart, the Capon beamformer, for noncircular complex signals, the existing approaches provide limited physical insights, since they explicitly or implicitly omit the complementary second-order (SO) statistics of the output interferences and noise (IN). To this end, we exploit the full SO statistics of the output IN to introduce a full SO performance analysis framework for the WL MVDR beamformer. This makes it possible to separate the overall signal-to-interference plus noise ratio (SINR) gain of the WL MVDR beamformer w.r.t. the Capon one into the individual contributions along the in-phase (I) and quadrature (Q) channels. Next, by considering the reception of the unknown signal of interest (SOI) corrupted by an arbitrary number of orthogonal noncircular interferences, we further unveil the distribution of SINR gains in both the I and Q channels, and show that in almost all the spatial cases, these performance advantages are more pronounced when the SO noncircularity rate of the interferences increases. Illustrative numerical simulations are provided to support the theoretical results. △ Less

Submitted 29 November, 2021; v1 submitted 10 August, 2020; originally announced August 2020.

arXiv:2007.13855 [pdf, other]

A Probabilistic Spectral Analysis of Multivariate Real-Valued Nonstationary Signals

Authors: Bruno Scalzo, Ljubisa Stankovic, Danilo P. Mandic

Abstract: A class of multivariate spectral representations for real-valued nonstationary random variables is introduced, which is characterised by a general complex Gaussian distribution. In this way, the temporal signal properties -- harmonicity, wide-sense stationarity and cyclostationarity -- are designated respectively by the mean, Hermitian variance and pseudo-variance of the associated time-frequency… ▽ More A class of multivariate spectral representations for real-valued nonstationary random variables is introduced, which is characterised by a general complex Gaussian distribution. In this way, the temporal signal properties -- harmonicity, wide-sense stationarity and cyclostationarity -- are designated respectively by the mean, Hermitian variance and pseudo-variance of the associated time-frequency representation (TFR). For rigour, the estimators of the TFR distribution parameters are derived within a maximum likelihood framework and are shown to be statistically consistent, owing to the statistical identifiability of the proposed distribution parametrization. By virtue of the assumed probabilistic model, a generalised likelihood ratio test (GLRT) for nonstationarity detection is also proposed. Intuitive examples demonstrate the utility of the derived probabilistic framework for spectral analysis in low-SNR environments. △ Less

Submitted 27 July, 2020; originally announced July 2020.

arXiv:2006.08413 [pdf, other]

Reciprocal Adversarial Learning via Characteristic Functions

Authors: Shengxi Li, Zeyang Yu, Min Xiang, Danilo Mandic

Abstract: Generative adversarial nets (GANs) have become a preferred tool for tasks involving complicated distributions. To stabilise the training and reduce the mode collapse of GANs, one of their main variants employs the integral probability metric (IPM) as the loss function. This provides extensive IPM-GANs with theoretical support for basically comparing moments in an embedded domain of the \textit{cri… ▽ More Generative adversarial nets (GANs) have become a preferred tool for tasks involving complicated distributions. To stabilise the training and reduce the mode collapse of GANs, one of their main variants employs the integral probability metric (IPM) as the loss function. This provides extensive IPM-GANs with theoretical support for basically comparing moments in an embedded domain of the \textit{critic}. We generalise this by comparing the distributions rather than their moments via a powerful tool, i.e., the characteristic function (CF), which uniquely and universally comprising all the information about a distribution. For rigour, we first establish the physical meaning of the phase and amplitude in CF, and show that this provides a feasible way of balancing the accuracy and diversity of generation. We then develop an efficient sampling strategy to calculate the CFs. Within this framework, we further prove an equivalence between the embedded and data domains when a reciprocal exists, where we naturally develop the GAN in an auto-encoder structure, in a way of comparing everything in the embedded space (a semantically meaningful manifold). This efficient structure uses only two modules, together with a simple training strategy, to achieve bi-directionally generating clear images, which is referred to as the reciprocal CF GAN (RCF-GAN). Experimental results demonstrate the superior performances of the proposed RCF-GAN in terms of both generation and reconstruction. △ Less

Submitted 23 October, 2020; v1 submitted 15 June, 2020; originally announced June 2020.

Comments: This work has been accepted to NeurIPS 2020

arXiv:2006.04231 [pdf, other]

doi 10.3390/s20174879

In-Ear Measurement of Blood Oxygen Saturation: An Ambulatory Tool Needed To Detect The Delayed Life-Threatening Hypoxaemia in COVID-19

Authors: Harry J. Davies, Ian Williams, Nicholas S. Peters, Danilo P. Mandic

Abstract: Non-invasive ambulatory estimation of blood oxygen saturation has emerged as an important clinical requirement to detect hypoxemia in the delayed post-infective phase of COVID-19, where dangerous hypoxia may occur in the absence of subjective breathlessness. This immediate clinical driver, combined with the general quest for more personalised health data, means that pulse oximetry measurement of c… ▽ More Non-invasive ambulatory estimation of blood oxygen saturation has emerged as an important clinical requirement to detect hypoxemia in the delayed post-infective phase of COVID-19, where dangerous hypoxia may occur in the absence of subjective breathlessness. This immediate clinical driver, combined with the general quest for more personalised health data, means that pulse oximetry measurement of capillary oxygen saturation (SpO2) will likely expand into both the clinical and consumer market of wearable health technology in the near future. In this study, we set out to establish the feasibility of SpO2 measurement from the ear canal as a convenient site for long term monitoring, and perform a comprehensive comparison with the right index finger - the conventional clinical measurement site. During resting SpO2 estimation, we found a root mean square difference of 1.47% between the two measurement sites, with a mean difference of 0.23% higher SpO2 in the right ear canal. Through the simultaneous recording of pulse oximetry from both the right ear canal and index finger during breath holds, we observe a substantial improvement in response time between the ear and finger that has a mean of 12.4 seconds and a range of 4.2 - 24.2 seconds across all subjects. Factors which influence this response time, termed SpO2 delay, such as the sex of a subject are also explored. Furthermore, we examine the potential downsides of ear canal blood oxygen saturation measurement, namely the lower photoplethysmogram amplitude, and suggest ways to mitigate this disadvantage. These results are presented in conjunction with previously discovered benefits such as robustness to temperature, making the case for measurement of SpO2 from the ear canal being both convenient and superior to conventional finger measurement sites for continuous non-intrusive long term monitoring in both clinical and everyday-life settings. △ Less

Submitted 7 June, 2020; originally announced June 2020.

Comments: 7 pages, 9 figures

arXiv:2003.05729 [pdf, ps, other]

Methods of Adaptive Signal Processing on Graphs Using Vertex-Time Autoregressive Models

Authors: Thiernithi Variddhisai, Danilo Mandic

Abstract: The concept of a random process has been recently extended to graph signals, whereby random graph processes are a class of multivariate stochastic processes whose coefficients are matrices with a \textit{graph-topological} structure. The system identification problem of a random graph process therefore revolves around determining its underlying topology, or mathematically, the graph shift operator… ▽ More The concept of a random process has been recently extended to graph signals, whereby random graph processes are a class of multivariate stochastic processes whose coefficients are matrices with a \textit{graph-topological} structure. The system identification problem of a random graph process therefore revolves around determining its underlying topology, or mathematically, the graph shift operators (GSOs) i.e. an adjacency matrix or a Laplacian matrix. In the same work that introduced random graph processes, a \textit{batch} optimization method to solve for the GSO was also proposed for the random graph process based on a \textit{causal} vertex-time autoregressive model. To this end, the online version of this optimization problem was proposed via the framework of adaptive filtering. The modified stochastic gradient projection method was employed on the regularized least squares objective to create the filter. The recursion is divided into 3 regularized sub-problems to address issues like multi-convexity, sparsity, commutativity and bias. A discussion on convergence analysis is also included. Finally, experiments are conducted to illustrate the performance of the proposed algorithm, from traditional MSE measure to successful recovery rate regardless correct values, all of which to shed light on the potential, the limit and the possible research attempt of this work. △ Less

Submitted 10 March, 2020; originally announced March 2020.

arXiv:2002.11835 [pdf, ps, other]

Tensor Decompositions in Deep Learning

Authors: Davide Bacciu, Danilo P. Mandic

Abstract: The paper surveys the topic of tensor decompositions in modern machine learning applications. It focuses on three active research topics of significant relevance for the community. After a brief review of consolidated works on multi-way data analysis, we consider the use of tensor decompositions in compressing the parameter space of deep learning models. Lastly, we discuss how tensor methods can b… ▽ More The paper surveys the topic of tensor decompositions in modern machine learning applications. It focuses on three active research topics of significant relevance for the community. After a brief review of consolidated works on multi-way data analysis, we consider the use of tensor decompositions in compressing the parameter space of deep learning models. Lastly, we discuss how tensor methods can be leveraged to yield richer adaptive representations of complex data, including structured information. The paper concludes with a discussion on interesting open research challenges. △ Less

Submitted 26 February, 2020; originally announced February 2020.

arXiv:2001.10109 [pdf, other]

Supervised Learning for Non-Sequential Data: A Canonical Polyadic Decomposition Approach

Authors: Alexandros Haliassos, Kriton Konstantinidis, Danilo P. Mandic

Abstract: Efficient modelling of feature interactions underpins supervised learning for non-sequential tasks, characterized by a lack of inherent ordering of features (variables). The brute force approach of learning a parameter for each interaction of every order comes at an exponential computational and memory cost (Curse of Dimensionality). To alleviate this issue, it has been proposed to implicitly repr… ▽ More Efficient modelling of feature interactions underpins supervised learning for non-sequential tasks, characterized by a lack of inherent ordering of features (variables). The brute force approach of learning a parameter for each interaction of every order comes at an exponential computational and memory cost (Curse of Dimensionality). To alleviate this issue, it has been proposed to implicitly represent the model parameters as a tensor, the order of which is equal to the number of features; for efficiency, it can be further factorized into a compact Tensor Train (TT) format. However, both TT and other Tensor Networks (TNs), such as Tensor Ring and Hierarchical Tucker, are sensitive to the ordering of their indices (and hence to the features). To establish the desired invariance to feature ordering, we propose to represent the weight tensor through the Canonical Polyadic (CP) Decomposition (CPD), and introduce the associated inference and learning algorithms, including suitable regularization and initialization schemes. It is demonstrated that the proposed CP-based predictor significantly outperforms other TN-based predictors on sparse data while exhibiting comparable performance on dense non-sequential tasks. Furthermore, for enhanced expressiveness, we generalize the framework to allow feature map** to arbitrarily high-dimensional feature vectors. In conjunction with feature vector normalization, this is shown to yield dramatic improvements in performance for dense non-sequential tasks, matching models such as fully-connected neural networks. △ Less

Submitted 30 March, 2021; v1 submitted 27 January, 2020; originally announced January 2020.

Comments: Accepted at IEEE Transactions on Neural Networks and Learning Systems

arXiv:2001.00426 [pdf, other]

Graph Signal Processing -- Part III: Machine Learning on Graphs, from Graph Topology to Applications

Authors: Ljubisa Stankovic, Danilo Mandic, Milos Dakovic, Milos Brajovic, Bruno Scalzo, Shengxi Li, Anthony G. Constantinides

Abstract: Many modern data analytics applications on graphs operate on domains where graph topology is not known a priori, and hence its determination becomes part of the problem definition, rather than serving as prior knowledge which aids the problem solution. Part III of this monograph starts by addressing ways to learn graph topology, from the case where the physics of the problem already suggest a poss… ▽ More Many modern data analytics applications on graphs operate on domains where graph topology is not known a priori, and hence its determination becomes part of the problem definition, rather than serving as prior knowledge which aids the problem solution. Part III of this monograph starts by addressing ways to learn graph topology, from the case where the physics of the problem already suggest a possible topology, through to most general cases where the graph topology is learned from the data. A particular emphasis is on graph topology definition based on the correlation and precision matrices of the observed data, combined with additional prior knowledge and structural conditions, such as the smoothness or sparsity of graph connections. For learning sparse graphs (with small number of edges), the least absolute shrinkage and selection operator, known as LASSO is employed, along with its graph specific variant, graphical LASSO. For completeness, both variants of LASSO are derived in an intuitive way, and explained. An in-depth elaboration of the graph topology learning paradigm is provided through several examples on physically well defined graphs, such as electric circuits, linear heat transfer, social and computer networks, and spring-mass systems. As many graph neural networks (GNN) and convolutional graph networks (GCN) are emerging, we have also reviewed the main trends in GNNs and GCNs, from the perspective of graph signal filtering. Tensor representation of lattice-structured graphs is next considered, and it is shown that tensors (multidimensional data arrays) are a special class of graph signals, whereby the graph vertices reside on a high-dimensional regular lattice structure. This part of monograph concludes with two emerging applications in financial data processing and underground transportation networks modeling. △ Less

Submitted 2 January, 2020; originally announced January 2020.

Comments: 61 pages, 55 figures, 40 examples

arXiv:1912.05964 [pdf, other]

Graph Theory and Metro Traffic Modelling

Authors: Bruno Scalzo Dees, Anthony G. Constantinides, Danilo P. Mandic

Abstract: In this article we demonstrate how graph theory can be used to identify those stations in the London underground network which have the greatest influence on the functionality of the traffic, and proceed, in an innovative way, to assess the impact of a station closure on service levels across the city. Such underground network vulnerability analysis offers the opportunity to analyse, optimize and… ▽ More In this article we demonstrate how graph theory can be used to identify those stations in the London underground network which have the greatest influence on the functionality of the traffic, and proceed, in an innovative way, to assess the impact of a station closure on service levels across the city. Such underground network vulnerability analysis offers the opportunity to analyse, optimize and enhance the connectivity of the London underground network in a mathematically tractable and physically meaningful manner. △ Less

Submitted 12 December, 2019; originally announced December 2019.

Comments: 4 pages, 6 figures, 2 tables

arXiv:1911.12816 [pdf, other]

On the Importance of Opponent Modeling in Auction Markets

Authors: Mahmoud Mahfouz, Angelos Filos, Cyrine Chtourou, Joshua Lockhart, Samuel Assefa, Manuela Veloso, Danilo Mandic, Tucker Balch

Abstract: The dynamics of financial markets are driven by the interactions between participants, as well as the trading mechanisms and regulatory frameworks that govern these interactions. Decision-makers would rather not ignore the impact of other participants on these dynamics and should employ tools and models that take this into account. To this end, we demonstrate the efficacy of applying opponent-mode… ▽ More The dynamics of financial markets are driven by the interactions between participants, as well as the trading mechanisms and regulatory frameworks that govern these interactions. Decision-makers would rather not ignore the impact of other participants on these dynamics and should employ tools and models that take this into account. To this end, we demonstrate the efficacy of applying opponent-modeling in a number of simulated market settings. While our simulations are simplified representations of actual market dynamics, they provide an idealized "playground" in which our techniques can be demonstrated and tested. We present this work with the aim that our techniques could be refined and, with some effort, scaled up to the full complexity of real-world market scenarios. We hope that the results presented encourage practitioners to adopt opponent-modeling methods and apply them online systems, in order to enable not only reactive but also proactive decisions to be made. △ Less

Submitted 28 November, 2019; originally announced November 2019.

arXiv:1911.02915 [pdf, other]

A Statistically Identifiable Model for Tensor-Valued Gaussian Random Variables

Authors: Bruno Scalzo Dees, Anh-Huy Phan, Danilo P. Mandic

Abstract: Real-world signals typically span across multiple dimensions, that is, they naturally reside on multi-way data structures referred to as tensors. In contrast to standard ``flat-view'' multivariate matrix models which are agnostic to data structure and only describe linear pairwise relationships, we introduce the tensor-valued Gaussian distribution which caters for multilinear interactions -- the l… ▽ More Real-world signals typically span across multiple dimensions, that is, they naturally reside on multi-way data structures referred to as tensors. In contrast to standard ``flat-view'' multivariate matrix models which are agnostic to data structure and only describe linear pairwise relationships, we introduce the tensor-valued Gaussian distribution which caters for multilinear interactions -- the linear relationship between fibers -- which is reflected by the Kronecker separable structure of the mean and covariance. By virtue of the statistical identifiability of the proposed distribution formulation, whereby different parameter values strictly generate different probability distributions, it is shown that the corresponding likelihood function can be maximised analytically to yield the maximum likelihood estimator. For rigour, the statistical consistency of the estimator is also demonstrated through numerical simulations. The probabilistic framework is then generalised to describe the joint distribution of multiple tensor-valued random variables, whereby the associated mean and covariance exhibit a Khatri-Rao separable structure. The proposed models are shown to serve as a natural basis for gridded atmospheric climate modelling. △ Less

Submitted 3 December, 2019; v1 submitted 7 November, 2019; originally announced November 2019.

Comments: 13 pages, 13 figures

arXiv:1910.11374 [pdf, other]

Robust Principal Component Analysis Based On Maximum Correntropy Power Iterations

Authors: Jean P. Chereau, Bruno Scalzo Dees, Danilo P. Mandic

Abstract: Principal component analysis (PCA) is recognised as a quintessential data analysis technique when it comes to describing linear relationships between the features of a dataset. However, the well-known sensitivity of PCA to non-Gaussian samples and/or outliers often makes it unreliable in practice. To this end, a robust formulation of PCA is derived based on the maximum correntropy criterion (MCC)… ▽ More Principal component analysis (PCA) is recognised as a quintessential data analysis technique when it comes to describing linear relationships between the features of a dataset. However, the well-known sensitivity of PCA to non-Gaussian samples and/or outliers often makes it unreliable in practice. To this end, a robust formulation of PCA is derived based on the maximum correntropy criterion (MCC) so as to maximise the expected likelihood of Gaussian distributed reconstruction errors. In this way, the proposed solution reduces to a generalised power iteration, whereby: (i) robust estimates of the principal components are obtained even in the presence of outliers; (ii) the number of principal components need not be specified in advance; and (iii) the entire set of principal components can be obtained, unlike existing approaches. The advantages of the proposed maximum correntropy power iteration (MCPI) are demonstrated through an intuitive numerical example. △ Less

Submitted 24 October, 2019; originally announced October 2019.

Comments: 5 pages, 1 figure

arXiv:1910.05561 [pdf, other]

Portfolio Cuts: A Graph-Theoretic Framework to Diversification

Authors: Bruno Scalzo Dees, Ljubisa Stankovic, Anthony G. Constantinides, Danilo P. Mandic

Abstract: Investment returns naturally reside on irregular domains, however, standard multivariate portfolio optimization methods are agnostic to data structure. To this end, we investigate ways for domain knowledge to be conveniently incorporated into the analysis, by means of graphs. Next, to relax the assumption of the completeness of graph topology and to equip the graph model with practically relevant… ▽ More Investment returns naturally reside on irregular domains, however, standard multivariate portfolio optimization methods are agnostic to data structure. To this end, we investigate ways for domain knowledge to be conveniently incorporated into the analysis, by means of graphs. Next, to relax the assumption of the completeness of graph topology and to equip the graph model with practically relevant physical intuition, we introduce the portfolio cut paradigm. Such a graph-theoretic portfolio partitioning technique is shown to allow the investor to devise robust and tractable asset allocation schemes, by virtue of a rigorous graph framework for considering smaller, computationally feasible, and economically meaningful clusters of assets, based on graph cuts. In turn, this makes it possible to fully utilize the asset returns covariance matrix for constructing the portfolio, even without the requirement for its inversion. The advantages of the proposed framework over traditional methods are demonstrated through numerical simulations based on real-world price data. △ Less

Submitted 16 October, 2019; v1 submitted 12 October, 2019; originally announced October 2019.

Comments: 5 pages, 4 figures

arXiv:1909.10325 [pdf, other]

Graph Signal Processing -- Part II: Processing and Analyzing Signals on Graphs

Authors: Ljubisa Stankovic, Danilo Mandic, Milos Dakovic, Milos Brajovic, Bruno Scalzo, Anthony G. Constantinides

Abstract: The focus of Part I of this monograph has been on both the fundamental properties, graph topologies, and spectral representations of graphs. Part II embarks on these concepts to address the algorithmic and practical issues centered round data/signal processing on graphs, that is, the focus is on the analysis and estimation of both deterministic and random data on graphs. The fundamental ideas rela… ▽ More The focus of Part I of this monograph has been on both the fundamental properties, graph topologies, and spectral representations of graphs. Part II embarks on these concepts to address the algorithmic and practical issues centered round data/signal processing on graphs, that is, the focus is on the analysis and estimation of both deterministic and random data on graphs. The fundamental ideas related to graph signals are introduced through a simple and intuitive, yet illustrative and general enough case study of multisensor temperature field estimation. The concept of systems on graph is defined using graph signal shift operators, which generalize the corresponding principles from traditional learning systems. At the core of the spectral domain representation of graph signals and systems is the Graph Discrete Fourier Transform (GDFT). The spectral domain representations are then used as the basis to introduce graph signal filtering concepts and address their design, including Chebyshev polynomial approximation series. Ideas related to the sampling of graph signals are presented and further linked with compressive sensing. Localized graph signal analysis in the joint vertex-spectral domain is referred to as the vertex-frequency analysis, since it can be considered as an extension of classical time-frequency analysis to the graph domain of a signal. Important topics related to the local graph Fourier transform (LGFT) are covered, together with its various forms including the graph spectral and vertex domain windows and the inversion conditions and relations. A link between the LGFT with spectral varying window and the spectral graph wavelet transform (SGWT) is also established. Realizations of the LGFT and SGWT using polynomial (Chebyshev) approximations of the spectral functions are further considered. Finally, energy versions of the vertex-frequency representations are introduced. △ Less

Submitted 23 September, 2019; originally announced September 2019.

Comments: 60 pages, 50 figures,

arXiv:1909.05831 [pdf, other]

Tight Lower Bound on the Tensor Rank based on the Maximally Square Unfolding

Authors: Giuseppe G. Calvi, Bruno Scalzo Dees, Danilo P. Mandic

Abstract: Tensors decompositions are a class of tools for analysing datasets of high dimensionality and variety in a natural manner, with the Canonical Polyadic Decomposition (CPD) being a main pillar. While the notion of CPD is closely intertwined with that of the tensor rank, $R$, unlike the matrix rank, the computation of the tensor rank is an NP-hard problem, owing to the associated computational burden… ▽ More Tensors decompositions are a class of tools for analysing datasets of high dimensionality and variety in a natural manner, with the Canonical Polyadic Decomposition (CPD) being a main pillar. While the notion of CPD is closely intertwined with that of the tensor rank, $R$, unlike the matrix rank, the computation of the tensor rank is an NP-hard problem, owing to the associated computational burden of evaluating the CPD. To address this issue, we investigate tight lower bounds on $R$ with the aim to provide a reduced search space, and hence to lessen the computational costs of the CPD evaluation. This is achieved by establishing a link between the maximum attainable lower bound on $R$ and the dimensions of the matrix unfolding of the tensor with aspect ratio closest to unity (maximally square). Moreover, we demonstrate that, for a generic tensor, such lower bound can be attained under very mild conditions, whereby the tensor rank becomes detectable. Numerical examples demonstrate the benefits of this result. △ Less

Submitted 14 November, 2019; v1 submitted 12 September, 2019; originally announced September 2019.

arXiv:1909.05767 [pdf, other]

Unitary Shift Operators on a Graph

Authors: Bruno Scalzo Dees, Ljubisa Stankovic, Milos Dakovic, Anthony G. Constantinides, Danilo P. Mandic

Abstract: A unitary shift operator (GSO) for signals on a graph is introduced, which exhibits the desired property of energy preservation over both backward and forward graph shifts. For rigour, the graph differential operator is also derived in an analytical form. The commutativity relation of the shift operator with the Fourier transform is next explored in conjunction with the proposed GSO to introduce a… ▽ More A unitary shift operator (GSO) for signals on a graph is introduced, which exhibits the desired property of energy preservation over both backward and forward graph shifts. For rigour, the graph differential operator is also derived in an analytical form. The commutativity relation of the shift operator with the Fourier transform is next explored in conjunction with the proposed GSO to introduce a graph discrete Fourier transform (GDFT) which, unlike existing approaches, ensures the orthogonality of GDFT bases and admits a natural frequency-domain interpretation. The proposed GDFT is shown to allow for a coherent definition of the graph discrete Hilbert transform (GDHT) and the graph analytic signal. The advantages of the proposed GSO are demonstrated through illustrative examples. △ Less

Submitted 17 September, 2019; v1 submitted 12 September, 2019; originally announced September 2019.

Comments: 5 pages, 3 figures

arXiv:1908.01596 [pdf, other]

A Class of Doubly Stochastic Shift Operators for Random Graph Signals and their Boundedness

Authors: Bruno Scalzo Dees, Ljubisa Stankovic, Milos Dakovic, Anthony G. Constantinides, Danilo P. Mandic

Abstract: A class of doubly stochastic graph shift operators (GSO) is proposed, which is shown to exhibit: (i) lower and upper $L_{2}$-boundedness for locally stationary random graph signals; (ii) $L_{2}$-isometry for \textit{i.i.d.} random graph signals with the asymptotic increase in the incoming neighbourhood size of vertices; and (iii) preservation of the mean of any graph signal. These properties are o… ▽ More A class of doubly stochastic graph shift operators (GSO) is proposed, which is shown to exhibit: (i) lower and upper $L_{2}$-boundedness for locally stationary random graph signals; (ii) $L_{2}$-isometry for \textit{i.i.d.} random graph signals with the asymptotic increase in the incoming neighbourhood size of vertices; and (iii) preservation of the mean of any graph signal. These properties are obtained through a statistical consistency analysis of the graph shift, and by exploiting the dual role of the doubly stochastic GSO as a Markov (diffusion) matrix and as an unbiased expectation operator. Practical utility of the class of doubly stochastic GSOs is demonstrated in a real-world multi-sensor signal filtering setting. △ Less

Submitted 7 February, 2020; v1 submitted 5 August, 2019; originally announced August 2019.

Comments: 5 pages, 1 figure

arXiv:1907.03471 [pdf, other]

Vertex-Frequency Graph Signal Processing: A review

Authors: Ljubisa Stankovic, Danilo P. Mandic, Milos Dakovic, Bruno Scalzo, Milos Brajovic, Ervin Sejdic, Anthony G. Constantinides

Abstract: Graph signal processing deals with signals which are observed on an irregular graph domain. While many approaches have been developed in classical graph theory to cluster vertices and segment large graphs in a signal independent way, signal localization based approaches to the analysis of data on graph represent a new research direction which is also a key to big data analytics on graphs. To this… ▽ More Graph signal processing deals with signals which are observed on an irregular graph domain. While many approaches have been developed in classical graph theory to cluster vertices and segment large graphs in a signal independent way, signal localization based approaches to the analysis of data on graph represent a new research direction which is also a key to big data analytics on graphs. To this end, after an overview of the basic definitions in graphs and graph signals, we present and discuss a localized form of the graph Fourier transform. To establish an analogy with classical signal processing, spectral- and vertex-domain definitions of the localization window are given next. The spectral and vertex localization kernels are then related to the wavelet transform, followed by a study of filtering and inversion of the localized graph Fourier transform. For rigour, the analysis of energy representation and frames in the localized graph Fourier transform is extended to the energy forms of vertex-frequency distributions, which operate even without the need to apply localization windows. Another link with classical signal processing is established through the concept of local smoothness, which is subsequently related to the particular paradigm of signal smoothness on graphs. This all represents a comprehensive account of the relation of general vertex-frequency analysis with classical time-frequency analysis, and important but missing link for more advanced applications of graphs signal processing. The theory is supported by illustrative and practically relevant examples. △ Less

Submitted 26 December, 2019; v1 submitted 8 July, 2019; originally announced July 2019.

Comments: 16 pages, 12 figures

arXiv:1907.03467 [pdf, other]

Graph Signal Processing -- Part I: Graphs, Graph Spectra, and Spectral Clustering

Authors: Ljubisa Stankovic, Danilo Mandic, Milos Dakovic, Milos Brajovic, Bruno Scalzo, Tony Constantinides

Abstract: The area of Data Analytics on graphs promises a paradigm shift as we approach information processing of classes of data, which are typically acquired on irregular but structured domains (social networks, various ad-hoc sensor networks). Yet, despite its long history, current approaches mostly focus on the optimization of graphs themselves, rather than on directly inferring learning strategies, suc… ▽ More The area of Data Analytics on graphs promises a paradigm shift as we approach information processing of classes of data, which are typically acquired on irregular but structured domains (social networks, various ad-hoc sensor networks). Yet, despite its long history, current approaches mostly focus on the optimization of graphs themselves, rather than on directly inferring learning strategies, such as detection, estimation, statistical and probabilistic inference, clustering and separation from signals and data acquired on graphs. To fill this void, we first revisit graph topologies from a Data Analytics point of view, and establish a taxonomy of graph networks through a linear algebraic formalism of graph topology (vertices, connections, directivity). This serves as a basis for spectral analysis of graphs, whereby the eigenvalues and eigenvectors of graph Laplacian and adjacency matrices are shown to convey physical meaning related to both graph topology and higher-order graph properties, such as cuts, walks, paths, and neighborhoods. Next, to illustrate estimation strategies performed on graph signals, spectral analysis of graphs is introduced through eigenanalysis of mathematical descriptors of graphs and in a generic way. Finally, a framework for vertex clustering and graph segmentation is established based on graph spectral representation (eigenanalysis) which illustrates the power of graphs in various data association tasks. The supporting examples demonstrate the promise of Graph Data Analytics in modeling structural and functional/semantic inferences. At the same time, Part I serves as a basis for Part II and Part III which deal with theory, methods and applications of processing Data on Graphs and Graph Topology Learning from data. △ Less

Submitted 12 August, 2019; v1 submitted 8 July, 2019; originally announced July 2019.

Comments: 49 pages, 40 figures

arXiv:1906.03700 [pdf, other]

doi 10.1609/aaai.v34i04.5897

Solving general elliptical mixture models through an approximate Wasserstein manifold

Authors: Shengxi Li, Zeyang Yu, Min Xiang, Danilo Mandic

Abstract: We address the estimation problem for general finite mixture models, with a particular focus on the elliptical mixture models (EMMs). Compared to the widely adopted Kullback-Leibler divergence, we show that the Wasserstein distance provides a more desirable optimisation space. We thus provide a stable solution to the EMMs that is both robust to initialisations and reaches a superior optimum by ada… ▽ More We address the estimation problem for general finite mixture models, with a particular focus on the elliptical mixture models (EMMs). Compared to the widely adopted Kullback-Leibler divergence, we show that the Wasserstein distance provides a more desirable optimisation space. We thus provide a stable solution to the EMMs that is both robust to initialisations and reaches a superior optimum by adaptively optimising along a manifold of an approximate Wasserstein distance. To this end, we first provide a unifying account of computable and identifiable EMMs, which serves as a basis to rigorously address the underpinning optimisation problem. Due to a probability constraint, solving this problem is extremely cumbersome and unstable, especially under the Wasserstein distance. To relieve this issue, we introduce an efficient optimisation method on a statistical manifold defined under an approximate Wasserstein distance, which allows for explicit metrics and computable operations, thus significantly stabilising and improving the EMM estimation. We further propose an adaptive method to accelerate the convergence. Experimental results demonstrate the excellent performance of the proposed EMM solver. △ Less

Submitted 7 October, 2020; v1 submitted 9 June, 2019; originally announced June 2019.

Comments: This work has been accepted to AAAI2020. Note that this version also corrects a small error on the Equation (16) in proof

arXiv:1904.00403 [pdf, other]

On the Decomposition of Multivariate Nonstationary Multicomponent Signals

Authors: Ljubisa Stankovic, Milos Brajovic, Milos Dakovic, Danilo Mandic

Abstract: With their ability to handle an increased amount of information, multivariate and multichannel signals can be used to solve problems normally not solvable with signals obtained from a single source. One such problem is the decomposition signals with several components whose domains of support significantly overlap in both the time and the frequency domain, including the joint time-frequency domain… ▽ More With their ability to handle an increased amount of information, multivariate and multichannel signals can be used to solve problems normally not solvable with signals obtained from a single source. One such problem is the decomposition signals with several components whose domains of support significantly overlap in both the time and the frequency domain, including the joint time-frequency domain. Initially, we proposed a solution to this problem based on the Wigner distribution of multivariate signals, which requires the attenuation of the cross-terms. In this paper, an advanced solution based on an eigenvalue analysis of the multivariate signal autocorrelation matrix, followed by their time-frequency concentration measure minimization, is presented. This analysis provides less restrictive conditions for the signal decomposition than in the case of Wigner distribution. The algorithm for the components separation is based on the concentration measures of the eigenvector time-frequency representation, that are linear combinations of the overlap** signal components. With an increased number of sensors/channels, the robustness of the decomposition process to additive noise is also achieved. The theory is supported by numerical examples. The required channel dissimilarity is statistically investigated as well. △ Less

Submitted 31 March, 2019; originally announced April 2019.

Comments: 13 pages, 10 figures

arXiv:1903.11179 [pdf, other]

An Example-Driven Introduction to Data Analytics on Graphs

Authors: Ljubisa Stankovic, Danilo Mandic, Milos Dakovic, Ilya Kisil, Ervin Sejdic, Anthony G. Constantinides

Abstract: Graphs are irregular structures which naturally account for data integrity, however, traditional approaches have been established outside Signal Processing, and largely focus on analyzing the underlying graphs rather than signals on graphs. Given the rapidly increasing availability of multisensor and multinode measurements, likely recorded on irregular or ad-hoc grids, it would be extremely advant… ▽ More Graphs are irregular structures which naturally account for data integrity, however, traditional approaches have been established outside Signal Processing, and largely focus on analyzing the underlying graphs rather than signals on graphs. Given the rapidly increasing availability of multisensor and multinode measurements, likely recorded on irregular or ad-hoc grids, it would be extremely advantageous to analyze such structured data as graph signals and thus benefit from the ability of graphs to incorporate spatial awareness of the sensing locations, sensor importance, and local versus global sensor association. The aim of this lecture note is therefore to establish a common language between graph signals, defined on irregular signal domains, and some of the most fundamental paradigms in DSP, such as spectral analysis of multichannel signals, system transfer function, digital filter design, parameter estimation, and optimal filter design. This is achieved through a physically meaningful and intuitive real-world example of geographically distributed multisensor temperature estimation. A similar spatial multisensor arrangement is already widely used in Signal Processing curricula to introduce minimum variance estimators and Kalman filters \cite{HM}, and by adopting this framework we facilitate a seamless integration of graph theory into the curriculum of existing DSP courses. By bridging the gap between standard approaches and graph signal processing, we also show that standard methods can be thought of as special cases of their graph counterparts, evaluated on linear graphs. It is hoped that our approach would not only help to demystify graph theoretic approaches in education and research but it would also empower practitioners to explore a whole host of otherwise prohibitive modern applications. △ Less

Submitted 12 May, 2019; v1 submitted 26 March, 2019; originally announced March 2019.

Comments: 10 pages, 3 figures, 5 boxes

arXiv:1903.11136 [pdf, other]

An Intuitive Derivation of the Coherence Index Relation in Compressive Sensing

Authors: Ljubisa Stankovic, Danilo Mandic, Milos Dakovic, Ilya Kisil

Abstract: The existence and uniqueness conditions are a prerequisite for reliable reconstruction of sparse signals from reduced sets of measurements within the Compressive Sensing (CS) paradigm. However, despite their underpinning role for practical applications, existing uniqueness relations are either computationally prohibitive to implement (Restricted Isometry Property), or involve mathematical tools th… ▽ More The existence and uniqueness conditions are a prerequisite for reliable reconstruction of sparse signals from reduced sets of measurements within the Compressive Sensing (CS) paradigm. However, despite their underpinning role for practical applications, existing uniqueness relations are either computationally prohibitive to implement (Restricted Isometry Property), or involve mathematical tools that are beyond the standard background of engineering graduates (Coherence Index). This can introduce conceptual and computational obstacles in the development of engineering intuition, the design of suboptimal practical solutions, or understanding of limitations. To this end, we introduce a simple but rigorous derivation of the coherence index condition, based on standard linear algebra, with the aim to empower signal processing practitioners with intuition in the design and ease in implementation of CS systems. Given that the coherence index is one of very few CS metrics that admits mathematically tractable and computationally feasible calculation, it is our hope that this work will help bridge the gap between the theory and applications of compressive sensing. △ Less

Submitted 26 March, 2019; originally announced March 2019.

Comments: 8 pages, 3 figures, 3 boxes

arXiv:1903.06133 [pdf, other]

Compression and Interpretability of Deep Neural Networks via Tucker Tensor Layer: From First Principles to Tensor Valued Back-Propagation

Authors: Giuseppe G. Calvi, Ahmad Moniri, Mahmoud Mahfouz, Qibin Zhao, Danilo P. Mandic

Abstract: This work aims to help resolve the two main stumbling blocks in the application of Deep Neural Networks (DNNs), that is, the exceedingly large number of trainable parameters and their physical interpretability. This is achieved through a tensor valued approach, based on the proposed Tucker Tensor Layer (TTL), as an alternative to the dense weight-matrices of DNNs. This allows us to treat the weigh… ▽ More This work aims to help resolve the two main stumbling blocks in the application of Deep Neural Networks (DNNs), that is, the exceedingly large number of trainable parameters and their physical interpretability. This is achieved through a tensor valued approach, based on the proposed Tucker Tensor Layer (TTL), as an alternative to the dense weight-matrices of DNNs. This allows us to treat the weight-matrices of general DNNs as a matrix unfolding of a higher order weight-tensor. By virtue of the compression properties of tensor decompositions, this enables us to introduce a novel and efficient framework for exploiting the multi-way nature of the weight-tensor in order to dramatically reduce the number of DNN parameters. We also derive the tensor valued back-propagation algorithm within the TTL framework, by extending the notion of matrix derivatives to tensors. In this way, the physical interpretability of the Tucker decomposition is exploited to gain physical insights into the NN training, through the process of computing gradients with respect to each factor matrix. The proposed framework is validated on both synthetic data, and the benchmark datasets MNIST, Fashion-MNIST, and CIFAR-10. Overall, through the ability to provide the relative importance of each data feature in training, the TTL back-propagation is shown to help mitigate the "black-box" nature inherent to NNs. Experiments also illustrate that the TTL achieves a 66.63-fold compression on MNIST and Fashion-MNIST, while, by simplifying the VGG-16 network, it achieves a 10\% speed up in training time, at a comparable performance. △ Less

Submitted 6 January, 2020; v1 submitted 14 March, 2019; originally announced March 2019.

arXiv:1903.02014 [pdf, other]

Widely Linear Complex-valued Autoencoder: Dealing with Noncircularity in Generative-Discriminative Models

Authors: Zeyang Yu, Shengxi Li, Danilo Mandic

Abstract: We propose a new structure for the complex-valued autoencoder by introducing additional degrees of freedom into its design through a widely linear (WL) transform. The corresponding widely linear backpropagation algorithm is also developed using the $\mathbb{CR}$ calculus, to unify the gradient calculation of the cost function and the underlying WL model. More specifically, all the existing complex… ▽ More We propose a new structure for the complex-valued autoencoder by introducing additional degrees of freedom into its design through a widely linear (WL) transform. The corresponding widely linear backpropagation algorithm is also developed using the $\mathbb{CR}$ calculus, to unify the gradient calculation of the cost function and the underlying WL model. More specifically, all the existing complex-valued autoencoders employ the strictly linear transform, which is optimal only when the complex-valued outputs of each network layer are independent of the conjugate of the inputs. In addition, the widely linear model which underpins our work allows us to consider all the second-order statistics of inputs. This provides more freedom in the design and enhanced optimization opportunities, as compared to the state-of-the-art. Furthermore, we show that the most widely adopted cost function, i.e., the mean squared error, is not best suited for the complex domain, as it is a real quantity with a single degree of freedom, while both the phase and the amplitude information need to be optimized. To resolve this issue, we design a new cost function, which is capable of controlling the balance between the phase and the amplitude contribution to the solution. The experimental results verify the superior performance of the proposed autoencoder together with the new cost function, especially for the imaging scenarios where the phase preserves extensive information on edges and shapes. △ Less

Submitted 5 March, 2019; originally announced March 2019.

arXiv:1812.06888 [pdf, other]

Tensor Ensemble Learning for Multidimensional Data

Authors: Ilia Kisil, Ahmad Moniri, Danilo P. Mandic

Abstract: In big data applications, classical ensemble learning is typically infeasible on the raw input data and dimensionality reduction techniques are necessary. To this end, novel framework that generalises classic flat-view ensemble learning to multidimensional tensor-valued data is introduced. This is achieved by virtue of tensor decompositions, whereby the proposed method, referred to as tensor ensem… ▽ More In big data applications, classical ensemble learning is typically infeasible on the raw input data and dimensionality reduction techniques are necessary. To this end, novel framework that generalises classic flat-view ensemble learning to multidimensional tensor-valued data is introduced. This is achieved by virtue of tensor decompositions, whereby the proposed method, referred to as tensor ensemble learning (TEL), decomposes every input data sample into multiple factors which allows for a flexibility in the choice of multiple learning algorithms in order to improve test performance. The TEL framework is shown to naturally compress multidimensional data in order to take advantage of the inherent multi-way data structure and exploit the benefit of ensemble learning. The proposed framework is verified through the application of Higher Order Singular Value Decomposition (HOSVD) to the ETH-80 dataset and is shown to outperform the classical ensemble learning approach of bootstrap aggregating. △ Less

Submitted 17 December, 2018; originally announced December 2018.

arXiv:1809.02288 [pdf, other]

Tensor Ring Decomposition with Rank Minimization on Latent Space: An Efficient Approach for Tensor Completion

Authors: Longhao Yuan, Chao Li, Danilo Mandic, Jianting Cao, Qibin Zhao

Abstract: In tensor completion tasks, the traditional low-rank tensor decomposition models suffer from the laborious model selection problem due to their high model sensitivity. In particular, for tensor ring (TR) decomposition, the number of model possibilities grows exponentially with the tensor order, which makes it rather challenging to find the optimal TR decomposition. In this paper, by exploiting the… ▽ More In tensor completion tasks, the traditional low-rank tensor decomposition models suffer from the laborious model selection problem due to their high model sensitivity. In particular, for tensor ring (TR) decomposition, the number of model possibilities grows exponentially with the tensor order, which makes it rather challenging to find the optimal TR decomposition. In this paper, by exploiting the low-rank structure of the TR latent space, we propose a novel tensor completion method which is robust to model selection. In contrast to imposing the low-rank constraint on the data space, we introduce nuclear norm regularization on the latent TR factors, resulting in the optimization step using singular value decomposition (SVD) being performed at a much smaller scale. By leveraging the alternating direction method of multipliers (ADMM) scheme, the latent TR factors with optimal rank and the recovered tensor can be obtained simultaneously. Our proposed algorithm is shown to effectively alleviate the burden of TR-rank selection, thereby greatly reducing the computational cost. The extensive experimental results on both synthetic and real-world data demonstrate the superior performance and efficiency of the proposed approach against the state-of-the-art algorithms. △ Less

Submitted 30 November, 2018; v1 submitted 6 September, 2018; originally announced September 2018.

arXiv:1809.00535 [pdf, other]

Tensor Networks for Latent Variable Analysis: Higher Order Canonical Polyadic Decomposition

Authors: Anh-Huy Phan, Andrzej Cichocki, Ivan Oseledets, Salman Ahmadi Asl, Giuseppe Calvi, Danilo Mandic

Abstract: The Canonical Polyadic decomposition (CPD) is a convenient and intuitive tool for tensor factorization; however, for higher-order tensors, it often exhibits high computational cost and permutation of tensor entries, these undesirable effects grow exponentially with the tensor order. Prior compression of tensor in-hand can reduce the computational cost of CPD, but this is only applicable when the r… ▽ More The Canonical Polyadic decomposition (CPD) is a convenient and intuitive tool for tensor factorization; however, for higher-order tensors, it often exhibits high computational cost and permutation of tensor entries, these undesirable effects grow exponentially with the tensor order. Prior compression of tensor in-hand can reduce the computational cost of CPD, but this is only applicable when the rank $R$ of the decomposition does not exceed the tensor dimensions. To resolve these issues, we present a novel method for CPD of higher-order tensors, which rests upon a simple tensor network of representative inter-connected core tensors of orders not higher than 3. For rigour, we develop an exact conversion scheme from the core tensors to the factor matrices in CPD, and an iterative algorithm with low complexity to estimate these factor matrices for the inexact case. Comprehensive simulations over a variety of scenarios support the approach. △ Less

Submitted 3 September, 2018; originally announced September 2018.

Comments: 13 pages

arXiv:1807.08720 [pdf, other]

A Data Analytics Perspective of the Clarke and Related Transforms in Power Grid Analysis

Authors: Danilo P. Mandic, Sithan Kanna, Yili Xia, Ahmad Moniri, Anthony G. Constantinides

Abstract: Affordable and reliable electric power is fundamental to modern society and economy, with the Smart Grid becoming an increasingly important factor in power generation and distribution. In order to fully exploit it advantages, the analysis of modern Smart Grid requires close collaboration and convergence between power engineers and signal processing and machine learning experts. Current analysis te… ▽ More Affordable and reliable electric power is fundamental to modern society and economy, with the Smart Grid becoming an increasingly important factor in power generation and distribution. In order to fully exploit it advantages, the analysis of modern Smart Grid requires close collaboration and convergence between power engineers and signal processing and machine learning experts. Current analysis techniques are typically derived from a Circuit Theory perspective; such an approach is adequate for only fully balanced systems operating at nominal conditions and non-obvious for data scientists - this is prohibitive for the analysis of dynamically unbalanced smart grids, where Data Analytics is not only well suited but also necessary. A common language that bridges the gap between Circuit Theory and Data Analytics, and the respective community of experts, would be a natural step forward. To this end, we revisit the Clarke and related transforms from a subspace, latent component, and spatial frequency analysis frameworks, to establish fundamental relationships between the standard three-phase transforms and modern Data Analytics. We show that the Clarke transform admits a physical interpretation as a "spatial dimensionality" reduction technique which is equivalent to Principal Component Analysis (PCA) for balanced systems, but is sub-optimal for dynamically unbalanced systems, such as the Smart Grid, while the related Park transform performs further "temporal" dimensionality reduction. Such a perspective opens numerous new avenues for the use Signal Processing and Machine Learning in power grid research, and paves the way for innovative optimisation, transformation, and analysis techniques that are not accessible to arrive at from the standard Circuit Theory principles, as demonstrated in this work through the possibility of simultaneous frequency estimation and fault detection. △ Less

Submitted 23 July, 2018; originally announced July 2018.

Comments: 20 pages, 11 figures

arXiv:1805.08468 [pdf, other]

Rank Minimization on Tensor Ring: A New Paradigm in Scalable Tensor Decomposition and Completion

Authors: Longhao Yuan, Chao Li, Danilo Mandic, Jianting Cao, Qibin Zhao

Abstract: In low-rank tensor completion tasks, due to the underlying multiple large-scale singular value decomposition (SVD) operations and rank selection problem of the traditional methods, they suffer from high computational cost and high sensitivity of model complexity. In this paper, taking advantages of high compressibility of the recently proposed tensor ring (TR) decomposition, we propose a new model… ▽ More In low-rank tensor completion tasks, due to the underlying multiple large-scale singular value decomposition (SVD) operations and rank selection problem of the traditional methods, they suffer from high computational cost and high sensitivity of model complexity. In this paper, taking advantages of high compressibility of the recently proposed tensor ring (TR) decomposition, we propose a new model for tensor completion problem. This is achieved through introducing convex surrogates of tensor low-rank assumption on latent tensor ring factors, which makes it possible for the Schatten norm regularization based models to be solved at much smaller scale. We propose two algorithms which apply different structured Schatten norms on tensor ring factors respectively. By the alternating direction method of multipliers (ADMM) scheme, the tensor ring factors and the predicted tensor can be optimized simultaneously. The experiments on synthetic data and real-world data show the high performance and efficiency of the proposed approach. △ Less

Submitted 22 May, 2018; originally announced May 2018.

arXiv:1805.08045 [pdf, other]

doi 10.1109/TNNLS.2020.3010198

A universal framework for learning the elliptical mixture model

Authors: Shengxi Li, Zeyang Yu, Danilo Mandic

Abstract: Mixture modelling using elliptical distributions promises enhanced robustness, flexibility and stability over the widely employed Gaussian mixture model (GMM). However, existing studies based on the elliptical mixture model (EMM) are restricted to several specific types of elliptical probability density functions, which are not supported by general solutions or systematic analysis frameworks; this… ▽ More Mixture modelling using elliptical distributions promises enhanced robustness, flexibility and stability over the widely employed Gaussian mixture model (GMM). However, existing studies based on the elliptical mixture model (EMM) are restricted to several specific types of elliptical probability density functions, which are not supported by general solutions or systematic analysis frameworks; this significantly limits the rigour and the power of EMMs in applications. To this end, we propose a novel general framework for estimating and analysing the EMMs, achieved through Riemannian manifold optimisation. First, we investigate the relationships between Riemannian manifolds and elliptical distributions, and the so established connection between the original manifold and a reformulated one indicates a mismatch between those manifolds, the major cause of failure of the existing optimisation for solving general EMMs. We next propose a universal solver which is based on the optimisation of a re-designed cost and prove the existence of the same optimum as in the original problem; this is achieved in a simple, fast and stable way. We further calculate the influence functions of the EMM as theoretical bounds to quantify robustness to outliers. Comprehensive numerical results demonstrate the ability of the proposed framework to accommodate EMMs with different properties of individual functions in a stable way and with fast convergence speed. Finally, the enhanced robustness and flexibility of the proposed framework over the standard GMM are demonstrated both analytically and through comprehensive simulations. △ Less

Submitted 28 September, 2020; v1 submitted 21 May, 2018; originally announced May 2018.

Comments: This work has been accepted to IEEE Transactions on Neural Networks and Learning Systems with DOI:10.1109/TNNLS.2020.3010198. The abstract link is https://ieeexplore.ieee.org/document/9153118

arXiv:1711.08171 [pdf, other]

doi 10.1609/aaai.v32i1.11823

Hypergraph $p$-Laplacian: A Differential Geometry View

Authors: Shota Saito, Danilo P Mandic, Hideyuki Suzuki

Abstract: The graph Laplacian plays key roles in information processing of relational data, and has analogies with the Laplacian in differential geometry. In this paper, we generalize the analogy between graph Laplacian and differential geometry to the hypergraph setting, and propose a novel hypergraph $p$-Laplacian. Unlike the existing two-node graph Laplacians, this generalization makes it possible to ana… ▽ More The graph Laplacian plays key roles in information processing of relational data, and has analogies with the Laplacian in differential geometry. In this paper, we generalize the analogy between graph Laplacian and differential geometry to the hypergraph setting, and propose a novel hypergraph $p$-Laplacian. Unlike the existing two-node graph Laplacians, this generalization makes it possible to analyze hypergraphs, where the edges are allowed to connect any number of nodes. Moreover, we propose a semi-supervised learning method based on the proposed hypergraph $p$-Laplacian, and formalize them as the analogue to the Dirichlet problem, which often appears in physics. We further explore theoretical connections to normalized hypergraph cut on a hypergraph, and propose normalized cut corresponding to hypergraph $p$-Laplacian. The proposed $p$-Laplacian is shown to outperform standard hypergraph Laplacians in the experiment on a hypergraph semi-supervised learning and normalized cut setting. △ Less

Submitted 22 November, 2017; originally announced November 2017.

Comments: Extended version of our AAAI-18 paper

Journal ref: Proceedings of the AAAI Conference on Artificial Intelligence, 32(1), 3984-3991 (2018)

arXiv:1711.04401 [pdf, other]

Quadratic Programming Over Ellipsoids (with Applications to Constrained Linear Regression and Tensor Decomposition)

Authors: Anh-Huy Phan, Masao Yamagishi, Danilo Mandic, Andrzej Cichocki

Abstract: A novel algorithm to solve the quadratic programming problem over ellipsoids is proposed. This is achieved by splitting the problem into two optimisation sub-problems, quadratic programming over a sphere and orthogonal projection. Next, an augmented-Lagrangian algorithm is developed for this multiple constraint optimisation. Benefit from the fact that the QP over a single sphere can be solved in a… ▽ More A novel algorithm to solve the quadratic programming problem over ellipsoids is proposed. This is achieved by splitting the problem into two optimisation sub-problems, quadratic programming over a sphere and orthogonal projection. Next, an augmented-Lagrangian algorithm is developed for this multiple constraint optimisation. Benefit from the fact that the QP over a single sphere can be solved in a closed form by solving a secular equation, we derive a tighter bound of the minimiser of the secular equation. We also propose to generate a new psd matrix with a low condition number from the matrices in the quadratic constraints. This correction method improves convergence of the proposed augmented-Lagrangian algorithm. Finally, applications of the quadratically constrained QP to bounded linear regression and tensor decompositions are presented. △ Less

Submitted 12 November, 2017; originally announced November 2017.

arXiv:1711.00701 [pdf, other]

The sum of tensor networks

Authors: Giuseppe G. Calvi, Ilia Kisil, Danilo P. Mandic

Abstract: Tensor networks (TNs) have been gaining interest as multiway data analysis tools owing to their ability to tackle the curse of dimensionality and to represent tensors as smaller-scale interconnections of their intrinsic features. However, despite the obvious advantages, the current treatment of TNs as stand-alone entities does not take full benefit of their underlying structure and the associated… ▽ More Tensor networks (TNs) have been gaining interest as multiway data analysis tools owing to their ability to tackle the curse of dimensionality and to represent tensors as smaller-scale interconnections of their intrinsic features. However, despite the obvious advantages, the current treatment of TNs as stand-alone entities does not take full benefit of their underlying structure and the associated feature localization. To this end, embarking upon the analogy with a feature fusion, we propose a rigorous framework for the combination of TNs, focusing on their summation as the natural way for their combination. This allows for feature combination for any number of tensors, as long as their TN representation topologies are isomorphic. The benefits of the proposed framework are demonstrated on the classification of several groups of partially related images, where it outperforms standard machine learning algorithms. △ Less

Submitted 2 November, 2017; originally announced November 2017.

arXiv:1711.00487 [pdf, other]

Tensor Valued Common and Individual Feature Extraction: Multi-dimensional Perspective

Authors: Ilia Kisil, Giuseppe G. Calvi, Danilo P. Mandic

Abstract: A novel method for common and individual feature analysis from exceedingly large-scale data is proposed, in order to ensure the tractability of both the computation and storage and thus mitigate the curse of dimensionality, a major bottleneck in modern data science. This is achieved by making use of the inherent redundancy in so-called multi-block data structures, which represent multiple observat… ▽ More A novel method for common and individual feature analysis from exceedingly large-scale data is proposed, in order to ensure the tractability of both the computation and storage and thus mitigate the curse of dimensionality, a major bottleneck in modern data science. This is achieved by making use of the inherent redundancy in so-called multi-block data structures, which represent multiple observations of the same phenomenon taken at different times, angles or recording conditions. Upon providing an intrinsic link between the properties of the outer vector product and extracted features in tensor decompositions (TDs), the proposed common and individual information extraction from multi-block data is performed through imposing physical meaning to otherwise unconstrained factorisation approaches. This is shown to dramatically reduce the dimensionality of search spaces for subsequent classification procedures and to yield greatly enhanced accuracy. Simulations on a multi-class classification task of large-scale extraction of individual features from a collection of partially related real-world images demonstrate the advantages of the "blessing of dimensionality" associated with TDs. △ Less

Submitted 1 November, 2017; originally announced November 2017.

arXiv:1710.04381 [pdf, other]

doi 10.1109/TSP.2018.2846250

An Augmented Nonlinear LMS for Digital Self-Interference Cancellation in Full-Duplex Direct-Conversion Transceivers

Authors: Zhe Li, Yili Xia, Wenjiang Pei, Kai Wang, Danilo P. Mandic

Abstract: In future full-duplex communications, the cancellation of self-interference (SI) arising from hardware non-idealities will play an important role in the design of mobile-scale devices. To this end, we introduce an optimal digital SI cancellation solution for shared-antenna-based direct-conversion transceivers. To establish that the underlying widely linear signal model is not adequate for strong t… ▽ More In future full-duplex communications, the cancellation of self-interference (SI) arising from hardware non-idealities will play an important role in the design of mobile-scale devices. To this end, we introduce an optimal digital SI cancellation solution for shared-antenna-based direct-conversion transceivers. To establish that the underlying widely linear signal model is not adequate for strong transmit signals, the impact of various circuit imperfections, including power amplifier (PA) distortion, frequency-dependent I/Q imbalance, quantization noise and thermal noise, on the performance of the conventional augmented least mean square (LMS) based SI canceller, is analyzed. In order to achieve a sufficient signal-to-interference-plus-noise ratio (SINR) when the nonlinear SI components are not negligible, we propose an augmented nonlinear LMS based SI canceller for a joint cancellation of both the linear and nonlinear SI components by virtue of a widely nonlinear model fit. A rigorous mean and mean square performance evaluation is conducted to justify the performance advantages of the proposed scheme over the conventional augmented LMS solution. Simulations on orthogonal frequency division multiplexing (OFDM)-based wireless local area network (WLAN) standard compliant waveforms support the analysis. △ Less

Submitted 17 July, 2018; v1 submitted 12 October, 2017; originally announced October 2017.

Journal ref: IEEE Transactions on Signal Processing, vol. 66, no. 15, pp. 4065-4078, Aug.1 1 2018

arXiv:1708.09165 [pdf, other]

doi 10.1561/2200000067

Tensor Networks for Dimensionality Reduction and Large-Scale Optimizations. Part 2 Applications and Future Perspectives

Authors: A. Cichocki, A-H. Phan, Q. Zhao, N. Lee, I. V. Oseledets, M. Sugiyama, D. Mandic

Abstract: Part 2 of this monograph builds on the introduction to tensor networks and their operations presented in Part 1. It focuses on tensor network models for super-compressed higher-order representation of data/parameters and related cost functions, while providing an outline of their applications in machine learning and data analytics. A particular emphasis is on the tensor train (TT) and Hierarchical… ▽ More Part 2 of this monograph builds on the introduction to tensor networks and their operations presented in Part 1. It focuses on tensor network models for super-compressed higher-order representation of data/parameters and related cost functions, while providing an outline of their applications in machine learning and data analytics. A particular emphasis is on the tensor train (TT) and Hierarchical Tucker (HT) decompositions, and their physically meaningful interpretations which reflect the scalability of the tensor network approach. Through a graphical approach, we also elucidate how, by virtue of the underlying low-rank tensor approximations and sophisticated contractions of core tensors, tensor networks have the ability to perform distributed computations on otherwise prohibitively large volumes of data/parameters, thereby alleviating or even eliminating the curse of dimensionality. The usefulness of this concept is illustrated over a number of applied areas, including generalized regression and classification (support tensor machines, canonical correlation analysis, higher order partial least squares), generalized eigenvalue decomposition, Riemannian optimization, and in the optimization of deep neural networks. Part 1 and Part 2 of this work can be used either as stand-alone separate texts, or indeed as a conjoint comprehensive review of the exciting field of low-rank tensor networks and tensor decompositions. △ Less

Submitted 30 August, 2017; originally announced August 2017.

Comments: 232 pages

Journal ref: Foundations and Trends in Machine Learning: Vol. 9: No. 6, pp 431-673, 2017

arXiv:1705.03742 [pdf, other]

In-ear EEG biometrics for feasible and readily collectable real-world person authentication

Authors: Takashi Nakamura, Valentin Goverdovsky, Danilo P. Mandic

Abstract: The use of EEG as a biometrics modality has been investigated for about a decade, however its feasibility in real-world applications is not yet conclusively established, mainly due to the issues with collectability and reproducibility. To this end, we propose a readily deployable EEG biometrics system based on a `one-fits-all' viscoelastic generic in-ear EEG sensor (collectability), which does not… ▽ More The use of EEG as a biometrics modality has been investigated for about a decade, however its feasibility in real-world applications is not yet conclusively established, mainly due to the issues with collectability and reproducibility. To this end, we propose a readily deployable EEG biometrics system based on a `one-fits-all' viscoelastic generic in-ear EEG sensor (collectability), which does not require skilled assistance or cumbersome preparation. Unlike most existing studies, we consider data recorded over multiple recording days and for multiple subjects (reproducibility) while, for rigour, the training and test segments are not taken from the same recording days. A robust approach is considered based on the resting state with eyes closed paradigm, the use of both parametric (autoregressive model) and non-parametric (spectral) features, and supported by simple and fast cosine distance, linear discriminant analysis and support vector machine classifiers. Both the verification and identification forensics scenarios are considered and the achieved results are on par with the studies based on impractical on-scalp recordings. Comprehensive analysis over a number of subjects, setups, and analysis features demonstrates the feasibility of the proposed ear-EEG biometrics, and its potential in resolving the critical collectability, robustness, and reproducibility issues associated with current EEG biometrics. △ Less

Submitted 13 September, 2017; v1 submitted 10 May, 2017; originally announced May 2017.

arXiv:1705.00058 [pdf, ps, other]

Simultaneous diagonalisation of the covariance and complementary covariance matrices in quaternion widely linear signal processing

Authors: Min Xiang, Shirin Enshaeifar, Alexander E. Stott, Clive Cheong Took, Yili Xia, Sithan Kanna, Danilo P. Mandic

Abstract: Recent developments in quaternion-valued widely linear processing have established that the exploitation of complete second-order statistics requires consideration of both the standard covariance and the three complementary covariance matrices. Although such matrices have a tremendous amount of structure and their decomposition is a powerful tool in a variety of applications, the non-commutative n… ▽ More Recent developments in quaternion-valued widely linear processing have established that the exploitation of complete second-order statistics requires consideration of both the standard covariance and the three complementary covariance matrices. Although such matrices have a tremendous amount of structure and their decomposition is a powerful tool in a variety of applications, the non-commutative nature of the quaternion product has been prohibitive to the development of quaternion uncorrelating transforms. To this end, we introduce novel techniques for a simultaneous decomposition of the covariance and complementary covariance matrices in the quaternion domain, whereby the quaternion version of the Takagi factorisation is explored to diagonalise symmetric quaternion-valued matrices. This gives new insights into the quaternion uncorrelating transform (QUT) and forms a basis for the proposed quaternion approximate uncorrelating transform (QAUT) which simultaneously diagonalises all four covariance matrices associated with improper quaternion signals. The effectiveness of the proposed uncorrelating transforms is validated by simulations on both synthetic and real-world quaternion-valued signals. △ Less

Submitted 29 January, 2018; v1 submitted 28 April, 2017; originally announced May 2017.

Comments: 41 pages, single column, 10 figures

arXiv:1703.02492 [pdf, ps, other]

Online Multilinear Dictionary Learning

Authors: Thiernithi Variddhisai, Danilo Mandic

Abstract: A method for online tensor dictionary learning is proposed. With the assumption of separable dictionaries, tensor contraction is used to diminish a $N$-way model of $\mathcal{O}\left(L^N\right)$ into a simple matrix equation of $\mathcal{O}\left(NL^2\right)$ with a real-time capability. To avoid numerical instability due to inversion of sparse matrix, a class of stochastic gradient with memory is… ▽ More A method for online tensor dictionary learning is proposed. With the assumption of separable dictionaries, tensor contraction is used to diminish a $N$-way model of $\mathcal{O}\left(L^N\right)$ into a simple matrix equation of $\mathcal{O}\left(NL^2\right)$ with a real-time capability. To avoid numerical instability due to inversion of sparse matrix, a class of stochastic gradient with memory is formulated via a least-square solution to guarantee convergence and robustness. Both gradient descent with exact line search and Newton's method are discussed and realized. Extensions onto how to deal with bad initialization and outliers are also explained in detail. Experiments on two synthetic signals confirms an impressive performance of our proposed method. △ Less

Submitted 10 March, 2020; v1 submitted 7 March, 2017; originally announced March 2017.

arXiv:1701.04398 [pdf, other]

Automatic sleep monitoring using ear-EEG

Authors: Takashi Nakamura, Valentin Goverdovsky, Mary J. Morrell, Danilo P. Mandic

Abstract: The monitoring of sleep patterns without patient's inconvenience or involvement of a medical specialist is a clinical question of significant importance. To this end, we propose an automatic sleep stage monitoring system based on an affordable, unobtrusive, discreet, and long-term wearable in-ear sensor for recording the Electroencephalogram (ear-EEG). The selected features for sleep pattern class… ▽ More The monitoring of sleep patterns without patient's inconvenience or involvement of a medical specialist is a clinical question of significant importance. To this end, we propose an automatic sleep stage monitoring system based on an affordable, unobtrusive, discreet, and long-term wearable in-ear sensor for recording the Electroencephalogram (ear-EEG). The selected features for sleep pattern classification from a single ear-EEG channel include the spectral edge frequency (SEF) and multi- scale fuzzy entropy (MSFE), a structural complexity feature. In this preliminary study, the manually scored hypnograms from simultaneous scalp-EEG and ear-EEG recordings of four subjects are used as labels for two analysis scenarios: 1) classification of ear-EEG hypnogram labels from ear-EEG recordings and 2) prediction of scalp-EEG hypnogram labels from ear-EEG recordings. We consider both 2-class and 4-class sleep scoring, with the achieved accuracies ranging from 78.5 % to 95.2 % for ear-EEG labels predicted from ear-EEG, and 76.8 % to 91.8 % for scalp-EEG labels predicted from ear-EEG. The corresponding kappa coefficients, which range from 0.64 to 0.83 for Scenario 1 and from 0.65 to 0.80 for Scenario 2, indicate a Substantial to Almost Perfect agreement, thus proving the feasibility of in-ear sensing for sleep monitoring in the community. △ Less

Submitted 3 January, 2017; originally announced January 2017.

arXiv:1609.09230 [pdf, ps, other]

Tensor Networks for Latent Variable Analysis. Part I: Algorithms for Tensor Train Decomposition

Authors: Anh-Huy Phan, Andrzej Cichocki, Andre Uschmajew, Petr Tichavsky, George Luta, Danilo Mandic

Abstract: Decompositions of tensors into factor matrices, which interact through a core tensor, have found numerous applications in signal processing and machine learning. A more general tensor model which represents data as an ordered network of sub-tensors of order-2 or order-3 has, so far, not been widely considered in these fields, although this so-called tensor network decomposition has been long studi… ▽ More Decompositions of tensors into factor matrices, which interact through a core tensor, have found numerous applications in signal processing and machine learning. A more general tensor model which represents data as an ordered network of sub-tensors of order-2 or order-3 has, so far, not been widely considered in these fields, although this so-called tensor network decomposition has been long studied in quantum physics and scientific computing. In this study, we present novel algorithms and applications of tensor network decompositions, with a particular focus on the tensor train decomposition and its variants. The novel algorithms developed for the tensor train decomposition update, in an alternating way, one or several core tensors at each iteration, and exhibit enhanced mathematical tractability and scalability to exceedingly large-scale data tensors. The proposed algorithms are tested in classic paradigms of blind source separation from a single mixture, denoising, and feature extraction, and achieve superior performance over the widely used truncated algorithms for tensor train decomposition. △ Less

Submitted 29 September, 2016; originally announced September 2016.

arXiv:1609.03330 [pdf]

Hearables: Multimodal physiological in-ear sensing

Authors: Valentin Goverdovsky, Wilhelm von Rosenberg, Takashi Nakamura, David Looney, David J Sharp, Christos Papavassiliou, Mary J Morrell, Danilo P Mandic

Abstract: Future health systems require the means to assess and track the neural and physiological function of a user over long periods of time and in the community. Human body responses are manifested through multiple modalities, such as the mechanical, electrical and chemical; yet current physiological monitors (actigraphy, heart rate) largely lack in both the desired cross-modal and non-stigmatizing aspe… ▽ More Future health systems require the means to assess and track the neural and physiological function of a user over long periods of time and in the community. Human body responses are manifested through multiple modalities, such as the mechanical, electrical and chemical; yet current physiological monitors (actigraphy, heart rate) largely lack in both the desired cross-modal and non-stigmatizing aspects. We address these challenges through an inconspicuous and comfortable earpiece, equipped with miniature multimodal sensors, which benefits from the relatively stable position of the ear canal with respect to vital organs to robustly measure the brain, cardiac and respiratory functions. Comprehensive experiments validate each modality within the proposed earpiece, while its potential in health monitoring is illustrated through case studies. We further demonstrate how combining data from multiple sensors within such an integrated wearable device improves both the accuracy of measurements and the ability to deal with artifacts in real-life scenarios. △ Less

Submitted 13 September, 2016; v1 submitted 12 September, 2016; originally announced September 2016.

arXiv:1609.00893 [pdf, other]

doi 10.1561/2200000059

Low-Rank Tensor Networks for Dimensionality Reduction and Large-Scale Optimization Problems: Perspectives and Challenges PART 1

Authors: A. Cichocki, N. Lee, I. V. Oseledets, A. -H. Phan, Q. Zhao, D. Mandic

Abstract: Machine learning and data mining algorithms are becoming increasingly important in analyzing large volume, multi-relational and multi--modal datasets, which are often conveniently represented as multiway arrays or tensors. It is therefore timely and valuable for the multidisciplinary research community to review tensor decompositions and tensor networks as emerging tools for large-scale data analy… ▽ More Machine learning and data mining algorithms are becoming increasingly important in analyzing large volume, multi-relational and multi--modal datasets, which are often conveniently represented as multiway arrays or tensors. It is therefore timely and valuable for the multidisciplinary research community to review tensor decompositions and tensor networks as emerging tools for large-scale data analysis and data mining. We provide the mathematical and graphical representations and interpretation of tensor networks, with the main focus on the Tucker and Tensor Train (TT) decompositions and their extensions or generalizations. Keywords: Tensor networks, Function-related tensors, CP decomposition, Tucker models, tensor train (TT) decompositions, matrix product states (MPS), matrix product operators (MPO), basic tensor operations, multiway component analysis, multilinear blind source separation, tensor completion, linear/multilinear dimensionality reduction, large-scale optimization problems, symmetric eigenvalue decomposition (EVD), PCA/SVD, huge systems of linear equations, pseudo-inverse of very large matrices, Lasso and Canonical Correlation Analysis (CCA) (This is Part 1) △ Less

Submitted 11 September, 2017; v1 submitted 4 September, 2016; originally announced September 2016.

Comments: 176 pages

Journal ref: Foundations and Trends in Machine Learning, vol. 9, no. 4-5, pp. 249-429, 2016

Showing 51–100 of 113 results for author: Mandic, D