Search | arXiv e-print repository

PROMISSING: Pruning Missing Values in Neural Networks

Authors: Seyed Mostafa Kia, Nastaran Mohammadian Rad, Daniel van Opstal, Bart van Schie, Andre F. Marquand, Josien Pluim, Wiepke Cahn, Hugo G. Schnack

Abstract: While data are the primary fuel for machine learning models, they often suffer from missing values, especially when collected in real-world scenarios. However, many off-the-shelf machine learning models, including artificial neural network models, are unable to handle these missing values directly. Therefore, extra data preprocessing and curation steps, such as data imputation, are inevitable befo… ▽ More While data are the primary fuel for machine learning models, they often suffer from missing values, especially when collected in real-world scenarios. However, many off-the-shelf machine learning models, including artificial neural network models, are unable to handle these missing values directly. Therefore, extra data preprocessing and curation steps, such as data imputation, are inevitable before learning and prediction processes. In this study, we propose a simple and intuitive yet effective method for pruning missing values (PROMISSING) during learning and inference steps in neural networks. In this method, there is no need to remove or impute the missing values; instead, the missing values are treated as a new source of information (representing what we do not know). Our experiments on simulated data, several classification and regression benchmarks, and a multi-modal clinical dataset show that PROMISSING results in similar prediction performance compared to various imputation techniques. In addition, our experiments show models trained using PROMISSING techniques are becoming less decisive in their predictions when facing incomplete samples with many unknowns. This finding hopefully advances machine learning models from being pure predicting machines to more realistic thinkers that can also say "I do not know" when facing incomplete sources of information. △ Less

Submitted 3 June, 2022; originally announced June 2022.

arXiv:2006.13386 [pdf, other]

Gender and Emotion Recognition from Implicit User Behavior Signals

Authors: Maneesh Bilalpur, Seyed Mostafa Kia, Mohan Kankanhalli, Ramanathan Subramanian

Abstract: This work explores the utility of implicit behavioral cues, namely, Electroencephalogram (EEG) signals and eye movements for gender recognition (GR) and emotion recognition (ER) from psychophysical behavior. Specifically, the examined cues are acquired via low-cost, off-the-shelf sensors. 28 users (14 male) recognized emotions from unoccluded (no mask) and partially occluded (eye or mouth masked)… ▽ More This work explores the utility of implicit behavioral cues, namely, Electroencephalogram (EEG) signals and eye movements for gender recognition (GR) and emotion recognition (ER) from psychophysical behavior. Specifically, the examined cues are acquired via low-cost, off-the-shelf sensors. 28 users (14 male) recognized emotions from unoccluded (no mask) and partially occluded (eye or mouth masked) emotive faces; their EEG responses contained gender-specific differences, while their eye movements were characteristic of the perceived facial emotions. Experimental results reveal that (a) reliable GR and ER is achievable with EEG and eye features, (b) differential cognitive processing of negative emotions is observed for females and (c) eye gaze-based gender differences manifest under partial face occlusion, as typified by the eye and mouth mask conditions. △ Less

Submitted 23 June, 2020; originally announced June 2020.

Comments: Under consideration for publication in IEEE Trans. Affective Computing

arXiv:2005.12055 [pdf, other]

Hierarchical Bayesian Regression for Multi-Site Normative Modeling of Neuroimaging Data

Authors: Seyed Mostafa Kia, Hester Huijsdens, Richard Dinga, Thomas Wolfers, Maarten Mennes, Ole A. Andreassen, Lars T. Westlye, Christian F. Beckmann, Andre F. Marquand

Abstract: Clinical neuroimaging has recently witnessed explosive growth in data availability which brings studying heterogeneity in clinical cohorts to the spotlight. Normative modeling is an emerging statistical tool for achieving this objective. However, its application remains technically challenging due to difficulties in properly dealing with nuisance variation, for example due to variability in image… ▽ More Clinical neuroimaging has recently witnessed explosive growth in data availability which brings studying heterogeneity in clinical cohorts to the spotlight. Normative modeling is an emerging statistical tool for achieving this objective. However, its application remains technically challenging due to difficulties in properly dealing with nuisance variation, for example due to variability in image acquisition devices. Here, in a fully probabilistic framework, we propose an application of hierarchical Bayesian regression (HBR) for multi-site normative modeling. Our experimental results confirm the superiority of HBR in deriving more accurate normative ranges on large multi-site neuroimaging data compared to widely used methods. This provides the possibility i) to learn the normative range of structural and functional brain measures on large multi-site data; ii) to recalibrate and reuse the learned model on local small data; therefore, HBR closes the technical loop for applying normative modeling as a medical tool for the diagnosis and prognosis of mental disorders. △ Less

Submitted 25 May, 2020; originally announced May 2020.

Comments: To be published in MICCAI 2020 proceedings

arXiv:1812.04998 [pdf, other]

Neural Processes Mixed-Effect Models for Deep Normative Modeling of Clinical Neuroimaging Data

Authors: Seyed Mostafa Kia, Andre F. Marquand

Abstract: Normative modeling has recently been introduced as a promising approach for modeling variation of neuroimaging measures across individuals in order to derive biomarkers of psychiatric disorders. Current implementations rely on Gaussian process regression, which provides coherent estimates of uncertainty needed for the method but also suffers from drawbacks including poor scaling to large datasets… ▽ More Normative modeling has recently been introduced as a promising approach for modeling variation of neuroimaging measures across individuals in order to derive biomarkers of psychiatric disorders. Current implementations rely on Gaussian process regression, which provides coherent estimates of uncertainty needed for the method but also suffers from drawbacks including poor scaling to large datasets and a reliance on fixed parametric kernels. In this paper, we propose a deep normative modeling framework based on neural processes (NPs) to solve these problems. To achieve this, we define a stochastic process formulation for mixed-effect models and show how NPs can be adopted for spatially structured mixed-effect modeling of neuroimaging data. This enables us to learn optimal feature representations and covariance structure for the random-effect and noise via global latent variables. In this scheme, predictive uncertainty can be approximated by sampling from the distribution of these global latent variables. On a publicly available clinical fMRI dataset, we compare the novelty detection performance of multivariate normative models estimated by the proposed NP approach to a baseline multi-task Gaussian process regression approach and show substantial improvements for certain diagnostic problems. △ Less

Submitted 15 April, 2019; v1 submitted 12 December, 2018; originally announced December 2018.

Comments: Medical Imaging with Deep Learning (MIDL 2019)

arXiv:1808.00036 [pdf, other]

Scalable Multi-Task Gaussian Process Tensor Regression for Normative Modeling of Structured Variation in Neuroimaging Data

Authors: Seyed Mostafa Kia, Christian F. Beckmann, Andre F. Marquand

Abstract: Most brain disorders are very heterogeneous in terms of their underlying biology and develo** analysis methods to model such heterogeneity is a major challenge. A promising approach is to use probabilistic regression methods to estimate normative models of brain function using (f)MRI data then use these to map variation across individuals in clinical populations (e.g., via anomaly detection). To… ▽ More Most brain disorders are very heterogeneous in terms of their underlying biology and develo** analysis methods to model such heterogeneity is a major challenge. A promising approach is to use probabilistic regression methods to estimate normative models of brain function using (f)MRI data then use these to map variation across individuals in clinical populations (e.g., via anomaly detection). To fully capture individual differences, it is crucial to statistically model the patterns of correlation across different brain regions and individuals. However, this is very challenging for neuroimaging data because of high-dimensionality and highly structured patterns of correlation across multiple axes. Here, we propose a general and flexible multi-task learning framework to address this problem. Our model uses a tensor-variate Gaussian process in a Bayesian mixed-effects model and makes use of Kronecker algebra and a low-rank approximation to scale efficiently to multi-way neuroimaging data at the whole brain level. On a publicly available clinical fMRI dataset, we show that our computationally affordable approach substantially improves detection sensitivity over both a mass-univariate normative model and a classifier that --unlike our approach-- has full access to the clinical labels. △ Less

Submitted 30 November, 2018; v1 submitted 31 July, 2018; originally announced August 2018.

arXiv:1806.01047 [pdf, other]

Normative Modeling of Neuroimaging Data using Scalable Multi-Task Gaussian Processes

Authors: Seyed Mostafa Kia, Andre Marquand

Abstract: Normative modeling has recently been proposed as an alternative for the case-control approach in modeling heterogeneity within clinical cohorts. Normative modeling is based on single-output Gaussian process regression that provides coherent estimates of uncertainty required by the method but does not consider spatial covariance structure. Here, we introduce a scalable multi-task Gaussian process r… ▽ More Normative modeling has recently been proposed as an alternative for the case-control approach in modeling heterogeneity within clinical cohorts. Normative modeling is based on single-output Gaussian process regression that provides coherent estimates of uncertainty required by the method but does not consider spatial covariance structure. Here, we introduce a scalable multi-task Gaussian process regression (S-MTGPR) approach to address this problem. To this end, we exploit a combination of a low-rank approximation of the spatial covariance matrix with algebraic properties of Kronecker product in order to reduce the computational complexity of Gaussian process regression in high-dimensional output spaces. On a public fMRI dataset, we show that S-MTGPR: 1) leads to substantial computational improvements that allow us to estimate normative models for high-dimensional fMRI data whilst accounting for spatial structure in data; 2) by modeling both spatial and across-sample variances, it provides higher sensitivity in novelty detection scenarios. △ Less

Submitted 5 June, 2018; v1 submitted 4 June, 2018; originally announced June 2018.

arXiv:1709.05956 [pdf, other]

Deep Learning for Automatic Stereotypical Motor Movement Detection using Wearable Sensors in Autism Spectrum Disorders

Authors: Nastaran Mohammadian Rad, Seyed Mostafa Kia, Calogero Zarbo, Twan van Laarhoven, Giuseppe Jurman, Paola Venuti, Elena Marchiori, Cesare Furlanello

Abstract: Autism Spectrum Disorders are associated with atypical movements, of which stereotypical motor movements (SMMs) interfere with learning and social interaction. The automatic SMM detection using inertial measurement units (IMU) remains complex due to the strong intra and inter-subject variability, especially when handcrafted features are extracted from the signal. We propose a new application of th… ▽ More Autism Spectrum Disorders are associated with atypical movements, of which stereotypical motor movements (SMMs) interfere with learning and social interaction. The automatic SMM detection using inertial measurement units (IMU) remains complex due to the strong intra and inter-subject variability, especially when handcrafted features are extracted from the signal. We propose a new application of the deep learning to facilitate automatic SMM detection using multi-axis IMUs. We use a convolutional neural network (CNN) to learn a discriminative feature space from raw data. We show how the CNN can be used for parameter transfer learning to enhance the detection rate on longitudinal data. We also combine the long short-term memory (LSTM) with CNN to model the temporal patterns in a sequence of multi-axis signals. Further, we employ ensemble learning to combine multiple LSTM learners into a more robust SMM detector. Our results show that: 1) feature learning outperforms handcrafted features; 2) parameter transfer learning is beneficial in longitudinal settings; 3) using LSTM to learn the temporal dynamic of signals enhances the detection rate especially for skewed training data; 4) an ensemble of LSTMs provides more accurate and stable detectors. These findings provide a significant step toward accurate SMM detection in real-time scenarios. △ Less

Submitted 14 September, 2017; originally announced September 2017.

arXiv:1708.08735 [pdf, other]

Gender and Emotion Recognition with Implicit User Signals

Authors: Maneesh Bilalpur, Seyed Mostafa Kia, Manisha Chawla, Tat-Seng Chua, Ramanathan Subramanian

Abstract: We examine the utility of implicit user behavioral signals captured using low-cost, off-the-shelf devices for anonymous gender and emotion recognition. A user study designed to examine male and female sensitivity to facial emotions confirms that females recognize (especially negative) emotions quicker and more accurately than men, mirroring prior findings. Implicit viewer responses in the form of… ▽ More We examine the utility of implicit user behavioral signals captured using low-cost, off-the-shelf devices for anonymous gender and emotion recognition. A user study designed to examine male and female sensitivity to facial emotions confirms that females recognize (especially negative) emotions quicker and more accurately than men, mirroring prior findings. Implicit viewer responses in the form of EEG brain signals and eye movements are then examined for existence of (a) emotion and gender-specific patterns from event-related potentials (ERPs) and fixation distributions and (b) emotion and gender discriminability. Experiments reveal that (i) Gender and emotion-specific differences are observable from ERPs, (ii) multiple similarities exist between explicit responses gathered from users and their implicit behavioral signals, and (iii) Significantly above-chance ($\approx$70%) gender recognition is achievable on comparing emotion-specific EEG responses-- gender differences are encoded best for anger and disgust. Also, fairly modest valence (positive vs negative emotion) recognition is achieved with EEG and eye-based features. △ Less

Submitted 29 August, 2017; originally announced August 2017.

Comments: To be published in the Proceedings of 19th International Conference on Multimodal Interaction.2017

ACM Class: H.5.2; I.3.6; H.1.2

arXiv:1708.08729 [pdf, other]

Discovering Gender Differences in Facial Emotion Recognition via Implicit Behavioral Cues

Authors: Maneesh Bilalpur, Seyed Mostafa Kia, Tat-Seng Chua, Ramanathan Subramanian

Abstract: We examine the utility of implicit behavioral cues in the form of EEG brain signals and eye movements for gender recognition (GR) and emotion recognition (ER). Specifically, the examined cues are acquired via low-cost, off-the-shelf sensors. We asked 28 viewers (14 female) to recognize emotions from unoccluded (no mask) as well as partially occluded (eye and mouth masked) emotive faces. Obtained e… ▽ More We examine the utility of implicit behavioral cues in the form of EEG brain signals and eye movements for gender recognition (GR) and emotion recognition (ER). Specifically, the examined cues are acquired via low-cost, off-the-shelf sensors. We asked 28 viewers (14 female) to recognize emotions from unoccluded (no mask) as well as partially occluded (eye and mouth masked) emotive faces. Obtained experimental results reveal that (a) reliable GR and ER is achievable with EEG and eye features, (b) differential cognitive processing especially for negative emotions is observed for males and females and (c) some of these cognitive differences manifest under partial face occlusion, as typified by the eye and mouth mask conditions. △ Less

Submitted 29 August, 2017; originally announced August 2017.

Comments: To be published in the Proceedings of Seventh International Conference on Affective Computing and Intelligent Interaction.2017

ACM Class: H.5.2; I.3.6; H.1.2

arXiv:1606.05672 [pdf, other]

Interpretability in Linear Brain Decoding

Authors: Seyed Mostafa Kia, Andrea Passerini

Abstract: Improving the interpretability of brain decoding approaches is of primary interest in many neuroimaging studies. Despite extensive studies of this type, at present, there is no formal definition for interpretability of brain decoding models. As a consequence, there is no quantitative measure for evaluating the interpretability of different brain decoding methods. In this paper, we present a simple… ▽ More Improving the interpretability of brain decoding approaches is of primary interest in many neuroimaging studies. Despite extensive studies of this type, at present, there is no formal definition for interpretability of brain decoding models. As a consequence, there is no quantitative measure for evaluating the interpretability of different brain decoding methods. In this paper, we present a simple definition for interpretability of linear brain decoding models. Then, we propose to combine the interpretability and the performance of the brain decoding into a new multi-objective criterion for model selection. Our preliminary results on the toy data show that optimizing the hyper-parameters of the regularized linear classifier based on the proposed criterion results in more informative linear models. The presented definition provides the theoretical background for quantitative evaluation of interpretability in linear brain decoding. △ Less

Submitted 17 June, 2016; originally announced June 2016.

Comments: presented at 2016 ICML Workshop on Human Interpretability in Machine Learning (WHI 2016), New York, NY

arXiv:1603.08704 [pdf, other]

Interpretability of Multivariate Brain Maps in Brain Decoding: Definition and Quantification

Authors: Seyed Mostafa Kia

Abstract: Brain decoding is a popular multivariate approach for hypothesis testing in neuroimaging. It is well known that the brain maps derived from weights of linear classifiers are hard to interpret because of high correlations between predictors, low signal to noise ratios, and the high dimensionality of neuroimaging data. Therefore, improving the interpretability of brain decoding approaches is of prim… ▽ More Brain decoding is a popular multivariate approach for hypothesis testing in neuroimaging. It is well known that the brain maps derived from weights of linear classifiers are hard to interpret because of high correlations between predictors, low signal to noise ratios, and the high dimensionality of neuroimaging data. Therefore, improving the interpretability of brain decoding approaches is of primary interest in many neuroimaging studies. Despite extensive studies of this type, at present, there is no formal definition for interpretability of multivariate brain maps. As a consequence, there is no quantitative measure for evaluating the interpretability of different brain decoding methods. In this paper, first, we present a theoretical definition of interpretability in brain decoding; we show that the interpretability of multivariate brain maps can be decomposed into their reproducibility and representativeness. Second, as an application of the proposed theoretical definition, we formalize a heuristic method for approximating the interpretability of multivariate brain maps in a binary magnetoencephalography (MEG) decoding scenario. Third, we propose to combine the approximated interpretability and the performance of the brain decoding model into a new multi-objective criterion for model selection. Our results for the MEG data show that optimizing the hyper-parameters of the regularized linear classifier based on the proposed criterion results in more informative multivariate brain maps. More importantly, the presented definition provides the theoretical background for quantitative evaluation of interpretability, and hence, facilitates the development of more effective brain decoding algorithms in the future. △ Less

Submitted 29 March, 2016; originally announced March 2016.

arXiv:1511.01865 [pdf, other]

Convolutional Neural Network for Stereotypical Motor Movement Detection in Autism

Authors: Nastaran Mohammadian Rad, Andrea Bizzego, Seyed Mostafa Kia, Giuseppe Jurman, Paola Venuti, Cesare Furlanello

Abstract: Autism Spectrum Disorders (ASDs) are often associated with specific atypical postural or motor behaviors, of which Stereotypical Motor Movements (SMMs) have a specific visibility. While the identification and the quantification of SMM patterns remain complex, its automation would provide support to accurate tuning of the intervention in the therapy of autism. Therefore, it is essential to develop… ▽ More Autism Spectrum Disorders (ASDs) are often associated with specific atypical postural or motor behaviors, of which Stereotypical Motor Movements (SMMs) have a specific visibility. While the identification and the quantification of SMM patterns remain complex, its automation would provide support to accurate tuning of the intervention in the therapy of autism. Therefore, it is essential to develop automatic SMM detection systems in a real world setting, taking care of strong inter-subject and intra-subject variability. Wireless accelerometer sensing technology can provide a valid infrastructure for real-time SMM detection, however such variability remains a problem also for machine learning methods, in particular whenever handcrafted features extracted from accelerometer signal are considered. Here, we propose to employ the deep learning paradigm in order to learn discriminating features from multi-sensor accelerometer signals. Our results provide preliminary evidence that feature learning and transfer learning embedded in the deep architecture achieve higher accurate SMM detectors in longitudinal scenarios. △ Less

Submitted 7 June, 2016; v1 submitted 5 November, 2015; originally announced November 2015.

Comments: Presented at 5th NIPS Workshop on Machine Learning and Interpretation in Neuroimaging (MLINI), 2015, (http://arxiv.longhoe.net/html/1605.04435), Report-no: MLINI/2015/13

Report number: MLINI/2015/13

arXiv:1406.6720 [pdf]

Mass-Univariate Hypothesis Testing on MEEG Data using Cross-Validation

Authors: Seyed Mostafa Kia

Abstract: Recent advances in statistical theory, together with advances in the computational power of computers, provide alternative methods to do mass-univariate hypothesis testing in which a large number of univariate tests, can be properly used to compare MEEG data at a large number of time-frequency points and scalp locations. One of the major problematic aspects of this kind of mass-univariate analysis… ▽ More Recent advances in statistical theory, together with advances in the computational power of computers, provide alternative methods to do mass-univariate hypothesis testing in which a large number of univariate tests, can be properly used to compare MEEG data at a large number of time-frequency points and scalp locations. One of the major problematic aspects of this kind of mass-univariate analysis is due to high number of accomplished hypothesis tests. Hence procedures that remove or alleviate the increased probability of false discoveries are crucial for this type of analysis. Here, I propose a new method for mass-univariate analysis of MEEG data based on cross-validation scheme. In this method, I suggest a hierarchical classification procedure under k-fold cross-validation to detect which sensors at which time-bin and which frequency-bin contributes in discriminating between two different stimuli or tasks. To achieve this goal, a new feature extraction method based on the discrete cosine transform (DCT) employed to get maximum advantage of all three data dimensions. Employing cross-validation and hierarchy architecture alongside the DCT feature space makes this method more reliable and at the same time enough sensitive to detect the narrow effects in brain activities. △ Less

Submitted 25 June, 2014; originally announced June 2014.

Comments: Master thesis, July 2013

arXiv:1404.4175 [pdf, ps, other]

MEG Decoding Across Subjects

Authors: Emanuele Olivetti, Seyed Mostafa Kia, Paolo Avesani

Abstract: Brain decoding is a data analysis paradigm for neuroimaging experiments that is based on predicting the stimulus presented to the subject from the concurrent brain activity. In order to make inference at the group level, a straightforward but sometimes unsuccessful approach is to train a classifier on the trials of a group of subjects and then to test it on unseen trials from new subjects. The ext… ▽ More Brain decoding is a data analysis paradigm for neuroimaging experiments that is based on predicting the stimulus presented to the subject from the concurrent brain activity. In order to make inference at the group level, a straightforward but sometimes unsuccessful approach is to train a classifier on the trials of a group of subjects and then to test it on unseen trials from new subjects. The extreme difficulty is related to the structural and functional variability across the subjects. We call this approach "decoding across subjects". In this work, we address the problem of decoding across subjects for magnetoencephalographic (MEG) experiments and we provide the following contributions: first, we formally describe the problem and show that it belongs to a machine learning sub-field called transductive transfer learning (TTL). Second, we propose to use a simple TTL technique that accounts for the differences between train data and test data. Third, we propose the use of ensemble learning, and specifically of stacked generalization, to address the variability across subjects within train data, with the aim of producing more stable classifiers. On a face vs. scramble task MEG dataset of 16 subjects, we compare the standard approach of not modelling the differences across subjects, to the proposed one of combining TTL and ensemble learning. We show that the proposed approach is consistently more accurate than the standard one. △ Less

Submitted 16 April, 2014; originally announced April 2014.

arXiv:1402.5792 [pdf]

A Novel Scheme for Intelligent Recognition of Pornographic Images

Authors: Seyed Mostafa Kia, Hossein Rahmani, Reza Mortezaei, Mohsen Ebrahimi Moghaddam, Amer Namazi

Abstract: Harmful contents are rising in internet day by day and this motivates the essence of more research in fast and reliable obscene and immoral material filtering. Pornographic image recognition is an important component in each filtering system. In this paper, a new approach for detecting pornographic images is introduced. In this approach, two new features are suggested. These two features in combin… ▽ More Harmful contents are rising in internet day by day and this motivates the essence of more research in fast and reliable obscene and immoral material filtering. Pornographic image recognition is an important component in each filtering system. In this paper, a new approach for detecting pornographic images is introduced. In this approach, two new features are suggested. These two features in combination with other simple traditional features provide decent difference between porn and non-porn images. In addition, we applied fuzzy integral based information fusion to combine MLP (Multi-Layer Perceptron) and NF (Neuro-Fuzzy) outputs. To test the proposed method, performance of system was evaluated over 18354 download images from internet. The attained precision was 93% in TP and 8% in FP on training dataset, and 87% and 5.5% on test dataset. Achieved results verify the performance of proposed system versus other related works. △ Less

Submitted 29 September, 2014; v1 submitted 24 February, 2014; originally announced February 2014.

Showing 1–15 of 15 results for author: Kia, S M