Skip to main content

Showing 1–50 of 63 results for author: Thirion, B

.
  1. arXiv:2312.10858  [pdf, other

    cs.LG

    Variable Importance in High-Dimensional Settings Requires Grou**

    Authors: Ahmad Chamma, Bertrand Thirion, Denis A. Engemann

    Abstract: Explaining the decision process of machine learning algorithms is nowadays crucial for both model's performance enhancement and human comprehension. This can be achieved by assessing the variable importance of single variables, even for high-capacity non-linear methods, e.g. Deep Neural Networks (DNNs). While only removal-based approaches, such as Permutation Importance (PI), can bring statistical… ▽ More

    Submitted 17 December, 2023; originally announced December 2023.

  2. arXiv:2310.10373  [pdf, other

    stat.ME

    False Discovery Proportion control for aggregated Knockoffs

    Authors: Alexandre Blain, Bertrand Thirion, Olivier Grisel, Pierre Neuvial

    Abstract: Controlled variable selection is an important analytical step in various scientific fields, such as brain imaging or genomics. In these high-dimensional data settings, considering too many variables leads to poor models and high costs, hence the need for statistical guarantees on false positives. Knockoffs are a popular statistical tool for conditional variable selection in high dimension. However… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

    Comments: NeurIPS 2023

  3. arXiv:2309.07593  [pdf, other

    cs.LG cs.AI stat.ML

    Statistically Valid Variable Importance Assessment through Conditional Permutations

    Authors: Ahmad Chamma, Denis A. Engemann, Bertrand Thirion

    Abstract: Variable importance assessment has become a crucial step in machine-learning applications when using complex learners, such as deep neural networks, on large-scale data. Removal-based importance assessment is currently the reference approach, particularly when statistical guarantees are sought to justify variable inclusion. It is often implemented with variable permutation schemes. On the flip sid… ▽ More

    Submitted 25 October, 2023; v1 submitted 14 September, 2023; originally announced September 2023.

  4. arXiv:2309.05768  [pdf

    q-bio.OT

    The Past, Present, and Future of the Brain Imaging Data Structure (BIDS)

    Authors: Russell A. Poldrack, Christopher J. Markiewicz, Stefan Appelhoff, Yoni K. Ashar, Tibor Auer, Sylvain Baillet, Shashank Bansal, Leandro Beltrachini, Christian G. Benar, Giacomo Bertazzoli, Suyash Bhogawar, Ross W. Blair, Marta Bortoletto, Mathieu Boudreau, Teon L. Brooks, Vince D. Calhoun, Filippo Maria Castelli, Patricia Clement, Alexander L Cohen, Julien Cohen-Adad, Sasha D'Ambrosio, Gilles de Hollander, María de la iglesia-Vayá, Alejandro de la Vega, Arnaud Delorme , et al. (89 additional authors not shown)

    Abstract: The Brain Imaging Data Structure (BIDS) is a community-driven standard for the organization of data and metadata from a growing range of neuroscience modalities. This paper is meant as a history of how the standard has developed and grown over time. We outline the principles behind the project, the mechanisms by which it has been extended, and some of the challenges being addressed as it evolves.… ▽ More

    Submitted 8 January, 2024; v1 submitted 11 September, 2023; originally announced September 2023.

  5. arXiv:2305.13863  [pdf, other

    cs.CL

    Probing Brain Context-Sensitivity with Masked-Attention Generation

    Authors: Alexandre Pasquiou, Yair Lakretz, Bertrand Thirion, Christophe Pallier

    Abstract: Two fundamental questions in neurolinguistics concerns the brain regions that integrate information beyond the lexical level, and the size of their window of integration. To address these questions we introduce a new approach named masked-attention generation. It uses GPT-2 transformers to generate word embeddings that capture a fixed amount of contextual information. We then tested whether these… ▽ More

    Submitted 23 May, 2023; originally announced May 2023.

    Comments: 2 pages, 2 figures, CCN 2023

    Journal ref: CCN 2023

  6. arXiv:2302.14389  [pdf, other

    cs.CL

    Information-Restricted Neural Language Models Reveal Different Brain Regions' Sensitivity to Semantics, Syntax and Context

    Authors: Alexandre Pasquiou, Yair Lakretz, Bertrand Thirion, Christophe Pallier

    Abstract: A fundamental question in neurolinguistics concerns the brain regions involved in syntactic and semantic processing during speech comprehension, both at the lexical (word processing) and supra-lexical levels (sentence and discourse processing). To what extent are these regions separated or intertwined? To address this question, we trained a lexical language model, Glove, and a supra-lexical langua… ▽ More

    Submitted 28 February, 2023; originally announced February 2023.

    Comments: 19 pages, 8 figures, 10 pages of Appendix, 5 appendix figures

  7. arXiv:2208.13724  [pdf, other

    stat.ME math.ST

    FDP control in multivariate linear models using the bootstrap

    Authors: Samuel Davenport, Bertrand Thirion, Pierre Neuvial

    Abstract: In this article we develop a method for performing post hoc inference of the False Discovery Proportion (FDP) over multiple contrasts of interest in the multivariate linear model. To do so we use the bootstrap to simulate from the distribution of the null contrasts. We combine the bootstrap with the post hoc inference bounds of Blanchard (2020) and prove that doing so provides simultaneous asympto… ▽ More

    Submitted 20 September, 2022; v1 submitted 29 August, 2022; originally announced August 2022.

  8. arXiv:2207.03380  [pdf, other

    cs.AI cs.CL

    Neural Language Models are not Born Equal to Fit Brain Data, but Training Helps

    Authors: Alexandre Pasquiou, Yair Lakretz, John Hale, Bertrand Thirion, Christophe Pallier

    Abstract: Neural Language Models (NLMs) have made tremendous advances during the last years, achieving impressive performance on various linguistic tasks. Capitalizing on this, studies in neuroscience have started to use NLMs to study neural activity in the human brain during language processing. However, many questions remain unanswered regarding which factors determine the ability of a neural language mod… ▽ More

    Submitted 7 July, 2022; originally announced July 2022.

    Journal ref: ICML 2022 - 39th International Conference on Machine Learning, Jul 2022, Baltimore, United States. pp.18

  9. arXiv:2206.09398  [pdf, other

    q-bio.NC stat.ML

    Aligning individual brains with Fused Unbalanced Gromov-Wasserstein

    Authors: Alexis Thual, Huy Tran, Tatiana Zemskova, Nicolas Courty, Rémi Flamary, Stanislas Dehaene, Bertrand Thirion

    Abstract: Individual brains vary in both anatomy and functional organization, even within a given species. Inter-individual variability is a major impediment when trying to draw generalizable conclusions from neuroimaging data collected on groups of subjects. Current co-registration procedures rely on limited data, and thus lead to very coarse inter-subject alignments. In this work, we present a novel metho… ▽ More

    Submitted 22 August, 2023; v1 submitted 19 June, 2022; originally announced June 2022.

    Journal ref: Advances in Neural Information Processing Systems, 35 (2022) 21792-21804

  10. arXiv:2205.14613  [pdf, other

    stat.ML cs.LG math.ST

    A Conditional Randomization Test for Sparse Logistic Regression in High-Dimension

    Authors: Binh T. Nguyen, Bertrand Thirion, Sylvain Arlot

    Abstract: Identifying the relevant variables for a classification model with correct confidence levels is a central but difficult task in high-dimension. Despite the core role of sparse logistic regression in statistics and machine learning, it still lacks a good solution for accurate inference in the regime where the number of features $p$ is as large as or larger than the number of samples $n$. Here, we t… ▽ More

    Submitted 29 May, 2022; originally announced May 2022.

  11. Notip: Non-parametric True Discovery Proportion control for brain imaging

    Authors: Alexandre Blain, Bertrand Thirion, Pierre Neuvial

    Abstract: Cluster-level inference procedures are widely used for brain map**. These methods compare the size of clusters obtained by thresholding brain maps to an upper bound under the global null hypothesis, computed using Random Field Theory or permutations. However, the guarantees obtained by this type of inference - i.e. at least one voxel is truly activated in the cluster - are not informative with r… ▽ More

    Submitted 21 July, 2022; v1 submitted 22 April, 2022; originally announced April 2022.

    Comments: NeuroImage (2022)

    Journal ref: NeuroImage (2022), 119492

  12. arXiv:2110.13502  [pdf, other

    cs.LG

    Shared Independent Component Analysis for Multi-Subject Neuroimaging

    Authors: Hugo Richard, Pierre Ablin, Bertrand Thirion, Alexandre Gramfort, Aapo Hyvärinen

    Abstract: We consider shared response modeling, a multi-view learning problem where one wants to identify common components from multiple datasets or views. We introduce Shared Independent Component Analysis (ShICA) that models each view as a linear transform of shared independent components contaminated by additive Gaussian noise. We show that this model is identifiable if the components are either non-Gau… ▽ More

    Submitted 26 October, 2021; originally announced October 2021.

    Comments: Accepted at NeurIPS 2021

  13. arXiv:2110.06135  [pdf, other

    cs.LG q-bio.NC

    Label scarcity in biomedicine: Data-rich latent factor discovery enhances phenotype prediction

    Authors: Marc-Andre Schulz, Bertrand Thirion, Alexandre Gramfort, Gaël Varoquaux, Danilo Bzdok

    Abstract: High-quality data accumulation is now becoming ubiquitous in the health domain. There is increasing opportunity to exploit rich data from normal subjects to improve supervised estimators in specific diseases with notorious data scarcity. We demonstrate that low-dimensional embedding spaces can be derived from the UK Biobank population dataset and used to enhance data-scarce prediction of health in… ▽ More

    Submitted 12 October, 2021; originally announced October 2021.

    Comments: Accepted at NIPS 2017 Workshop on Machine Learning for Health

  14. arXiv:2107.06104  [pdf, other

    eess.IV cs.LG

    Functional Magnetic Resonance Imaging data augmentation through conditional ICA

    Authors: Badr Ta**i, Hugo Richard, Bertrand Thirion

    Abstract: Advances in computational cognitive neuroimaging research are related to the availability of large amounts of labeled brain imaging data, but such data are scarce and expensive to generate. While powerful data generation mechanisms, such as Generative Adversarial Networks (GANs), have been designed in the last decade for computer vision, such improvements have not yet carried over to brain imaging… ▽ More

    Submitted 14 July, 2021; v1 submitted 11 July, 2021; originally announced July 2021.

    Comments: 14 pages, 5 figures, 7 tables

  15. arXiv:2106.02590  [pdf, other

    stat.ME math.ST stat.ML

    Spatially relaxed inference on high-dimensional linear models

    Authors: Jérôme-Alexis Chevalier, Tuan-Binh Nguyen, Bertrand Thirion, Joseph Salmon

    Abstract: We consider the inference problem for high-dimensional linear models, when covariates have an underlying spatial organization reflected in their correlation. A typical example of such a setting is high-resolution imaging, in which neighboring pixels are usually very similar. Accurate point and confidence intervals estimation is not possible in this context with many more covariates than samples, f… ▽ More

    Submitted 4 June, 2021; originally announced June 2021.

  16. arXiv:2102.10964  [pdf, other

    stat.ML cs.LG

    Adaptive Multi-View ICA: Estimation of noise levels for optimal inference

    Authors: Hugo Richard, Pierre Ablin, Aapo Hyvärinen, Alexandre Gramfort, Bertrand Thirion

    Abstract: We consider a multi-view learning problem known as group independent component analysis (group ICA), where the goal is to recover shared independent sources from many views. The statistical modeling of this problem requires to take noise into account. When the model includes additive noise on the observations, the likelihood is intractable. By contrast, we propose Adaptive multiView ICA (AVICA), a… ▽ More

    Submitted 22 February, 2021; originally announced February 2021.

  17. arXiv:2009.14310  [pdf, other

    stat.ML cs.LG stat.AP

    Statistical control for spatio-temporal MEG/EEG source imaging with desparsified multi-task Lasso

    Authors: Jérôme-Alexis Chevalier, Alexandre Gramfort, Joseph Salmon, Bertrand Thirion

    Abstract: Detecting where and when brain regions activate in a cognitive task or in a given clinical condition is the promise of non-invasive techniques like magnetoencephalography (MEG) or electroencephalography (EEG). This problem, referred to as source localization, or source imaging, poses however a high-dimensional statistical inference challenge. While sparsity promoting regularizations have been prop… ▽ More

    Submitted 25 November, 2020; v1 submitted 29 September, 2020; originally announced September 2020.

    Comments: 21 pages

  18. arXiv:2006.06635  [pdf, other

    stat.ML cs.LG

    Modeling Shared Responses in Neuroimaging Studies through MultiView ICA

    Authors: Hugo Richard, Luigi Gresele, Aapo Hyvärinen, Bertrand Thirion, Alexandre Gramfort, Pierre Ablin

    Abstract: Group studies involving large cohorts of subjects are important to draw general conclusions about brain functional organization. However, the aggregation of data coming from multiple subjects is challenging, since it requires accounting for large variability in anatomy, functional topography and stimulus response across individuals. Data modeling is especially hard for ecologically relevant condit… ▽ More

    Submitted 24 December, 2020; v1 submitted 11 June, 2020; originally announced June 2020.

    Comments: Accepted to NeurIPS 2020

  19. arXiv:2003.05405  [pdf, other

    q-bio.NC eess.SP stat.ML

    Fine-grain atlases of functional modes for fMRI analysis

    Authors: Kamalaker Dadi, Gaël Varoquaux, Antonia Machlouzarides-Shalit, Krzysztof J. Gorgolewski, Demian Wassermann, Bertrand Thirion, Arthur Mensch

    Abstract: Population imaging markedly increased the size of functional-imaging datasets, shedding new light on the neural basis of inter-individual differences. Analyzing these large data entails new scalability challenges, computational and statistical. For this reason, brain images are typically summarized in a few signals, for instance reducing voxel-level measures with brain atlases or functional modes.… ▽ More

    Submitted 5 March, 2020; originally announced March 2020.

  20. arXiv:2002.09269  [pdf, other

    math.ST stat.AP stat.ME stat.ML

    Aggregation of Multiple Knockoffs

    Authors: Tuan-Binh Nguyen, Jérôme-Alexis Chevalier, Bertrand Thirion, Sylvain Arlot

    Abstract: We develop an extension of the Knockoff Inference procedure, introduced by Barber and Candes (2015). This new method, called Aggregation of Multiple Knockoffs (AKO), addresses the instability inherent to the random nature of Knockoff-based inference. Specifically, AKO improves both the stability and power compared with the original Knockoff algorithm while still maintaining guarantees for False Di… ▽ More

    Submitted 25 June, 2020; v1 submitted 21 February, 2020; originally announced February 2020.

    Comments: Accepted to ICML 2020 (Thirty-seventh International Conference on Machine Learning). This version includes both the main text of the conference paper and supplementary materials (as appendices). 35 pages, 7 figures

  21. arXiv:2002.09261  [pdf, other

    q-bio.QM stat.ML

    NeuroQuery: comprehensive meta-analysis of human brain map**

    Authors: Jérôme Dockès, Russell Poldrack, Romain Primet, Hande Gözükan, Tal Yarkoni, Fabian Suchanek, Bertrand Thirion, Gaël Varoquaux

    Abstract: Reaching a global view of brain organization requires assembling evidence on widely different mental processes and mechanisms. The variety of human neuroscience concepts and terminology poses a fundamental challenge to relating brain imaging results across the scientific literature. Existing meta-analysis methods perform statistical tests on sets of publications associated with a particular concep… ▽ More

    Submitted 21 February, 2020; originally announced February 2020.

  22. arXiv:1910.01914  [pdf, other

    stat.ML cs.LG q-bio.NC

    Multi-subject MEG/EEG source imaging with sparse multi-task regression

    Authors: Hicham Janati, Thomas Bazeille, Bertrand Thirion, Marco Cuturi, Alexandre Gramfort

    Abstract: Magnetoencephalography and electroencephalography (M/EEG) are non-invasive modalities that measure the weak electromagnetic fields generated by neural activity. Estimating the location and magnitude of the current sources that generated these electromagnetic fields is a challenging ill-posed regression problem known as \emph{source imaging}. When considering a group study, a common approach consis… ▽ More

    Submitted 14 October, 2019; v1 submitted 3 October, 2019; originally announced October 2019.

    Comments: version 2. arXiv admin note: text overlap with arXiv:1902.04812

  23. arXiv:1909.12537  [pdf, other

    cs.CV cs.LG eess.IV q-bio.NC

    Fast shared response model for fMRI data

    Authors: Hugo Richard, Lucas Martin, Ana Luısa Pinho, Jonathan Pillow, Bertrand Thirion

    Abstract: The shared response model provides a simple but effective framework to analyse fMRI data of subjects exposed to naturalistic stimuli. However when the number of subjects or runs is large, fitting the model requires a large amount of memory and computational power, which limits its use in practice. In this work, we introduce the FastSRM algorithm that relies on an intermediate atlas-based represent… ▽ More

    Submitted 3 December, 2019; v1 submitted 27 September, 2019; originally announced September 2019.

  24. arXiv:1903.04955  [pdf, other

    math.ST stat.AP

    ECKO: Ensemble of Clustered Knockoffs for multivariate inference on fMRI data

    Authors: Tuan-Binh Nguyen, Jérôme-Alexis Chevalier, Bertrand Thirion

    Abstract: Continuous improvement in medical imaging techniques allows the acquisition of higher-resolution images. When these are used in a predictive setting, a greater number of explanatory variables are potentially related to the dependent variable (the response). Meanwhile, the number of acquisitions per experiment remains limited. In such high dimension/small sample size setting, it is desirable to fin… ▽ More

    Submitted 12 March, 2019; originally announced March 2019.

    Comments: Accepted to 26th International Conference on Information Processing in Medical Imaging (IPMI)

  25. arXiv:1902.04812  [pdf, other

    stat.ML cs.LG

    Group level MEG/EEG source imaging via optimal transport: minimum Wasserstein estimates

    Authors: Hicham Janati, Thomas Bazeille, Bertrand Thirion, Marco Cuturi, Alexandre Gramfort

    Abstract: Magnetoencephalography (MEG) and electroencephalogra-phy (EEG) are non-invasive modalities that measure the weak electromagnetic fields generated by neural activity. Inferring the location of the current sources that generated these magnetic fields is an ill-posed inverse problem known as source imaging. When considering a group study, a baseline approach consists in carrying out the estimation of… ▽ More

    Submitted 13 February, 2019; originally announced February 2019.

  26. arXiv:1809.06304  [pdf, other

    stat.ML cs.IT cs.LG

    Approximate message-passing for convex optimization with non-separable penalties

    Authors: Andre Manoel, Florent Krzakala, Gaël Varoquaux, Bertrand Thirion, Lenka Zdeborová

    Abstract: We introduce an iterative optimization scheme for convex objectives consisting of a linear loss and a non-separable penalty, based on the expectation-consistent approximation and the vector approximate message-passing (VAMP) algorithm. Specifically, the penalties we approach are convex on a linear transformation of the variable to be determined, a notable example being total variation (TV). We des… ▽ More

    Submitted 17 September, 2018; originally announced September 2018.

    Comments: 18 pages, 6 figures

  27. arXiv:1809.06035  [pdf, other

    stat.ML cs.CV cs.LG q-bio.QM

    Extracting representations of cognition across neuroimaging studies improves brain decoding

    Authors: Arthur Mensch, Julien Mairal, Bertrand Thirion, Gaël Varoquaux

    Abstract: Cognitive brain imaging is accumulating datasets about the neural substrate of many different mental processes. Yet, most studies are based on few subjects and have low statistical power. Analyzing data across studies could bring more statistical power; yet the current brain-imaging analytic framework cannot be used at scale as it requires casting all cognitive tasks in a unified theoretical frame… ▽ More

    Submitted 19 May, 2021; v1 submitted 17 September, 2018; originally announced September 2018.

    Journal ref: PLoS Computational Biology, Public Library of Science, 2021

  28. arXiv:1809.02440  [pdf, other

    cs.NE cs.CV cs.LG q-bio.NC

    Optimizing deep video representation to match brain activity

    Authors: Hugo Richard, Ana Pinho, Bertrand Thirion, Guillaume Charpiat

    Abstract: The comparison of observed brain activity with the statistics generated by artificial intelligence systems is useful to probe brain functional organization under ecological conditions. Here we study fMRI activity in ten subjects watching color natural movies and compute deep representations of these movies with an architecture that relies on optical flow and image content. The association of activ… ▽ More

    Submitted 7 September, 2018; originally announced September 2018.

    Journal ref: 2018 Conference on Cognitive Computational Neuroscience, Sep 2018, Philadelphia, United States

  29. arXiv:1807.11718  [pdf, other

    cs.LG stat.ML

    Feature Grou** as a Stochastic Regularizer for High-Dimensional Structured Data

    Authors: Sergul Aydore, Bertrand Thirion, Gael Varoquaux

    Abstract: In many applications where collecting data is expensive, for example neuroscience or medical imaging, the sample size is typically small compared to the feature dimension. It is challenging in this setting to train expressive, non-linear models without overfitting. These datasets call for intelligent regularization that exploits known structure, such as correlations between the features arising fr… ▽ More

    Submitted 22 April, 2019; v1 submitted 31 July, 2018; originally announced July 2018.

    Comments: 12 pages, 14 figures

    Journal ref: ICML2019

  30. arXiv:1806.05829  [pdf, other

    stat.AP

    Statistical Inference with Ensemble of Clustered Desparsified Lasso

    Authors: Jérôme-Alexis Chevalier, Joseph Salmon, Bertrand Thirion

    Abstract: Medical imaging involves high-dimensional data, yet their acquisition is obtained for limited samples. Multivariate predictive models have become popular in the last decades to fit some external variables from imaging data, and standard algorithms yield point estimates of the model parameters. It is however challenging to attribute confidence to these parameter estimates, which makes solutions har… ▽ More

    Submitted 15 June, 2018; originally announced June 2018.

  31. arXiv:1806.01139  [pdf, other

    stat.ME cs.IR cs.LG

    Text to brain: predicting the spatial distribution of neuroimaging observations from text reports

    Authors: Jérôme Dockès, Demian Wassermann, Russell Poldrack, Fabian Suchanek, Bertrand Thirion, Gaël Varoquaux

    Abstract: Despite the digital nature of magnetic resonance imaging, the resulting observations are most frequently reported and stored in text documents. There is a trove of information untapped in medical health records, case reports, and medical publications. In this paper, we propose to mine brain medical publications to learn the spatial distribution associated with anatomical terms. The problem is form… ▽ More

    Submitted 28 June, 2018; v1 submitted 4 June, 2018; originally announced June 2018.

    Journal ref: MICCAI 2018 - 21st International Conference on Medical Image Computing and Computer Assisted Intervention, Sep 2018, Granada, Spain. pp.1-18, 2018

  32. arXiv:1710.11438  [pdf, other

    stat.ML cs.LG q-bio.NC

    Learning Neural Representations of Human Cognition across Many fMRI Studies

    Authors: Arthur Mensch, Julien Mairal, Danilo Bzdok, Bertrand Thirion, Gaël Varoquaux

    Abstract: Cognitive neuroscience is enjoying rapid increase in extensive public brain-imaging datasets. It opens the door to large-scale statistical models. Finding a unified perspective for all available data calls for scalable and automated solutions to an old challenge: how to aggregate heterogeneous information on brain function into a universal cognitive system that relates mental operations/cognitive… ▽ More

    Submitted 10 November, 2017; v1 submitted 31 October, 2017; originally announced October 2017.

    Comments: Advances in Neural Information Processing Systems, Dec 2017, Long Beach, United States. 2017

    Journal ref: Advances in Neural Information Processing Systems, 2017

  33. arXiv:1701.05363  [pdf, other

    stat.ML cs.LG math.OC q-bio.NC

    Stochastic Subsampling for Factorizing Huge Matrices

    Authors: Arthur Mensch, Julien Mairal, Bertrand Thirion, Gael Varoquaux

    Abstract: We present a matrix-factorization algorithm that scales to input matrices with both huge number of rows and columns. Learned factors may be sparse or dense and/or non-negative, which makes our algorithm suitable for dictionary learning, sparse component analysis, and non-negative matrix factorization. Our algorithm streams matrix columns while subsampling them to iteratively learn the matrix facto… ▽ More

    Submitted 30 October, 2017; v1 submitted 19 January, 2017; originally announced January 2017.

    Comments: IEEE Transactions on Signal Processing, Institute of Electrical and Electronics Engineers, A Paraître

    Journal ref: IEEE Transactions on Signal Processing, 2018, 66 (1), pp 113-128

  34. arXiv:1611.10041  [pdf, other

    math.OC cs.LG stat.ML

    Subsampled online matrix factorization with convergence guarantees

    Authors: Arthur Mensch, Julien Mairal, Gaël Varoquaux, Bertrand Thirion

    Abstract: We present a matrix factorization algorithm that scales to input matrices that are large in both dimensions (i.e., that contains morethan 1TB of data). The algorithm streams the matrix columns while subsampling them, resulting in low complexity per iteration andreasonable memory footprint. In contrast to previous online matrix factorization methods, our approach relies on low-dimensional statistic… ▽ More

    Submitted 30 November, 2016; originally announced November 2016.

    Journal ref: 9th NIPS Workshop on Optimization for Machine Learning, Dec 2016, Barcelone, Spain

  35. Deriving reproducible biomarkers from multi-site resting-state data: An Autism-based example

    Authors: Alexandre Abraham, Michael Milham, Adriana Di Martino, R. Cameron Craddock, Dimitris Samaras, Bertrand Thirion, Gaël Varoquaux

    Abstract: Resting-state functional Magnetic Resonance Imaging (R-fMRI) holds the promise to reveal functional biomarkers of neuropsychiatric disorders. However, extracting such biomarkers is challenging for complex multi-faceted neuropatholo-gies, such as autism spectrum disorders. Large multi-site datasets increase sample sizes to compensate for this complexity, at the cost of uncontrolled heterogeneity. T… ▽ More

    Submitted 18 November, 2016; originally announced November 2016.

    Comments: in NeuroImage, Elsevier, 2016

  36. Recursive nearest agglomeration (ReNA): fast clustering for approximation of structured signals

    Authors: Andrés Hoyos-Idrobo, Gaël Varoquaux, Jonas Kahn, Bertrand Thirion

    Abstract: In this work, we revisit fast dimension reduction approaches, as with random projections and random sampling. Our goal is to summarize the data to decrease computational costs and memory footprint of subsequent analysis. Such dimension reduction can be very efficient when the signals of interest have a strong structure, such as with images. We focus on this setting and investigate feature clusteri… ▽ More

    Submitted 19 March, 2018; v1 submitted 15 September, 2016; originally announced September 2016.

    Comments: IEEE Transactions on Pattern Analysis and Machine Intelligence, Institute of Electrical and Electronics Engineers, In press

  37. arXiv:1606.06439  [pdf, other

    stat.ML cs.CV q-bio.NC

    Social-sparsity brain decoders: faster spatial sparsity

    Authors: Gaël Varoquaux, Matthieu Kowalski, Bertrand Thirion

    Abstract: Spatially-sparse predictors are good models for brain decoding: they give accurate predictions and their weight maps are interpretable as they focus on a small number of regions. However, the state of the art, based on total variation or graph-net, is computationally costly. Here we introduce sparsity in the local neighborhood of each voxel with social-sparsity, a structured shrinkage operator. We… ▽ More

    Submitted 21 June, 2016; originally announced June 2016.

    Comments: in Pattern Recognition in NeuroImaging, Jun 2016, Trento, Italy. 2016

  38. Assessing and tuning brain decoders: cross-validation, caveats, and guidelines

    Authors: Gaël Varoquaux, Pradeep Reddy Raamana, Denis Engemann, Andrés Hoyos-Idrobo, Yannick Schwartz, Bertrand Thirion

    Abstract: Decoding, ie prediction from brain images or signals, calls for empirical evaluation of its predictive power. Such evaluation is achieved via cross-validation, a method also used to tune decoders' hyper-parameters. This paper is a review on cross-validation procedures for decoding in neuroimaging. It includes a didactic overview of the relevant theoretical considerations. Practical aspects are hig… ▽ More

    Submitted 7 November, 2016; v1 submitted 16 June, 2016; originally announced June 2016.

    Comments: NeuroImage, Elsevier, 2016

  39. arXiv:1605.00937  [pdf, other

    stat.ML cs.LG q-bio.QM

    Dictionary Learning for Massive Matrix Factorization

    Authors: Arthur Mensch, Julien Mairal, Bertrand Thirion, Gaël Varoquaux

    Abstract: Sparse matrix factorization is a popular tool to obtain interpretable data decompositions, which are also effective to perform data completion or denoising. Its applicability to large datasets has been addressed with online and randomized methods, that reduce the complexity in one of the matrix dimension, but not in both of them. In this paper, we tackle very large matrices in both dimensions. We… ▽ More

    Submitted 26 May, 2016; v1 submitted 3 May, 2016; originally announced May 2016.

    Journal ref: Proceedings of the International Conference on Machine Learning, 2016, pp 1737-1746

  40. Compressed Online Dictionary Learning for Fast fMRI Decomposition

    Authors: Arthur Mensch, Gaël Varoquaux, Bertrand Thirion

    Abstract: We present a method for fast resting-state fMRI spatial decomposi-tions of very large datasets, based on the reduction of the temporal dimension before applying dictionary learning on concatenated individual records from groups of subjects. Introducing a measure of correspondence between spatial decompositions of rest fMRI, we demonstrates that time-reduced dictionary learning produces result as r… ▽ More

    Submitted 8 February, 2016; originally announced February 2016.

    Journal ref: IEEE International Symposium on Biomedical Imaging, 2016

  41. arXiv:1512.06999  [pdf, ps, other

    q-bio.NC cs.LG stat.CO stat.ML

    FAASTA: A fast solver for total-variation regularization of ill-conditioned problems with application to brain imaging

    Authors: Gaël Varoquaux, Michael Eickenberg, Elvis Dohmatob, Bertand Thirion

    Abstract: The total variation (TV) penalty, as many other analysis-sparsity problems, does not lead to separable factors or a proximal operatorwith a closed-form expression, such as soft thresholding for the $\ell\_1$ penalty. As a result, in a variational formulation of an inverse problem or statisticallearning estimation, it leads to challenging non-smooth optimization problemsthat are often solved with e… ▽ More

    Submitted 22 December, 2015; originally announced December 2015.

    Journal ref: Colloque GRETSI, Sep 2015, Lyon, France. Gretsi, 2015, http://www.gretsi.fr/colloque2015/myGretsi/programme.php

  42. arXiv:1511.04898  [pdf, other

    stat.ML cs.CV

    Fast clustering for scalable statistical analysis on structured images

    Authors: Bertrand Thirion, Andrés Hoyos-Idrobo, Jonas Kahn, Gael Varoquaux

    Abstract: The use of brain images as markers for diseases or behavioral differences is challenged by the small effects size and the ensuing lack of power, an issue that has incited researchers to rely more systematically on large cohorts. Coupled with resolution increases, this leads to very large datasets. A striking example in the case of brain imaging is that of the Human Connectome Project: 20 Terabytes… ▽ More

    Submitted 16 November, 2015; originally announced November 2015.

    Comments: ICML Workshop on Statistics, Machine Learning and Neuroscience (Stamlins 2015), Jul 2015, Lille, France

  43. arXiv:1412.3925  [pdf, other

    q-bio.NC cs.CV

    Region segmentation for sparse decompositions: better brain parcellations from rest fMRI

    Authors: Alexandre Abraham, Elvis Dohmatob, Bertrand Thirion, Dimitris Samaras, Gael Varoquaux

    Abstract: Functional Magnetic Resonance Images acquired during resting-state provide information about the functional organization of the brain through measuring correlations between brain areas. Independent components analysis is the reference approach to estimate spatial components from weakly structured data such as brain signal time courses; each of these components may be referred to as a brain network… ▽ More

    Submitted 12 December, 2014; originally announced December 2014.

    Journal ref: Sparsity Techniques in Medical Imaging, Sep 2014, Boston, United States. pp.8

  44. arXiv:1412.3919  [pdf, other

    cs.LG cs.CV stat.ML

    Machine Learning for Neuroimaging with Scikit-Learn

    Authors: Alexandre Abraham, Fabian Pedregosa, Michael Eickenberg, Philippe Gervais, Andreas Muller, Jean Kossaifi, Alexandre Gramfort, Bertrand Thirion, Gäel Varoquaux

    Abstract: Statistical machine learning methods are increasingly used for neuroimaging data analysis. Their main virtue is their ability to model high-dimensional datasets, e.g. multivariate analysis of activation images or resting-state time series. Supervised learning is typically used in decoding or encoding settings to relate brain images to behavioral or clinical observations, while unsupervised learnin… ▽ More

    Submitted 12 December, 2014; originally announced December 2014.

    Comments: Frontiers in neuroscience, Frontiers Research Foundation, 2013, pp.15

  45. Data-driven HRF estimation for encoding and decoding models

    Authors: Fabian Pedregosa, Michael Eickenberg, Philippe Ciuciu, Bertrand Thirion, Alexandre Gramfort

    Abstract: Despite the common usage of a canonical, data-independent, hemodynamic response function (HRF), it is known that the shape of the HRF varies across brain regions and subjects. This suggests that a data-driven estimation of this function could lead to more statistical power when modeling BOLD fMRI data. However, unconstrained estimation of the HRF can yield highly unstable results when the number o… ▽ More

    Submitted 7 November, 2014; v1 submitted 27 February, 2014; originally announced February 2014.

    Comments: appears in NeuroImage (2015)

  46. arXiv:1311.3859  [pdf, other

    stat.ML cs.LG q-bio.NC

    Map** cognitive ontologies to and from the brain

    Authors: Yannick Schwartz, Bertrand Thirion, Gaël Varoquaux

    Abstract: Imaging neuroscience links brain activation maps to behavior and cognition via correlational studies. Due to the nature of the individual experiments, based on eliciting neural response from a small number of stimuli, this link is incomplete, and unidirectional from the causal point of view. To come to conclusions on the function implied by the activation of brain regions, it is necessary to combi… ▽ More

    Submitted 20 November, 2013; v1 submitted 15 November, 2013; originally announced November 2013.

    Comments: NIPS (Neural Information Processing Systems), United States (2013)

  47. arXiv:1310.1257  [pdf, other

    cs.CV

    Second order scattering descriptors predict fMRI activity due to visual textures

    Authors: Michael Eickenberg, Fabian Pedregosa, Senoussi Mehdi, Alexandre Gramfort, Bertrand Thirion

    Abstract: Second layer scattering descriptors are known to provide good classification performance on natural quasi-stationary processes such as visual textures due to their sensitivity to higher order moments and continuity with respect to small deformations. In a functional Magnetic Resonance Imaging (fMRI) experiment we present visual textures to subjects and evaluate the predictive power of these descri… ▽ More

    Submitted 10 August, 2013; originally announced October 2013.

    Comments: 3nd International Workshop on Pattern Recognition in NeuroImaging (2013)

  48. arXiv:1305.2788  [pdf, other

    cs.LG stat.AP

    HRF estimation improves sensitivity of fMRI encoding and decoding models

    Authors: Fabian Pedregosa, Michael Eickenberg, Bertrand Thirion, Alexandre Gramfort

    Abstract: Extracting activation patterns from functional Magnetic Resonance Images (fMRI) datasets remains challenging in rapid-event designs due to the inherent delay of blood oxygen level-dependent (BOLD) signal. The general linear model (GLM) allows to estimate the activation from a design matrix and a fixed hemodynamic response function (HRF). However, the HRF is known to vary substantially between subj… ▽ More

    Submitted 13 May, 2013; originally announced May 2013.

    Comments: 3nd International Workshop on Pattern Recognition in NeuroImaging (2013)

  49. arXiv:1209.5375  [pdf, other

    stat.ML

    Improving accuracy and power with transfer learning using a meta-analytic database

    Authors: Yannick Schwartz, Gaël Varoquaux, Christophe Pallier, Philippe Pinel, Jean-Baptiste Poline, Bertrand Thirion

    Abstract: Typical cohorts in brain imaging studies are not large enough for systematic testing of all the information contained in the images. To build testable working hypotheses, investigators thus rely on analysis of previous work, sometimes formalized in a so-called meta-analysis. In brain imaging, this approach underlies the specification of regions of interest (ROIs) that are usually selected on the b… ▽ More

    Submitted 28 September, 2012; v1 submitted 24 September, 2012; originally announced September 2012.

    Comments: MICCAI, Nice : France (2012)

  50. arXiv:1209.1450  [pdf, other

    stat.ML cs.LG

    On spatial selectivity and prediction across conditions with fMRI

    Authors: Yannick Schwartz, Gaël Varoquaux, Bertrand Thirion

    Abstract: Researchers in functional neuroimaging mostly use activation coordinates to formulate their hypotheses. Instead, we propose to use the full statistical images to define regions of interest (ROIs). This paper presents two machine learning approaches, transfer learning and selection transfer, that are compared upon their ability to identify the common patterns between brain activation maps related t… ▽ More

    Submitted 7 September, 2012; originally announced September 2012.

    Comments: PRNI 2012 : 2nd International Workshop on Pattern Recognition in NeuroImaging, London : United Kingdom (2012)