Search | arXiv e-print repository

Semi-Supervised Learning Enabled by Multiscale Deep Neural Network Inversion

Authors: Randall Balestriero, Herve Glotin, Richard Baraniuk

Abstract: Deep Neural Networks (DNNs) provide state-of-the-art solutions in several difficult machine perceptual tasks. However, their performance relies on the availability of a large set of labeled training data, which limits the breadth of their applicability. Hence, there is a need for new {\em semi-supervised learning} methods for DNNs that can leverage both (a small amount of) labeled and unlabeled tr… ▽ More Deep Neural Networks (DNNs) provide state-of-the-art solutions in several difficult machine perceptual tasks. However, their performance relies on the availability of a large set of labeled training data, which limits the breadth of their applicability. Hence, there is a need for new {\em semi-supervised learning} methods for DNNs that can leverage both (a small amount of) labeled and unlabeled training data. In this paper, we develop a general loss function enabling DNNs of any topology to be trained in a semi-supervised manner without extra hyper-parameters. As opposed to current semi-supervised techniques based on topology-specific or unstable approaches, ours is both robust and general. We demonstrate that our approach reaches state-of-the-art performance on the SVHN ($9.82\%$ test error, with $500$ labels and wide Resnet) and CIFAR10 (16.38% test error, with 8000 labels and sigmoid convolutional neural network) data sets. △ Less

Submitted 27 February, 2018; originally announced February 2018.

arXiv:1711.04313 [pdf, other]

Semi-Supervised Learning via New Deep Network Inversion

Authors: Randall Balestriero, Vincent Roger, Herve G. Glotin, Richard G. Baraniuk

Abstract: We exploit a recently derived inversion scheme for arbitrary deep neural networks to develop a new semi-supervised learning framework that applies to a wide range of systems and problems. The approach outperforms current state-of-the-art methods on MNIST reaching $99.14\%$ of test set accuracy while using $5$ labeled examples per class. Experiments with one-dimensional signals highlight the genera… ▽ More We exploit a recently derived inversion scheme for arbitrary deep neural networks to develop a new semi-supervised learning framework that applies to a wide range of systems and problems. The approach outperforms current state-of-the-art methods on MNIST reaching $99.14\%$ of test set accuracy while using $5$ labeled examples per class. Experiments with one-dimensional signals highlight the generality of the method. Importantly, our approach is simple, efficient, and requires no change in the deep network architecture. △ Less

Submitted 12 November, 2017; originally announced November 2017.

Comments: arXiv admin note: substantial text overlap with arXiv:1710.09302

arXiv:1707.05841 [pdf, other]

Linear Time Complexity Deep Fourier Scattering Network and Extension to Nonlinear Invariants

Authors: Randall Balestriero, Herve Glotin

Abstract: In this paper we propose a scalable version of a state-of-the-art deterministic time-invariant feature extraction approach based on consecutive changes of basis and nonlinearities, namely, the scattering network. The first focus of the paper is to extend the scattering network to allow the use of higher order nonlinearities as well as extracting nonlinear and Fourier based statistics leading to th… ▽ More In this paper we propose a scalable version of a state-of-the-art deterministic time-invariant feature extraction approach based on consecutive changes of basis and nonlinearities, namely, the scattering network. The first focus of the paper is to extend the scattering network to allow the use of higher order nonlinearities as well as extracting nonlinear and Fourier based statistics leading to the required invariants of any inherently structured input. In order to reach fast convolutions and to leverage the intrinsic structure of wavelets, we derive our complete model in the Fourier domain. In addition of providing fast computations, we are now able to exploit sparse matrices due to extremely high sparsity well localized in the Fourier domain. As a result, we are able to reach a true linear time complexity with inputs in the Fourier domain allowing fast and energy efficient solutions to machine learning tasks. Validation of the features and computational results will be presented through the use of these invariant coefficients to perform classification on audio recordings of bird songs captured in multiple different soundscapes. In the end, the applicability of the presented solutions to deep artificial neural networks is discussed. △ Less

Submitted 18 July, 2017; originally announced July 2017.

arXiv:1501.03347 [pdf, other]

Dirichlet Process Parsimonious Mixtures for clustering

Authors: Faicel Chamroukhi, Marius Bartcus, Hervé Glotin

Abstract: The parsimonious Gaussian mixture models, which exploit an eigenvalue decomposition of the group covariance matrices of the Gaussian mixture, have shown their success in particular in cluster analysis. Their estimation is in general performed by maximum likelihood estimation and has also been considered from a parametric Bayesian prospective. We propose new Dirichlet Process Parsimonious mixtures… ▽ More The parsimonious Gaussian mixture models, which exploit an eigenvalue decomposition of the group covariance matrices of the Gaussian mixture, have shown their success in particular in cluster analysis. Their estimation is in general performed by maximum likelihood estimation and has also been considered from a parametric Bayesian prospective. We propose new Dirichlet Process Parsimonious mixtures (DPPM) which represent a Bayesian nonparametric formulation of these parsimonious Gaussian mixture models. The proposed DPPM models are Bayesian nonparametric parsimonious mixture models that allow to simultaneously infer the model parameters, the optimal number of mixture components and the optimal parsimonious mixture structure from the data. We develop a Gibbs sampling technique for maximum a posteriori (MAP) estimation of the developed DPMM models and provide a Bayesian model selection framework by using Bayes factors. We apply them to cluster simulated data and real data sets, and compare them to the standard parsimonious mixture models. The obtained results highlight the effectiveness of the proposed nonparametric parsimonious mixture models as a good nonparametric alternative for the parametric parsimonious models. △ Less

Submitted 17 October, 2018; v1 submitted 14 January, 2015; originally announced January 2015.

arXiv:1312.7018 [pdf, ps, other]

doi 10.1109/IJCNN.2012.6252818

Mixture model-based functional discriminant analysis for curve classification

Authors: Faicel Chamroukhi, Hervé Glotin

Abstract: Statistical approaches for Functional Data Analysis concern the paradigm for which the individuals are functions or curves rather than finite dimensional vectors. In this paper, we particularly focus on the modeling and the classification of functional data which are temporal curves presenting regime changes over time. More specifically, we propose a new mixture model-based discriminant analysis a… ▽ More Statistical approaches for Functional Data Analysis concern the paradigm for which the individuals are functions or curves rather than finite dimensional vectors. In this paper, we particularly focus on the modeling and the classification of functional data which are temporal curves presenting regime changes over time. More specifically, we propose a new mixture model-based discriminant analysis approach for functional data using a specific hidden process regression model. Our approach is particularly adapted to both handle the problem of complex-shaped classes of curves, where each class is composed of several sub-classes, and to deal with the regime changes within each homogeneous sub-class. The model explicitly integrates the heterogeneity of each class of curves via a mixture model formulation, and the regime changes within each sub-class through a hidden logistic process. The approach allows therefore for fitting flexible curve-models to each class of complex-shaped curves presenting regime changes through an unsupervised learning scheme, to automatically summarize it into a finite number of homogeneous clusters, each of them is decomposed into several regimes. The model parameters are learned by maximizing the observed-data log-likelihood for each class by using a dedicated expectation-maximization (EM) algorithm. Comparisons on simulated data and real data with alternative approaches, including functional linear discriminant analysis and functional mixture discriminant analysis with polynomial regression mixtures and spline regression mixtures, show that the proposed approach provides better results regarding the discrimination results and significantly improves the curves approximation. △ Less

Submitted 25 December, 2013; originally announced December 2013.

Comments: In Proceedings of the 2012 International Joint Conference on Neural Networks (IJCNN), 2012, Pages: 1-8, Brisbane, Australia

arXiv:1312.7007 [pdf, ps, other]

Functional Mixture Discriminant Analysis with hidden process regression for curve classification

Authors: Faicel Chamroukhi, Heré Glotin, Céline Rabouy

Abstract: We present a new mixture model-based discriminant analysis approach for functional data using a specific hidden process regression model. The approach allows for fitting flexible curve-models to each class of complex-shaped curves presenting regime changes. The model parameters are learned by maximizing the observed-data log-likelihood for each class by using a dedicated expectation-maximization (… ▽ More We present a new mixture model-based discriminant analysis approach for functional data using a specific hidden process regression model. The approach allows for fitting flexible curve-models to each class of complex-shaped curves presenting regime changes. The model parameters are learned by maximizing the observed-data log-likelihood for each class by using a dedicated expectation-maximization (EM) algorithm. Comparisons on simulated data with alternative approaches show that the proposed approach provides better results. △ Less

Submitted 25 December, 2013; originally announced December 2013.

Comments: In Proceedings of the XXth European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN), Pages 281-286, 2012, Bruges, Belgium

arXiv:1312.6966 [pdf, ps, other]

doi 10.1016/j.neucom.2012.10.030

Model-based functional mixture discriminant analysis with hidden process regression for curve classification

Authors: Faicel Chamroukhi, Hervé Glotin, Allou Samé

Abstract: In this paper, we study the modeling and the classification of functional data presenting regime changes over time. We propose a new model-based functional mixture discriminant analysis approach based on a specific hidden process regression model that governs the regime changes over time. Our approach is particularly adapted to handle the problem of complex-shaped classes of curves, where each cla… ▽ More In this paper, we study the modeling and the classification of functional data presenting regime changes over time. We propose a new model-based functional mixture discriminant analysis approach based on a specific hidden process regression model that governs the regime changes over time. Our approach is particularly adapted to handle the problem of complex-shaped classes of curves, where each class is potentially composed of several sub-classes, and to deal with the regime changes within each homogeneous sub-class. The proposed model explicitly integrates the heterogeneity of each class of curves via a mixture model formulation, and the regime changes within each sub-class through a hidden logistic process. Each class of complex-shaped curves is modeled by a finite number of homogeneous clusters, each of them being decomposed into several regimes. The model parameters of each class are learned by maximizing the observed-data log-likelihood by using a dedicated expectation-maximization (EM) algorithm. Comparisons are performed with alternative curve classification approaches, including functional linear discriminant analysis and functional mixture discriminant analysis with polynomial regression mixtures and spline regression mixtures. Results obtained on simulated data and real data show that the proposed approach outperforms the alternative approaches in terms of discrimination, and significantly improves the curves approximation. △ Less

Submitted 25 December, 2013; originally announced December 2013.

Journal ref: Neurocomputing, Volume 112, Pages 153-163, July 2013

arXiv:1306.3058 [pdf, other]

Physeter catodon localization by sparse coding

Authors: Sébastien Paris, Yann Doh, Hervé Glotin, Xanadu Halkias, Joseph Razik

Abstract: This paper presents a spermwhale' localization architecture using jointly a bag-of-features (BoF) approach and machine learning framework. BoF methods are known, especially in computer vision, to produce from a collection of local features a global representation invariant to principal signal transformations. Our idea is to regress supervisely from these local features two rough estimates of the d… ▽ More This paper presents a spermwhale' localization architecture using jointly a bag-of-features (BoF) approach and machine learning framework. BoF methods are known, especially in computer vision, to produce from a collection of local features a global representation invariant to principal signal transformations. Our idea is to regress supervisely from these local features two rough estimates of the distance and azimuth thanks to some datasets where both acoustic events and ground-truth position are now available. Furthermore, these estimates can feed a particle filter system in order to obtain a precise spermwhale' position even in mono-hydrophone configuration. Anti-collision system and whale watching are considered applications of this work. △ Less

Submitted 13 June, 2013; originally announced June 2013.

Comments: 6 pages, 6 figures, workshop ICML4B in ICML 2013 conference

arXiv:1301.3533 [pdf, other]

Sparse Penalty in Deep Belief Networks: Using the Mixed Norm Constraint

Authors: Xanadu Halkias, Sebastien Paris, Herve Glotin

Abstract: Deep Belief Networks (DBN) have been successfully applied on popular machine learning tasks. Specifically, when applied on hand-written digit recognition, DBNs have achieved approximate accuracy rates of 98.8%. In an effort to optimize the data representation achieved by the DBN and maximize their descriptive power, recent advances have focused on inducing sparse constraints at each layer of the D… ▽ More Deep Belief Networks (DBN) have been successfully applied on popular machine learning tasks. Specifically, when applied on hand-written digit recognition, DBNs have achieved approximate accuracy rates of 98.8%. In an effort to optimize the data representation achieved by the DBN and maximize their descriptive power, recent advances have focused on inducing sparse constraints at each layer of the DBN. In this paper we present a theoretical approach for sparse constraints in the DBN using the mixed norm for both non-overlap** and overlap** groups. We explore how these constraints affect the classification accuracy for digit recognition in three different datasets (MNIST, USPS, RIMES) and provide initial estimations of their usefulness by altering different parameters such as the group size and overlap percentage. △ Less

Submitted 22 February, 2013; v1 submitted 15 January, 2013; originally announced January 2013.

Comments: 8 pages, 7 figures (including subfigures), ICleaR conference

Showing 1–9 of 9 results for author: Glotin, H