Skip to main content

Showing 101–150 of 156 results for author: LeCun, Y

.
  1. arXiv:1606.01535  [pdf, other

    cs.CV

    What is the Best Feature Learning Procedure in Hierarchical Recognition Architectures?

    Authors: Kevin Jarrett, Koray Kvukcuoglu, Karol Gregor, Yann LeCun

    Abstract: (This paper was written in November 2011 and never published. It is posted on arXiv.org in its original form in June 2016). Many recent object recognition systems have proposed using a two phase training procedure to learn sparse convolutional feature hierarchies: unsupervised pre-training followed by supervised fine-tuning. Recent results suggest that these methods provide little improvement over… ▽ More

    Submitted 5 June, 2016; originally announced June 2016.

    Comments: 17 pages, 3 figures

  2. arXiv:1605.00983  [pdf

    cs.DC

    Phase 3: DCL System Using Deep Learning Approaches for Land-based or Ship-based Real-Time Recognition and Localization of Marine Mammals - Bioacoustic Applicaitons

    Authors: Peter J. Dugan, Christopher W. Clark, Yann André LeCun, Sofie M. Van Parijs

    Abstract: Goals of this research phase is to investigate advanced detection and classification pardims useful for data-mining passive large passive acoustic archives. Technical objectives are to develop and refine a High Performance Computing, Acoustic Data Accelerator (HPC-ADA) along with MATLAB based software based on time series acoustic signal Detection cLassification using Machine learning Algorithms,… ▽ More

    Submitted 5 May, 2016; v1 submitted 3 May, 2016; originally announced May 2016.

    Comments: National Oceanic Partnership Program (NOPP) sponsored by ONR and NFWF

    Report number: N000141210585

  3. arXiv:1605.00982  [pdf

    cs.DC

    Phase 4: DCL System Using Deep Learning Approaches for Land-Based or Ship-Based Real-Time Recognition and Localization of Marine Mammals - Distributed Processing and Big Data Applications

    Authors: Peter J. Dugan, Christopher W. Clark, Yann André LeCun, Sofie M. Van Parijs

    Abstract: While the animal bioacoustics community at large is collecting huge amounts of acoustic data at an unprecedented pace, processing these data is problematic. Currently in bioacoustics, there is no effective way to achieve high performance computing using commericial off the shelf (COTS) or government off the shelf (GOTS) tools. Although several advances have been made in the open source and commerc… ▽ More

    Submitted 5 May, 2016; v1 submitted 3 May, 2016; originally announced May 2016.

    Comments: National Oceanic Partnership Program (NOPP) sponsored by ONR and NFWF

    Report number: N000141210585

  4. arXiv:1605.00972  [pdf

    cs.CV

    Phase 2: DCL System Using Deep Learning Approaches for Land-based or Ship-based Real-Time Recognition and Localization of Marine Mammals - Machine Learning Detection Algorithms

    Authors: Peter J. Dugan, Christopher W. Clark, Yann André LeCun, Sofie M. Van Parijs

    Abstract: Overarching goals for this work aim to advance the state of the art for detection, classification and localization (DCL) in the field of bioacoustics. This goal is primarily achieved by building a generic framework for detection-classification (DC) using a fast, efficient and scalable architecture, demonstrating the capabilities of this system using on a variety of low-frequency mid-frequency ceta… ▽ More

    Submitted 5 May, 2016; v1 submitted 3 May, 2016; originally announced May 2016.

    Comments: National Oceanic Partnership Program (NOPP) sponsored by ONR and NFWF: N000141210585

    Report number: N000141210585

  5. arXiv:1605.00971  [pdf

    cs.DC

    Phase 1: DCL System Research Using Advanced Approaches for Land-based or Ship-based Real-Time Recognition and Localization of Marine Mammals - HPC System Implementation

    Authors: Peter J. Dugan, Christopher W. Clark, Yann André LeCun, Sofie M. Van Parijs

    Abstract: We aim to investigate advancing the state of the art of detection, classification and localization (DCL) in the field of bioacoustics. The two primary goals are to develop transferable technologies for detection and classification in: (1) the area of advanced algorithms, such as deep learning and other methods; and (2) advanced systems, capable of real-time and archival and processing. This projec… ▽ More

    Submitted 5 May, 2016; v1 submitted 3 May, 2016; originally announced May 2016.

    Comments: Year 1 National Oceanic Partnership Program Report, sponsored ONR, NFWF. N000141210585

    Report number: N000141210585

  6. arXiv:1602.06662  [pdf, other

    cs.NE cs.AI cs.LG stat.ML

    Recurrent Orthogonal Networks and Long-Memory Tasks

    Authors: Mikael Henaff, Arthur Szlam, Yann LeCun

    Abstract: Although RNNs have been shown to be powerful tools for processing sequential data, finding architectures or optimization strategies that allow them to model very long term dependencies is still an active area of research. In this work, we carefully analyze two synthetic datasets originally outlined in (Hochreiter and Schmidhuber, 1997) which are used to evaluate the ability of RNNs to store inform… ▽ More

    Submitted 15 March, 2017; v1 submitted 22 February, 2016; originally announced February 2016.

  7. arXiv:1511.06444  [pdf, other

    cs.LG math.NA math.PR

    Universal halting times in optimization and machine learning

    Authors: Levent Sagun, Thomas Trogdon, Yann LeCun

    Abstract: The authors present empirical distributions for the halting time (measured by the number of iterations to reach a given accuracy) of optimization algorithms applied to two random systems: spin glasses and deep learning. Given an algorithm, which we take to be both the optimization routine and the form of the random landscape, the fluctuations of the halting time follow a distribution that, after c… ▽ More

    Submitted 20 February, 2017; v1 submitted 19 November, 2015; originally announced November 2015.

    MSC Class: 65K10; 82D30; 37E20

    Journal ref: Quart. Appl. Math. 76 (2018), 289-301

  8. arXiv:1511.05666  [pdf, other

    cs.CV

    Super-Resolution with Deep Convolutional Sufficient Statistics

    Authors: Joan Bruna, Pablo Sprechmann, Yann LeCun

    Abstract: Inverse problems in image and audio, and super-resolution in particular, can be seen as high-dimensional structured prediction problems, where the goal is to characterize the conditional distribution of a high-resolution output given its low-resolution corrupted observation. When the scaling ratio is small, point estimates achieve impressive performance, but soon they suffer from the regression-to… ▽ More

    Submitted 1 March, 2016; v1 submitted 18 November, 2015; originally announced November 2015.

  9. arXiv:1511.05440  [pdf, other

    cs.LG cs.CV stat.ML

    Deep multi-scale video prediction beyond mean square error

    Authors: Michael Mathieu, Camille Couprie, Yann LeCun

    Abstract: Learning to predict future images from a video sequence involves the construction of an internal representation that models the image evolution accurately, and therefore, to some degree, its content and dynamics. This is why pixel-space video prediction may be viewed as a promising avenue for unsupervised feature learning. In addition, while optical flow has been a very studied problem in computer… ▽ More

    Submitted 26 February, 2016; v1 submitted 17 November, 2015; originally announced November 2015.

  10. arXiv:1511.05212  [pdf, other

    cs.LG

    Binary embeddings with structured hashed projections

    Authors: Anna Choromanska, Krzysztof Choromanski, Mariusz Bojarski, Tony Jebara, Sanjiv Kumar, Yann LeCun

    Abstract: We consider the hashing mechanism for constructing binary embeddings, that involves pseudo-random projections followed by nonlinear (sign function) map**s. The pseudo-random projection is described by a matrix, where not all entries are independent random variables but instead a fixed "budget of randomness" is distributed across the matrix. Such matrices can be efficiently stored in sub-quadrati… ▽ More

    Submitted 1 July, 2016; v1 submitted 16 November, 2015; originally announced November 2015.

    Comments: arXiv admin note: text overlap with arXiv:1505.03190

  11. arXiv:1511.03719  [pdf, other

    cs.LG

    Universum Prescription: Regularization using Unlabeled Data

    Authors: Xiang Zhang, Yann LeCun

    Abstract: This paper shows that simply prescribing "none of the above" labels to unlabeled data has a beneficial regularization effect to supervised learning. We call it universum prescription by the fact that the prescribed labels cannot be one of the supervised labels. In spite of its simplicity, universum prescription obtained competitive results in training deep convolutional networks for CIFAR-10, CIFA… ▽ More

    Submitted 17 November, 2016; v1 submitted 11 November, 2015; originally announced November 2015.

    Comments: 7 pages for article, 3 pages for supplemental material. To appear in AAAI-17

  12. arXiv:1510.05970  [pdf, other

    cs.CV cs.LG cs.NE

    Stereo Matching by Training a Convolutional Neural Network to Compare Image Patches

    Authors: Jure Žbontar, Yann LeCun

    Abstract: We present a method for extracting depth information from a rectified image pair. Our approach focuses on the first stage of many stereo algorithms: the matching cost computation. We approach the problem by learning a similarity measure on small image patches using a convolutional neural network. Training is carried out in a supervised manner by constructing a binary classification data set with e… ▽ More

    Submitted 18 May, 2016; v1 submitted 20 October, 2015; originally announced October 2015.

    Journal ref: JMLR 17(65):1-32, 2016

  13. arXiv:1509.08967  [pdf, other

    cs.CL cs.NE

    Very Deep Multilingual Convolutional Neural Networks for LVCSR

    Authors: Tom Sercu, Christian Puhrsch, Brian Kingsbury, Yann LeCun

    Abstract: Convolutional neural networks (CNNs) are a standard component of many current state-of-the-art Large Vocabulary Continuous Speech Recognition (LVCSR) systems. However, CNNs in LVCSR have not kept pace with recent advances in other domains where deeper neural networks provide superior performance. In this paper we propose a number of architectural advances in CNNs for LVCSR. First, we introduce a v… ▽ More

    Submitted 23 January, 2016; v1 submitted 29 September, 2015; originally announced September 2015.

    Comments: Accepted for publication at ICASSP 2016

  14. arXiv:1509.03591  [pdf

    cs.DC

    High Performance Computer Acoustic Data Accelerator: A New System for Exploring Marine Mammal Acoustics for Big Data Applications

    Authors: Peter Dugan, John Zollweg, Marian Popescu, Denise Risch, Herve Glotin, Yann LeCun, and Christopher Clark

    Abstract: This paper presents a new software model designed for distributed sonic signal detection runtime using machine learning algorithms called DeLMA. A new algorithm--Acoustic Data-mining Accelerator (ADA)--is also presented. ADA is a robust yet scalable solution for efficiently processing big sound archives using distributing computing technologies. Together, DeLMA and the ADA algorithm provide a powe… ▽ More

    Submitted 11 September, 2015; originally announced September 2015.

    Comments: Seven pages, submitted at International Conference on Machine Learning 2014, Workshop uLearnBio, unsupervised learning for bioacoustic applications

    MSC Class: 68-04

  15. arXiv:1509.01626  [pdf, other

    cs.LG cs.CL

    Character-level Convolutional Networks for Text Classification

    Authors: Xiang Zhang, Junbo Zhao, Yann LeCun

    Abstract: This article offers an empirical exploration on the use of character-level convolutional networks (ConvNets) for text classification. We constructed several large-scale datasets to show that character-level convolutional networks could achieve state-of-the-art or competitive results. Comparisons are offered against traditional models such as bag of words, n-grams and their TFIDF variants, and deep… ▽ More

    Submitted 3 April, 2016; v1 submitted 4 September, 2015; originally announced September 2015.

    Comments: An early version of this work entitled "Text Understanding from Scratch" was posted in Feb 2015 as arXiv:1502.01710. The present paper has considerably more experimental results and a rewritten introduction, Advances in Neural Information Processing Systems 28 (NIPS 2015)

  16. arXiv:1506.05163  [pdf, other

    cs.LG cs.CV cs.NE

    Deep Convolutional Networks on Graph-Structured Data

    Authors: Mikael Henaff, Joan Bruna, Yann LeCun

    Abstract: Deep Learning's recent successes have mostly relied on Convolutional Networks, which exploit fundamental statistical properties of images, sounds and video data: the local stationarity and multi-scale compositional structure, that allows expressing long range interactions in terms of shorter, localized interactions. However, there exist other important examples, such as text documents or bioinform… ▽ More

    Submitted 16 June, 2015; originally announced June 2015.

  17. arXiv:1506.03011  [pdf, other

    cs.CV

    Learning to Linearize Under Uncertainty

    Authors: Ross Goroshin, Michael Mathieu, Yann LeCun

    Abstract: Training deep feature hierarchies to solve supervised learning tasks has achieved state of the art performance on many problems in computer vision. However, a principled way in which to train such hierarchies in the unsupervised setting has remained elusive. In this work we suggest a new architecture and loss for training deep feature hierarchies that linearize the transformations observed in unla… ▽ More

    Submitted 10 September, 2015; v1 submitted 9 June, 2015; originally announced June 2015.

    Comments: To appear at NIPS 2015

  18. arXiv:1506.02351  [pdf, other

    stat.ML cs.LG cs.NE

    Stacked What-Where Auto-encoders

    Authors: Junbo Zhao, Michael Mathieu, Ross Goroshin, Yann LeCun

    Abstract: We present a novel architecture, the "stacked what-where auto-encoders" (SWWAE), which integrates discriminative and generative pathways and provides a unified approach to supervised, semi-supervised and unsupervised learning without relying on sampling during training. An instantiation of SWWAE uses a convolutional net (Convnet) (LeCun et al. (1998)) to encode the input, and employs a deconvoluti… ▽ More

    Submitted 14 February, 2016; v1 submitted 8 June, 2015; originally announced June 2015.

    Comments: Workshop track - ICLR 2016

  19. arXiv:1504.02518  [pdf, other

    cs.CV cs.LG

    Unsupervised Feature Learning from Temporal Data

    Authors: Ross Goroshin, Joan Bruna, Jonathan Tompson, David Eigen, Yann LeCun

    Abstract: Current state-of-the-art classification and detection algorithms rely on supervised training. In this work we study unsupervised feature learning in the context of temporally coherent video data. We focus on feature learning from unlabeled video data, using the assumption that adjacent video frames contain semantically similar information. This assumption is exploited to train a convolutional pool… ▽ More

    Submitted 15 April, 2015; v1 submitted 9 April, 2015; originally announced April 2015.

    Comments: arXiv admin note: substantial text overlap with arXiv:1412.6056

  20. arXiv:1503.03438  [pdf, ps, other

    cs.LG cs.NE stat.ML

    A mathematical motivation for complex-valued convolutional networks

    Authors: Joan Bruna, Soumith Chintala, Yann LeCun, Serkan Piantino, Arthur Szlam, Mark Tygert

    Abstract: A complex-valued convolutional network (convnet) implements the repeated application of the following composition of three operations, recursively applying the composition to an input vector of nonnegative real numbers: (1) convolution with complex-valued vectors followed by (2) taking the absolute value of every entry of the resulting vectors followed by (3) local averaging. For processing real-v… ▽ More

    Submitted 12 December, 2015; v1 submitted 11 March, 2015; originally announced March 2015.

    Comments: 11 pages, 3 figures; this is the retitled version submitted to the journal, "Neural Computation"

    Journal ref: Neural Computation, 28 (5): 815-825, May 2016

  21. arXiv:1502.01710  [pdf, other

    cs.LG cs.CL

    Text Understanding from Scratch

    Authors: Xiang Zhang, Yann LeCun

    Abstract: This article demontrates that we can apply deep learning to text understanding from character-level inputs all the way up to abstract text concepts, using temporal convolutional networks (ConvNets). We apply ConvNets to various large-scale datasets, including ontology classification, sentiment analysis, and text categorization. We show that temporal ConvNets can achieve astonishing performance wit… ▽ More

    Submitted 3 April, 2016; v1 submitted 5 February, 2015; originally announced February 2015.

    Comments: This technical report is superseded by a paper entitled "Character-level Convolutional Networks for Text Classification", arXiv:1509.01626. It has considerably more experimental results and a rewritten introduction

  22. arXiv:1412.7580  [pdf, ps, other

    cs.LG cs.DC cs.NE

    Fast Convolutional Nets With fbfft: A GPU Performance Evaluation

    Authors: Nicolas Vasilache, Jeff Johnson, Michael Mathieu, Soumith Chintala, Serkan Piantino, Yann LeCun

    Abstract: We examine the performance profile of Convolutional Neural Network training on the current generation of NVIDIA Graphics Processing Units. We introduce two new Fast Fourier Transform convolution implementations: one based on NVIDIA's cuFFT library, and another based on a Facebook authored FFT implementation, fbfft, that provides significant speedups over cuFFT (over 1.5x) for whole CNNs. Both of t… ▽ More

    Submitted 10 April, 2015; v1 submitted 23 December, 2014; originally announced December 2014.

    Comments: Camera ready for ICLR2015

  23. arXiv:1412.7022  [pdf, ps, other

    cs.SD cs.LG

    Audio Source Separation with Discriminative Scattering Networks

    Authors: Pablo Sprechmann, Joan Bruna, Yann LeCun

    Abstract: In this report we describe an ongoing line of research for solving single-channel source separation problems. Many monaural signal decomposition techniques proposed in the literature operate on a feature space consisting of a time-frequency representation of the input data. A challenge faced by these approaches is to effectively exploit the temporal dependencies of the signals at scales larger tha… ▽ More

    Submitted 27 April, 2015; v1 submitted 22 December, 2014; originally announced December 2014.

  24. arXiv:1412.6651  [pdf, other

    cs.LG stat.ML

    Deep learning with Elastic Averaging SGD

    Authors: Sixin Zhang, Anna Choromanska, Yann LeCun

    Abstract: We study the problem of stochastic optimization for deep learning in the parallel computing environment under communication constraints. A new algorithm is proposed in this setting where the communication and coordination of work among concurrent processes (local workers), is based on an elastic force which links the parameters they compute with a center variable stored by the parameter server (ma… ▽ More

    Submitted 25 October, 2015; v1 submitted 20 December, 2014; originally announced December 2014.

    Comments: NIPS2015 camera-ready version

  25. arXiv:1412.6615  [pdf, other

    stat.ML cs.LG

    Explorations on high dimensional landscapes

    Authors: Levent Sagun, V. Ugur Guney, Gerard Ben Arous, Yann LeCun

    Abstract: Finding minima of a real valued non-convex function over a high dimensional space is a major challenge in science. We provide evidence that some such functions that are defined on high dimensional domains have a narrow band of values whose pre-image contains the bulk of its critical points. This is in contrast with the low dimensional picture in which this band is wide. Our simulations agree with… ▽ More

    Submitted 6 April, 2015; v1 submitted 20 December, 2014; originally announced December 2014.

    Comments: 11 pages, 8 figures, workshop contribution at ICLR 2015

  26. arXiv:1412.6056  [pdf, other

    cs.CV

    Unsupervised Learning of Spatiotemporally Coherent Metrics

    Authors: Ross Goroshin, Joan Bruna, Jonathan Tompson, David Eigen, Yann LeCun

    Abstract: Current state-of-the-art classification and detection algorithms rely on supervised training. In this work we study unsupervised feature learning in the context of temporally coherent video data. We focus on feature learning from unlabeled video data, using the assumption that adjacent video frames contain semantically similar information. This assumption is exploited to train a convolutional pool… ▽ More

    Submitted 8 September, 2015; v1 submitted 18 December, 2014; originally announced December 2014.

    Comments: To appear at ICCV2015

  27. arXiv:1412.0233  [pdf, other

    cs.LG

    The Loss Surfaces of Multilayer Networks

    Authors: Anna Choromanska, Mikael Henaff, Michael Mathieu, Gérard Ben Arous, Yann LeCun

    Abstract: We study the connection between the highly non-convex loss function of a simple model of the fully-connected feed-forward neural network and the Hamiltonian of the spherical spin-glass model under the assumptions of: i) variable independence, ii) redundancy in network parametrization, and iii) uniformity. These assumptions enable us to explain the complexity of the fully decoupled neural network t… ▽ More

    Submitted 21 January, 2015; v1 submitted 30 November, 2014; originally announced December 2014.

  28. arXiv:1411.4280  [pdf, other

    cs.CV

    Efficient Object Localization Using Convolutional Networks

    Authors: Jonathan Tompson, Ross Goroshin, Arjun Jain, Yann LeCun, Christopher Bregler

    Abstract: Recent state-of-the-art performance on human-body pose estimation has been achieved with Deep Convolutional Networks (ConvNets). Traditional ConvNet architectures include pooling and sub-sampling layers which reduce computational requirements, introduce invariance and prevent over-training. These benefits of pooling come at the cost of reduced localization accuracy. We introduce a novel architectu… ▽ More

    Submitted 9 June, 2015; v1 submitted 16 November, 2014; originally announced November 2014.

    Comments: 8 pages with 1 page of citations

  29. arXiv:1410.6973  [pdf, other

    cs.LG

    Differentially- and non-differentially-private random decision trees

    Authors: Mariusz Bojarski, Anna Choromanska, Krzysztof Choromanski, Yann LeCun

    Abstract: We consider supervised learning with random decision trees, where the tree construction is completely random. The method is popularly used and works well in practice despite the simplicity of the setting, but its statistical mechanism is not yet well-understood. In this paper we provide strong theoretical guarantees regarding learning with random decision trees. We analyze and compare three differ… ▽ More

    Submitted 5 February, 2015; v1 submitted 25 October, 2014; originally announced October 2014.

  30. arXiv:1409.7963  [pdf, other

    cs.CV cs.LG cs.NE

    MoDeep: A Deep Learning Framework Using Motion Features for Human Pose Estimation

    Authors: Arjun Jain, Jonathan Tompson, Yann LeCun, Christoph Bregler

    Abstract: In this work, we propose a novel and efficient method for articulated human pose estimation in videos using a convolutional network architecture, which incorporates both color and motion features. We propose a new human body pose dataset, FLIC-motion, that extends the FLIC dataset with additional motion features. We apply our architecture to this dataset and report significantly better performance… ▽ More

    Submitted 28 September, 2014; originally announced September 2014.

  31. arXiv:1409.4326  [pdf, other

    cs.CV cs.LG cs.NE

    Computing the Stereo Matching Cost with a Convolutional Neural Network

    Authors: Jure Žbontar, Yann LeCun

    Abstract: We present a method for extracting depth information from a rectified image pair. We train a convolutional neural network to predict how well two image patches match and use it to compute the stereo matching cost. The cost is refined by cross-based cost aggregation and semiglobal matching, followed by a left-right consistency check to eliminate errors in the occluded regions. Our stereo method ach… ▽ More

    Submitted 20 October, 2015; v1 submitted 15 September, 2014; originally announced September 2014.

    Comments: Conference on Computer Vision and Pattern Recognition (CVPR), June 2015

  32. arXiv:1406.2984  [pdf, other

    cs.CV

    Joint Training of a Convolutional Network and a Graphical Model for Human Pose Estimation

    Authors: Jonathan Tompson, Arjun Jain, Yann LeCun, Christoph Bregler

    Abstract: This paper proposes a new hybrid architecture that consists of a deep Convolutional Network and a Markov Random Field. We show how this architecture is successfully applied to the challenging problem of articulated human pose estimation in monocular images. The architecture can exploit structural domain constraints such as geometric relationships between body joint locations. We show that joint tr… ▽ More

    Submitted 17 September, 2014; v1 submitted 11 June, 2014; originally announced June 2014.

  33. arXiv:1404.7195  [pdf, other

    cs.LG

    Fast Approximation of Rotations and Hessians matrices

    Authors: Michael Mathieu, Yann LeCun

    Abstract: A new method to represent and approximate rotation matrices is introduced. The method represents approximations of a rotation matrix $Q$ with linearithmic complexity, i.e. with $\frac{1}{2}n\lg(n)$ rotations over pairs of coordinates, arranged in an FFT-like fashion. The approximation is "learned" using gradient descent. It allows to represent symmetric matrices $H$ as $QDQ^T$ where $D$ is a diago… ▽ More

    Submitted 28 April, 2014; originally announced April 2014.

  34. arXiv:1404.0736  [pdf, other

    cs.CV cs.LG

    Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation

    Authors: Remi Denton, Wojciech Zaremba, Joan Bruna, Yann LeCun, Rob Fergus

    Abstract: We present techniques for speeding up the test-time evaluation of large convolutional networks, designed for object recognition tasks. These models deliver impressive accuracy but each image evaluation requires millions of floating point operations, making their deployment on smartphones and Internet-scale clusters problematic. The computation is dominated by the convolution operations in the lo… ▽ More

    Submitted 9 June, 2014; v1 submitted 2 April, 2014; originally announced April 2014.

  35. arXiv:1312.6229  [pdf, ps, other

    cs.CV

    OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks

    Authors: Pierre Sermanet, David Eigen, Xiang Zhang, Michael Mathieu, Rob Fergus, Yann LeCun

    Abstract: We present an integrated framework for using Convolutional Networks for classification, localization and detection. We show how a multiscale and sliding window approach can be efficiently implemented within a ConvNet. We also introduce a novel deep learning approach to localization by learning to predict object boundaries. Bounding boxes are then accumulated rather than suppressed in order to incr… ▽ More

    Submitted 23 February, 2014; v1 submitted 21 December, 2013; originally announced December 2013.

  36. arXiv:1312.6203  [pdf, other

    cs.LG cs.CV cs.NE

    Spectral Networks and Locally Connected Networks on Graphs

    Authors: Joan Bruna, Wojciech Zaremba, Arthur Szlam, Yann LeCun

    Abstract: Convolutional Neural Networks are extremely efficient architectures in image and audio recognition tasks, thanks to their ability to exploit the local translational invariance of signal classes over their domain. In this paper we consider possible generalizations of CNNs to signals defined on more general domains without the action of a translation group. In particular, we propose two construction… ▽ More

    Submitted 21 May, 2014; v1 submitted 20 December, 2013; originally announced December 2013.

    Comments: 14 pages

  37. arXiv:1312.5851  [pdf, other

    cs.CV cs.LG cs.NE

    Fast Training of Convolutional Networks through FFTs

    Authors: Michael Mathieu, Mikael Henaff, Yann LeCun

    Abstract: Convolutional networks are one of the most widely employed architectures in computer vision and machine learning. In order to leverage their ability to learn complex functions, large amounts of data are required for training. Training a large convolutional network to produce state-of-the-art results can take weeks, even when using modern GPUs. Producing labels using a trained network can also be c… ▽ More

    Submitted 6 March, 2014; v1 submitted 20 December, 2013; originally announced December 2013.

  38. arXiv:1312.1847  [pdf, other

    cs.LG

    Understanding Deep Architectures using a Recursive Convolutional Network

    Authors: David Eigen, Jason Rolfe, Rob Fergus, Yann LeCun

    Abstract: A key challenge in designing convolutional network models is sizing them appropriately. Many factors are involved in these decisions, including number of layers, feature maps, kernel sizes, etc. Complicating this further is the fact that each of these influence not only the numbers and dimensions of the activation units, but also the total number of parameters. In this paper we focus on assessing… ▽ More

    Submitted 19 February, 2014; v1 submitted 6 December, 2013; originally announced December 2013.

  39. arXiv:1311.4025  [pdf, ps, other

    stat.ML

    Signal Recovery from Pooling Representations

    Authors: Joan Bruna, Arthur Szlam, Yann LeCun

    Abstract: In this work we compute lower Lipschitz bounds of $\ell_p$ pooling operators for $p=1, 2, \infty$ as well as $\ell_p$ pooling operators preceded by half-rectification layers. These give sufficient conditions for the design of invertible neural network layers. Numerical experiments on MNIST and image patches confirm that pooling layers can be inverted with phase recovery algorithms. Moreover, the r… ▽ More

    Submitted 27 February, 2014; v1 submitted 16 November, 2013; originally announced November 2013.

    Comments: 17 pages, 3 figures

  40. arXiv:1301.3775  [pdf, other

    cs.LG cs.CV

    Discriminative Recurrent Sparse Auto-Encoders

    Authors: Jason Tyler Rolfe, Yann LeCun

    Abstract: We present the discriminative recurrent sparse auto-encoder model, comprising a recurrent encoder of rectified linear units, unrolled for a fixed number of iterations, and connected to two linear decoders that reconstruct the input and predict its supervised classification. Training via backpropagation-through-time initially minimizes an unsupervised sparse reconstruction error; the loss function… ▽ More

    Submitted 19 March, 2013; v1 submitted 16 January, 2013; originally announced January 2013.

    Comments: Added clarifications suggested by reviewers. 15 pages, 10 figures

  41. arXiv:1301.3764  [pdf, other

    cs.LG cs.AI stat.ML

    Adaptive learning rates and parallelization for stochastic, sparse, non-smooth gradients

    Authors: Tom Schaul, Yann LeCun

    Abstract: Recent work has established an empirically successful framework for adapting learning rates for stochastic gradient descent (SGD). This effectively removes all needs for tuning, while automatically reducing learning rates over time on stationary problems, and permitting learning rates to grow appropriately in non-stationary tasks. Here, we extend the idea in three directions, addressing proper min… ▽ More

    Submitted 27 March, 2013; v1 submitted 16 January, 2013; originally announced January 2013.

    Comments: Published at the First International Conference on Learning Representations (ICLR-2013). Public reviews are available at http://openreview.net/document/c14f2204-fd66-4d91-bed4-153523694041#c14f2204-fd66-4d91-bed4-153523694041

  42. arXiv:1301.3577  [pdf, other

    cs.LG

    Saturating Auto-Encoders

    Authors: Rostislav Goroshin, Yann LeCun

    Abstract: We introduce a simple new regularizer for auto-encoders whose hidden-unit activation functions contain at least one zero-gradient (saturated) region. This regularizer explicitly encourages activations in the saturated region(s) of the corresponding activation function. We call these Saturating Auto-Encoders (SATAE). We show that the saturation regularizer explicitly limits the SATAE's ability to r… ▽ More

    Submitted 20 March, 2013; v1 submitted 15 January, 2013; originally announced January 2013.

  43. arXiv:1301.3572  [pdf, other

    cs.CV

    Indoor Semantic Segmentation using depth information

    Authors: Camille Couprie, Clément Farabet, Laurent Najman, Yann LeCun

    Abstract: This work addresses multi-class segmentation of indoor scenes with RGB-D inputs. While this area of research has gained much attention recently, most works still rely on hand-crafted features. In contrast, we apply a multiscale convolutional network to learn features directly from the images and the depth information. We obtain state-of-the-art on the NYU-v2 depth dataset with an accuracy of 64.5%… ▽ More

    Submitted 14 March, 2013; v1 submitted 15 January, 2013; originally announced January 2013.

    Comments: 8 pages, 3 figures

  44. arXiv:1301.3537  [pdf, ps, other

    cs.AI math.NA

    Learning Stable Group Invariant Representations with Convolutional Networks

    Authors: Joan Bruna, Arthur Szlam, Yann LeCun

    Abstract: Transformation groups, such as translations or rotations, effectively express part of the variability observed in many recognition problems. The group structure enables the construction of invariant signal representations with appealing mathematical properties, where convolutions, together with pooling operators, bring stability to additive and geometric perturbations of the input. Whereas physica… ▽ More

    Submitted 15 January, 2013; originally announced January 2013.

    Comments: 4 pages

  45. arXiv:1301.3476  [pdf, other

    cs.LG cs.CV stat.ML

    Pushing Stochastic Gradient towards Second-Order Methods -- Backpropagation Learning with Transformations in Nonlinearities

    Authors: Tommi Vatanen, Tapani Raiko, Harri Valpola, Yann LeCun

    Abstract: Recently, we proposed to transform the outputs of each hidden neuron in a multi-layer perceptron network to have zero output and zero slope on average, and use separate shortcut connections to model the linear dependencies instead. We continue the work by firstly introducing a third transformation to normalize the scale of the outputs of each hidden neuron, and secondly by analyzing the connection… ▽ More

    Submitted 11 March, 2013; v1 submitted 15 January, 2013; originally announced January 2013.

    Comments: 10 pages, 5 figures, ICLR2013

  46. arXiv:1301.1671  [pdf, other

    cs.CV

    Causal graph-based video segmentation

    Authors: Camille Couprie, Clément Farabet, Yann LeCun

    Abstract: Numerous approaches in image processing and computer vision are making use of super-pixels as a pre-processing step. Among the different methods producing such over-segmentation of an image, the graph-based approach of Felzenszwalb and Huttenlocher is broadly employed. One of its interesting properties is that the regions are computed in a greedy manner in quasi-linear time. The algorithm may be t… ▽ More

    Submitted 8 January, 2013; originally announced January 2013.

    Comments: 6 pages, 5 figures

  47. arXiv:1212.0142  [pdf, ps, other

    cs.CV cs.LG

    Pedestrian Detection with Unsupervised Multi-Stage Feature Learning

    Authors: Pierre Sermanet, Koray Kavukcuoglu, Soumith Chintala, Yann LeCun

    Abstract: Pedestrian detection is a problem of considerable practical interest. Adding to the list of successful applications of deep learning methods to vision, we report state-of-the-art and competitive results on all major pedestrian datasets with a convolutional network model. The model uses a few new twists, such as multi-stage features, connections that skip layers to integrate global shape informatio… ▽ More

    Submitted 2 April, 2013; v1 submitted 1 December, 2012; originally announced December 2012.

    Comments: 12 pages

  48. arXiv:1206.1106  [pdf, other

    stat.ML cs.LG

    No More Pesky Learning Rates

    Authors: Tom Schaul, Sixin Zhang, Yann LeCun

    Abstract: The performance of stochastic gradient descent (SGD) depends critically on how learning rates are tuned and decreased over time. We propose a method to automatically adjust multiple learning rates so as to minimize the expected error at any one time. The method relies on local gradient variations across samples. In our approach, learning rates can increase as well as decrease, making it suitable f… ▽ More

    Submitted 18 February, 2013; v1 submitted 5 June, 2012; originally announced June 2012.

  49. arXiv:1204.3968  [pdf, ps, other

    cs.CV cs.LG cs.NE

    Convolutional Neural Networks Applied to House Numbers Digit Classification

    Authors: Pierre Sermanet, Soumith Chintala, Yann LeCun

    Abstract: We classify digits of real-world house numbers using convolutional neural networks (ConvNets). ConvNets are hierarchical feature learning neural networks whose structure is biologically inspired. Unlike many popular vision approaches that are hand-designed, ConvNets can automatically learn a unique set of features optimized for a given task. We augmented the traditional ConvNet architecture by lea… ▽ More

    Submitted 17 April, 2012; originally announced April 2012.

    Comments: 4 pages, 6 figures, 2 tables

  50. arXiv:1202.6384  [pdf, ps, other

    cs.CV

    Fast approximations to structured sparse coding and applications to object classification

    Authors: Arthur Szlam, Karol Gregor, Yann LeCun

    Abstract: We describe a method for fast approximation of sparse coding. The input space is subdivided by a binary decision tree, and we simultaneously learn a dictionary and assignment of allowed dictionary elements for each leaf of the tree. We store a lookup table with the assignments and the pseudoinverses for each node, allowing for very fast inference. We give an algorithm for learning the tree, the di… ▽ More

    Submitted 28 February, 2012; originally announced February 2012.