Skip to main content

Showing 1–33 of 33 results for author: Vert, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2211.05641  [pdf, other

    cs.LG cs.AI stat.ML

    Regression as Classification: Influence of Task Formulation on Neural Network Features

    Authors: Lawrence Stewart, Francis Bach, Quentin Berthet, Jean-Philippe Vert

    Abstract: Neural networks can be trained to solve regression problems by using gradient-based methods to minimize the square loss. However, practitioners often prefer to reformulate regression as a classification problem, observing that training on the cross entropy loss results in better performance. By focusing on two-layer ReLU networks, which can be fully characterized by measures over their feature spa… ▽ More

    Submitted 1 March, 2023; v1 submitted 10 November, 2022; originally announced November 2022.

  2. arXiv:2206.06929  [pdf, other

    cs.LG stat.ML

    Scaling ResNets in the Large-depth Regime

    Authors: Pierre Marion, Adeline Fermanian, Gérard Biau, Jean-Philippe Vert

    Abstract: Deep ResNets are recognized for achieving state-of-the-art results in complex machine learning tasks. However, the remarkable performance of these architectures relies on a training procedure that needs to be carefully crafted to avoid vanishing or exploding gradients, particularly as the depth $L$ increases. No consensus has been reached on how to mitigate this issue, although a widely discussed… ▽ More

    Submitted 10 June, 2024; v1 submitted 14 June, 2022; originally announced June 2022.

    Comments: 44 pages, 9 figures. Updated with clarifications and additional references

  3. arXiv:2106.01202  [pdf, other

    stat.ML cs.LG

    Framing RNN as a kernel method: A neural ODE approach

    Authors: Adeline Fermanian, Pierre Marion, Jean-Philippe Vert, Gérard Biau

    Abstract: Building on the interpretation of a recurrent neural network (RNN) as a continuous-time neural differential equation, we show, under appropriate conditions, that the solution of a RNN can be viewed as a linear function of a specific feature set of the input sequence, known as the signature. This connection allows us to frame a RNN as a kernel method in a suitable reproducing kernel Hilbert space.… ▽ More

    Submitted 29 October, 2021; v1 submitted 2 June, 2021; originally announced June 2021.

    Comments: 33 pages, 7 figures, accepted for an oral presentation at NeurIPS 2021

  4. arXiv:2105.15183  [pdf, other

    cs.LG math.NA stat.ML

    Efficient and Modular Implicit Differentiation

    Authors: Mathieu Blondel, Quentin Berthet, Marco Cuturi, Roy Frostig, Stephan Hoyer, Felipe Llinares-López, Fabian Pedregosa, Jean-Philippe Vert

    Abstract: Automatic differentiation (autodiff) has revolutionized machine learning. It allows to express complex computations by composing elementary ones in creative ways and removes the burden of computing their derivatives by hand. More recently, differentiation of optimization problem solutions has attracted widespread attention with applications such as optimization layers, and in bi-level problems suc… ▽ More

    Submitted 12 October, 2022; v1 submitted 31 May, 2021; originally announced May 2021.

    Comments: V3: added more related work and Jacobian precision figure

  5. arXiv:2010.08354  [pdf, other

    cs.LG stat.ML

    Differentiable Divergences Between Time Series

    Authors: Mathieu Blondel, Arthur Mensch, Jean-Philippe Vert

    Abstract: Computing the discrepancy between time series of variable sizes is notoriously challenging. While dynamic time war** (DTW) is popularly used for this purpose, it is not differentiable everywhere and is known to lead to bad local optima when used as a "loss". Soft-DTW addresses these issues, but it is not a positive definite divergence: due to the bias introduced by entropic regularization, it ca… ▽ More

    Submitted 25 February, 2021; v1 submitted 16 October, 2020; originally announced October 2020.

    Comments: V3: AISTATS 2021 camera-ready

  6. arXiv:2006.06049  [pdf, other

    cs.LG stat.ML

    On Mixup Regularization

    Authors: Luigi Carratino, Moustapha Cissé, Rodolphe Jenatton, Jean-Philippe Vert

    Abstract: Mixup is a data augmentation technique that creates new examples as convex combinations of training points and labels. This simple technique has empirically shown to improve the accuracy of many state-of-the-art models in different settings and applications, but the reasons behind this empirical success remain poorly understood. In this paper we take a substantial step in explaining the theoretica… ▽ More

    Submitted 17 October, 2022; v1 submitted 10 June, 2020; originally announced June 2020.

  7. arXiv:2004.12508  [pdf, other

    stat.ME cs.LG stat.AP

    Noisy Adaptive Group Testing using Bayesian Sequential Experimental Design

    Authors: Marco Cuturi, Olivier Teboul, Quentin Berthet, Arnaud Doucet, Jean-Philippe Vert

    Abstract: When the infection prevalence of a disease is low, Dorfman showed 80 years ago that testing groups of people can prove more efficient than testing people individually. Our goal in this paper is to propose new group testing algorithms that can operate in a noisy setting (tests can be mistaken) to decide adaptively (looking at past results) which groups to test next, with the goal to converge to a g… ▽ More

    Submitted 22 July, 2020; v1 submitted 26 April, 2020; originally announced April 2020.

    Comments: Latest version, with updated experiments, new conclusions on LBP vs SMC decoding and new approach

  8. arXiv:2002.10837  [pdf, ps, other

    stat.ME cs.LG stat.ML

    MissDeepCausal: Causal Inference from Incomplete Data Using Deep Latent Variable Models

    Authors: Imke Mayer, Julie Josse, Félix Raimundo, Jean-Philippe Vert

    Abstract: Inferring causal effects of a treatment, intervention or policy from observational data is central to many applications. However, state-of-the-art methods for causal inference seldom consider the possibility that covariates have missing values, which is ubiquitous in many real-world analyses. Missing data greatly complicate causal inference procedures as they require an adapted unconfoundedness hy… ▽ More

    Submitted 25 February, 2020; originally announced February 2020.

  9. arXiv:2002.08676  [pdf, other

    cs.LG math.OC stat.ML

    Learning with Differentiable Perturbed Optimizers

    Authors: Quentin Berthet, Mathieu Blondel, Olivier Teboul, Marco Cuturi, Jean-Philippe Vert, Francis Bach

    Abstract: Machine learning pipelines often rely on optimization procedures to make discrete decisions (e.g., sorting, picking closest neighbors, or shortest paths). Although these discrete decisions are easily computed, they break the back-propagation of computational graphs. In order to expand the scope of learning problems that can be solved in an end-to-end fashion, we propose a systematic method to tran… ▽ More

    Submitted 9 June, 2020; v1 submitted 20 February, 2020; originally announced February 2020.

  10. arXiv:2002.03229  [pdf, other

    cs.LG stat.ML

    Supervised Quantile Normalization for Low-rank Matrix Approximation

    Authors: Marco Cuturi, Olivier Teboul, Jonathan Niles-Weed, Jean-Philippe Vert

    Abstract: Low rank matrix factorization is a fundamental building block in machine learning, used for instance to summarize gene expression profile data or word-document counts. To be robust to outliers and differences in scale across features, a matrix factorization step is usually preceded by ad-hoc feature normalization steps, such as \texttt{tf-idf} scaling or data whitening. We propose in this work to… ▽ More

    Submitted 3 July, 2020; v1 submitted 8 February, 2020; originally announced February 2020.

    Comments: new version with genomics experiments

    Journal ref: ICML 2020

  11. arXiv:1910.09036  [pdf, other

    cs.LG stat.ML

    Differentiable Deep Clustering with Cluster Size Constraints

    Authors: Aude Genevay, Gabriel Dulac-Arnold, Jean-Philippe Vert

    Abstract: Clustering is a fundamental unsupervised learning approach. Many clustering algorithms -- such as $k$-means -- rely on the euclidean distance as a similarity measure, which is often not the most relevant metric for high dimensional data such as images. Learning a lower-dimensional embedding that can better reflect the geometry of the dataset is therefore instrumental for performance. We propose a… ▽ More

    Submitted 20 October, 2019; originally announced October 2019.

  12. arXiv:1909.09819  [pdf, other

    stat.ML cs.LG

    ASNI: Adaptive Structured Noise Injection for shallow and deep neural networks

    Authors: Beyrem Khalfaoui, Joseph Boyd, Jean-Philippe Vert

    Abstract: Dropout is a regularisation technique in neural network training where unit activations are randomly set to zero with a given probability \emph{independently}. In this work, we propose a generalisation of dropout and other multiplicative noise injection schemes for shallow and deep neural networks, where the random noise applied to different units is not independent but follows a joint distributio… ▽ More

    Submitted 21 September, 2019; originally announced September 2019.

    Comments: All code concerning the real data experiments is available at \url{https://github.com/BeyremKh/ASNI}\\

  13. arXiv:1905.12909  [pdf, other

    cs.LG stat.ML

    Deep multi-class learning from label proportions

    Authors: Gabriel Dulac-Arnold, Neil Zeghidour, Marco Cuturi, Lucas Beyer, Jean-Philippe Vert

    Abstract: We propose a learning algorithm capable of learning from label proportions instead of direct data labels. In this scenario, our data are arranged into various bags of a certain size, and only the proportions of each label within a given bag are known. This is a common situation in cases where per-data labeling is lengthy, but a more general label is easily accessible. Several approaches have been… ▽ More

    Submitted 26 June, 2019; v1 submitted 30 May, 2019; originally announced May 2019.

  14. arXiv:1905.11885  [pdf, other

    cs.LG stat.ML

    Differentiable Ranks and Sorting using Optimal Transport

    Authors: Marco Cuturi, Olivier Teboul, Jean-Philippe Vert

    Abstract: Sorting an array is a fundamental routine in machine learning, one that is used to compute rank-based statistics, cumulative distribution functions (CDFs), quantiles, or to select closest neighbors and labels. The sorting function is however piece-wise constant (the sorting permutation of a vector does not change if the entries of that vector are infinitesimally perturbed) and therefore has no gra… ▽ More

    Submitted 2 November, 2019; v1 submitted 28 May, 2019; originally announced May 2019.

  15. arXiv:1805.07943  [pdf, other

    cs.LG stat.ML

    Relating Leverage Scores and Density using Regularized Christoffel Functions

    Authors: Edouard Pauwels, Francis Bach, Jean-Philippe Vert

    Abstract: Statistical leverage scores emerged as a fundamental tool for matrix sketching and column sampling with applications to low rank approximation, regression, random feature learning and quadrature. Yet, the very nature of this quantity is barely understood. Borrowing ideas from the orthogonal polynomial literature, we introduce the regularized Christoffel function associated to a positive definite k… ▽ More

    Submitted 21 November, 2018; v1 submitted 21 May, 2018; originally announced May 2018.

  16. arXiv:1802.09381  [pdf, other

    q-bio.QM cs.CV q-bio.GN stat.ML

    DropLasso: A robust variant of Lasso for single cell RNA-seq data

    Authors: Beyrem Khalfaoui, Jean-Philippe Vert

    Abstract: Single-cell RNA sequencing (scRNA-seq) is a fast growing approach to measure the genome-wide transcriptome of many individual cells in parallel, but results in noisy data with many dropout events. Existing methods to learn molecular signatures from bulk transcriptomic data may therefore not be adapted to scRNA-seq data, in order to automatically classify individual cells into predefined classes. W… ▽ More

    Submitted 26 February, 2018; originally announced February 2018.

  17. arXiv:1802.08526  [pdf, other

    stat.ML cs.LG

    The Weighted Kendall and High-order Kernels for Permutations

    Authors: Yunlong Jiao, Jean-Philippe Vert

    Abstract: We propose new positive definite kernels for permutations. First we introduce a weighted version of the Kendall kernel, which allows to weight unequally the contributions of different item pairs in the permutations depending on their ranks. Like the Kendall kernel, we show that the weighted version is invariant to relabeling of items and can be computed efficiently in $O(n \ln(n))$ operations, whe… ▽ More

    Submitted 12 June, 2018; v1 submitted 23 February, 2018; originally announced February 2018.

    Comments: Published in ICML 2018

  18. arXiv:1802.05980  [pdf, other

    q-bio.QM cs.LG stat.ML

    WHInter: A Working set algorithm for High-dimensional sparse second order Interaction models

    Authors: Marine Le Morvan, Jean-Philippe Vert

    Abstract: Learning sparse linear models with two-way interactions is desirable in many application domains such as genomics. l1-regularised linear models are popular to estimate sparse models, yet standard implementations fail to address specifically the quadratic explosion of candidate two-way interactions in high dimensions, and typically do not scale to genetic data with hundreds of thousands of features… ▽ More

    Submitted 16 February, 2018; originally announced February 2018.

  19. arXiv:1706.00244  [pdf, other

    stat.ML cs.LG q-bio.QM

    Supervised Quantile Normalisation

    Authors: Marine Le Morvan, Jean-Philippe Vert

    Abstract: Quantile normalisation is a popular normalisation method for data subject to unwanted variations such as images, speech, or genomic data. It applies a monotonic transformation to the feature values of each sample to ensure that after normalisation, they follow the same target distribution for each sample. Choosing a "good" target distribution remains however largely empirical and heuristic, and is… ▽ More

    Submitted 1 June, 2017; originally announced June 2017.

  20. arXiv:1506.07251  [pdf, other

    stat.ML cs.LG q-bio.QM

    Benchmark of structured machine learning methods for microbial identification from mass-spectrometry data

    Authors: Kévin Vervier, Pierre Mahé, Jean-Baptiste Veyrieras, Jean-Philippe Vert

    Abstract: Microbial identification is a central issue in microbiology, in particular in the fields of infectious diseases diagnosis and industrial quality control. The concept of species is tightly linked to the concept of biological and clinical classification where the proximity between species is generally measured in terms of evolutionary distances and/or clinical phenotypes. Surprisingly, the informati… ▽ More

    Submitted 24 June, 2015; originally announced June 2015.

  21. arXiv:1505.06915  [pdf, other

    q-bio.QM cs.CE cs.LG q-bio.GN stat.ML

    Large-scale Machine Learning for Metagenomics Sequence Classification

    Authors: Kévin Vervier, Pierre Mahé, Maud Tournoud, Jean-Baptiste Veyrieras, Jean-Philippe Vert

    Abstract: Metagenomics characterizes the taxonomic diversity of microbial communities by sequencing DNA directly from an environmental sample. One of the main challenges in metagenomics data analysis is the binning step, where each sequenced read is assigned to a taxonomic clade. Due to the large volume of metagenomics datasets, binning methods need fast and accurate algorithms that can operate with reasona… ▽ More

    Submitted 26 May, 2015; originally announced May 2015.

  22. arXiv:1407.5158  [pdf, ps, other

    stat.ML cs.LG math.ST

    Tight convex relaxations for sparse matrix factorization

    Authors: Emile Richard, Guillaume Obozinski, Jean-Philippe Vert

    Abstract: Based on a new atomic norm, we propose a new convex formulation for sparse matrix factorization problems in which the number of nonzero elements of the factors is assumed fixed and known. The formulation counts sparse PCA with multiple factors, subspace clustering and low-rank sparse bilinear regression as potential applications. We compute slow rates and an upper bound on the statistical dimensio… ▽ More

    Submitted 4 December, 2014; v1 submitted 19 July, 2014; originally announced July 2014.

  23. arXiv:1110.0413  [pdf, other

    stat.ML cs.LG

    Group Lasso with Overlaps: the Latent Group Lasso approach

    Authors: Guillaume Obozinski, Laurent Jacob, Jean-Philippe Vert

    Abstract: We study a norm for structured sparsity which leads to sparse linear predictors whose supports are unions of prede ned overlap** groups of variables. We call the obtained formulation latent group Lasso, since it is based on applying the usual group Lasso penalty on a set of latent variables. A detailed analysis of the norm and its properties is presented and we characterize conditions under whic… ▽ More

    Submitted 3 October, 2011; originally announced October 2011.

  24. arXiv:1004.4965  [pdf, ps, other

    stat.ML cs.CV

    Many-to-Many Graph Matching: a Continuous Relaxation Approach

    Authors: Mikhail Zaslavskiy, Francis Bach, Jean-Philippe Vert

    Abstract: Graphs provide an efficient tool for object representation in various computer vision applications. Once graph-based representations are constructed, an important question is how to compare graphs. This problem is often formulated as a graph matching problem where one seeks a map** between vertices of two graphs which optimally aligns their structure. In the classical formulation of graph matchi… ▽ More

    Submitted 28 April, 2010; originally announced April 2010.

    Comments: 19

  25. arXiv:0809.2085  [pdf, ps, other

    cs.LG

    Clustered Multi-Task Learning: A Convex Formulation

    Authors: Laurent Jacob, Francis Bach, Jean-Philippe Vert

    Abstract: In multi-task learning several related tasks are considered simultaneously, with the hope that by an appropriate sharing of information across tasks, each task may benefit from the others. In the context of learning linear functions for supervised classification or regression, this can be achieved by including a priori information about the weight vectors associated with the tasks, and how they… ▽ More

    Submitted 11 September, 2008; originally announced September 2008.

  26. arXiv:0802.1430  [pdf, ps, other

    cs.LG

    A New Approach to Collaborative Filtering: Operator Estimation with Spectral Regularization

    Authors: Jacob Abernethy, Francis Bach, Theodoros Evgeniou, Jean-Philippe Vert

    Abstract: We present a general approach for collaborative filtering (CF) using spectral regularization to learn linear operators from "users" to the "objects" they rate. Recent low-rank type matrix completion approaches to CF are shown to be special cases. However, unlike existing regularization based CF methods, our approach can be used to also incorporate information such as attributes of the users or t… ▽ More

    Submitted 19 December, 2008; v1 submitted 11 February, 2008; originally announced February 2008.

  27. arXiv:0801.4061  [pdf, ps, other

    cs.LG

    The optimal assignment kernel is not positive definite

    Authors: Jean-Philippe Vert

    Abstract: We prove that the optimal assignment kernel, proposed recently as an attempt to embed labeled graphs and more generally tuples of basic data to a Hilbert space, is in fact not always positive definite.

    Submitted 26 January, 2008; originally announced January 2008.

  28. arXiv:0801.3654  [pdf, ps, other

    cs.CV cs.DM

    A path following algorithm for the graph matching problem

    Authors: Mikhail Zaslavskiy, Francis Bach, Jean-Philippe Vert

    Abstract: We propose a convex-concave programming approach for the labeled weighted graph matching problem. The convex-concave programming formulation is obtained by rewriting the weighted graph matching problem as a least-square problem on the set of permutation matrices and relaxing it to two different optimization problems: a quadratic convex and a quadratic concave optimization problem on the set of d… ▽ More

    Submitted 27 October, 2008; v1 submitted 23 January, 2008; originally announced January 2008.

    Comments: 23 pages, 13 figures,typo correction, new results in sections 4,5,6

  29. arXiv:0708.0171  [pdf, ps, other

    q-bio.QM cs.LG

    Virtual screening with support vector machines and structure kernels

    Authors: Pierre Mahé, Jean-Philippe Vert

    Abstract: Support vector machines and kernel methods have recently gained considerable attention in chemoinformatics. They offer generally good performance for problems of supervised classification or regression, and provide a flexible and computationally efficient framework to include relevant information and prior knowledge about the data and problems to be handled. In particular, with kernel methods mo… ▽ More

    Submitted 1 August, 2007; originally announced August 2007.

  30. arXiv:cs/0611124  [pdf, ps, other

    cs.LG cs.AI cs.IR

    Low-rank matrix factorization with attributes

    Authors: Jacob Abernethy, Francis Bach, Theodoros Evgeniou, Jean-Philippe Vert

    Abstract: We develop a new collaborative filtering (CF) method that combines both previously known users' preferences, i.e. standard CF, as well as product/user attributes, i.e. classical function approximation, to predict a given user's interest in a particular product. Our method is a generalized low rank matrix completion problem, where we learn a function whose inputs are pairs of vectors -- the stand… ▽ More

    Submitted 24 November, 2006; originally announced November 2006.

    Comments: 12 pages, 2 figures

    Report number: N-24/06/MM

  31. arXiv:q-bio/0610040  [pdf, ps, other

    q-bio.QM cs.LG

    Metric learning pairwise kernel for graph inference

    Authors: Jean-Philippe Vert, Jian Qiu, William Stafford Noble

    Abstract: Much recent work in bioinformatics has focused on the inference of various types of biological networks, representing gene regulation, metabolic processes, protein-protein interactions, etc. A common setting involves inferring network edges in a supervised fashion from a set of high-confidence edges, possibly characterized by multiple, heterogeneous data sets (protein sequence, gene expression,… ▽ More

    Submitted 21 October, 2006; originally announced October 2006.

  32. A kernel for time series based on global alignments

    Authors: Marco Cuturi, Jean-Philippe Vert, Oystein Birkenes, Tomoko Matsui

    Abstract: We propose in this paper a new family of kernels to handle times series, notably speech data, within the framework of kernel methods which includes popular algorithms such as the Support Vector Machine. These kernels elaborate on the well known Dynamic Time War** (DTW) family of distances by considering the same set of elementary operations, namely substitutions and repetitions of tokens, to m… ▽ More

    Submitted 6 October, 2006; originally announced October 2006.

  33. arXiv:cs/9809006  [pdf, ps

    cs.OS cs.DC

    The Design and Architecture of the Microsoft Cluster Service -- A Practical Approach to High-Availability and Scalability

    Authors: Werner Vogels, Dan Dumitriu, Ken Birman, Rod Gamache, Mike Massa, Rob Short, John Vert, Joe Barrera

    Abstract: Microsoft Cluster Service (MSCS) extends the Win-dows NT operating system to support high-availability services. The goal is to offer an execution environment where off-the-shelf server applications can continue to operate, even in the presence of node failures. Later ver-sions of MSCS will provide scalability via a node and application management system that allows applications to scale to hund… ▽ More

    Submitted 2 September, 1998; originally announced September 1998.

    Comments: Original document at: http://research.microsoft.com/~gray/MSCS_FTCS98.doc

    Report number: Microsoft Research MSR-TR-98-16 ACM Class: C.4; C.5; D.4.5

    Journal ref: Proceedings of FTCS'98, June 23-25, 1998 in Munich, Germany