Skip to main content

Showing 1–50 of 80 results for author: Pontil, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.01171  [pdf, other

    cs.LG math.ST stat.ME stat.ML

    Neural Conditional Probability for Inference

    Authors: Vladimir R. Kostic, Karim Lounici, Gregoire Pacreau, Pietro Novelli, Giacomo Turri, Massimiliano Pontil

    Abstract: We introduce NCP (Neural Conditional Probability), a novel operator-theoretic approach for learning conditional distributions with a particular focus on inference tasks. NCP can be used to build conditional confidence regions and extract important statistics like conditional quantiles, mean, and covariance. It offers streamlined learning through a single unconditional training phase, facilitating… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  2. arXiv:2406.19861  [pdf, other

    cs.LG math.OC stat.ML

    Operator World Models for Reinforcement Learning

    Authors: Pietro Novelli, Marco Pratticò, Massimiliano Pontil, Carlo Ciliberto

    Abstract: Policy Mirror Descent (PMD) is a powerful and theoretically sound methodology for sequential decision-making. However, it is not directly applicable to Reinforcement Learning (RL) due to the inaccessibility of explicit action-value functions. We address this challenge by introducing a novel approach based on learning a world model of the environment using conditional mean embeddings. We then lever… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  3. arXiv:2406.09028  [pdf, other

    cs.LG physics.chem-ph

    From Biased to Unbiased Dynamics: An Infinitesimal Generator Approach

    Authors: Timothée Devergne, Vladimir Kostic, Michele Parrinello, Massimiliano Pontil

    Abstract: We investigate learning the eigenfunctions of evolution operators for time-reversal invariant stochastic processes, a prime example being the Langevin equation used in molecular dynamics. Many physical or chemical processes described by this equation involve transitions between metastable states separated by high potential barriers that can hardly be crossed during a simulation. To overcome this b… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  4. arXiv:2406.05714  [pdf, ps, other

    stat.ML cs.LG math.ST

    Contextual Continuum Bandits: Static Versus Dynamic Regret

    Authors: Arya Akhavan, Karim Lounici, Massimiliano Pontil, Alexandre B. Tsybakov

    Abstract: We study the contextual continuum bandits problem, where the learner sequentially receives a side information vector and has to choose an action in a convex set, minimizing a function associated to the context. The goal is to minimize all the underlying functions for the received contexts, leading to a dynamic (contextual) notion of regret, which is stronger than the standard static regret. Assumi… ▽ More

    Submitted 20 June, 2024; v1 submitted 9 June, 2024; originally announced June 2024.

  5. arXiv:2405.12940  [pdf, other

    stat.ML cs.LG math.PR

    Learning the Infinitesimal Generator of Stochastic Diffusion Processes

    Authors: Vladimir R. Kostic, Karim Lounici, Helene Halconruy, Timothee Devergne, Massimiliano Pontil

    Abstract: We address data-driven learning of the infinitesimal generator of stochastic diffusion processes, essential for understanding numerical simulations of natural and physical systems. The unbounded nature of the generator poses significant challenges, rendering conventional analysis techniques for Hilbert-Schmidt operators ineffective. To overcome this, we introduce a novel framework based on the ene… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: 38 pages, 3 figures

    MSC Class: 62M15

  6. arXiv:2403.17320  [pdf, other

    cs.RO

    Leveraging Symmetry in RL-based Legged Locomotion Control

    Authors: Zhi Su, Xiaoyu Huang, Daniel Ordoñez-Apraez, Yunfei Li, Zhongyu Li, Qiayuan Liao, Giulio Turrisi, Massimiliano Pontil, Claudio Semini, Yi Wu, Koushil Sreenath

    Abstract: Model-free reinforcement learning is a promising approach for autonomously solving challenging robotics control problems, but faces exploration difficulty without information of the robot's kinematics and dynamics morphology. The under-exploration of multiple modalities with symmetric states leads to behaviors that are often unnatural and sub-optimal. This issue becomes particularly pronounced in… ▽ More

    Submitted 26 March, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

  7. arXiv:2403.11687  [pdf, other

    stat.ML cs.LG math.OC

    Nonsmooth Implicit Differentiation: Deterministic and Stochastic Convergence Rates

    Authors: Riccardo Grazzi, Massimiliano Pontil, Saverio Salzo

    Abstract: We study the problem of efficiently computing the derivative of the fixed-point of a parametric nondifferentiable contraction map. This problem has wide applications in machine learning, including hyperparameter optimization, meta-learning and data poisoning attacks. We analyze two popular approaches: iterative differentiation (ITD) and approximate implicit differentiation (AID). A key challenge b… ▽ More

    Submitted 4 June, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

    Comments: ICML 2024. Code at github.com/prolearner/nonsmooth_implicit_diff

  8. arXiv:2402.15552  [pdf, other

    cs.RO cs.AI eess.SY

    Morphological Symmetries in Robotics

    Authors: Daniel Ordoñez-Apraez, Giulio Turrisi, Vladimir Kostic, Mario Martin, Antonio Agudo, Francesc Moreno-Noguer, Massimiliano Pontil, Claudio Semini, Carlos Mastalli

    Abstract: We present a comprehensive framework for studying and leveraging morphological symmetries in robotic systems. These are intrinsic properties of the robot's morphology, frequently observed in animal biology and robotics, which stem from the replication of kinematic structures and the symmetrical distribution of mass. We illustrate how these symmetries extend to the robot's state space and both prop… ▽ More

    Submitted 4 June, 2024; v1 submitted 23 February, 2024; originally announced February 2024.

    Comments: 18 pages, 11 figures

  9. arXiv:2312.17348  [pdf, other

    cs.LG math.NA stat.ML

    A randomized algorithm to solve reduced rank operator regression

    Authors: Giacomo Turri, Vladimir Kostic, Pietro Novelli, Massimiliano Pontil

    Abstract: We present and analyze an algorithm designed for addressing vector-valued regression problems involving possibly infinite-dimensional input and output spaces. The algorithm is a randomized adaptation of reduced rank regression, a technique to optimally learn a low-rank vector-valued function (i.e. an operator) between sampled data via regularized empirical risk minimization with rank constraints.… ▽ More

    Submitted 28 December, 2023; originally announced December 2023.

    Comments: 19 pages, 3 figures, 1 table

  10. arXiv:2312.13426  [pdf, other

    stat.ML cs.LG math.DS

    Consistent Long-Term Forecasting of Ergodic Dynamical Systems

    Authors: Prune Inzerilli, Vladimir Kostic, Karim Lounici, Pietro Novelli, Massimiliano Pontil

    Abstract: We study the evolution of distributions under the action of an ergodic dynamical system, which may be stochastic in nature. By employing tools from Koopman and transfer operator theory one can evolve any initial distribution of the state forward in time, and we investigate how estimators of these operators perform on long-term forecasting. Motivated by the observation that standard estimators may… ▽ More

    Submitted 20 December, 2023; originally announced December 2023.

  11. arXiv:2312.07457  [pdf, other

    cs.RO cs.AI cs.LG eess.SY

    Dynamics Harmonic Analysis of Robotic Systems: Application in Data-Driven Koopman Modelling

    Authors: Daniel Ordoñez-Apraez, Vladimir Kostic, Giulio Turrisi, Pietro Novelli, Carlos Mastalli, Claudio Semini, Massimiliano Pontil

    Abstract: We introduce the use of harmonic analysis to decompose the state space of symmetric robotic systems into orthogonal isotypic subspaces. These are lower-dimensional spaces that capture distinct, symmetric, and synergistic motions. For linear dynamics, we characterize how this decomposition leads to a subdivision of the dynamics into independent linear systems on each subspace, a property we term dy… ▽ More

    Submitted 4 June, 2024; v1 submitted 12 December, 2023; originally announced December 2023.

    MSC Class: 43-08

  12. arXiv:2307.09912  [pdf, other

    cs.LG

    Learning invariant representations of time-homogeneous stochastic dynamical systems

    Authors: Vladimir R. Kostic, Pietro Novelli, Riccardo Grazzi, Karim Lounici, Massimiliano Pontil

    Abstract: We consider the general class of time-homogeneous stochastic dynamical systems, both discrete and continuous, and study the problem of learning a representation of the state that faithfully captures its dynamics. This is instrumental to learning the transfer operator or the generator of the system, which in turn can be used for numerous tasks, such as forecasting and interpreting the system dynami… ▽ More

    Submitted 14 March, 2024; v1 submitted 19 July, 2023; originally announced July 2023.

  13. arXiv:2306.04520  [pdf, other

    stat.ML cs.LG math.DS

    Estimating Koopman operators with sketching to provably learn large scale dynamical systems

    Authors: Giacomo Meanti, Antoine Chatalic, Vladimir R. Kostic, Pietro Novelli, Massimiliano Pontil, Lorenzo Rosasco

    Abstract: The theory of Koopman operators allows to deploy non-parametric machine learning algorithms to predict and analyze complex dynamical systems. Estimators such as principal component regression (PCR) or reduced rank regression (RRR) in kernel spaces can be shown to provably learn Koopman operators from finite empirical observations of the system's time evolution. Scaling these approaches to very lon… ▽ More

    Submitted 30 July, 2023; v1 submitted 7 June, 2023; originally announced June 2023.

    Comments: 9 pages, 4 figures, code at https://github.com/Giodiro/NystromKoopman

  14. arXiv:2306.01589  [pdf, other

    cs.LG physics.chem-ph

    Transfer learning for atomistic simulations using GNNs and kernel mean embeddings

    Authors: John Falk, Luigi Bonati, Pietro Novelli, Michele Parrinello, Massimiliano Pontil

    Abstract: Interatomic potentials learned using machine learning methods have been successfully applied to atomistic simulations. However, accurate models require large training datasets, while generating reference calculations is computationally demanding. To bypass this difficulty, we propose a transfer learning algorithm that leverages the ability of graph neural networks (GNNs) to represent chemical envi… ▽ More

    Submitted 20 January, 2024; v1 submitted 2 June, 2023; originally announced June 2023.

    Comments: 20 pages, 4 figures, 7 tables, published in NeurIPS 2023

  15. arXiv:2302.02004  [pdf, other

    cs.LG math.DS

    Sharp Spectral Rates for Koopman Operator Learning

    Authors: Vladimir Kostic, Karim Lounici, Pietro Novelli, Massimiliano Pontil

    Abstract: Nonlinear dynamical systems can be handily described by the associated Koopman operator, whose action evolves every observable of the system forward in time. Learning the Koopman operator and its spectral decomposition from data is enabled by a number of algorithms. In this work we present for the first time non-asymptotic learning bounds for the Koopman eigenvalues and eigenfunctions. We focus on… ▽ More

    Submitted 8 November, 2023; v1 submitted 3 February, 2023; originally announced February 2023.

    Comments: Accepted to the thirty-seventh Conference on Neural Information Processing Systems (NeurIPS 2023)

  16. arXiv:2212.11702  [pdf, other

    cs.LG stat.ML

    Robust Meta-Representation Learning via Global Label Inference and Classification

    Authors: Ruohan Wang, Isak Falk, Massimiliano Pontil, Carlo Ciliberto

    Abstract: Few-shot learning (FSL) is a central problem in meta-learning, where learners must efficiently learn from few labeled examples. Within FSL, feature pre-training has recently become an increasingly popular strategy to significantly improve generalization performance. However, the contribution of pre-training is often overlooked and understudied, with limited theoretical understanding of its impact… ▽ More

    Submitted 5 November, 2023; v1 submitted 22 December, 2022; originally announced December 2022.

    Comments: 23 pages, 4 figures

  17. arXiv:2210.05561  [pdf, other

    cs.LG

    Schedule-Robust Online Continual Learning

    Authors: Ruohan Wang, Marco Ciccone, Giulia Luise, Andrew Yapp, Massimiliano Pontil, Carlo Ciliberto

    Abstract: A continual learning (CL) algorithm learns from a non-stationary data stream. The non-stationarity is modeled by some schedule that determines how data is presented over time. Most current methods make strong assumptions on the schedule and have unpredictable performance when such requirements are not met. A key challenge in CL is thus to design methods robust against arbitrary schedules over the… ▽ More

    Submitted 14 October, 2022; v1 submitted 11 October, 2022; originally announced October 2022.

  18. arXiv:2206.03150  [pdf, other

    stat.ML cs.LG

    Group Meritocratic Fairness in Linear Contextual Bandits

    Authors: Riccardo Grazzi, Arya Akhavan, John Isak Texas Falk, Leonardo Cella, Massimiliano Pontil

    Abstract: We study the linear contextual bandit problem where an agent has to select one candidate from a pool and each candidate belongs to a sensitive group. In this setting, candidates' rewards may not be directly comparable between groups, for example when the agent is an employer hiring candidates from different ethnic groups and some groups have a lower reward due to discriminatory bias and/or social… ▽ More

    Submitted 20 December, 2022; v1 submitted 7 June, 2022; originally announced June 2022.

    Comments: NeurIPS 2022. Code for the experiments at https://github.com/CSML-IIT-UCL/GMFbandits

  19. arXiv:2205.15100  [pdf, other

    cs.LG stat.ML

    Meta Representation Learning with Contextual Linear Bandits

    Authors: Leonardo Cella, Karim Lounici, Massimiliano Pontil

    Abstract: Meta-learning seeks to build algorithms that rapidly learn how to solve new learning problems based on previous experience. In this paper we investigate meta-learning in the setting of stochastic linear bandit tasks. We assume that the tasks share a low dimensional representation, which has been partially acquired from previous learning tasks. We aim to leverage this information in order to learn… ▽ More

    Submitted 30 May, 2022; originally announced May 2022.

  20. arXiv:2205.14027  [pdf, other

    cs.LG math.DS

    Learning Dynamical Systems via Koopman Operator Regression in Reproducing Kernel Hilbert Spaces

    Authors: Vladimir Kostic, Pietro Novelli, Andreas Maurer, Carlo Ciliberto, Lorenzo Rosasco, Massimiliano Pontil

    Abstract: We study a class of dynamical systems modelled as Markov chains that admit an invariant distribution via the corresponding transfer, or Koopman, operator. While data-driven algorithms to reconstruct such operators are well known, their relationship with statistical learning is largely unexplored. We formalize a framework to learn the Koopman operator from finite data trajectories of the dynamical… ▽ More

    Submitted 13 December, 2022; v1 submitted 27 May, 2022; originally announced May 2022.

    Comments: Main text: 10 pages, 2 figures, 1 table. Supplementary informations: 18 pages, 5 figures, 2 tables

  21. arXiv:2204.07391  [pdf, other

    physics.comp-ph cs.LG physics.chem-ph

    Characterizing metastable states with the help of machine learning

    Authors: Pietro Novelli, Luigi Bonati, Massimiliano Pontil, Michele Parrinello

    Abstract: Present-day atomistic simulations generate long trajectories of ever more complex systems. Analyzing these data, discovering metastable states, and uncovering their nature is becoming increasingly challenging. In this paper, we first use the variational approach to conformation dynamics to discover the slowest dynamical modes of the simulations. This allows the different metastable states of the s… ▽ More

    Submitted 15 April, 2022; originally announced April 2022.

    Comments: Main text: 10 pages, 4 figures. Supplementary Info: 4 pages, 5, figures

  22. arXiv:2202.10066  [pdf, other

    stat.ML cs.LG

    Multi-task Representation Learning with Stochastic Linear Bandits

    Authors: Leonardo Cella, Karim Lounici, Grégoire Pacreau, Massimiliano Pontil

    Abstract: We study the problem of transfer-learning in the setting of stochastic linear bandit tasks. We consider that a low dimensional linear representation is shared across the tasks, and study the benefit of learning this representation in the multi-task learning setting. Following recent results to design stochastic bandit policies, we propose an efficient greedy policy based on trace norm regularizati… ▽ More

    Submitted 15 August, 2023; v1 submitted 21 February, 2022; originally announced February 2022.

  23. arXiv:2202.03926  [pdf, other

    stat.ML cs.LG

    Distribution Regression with Sliced Wasserstein Kernels

    Authors: Dimitri Meunier, Massimiliano Pontil, Carlo Ciliberto

    Abstract: The problem of learning functions over spaces of probabilities - or distribution regression - is gaining significant interest in the machine learning community. A key challenge behind this problem is to identify a suitable representation capturing all relevant properties of the underlying functional map**. A principled approach to distribution regression is provided by kernel mean embeddings, wh… ▽ More

    Submitted 17 June, 2022; v1 submitted 8 February, 2022; originally announced February 2022.

  24. arXiv:2202.03397  [pdf, other

    stat.ML cs.LG math.OC

    Bilevel Optimization with a Lower-level Contraction: Optimal Sample Complexity without Warm-start

    Authors: Riccardo Grazzi, Massimiliano Pontil, Saverio Salzo

    Abstract: We analyse a general class of bilevel problems, in which the upper-level problem consists in the minimization of a smooth objective function and the lower-level problem is to find the fixed point of a smooth contraction map. This type of problems include instances of meta-learning, equilibrium models, hyperparameter optimization and data poisoning adversarial attacks. Several recent works have pro… ▽ More

    Submitted 16 November, 2023; v1 submitted 7 February, 2022; originally announced February 2022.

    Comments: Corrected Remark 18 + other small edits. Code at https://github.com/CSML-IIT-UCL/bioptexps

    Journal ref: Journal of Machine Learning Research, volume 24, number 167, pages 1-37, year 2023

  25. arXiv:2112.00838  [pdf, ps, other

    stat.ML cs.LG math.OC

    Convergence of Batch Greenkhorn for Regularized Multimarginal Optimal Transport

    Authors: Vladimir Kostic, Saverio Salzo, Massimilano Pontil

    Abstract: In this work we propose a batch version of the Greenkhorn algorithm for multimarginal regularized optimal transport problems. Our framework is general enough to cover, as particular cases, some existing algorithms like Sinkhorn and Greenkhorn algorithm for the bi-marginal setting, and (greedy) MultiSinkhorn for multimarginal optimal transport. We provide a complete convergence analysis, which is b… ▽ More

    Submitted 3 December, 2021; v1 submitted 1 December, 2021; originally announced December 2021.

    Comments: 30 pages

  26. arXiv:2108.04055  [pdf, other

    cs.LG stat.ML

    The Role of Global Labels in Few-Shot Classification and How to Infer Them

    Authors: Ruohan Wang, Massimiliano Pontil, Carlo Ciliberto

    Abstract: Few-shot learning is a central problem in meta-learning, where learners must quickly adapt to new tasks given limited training data. Recently, feature pre-training has become a ubiquitous component in state-of-the-art meta-learning methods and is shown to provide significant performance improvement. However, there is limited theoretical understanding of the connection between pre-training and meta… ▽ More

    Submitted 27 October, 2021; v1 submitted 9 August, 2021; originally announced August 2021.

    Comments: Conference on Neural Information Processing Systems 2021

  27. arXiv:2106.02393  [pdf, other

    cs.LG

    Multitask Online Mirror Descent

    Authors: Nicolò Cesa-Bianchi, Pierre Laforgue, Andrea Paudice, Massimiliano Pontil

    Abstract: We introduce and analyze MT-OMD, a multitask generalization of Online Mirror Descent (OMD) which operates by sharing updates between tasks. We prove that the regret of MT-OMD is of order $\sqrt{1 + σ^2(N-1)}\sqrt{T}$, where $σ^2$ is the task variance according to the geometry induced by the regularizer, $N$ is the number of tasks, and $T$ is the time horizon. Whenever tasks are similar, that is… ▽ More

    Submitted 1 November, 2022; v1 submitted 4 June, 2021; originally announced June 2021.

  28. arXiv:2103.16277  [pdf, other

    cs.LG

    Conditional Meta-Learning of Linear Representations

    Authors: Giulia Denevi, Massimiliano Pontil, Carlo Ciliberto

    Abstract: Standard meta-learning for representation learning aims to find a common representation to be shared across multiple tasks. The effectiveness of these methods is often limited when the nuances of the tasks' distribution cannot be captured by a single representation. In this work we overcome this issue by inferring a conditioning function, map** the tasks' side information (such as the tasks' tra… ▽ More

    Submitted 30 March, 2021; originally announced March 2021.

  29. arXiv:2102.06304  [pdf, ps, other

    math.PR cs.LG stat.ML

    Some Hoeffding- and Bernstein-type Concentration Inequalities

    Authors: Andreas Maurer, Massimiliano Pontil

    Abstract: We prove concentration inequalities for functions of independent random variables {under} sub-gaussian and sub-exponential conditions. The utility of the inequalities is demonstrated by an extension of the now classical method of Rademacher complexities to Lipschitz function classes and unbounded sub-exponential distribution.

    Submitted 23 June, 2021; v1 submitted 11 February, 2021; originally announced February 2021.

  30. arXiv:2012.07399  [pdf, other

    cs.LG

    Robust Unsupervised Learning via L-Statistic Minimization

    Authors: Andreas Maurer, Daniela A. Parletta, Andrea Paudice, Massimiliano Pontil

    Abstract: Designing learning algorithms that are resistant to perturbations of the underlying data distribution is a problem of wide practical and theoretical importance. We present a general approach to this problem focusing on unsupervised learning. The key assumption is that the perturbing distribution is characterized by larger losses relative to a given class of admissible models. This is exploited by… ▽ More

    Submitted 18 February, 2021; v1 submitted 14 December, 2020; originally announced December 2020.

    Comments: We have just uploaded a new version of the paper with a more relavant title " Robust Unsupervised Learning via L-statistic Minimization"

  31. arXiv:2012.03522  [pdf, ps, other

    stat.ML cs.LG

    Online Model Selection: a Rested Bandit Formulation

    Authors: Leonardo Cella, Claudio Gentile, Massimiliano Pontil

    Abstract: Motivated by a natural problem in online model selection with bandit information, we introduce and analyze a best arm identification problem in the rested bandit setting, wherein arm expected losses decrease with the number of times the arm has been played. The shape of the expected loss functions is similar across arms, and is assumed to be available up to unknown parameters that have to be learn… ▽ More

    Submitted 7 December, 2020; originally announced December 2020.

  32. arXiv:2011.07122  [pdf, other

    stat.ML cs.LG

    Convergence Properties of Stochastic Hypergradients

    Authors: Riccardo Grazzi, Massimiliano Pontil, Saverio Salzo

    Abstract: Bilevel optimization problems are receiving increasing attention in machine learning as they provide a natural framework for hyperparameter optimization and meta-learning. A key step to tackle these problems is the efficient computation of the gradient of the upper-level objective (hypergradient). In this work, we study stochastic approximation schemes for the hypergradient, which are important wh… ▽ More

    Submitted 12 April, 2021; v1 submitted 13 November, 2020; originally announced November 2020.

    Comments: added experiments, a table of notation and some comments. 22 pages

    Journal ref: Proceedings of The 24th International Conference on Artificial Intelligence and Statistics (AISTATS 2021), PMLR 130:3826-3834

  33. arXiv:2008.10857  [pdf, other

    cs.LG stat.ML

    The Advantage of Conditional Meta-Learning for Biased Regularization and Fine-Tuning

    Authors: Giulia Denevi, Massimiliano Pontil, Carlo Ciliberto

    Abstract: Biased regularization and fine-tuning are two recent meta-learning approaches. They have been shown to be effective to tackle distributions of tasks, in which the tasks' target vectors are all close to a common meta-parameter vector. However, these methods may perform poorly on heterogeneous environments of tasks, where the complexity of the tasks' distribution cannot be captured by a single meta-… ▽ More

    Submitted 25 August, 2020; originally announced August 2020.

    Comments: 34 pages; 2 figures

  34. arXiv:2007.14641  [pdf, other

    stat.ML cs.LG

    Generalization Properties of Optimal Transport GANs with Latent Distribution Learning

    Authors: Giulia Luise, Massimiliano Pontil, Carlo Ciliberto

    Abstract: The Generative Adversarial Networks (GAN) framework is a well-established paradigm for probability matching and realistic sample generation. While recent attention has been devoted to studying the theoretical properties of such models, a full theoretical understanding of the main building blocks is still missing. Focusing on generative models with Optimal Transport metrics as discriminators, in th… ▽ More

    Submitted 29 July, 2020; originally announced July 2020.

    Comments: 34 pages, 6 figures

  35. arXiv:2007.05732  [pdf, other

    cs.LG stat.ML

    Online Parameter-Free Learning of Multiple Low Variance Tasks

    Authors: Giulia Denevi, Dimitris Stamos, Massimiliano Pontil

    Abstract: We propose a method to learn a common bias vector for a growing sequence of low-variance tasks. Unlike state-of-the-art approaches, our method does not require tuning any hyper-parameter. Our approach is presented in the non-statistical setting and can be of two variants. The "aggressive" one updates the bias after each datapoint, the "lazy" one updates the bias only at the end of each task. We de… ▽ More

    Submitted 11 July, 2020; originally announced July 2020.

    Journal ref: Conference on Uncertainty in Artificial Intelligence (UAI) 2020

  36. arXiv:2006.16218  [pdf, other

    stat.ML cs.LG

    On the Iteration Complexity of Hypergradient Computation

    Authors: Riccardo Grazzi, Luca Franceschi, Massimiliano Pontil, Saverio Salzo

    Abstract: We study a general class of bilevel problems, consisting in the minimization of an upper-level objective which depends on the solution to a parametric fixed-point equation. Important instances arising in machine learning include hyperparameter optimization, meta-learning, and certain graph and recurrent neural networks. Typically the gradient of the upper-level objective (hypergradient) is hard or… ▽ More

    Submitted 10 July, 2020; v1 submitted 29 June, 2020; originally announced June 2020.

    Comments: accepted at ICML 2020; 19 pages, 4 figures; code at https://github.com/prolearner/hypertorch (corrected typos and one reference)

  37. arXiv:2006.12938  [pdf, other

    cs.LG stat.ML

    Multi-source Domain Adaptation via Weighted Joint Distributions Optimal Transport

    Authors: Rosanna Turrisi, Rémi Flamary, Alain Rakotomamonjy, Massimiliano Pontil

    Abstract: The problem of domain adaptation on an unlabeled target dataset using knowledge from multiple labelled source datasets is becoming increasingly important. A key challenge is to design an approach that overcomes the covariate and target shift both among the sources, and between the source and target domains. In this paper, we address this problem from a new perspective: instead of looking for a lat… ▽ More

    Submitted 2 June, 2022; v1 submitted 23 June, 2020; originally announced June 2020.

    Comments: Accepted at UAI 2022

  38. arXiv:2006.07862  [pdf, ps, other

    cs.LG math.OC stat.ML

    Exploiting Higher Order Smoothness in Derivative-free Optimization and Continuous Bandits

    Authors: Arya Akhavan, Massimiliano Pontil, Alexandre B. Tsybakov

    Abstract: We study the problem of zero-order optimization of a strongly convex function. The goal is to find the minimizer of the function by a sequential exploration of its values, under measurement noise. We study the impact of higher order smoothness properties of the function on the optimization error and on the cumulative regret. To solve this problem we consider a randomized approximation of the proje… ▽ More

    Submitted 24 November, 2022; v1 submitted 14 June, 2020; originally announced June 2020.

  39. arXiv:2006.07286  [pdf, other

    stat.ML cs.LG math.ST

    Fair Regression with Wasserstein Barycenters

    Authors: Evgenii Chzhen, Christophe Denis, Mohamed Hebiri, Luca Oneto, Massimiliano Pontil

    Abstract: We study the problem of learning a real-valued function that satisfies the Demographic Parity constraint. It demands the distribution of the predicted output to be independent of the sensitive attribute. We consider the case that the sensitive attribute is available for prediction. We establish a connection between fair regression and optimal transport theory, based on which we derive a close form… ▽ More

    Submitted 23 June, 2020; v1 submitted 12 June, 2020; originally announced June 2020.

  40. arXiv:2005.08531  [pdf, other

    stat.ML cs.LG

    Meta-learning with Stochastic Linear Bandits

    Authors: Leonardo Cella, Alessandro Lazaric, Massimiliano Pontil

    Abstract: We investigate meta-learning procedures in the setting of stochastic linear bandits tasks. The goal is to select a learning algorithm which works well on average over a class of bandits tasks, that are sampled from a task-distribution. Inspired by recent work on learning-to-learn linear regression, we consider a class of bandit algorithms that implement a regularized version of the well-known OFUL… ▽ More

    Submitted 18 May, 2020; originally announced May 2020.

  41. arXiv:2003.10482  [pdf, other

    cs.LG cs.PF stat.ML

    Efficient Tensor Kernel methods for sparse regression

    Authors: Feliks Hibraj, Marcello Pelillo, Saverio Salzo, Massimiliano Pontil

    Abstract: Recently, classical kernel methods have been extended by the introduction of suitable tensor kernels so to promote sparsity in the solution of the underlying regression problem. Indeed, they solve an lp-norm regularization problem, with p=m/(m-1) and m even integer, which happens to be close to a lasso problem. However, a major drawback of the method is that storing tensors requires a considerable… ▽ More

    Submitted 23 March, 2020; originally announced March 2020.

    Comments: M.Sc. Thesis introducing a novel layout to efficiently store symmetric tensor data

  42. arXiv:2002.08253  [pdf, ps, other

    stat.ML cs.LG

    Distance-Based Regularisation of Deep Networks for Fine-Tuning

    Authors: Henry Gouk, Timothy M. Hospedales, Massimiliano Pontil

    Abstract: We investigate approaches to regularisation during fine-tuning of deep neural networks. First we provide a neural network generalisation bound based on Rademacher complexity that uses the distance the weights have moved from their initial values. This bound has no direct dependence on the number of weights and compares favourably to other bounds when applied to convolutional networks. Our bound is… ▽ More

    Submitted 15 January, 2021; v1 submitted 19 February, 2020; originally announced February 2020.

  43. arXiv:1910.08525  [pdf, other

    cs.LG stat.ML

    MARTHE: Scheduling the Learning Rate Via Online Hypergradients

    Authors: Michele Donini, Luca Franceschi, Massimiliano Pontil, Orchid Majumder, Paolo Frasconi

    Abstract: We study the problem of fitting task-specific learning rate schedules from the perspective of hyperparameter optimization, aiming at good generalization. We describe the structure of the gradient of a validation error w.r.t. the learning rate schedule -- the hypergradient. Based on this, we introduce MARTHE, a novel online algorithm guided by cheap approximations of the hypergradient that uses pas… ▽ More

    Submitted 17 May, 2020; v1 submitted 18 October, 2019; originally announced October 2019.

    Comments: IJCAI 2020. Larger images. Code available at https://github.com/awslabs/adatune

  44. arXiv:1906.10673  [pdf, ps, other

    stat.ML cs.LG

    Learning Fair and Transferable Representations

    Authors: Luca Oneto, Michele Donini, Andreas Maurer, Massimiliano Pontil

    Abstract: Develo** learning methods which do not discriminate subgroups in the population is a central goal of algorithmic fairness. One way to reach this goal is by modifying the data representation in order to meet certain fairness constraints. In this work we measure fairness according to demographic parity. This requires the probability of the possible model decisions to be independent of the sensitiv… ▽ More

    Submitted 31 January, 2020; v1 submitted 25 June, 2019; originally announced June 2019.

  45. arXiv:1905.13194  [pdf, other

    stat.ML cs.LG math.ST

    Sinkhorn Barycenters with Free Support via Frank-Wolfe Algorithm

    Authors: Giulia Luise, Saverio Salzo, Massimiliano Pontil, Carlo Ciliberto

    Abstract: We present a novel algorithm to estimate the barycenter of arbitrary probability distributions with respect to the Sinkhorn divergence. Based on a Frank-Wolfe optimization strategy, our approach proceeds by populating the support of the barycenter incrementally, without requiring any pre-allocation. We consider discrete as well as continuous distributions, proving convergence rates of the proposed… ▽ More

    Submitted 30 May, 2019; originally announced May 2019.

    Comments: 46 pages, 8 figures

  46. arXiv:1903.11960  [pdf, other

    cs.LG stat.ML

    Learning Discrete Structures for Graph Neural Networks

    Authors: Luca Franceschi, Mathias Niepert, Massimiliano Pontil, Xiao He

    Abstract: Graph neural networks (GNNs) are a popular class of machine learning models whose major advantage is their ability to incorporate a sparse and discrete dependency structure between data points. Unfortunately, GNNs can only be used when such a graph-structure is available. In practice, however, real-world graphs are often noisy and incomplete or might not be available at all. With this work, we pro… ▽ More

    Submitted 19 June, 2020; v1 submitted 28 March, 2019; originally announced March 2019.

    Comments: ICML 2019, code at https://github.com/lucfra/LDS - Revision of Sec. 3

  47. arXiv:1903.10399  [pdf, other

    cs.LG stat.ML

    Learning-to-Learn Stochastic Gradient Descent with Biased Regularization

    Authors: Giulia Denevi, Carlo Ciliberto, Riccardo Grazzi, Massimiliano Pontil

    Abstract: We study the problem of learning-to-learn: inferring a learning algorithm that works well on tasks sampled from an unknown distribution. As class of algorithms we consider Stochastic Gradient Descent on the true risk regularized by the square euclidean distance to a bias vector. We present an average excess risk bound for such a learning algorithm. This result quantifies the potential benefit of u… ▽ More

    Submitted 25 March, 2019; originally announced March 2019.

    Comments: 37 pages, 8 figures

  48. arXiv:1903.00667  [pdf, ps, other

    cs.LG stat.ML

    Leveraging Low-Rank Relations Between Surrogate Tasks in Structured Prediction

    Authors: Giulia Luise, Dimitris Stamos, Massimiliano Pontil, Carlo Ciliberto

    Abstract: We study the interplay between surrogate methods for structured prediction and techniques from multitask learning designed to leverage relationships between surrogate outputs. We propose an efficient algorithm based on trace norm regularization which, differently from previous methods, does not require explicit knowledge of the coding/decoding functions of the surrogate framework. As a result, our… ▽ More

    Submitted 2 March, 2019; originally announced March 2019.

    Comments: 42 pages, 1 table

  49. arXiv:1902.01911  [pdf, ps, other

    math.ST cs.LG stat.ML

    Uniform concentration and symmetrization for weak interactions

    Authors: Andreas Maurer, Massimiliano Pontil

    Abstract: The method to derive uniform bounds with Gaussian and Rademacher complexities is extended to the case where the sample average is replaced by a nonlinear statistic. Tight bounds are obtained for U-statistics, smoothened L-statistics and error functionals of l2-regularized algorithms.

    Submitted 10 May, 2019; v1 submitted 5 February, 2019; originally announced February 2019.

  50. arXiv:1901.10080  [pdf, ps, other

    stat.ML cs.LG

    General Fair Empirical Risk Minimization

    Authors: Luca Oneto, Michele Donini, Massimiliano Pontil

    Abstract: We tackle the problem of algorithmic fairness, where the goal is to avoid the unfairly influence of sensitive information, in the general context of regression with possible continuous sensitive attributes. We extend the framework of fair empirical risk minimization to this general scenario, covering in this way the whole standard supervised learning setting. Our generalized fairness measure reduc… ▽ More

    Submitted 27 December, 2019; v1 submitted 28 January, 2019; originally announced January 2019.