Skip to main content

Showing 1–24 of 24 results for author: Zappella, G

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.03216  [pdf, other

    cs.LG cs.AI

    Choice of PEFT Technique in Continual Learning: Prompt Tuning is Not All You Need

    Authors: Martin Wistuba, Prabhu Teja Sivaprasad, Lukas Balles, Giovanni Zappella

    Abstract: Recent Continual Learning (CL) methods have combined pretrained Transformers with prompt tuning, a parameter-efficient fine-tuning (PEFT) technique. We argue that the choice of prompt tuning in prior works was an undefended and unablated decision, which has been uncritically adopted by subsequent research, but warrants further research to understand its implications. In this paper, we conduct this… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  2. arXiv:2312.05021  [pdf, other

    cs.LG cs.AI math.OC

    A Negative Result on Gradient Matching for Selective Backprop

    Authors: Lukas Balles, Cedric Archambeau, Giovanni Zappella

    Abstract: With increasing scale in model and dataset size, the training of deep neural networks becomes a massive computational burden. One approach to speed up the training process is Selective Backprop. For this approach, we perform a forward pass to obtain a loss value for each data point in a minibatch. The backward pass is then restricted to a subset of that minibatch, prioritizing high-loss examples.… ▽ More

    Submitted 8 December, 2023; originally announced December 2023.

    Comments: Paper accepted at the ICBINB Workshop at NeurIPS 2023

  3. arXiv:2311.17601  [pdf, ps, other

    cs.LG cs.AI

    Continual Learning with Low Rank Adaptation

    Authors: Martin Wistuba, Prabhu Teja Sivaprasad, Lukas Balles, Giovanni Zappella

    Abstract: Recent work using pretrained transformers has shown impressive performance when fine-tuned with data from the downstream problem of interest. However, they struggle to retain that performance when the data characteristics changes. In this paper, we focus on continual learning, where a pre-trained transformer is updated to perform well on new data, while retaining its performance on data it was pre… ▽ More

    Submitted 29 November, 2023; originally announced November 2023.

    Comments: Accepted at Workshop on Distribution Shifts (DistShift), NeurIPS 2023

  4. arXiv:2304.12067  [pdf, other

    cs.LG cs.AI cs.CV

    Renate: A Library for Real-World Continual Learning

    Authors: Martin Wistuba, Martin Ferianc, Lukas Balles, Cedric Archambeau, Giovanni Zappella

    Abstract: Continual learning enables the incremental training of machine learning models on non-stationary data streams.While academic interest in the topic is high, there is little indication of the use of state-of-the-art continual learning algorithms in practical machine learning deployment. This paper presents Renate, a continual learning library designed to build real-world updating pipelines for PyTor… ▽ More

    Submitted 24 April, 2023; originally announced April 2023.

    Comments: Paper accepted at the CLVision workshop at CVPR 2023

  5. arXiv:2207.06940  [pdf, other

    cs.LG stat.ML

    PASHA: Efficient HPO and NAS with Progressive Resource Allocation

    Authors: Ondrej Bohdal, Lukas Balles, Martin Wistuba, Beyza Ermis, Cédric Archambeau, Giovanni Zappella

    Abstract: Hyperparameter optimization (HPO) and neural architecture search (NAS) are methods of choice to obtain the best-in-class machine learning models, but in practice they can be costly to run. When models are trained on large datasets, tuning them with HPO or NAS rapidly becomes prohibitively expensive for practitioners, even when efficient multi-fidelity methods are employed. We propose an approach t… ▽ More

    Submitted 8 March, 2023; v1 submitted 14 July, 2022; originally announced July 2022.

    Comments: Accepted at ICLR 2023

  6. arXiv:2206.14085  [pdf, other

    cs.LG cs.CV

    Continual Learning with Transformers for Image Classification

    Authors: Beyza Ermis, Giovanni Zappella, Martin Wistuba, Aditya Rawal, Cedric Archambeau

    Abstract: In many real-world scenarios, data to train machine learning models become available over time. However, neural network models struggle to continually learn new concepts without forgetting what has been learnt in the past. This phenomenon is known as catastrophic forgetting and it is often difficult to prevent due to practical constraints, such as the amount of data that can be stored or the limit… ▽ More

    Submitted 28 June, 2022; originally announced June 2022.

    Comments: Appeared in CVPR CLVision workshop. arXiv admin note: substantial text overlap with arXiv:2203.04640

  7. arXiv:2203.14544  [pdf, other

    cs.LG

    Gradient-Matching Coresets for Rehearsal-Based Continual Learning

    Authors: Lukas Balles, Giovanni Zappella, Cédric Archambeau

    Abstract: The goal of continual learning (CL) is to efficiently update a machine learning model with new data without forgetting previously-learned knowledge. Most widely-used CL methods rely on a rehearsal memory of data points to be reused while training on new data. Curating such a rehearsal memory to maintain a small, informative subset of all the data seen so far is crucial to the success of these meth… ▽ More

    Submitted 28 March, 2022; originally announced March 2022.

    Comments: A short version of this paper has been presented at the NeurIPS '21 Workshop on Distribution Shifts

  8. arXiv:2203.04640  [pdf, other

    cs.CL cs.AI stat.ML

    Memory Efficient Continual Learning with Transformers

    Authors: Beyza Ermis, Giovanni Zappella, Martin Wistuba, Aditya Rawal, Cedric Archambeau

    Abstract: In many real-world scenarios, data to train machine learning models becomes available over time. Unfortunately, these models struggle to continually learn new concepts without forgetting what has been learnt in the past. This phenomenon is known as catastrophic forgetting and it is difficult to prevent due to practical constraints. For instance, the amount of data that can be stored or the computa… ▽ More

    Submitted 13 January, 2023; v1 submitted 9 March, 2022; originally announced March 2022.

    Comments: This paper was published at NeurIPS 2022

  9. arXiv:2112.05025  [pdf, other

    cs.LG

    Gradient-matching coresets for continual learning

    Authors: Lukas Balles, Giovanni Zappella, Cédric Archambeau

    Abstract: We devise a coreset selection method based on the idea of gradient matching: The gradients induced by the coreset should match, as closely as possible, those induced by the original training dataset. We evaluate the method in the context of continual learning, where it can be used to curate a rehearsal memory. Our method performs strong competitors such as reservoir sampling across a range of memo… ▽ More

    Submitted 9 December, 2021; originally announced December 2021.

    Comments: Accepted at the NeurIPS '21 Workshop on Distribution Shifts

  10. arXiv:2103.16111  [pdf, other

    cs.LG cs.AI

    A resource-efficient method for repeated HPO and NAS problems

    Authors: Giovanni Zappella, David Salinas, Cédric Archambeau

    Abstract: In this work we consider the problem of repeated hyperparameter and neural architecture search (HNAS). We propose an extension of Successive Halving that is able to leverage information gained in previous HNAS problems with the goal of saving computational resources. We empirically demonstrate that our solution is able to drastically decrease costs while maintaining accuracy and being robust to ne… ▽ More

    Submitted 13 July, 2021; v1 submitted 30 March, 2021; originally announced March 2021.

    Comments: Accepted at AutoML workshop @ ICML 2021

  11. arXiv:2012.08483  [pdf, other

    cs.LG

    Amazon SageMaker Autopilot: a white box AutoML solution at scale

    Authors: Piali Das, Valerio Perrone, Nikita Ivkin, Tanya Bansal, Zohar Karnin, Huibin Shen, Iaroslav Shcherbatyi, Yotam Elor, Wilton Wu, Aida Zolic, Thibaut Lienart, Alex Tang, Amr Ahmed, Jean Baptiste Faddoul, Rodolphe Jenatton, Fela Winkelmolen, Philip Gautier, Leo Dirac, Andre Perunicic, Miroslav Miladinovic, Giovanni Zappella, Cédric Archambeau, Matthias Seeger, Bhaskar Dutt, Laurence Rouesnel

    Abstract: AutoML systems provide a black-box solution to machine learning problems by selecting the right way of processing features, choosing an algorithm and tuning the hyperparameters of the entire pipeline. Although these systems perform well on many datasets, there is still a non-negligible number of datasets for which the one-shot solution produced by each particular system would provide sub-par perfo… ▽ More

    Submitted 16 December, 2020; v1 submitted 15 December, 2020; originally announced December 2020.

  12. arXiv:2004.13576  [pdf, other

    stat.ML cs.LG

    A Linear Bandit for Seasonal Environments

    Authors: Giuseppe Di Benedetto, Vito Bellini, Giovanni Zappella

    Abstract: Contextual bandit algorithms are extremely popular and widely used in recommendation systems to provide online personalised recommendations. A recurrent assumption is the stationarity of the reward function, which is rather unrealistic in most of the real-world applications. In the music recommendation scenario for instance, people's music taste can abruptly change during certain events, such as H… ▽ More

    Submitted 28 April, 2020; originally announced April 2020.

  13. arXiv:2004.13106  [pdf, other

    cs.LG stat.ML

    Learning to Rank in the Position Based Model with Bandit Feedback

    Authors: Beyza Ermis, Patrick Ernst, Yannik Stein, Giovanni Zappella

    Abstract: Personalization is a crucial aspect of many online experiences. In particular, content ranking is often a key component in delivering sophisticated personalization results. Commonly, supervised learning-to-rank methods are applied, which suffer from bias introduced during data collection by production systems in charge of producing the ranking. To compensate for this problem, we leverage contextua… ▽ More

    Submitted 27 April, 2020; originally announced April 2020.

  14. arXiv:1807.02089  [pdf, other

    stat.ML cs.LG

    Linear Bandits with Stochastic Delayed Feedback

    Authors: Claire Vernade, Alexandra Carpentier, Tor Lattimore, Giovanni Zappella, Beyza Ermis, Michael Brueckner

    Abstract: Stochastic linear bandits are a natural and well-studied model for structured exploration/exploitation problems and are widely used in applications such as online marketing and recommendation. One of the main challenges faced by practitioners ho** to apply existing algorithms is that usually the feedback is randomly delayed and delays are only partially observable. For example, while a purchase… ▽ More

    Submitted 2 March, 2020; v1 submitted 5 July, 2018; originally announced July 2018.

  15. arXiv:1608.03544  [pdf, other

    cs.LG cs.AI cs.IR stat.ML

    On Context-Dependent Clustering of Bandits

    Authors: Claudio Gentile, Shuai Li, Purushottam Kar, Alexandros Karatzoglou, Evans Etrue, Giovanni Zappella

    Abstract: We investigate a novel cluster-of-bandit algorithm CAB for collaborative recommendation tasks that implements the underlying feedback sharing mechanism by estimating the neighborhood of users in a context-dependent manner. CAB makes sharp departures from the state of the art by incorporating collaborative effects into inference as well as learning processes in a manner that seamlessly interleaving… ▽ More

    Submitted 27 February, 2017; v1 submitted 6 August, 2016; originally announced August 2016.

  16. arXiv:1401.8257  [pdf, other

    cs.LG stat.ML

    Online Clustering of Bandits

    Authors: Claudio Gentile, Shuai Li, Giovanni Zappella

    Abstract: We introduce a novel algorithmic approach to content recommendation based on adaptive clustering of exploration-exploitation ("bandit") strategies. We provide a sharp regret analysis of this algorithm in a standard stochastic noise setting, demonstrate its scalability properties, and prove its effectiveness on a number of artificial and real-world datasets. Our experiments show a significant incre… ▽ More

    Submitted 6 June, 2014; v1 submitted 31 January, 2014; originally announced January 2014.

    Comments: In E. Xing and T. Jebara (Eds.), Proceedings of 31st International Conference on Machine Learning, Journal of Machine Learning Research Workshop and Conference Proceedings, Vol.32 (JMLR W&CP-32), Bei**g, China, Jun. 21-26, 2014 (ICML 2014), Submitted by Shuai Li (https://sites.google.com/site/shuailidotsli)

  17. arXiv:1306.0811  [pdf, other

    cs.LG cs.SI stat.ML

    A Gang of Bandits

    Authors: Nicolò Cesa-Bianchi, Claudio Gentile, Giovanni Zappella

    Abstract: Multi-armed bandit problems are receiving a great deal of attention because they adequately formalize the exploration-exploitation trade-offs arising in several industrially relevant applications, such as online advertisement and, more generally, recommendation systems. In many cases, however, these applications have a strong social component, whose integration in the bandit algorithm could lead t… ▽ More

    Submitted 4 November, 2013; v1 submitted 4 June, 2013; originally announced June 2013.

    Comments: NIPS 2013

  18. arXiv:1301.6630  [pdf, other

    cs.SI cs.LG physics.soc-ph

    Political Disaffection: a case study on the Italian Twitter community

    Authors: Corrado Monti, Alessandro Rozza, Giovanni Zappella, Matteo Zignani, Adam Arvidsson, Monica Poletti

    Abstract: In our work we analyse the political disaffection or "the subjective feeling of powerlessness, cynicism, and lack of confidence in the political process, politicians, and democratic institutions, but with no questioning of the political regime" by exploiting Twitter data through machine learning techniques. In order to validate the quality of the time-series generated by the Twitter data, we highl… ▽ More

    Submitted 8 February, 2013; v1 submitted 28 January, 2013; originally announced January 2013.

  19. arXiv:1301.5160  [pdf, other

    cs.LG

    See the Tree Through the Lines: The Shazoo Algorithm -- Full Version --

    Authors: Fabio Vitale, Nicolo Cesa-Bianchi, Claudio Gentile, Giovanni Zappella

    Abstract: Predicting the nodes of a given graph is a fascinating theoretical problem with applications in several domains. Since graph sparsification via spanning trees retains enough information while making the task much easier, trees are an important special case of this problem. Although it is known how to predict the nodes of an unweighted tree in a nearly optimal way, in the weighted case a fully sati… ▽ More

    Submitted 28 February, 2013; v1 submitted 22 January, 2013; originally announced January 2013.

  20. arXiv:1301.5112  [pdf, ps, other

    cs.LG stat.ML

    Active Learning on Trees and Graphs

    Authors: Nicolo Cesa-Bianchi, Claudio Gentile, Fabio Vitale, Giovanni Zappella

    Abstract: We investigate the problem of active learning on a given tree whose nodes are assigned binary labels in an adversarial way. Inspired by recent results by Guillory and Bilmes, we characterize (up to constant factors) the optimal placement of queries so to minimize the mistakes made on the non-queried nodes. Our query selection algorithm is extremely efficient, and the optimal number of mistakes on… ▽ More

    Submitted 22 January, 2013; originally announced January 2013.

  21. arXiv:1301.4769  [pdf, other

    cs.LG cs.DS stat.ML

    A Correlation Clustering Approach to Link Classification in Signed Networks -- Full Version --

    Authors: Nicolo Cesa-Bianchi, Claudio Gentile, Fabio Vitale, Giovanni Zappella

    Abstract: Motivated by social balance theory, we develop a theory of link classification in signed networks using the correlation clustering index as measure of label regularity. We derive learning bounds in terms of correlation clustering within three fundamental transductive learning settings: online, batch and active. Our main algorithmic contribution is in the active setting, where we introduce a new fa… ▽ More

    Submitted 28 February, 2013; v1 submitted 21 January, 2013; originally announced January 2013.

  22. arXiv:1301.4767  [pdf, other

    cs.LG cs.SI stat.ML

    A Linear Time Active Learning Algorithm for Link Classification -- Full Version --

    Authors: Nicolo Cesa-Bianchi, Claudio Gentile, Fabio Vitale, Giovanni Zappella

    Abstract: We present very efficient active learning algorithms for link classification in signed networks. Our algorithms are motivated by a stochastic model in which edge labels are obtained through perturbations of a initial sign assignment consistent with a two-clustering of the nodes. We provide a theoretical analysis within this model, showing that we can achieve an optimal (to whithin a constant facto… ▽ More

    Submitted 28 February, 2013; v1 submitted 21 January, 2013; originally announced January 2013.

  23. arXiv:1212.5637  [pdf, other

    cs.LG stat.ML

    Random Spanning Trees and the Prediction of Weighted Graphs

    Authors: Nicolo' Cesa-Bianchi, Claudio Gentile, Fabio Vitale, Giovanni Zappella

    Abstract: We investigate the problem of sequentially predicting the binary labels on the nodes of an arbitrary weighted graph. We show that, under a suitable parametrization of the problem, the optimal number of prediction mistakes can be characterized (up to logarithmic factors) by the cutsize of a random spanning tree of the graph. The cutsize is induced by the unknown adversarial labeling of the graph no… ▽ More

    Submitted 21 December, 2012; originally announced December 2012.

    Comments: Appeared in ICML 2010

  24. arXiv:1112.4344  [pdf, other

    cs.LG cs.GT

    A Scalable Multiclass Algorithm for Node Classification

    Authors: Giovanni Zappella

    Abstract: We introduce a scalable algorithm, MUCCA, for multiclass node classification in weighted graphs. Unlike previously proposed methods for the same task, MUCCA works in time linear in the number of nodes. Our approach is based on a game-theoretic formulation of the problem in which the test labels are expressed as a Nash Equilibrium of a certain game. However, in order to achieve scalability, we find… ▽ More

    Submitted 19 December, 2011; originally announced December 2011.