Skip to main content

Showing 1–11 of 11 results for author: Laradji, I

Searching in archive stat. Search in all archives.
.
  1. arXiv:2312.13876  [pdf, other

    cs.LG cs.CL stat.ML

    Capture the Flag: Uncovering Data Insights with Large Language Models

    Authors: Issam Laradji, Perouz Taslakian, Sai Rajeswar, Valentina Zantedeschi, Alexandre Lacoste, Nicolas Chapados, David Vazquez, Christopher Pal, Alexandre Drouin

    Abstract: The extraction of a small number of relevant insights from vast amounts of data is a crucial component of data-driven decision-making. However, accomplishing this task requires considerable technical skills, domain expertise, and human labor. This study explores the potential of using Large Language Models (LLMs) to automate the discovery of insights in data, leveraging recent advances in reasonin… ▽ More

    Submitted 21 December, 2023; originally announced December 2023.

    Comments: 14 pages, 1 figure, Foundation Models for Decision Making Workshop at NeurIPS 2023

  2. arXiv:2203.16662  [pdf, other

    stat.ML cs.LG

    Overcoming challenges in leveraging GANs for few-shot data augmentation

    Authors: Christopher Beckham, Issam Laradji, Pau Rodriguez, David Vazquez, Derek Nowrouzezahrai, Christopher Pal

    Abstract: In this paper, we explore the use of GAN-based few-shot data augmentation as a method to improve few-shot classification performance. We perform an exploration into how a GAN can be fine-tuned for such a task (one of which is in a class-incremental manner), as well as a rigorous empirical investigation into how well these models can perform to improve few-shot classification. We identify issues re… ▽ More

    Submitted 8 August, 2022; v1 submitted 30 March, 2022; originally announced March 2022.

    Comments: v3 of the paper, various changes including better figures, CIFAR-100 results, and precision-recall metrics

  3. arXiv:2009.09929  [pdf, other

    cs.CV cs.AI cs.LG stat.ML

    CVPR 2020 Continual Learning in Computer Vision Competition: Approaches, Results, Current Challenges and Future Directions

    Authors: Vincenzo Lomonaco, Lorenzo Pellegrini, Pau Rodriguez, Massimo Caccia, Qi She, Yu Chen, Quentin Jodelet, Rui** Wang, Zheda Mai, David Vazquez, German I. Parisi, Nikhil Churamani, Marc Pickett, Issam Laradji, Davide Maltoni

    Abstract: In the last few years, we have witnessed a renewed and fast-growing interest in continual learning with deep neural networks with the shared objective of making current AI systems more adaptive, efficient and autonomous. However, despite the significant and undoubted progress of the field in addressing the issue of catastrophic forgetting, benchmarking different continual learning approaches is a… ▽ More

    Submitted 14 September, 2020; originally announced September 2020.

    Comments: Pre-print v1: 12 pages, 3 figures, 8 tables

  4. arXiv:2006.14699  [pdf, other

    cs.CV stat.ML

    Learning Data Augmentation with Online Bilevel Optimization for Image Classification

    Authors: Saypraseuth Mounsaveng, Issam Laradji, Ismail Ben Ayed, David Vazquez, Marco Pedersoli

    Abstract: Data augmentation is a key practice in machine learning for improving generalization performance. However, finding the best data augmentation hyperparameters requires domain knowledge or a computationally demanding search. We address this issue by proposing an efficient approach to automatically train a network that learns an effective distribution of transformations to improve its generalization.… ▽ More

    Submitted 10 November, 2020; v1 submitted 25 June, 2020; originally announced June 2020.

  5. arXiv:2006.06835  [pdf, other

    cs.LG math.OC stat.ML

    Adaptive Gradient Methods Converge Faster with Over-Parameterization (but you should do a line-search)

    Authors: Sharan Vaswani, Issam Laradji, Frederik Kunstner, Si Yi Meng, Mark Schmidt, Simon Lacoste-Julien

    Abstract: Adaptive gradient methods are typically used for training over-parameterized models. To better understand their behaviour, we study a simplistic setting -- smooth, convex losses with models over-parameterized enough to interpolate the data. In this setting, we prove that AMSGrad with constant step-size and momentum converges to the minimizer at a faster $O(1/T)$ rate. When interpolation is only ap… ▽ More

    Submitted 18 February, 2021; v1 submitted 11 June, 2020; originally announced June 2020.

  6. arXiv:2002.10542  [pdf, other

    math.OC cs.LG stat.ML

    Stochastic Polyak Step-size for SGD: An Adaptive Learning Rate for Fast Convergence

    Authors: Nicolas Loizou, Sharan Vaswani, Issam Laradji, Simon Lacoste-Julien

    Abstract: We propose a stochastic variant of the classical Polyak step-size (Polyak, 1987) commonly used in the subgradient method. Although computing the Polyak step-size requires knowledge of the optimal function values, this information is readily available for typical modern machine learning applications. Consequently, the proposed stochastic Polyak step-size (SPS) is an attractive choice for setting th… ▽ More

    Submitted 22 March, 2021; v1 submitted 24 February, 2020; originally announced February 2020.

    Comments: Proceedings of the 24th International Conference on Artificial Intelligence and Statistics (AISTATS) 2021

  7. arXiv:1910.04920  [pdf, other

    cs.LG math.OC stat.ML

    Fast and Furious Convergence: Stochastic Second Order Methods under Interpolation

    Authors: Si Yi Meng, Sharan Vaswani, Issam Laradji, Mark Schmidt, Simon Lacoste-Julien

    Abstract: We consider stochastic second-order methods for minimizing smooth and strongly-convex functions under an interpolation condition satisfied by over-parameterized models. Under this condition, we show that the regularized subsampled Newton method (R-SSN) achieves global linear convergence with an adaptive step-size and a constant batch-size. By growing the batch size for both the subsampled gradient… ▽ More

    Submitted 22 March, 2020; v1 submitted 10 October, 2019; originally announced October 2019.

    Comments: AISTATS, 2020

  8. arXiv:1905.09997  [pdf, other

    cs.LG math.OC stat.ML

    Painless Stochastic Gradient: Interpolation, Line-Search, and Convergence Rates

    Authors: Sharan Vaswani, Aaron Mishkin, Issam Laradji, Mark Schmidt, Gauthier Gidel, Simon Lacoste-Julien

    Abstract: Recent works have shown that stochastic gradient descent (SGD) achieves the fast convergence rates of full-batch gradient descent for over-parameterized models satisfying certain interpolation conditions. However, the step-size used in these works depends on unknown quantities and SGD's practical performance heavily relies on the choice of this step-size. We propose to use line-search techniques t… ▽ More

    Submitted 4 June, 2021; v1 submitted 23 May, 2019; originally announced May 2019.

    Comments: Added a citation to the related work of Paul Tseng, and citations to methods that had previously explored line-searches for deep learning empirically

  9. arXiv:1905.06982  [pdf, other

    cs.LG stat.ML

    Efficient Deep Gaussian Process Models for Variable-Sized Input

    Authors: Issam H. Laradji, Mark Schmidt, Vladimir Pavlovic, Minyoung Kim

    Abstract: Deep Gaussian processes (DGP) have appealing Bayesian properties, can handle variable-sized data, and learn deep features. Their limitation is that they do not scale well with the size of the data. Existing approaches address this using a deep random feature (DRF) expansion model, which makes inference tractable by approximating DGPs. However, DRF is not suitable for variable-sized input data such… ▽ More

    Submitted 16 May, 2019; originally announced May 2019.

    Comments: Accepted in IJCNN 2019

  10. arXiv:1807.02552  [pdf, other

    cs.LG stat.ML

    M-ADDA: Unsupervised Domain Adaptation with Deep Metric Learning

    Authors: Issam Laradji, Reza Babanezhad

    Abstract: Unsupervised domain adaptation techniques have been successful for a wide range of problems where supervised labels are limited. The task is to classify an unlabeled `target' dataset by leveraging a labeled `source' dataset that comes from a slightly similar distribution. We propose metric-based adversarial discriminative domain adaptation (M-ADDA) which performs two main steps. First, it uses a m… ▽ More

    Submitted 6 July, 2018; originally announced July 2018.

  11. arXiv:1506.00552  [pdf, other

    math.OC cs.LG stat.CO stat.ML

    Coordinate Descent Converges Faster with the Gauss-Southwell Rule Than Random Selection

    Authors: Julie Nutini, Mark Schmidt, Issam H. Laradji, Michael Friedlander, Hoyt Koepke

    Abstract: There has been significant recent work on the theory and application of randomized coordinate descent algorithms, beginning with the work of Nesterov [SIAM J. Optim., 22(2), 2012], who showed that a random-coordinate selection rule achieves the same convergence rate as the Gauss-Southwell selection rule. This result suggests that we should never use the Gauss-Southwell rule, as it is typically muc… ▽ More

    Submitted 28 October, 2018; v1 submitted 1 June, 2015; originally announced June 2015.

    Comments: ICML 2015. v2: Updated the Gauss-Southwell-q result in Section 8 and Appendix H, to remove the part depending on mu_1 (the proof had an error). Added Section 8.1, which discusses conditions under which a rate depending on mu_1 does hold