Skip to main content

Showing 1–45 of 45 results for author: Charlin, L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.11157  [pdf, other

    cs.LG cs.CL

    Towards Modular LLMs by Building and Reusing a Library of LoRAs

    Authors: Oleksiy Ostapenko, Zhan Su, Edoardo Maria Ponti, Laurent Charlin, Nicolas Le Roux, Matheus Pereira, Lucas Caccia, Alessandro Sordoni

    Abstract: The growing number of parameter-efficient adaptations of a base large language model (LLM) calls for studying whether we can reuse such trained adapters to improve performance for new tasks. We study how to best build a library of adapters given multi-task data and devise techniques for both zero-shot and supervised task generalization through routing in such library. We benchmark existing approac… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

  2. arXiv:2404.19132  [pdf, other

    cs.LG cs.CV

    Integrating Present and Past in Unsupervised Continual Learning

    Authors: Yipeng Zhang, Laurent Charlin, Richard Zemel, Mengye Ren

    Abstract: We formulate a unifying framework for unsupervised continual learning (UCL), which disentangles learning objectives that are specific to the present and the past data, encompassing stability, plasticity, and cross-task consolidation. The framework reveals that many existing UCL approaches overlook cross-task consolidation and try to balance plasticity and stability in a shared embedding space. Thi… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: CoLLAs 2024

  3. arXiv:2402.01788  [pdf, other

    cs.CL cs.AI cs.IR

    LitLLM: A Toolkit for Scientific Literature Review

    Authors: Shubham Agarwal, Issam H. Laradji, Laurent Charlin, Christopher Pal

    Abstract: Conducting literature reviews for scientific papers is essential for understanding research, its limitations, and building on existing work. It is a tedious task which makes an automatic literature review generator appealing. Unfortunately, many existing works that generate such reviews using Large Language Models (LLMs) have significant limitations. They tend to hallucinate-generate non-actual in… ▽ More

    Submitted 1 February, 2024; originally announced February 2024.

  4. arXiv:2306.01925  [pdf, other

    cs.LG

    Improving the generalizability and robustness of large-scale traffic signal control

    Authors: Tianyu Shi, Francois-Xavier Devailly, Denis Larocque, Laurent Charlin

    Abstract: A number of deep reinforcement-learning (RL) approaches propose to control traffic signals. In this work, we study the robustness of such methods along two axes. First, sensor failures and GPS occlusions create missing-data challenges and we show that recent methods remain brittle in the face of these missing data. Second, we provide a more systematic study of the generalization ability of RL meth… ▽ More

    Submitted 7 June, 2023; v1 submitted 2 June, 2023; originally announced June 2023.

  5. arXiv:2305.19366  [pdf, other

    cs.LG stat.ML

    Joint Bayesian Inference of Graphical Structure and Parameters with a Single Generative Flow Network

    Authors: Tristan Deleu, Mizu Nishikawa-Toomey, Jithendaraa Subramanian, Nikolay Malkin, Laurent Charlin, Yoshua Bengio

    Abstract: Generative Flow Networks (GFlowNets), a class of generative models over discrete and structured sample spaces, have been previously applied to the problem of inferring the marginal posterior distribution over the directed acyclic graph (DAG) of a Bayesian Network, given a dataset of observations. Based on recent advances extending this framework to non-discrete sample spaces, we propose in this pa… ▽ More

    Submitted 30 October, 2023; v1 submitted 30 May, 2023; originally announced May 2023.

  6. arXiv:2304.13164  [pdf, other

    cs.LG cs.AI

    Towards Compute-Optimal Transfer Learning

    Authors: Massimo Caccia, Alexandre Galashov, Arthur Douillard, Amal Rannen-Triki, Dushyant Rao, Michela Paganini, Laurent Charlin, Marc'Aurelio Ranzato, Razvan Pascanu

    Abstract: The field of transfer learning is undergoing a significant shift with the introduction of large pretrained models which have demonstrated strong adaptability to a variety of downstream tasks. However, the high computational and memory requirements to finetune or use these models can be a hindrance to their widespread use. In this study, we present a solution to this issue by proposing a simple yet… ▽ More

    Submitted 25 April, 2023; originally announced April 2023.

  7. arXiv:2211.02763  [pdf, other

    cs.LG stat.ML

    Bayesian learning of Causal Structure and Mechanisms with GFlowNets and Variational Bayes

    Authors: Mizu Nishikawa-Toomey, Tristan Deleu, Jithendaraa Subramanian, Yoshua Bengio, Laurent Charlin

    Abstract: Bayesian causal structure learning aims to learn a posterior distribution over directed acyclic graphs (DAGs), and the mechanisms that define the relationship between parent and child variables. By taking a Bayesian approach, it is possible to reason about the uncertainty of the causal model. The notion of modelling the uncertainty over models is particularly crucial for causal structure learning… ▽ More

    Submitted 3 June, 2024; v1 submitted 4 November, 2022; originally announced November 2022.

  8. arXiv:2208.00659  [pdf, other

    cs.LG stat.ML

    Model-based graph reinforcement learning for inductive traffic signal control

    Authors: François-Xavier Devailly, Denis Larocque, Laurent Charlin

    Abstract: Most reinforcement learning methods for adaptive-traffic-signal-control require training from scratch to be applied on any new intersection or after any modification to the road network, traffic distribution, or behavioral constraints experienced during training. Considering 1) the massive amount of experience required to train such methods, and 2) that experience must be gathered by interacting i… ▽ More

    Submitted 1 August, 2022; originally announced August 2022.

    Comments: 11 pages, 3 tables, 4 figures

  9. arXiv:2207.04543  [pdf, other

    cs.LG cs.AI

    Challenging Common Assumptions about Catastrophic Forgetting

    Authors: Timothée Lesort, Oleksiy Ostapenko, Diganta Misra, Md Rifat Arefin, Pau Rodríguez, Laurent Charlin, Irina Rish

    Abstract: Building learning agents that can progressively learn and accumulate knowledge is the core goal of the continual learning (CL) research field. Unfortunately, training a model on new data usually compromises the performance on past data. In the CL literature, this effect is referred to as catastrophic forgetting (CF). CF has been largely studied, and a plethora of methods have been proposed to addr… ▽ More

    Submitted 15 May, 2023; v1 submitted 10 July, 2022; originally announced July 2022.

  10. arXiv:2206.13414  [pdf, other

    cs.LG math.OC stat.ML

    Learning To Cut By Looking Ahead: Cutting Plane Selection via Imitation Learning

    Authors: Max B. Paulus, Giulia Zarpellon, Andreas Krause, Laurent Charlin, Chris J. Maddison

    Abstract: Cutting planes are essential for solving mixed-integer linear problems (MILPs), because they facilitate bound improvements on the optimal solution value. For selecting cuts, modern solvers rely on manually designed heuristics that are tuned to gauge the potential effectiveness of cuts. We show that a greedy selection rule explicitly looking ahead to select cuts that yield the best bound improvemen… ▽ More

    Submitted 27 June, 2022; originally announced June 2022.

    Comments: ICML 2022

  11. arXiv:2205.14495  [pdf, other

    cs.LG

    Task-Agnostic Continual Reinforcement Learning: Gaining Insights and Overcoming Challenges

    Authors: Massimo Caccia, Jonas Mueller, Taesup Kim, Laurent Charlin, Rasool Fakoor

    Abstract: Continual learning (CL) enables the development of models and agents that learn from a sequence of tasks while addressing the limitations of standard deep learning approaches, such as catastrophic forgetting. In this work, we investigate the factors that contribute to the performance differences between task-agnostic CL and multi-task (MTL) agents. We pose two hypotheses: (1) task-agnostic methods… ▽ More

    Submitted 17 May, 2023; v1 submitted 28 May, 2022; originally announced May 2022.

    Journal ref: CoLLAs 2023

  12. arXiv:2205.00329  [pdf, other

    cs.LG cs.AI

    Continual Learning with Foundation Models: An Empirical Study of Latent Replay

    Authors: Oleksiy Ostapenko, Timothee Lesort, Pau Rodríguez, Md Rifat Arefin, Arthur Douillard, Irina Rish, Laurent Charlin

    Abstract: Rapid development of large-scale pre-training has resulted in foundation models that can act as effective feature extractors on a variety of downstream tasks and domains. Motivated by this, we study the efficacy of pre-trained vision models as a foundation for downstream continual learning (CL) scenarios. Our goal is twofold. First, we want to understand the compute-accuracy trade-off between CL i… ▽ More

    Submitted 2 July, 2022; v1 submitted 30 April, 2022; originally announced May 2022.

  13. arXiv:2203.03724  [pdf, other

    cs.CY cs.AI cs.HC cs.LG

    A New Era: Intelligent Tutoring Systems Will Transform Online Learning for Millions

    Authors: Francois St-Hilaire, Dung Do Vu, Antoine Frau, Nathan Burns, Farid Faraji, Joseph Potochny, Stephane Robert, Arnaud Roussel, Selene Zheng, Taylor Glazier, Junfel Vincent Romano, Robert Belfer, Muhammad Shayan, Ariella Smofsky, Tommy Delarosbil, Seulmin Ahn, Simon Eden-Walker, Kritika Sony, Ansona Onyi Ching, Sabina Elkins, Anush Stepanyan, Adela Matajova, Victor Chen, Hossein Sahraei, Robert Larson , et al. (6 additional authors not shown)

    Abstract: Despite artificial intelligence (AI) having transformed major aspects of our society, less than a fraction of its potential has been explored, let alone deployed, for education. AI-powered learning can provide millions of learners with a highly personalized, active and practical learning experience, which is key to successful learning. This is especially relevant in the context of online learning… ▽ More

    Submitted 3 March, 2022; originally announced March 2022.

    Comments: 9 pages, 6 figures

    ACM Class: I.2.0; K.3.1; K.4.0

  14. arXiv:2203.02433  [pdf, ps, other

    cs.LG cs.NE math.OC stat.ML

    The Machine Learning for Combinatorial Optimization Competition (ML4CO): Results and Insights

    Authors: Maxime Gasse, Quentin Cappart, Jonas Charfreitag, Laurent Charlin, Didier Chételat, Antonia Chmiela, Justin Dumouchelle, Ambros Gleixner, Aleksandr M. Kazachkov, Elias Khalil, Pawel Lichocki, Andrea Lodi, Miles Lubin, Chris J. Maddison, Christopher Morris, Dimitri J. Papageorgiou, Augustin Parjadis, Sebastian Pokutta, Antoine Prouvost, Lara Scavuzzo, Giulia Zarpellon, Linxin Yang, Sha Lai, Akang Wang, Xiaodong Luo , et al. (16 additional authors not shown)

    Abstract: Combinatorial optimization is a well-established area in operations research and computer science. Until recently, its methods have focused on solving problem instances in isolation, ignoring that they often stem from related data distributions in practice. However, recent years have seen a surge of interest in using machine learning as a new approach for solving combinatorial problems, either dir… ▽ More

    Submitted 17 March, 2022; v1 submitted 4 March, 2022; originally announced March 2022.

    Comments: Neurips 2021 competition. arXiv admin note: text overlap with arXiv:2112.12251 by other authors

  15. arXiv:2111.07736  [pdf, other

    cs.LG cs.AI

    Continual Learning via Local Module Composition

    Authors: Oleksiy Ostapenko, Pau Rodriguez, Massimo Caccia, Laurent Charlin

    Abstract: Modularity is a compelling solution to continual learning (CL), the problem of modeling sequences of related tasks. Learning and then composing modules to solve different tasks provides an abstraction to address the principal challenges of CL including catastrophic forgetting, backward and forward transfer across tasks, and sub-linear model growth. We introduce local module composition (LMC), an a… ▽ More

    Submitted 15 November, 2021; originally announced November 2021.

    Journal ref: NeurIPS 2021

  16. arXiv:2108.01005  [pdf, other

    cs.LG

    Sequoia: A Software Framework to Unify Continual Learning Research

    Authors: Fabrice Normandin, Florian Golemo, Oleksiy Ostapenko, Pau Rodriguez, Matthew D Riemer, Julio Hurtado, Khimya Khetarpal, Ryan Lindeborg, Lucas Cecchi, Timothée Lesort, Laurent Charlin, Irina Rish, Massimo Caccia

    Abstract: The field of Continual Learning (CL) seeks to develop algorithms that accumulate knowledge and skills over time through interaction with non-stationary environments. In practice, a plethora of evaluation procedures (settings) and algorithmic solutions (methods) exist, each with their own potentially disjoint set of assumptions. This variety makes measuring progress in CL difficult. We propose a ta… ▽ More

    Submitted 5 June, 2023; v1 submitted 2 August, 2021; originally announced August 2021.

  17. arXiv:2106.04799  [pdf, other

    cs.LG

    Pretraining Representations for Data-Efficient Reinforcement Learning

    Authors: Max Schwarzer, Nitarshan Rajkumar, Michael Noukhovitch, Ankesh Anand, Laurent Charlin, Devon Hjelm, Philip Bachman, Aaron Courville

    Abstract: Data efficiency is a key challenge for deep reinforcement learning. We address this problem by using unlabeled data to pretrain an encoder which is then finetuned on a small amount of task-specific data. To encourage learning representations which capture diverse aspects of the underlying MDP, we employ a combination of latent dynamics modelling and unsupervised goal-conditioned RL. When limited t… ▽ More

    Submitted 9 June, 2021; originally announced June 2021.

  18. arXiv:2104.07763  [pdf, other

    cs.CY cs.AI cs.CL cs.HC

    Comparative Study of Learning Outcomes for Online Learning Platforms

    Authors: Francois St-Hilaire, Nathan Burns, Robert Belfer, Muhammad Shayan, Ariella Smofsky, Dung Do Vu, Antoine Frau, Joseph Potochny, Farid Faraji, Vincent Pavero, Neroli Ko, Ansona Onyi Ching, Sabina Elkins, Anush Stepanyan, Adela Matajova, Laurent Charlin, Yoshua Bengio, Iulian Vlad Serban, Ekaterina Kochmar

    Abstract: Personalization and active learning are key aspects to successful learning. These aspects are important to address in intelligent educational applications, as they help systems to adapt and close the gap between students with varying abilities, which becomes increasingly important in the context of online and distance learning. We run a comparative head-to-head study of learning outcomes for two p… ▽ More

    Submitted 15 April, 2021; originally announced April 2021.

    Comments: 14 pages, 3 figures, 2 tables, accepted at AIED 2021 (2021 Conference on Artificial Intelligence in Education)

    ACM Class: I.2.0; I.2.1; I.2.7; K.3.1; G.4

  19. arXiv:2103.10226  [pdf, other

    cs.LG cs.CV

    Beyond Trivial Counterfactual Explanations with Diverse Valuable Explanations

    Authors: Pau Rodriguez, Massimo Caccia, Alexandre Lacoste, Lee Zamparo, Issam Laradji, Laurent Charlin, David Vazquez

    Abstract: Explainability for machine learning models has gained considerable attention within the research community given the importance of deploying more reliable machine-learning systems. In computer vision applications, generative counterfactual methods indicate how to perturb a model's input to change its prediction, providing details about the model's decision-making. Current methods tend to generate… ▽ More

    Submitted 11 November, 2021; v1 submitted 18 March, 2021; originally announced March 2021.

    Comments: ICCV 2021

  20. arXiv:2010.14235  [pdf, ps, other

    cs.CL cs.AI

    Multi-XScience: A Large-scale Dataset for Extreme Multi-document Summarization of Scientific Articles

    Authors: Yao Lu, Yue Dong, Laurent Charlin

    Abstract: Multi-document summarization is a challenging task for which there exists little large-scale datasets. We propose Multi-XScience, a large-scale multi-document summarization dataset created from scientific articles. Multi-XScience introduces a challenging multi-document summarization task: writing the related-work section of a paper based on its abstract and the articles it references. Our work is… ▽ More

    Submitted 27 October, 2020; originally announced October 2020.

    Comments: EMNLP 2020

  21. arXiv:2009.06415  [pdf, other

    cs.CV cs.AI

    Synbols: Probing Learning Algorithms with Synthetic Datasets

    Authors: Alexandre Lacoste, Pau Rodríguez, Frédéric Branchaud-Charron, Parmida Atighehchian, Massimo Caccia, Issam Laradji, Alexandre Drouin, Matt Craddock, Laurent Charlin, David Vázquez

    Abstract: Progress in the field of machine learning has been fueled by the introduction of benchmark datasets pushing the limits of existing algorithms. Enabling the design of datasets to test specific properties and failure modes of learning algorithms is thus a problem of high interest, as it has a direct impact on innovation in the field. In this sense, we introduce Synbols -- Synthetic Symbols -- a tool… ▽ More

    Submitted 4 November, 2020; v1 submitted 14 September, 2020; originally announced September 2020.

  22. arXiv:2005.06616  [pdf, other

    cs.CY cs.AI cs.CL cs.HC cs.LG

    A Large-Scale, Open-Domain, Mixed-Interface Dialogue-Based ITS for STEM

    Authors: Iulian Vlad Serban, Varun Gupta, Ekaterina Kochmar, Dung D. Vu, Robert Belfer, Joelle Pineau, Aaron Courville, Laurent Charlin, Yoshua Bengio

    Abstract: We present Korbit, a large-scale, open-domain, mixed-interface, dialogue-based intelligent tutoring system (ITS). Korbit uses machine learning, natural language processing and reinforcement learning to provide interactive, personalized learning online. Korbit has been designed to easily scale to thousands of subjects, by automating, standardizing and simplifying the content creation process. Unlik… ▽ More

    Submitted 5 May, 2020; originally announced May 2020.

    Comments: 6 pages, 1 figure, 1 table, accepted for publication in the 21st International Conference on Artificial Intelligence in Education (AIED 2020)

    ACM Class: I.2.0; I.2.1; I.2.7; K.3.1; G.4

  23. arXiv:2003.05856  [pdf, other

    cs.AI cs.LG

    Online Fast Adaptation and Knowledge Accumulation: a New Approach to Continual Learning

    Authors: Massimo Caccia, Pau Rodriguez, Oleksiy Ostapenko, Fabrice Normandin, Min Lin, Lucas Caccia, Issam Laradji, Irina Rish, Alexandre Lacoste, David Vazquez, Laurent Charlin

    Abstract: Continual learning studies agents that learn from streams of tasks without forgetting previous ones while adapting to new ones. Two recent continual-learning scenarios have opened new avenues of research. In meta-continual learning, the model is pre-trained to minimize catastrophic forgetting of previous tasks. In continual-meta learning, the aim is to train agents for faster remembering of previo… ▽ More

    Submitted 20 January, 2021; v1 submitted 12 March, 2020; originally announced March 2020.

    Journal ref: NeurIPS 2020

  24. IG-RL: Inductive Graph Reinforcement Learning for Massive-Scale Traffic Signal Control

    Authors: François-Xavier Devailly, Denis Larocque, Laurent Charlin

    Abstract: Scaling adaptive traffic-signal control involves dealing with combinatorial state and action spaces. Multi-agent reinforcement learning attempts to address this challenge by distributing control to specialized agents. However, specialization hinders generalization and transferability, and the computational graphs underlying neural-networks architectures -- dominating in the multi-agent setting --… ▽ More

    Submitted 20 September, 2021; v1 submitted 6 March, 2020; originally announced March 2020.

    Comments: 11 pages, 10 figures, 1 table. IEEE Transactions on Intelligent Transportation Systems (2021)

  25. arXiv:1908.04742  [pdf, other

    cs.LG stat.ML

    Online Continual Learning with Maximally Interfered Retrieval

    Authors: Rahaf Aljundi, Lucas Caccia, Eugene Belilovsky, Massimo Caccia, Min Lin, Laurent Charlin, Tinne Tuytelaars

    Abstract: Continual learning, the setting where a learning agent is faced with a never ending stream of data, continues to be a great challenge for modern machine learning systems. In particular the online or "single-pass through the data" setting has gained attention recently as a natural setting that is difficult to tackle. Methods based on replay, either generative or from a stored memory, have been show… ▽ More

    Submitted 29 October, 2019; v1 submitted 11 August, 2019; originally announced August 2019.

    Journal ref: NeurIPS 2019

  26. arXiv:1906.01629  [pdf, other

    cs.LG math.OC stat.ML

    Exact Combinatorial Optimization with Graph Convolutional Neural Networks

    Authors: Maxime Gasse, Didier Chételat, Nicola Ferroni, Laurent Charlin, Andrea Lodi

    Abstract: Combinatorial optimization problems are typically tackled by the branch-and-bound paradigm. We propose a new graph convolutional neural network model for learning branch-and-bound variable selection policies, which leverages the natural variable-constraint bipartite graph representation of mixed-integer linear programs. We train our model via imitation learning from the strong branching expert rul… ▽ More

    Submitted 30 October, 2019; v1 submitted 4 June, 2019; originally announced June 2019.

    Comments: Accepted paper at the NeurIPS 2019 conference

  27. arXiv:1906.00654  [pdf, other

    cs.LG cs.SD eess.AS stat.ML

    Continual Learning of New Sound Classes using Generative Replay

    Authors: Zhepei Wang, Cem Subakan, Efthymios Tzinis, Paris Smaragdis, Laurent Charlin

    Abstract: Continual learning consists in incrementally training a model on a sequence of datasets and testing on the union of all datasets. In this paper, we examine continual learning for the problem of sound classification, in which we wish to refine already trained models to learn new sound classes. In practice one does not want to maintain all past training data and retrain from scratch, but naively upd… ▽ More

    Submitted 3 June, 2019; originally announced June 2019.

  28. Session-based Social Recommendation via Dynamic Graph Attention Networks

    Authors: Wei** Song, Yifan Wang, Laurent Charlin, Ming Zhang, Jian Tang

    Abstract: Online communities such as Facebook and Twitter are enormously popular and have become an essential part of the daily life of many of their users. Through these platforms, users can discover and create information that others will then consume. In that context, recommending relevant information to users becomes critical for viability. However, recommendation in online communities is a challenging… ▽ More

    Submitted 15 April, 2019; v1 submitted 25 February, 2019; originally announced February 2019.

    Comments: Published as a conference paper at WSDM2019. Source code and data are available online

  29. arXiv:1812.07617  [pdf, other

    cs.LG cs.CL cs.IR stat.ML

    Towards Deep Conversational Recommendations

    Authors: Raymond Li, Samira Kahou, Hannes Schulz, Vincent Michalski, Laurent Charlin, Chris Pal

    Abstract: There has been growing interest in using neural networks and deep learning techniques to create dialogue systems. Conversational recommendation is an interesting setting for the scientific exploration of dialogue with natural language as the associated discourse involves goal-driven dialogue that often transforms naturally into more free-form chat. This paper provides two contributions. First, unt… ▽ More

    Submitted 4 March, 2019; v1 submitted 18 December, 2018; originally announced December 2018.

    Comments: 17 pages, 5 figures, Accepted at 32nd Conference on Neural Information Processing Systems (NeurIPS 2018), Montréal, Canada

  30. arXiv:1811.02549  [pdf, other

    cs.CL cs.LG

    Language GANs Falling Short

    Authors: Massimo Caccia, Lucas Caccia, William Fedus, Hugo Larochelle, Joelle Pineau, Laurent Charlin

    Abstract: Generating high-quality text with sufficient diversity is essential for a wide range of Natural Language Generation (NLG) tasks. Maximum-Likelihood (MLE) models trained with teacher forcing have consistently been reported as weak baselines, where poor performance is attributed to exposure bias (Bengio et al., 2015; Ranzato et al., 2015); at inference time, the model is fed its own prediction inste… ▽ More

    Submitted 19 February, 2020; v1 submitted 6 November, 2018; originally announced November 2018.

    Journal ref: ICLR 2020 - Proceedings of the Seventh International Conference on Learning Representation

  31. arXiv:1808.06581  [pdf, other

    cs.IR cs.LG stat.ML

    The Deconfounded Recommender: A Causal Inference Approach to Recommendation

    Authors: Yixin Wang, Dawen Liang, Laurent Charlin, David M. Blei

    Abstract: The goal of recommendation is to show users items that they will like. Though usually framed as a prediction, the spirit of recommendation is to answer an interventional question---for each user and movie, what would the rating be if we "forced" the user to watch the movie? To this end, we develop a causal approach to recommendation, one where watching a movie is a "treatment" and a user's rating… ▽ More

    Submitted 27 May, 2019; v1 submitted 20 August, 2018; originally announced August 2018.

    Comments: 15 pages

  32. arXiv:1806.04342  [pdf, other

    stat.ML cs.LG

    Focused Hierarchical RNNs for Conditional Sequence Processing

    Authors: Nan Rosemary Ke, Konrad Zolna, Alessandro Sordoni, Zhouhan Lin, Adam Trischler, Yoshua Bengio, Joelle Pineau, Laurent Charlin, Chris Pal

    Abstract: Recurrent Neural Networks (RNNs) with attention mechanisms have obtained state-of-the-art results for many sequence processing tasks. Most of these models use a simple form of encoder with attention that looks over the entire sequence and assigns a weight to each token independently. We present a mechanism for focusing RNN encoders for sequence modelling tasks which allows them to attend to key pa… ▽ More

    Submitted 12 June, 2018; originally announced June 2018.

    Comments: To appear at ICML 2018

  33. arXiv:1711.02326  [pdf, other

    cs.AI cs.LG cs.NE stat.ML

    Sparse Attentive Backtracking: Long-Range Credit Assignment in Recurrent Networks

    Authors: Nan Rosemary Ke, Anirudh Goyal, Olexa Bilaniuk, Jonathan Binas, Laurent Charlin, Chris Pal, Yoshua Bengio

    Abstract: A major drawback of backpropagation through time (BPTT) is the difficulty of learning long-term dependencies, coming from having to propagate credit information backwards through every single step of the forward computation. This makes BPTT both computationally impractical and biologically implausible. For this reason, full backpropagation through time is rarely used on long sequences, and truncat… ▽ More

    Submitted 7 November, 2017; originally announced November 2017.

  34. arXiv:1710.02248  [pdf, other

    cs.LG cs.AI stat.ML

    Learnable Explicit Density for Continuous Latent Space and Variational Inference

    Authors: Chin-Wei Huang, Ahmed Touati, Laurent Dinh, Michal Drozdzal, Mohammad Havaei, Laurent Charlin, Aaron Courville

    Abstract: In this paper, we study two aspects of the variational autoencoder (VAE): the prior distribution over the latent variables and its corresponding posterior. First, we decompose the learning of VAEs into layerwise density estimation, and argue that having a flexible prior is beneficial to both sample generation and inference. Second, we analyze the family of inverse autoregressive flows (inverse AF)… ▽ More

    Submitted 5 October, 2017; originally announced October 2017.

    Comments: 2 figures, 5 pages, submitted to ICML Principled Approaches to Deep Learning workshop

  35. arXiv:1611.06216  [pdf, other

    cs.CL cs.AI cs.NE

    Generative Deep Neural Networks for Dialogue: A Short Review

    Authors: Iulian Vlad Serban, Ryan Lowe, Laurent Charlin, Joelle Pineau

    Abstract: Researchers have recently started investigating deep neural networks for dialogue applications. In particular, generative sequence-to-sequence (Seq2Seq) models have shown promising results for unstructured tasks, such as word-level dialogue response generation. The hope is that such models will be able to leverage massive amounts of data to learn meaningful natural language representations and res… ▽ More

    Submitted 18 November, 2016; originally announced November 2016.

    Comments: 6 pages, 1 figure, 3 tables; NIPS 2016 workshop on Learning Methods for Dialogue

    ACM Class: I.5.1; I.2.7

  36. arXiv:1605.06069  [pdf, other

    cs.CL cs.AI cs.LG cs.NE

    A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues

    Authors: Iulian Vlad Serban, Alessandro Sordoni, Ryan Lowe, Laurent Charlin, Joelle Pineau, Aaron Courville, Yoshua Bengio

    Abstract: Sequential data often possesses a hierarchical structure with complex dependencies between subsequences, such as found between the utterances in a dialogue. In an effort to model this kind of generative process, we propose a neural network-based generative architecture, with latent stochastic variables that span a variable number of time steps. We apply the proposed model to the task of dialogue r… ▽ More

    Submitted 13 June, 2016; v1 submitted 19 May, 2016; originally announced May 2016.

    Comments: 15 pages, 5 tables, 4 figures

    ACM Class: I.5.1; I.2.7

  37. arXiv:1605.05414  [pdf, other

    cs.CL cs.LG

    On the Evaluation of Dialogue Systems with Next Utterance Classification

    Authors: Ryan Lowe, Iulian V. Serban, Mike Noseworthy, Laurent Charlin, Joelle Pineau

    Abstract: An open challenge in constructing dialogue systems is develo** methods for automatically learning dialogue strategies from large amounts of unlabelled data. Recent work has proposed Next-Utterance-Classification (NUC) as a surrogate task for building dialogue systems from text data. In this paper we investigate the performance of humans on this task to validate the relevance of NUC as a method o… ▽ More

    Submitted 22 July, 2016; v1 submitted 17 May, 2016; originally announced May 2016.

    Comments: Accepted to SIGDIAL 2016 (short paper). 5 pages

  38. arXiv:1603.08023  [pdf, other

    cs.CL cs.AI cs.LG cs.NE

    How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation

    Authors: Chia-Wei Liu, Ryan Lowe, Iulian V. Serban, Michael Noseworthy, Laurent Charlin, Joelle Pineau

    Abstract: We investigate evaluation metrics for dialogue response generation systems where supervised labels, such as task completion, are not available. Recent works in response generation have adopted metrics from machine translation to compare a model's generated response to a single target response. We show that these metrics correlate very weakly with human judgements in the non-technical Twitter domai… ▽ More

    Submitted 3 January, 2017; v1 submitted 25 March, 2016; originally announced March 2016.

    Comments: First 4 authors had equal contribution. 13 pages, 5 tables, 6 figures. EMNLP 2016

  39. arXiv:1512.05742  [pdf, other

    cs.CL cs.AI cs.HC cs.LG stat.ML

    A Survey of Available Corpora for Building Data-Driven Dialogue Systems

    Authors: Iulian Vlad Serban, Ryan Lowe, Peter Henderson, Laurent Charlin, Joelle Pineau

    Abstract: During the past decade, several areas of speech and language understanding have witnessed substantial breakthroughs from the use of data-driven models. In the area of dialogue systems, the trend is less obvious, and most practical systems are still built through significant engineering and expert knowledge. Nevertheless, several recent results suggest that data-driven approaches are feasible and q… ▽ More

    Submitted 20 March, 2017; v1 submitted 17 December, 2015; originally announced December 2015.

    Comments: 56 pages including references and appendix, 5 tables and 1 figure; Under review for the Dialogue & Discourse journal. Update: paper has been rewritten and now includes several new datasets

    MSC Class: 68T01; 68T05; 68T35; 68T50 ACM Class: I.2.6; I.2.7; I.2.1

  40. arXiv:1510.07025  [pdf, other

    stat.ML cs.IR cs.LG

    Modeling User Exposure in Recommendation

    Authors: Dawen Liang, Laurent Charlin, James McInerney, David M. Blei

    Abstract: Collaborative filtering analyzes user preferences for items (e.g., books, movies, restaurants, academic papers) by exploiting the similarity patterns across users. In implicit feedback settings, all the items, including the ones that a user did not consume, are taken into consideration. But this assumption does not accord with the common sense understanding that users have a limited scope and awar… ▽ More

    Submitted 4 February, 2016; v1 submitted 23 October, 2015; originally announced October 2015.

    Comments: 11 pages, 4 figures. WWW'16

  41. arXiv:1509.04640  [pdf, other

    cs.LG cs.IR stat.ML

    Dynamic Poisson Factorization

    Authors: Laurent Charlin, Rajesh Ranganath, James McInerney, David M. Blei

    Abstract: Models for recommender systems use latent factors to explain the preferences and behaviors of users with respect to a set of items (e.g., movies, books, academic papers). Typically, the latent factors are assumed to be static and, given these factors, the observed preferences and behaviors of users are assumed to be generated without order. These assumptions limit the explorative and predictive ca… ▽ More

    Submitted 15 September, 2015; originally announced September 2015.

    Comments: RecSys 2015

  42. arXiv:1411.2581  [pdf, other

    stat.ML cs.LG

    Deep Exponential Families

    Authors: Rajesh Ranganath, Linpeng Tang, Laurent Charlin, David M. Blei

    Abstract: We describe \textit{deep exponential families} (DEFs), a class of latent variable models that are inspired by the hidden structures used in deep neural networks. DEFs capture a hierarchy of dependencies between latent variables, and are easily generalized to many settings through exponential families. We perform inference using recent "black box" variational inference techniques. We then evaluate… ▽ More

    Submitted 10 November, 2014; originally announced November 2014.

  43. arXiv:1206.4647  [pdf

    cs.LG cs.AI cs.IR

    Active Learning for Matching Problems

    Authors: Laurent Charlin, Rich Zemel, Craig Boutilier

    Abstract: Effective learning of user preferences is critical to easing user burden in various types of matching problems. Equally important is active query selection to further reduce the amount of preference information users must provide. We address the problem of active learning of user preferences for matching problems, introducing a novel method for determining probabilistic matchings, and develo** s… ▽ More

    Submitted 18 June, 2012; originally announced June 2012.

    Comments: ICML2012

  44. arXiv:1206.3291  [pdf

    cs.AI

    Hierarchical POMDP Controller Optimization by Likelihood Maximization

    Authors: Marc Toussaint, Laurent Charlin, Pascal Poupart

    Abstract: Planning can often be simpli ed by decomposing the task into smaller tasks arranged hierarchically. Charlin et al. [4] recently showed that the hierarchy discovery problem can be framed as a non-convex optimization problem. However, the inherent computational di culty of solving such an optimization problem makes it hard to scale to realworld problems. In another line of research, Toussaint et al.… ▽ More

    Submitted 13 June, 2012; originally announced June 2012.

    Comments: Appears in Proceedings of the Twenty-Fourth Conference on Uncertainty in Artificial Intelligence (UAI2008)

    Report number: UAI-P-2008-PG-562-570

  45. arXiv:1202.3706  [pdf

    cs.IR cs.AI

    A Framework for Optimizing Paper Matching

    Authors: Laurent Charlin, Richard S. Zemel, Craig Boutilier

    Abstract: At the heart of many scientific conferences is the problem of matching submitted papers to suitable reviewers. Arriving at a good assignment is a major and important challenge for any conference organizer. In this paper we propose a framework to optimize paper-to-reviewer assignments. Our framework uses suitability scores to measure pairwise affinity between papers and reviewers. We show how learn… ▽ More

    Submitted 14 February, 2012; originally announced February 2012.

    Report number: UAI-P-2011-PG-86-95