Search | arXiv e-print repository

Policy Diversity for Cooperative Agents

Authors: Mingxi Tan, Andong Tian, Ludovic Denoyer

Abstract: Standard cooperative multi-agent reinforcement learning (MARL) methods aim to find the optimal team cooperative policy to complete a task. However there may exist multiple different ways of cooperating, which usually are very needed by domain experts. Therefore, identifying a set of significantly different policies can alleviate the task complexity for them. Unfortunately, there is a general lack… ▽ More Standard cooperative multi-agent reinforcement learning (MARL) methods aim to find the optimal team cooperative policy to complete a task. However there may exist multiple different ways of cooperating, which usually are very needed by domain experts. Therefore, identifying a set of significantly different policies can alleviate the task complexity for them. Unfortunately, there is a general lack of effective policy diversity approaches specifically designed for the multi-agent domain. In this work, we propose a method called Moment-Matching Policy Diversity to alleviate this problem. This method can generate different team policies to varying degrees by formalizing the difference between team policies as the difference in actions of selected agents in different policies. Theoretically, we show that our method is a simple way to implement a constrained optimization problem that regularizes the difference between two trajectory distributions by using the maximum mean discrepancy. The effectiveness of our approach is demonstrated on a challenging team-based shooter. △ Less

Submitted 28 August, 2023; originally announced August 2023.

arXiv:2308.09629 [pdf, other]

Learning Computational Efficient Bots with Costly Features

Authors: Anthony Kobanda, Valliappan C. A., Joshua Romoff, Ludovic Denoyer

Abstract: Deep reinforcement learning (DRL) techniques have become increasingly used in various fields for decision-making processes. However, a challenge that often arises is the trade-off between both the computational efficiency of the decision-making process and the ability of the learned agent to solve a particular task. This is particularly critical in real-time settings such as video games where the… ▽ More Deep reinforcement learning (DRL) techniques have become increasingly used in various fields for decision-making processes. However, a challenge that often arises is the trade-off between both the computational efficiency of the decision-making process and the ability of the learned agent to solve a particular task. This is particularly critical in real-time settings such as video games where the agent needs to take relevant decisions at a very high frequency, with a very limited inference time. In this work, we propose a generic offline learning approach where the computation cost of the input features is taken into account. We derive the Budgeted Decision Transformer as an extension of the Decision Transformer that incorporates cost constraints to limit its cost at inference. As a result, the model can dynamically choose the best input features at each timestep. We demonstrate the effectiveness of our method on several tasks, including D4RL benchmarks and complex 3D environments similar to those found in video games, and show that it can achieve similar performance while using significantly fewer computational resources compared to classical approaches. △ Less

Submitted 18 August, 2023; originally announced August 2023.

arXiv:2211.10445 [pdf, other]

Building a Subspace of Policies for Scalable Continual Learning

Authors: Jean-Baptiste Gaya, Thang Doan, Lucas Caccia, Laure Soulier, Ludovic Denoyer, Roberta Raileanu

Abstract: The ability to continuously acquire new knowledge and skills is crucial for autonomous agents. Existing methods are typically based on either fixed-size models that struggle to learn a large number of diverse behaviors, or growing-size models that scale poorly with the number of tasks. In this work, we aim to strike a better balance between an agent's size and performance by designing a method tha… ▽ More The ability to continuously acquire new knowledge and skills is crucial for autonomous agents. Existing methods are typically based on either fixed-size models that struggle to learn a large number of diverse behaviors, or growing-size models that scale poorly with the number of tasks. In this work, we aim to strike a better balance between an agent's size and performance by designing a method that grows adaptively depending on the task sequence. We introduce Continual Subspace of Policies (CSP), a new approach that incrementally builds a subspace of policies for training a reinforcement learning agent on a sequence of tasks. The subspace's high expressivity allows CSP to perform well for many different tasks while growing sublinearly with the number of tasks. Our method does not suffer from forgetting and displays positive transfer to new tasks. CSP outperforms a number of popular baselines on a wide range of scenarios from two challenging domains, Brax (locomotion) and Continual World (manipulation). △ Less

Submitted 2 March, 2023; v1 submitted 18 November, 2022; originally announced November 2022.

Comments: Accepted at ICLR2023 (notable-top-25%). website: https://continual-subspace-policies-streamlit-app-gofujp.streamlit.app/ code: https://github.com/facebookresearch/salina/tree/main/salina_cl

arXiv:2209.13224 [pdf, other]

Regularized Soft Actor-Critic for Behavior Transfer Learning

Authors: Mingxi Tan, Andong Tian, Ludovic Denoyer

Abstract: Existing imitation learning methods mainly focus on making an agent effectively mimic a demonstrated behavior, but do not address the potential contradiction between the behavior style and the objective of a task. There is a general lack of efficient methods that allow an agent to partially imitate a demonstrated behavior to varying degrees, while completing the main objective of a task. In this p… ▽ More Existing imitation learning methods mainly focus on making an agent effectively mimic a demonstrated behavior, but do not address the potential contradiction between the behavior style and the objective of a task. There is a general lack of efficient methods that allow an agent to partially imitate a demonstrated behavior to varying degrees, while completing the main objective of a task. In this paper we propose a method called Regularized Soft Actor-Critic which formulates the main task and the imitation task under the Constrained Markov Decision Process framework (CMDP). The main task is defined as the maximum entropy objective used in Soft Actor-Critic (SAC) and the imitation task is defined as a constraint. We evaluate our method on continuous control tasks relevant to video games applications. △ Less

Submitted 27 September, 2022; originally announced September 2022.

Comments: 13 pages, 5 figures, paper accepted by IEEE CoG2022

arXiv:2205.15918 [pdf, other]

Interactive Query Clarification and Refinement via User Simulation

Authors: Pierre Erbacher, Ludovic Denoyer, Laure Soulier

Abstract: When users initiate search sessions, their queries are often unclear or might lack of context; this resulting in inefficient document ranking. Multiple approaches have been proposed by the Information Retrieval community to add context and retrieve documents aligned with users' intents. While some work focus on query disambiguation using users' browsing history, a recent line of work proposes to i… ▽ More When users initiate search sessions, their queries are often unclear or might lack of context; this resulting in inefficient document ranking. Multiple approaches have been proposed by the Information Retrieval community to add context and retrieve documents aligned with users' intents. While some work focus on query disambiguation using users' browsing history, a recent line of work proposes to interact with users by asking clarification questions or/and proposing clarification panels. However, these approaches count either a limited number (i.e., 1) of interactions with user or log-based interactions. In this paper, we propose and evaluate a fully simulated query clarification framework allowing multi-turn interactions between IR systems and user agents. △ Less

Submitted 31 May, 2022; originally announced May 2022.

arXiv:2203.11369 [pdf, other]

Temporal Abstractions-Augmented Temporally Contrastive Learning: An Alternative to the Laplacian in RL

Authors: Akram Erraqabi, Marlos C. Machado, Mingde Zhao, Sainbayar Sukhbaatar, Alessandro Lazaric, Ludovic Denoyer, Yoshua Bengio

Abstract: In reinforcement learning, the graph Laplacian has proved to be a valuable tool in the task-agnostic setting, with applications ranging from skill discovery to reward sha**. Recently, learning the Laplacian representation has been framed as the optimization of a temporally-contrastive objective to overcome its computational limitations in large (or continuous) state spaces. However, this approac… ▽ More In reinforcement learning, the graph Laplacian has proved to be a valuable tool in the task-agnostic setting, with applications ranging from skill discovery to reward sha**. Recently, learning the Laplacian representation has been framed as the optimization of a temporally-contrastive objective to overcome its computational limitations in large (or continuous) state spaces. However, this approach requires uniform access to all states in the state space, overlooking the exploration problem that emerges during the representation learning process. In this work, we propose an alternative method that is able to recover, in a non-uniform-prior setting, the expressiveness and the desired properties of the Laplacian representation. We do so by combining the representation learning with a skill-based covering policy, which provides a better training distribution to extend and refine the representation. We also show that a simple augmentation of the representation objective with the learned temporal abstractions improves dynamics-awareness and helps exploration. We find that our method succeeds as an alternative to the Laplacian in the non-uniform setting and scales to challenging continuous control environments. Finally, even if our method is not optimized for skill discovery, the learned skills can successfully solve difficult continuous navigation tasks with sparse rewards, where standard skill discovery approaches are no so effective. △ Less

Submitted 21 March, 2022; originally announced March 2022.

arXiv:2203.06215 [pdf, other]

Can I see an Example? Active Learning the Long Tail of Attributes and Relations

Authors: Tyler L. Hayes, Maximilian Nickel, Christopher Kanan, Ludovic Denoyer, Arthur Szlam

Abstract: There has been significant progress in creating machine learning models that identify objects in scenes along with their associated attributes and relationships; however, there is a large gap between the best models and human capabilities. One of the major reasons for this gap is the difficulty in collecting sufficient amounts of annotated relations and attributes for training these systems. While… ▽ More There has been significant progress in creating machine learning models that identify objects in scenes along with their associated attributes and relationships; however, there is a large gap between the best models and human capabilities. One of the major reasons for this gap is the difficulty in collecting sufficient amounts of annotated relations and attributes for training these systems. While some attributes and relations are abundant, the distribution in the natural world and existing datasets is long tailed. In this paper, we address this problem by introducing a novel incremental active learning framework that asks for attributes and relations in visual scenes. While conventional active learning methods ask for labels of specific examples, we flip this framing to allow agents to ask for examples from specific categories. Using this framing, we introduce an active sampling method that asks for examples from the tail of the data distribution and show that it outperforms classical active learning methods on Visual Genome. △ Less

Submitted 7 October, 2022; v1 submitted 11 March, 2022; originally announced March 2022.

Comments: To appear in the British Machine Vision Conference (BMVC-2022)

arXiv:2201.03435 [pdf, other]

State of the Art of User Simulation approaches for conversational information retrieval

Authors: Pierre Erbacher, Laure Soulier, Ludovic Denoyer

Abstract: Conversational Information Retrieval (CIR) is an emerging field of Information Retrieval (IR) at the intersection of interactive IR and dialogue systems for open domain information needs. In order to optimize these interactions and enhance the user experience, it is necessary to improve IR models by taking into account sequential heterogeneous user-system interactions. Reinforcement learning has e… ▽ More Conversational Information Retrieval (CIR) is an emerging field of Information Retrieval (IR) at the intersection of interactive IR and dialogue systems for open domain information needs. In order to optimize these interactions and enhance the user experience, it is necessary to improve IR models by taking into account sequential heterogeneous user-system interactions. Reinforcement learning has emerged as a paradigm particularly suited to optimize sequential decision making in many domains and has recently appeared in IR. However, training these systems by reinforcement learning on users is not feasible. One solution is to train IR systems on user simulations that model the behavior of real users. Our contribution is twofold: 1)reviewing the literature on user modeling and user simulation for information access, and 2) discussing the different research perspectives for user simulations in the context of CIR △ Less

Submitted 10 January, 2022; originally announced January 2022.

Comments: SIM4IR - Sigir Workshop 2021

arXiv:2110.14457 [pdf, other]

Direct then Diffuse: Incremental Unsupervised Skill Discovery for State Covering and Goal Reaching

Authors: Pierre-Alexandre Kamienny, Jean Tarbouriech, Sylvain Lamprier, Alessandro Lazaric, Ludovic Denoyer

Abstract: Learning meaningful behaviors in the absence of reward is a difficult problem in reinforcement learning. A desirable and challenging unsupervised objective is to learn a set of diverse skills that provide a thorough coverage of the state space while being directed, i.e., reliably reaching distinct regions of the environment. In this paper, we build on the mutual information framework for skill dis… ▽ More Learning meaningful behaviors in the absence of reward is a difficult problem in reinforcement learning. A desirable and challenging unsupervised objective is to learn a set of diverse skills that provide a thorough coverage of the state space while being directed, i.e., reliably reaching distinct regions of the environment. In this paper, we build on the mutual information framework for skill discovery and introduce UPSIDE, which addresses the coverage-directedness trade-off in the following ways: 1) We design policies with a decoupled structure of a directed skill, trained to reach a specific region, followed by a diffusing part that induces a local coverage. 2) We optimize policies by maximizing their number under the constraint that each of them reaches distinct regions of the environment (i.e., they are sufficiently discriminable) and prove that this serves as a lower bound to the original mutual information objective. 3) Finally, we compose the learned directed skills into a growing tree that adaptively covers the environment. We illustrate in several navigation and control environments how the skills learned by UPSIDE solve sparse-reward downstream tasks better than existing baselines. △ Less

Submitted 30 April, 2022; v1 submitted 27 October, 2021; originally announced October 2021.

Comments: ICLR 2022

arXiv:2110.07910 [pdf, other]

SaLinA: Sequential Learning of Agents

Authors: Ludovic Denoyer, Alfredo de la Fuente, Song Duong, Jean-Baptiste Gaya, Pierre-Alexandre Kamienny, Daniel H. Thompson

Abstract: SaLinA is a simple library that makes implementing complex sequential learning models easy, including reinforcement learning algorithms. It is built as an extension of PyTorch: algorithms coded with \SALINA{} can be understood in few minutes by PyTorch users and modified easily. Moreover, SaLinA naturally works with multiple CPUs and GPUs at train and test time, thus being a good fit for the large… ▽ More SaLinA is a simple library that makes implementing complex sequential learning models easy, including reinforcement learning algorithms. It is built as an extension of PyTorch: algorithms coded with \SALINA{} can be understood in few minutes by PyTorch users and modified easily. Moreover, SaLinA naturally works with multiple CPUs and GPUs at train and test time, thus being a good fit for the large-scale training use cases. In comparison to existing RL libraries, SaLinA has a very low adoption cost and capture a large variety of settings (model-based RL, batch RL, hierarchical RL, multi-agent RL, etc.). But SaLinA does not only target RL practitioners, it aims at providing sequential learning capabilities to any deep learning programmer. △ Less

Submitted 15 October, 2021; originally announced October 2021.

arXiv:2110.05169 [pdf, other]

Learning a subspace of policies for online adaptation in Reinforcement Learning

Authors: Jean-Baptiste Gaya, Laure Soulier, Ludovic Denoyer

Abstract: Deep Reinforcement Learning (RL) is mainly studied in a setting where the training and the testing environments are similar. But in many practical applications, these environments may differ. For instance, in control systems, the robot(s) on which a policy is learned might differ from the robot(s) on which a policy will run. It can be caused by different internal factors (e.g., calibration issues,… ▽ More Deep Reinforcement Learning (RL) is mainly studied in a setting where the training and the testing environments are similar. But in many practical applications, these environments may differ. For instance, in control systems, the robot(s) on which a policy is learned might differ from the robot(s) on which a policy will run. It can be caused by different internal factors (e.g., calibration issues, system attrition, defective modules) or also by external changes (e.g., weather conditions). There is a need to develop RL methods that generalize well to variations of the training conditions. In this article, we consider the simplest yet hard to tackle generalization setting where the test environment is unknown at train time, forcing the agent to adapt to the system's new dynamics. This online adaptation process can be computationally expensive (e.g., fine-tuning) and cannot rely on meta-RL techniques since there is just a single train environment. To do so, we propose an approach where we learn a subspace of policies within the parameter space. This subspace contains an infinite number of policies that are trained to solve the training environment while having different parameter values. As a consequence, two policies in that subspace process information differently and exhibit different behaviors when facing variations of the train environment. Our experiments carried out over a large variety of benchmarks compare our approach with baselines, including diversity-based methods. In comparison, our approach is simple to tune, does not need any extra component (e.g., discriminator) and learns policies able to gather a high reward on unseen environments. △ Less

Submitted 24 October, 2022; v1 submitted 11 October, 2021; originally announced October 2021.

arXiv:2106.09563 [pdf, other]

On Anytime Learning at Macroscale

Authors: Lucas Caccia, **g Xu, Myle Ott, Marc'Aurelio Ranzato, Ludovic Denoyer

Abstract: In many practical applications of machine learning data arrives sequentially over time in large chunks. Practitioners have then to decide how to allocate their computational budget in order to obtain the best performance at any point in time. Online learning theory for convex optimization suggests that the best strategy is to use data as soon as it arrives. However, this might not be the best stra… ▽ More In many practical applications of machine learning data arrives sequentially over time in large chunks. Practitioners have then to decide how to allocate their computational budget in order to obtain the best performance at any point in time. Online learning theory for convex optimization suggests that the best strategy is to use data as soon as it arrives. However, this might not be the best strategy when using deep non-linear networks, particularly when these perform multiple passes over each chunk of data rendering the overall distribution non i.i.d.. In this paper, we formalize this learning setting in the simplest scenario in which each data chunk is drawn from the same underlying distribution, and make a first attempt at empirically answering the following questions: How long should the learner wait before training on the newly arrived chunks? What architecture should the learner adopt? Should the learner increase capacity over time as more data is observed? We probe this learning setting using convolutional neural networks trained on classic computer vision benchmarks as well as a large transformer model trained on a large-scale language modeling task. Code is available at \url{www.github.com/facebookresearch/ALMA}. △ Less

Submitted 2 August, 2022; v1 submitted 17 June, 2021; originally announced June 2021.

Comments: Accepted at the Conference on Lifelong Learning Agents (CoLLAs) 2022

arXiv:2012.12631 [pdf, other]

Efficient Continual Learning with Modular Networks and Task-Driven Priors

Authors: Tom Veniat, Ludovic Denoyer, Marc'Aurelio Ranzato

Abstract: Existing literature in Continual Learning (CL) has focused on overcoming catastrophic forgetting, the inability of the learner to recall how to perform tasks observed in the past. There are however other desirable properties of a CL system, such as the ability to transfer knowledge from previous tasks and to scale memory and compute sub-linearly with the number of tasks. Since most current benchma… ▽ More Existing literature in Continual Learning (CL) has focused on overcoming catastrophic forgetting, the inability of the learner to recall how to perform tasks observed in the past. There are however other desirable properties of a CL system, such as the ability to transfer knowledge from previous tasks and to scale memory and compute sub-linearly with the number of tasks. Since most current benchmarks focus only on forgetting using short streams of tasks, we first propose a new suite of benchmarks to probe CL algorithms across these new axes. Finally, we introduce a new modular architecture, whose modules represent atomic skills that can be composed to perform a certain task. Learning a task reduces to figuring out which past modules to re-use, and which new modules to instantiate to solve the current task. Our learning algorithm leverages a task-driven prior over the exponential search space of all possible ways to combine modules, enabling efficient learning on long streams of tasks. Our experiments show that this modular architecture and learning algorithm perform competitively on widely used CL benchmarks while yielding superior performance on the more challenging benchmarks we introduce in this work. △ Less

Submitted 12 February, 2021; v1 submitted 23 December, 2020; originally announced December 2020.

Comments: Accepted as a conference paper at ICLR 2021

arXiv:2006.00937 [pdf, ps, other]

Concept Matching for Low-Resource Classification

Authors: Federico Errica, Ludovic Denoyer, Bora Edizel, Fabio Petroni, Vassilis Plachouras, Fabrizio Silvestri, Sebastian Riedel

Abstract: We propose a model to tackle classification tasks in the presence of very little training data. To this aim, we approximate the notion of exact match with a theoretically sound mechanism that computes a probability of matching in the input space. Importantly, the model learns to focus on elements of the input that are relevant for the task at hand; by leveraging highlighted portions of the trainin… ▽ More We propose a model to tackle classification tasks in the presence of very little training data. To this aim, we approximate the notion of exact match with a theoretically sound mechanism that computes a probability of matching in the input space. Importantly, the model learns to focus on elements of the input that are relevant for the task at hand; by leveraging highlighted portions of the training data, an error boosting technique guides the learning process. In practice, it increases the error associated with relevant parts of the input by a given factor. Remarkable results on text classification tasks confirm the benefits of the proposed approach in both balanced and unbalanced cases, thus being of practical use when labeling new examples is expensive. In addition, by inspecting its weights, it is often possible to gather insights on what the model has learned. △ Less

Submitted 1 June, 2020; originally announced June 2020.

arXiv:2005.02934 [pdf, other]

Learning Adaptive Exploration Strategies in Dynamic Environments Through Informed Policy Regularization

Authors: Pierre-Alexandre Kamienny, Matteo Pirotta, Alessandro Lazaric, Thibault Lavril, Nicolas Usunier, Ludovic Denoyer

Abstract: We study the problem of learning exploration-exploitation strategies that effectively adapt to dynamic environments, where the task may change over time. While RNN-based policies could in principle represent such strategies, in practice their training time is prohibitive and the learning process often converges to poor solutions. In this paper, we consider the case where the agent has access to a… ▽ More We study the problem of learning exploration-exploitation strategies that effectively adapt to dynamic environments, where the task may change over time. While RNN-based policies could in principle represent such strategies, in practice their training time is prohibitive and the learning process often converges to poor solutions. In this paper, we consider the case where the agent has access to a description of the task (e.g., a task id or task parameters) at training time, but not at test time. We propose a novel algorithm that regularizes the training of an RNN-based policy using informed policies trained to maximize the reward in each task. This dramatically reduces the sample complexity of training RNN-based policies, without losing their representational power. As a result, our method learns exploration strategies that efficiently balance between gathering information about the unknown and changing task and maximizing the reward over time. We test the performance of our algorithm in a variety of environments where tasks may vary within each episode. △ Less

Submitted 6 May, 2020; originally announced May 2020.

Comments: 18 pages

MSC Class: 68T99

arXiv:1909.04985 [pdf, other]

doi 10.1109/ICDM.2019.00022

Learning Dynamic Author Representations with Temporal Language Models

Authors: Edouard Delasalles, Sylvain Lamprier, Ludovic Denoyer

Abstract: Language models are at the heart of numerous works, notably in the text mining and information retrieval communities. These statistical models aim at extracting word distributions, from simple unigram models to recurrent approaches with latent variables that capture subtle dependencies in texts. However, those models are learned from word sequences only, and authors' identities, as well as publica… ▽ More Language models are at the heart of numerous works, notably in the text mining and information retrieval communities. These statistical models aim at extracting word distributions, from simple unigram models to recurrent approaches with latent variables that capture subtle dependencies in texts. However, those models are learned from word sequences only, and authors' identities, as well as publication dates, are seldom considered. We propose a neural model, based on recurrent language modeling, which aims at capturing language diffusion tendencies in author communities through time. By conditioning language models with author and temporal vector states, we are able to leverage the latent dependencies between the text contexts. This allows us to beat several temporal and non-temporal language baselines on two real-world corpora, and to learn meaningful author representations that vary through time. △ Less

Submitted 11 September, 2019; originally announced September 2019.

Comments: International Conference on Data Mining, ICDM 2019

arXiv:1907.05242 [pdf, other]

Large Memory Layers with Product Keys

Authors: Guillaume Lample, Alexandre Sablayrolles, Marc'Aurelio Ranzato, Ludovic Denoyer, Hervé Jégou

Abstract: This paper introduces a structured memory which can be easily integrated into a neural network. The memory is very large by design and significantly increases the capacity of the architecture, by up to a billion parameters with a negligible computational overhead. Its design and access pattern is based on product keys, which enable fast and exact nearest neighbor search. The ability to increase th… ▽ More This paper introduces a structured memory which can be easily integrated into a neural network. The memory is very large by design and significantly increases the capacity of the architecture, by up to a billion parameters with a negligible computational overhead. Its design and access pattern is based on product keys, which enable fast and exact nearest neighbor search. The ability to increase the number of parameters while kee** the same computational budget lets the overall system strike a better trade-off between prediction accuracy and computation efficiency both at training and test time. This memory layer allows us to tackle very large scale language modeling tasks. In our experiments we consider a dataset with up to 30 billion words, and we plug our memory layer in a state-of-the-art transformer-based architecture. In particular, we found that a memory augmented model with only 12 layers outperforms a baseline transformer model with 24 layers, while being twice faster at inference time. We release our code for reproducibility purposes. △ Less

Submitted 15 December, 2019; v1 submitted 10 July, 2019; originally announced July 2019.

Comments: Advances in Neural Information Processing Systems, 2019

arXiv:1906.09838 [pdf, other]

Binary Stochastic Representations for Large Multi-class Classification

Authors: Thomas Gerald, Aurélia Léon, Nicolas Baskiotis, Ludovic Denoyer

Abstract: Classification with a large number of classes is a key problem in machine learning and corresponds to many real-world applications like tagging of images or textual documents in social networks. If one-vs-all methods usually reach top performance in this context, these approaches suffer from a high inference complexity, linear w.r.t the number of categories. Different models based on the notion of… ▽ More Classification with a large number of classes is a key problem in machine learning and corresponds to many real-world applications like tagging of images or textual documents in social networks. If one-vs-all methods usually reach top performance in this context, these approaches suffer from a high inference complexity, linear w.r.t the number of categories. Different models based on the notion of binary codes have been proposed to overcome this limitation, achieving in a sublinear inference complexity. But they a priori need to decide which binary code to associate to which category before learning using more or less complex heuristics. We propose a new end-to-end model which aims at simultaneously learning to associate binary codes with categories, but also learning to map inputs to binary codes. This approach called Deep Stochastic Neural Codes (DSNC) keeps the sublinear inference complexity but do not need any a priori tuning. Experimental results on different datasets show the effectiveness of the approach w.r.t baseline methods. △ Less

Submitted 24 June, 2019; originally announced June 2019.

arXiv:1906.04980 [pdf, other]

doi 10.18653/v1/P19-1484

Unsupervised Question Answering by Cloze Translation

Authors: Patrick Lewis, Ludovic Denoyer, Sebastian Riedel

Abstract: Obtaining training data for Question Answering (QA) is time-consuming and resource-intensive, and existing QA datasets are only available for limited domains and languages. In this work, we explore to what extent high quality training data is actually required for Extractive QA, and investigate the possibility of unsupervised Extractive QA. We approach this problem by first learning to generate co… ▽ More Obtaining training data for Question Answering (QA) is time-consuming and resource-intensive, and existing QA datasets are only available for limited domains and languages. In this work, we explore to what extent high quality training data is actually required for Extractive QA, and investigate the possibility of unsupervised Extractive QA. We approach this problem by first learning to generate context, question and answer triples in an unsupervised manner, which we then use to synthesize Extractive QA training data automatically. To generate such triples, we first sample random context paragraphs from a large corpus of documents and then random noun phrases or named entity mentions from these paragraphs as answers. Next we convert answers in context to "fill-in-the-blank" cloze questions and finally translate them into natural questions. We propose and compare various unsupervised ways to perform cloze-to-natural question translation, including training an unsupervised NMT model using non-aligned corpora of natural questions and cloze questions as well as a rule-based approach. We find that modern QA models can learn to answer human questions surprisingly well using only synthetic training data. We demonstrate that, without using the SQuAD training data at all, our approach achieves 56.4 F1 on SQuAD v1 (64.5 F1 when the answer is a Named entity mention), outperforming early supervised models. △ Less

Submitted 27 June, 2019; v1 submitted 12 June, 2019; originally announced June 2019.

Comments: To appear in ACL 2019

Journal ref: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019

arXiv:1905.13539 [pdf, other]

Unsupervised Object Segmentation by Redrawing

Authors: Mickaël Chen, Thierry Artières, Ludovic Denoyer

Abstract: Object segmentation is a crucial problem that is usually solved by using supervised learning approaches over very large datasets composed of both images and corresponding object masks. Since the masks have to be provided at pixel level, building such a dataset for any new domain can be very time-consuming. We present ReDO, a new model able to extract objects from images without any annotation in a… ▽ More Object segmentation is a crucial problem that is usually solved by using supervised learning approaches over very large datasets composed of both images and corresponding object masks. Since the masks have to be provided at pixel level, building such a dataset for any new domain can be very time-consuming. We present ReDO, a new model able to extract objects from images without any annotation in an unsupervised way. It relies on the idea that it should be possible to change the textures or colors of the objects without changing the overall distribution of the dataset. Following this assumption, our approach is based on an adversarial architecture where the generator is guided by an input sample: given an image, it extracts the object mask, then redraws a new object at the same location. The generator is controlled by a discriminator that ensures that the distribution of generated images is aligned to the original one. We experiment with this method on different datasets and demonstrate the good quality of extracted masks. △ Less

Submitted 29 November, 2019; v1 submitted 27 May, 2019; originally announced May 2019.

Comments: Presented at NeurIPS 2019

arXiv:1905.11852 [pdf, other]

EDUCE: Explaining model Decisions through Unsupervised Concepts Extraction

Authors: Diane Bouchacourt, Ludovic Denoyer

Abstract: Providing explanations along with predictions is crucial in some text processing tasks. Therefore, we propose a new self-interpretable model that performs output prediction and simultaneously provides an explanation in terms of the presence of particular concepts in the input. To do so, our model's prediction relies solely on a low-dimensional binary representation of the input, where each feature… ▽ More Providing explanations along with predictions is crucial in some text processing tasks. Therefore, we propose a new self-interpretable model that performs output prediction and simultaneously provides an explanation in terms of the presence of particular concepts in the input. To do so, our model's prediction relies solely on a low-dimensional binary representation of the input, where each feature denotes the presence or absence of concepts. The presence of a concept is decided from an excerpt i.e. a small sequence of consecutive words in the text. Relevant concepts for the prediction task at hand are automatically defined by our model, avoiding the need for concept-level annotations. To ease interpretability, we enforce that for each concept, the corresponding excerpts share similar semantics and are differentiable from each others. We experimentally demonstrate the relevance of our approach on text classification and multi-sentiment analysis tasks. △ Less

Submitted 27 September, 2019; v1 submitted 28 May, 2019; originally announced May 2019.

arXiv:1811.06753 [pdf, other]

Stochastic Adaptive Neural Architecture Search for Keyword Spotting

Authors: Tom Véniat, Olivier Schwander, Ludovic Denoyer

Abstract: The problem of keyword spotting i.e. identifying keywords in a real-time audio stream is mainly solved by applying a neural network over successive sliding windows. Due to the difficulty of the task, baseline models are usually large, resulting in a high computational cost and energy consumption level. We propose a new method called SANAS (Stochastic Adaptive Neural Architecture Search) which is a… ▽ More The problem of keyword spotting i.e. identifying keywords in a real-time audio stream is mainly solved by applying a neural network over successive sliding windows. Due to the difficulty of the task, baseline models are usually large, resulting in a high computational cost and energy consumption level. We propose a new method called SANAS (Stochastic Adaptive Neural Architecture Search) which is able to adapt the architecture of the neural network on-the-fly at inference time such that small architectures will be used when the stream is easy to process (silence, low noise, ...) and bigger networks will be used when the task becomes more difficult. We show that this adaptive model can be learned end-to-end by optimizing a trade-off between the prediction performance and the average computational cost per unit of time. Experiments on the Speech Commands dataset show that this approach leads to a high recognition level while being much faster (and/or energy saving) than classical approaches where the network architecture is static. △ Less

Submitted 16 November, 2018; originally announced November 2018.

arXiv:1811.00552 [pdf, other]

Multiple-Attribute Text Style Transfer

Authors: Sandeep Subramanian, Guillaume Lample, Eric Michael Smith, Ludovic Denoyer, Marc'Aurelio Ranzato, Y-Lan Boureau

Abstract: The dominant approach to unsupervised "style transfer" in text is based on the idea of learning a latent representation, which is independent of the attributes specifying its "style". In this paper, we show that this condition is not necessary and is not always met in practice, even with domain adversarial training that explicitly aims at learning such disentangled representations. We thus propose… ▽ More The dominant approach to unsupervised "style transfer" in text is based on the idea of learning a latent representation, which is independent of the attributes specifying its "style". In this paper, we show that this condition is not necessary and is not always met in practice, even with domain adversarial training that explicitly aims at learning such disentangled representations. We thus propose a new model that controls several factors of variation in textual data where this condition on disentanglement is replaced with a simpler mechanism based on back-translation. Our method allows control over multiple attributes, like gender, sentiment, product type, etc., and a more fine-grained control on the trade-off between content preservation and change of style with a pooling operator in the latent space. Our experiments demonstrate that the fully entangled model produces better generations, even when tested on new and more challenging benchmarks comprising reviews with multiple sentences and multiple attributes. △ Less

Submitted 20 September, 2019; v1 submitted 1 November, 2018; originally announced November 2018.

arXiv:1809.01495 [pdf, other]

A Reinforcement Learning-driven Translation Model for Search-Oriented Conversational Systems

Authors: Wafa Aissa, Laure Soulier, Ludovic Denoyer

Abstract: Search-oriented conversational systems rely on information needs expressed in natural language (NL). We focus here on the understanding of NL expressions for building keyword-based queries. We propose a reinforcement-learning-driven translation model framework able to 1) learn the translation from NL expressions to queries in a supervised way, and, 2) to overcome the lack of large-scale dataset by… ▽ More Search-oriented conversational systems rely on information needs expressed in natural language (NL). We focus here on the understanding of NL expressions for building keyword-based queries. We propose a reinforcement-learning-driven translation model framework able to 1) learn the translation from NL expressions to queries in a supervised way, and, 2) to overcome the lack of large-scale dataset by framing the translation model as a word selection approach and injecting relevance feedback in the learning process. Experiments are carried out on two TREC datasets and outline the effectiveness of our approach. △ Less

Submitted 29 August, 2018; originally announced September 2018.

Comments: This is the author's pre-print version of the work. It is posted here for your personal use, not for redistribution. Please cite the definitive version which will be published in Proceedings of the 2018 EMNLP Workshop SCAI: The 2nd International Workshop on Search-Oriented Conversational AI - ISBN: 978-1-948087-75-9

arXiv:1804.08562 [pdf, other]

doi 10.1109/ICDM.2017.80

Spatio-Temporal Neural Networks for Space-Time Series Forecasting and Relations Discovery

Authors: Ali Ziat, Edouard Delasalles, Ludovic Denoyer, Patrick Gallinari

Abstract: We introduce a dynamical spatio-temporal model formalized as a recurrent neural network for forecasting time series of spatial processes, i.e. series of observations sharing temporal and spatial dependencies. The model learns these dependencies through a structured latent dynamical component, while a decoder predicts the observations from the latent representations. We consider several variants of… ▽ More We introduce a dynamical spatio-temporal model formalized as a recurrent neural network for forecasting time series of spatial processes, i.e. series of observations sharing temporal and spatial dependencies. The model learns these dependencies through a structured latent dynamical component, while a decoder predicts the observations from the latent representations. We consider several variants of this model, corresponding to different prior hypothesis about the spatial relations between the series. The model is evaluated and compared to state-of-the-art baselines, on a variety of forecasting problems representative of different application areas: epidemiology, geo-spatial statistics and car-traffic prediction. Besides these evaluations, we also describe experiments showing the ability of this approach to extract relevant spatial relations. △ Less

Submitted 23 April, 2018; originally announced April 2018.

Comments: accepted by: ICDM 2018 - IEEE International Conference on Data Mining series (ICDM)

Journal ref: 2017 IEEE International Conference on Data Mining (ICDM), New Orleans, LA, 2017, pp. 705-714

arXiv:1804.07755 [pdf, other]

Phrase-Based & Neural Unsupervised Machine Translation

Authors: Guillaume Lample, Myle Ott, Alexis Conneau, Ludovic Denoyer, Marc'Aurelio Ranzato

Abstract: Machine translation systems achieve near human-level performance on some languages, yet their effectiveness strongly relies on the availability of large amounts of parallel sentences, which hinders their applicability to the majority of language pairs. This work investigates how to learn to translate when having access to only large monolingual corpora in each language. We propose two model varian… ▽ More Machine translation systems achieve near human-level performance on some languages, yet their effectiveness strongly relies on the availability of large amounts of parallel sentences, which hinders their applicability to the majority of language pairs. This work investigates how to learn to translate when having access to only large monolingual corpora in each language. We propose two model variants, a neural and a phrase-based model. Both versions leverage a careful initialization of the parameters, the denoising effect of language models and automatic generation of parallel data by iterative back-translation. These models are significantly better than methods from the literature, while being simpler and having fewer hyper-parameters. On the widely used WMT'14 English-French and WMT'16 German-English benchmarks, our models respectively obtain 28.1 and 25.2 BLEU points without using a single parallel sentence, outperforming the state of the art by more than 11 BLEU points. On low-resource languages like English-Urdu and English-Romanian, our methods achieve even better results than semi-supervised and supervised approaches leveraging the paucity of available bitexts. Our code for NMT and PBSMT is publicly available. △ Less

Submitted 13 August, 2018; v1 submitted 20 April, 2018; originally announced April 2018.

Comments: EMNLP 2018

arXiv:1711.00305 [pdf, other]

Multi-View Data Generation Without View Supervision

Authors: Mickaël Chen, Ludovic Denoyer, Thierry Artières

Abstract: The development of high-dimensional generative models has recently gained a great surge of interest with the introduction of variational auto-encoders and generative adversarial neural networks. Different variants have been proposed where the underlying latent space is structured, for example, based on attributes describing the data to generate. We focus on a particular problem where one aims at g… ▽ More The development of high-dimensional generative models has recently gained a great surge of interest with the introduction of variational auto-encoders and generative adversarial neural networks. Different variants have been proposed where the underlying latent space is structured, for example, based on attributes describing the data to generate. We focus on a particular problem where one aims at generating samples corresponding to a number of objects under various views. We assume that the distribution of the data is driven by two independent latent factors: the content, which represents the intrinsic features of an object, and the view, which stands for the settings of a particular observation of that object. Therefore, we propose a generative model and a conditional variant built on such a disentangled latent space. This approach allows us to generate realistic samples corresponding to various objects in a high variety of views. Unlike many multi-view approaches, our model doesn't need any supervision on the views but only on the content. Compared to other conditional generation approaches that are mostly based on binary or categorical attributes, we make no such assumption about the factors of variations. Our model can be used on problems with a huge, potentially infinite, number of categories. We experiment it on four image datasets on which we demonstrate the effectiveness of the model and its ability to generalize. △ Less

Submitted 17 April, 2019; v1 submitted 1 November, 2017; originally announced November 2017.

Comments: Published as a conference paper at ICLR 2018

arXiv:1711.00043 [pdf, other]

Unsupervised Machine Translation Using Monolingual Corpora Only

Authors: Guillaume Lample, Alexis Conneau, Ludovic Denoyer, Marc'Aurelio Ranzato

Abstract: Machine translation has recently achieved impressive performance thanks to recent advances in deep learning and the availability of large-scale parallel corpora. There have been numerous attempts to extend these successes to low-resource language pairs, yet requiring tens of thousands of parallel sentences. In this work, we take this research direction to the extreme and investigate whether it is… ▽ More Machine translation has recently achieved impressive performance thanks to recent advances in deep learning and the availability of large-scale parallel corpora. There have been numerous attempts to extend these successes to low-resource language pairs, yet requiring tens of thousands of parallel sentences. In this work, we take this research direction to the extreme and investigate whether it is possible to learn to translate even without any parallel data. We propose a model that takes sentences from monolingual corpora in two different languages and maps them into the same latent space. By learning to reconstruct in both languages from this shared feature space, the model effectively learns to translate without using any labeled data. We demonstrate our model on two widely used datasets and two language pairs, reporting BLEU scores of 32.8 and 15.1 on the Multi30k and WMT English-French datasets, without using even a single parallel sentence at training time. △ Less

Submitted 13 April, 2018; v1 submitted 31 October, 2017; originally announced November 2017.

Comments: ICLR 2018

arXiv:1710.04087 [pdf, other]

Word Translation Without Parallel Data

Authors: Alexis Conneau, Guillaume Lample, Marc'Aurelio Ranzato, Ludovic Denoyer, Hervé Jégou

Abstract: State-of-the-art methods for learning cross-lingual word embeddings have relied on bilingual dictionaries or parallel corpora. Recent studies showed that the need for parallel data supervision can be alleviated with character-level information. While these methods showed encouraging results, they are not on par with their supervised counterparts and are limited to pairs of languages sharing a comm… ▽ More State-of-the-art methods for learning cross-lingual word embeddings have relied on bilingual dictionaries or parallel corpora. Recent studies showed that the need for parallel data supervision can be alleviated with character-level information. While these methods showed encouraging results, they are not on par with their supervised counterparts and are limited to pairs of languages sharing a common alphabet. In this work, we show that we can build a bilingual dictionary between two languages without using any parallel corpora, by aligning monolingual word embedding spaces in an unsupervised way. Without using any character information, our model even outperforms existing supervised methods on cross-lingual tasks for some language pairs. Our experiments demonstrate that our method works very well also for distant language pairs, like English-Russian or English-Chinese. We finally describe experiments on the English-Esperanto low-resource language pair, on which there only exists a limited amount of parallel data, to show the potential impact of our method in fully unsupervised machine translation. Our code, embeddings and dictionaries are publicly available. △ Less

Submitted 30 January, 2018; v1 submitted 11 October, 2017; originally announced October 2017.

Comments: ICLR 2018

arXiv:1706.08334 [pdf, other]

A Meta-Learning Approach to One-Step Active Learning

Authors: Gabriella Contardo, Ludovic Denoyer, Thierry Artieres

Abstract: We consider the problem of learning when obtaining the training labels is costly, which is usually tackled in the literature using active-learning techniques. These approaches provide strategies to choose the examples to label before or during training. These strategies are usually based on heuristics or even theoretical measures, but are not learned as they are directly used during training. We d… ▽ More We consider the problem of learning when obtaining the training labels is costly, which is usually tackled in the literature using active-learning techniques. These approaches provide strategies to choose the examples to label before or during training. These strategies are usually based on heuristics or even theoretical measures, but are not learned as they are directly used during training. We design a model which aims at \textit{learning active-learning strategies} using a meta-learning setting. More specifically, we consider a pool-based setting, where the system observes all the examples of the dataset of a problem and has to choose the subset of examples to label in a single shot. Experiments show encouraging results. △ Less

Submitted 17 July, 2017; v1 submitted 26 June, 2017; originally announced June 2017.

arXiv:1706.00409 [pdf, other]

Fader Networks: Manipulating Images by Sliding Attributes

Authors: Guillaume Lample, Neil Zeghidour, Nicolas Usunier, Antoine Bordes, Ludovic Denoyer, Marc'Aurelio Ranzato

Abstract: This paper introduces a new encoder-decoder architecture that is trained to reconstruct images by disentangling the salient information of the image and the values of attributes directly in the latent space. As a result, after training, our model can generate different realistic versions of an input image by varying the attribute values. By using continuous attribute values, we can choose how much… ▽ More This paper introduces a new encoder-decoder architecture that is trained to reconstruct images by disentangling the salient information of the image and the values of attributes directly in the latent space. As a result, after training, our model can generate different realistic versions of an input image by varying the attribute values. By using continuous attribute values, we can choose how much a specific attribute is perceivable in the generated image. This property could allow for applications where users can modify an image using sliding knobs, like faders on a mixing console, to change the facial expression of a portrait, or to update the color of some objects. Compared to the state-of-the-art which mostly relies on training adversarial networks in pixel space by altering attribute values at train time, our approach results in much simpler training schemes and nicely scales to multiple attributes. We present evidence that our model can significantly change the perceived value of the attributes while preserving the naturalness of images. △ Less

Submitted 28 January, 2018; v1 submitted 1 June, 2017; originally announced June 2017.

Comments: NIPS 2017

arXiv:1706.00046 [pdf, other]

Learning Time/Memory-Efficient Deep Architectures with Budgeted Super Networks

Authors: Tom Veniat, Ludovic Denoyer

Abstract: We propose to focus on the problem of discovering neural network architectures efficient in terms of both prediction quality and cost. For instance, our approach is able to solve the following tasks: learn a neural network able to predict well in less than 100 milliseconds or learn an efficient model that fits in a 50 Mb memory. Our contribution is a novel family of models called Budgeted Super Ne… ▽ More We propose to focus on the problem of discovering neural network architectures efficient in terms of both prediction quality and cost. For instance, our approach is able to solve the following tasks: learn a neural network able to predict well in less than 100 milliseconds or learn an efficient model that fits in a 50 Mb memory. Our contribution is a novel family of models called Budgeted Super Networks (BSN). They are learned using gradient descent techniques applied on a budgeted learning objective function which integrates a maximum authorized cost, while making no assumption on the nature of this cost. We present a set of experiments on computer vision problems and analyze the ability of our technique to deal with three different costs: the computation cost, the memory consumption cost and a distributed computation cost. We particularly show that our model can discover neural network architectures that have a better accuracy than the ResNet and Convolutional Neural Fabrics architectures on CIFAR-10 and CIFAR-100, at a lower cost. △ Less

Submitted 22 May, 2018; v1 submitted 31 May, 2017; originally announced June 2017.

Comments: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

arXiv:1611.06824 [pdf, other]

Options Discovery with Budgeted Reinforcement Learning

Authors: Aurélia Léon, Ludovic Denoyer

Abstract: We consider the problem of learning hierarchical policies for Reinforcement Learning able to discover options, an option corresponding to a sub-policy over a set of primitive actions. Different models have been proposed during the last decade that usually rely on a predefined set of options. We specifically address the problem of automatically discovering options in decision processes. We describe… ▽ More We consider the problem of learning hierarchical policies for Reinforcement Learning able to discover options, an option corresponding to a sub-policy over a set of primitive actions. Different models have been proposed during the last decade that usually rely on a predefined set of options. We specifically address the problem of automatically discovering options in decision processes. We describe a new learning model called Budgeted Option Neural Network (BONN) able to discover options based on a budgeted learning objective. The BONN model is evaluated on different classical RL problems, demonstrating both quantitative and qualitative interesting results. △ Less

Submitted 22 February, 2017; v1 submitted 21 November, 2016; originally announced November 2016.

Comments: Under review as a conference paper at IJCAI 2017

arXiv:1611.02019 [pdf, other]

doi 10.1007/978-3-319-71246-8_11

Multi-view Generative Adversarial Networks

Authors: Mickaël Chen, Ludovic Denoyer

Abstract: Learning over multi-view data is a challenging problem with strong practical applications. Most related studies focus on the classification point of view and assume that all the views are available at any time. We consider an extension of this framework in two directions. First, based on the BiGAN model, the Multi-view BiGAN (MV-BiGAN) is able to perform density estimation from multi-view inputs.… ▽ More Learning over multi-view data is a challenging problem with strong practical applications. Most related studies focus on the classification point of view and assume that all the views are available at any time. We consider an extension of this framework in two directions. First, based on the BiGAN model, the Multi-view BiGAN (MV-BiGAN) is able to perform density estimation from multi-view inputs. Second, it can deal with missing views and is able to update its prediction when additional views are provided. We illustrate these properties on a set of experiments over different datasets. △ Less

Submitted 17 April, 2019; v1 submitted 7 November, 2016; originally announced November 2016.

arXiv:1607.03691 [pdf, other]

Sequential Cost-Sensitive Feature Acquisition

Authors: Gabriella Contardo, Ludovic Denoyer, Thierry Artières

Abstract: We propose a reinforcement learning based approach to tackle the cost-sensitive learning problem where each input feature has a specific cost. The acquisition process is handled through a stochastic policy which allows features to be acquired in an adaptive way. The general architecture of our approach relies on representation learning to enable performing prediction on any partially observed samp… ▽ More We propose a reinforcement learning based approach to tackle the cost-sensitive learning problem where each input feature has a specific cost. The acquisition process is handled through a stochastic policy which allows features to be acquired in an adaptive way. The general architecture of our approach relies on representation learning to enable performing prediction on any partially observed sample, whatever the set of its observed features are. The resulting model is an original mix of representation learning and of reinforcement learning ideas. It is learned with policy gradient techniques to minimize a budgeted inference cost. We demonstrate the effectiveness of our proposed method with several experiments on a variety of datasets for the sparse prediction problem where all features have the same cost, but also for some cost-sensitive settings. △ Less

Submitted 13 July, 2016; originally announced July 2016.

Comments: 12 pages, conference : accepted at IDA 2016

arXiv:1505.00908 [pdf, other]

Reinforced Decision Trees

Authors: Aurélia Léon, Ludovic Denoyer

Abstract: In order to speed-up classification models when facing a large number of categories, one usual approach consists in organizing the categories in a particular structure, this structure being then used as a way to speed-up the prediction computation. This is for example the case when using error-correcting codes or even hierarchies of categories. But in the majority of approaches, this structure is… ▽ More In order to speed-up classification models when facing a large number of categories, one usual approach consists in organizing the categories in a particular structure, this structure being then used as a way to speed-up the prediction computation. This is for example the case when using error-correcting codes or even hierarchies of categories. But in the majority of approaches, this structure is chosen \textit{by hand}, or during a preliminary step, and not integrated in the learning process. We propose a new model called Reinforced Decision Tree which simultaneously learns how to organize categories in a tree structure and how to classify any input based on this structure. This approach keeps the advantages of existing techniques (low inference complexity) but allows one to build efficient classifiers in one learning step. The learning algorithm is inspired by reinforcement learning and policy-gradient techniques which allows us to integrate the two steps (building the tree, and learning the classifier) in one single algorithm. △ Less

Submitted 5 May, 2015; originally announced May 2015.

Report number: Accepted as a poster at EWRL 2015

arXiv:1412.7156 [pdf, other]

Representation Learning for cold-start recommendation

Authors: Gabriella Contardo, Ludovic Denoyer, Thierry Artieres

Abstract: A standard approach to Collaborative Filtering (CF), i.e. prediction of user ratings on items, relies on Matrix Factorization techniques. Representations for both users and items are computed from the observed ratings and used for prediction. Unfortunatly, these transductive approaches cannot handle the case of new users arriving in the system, with no known rating, a problem known as user cold-st… ▽ More A standard approach to Collaborative Filtering (CF), i.e. prediction of user ratings on items, relies on Matrix Factorization techniques. Representations for both users and items are computed from the observed ratings and used for prediction. Unfortunatly, these transductive approaches cannot handle the case of new users arriving in the system, with no known rating, a problem known as user cold-start. A common approach in this context is to ask these incoming users for a few initialization ratings. This paper presents a model to tackle this twofold problem of (i) finding good questions to ask, (ii) building efficient representations from this small amount of information. The model can also be used in a more standard (warm) context. Our approach is evaluated on the classical CF problem and on the cold-start problem on four different datasets showing its ability to improve baseline performance in both cases. △ Less

Submitted 22 June, 2015; v1 submitted 22 December, 2014; originally announced December 2014.

Comments: Accepted as workshop contribution at ICLR 2015

arXiv:1410.0510 [pdf, other]

Deep Sequential Neural Network

Authors: Ludovic Denoyer, Patrick Gallinari

Abstract: Neural Networks sequentially build high-level features through their successive layers. We propose here a new neural network model where each layer is associated with a set of candidate map**s. When an input is processed, at each layer, one map** among these candidates is selected according to a sequential decision process. The resulting model is structured according to a DAG like architecture… ▽ More Neural Networks sequentially build high-level features through their successive layers. We propose here a new neural network model where each layer is associated with a set of candidate map**s. When an input is processed, at each layer, one map** among these candidates is selected according to a sequential decision process. The resulting model is structured according to a DAG like architecture, so that a path from the root to a leaf node defines a sequence of transformations. Instead of considering global transformations, like in classical multilayer networks, this model allows us for learning a set of local transformations. It is thus able to process data with different characteristics through specific sequences of such local transformations, increasing the expression power of this model w.r.t a classical multilayered network. The learning algorithm is inspired from policy gradient techniques coming from the reinforcement learning domain and is used here instead of the classical back-propagation based gradient descent techniques. Experiments on different datasets show the relevance of this approach. △ Less

Submitted 2 October, 2014; originally announced October 2014.

arXiv:1312.6594 [pdf, other]

Sequentially Generated Instance-Dependent Image Representations for Classification

Authors: Gabriel Dulac-Arnold, Ludovic Denoyer, Nicolas Thome, Matthieu Cord, Patrick Gallinari

Abstract: In this paper, we investigate a new framework for image classification that adaptively generates spatial representations. Our strategy is based on a sequential process that learns to explore the different regions of any image in order to infer its category. In particular, the choice of regions is specific to each image, directed by the actual content of previously selected regions.The capacity of… ▽ More In this paper, we investigate a new framework for image classification that adaptively generates spatial representations. Our strategy is based on a sequential process that learns to explore the different regions of any image in order to infer its category. In particular, the choice of regions is specific to each image, directed by the actual content of previously selected regions.The capacity of the system to handle incomplete image information as well as its adaptive region selection allow the system to perform well in budgeted classification tasks by exploiting a dynamicly generated representation of each image. We demonstrate the system's abilities in a series of image-based exploration and classification tasks that highlight its learned exploration and inference abilities. △ Less

Submitted 11 February, 2014; v1 submitted 20 December, 2013; originally announced December 2013.

arXiv:1312.6169 [pdf, other]

Learning Information Spread in Content Networks

Authors: Cédric Lagnier, Simon Bourigault, Sylvain Lamprier, Ludovic Denoyer, Patrick Gallinari

Abstract: We introduce a model for predicting the diffusion of content information on social media. When propagation is usually modeled on discrete graph structures, we introduce here a continuous diffusion model, where nodes in a diffusion cascade are projected onto a latent space with the property that their proximity in this space reflects the temporal diffusion process. We focus on the task of predictin… ▽ More We introduce a model for predicting the diffusion of content information on social media. When propagation is usually modeled on discrete graph structures, we introduce here a continuous diffusion model, where nodes in a diffusion cascade are projected onto a latent space with the property that their proximity in this space reflects the temporal diffusion process. We focus on the task of predicting contaminated users for an initial initial information source and provide preliminary results on differents datasets. △ Less

Submitted 2 February, 2014; v1 submitted 20 December, 2013; originally announced December 2013.

Comments: 4 pages

arXiv:1312.6042 [pdf, other]

Learning States Representations in POMDP

Authors: Gabriella Contardo, Ludovic Denoyer, Thierry Artieres, Patrick Gallinari

Abstract: We propose to deal with sequential processes where only partial observations are available by learning a latent representation space on which policies may be accurately learned. We propose to deal with sequential processes where only partial observations are available by learning a latent representation space on which policies may be accurately learned. △ Less

Submitted 17 June, 2014; v1 submitted 20 December, 2013; originally announced December 2013.

Comments: 4 pages

arXiv:1204.2588 [pdf, other]

Probabilistic Latent Tensor Factorization Model for Link Pattern Prediction in Multi-relational Networks

Authors: Sheng Gao, Ludovic Denoyer, Patrick Gallinari

Abstract: This paper aims at the problem of link pattern prediction in collections of objects connected by multiple relation types, where each type may play a distinct role. While common link analysis models are limited to single-type link prediction, we attempt here to capture the correlations among different relation types and reveal the impact of various relation types on performance quality. For that, w… ▽ More This paper aims at the problem of link pattern prediction in collections of objects connected by multiple relation types, where each type may play a distinct role. While common link analysis models are limited to single-type link prediction, we attempt here to capture the correlations among different relation types and reveal the impact of various relation types on performance quality. For that, we define the overall relations between object pairs as a \textit{link pattern} which consists in interaction pattern and connection structure in the network, and then use tensor formalization to jointly model and predict the link patterns, which we refer to as \textit{Link Pattern Prediction} (LPP) problem. To address the issue, we propose a Probabilistic Latent Tensor Factorization (PLTF) model by introducing another latent factor for multiple relation types and furnish the Hierarchical Bayesian treatment of the proposed probabilistic model to avoid overfitting for solving the LPP problem. To learn the proposed model we develop an efficient Markov Chain Monte Carlo sampling method. Extensive experiments are conducted on several real world datasets and demonstrate significant improvements over several existing state-of-the-art methods. △ Less

Submitted 11 April, 2012; originally announced April 2012.

Comments: 19pages, 5 figures

MSC Class: 15A69 ACM Class: H.2.8; J.4

arXiv:1204.2581 [pdf, other]

Modeling Relational Data via Latent Factor Blockmodel

Authors: Sheng Gao, Ludovic Denoyer, Patrick Gallinari

Abstract: In this paper we address the problem of modeling relational data, which appear in many applications such as social network analysis, recommender systems and bioinformatics. Previous studies either consider latent feature based models but disregarding local structure in the network, or focus exclusively on capturing local structure of objects based on latent blockmodels without coupling with latent… ▽ More In this paper we address the problem of modeling relational data, which appear in many applications such as social network analysis, recommender systems and bioinformatics. Previous studies either consider latent feature based models but disregarding local structure in the network, or focus exclusively on capturing local structure of objects based on latent blockmodels without coupling with latent characteristics of objects. To combine the benefits of the previous work, we propose a novel model that can simultaneously incorporate the effect of latent features and covariates if any, as well as the effect of latent structure that may exist in the data. To achieve this, we model the relation graph as a function of both latent feature factors and latent cluster memberships of objects to collectively discover globally predictive intrinsic properties of objects and capture latent block structure in the network to improve prediction performance. We also develop an optimization transfer algorithm based on the generalized EM-style strategy to learn the latent factors. We prove the efficacy of our proposed model through the link prediction task and cluster analysis task, and extensive experiments on the synthetic data and several real world datasets suggest that our proposed LFBM model outperforms the other state of the art approaches in the evaluated tasks. △ Less

Submitted 11 April, 2012; originally announced April 2012.

Comments: 10 pages, 12 figures

MSC Class: 15A83 ACM Class: H.2.8; J.4

arXiv:1203.0203 [pdf, other]

Fast Reinforcement Learning with Large Action Sets using Error-Correcting Output Codes for MDP Factorization

Authors: Gabriel Dulac-Arnold, Ludovic Denoyer, Philippe Preux, Patrick Gallinari

Abstract: The use of Reinforcement Learning in real-world scenarios is strongly limited by issues of scale. Most RL learning algorithms are unable to deal with problems composed of hundreds or sometimes even dozens of possible actions, and therefore cannot be applied to many real-world problems. We consider the RL problem in the supervised classification framework where the optimal policy is obtained throug… ▽ More The use of Reinforcement Learning in real-world scenarios is strongly limited by issues of scale. Most RL learning algorithms are unable to deal with problems composed of hundreds or sometimes even dozens of possible actions, and therefore cannot be applied to many real-world problems. We consider the RL problem in the supervised classification framework where the optimal policy is obtained through a multiclass classifier, the set of classes being the set of actions of the problem. We introduce error-correcting output codes (ECOCs) in this setting and propose two new methods for reducing complexity when using rollouts-based approaches. The first method consists in using an ECOC-based classifier as the multiclass classifier, reducing the learning complexity from O(A2) to O(Alog(A)). We then propose a novel method that profits from the ECOC's coding dictionary to split the initial MDP into O(log(A)) seperate two-action MDPs. This second method reduces learning complexity even further, from O(A2) to O(log(A)), thus rendering problems with large action sets tractable. We finish by experimentally demonstrating the advantages of our approach on a set of benchmark problems, both in speed and performance. △ Less

Submitted 29 February, 2012; originally announced March 2012.

MSC Class: 68T05

arXiv:1108.5668 [pdf, other]

doi 10.1007/978-3-642-23780-5_34

Datum-Wise Classification: A Sequential Approach to Sparsity

Authors: Gabriel Dulac-Arnold, Ludovic Denoyer, Philippe Preux, Patrick Gallinari

Abstract: We propose a novel classification technique whose aim is to select an appropriate representation for each datapoint, in contrast to the usual approach of selecting a representation encompassing the whole dataset. This datum-wise representation is found by using a sparsity inducing empirical risk, which is a relaxation of the standard L 0 regularized risk. The classification problem is modeled as a… ▽ More We propose a novel classification technique whose aim is to select an appropriate representation for each datapoint, in contrast to the usual approach of selecting a representation encompassing the whole dataset. This datum-wise representation is found by using a sparsity inducing empirical risk, which is a relaxation of the standard L 0 regularized risk. The classification problem is modeled as a sequential decision process that sequentially chooses, for each datapoint, which features to use before classifying. Datum-Wise Classification extends naturally to multi-class tasks, and we describe a specific case where our inference has equivalent complexity to a traditional linear classifier, while still using a variable number of features. We compare our classifier to classical L 1 regularized linear models (L 1-SVM and LARS) on a set of common binary and multi-class datasets and show that for an equal average number of features used we can get improved performance using our method. △ Less

Submitted 29 August, 2011; originally announced August 2011.

Comments: ECML2011

Journal ref: Lecture Notes in Computer Science, 2011, Volume 6911/2011, 375-390

arXiv:1107.1322 [pdf, other]

doi 10.1007/978-3-642-20161-5_41

Text Classification: A Sequential Reading Approach

Authors: Gabriel Dulac-Arnold, Ludovic Denoyer, Patrick Gallinari

Abstract: We propose to model the text classification process as a sequential decision process. In this process, an agent learns to classify documents into topics while reading the document sentences sequentially and learns to stop as soon as enough information was read for deciding. The proposed algorithm is based on a modelisation of Text Classification as a Markov Decision Process and learns by using Rei… ▽ More We propose to model the text classification process as a sequential decision process. In this process, an agent learns to classify documents into topics while reading the document sentences sequentially and learns to stop as soon as enough information was read for deciding. The proposed algorithm is based on a modelisation of Text Classification as a Markov Decision Process and learns by using Reinforcement Learning. Experiments on four different classical mono-label corpora show that the proposed approach performs comparably to classical SVM approaches for large training sets, and better for small training sets. In addition, the model automatically adapts its reading process to the quantity of training information provided. △ Less

Submitted 29 August, 2011; v1 submitted 7 July, 2011; originally announced July 2011.

Comments: ECIR2011

Journal ref: Lecture Notes in Computer Science, 2011, Volume 6611/2011, 411-423

Showing 1–46 of 46 results for author: Denoyer, L