Skip to main content

Showing 1–50 of 68 results for author: Poupart, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.16782  [pdf, other

    cs.LG

    Confidence Aware Inverse Constrained Reinforcement Learning

    Authors: Sriram Ganapathi Subramanian, Guiliang Liu, Mohammed Elmahgiubi, Kasra Rezaee, Pascal Poupart

    Abstract: In coming up with solutions to real-world problems, humans implicitly adhere to constraints that are too numerous and complex to be specified completely. However, reinforcement learning (RL) agents need these constraints to learn the correct optimal policy in these settings. The field of Inverse Constraint Reinforcement Learning (ICRL) deals with this problem and provides algorithms that aim to es… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: Paper to appear in ICML 2024

  2. arXiv:2406.07780  [pdf, other

    cs.LG cs.CL

    A Critical Look At Tokenwise Reward-Guided Text Generation

    Authors: Ahmad Rashid, Ruotian Wu, Julia Grosse, Agustinus Kristiadi, Pascal Poupart

    Abstract: Large language models (LLMs) can significantly be improved by aligning to human preferences -- the so-called reinforcement learning from human feedback (RLHF). However, the cost of fine-tuning an LLM is prohibitive for many users. Due to their ability to bypass LLM finetuning, tokenwise reward-guided text generation (RGTG) methods have recently been proposed. They use a reward model trained on ful… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  3. arXiv:2406.06459  [pdf, other

    cs.LG

    How Useful is Intermittent, Asynchronous Expert Feedback for Bayesian Optimization?

    Authors: Agustinus Kristiadi, Felix Strieth-Kalthoff, Sriram Ganapathi Subramanian, Vincent Fortuin, Pascal Poupart, Geoff Pleiss

    Abstract: Bayesian optimization (BO) is an integral part of automated scientific discovery -- the so-called self-driving lab -- where human inputs are ideally minimal or at least non-blocking. However, scientists often have strong intuition, and thus human feedback is still useful. Nevertheless, prior works in enhancing BO with expert feedback, such as by incorporating it in an offline or online but blockin… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: AABI 2024. Code: https://github.com/wiseodd/bo-async-feedback

  4. arXiv:2403.11062  [pdf, other

    cs.LG math.OC

    A Simple Mixture Policy Parameterization for Improving Sample Efficiency of CVaR Optimization

    Authors: Yudong Luo, Yangchen Pan, Han Wang, Philip Torr, Pascal Poupart

    Abstract: Reinforcement learning algorithms utilizing policy gradients (PG) to optimize Conditional Value at Risk (CVaR) face significant challenges with sample inefficiency, hindering their practical applications. This inefficiency stems from two main facts: a focus on tail-end performance that overlooks many sampled trajectories, and the potential of gradient vanishing when the lower tail of the return di… ▽ More

    Submitted 28 June, 2024; v1 submitted 16 March, 2024; originally announced March 2024.

    Comments: RLC 2024

  5. arXiv:2403.04221  [pdf, other

    cs.LG cs.AI

    Why Online Reinforcement Learning is Causal

    Authors: Oliver Schulte, Pascal Poupart

    Abstract: Reinforcement learning (RL) and causal modelling naturally complement each other. The goal of causal modelling is to predict the effects of interventions in an environment, while the goal of reinforcement learning is to select interventions that maximize the rewards the agent receives from the environment. Reinforcement learning includes the two most powerful sources of information for estimating… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

    Comments: 27 pages

    ACM Class: I.2.6

  6. arXiv:2402.05015  [pdf, other

    cs.LG

    A Sober Look at LLMs for Material Discovery: Are They Actually Good for Bayesian Optimization Over Molecules?

    Authors: Agustinus Kristiadi, Felix Strieth-Kalthoff, Marta Skreta, Pascal Poupart, Alán Aspuru-Guzik, Geoff Pleiss

    Abstract: Automation is one of the cornerstones of contemporary material discovery. Bayesian optimization (BO) is an essential part of such workflows, enabling scientists to leverage prior domain knowledge into efficient exploration of a large molecular space. While such prior knowledge can take many forms, there has been significant fanfare around the ancillary scientific knowledge encapsulated in large la… ▽ More

    Submitted 28 May, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

    Comments: ICML 2024. Code: https://github.com/wiseodd/lapeft-bayesopt

  7. arXiv:2312.09817  [pdf, other

    cs.LG stat.ML

    Calibrated One Round Federated Learning with Bayesian Inference in the Predictive Space

    Authors: Mohsin Hasan, Guojun Zhang, Kaiyang Guo, Xi Chen, Pascal Poupart

    Abstract: Federated Learning (FL) involves training a model over a dataset distributed among clients, with the constraint that each client's dataset is localized and possibly heterogeneous. In FL, small and noisy datasets are common, highlighting the need for well-calibrated models that represent the uncertainty of predictions. The closest FL techniques to achieving such goals are the Bayesian FL methods wh… ▽ More

    Submitted 9 January, 2024; v1 submitted 15 December, 2023; originally announced December 2023.

    Comments: 7 pages, 2 figures. To appear at AAAI 2024

  8. arXiv:2311.03683  [pdf, other

    cs.LG

    Preventing Arbitrarily High Confidence on Far-Away Data in Point-Estimated Discriminative Neural Networks

    Authors: Ahmad Rashid, Serena Hacker, Guojun Zhang, Agustinus Kristiadi, Pascal Poupart

    Abstract: Discriminatively trained, deterministic neural networks are the de facto choice for classification problems. However, even though they achieve state-of-the-art results on in-domain test sets, they tend to be overconfident on out-of-distribution (OOD) data. For instance, ReLU networks - a popular class of neural network architectures - have been shown to almost always yield high confidence predicti… ▽ More

    Submitted 27 March, 2024; v1 submitted 6 November, 2023; originally announced November 2023.

    Comments: Accepted at AISTATS 2024

  9. arXiv:2307.08873  [pdf, other

    cs.LG cs.AI

    An Alternative to Variance: Gini Deviation for Risk-averse Policy Gradient

    Authors: Yudong Luo, Guiliang Liu, Pascal Poupart, Yangchen Pan

    Abstract: Restricting the variance of a policy's return is a popular choice in risk-averse Reinforcement Learning (RL) due to its clear mathematical definition and easy interpretability. Traditional methods directly restrict the total return variance. Recent methods restrict the per-step reward variance as a proxy. We thoroughly examine the limitations of these variance-based methods, such as sensitivity to… ▽ More

    Submitted 2 November, 2023; v1 submitted 17 July, 2023; originally announced July 2023.

    Comments: NeurIPS 2023

  10. arXiv:2307.05228  [pdf, other

    cs.CL cs.LG

    Attribute Controlled Dialogue Prompting

    Authors: Runcheng Liu, Ahmad Rashid, Ivan Kobyzev, Mehdi Rezagholizadeh, Pascal Poupart

    Abstract: Prompt-tuning has become an increasingly popular parameter-efficient method for adapting large pretrained language models to downstream tasks. However, both discrete prompting and continuous prompting assume fixed prompts for all data samples within a task, neglecting the fact that inputs vary greatly in some tasks such as open-domain dialogue generation. In this paper, we present a novel, instanc… ▽ More

    Submitted 11 July, 2023; originally announced July 2023.

    Comments: Accepted at ACL 2023 In Findings

  11. arXiv:2212.05998  [pdf, other

    cs.LG cs.CL

    Continuation KD: Improved Knowledge Distillation through the Lens of Continuation Optimization

    Authors: Aref Jafari, Ivan Kobyzev, Mehdi Rezagholizadeh, Pascal Poupart, Ali Ghodsi

    Abstract: Knowledge Distillation (KD) has been extensively used for natural language understanding (NLU) tasks to improve a small model's (a student) generalization by transferring the knowledge from a larger model (a teacher). Although KD methods achieve state-of-the-art performance in numerous settings, they suffer from several problems limiting their performance. It is shown in the literature that the ca… ▽ More

    Submitted 12 December, 2022; originally announced December 2022.

    Comments: Published at EMNLP 2022 (Findings)

  12. arXiv:2211.14960  [pdf, other

    cs.LG stat.ML

    Label Alignment Regularization for Distribution Shift

    Authors: Ehsan Imani, Guojun Zhang, Runjia Li, Jun Luo, Pascal Poupart, Philip H. S. Torr, Yangchen Pan

    Abstract: Recent work has highlighted the label alignment property (LAP) in supervised learning, where the vector of all labels in the dataset is mostly in the span of the top few singular vectors of the data matrix. Drawing inspiration from this observation, we propose a regularization method for unsupervised domain adaptation that encourages alignment between the predictions in the target domain and its t… ▽ More

    Submitted 11 June, 2024; v1 submitted 27 November, 2022; originally announced November 2022.

  13. arXiv:2206.15444  [pdf, other

    cs.LG

    Learning Functions on Multiple Sets using Multi-Set Transformers

    Authors: Kira Selby, Ahmad Rashid, Ivan Kobyzev, Mehdi Rezagholizadeh, Pascal Poupart

    Abstract: We propose a general deep architecture for learning functions on multiple permutation-invariant sets. We also show how to generalize this architecture to sets of elements of any dimension by dimension equivariance. We demonstrate that our architecture is a universal approximator of these functions, and show superior results to existing methods on a variety of tasks including counting tasks, alignm… ▽ More

    Submitted 30 June, 2022; originally announced June 2022.

  14. arXiv:2206.09670  [pdf, other

    cs.LG

    Benchmarking Constraint Inference in Inverse Reinforcement Learning

    Authors: Guiliang Liu, Yudong Luo, Ashish Gaurav, Kasra Rezaee, Pascal Poupart

    Abstract: When deploying Reinforcement Learning (RL) agents into a physical system, we must ensure that these agents are well aware of the underlying constraints. In many real-world problems, however, the constraints are often hard to specify mathematically and unknown to the RL agents. To tackle these issues, Inverse Constrained Reinforcement Learning (ICRL) empirically estimates constraints from expert de… ▽ More

    Submitted 2 March, 2023; v1 submitted 20 June, 2022; originally announced June 2022.

  15. arXiv:2206.09526  [pdf, other

    cs.LG stat.ML

    Robust One Round Federated Learning with Predictive Space Bayesian Inference

    Authors: Mohsin Hasan, Zehao Zhang, Kaiyang Guo, Mahdi Karami, Guojun Zhang, Xi Chen, Pascal Poupart

    Abstract: Making predictions robust is an important challenge. A separate challenge in federated learning (FL) is to reduce the number of communication rounds, particularly since doing so reduces performance in heterogeneous data settings. To tackle both issues, we take a Bayesian perspective on the problem of learning a global model. We show how the global predictive posterior can be approximated using cli… ▽ More

    Submitted 19 June, 2022; originally announced June 2022.

    Comments: 7 pages, 1 figure. Code is publicly available at https://github.com/hasanmohsin/FedPredSpace_1Round

  16. arXiv:2206.06357  [pdf, other

    cs.LG

    Federated Bayesian Neural Regression: A Scalable Global Federated Gaussian Process

    Authors: Haolin Yu, Kaiyang Guo, Mahdi Karami, Xi Chen, Guojun Zhang, Pascal Poupart

    Abstract: In typical scenarios where the Federated Learning (FL) framework applies, it is common for clients to have insufficient training data to produce an accurate model. Thus, models that provide not only point estimations, but also some notion of confidence are beneficial. Gaussian Process (GP) is a powerful Bayesian model that comes with naturally well-calibrated variance estimations. However, it is c… ▽ More

    Submitted 13 June, 2022; originally announced June 2022.

    Comments: 10 pages main text, 5 pages appendix, 5 figures

  17. arXiv:2206.01311  [pdf, other

    cs.LG

    Learning Soft Constraints From Constrained Expert Demonstrations

    Authors: Ashish Gaurav, Kasra Rezaee, Guiliang Liu, Pascal Poupart

    Abstract: Inverse reinforcement learning (IRL) methods assume that the expert data is generated by an agent optimizing some reward function. However, in many settings, the agent may optimize a reward function subject to some constraints, where the constraints induce behaviors that may be otherwise difficult to express with just a reward function. We consider the setting where the reward function is given, a… ▽ More

    Submitted 27 April, 2023; v1 submitted 2 June, 2022; originally announced June 2022.

    Comments: ICLR 2023 camera ready version (incl. supplementary material)

  18. arXiv:2205.13697  [pdf, other

    cs.LG cs.AI cs.MA

    FedFormer: Contextual Federation with Attention in Reinforcement Learning

    Authors: Liam Hebert, Lukasz Golab, Pascal Poupart, Robin Cohen

    Abstract: A core issue in multi-agent federated reinforcement learning is defining how to aggregate insights from multiple agents. This is commonly done by taking the average of each participating agent's model weights into one common model (FedAvg). We instead propose FedFormer, a novel federation strategy that utilizes Transformer Attention to contextually aggregate embeddings from models originating from… ▽ More

    Submitted 2 March, 2023; v1 submitted 26 May, 2022; originally announced May 2022.

    Comments: Our source code can be found at https://github.com/liamhebert/FedFormer. Accepted at AAMAS 2023

  19. arXiv:2205.12428  [pdf, other

    cs.LG cs.CL

    Do we need Label Regularization to Fine-tune Pre-trained Language Models?

    Authors: Ivan Kobyzev, Aref Jafari, Mehdi Rezagholizadeh, Tianda Li, Alan Do-Omri, Peng Lu, Pascal Poupart, Ali Ghodsi

    Abstract: Knowledge Distillation (KD) is a prominent neural model compression technique that heavily relies on teacher network predictions to guide the training of a student model. Considering the ever-growing size of pre-trained language models (PLMs), KD is often adopted in many NLP tasks involving PLMs. However, it is evident that in KD, deploying the teacher network during training adds to the memory an… ▽ More

    Submitted 12 April, 2023; v1 submitted 24 May, 2022; originally announced May 2022.

    Comments: Published at EACL 2023

  20. arXiv:2204.07674  [pdf, other

    cs.CL

    CILDA: Contrastive Data Augmentation using Intermediate Layer Knowledge Distillation

    Authors: Md Akmal Haidar, Mehdi Rezagholizadeh, Abbas Ghaddar, Khalil Bibi, Philippe Langlais, Pascal Poupart

    Abstract: Knowledge distillation (KD) is an efficient framework for compressing large-scale pre-trained language models. Recent years have seen a surge of research aiming to improve KD by leveraging Contrastive Learning, Intermediate Layer Distillation, Data Augmentation, and Adversarial Training. In this work, we propose a learning based data augmentation technique tailored for knowledge distillation, call… ▽ More

    Submitted 15 April, 2022; originally announced April 2022.

  21. arXiv:2112.12321  [pdf, other

    cs.LG cs.NI

    Physics Constrained Flow Neural Network for Short-Timescale Predictions in Data Communications Networks

    Authors: Xiangle Cheng, James He, Shihan Xiao, Yingxue Zhang, Zhitang Chen, Pascal Poupart, Fenglin Li

    Abstract: Machine learning is gaining growing momentum in various recent models for the dynamic analysis of information flows in data communications networks. These preliminary models often rely on off-the-shelf learning models to predict from historical statistics while disregarding the physics governing the generating behaviors of these flows. This paper instead introduces Flow Neural Network (FlowNN) to… ▽ More

    Submitted 2 April, 2023; v1 submitted 22 December, 2021; originally announced December 2021.

  22. arXiv:2112.09099  [pdf, other

    cs.MA

    Decentralized Mean Field Games

    Authors: Sriram Ganapathi Subramanian, Matthew E. Taylor, Mark Crowley, Pascal Poupart

    Abstract: Multiagent reinforcement learning algorithms have not been widely adopted in large scale environments with many agents as they often scale poorly with the number of agents. Using mean field theory to aggregate agents has been proposed as a solution to this problem. However, almost all previous methods in this area make a strong assumption of a centralized system where all the agents in the environ… ▽ More

    Submitted 13 April, 2022; v1 submitted 16 December, 2021; originally announced December 2021.

    Comments: This work is to appear in AAAI-22. Recent version has minor formatting changes and some typos corrected

  23. arXiv:2109.10164  [pdf, other

    cs.CL

    RAIL-KD: RAndom Intermediate Layer Map** for Knowledge Distillation

    Authors: Md Akmal Haidar, Nithin Anchuri, Mehdi Rezagholizadeh, Abbas Ghaddar, Philippe Langlais, Pascal Poupart

    Abstract: Intermediate layer knowledge distillation (KD) can improve the standard KD technique (which only targets the output of teacher and student models) especially over large pre-trained language models. However, intermediate layer distillation suffers from excessive computational burdens and engineering efforts required for setting up a proper layer map**. To address these problems, we propose a RAnd… ▽ More

    Submitted 1 October, 2021; v1 submitted 21 September, 2021; originally announced September 2021.

  24. arXiv:2109.04286  [pdf, other

    cs.LG cs.AI stat.ML

    NTS-NOTEARS: Learning Nonparametric DBNs With Prior Knowledge

    Authors: Xiangyu Sun, Oliver Schulte, Guiliang Liu, Pascal Poupart

    Abstract: We describe NTS-NOTEARS, a score-based structure learning method for time-series data to learn dynamic Bayesian networks (DBNs) that captures nonlinear, lagged (inter-slice) and instantaneous (intra-slice) relations among variables. NTS-NOTEARS utilizes 1D convolutional neural networks (CNNs) to model the dependence of child variables on their parents; 1D CNN is a neural function approximation mod… ▽ More

    Submitted 1 March, 2023; v1 submitted 9 September, 2021; originally announced September 2021.

    Comments: AISTATS 2023

  25. arXiv:2106.03632  [pdf, other

    cs.LG cs.AI stat.ML

    Quantifying and Improving Transferability in Domain Generalization

    Authors: Guojun Zhang, Han Zhao, Yaoliang Yu, Pascal Poupart

    Abstract: Out-of-distribution generalization is one of the key challenges when transferring a model from the lab to the real world. Existing efforts mostly focus on building invariant features among source and target domains. Based on invariant features, a high-performing classifier on source domains could hopefully behave equally well on a target domain. In other words, the invariant features are \emph{tra… ▽ More

    Submitted 1 November, 2021; v1 submitted 7 June, 2021; originally announced June 2021.

    Comments: NeurIPS 2021

  26. arXiv:2104.08420  [pdf, other

    cs.CL cs.LG

    Robust Embeddings Via Distributions

    Authors: Kira A. Selby, Yinong Wang, Ruizhe Wang, Peyman Passban, Ahmad Rashid, Mehdi Rezagholizadeh, Pascal Poupart

    Abstract: Despite recent monumental advances in the field, many Natural Language Processing (NLP) models still struggle to perform adequately on noisy domains. We propose a novel probabilistic embedding-level method to improve the robustness of NLP models. Our method, Robust Embeddings via Distributions (RED), incorporates information from both noisy tokens and surrounding context to obtain distributions ov… ▽ More

    Submitted 16 April, 2021; originally announced April 2021.

  27. arXiv:2103.01039  [pdf, other

    cs.CV

    Self-Supervised Simultaneous Multi-Step Prediction of Road Dynamics and Cost Map

    Authors: Elmira Amirloo, Mohsen Rohani, Ershad Banijamali, Jun Luo, Pascal Poupart

    Abstract: While supervised learning is widely used for perception modules in conventional autonomous driving solutions, scalability is hindered by the huge amount of data labeling needed. In contrast, while end-to-end architectures do not require labeled data and are potentially more scalable, interpretability is sacrificed. We introduce a novel architecture that is trained in a fully self-supervised fashio… ▽ More

    Submitted 29 March, 2021; v1 submitted 1 March, 2021; originally announced March 2021.

    Journal ref: CVPR 2021

  28. arXiv:2012.15791  [pdf, other

    cs.MA

    Partially Observable Mean Field Reinforcement Learning

    Authors: Sriram Ganapathi Subramanian, Matthew E. Taylor, Mark Crowley, Pascal Poupart

    Abstract: Traditional multi-agent reinforcement learning algorithms are not scalable to environments with more than a few agents, since these algorithms are exponential in the number of agents. Recent research has introduced successful methods to scale multi-agent reinforcement learning algorithms to many agent scenarios using mean field theory. Previous work in this field assumes that an agent has access t… ▽ More

    Submitted 24 January, 2021; v1 submitted 31 December, 2020; originally announced December 2020.

    Comments: Paper to be published in International Conference on Autonomous Agents and Multiagent Systems (AAMAS) - 2021. New version has some typos corrected

  29. arXiv:2012.13478  [pdf, other

    cs.LG cs.CV

    Prediction by Anticipation: An Action-Conditional Prediction Method based on Interaction Learning

    Authors: Ershad Banijamali, Mohsen Rohani, Elmira Amirloo, Jun Luo, Pascal Poupart

    Abstract: In autonomous driving (AD), accurately predicting changes in the environment can effectively improve safety and comfort. Due to complex interactions among traffic participants, however, it is very hard to achieve accurate prediction for a long horizon. To address this challenge, we propose prediction by anticipation, which views interaction in terms of a latent probabilistic generative process whe… ▽ More

    Submitted 24 December, 2020; originally announced December 2020.

  30. arXiv:2006.14592  [pdf, other

    cs.LG math.OC stat.ML

    Newton-type Methods for Minimax Optimization

    Authors: Guojun Zhang, Kaiwen Wu, Pascal Poupart, Yaoliang Yu

    Abstract: Differential games, in particular two-player sequential zero-sum games (a.k.a. minimax optimization), have been an important modeling tool in applied science and received renewed interest in machine learning due to many recent applications, such as adversarial training, generative models and reinforcement learning. However, existing theory mostly focuses on convex-concave functions with few except… ▽ More

    Submitted 18 February, 2023; v1 submitted 25 June, 2020; originally announced June 2020.

    Comments: code update

  31. arXiv:2003.03731  [pdf, ps, other

    math.OC cs.LG math.AG

    A Positivstellensatz for Conditional SAGE Signomials

    Authors: Allen Houze Wang, Priyank Jaini, Yaoliang Yu, Pascal Poupart

    Abstract: Recently, the conditional SAGE certificate has been proposed as a sufficient condition for signomial positivity over a convex set. In this article, we show that the conditional SAGE certificate is $\textit{complete}$. That is, for any signomial $f(\mathbf{x}) = \sum_{j=1}^{\ell}c_j \exp(\mathbf{A}_j\mathbf{x})$ defined by rational exponents that is positive over a compact convex set $\mathcal{X}$,… ▽ More

    Submitted 24 October, 2020; v1 submitted 8 March, 2020; originally announced March 2020.

    Comments: 19 pages, preprint

  32. arXiv:2003.03645  [pdf, other

    cs.CL cs.AI cs.HC

    Generating Emotionally Aligned Responses in Dialogues using Affect Control Theory

    Authors: Nabiha Asghar, Ivan Kobyzev, Jesse Hoey, Pascal Poupart, Muhammad Bilal Sheikh

    Abstract: State-of-the-art neural dialogue systems excel at syntactic and semantic modelling of language, but often have a hard time establishing emotional alignment with the human interactant during a conversation. In this work, we bring Affect Control Theory (ACT), a socio-mathematical model of emotions for human-human interactions, to the neural dialogue generation setting. ACT makes predictions about ho… ▽ More

    Submitted 16 April, 2020; v1 submitted 7 March, 2020; originally announced March 2020.

  33. arXiv:2002.11875  [pdf, other

    cs.LG math.OC stat.ML

    Optimality and Stability in Non-Convex Smooth Games

    Authors: Guojun Zhang, Pascal Poupart, Yaoliang Yu

    Abstract: Convergence to a saddle point for convex-concave functions has been studied for decades, while recent years has seen a surge of interest in non-convex (zero-sum) smooth games, motivated by their recent wide applications. It remains an intriguing research challenge how local optimal points are defined and which algorithm can converge to such points. An interesting concept is known as the local mini… ▽ More

    Submitted 3 February, 2022; v1 submitted 26 February, 2020; originally announced February 2020.

    Comments: accepted by JMLR 2022

  34. arXiv:2002.10631  [pdf, other

    cs.LG cs.CV stat.ML

    Batch norm with entropic regularization turns deterministic autoencoders into generative models

    Authors: Amur Ghose, Abdullah Rashwan, Pascal Poupart

    Abstract: The variational autoencoder is a well defined deep generative model that utilizes an encoder-decoder framework where an encoding neural network outputs a non-deterministic code for reconstructing an input. The encoder achieves this by sampling from a distribution for every input, instead of outputting a deterministic code per input. The great advantage of this process is that it allows the use of… ▽ More

    Submitted 21 September, 2021; v1 submitted 24 February, 2020; originally announced February 2020.

    Journal ref: Published in the Proceedings of the International Conference on Uncertainty in Artificial Intelligence (UAI), 2020

  35. arXiv:2002.09127  [pdf, other

    cs.CL cs.LG

    Learning Dynamic Belief Graphs to Generalize on Text-Based Games

    Authors: Ashutosh Adhikari, Xingdi Yuan, Marc-Alexandre Côté, Mikuláš Zelinka, Marc-Antoine Rondeau, Romain Laroche, Pascal Poupart, Jian Tang, Adam Trischler, William L. Hamilton

    Abstract: Playing text-based games requires skills in processing natural language and sequential decision making. Achieving human-level performance on text-based games remains an open challenge, and prior research has largely relied on hand-crafted structured representations and heuristics. In this work, we investigate how an agent can plan and generalize in text-based games using graph-structured represent… ▽ More

    Submitted 11 May, 2021; v1 submitted 20 February, 2020; originally announced February 2020.

    Comments: Bug fixed in Table 1

  36. arXiv:2002.02513  [pdf, other

    cs.MA cs.AI cs.LG

    Multi Type Mean Field Reinforcement Learning

    Authors: Sriram Ganapathi Subramanian, Pascal Poupart, Matthew E. Taylor, Nidhi Hegde

    Abstract: Mean field theory provides an effective way of scaling multiagent reinforcement learning algorithms to environments with many agents that can be abstracted by a virtual mean agent. In this paper, we extend mean field multiagent algorithms to multiple types. The types enable the relaxation of a core assumption in mean field reinforcement learning, which is that all agents in the environment are pla… ▽ More

    Submitted 21 June, 2022; v1 submitted 6 February, 2020; originally announced February 2020.

    Comments: The paper appears in the proceedings of International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS) 2020. Revised version has some typos corrected

  37. arXiv:2002.00743  [pdf, other

    cs.CL cs.LG stat.ML

    Unsupervised Multilingual Alignment using Wasserstein Barycenter

    Authors: Xin Lian, Kshitij Jain, Jakub Truszkowski, Pascal Poupart, Yaoliang Yu

    Abstract: We study unsupervised multilingual alignment, the problem of finding word-to-word translations between multiple languages without using any parallel data. One popular strategy is to reduce multilingual alignment to the much simplified bilingual setting, by picking one of the input languages as the pivot language that we transit through. However, it is well-known that transiting through a poorly ch… ▽ More

    Submitted 28 July, 2020; v1 submitted 28 January, 2020; originally announced February 2020.

    Comments: Code is available at https://github.com/alixxxin/multi-lang

    ACM Class: I.2.7

    Journal ref: Proceedings of International Joint Conference on Artificial Intelligence (IJCAI), 2020

  38. arXiv:2001.03194  [pdf, other

    cs.CV

    MatrixNets: A New Scale and Aspect Ratio Aware Architecture for Object Detection

    Authors: Abdullah Rashwan, Rishav Agarwal, Agastya Kalra, Pascal Poupart

    Abstract: We present MatrixNets (xNets), a new deep architecture for object detection. xNets map objects with similar sizes and aspect ratios into many specialized layers, allowing xNets to provide a scale and aspect ratio aware architecture. We leverage xNets to enhance single-stage object detection frameworks. First, we apply xNets on anchor-based object detection, for which we predict object centers and… ▽ More

    Submitted 9 January, 2020; originally announced January 2020.

    Comments: This is the full paper for arXiv:1908.04646 with more applications, experiments, and ablation study

  39. arXiv:1908.04646  [pdf, other

    cs.CV

    Matrix Nets: A New Deep Architecture for Object Detection

    Authors: Abdullah Rashwan, Agastya Kalra, Pascal Poupart

    Abstract: We present Matrix Nets (xNets), a new deep architecture for object detection. xNets map objects with different sizes and aspect ratios into layers where the sizes and the aspect ratios of the objects within their layers are nearly uniform. Hence, xNets provide a scale and aspect ratio aware architecture. We leverage xNets to enhance key-points based object detection. Our architecture achieves mAP… ▽ More

    Submitted 14 August, 2019; v1 submitted 13 August, 2019; originally announced August 2019.

    Comments: Short paper, stay tuned for the full paper!

  40. arXiv:1907.05321  [pdf, other

    cs.LG

    Time2Vec: Learning a Vector Representation of Time

    Authors: Seyed Mehran Kazemi, Rishab Goel, Sepehr Eghbali, Janahan Ramanan, Jaspreet Sahota, Sanjay Thakur, Stella Wu, Cathal Smyth, Pascal Poupart, Marcus Brubaker

    Abstract: Time is an important feature in many applications involving events that occur synchronously and/or asynchronously. To effectively consume time information, recent studies have focused on designing new architectures. In this paper, we take an orthogonal but complementary approach by providing a model-agnostic vector representation for time, called Time2Vec, that can be easily imported into many exi… ▽ More

    Submitted 11 July, 2019; originally announced July 2019.

  41. arXiv:1907.03783  [pdf, other

    cs.LG math.ST stat.ML

    Comparing EM with GD in Mixture Models of Two Components

    Authors: Guojun Zhang, Pascal Poupart, George Trimponias

    Abstract: The expectation-maximization (EM) algorithm has been widely used in minimizing the negative log likelihood (also known as cross entropy) of mixture models. However, little is understood about the goodness of the fixed points it converges to. In this paper, we study the regions where one component is missing in two-component mixture models, which we call one-cluster regions. We analyze the propensi… ▽ More

    Submitted 29 October, 2019; v1 submitted 8 July, 2019; originally announced July 2019.

    Comments: UAI 2019

  42. arXiv:1907.03143  [pdf, other

    cs.LG cs.AI stat.ML

    Diachronic Embedding for Temporal Knowledge Graph Completion

    Authors: Rishab Goel, Seyed Mehran Kazemi, Marcus Brubaker, Pascal Poupart

    Abstract: Knowledge graphs (KGs) typically contain temporal facts indicating relationships among entities at different times. Due to their incompleteness, several approaches have been proposed to infer new facts for a KG based on the existing ones-a problem known as KG completion. KG embedding approaches have proved effective for KG completion, however, they have been developed mostly for static KGs. Develo… ▽ More

    Submitted 6 July, 2019; originally announced July 2019.

  43. arXiv:1905.11485  [pdf, other

    cs.LG stat.ML

    Representation Learning for Dynamic Graphs: A Survey

    Authors: Seyed Mehran Kazemi, Rishab Goel, Kshitij Jain, Ivan Kobyzev, Akshay Sethi, Peter Forsyth, Pascal Poupart

    Abstract: Graphs arise naturally in many real-world applications including social networks, recommender systems, ontologies, biology, and computational finance. Traditionally, machine learning models for graphs have been mostly designed for static graphs. However, many applications involve evolving graphs. This introduces important challenges for learning and inference since nodes, attributes, and edges cha… ▽ More

    Submitted 27 April, 2020; v1 submitted 27 May, 2019; originally announced May 2019.

    Comments: Accepted at JMLR, 73 pages, 2 figures

    Journal ref: JMLR, Vol 21, Pages 1-73, 2020

  44. arXiv:1901.03704  [pdf, other

    cs.LG stat.ML

    SPFlow: An Easy and Extensible Library for Deep Probabilistic Learning using Sum-Product Networks

    Authors: Alejandro Molina, Antonio Vergari, Karl Stelzner, Robert Peharz, Pranav Subramani, Nicola Di Mauro, Pascal Poupart, Kristian Kersting

    Abstract: We introduce SPFlow, an open-source Python library providing a simple interface to inference, learning and manipulation routines for deep and tractable probabilistic models called Sum-Product Networks (SPNs). The library allows one to quickly create SPNs both from data and through a domain specific language (DSL). It efficiently implements several probabilistic inference routines like computing ma… ▽ More

    Submitted 11 January, 2019; originally announced January 2019.

    Comments: 4 pages, 1 figure, code

  45. arXiv:1811.00239  [pdf, other

    cs.CL cs.AI cs.LG

    Progressive Memory Banks for Incremental Domain Adaptation

    Authors: Nabiha Asghar, Lili Mou, Kira A. Selby, Kevin D. Pantasdo, Pascal Poupart, Xin Jiang

    Abstract: This paper addresses the problem of incremental domain adaptation (IDA) in natural language processing (NLP). We assume each domain comes one after another, and that we could only access data in the current domain. The goal of IDA is to build a unified model performing well on all the domains that we have encountered. We adopt the recurrent neural network (RNN) widely used in NLP, but augment it w… ▽ More

    Submitted 13 February, 2020; v1 submitted 1 November, 2018; originally announced November 2018.

    Comments: ICLR 2020

  46. arXiv:1805.07780  [pdf, other

    cs.CV cs.AI cs.LG

    Unsupervised Video Object Segmentation for Deep Reinforcement Learning

    Authors: Vik Goel, Jameson Weng, Pascal Poupart

    Abstract: We present a new technique for deep reinforcement learning that automatically detects moving objects and uses the relevant information for action selection. The detection of moving objects is done in an unsupervised way by exploiting structure from motion. Instead of directly learning a policy from raw images, the agent first learns to detect and segment moving objects by exploiting flow informati… ▽ More

    Submitted 20 May, 2018; originally announced May 2018.

  47. arXiv:1804.06309   

    cs.LG stat.ML

    On Improving Deep Reinforcement Learning for POMDPs

    Authors: Pengfei Zhu, Xin Li, Pascal Poupart, Guanghui Miao

    Abstract: Deep Reinforcement Learning (RL) recently emerged as one of the most competitive approaches for learning in sequential decision making problems with fully observable environments, e.g., computer Go. However, very little work has been done in deep RL to handle partially observable environments. We propose a new architecture called Action-specific Deep Recurrent Q-Network (ADRQN) to enhance learning… ▽ More

    Submitted 8 May, 2018; v1 submitted 17 April, 2018; originally announced April 2018.

    Comments: We are the authors of "On Improving Deep Reinforcement Learning for POMDPs", identifier of which is arXiv:1704.07978. Last week, I wanted to update the article with new version but created a new submission which identifier is 1804.06309 by mistake

  48. arXiv:1712.08207  [pdf, other

    cs.CL

    Variational Attention for Sequence-to-Sequence Models

    Authors: Hareesh Bahuleyan, Lili Mou, Olga Vechtomova, Pascal Poupart

    Abstract: The variational encoder-decoder (VED) encodes source information as a set of random variables using a neural network, which in turn is decoded into target data using another neural network. In natural language processing, sequence-to-sequence (Seq2Seq) models typically serve as encoder-decoder networks. When combined with a traditional (deterministic) attention mechanism, the variational latent sp… ▽ More

    Submitted 21 June, 2018; v1 submitted 21 December, 2017; originally announced December 2017.

    Comments: In Proceedings of COLING 2018. Also accepted by TADGM Workshop@ICML 2018 for presentation

  49. arXiv:1712.02250  [pdf, other

    cs.CL cs.LG

    Why Do Neural Dialog Systems Generate Short and Meaningless Replies? A Comparison between Dialog and Translation

    Authors: Bolin Wei, Shuai Lu, Lili Mou, Hao Zhou, Pascal Poupart, Ge Li, Zhi **

    Abstract: This paper addresses the question: Why do neural dialog systems generate short and meaningless replies? We conjecture that, in a dialog system, an utterance may have multiple equally plausible replies, causing the deficiency of neural networks in the dialog application. We propose a systematic way to mimic the dialog scenario in a machine translation system, and manage to reproduce the phenomenon… ▽ More

    Submitted 6 December, 2017; originally announced December 2017.

  50. arXiv:1709.03968  [pdf, other

    cs.CL cs.AI cs.CY cs.HC cs.IR

    Affective Neural Response Generation

    Authors: Nabiha Asghar, Pascal Poupart, Jesse Hoey, Xin Jiang, Lili Mou

    Abstract: Existing neural conversational models process natural language primarily on a lexico-syntactic level, thereby ignoring one of the most crucial components of human-to-human dialogue: its affective content. We take a step in this direction by proposing three novel ways to incorporate affective/emotional aspects into long short term memory (LSTM) encoder-decoder neural conversation models: (1) affect… ▽ More

    Submitted 12 September, 2017; originally announced September 2017.

    Comments: 8 pages

    MSC Class: 68T50 ACM Class: I.2.7