Skip to main content

Showing 1–50 of 129 results for author: Bacon, P

.
  1. arXiv:2406.05953  [pdf, other

    cs.LG

    Decoupling regularization from the action space

    Authors: Sobhan Mohammadpour, Emma Fre**ger, Pierre-Luc Bacon

    Abstract: Regularized reinforcement learning (RL), particularly the entropy-regularized kind, has gained traction in optimal control and inverse RL. While standard unregularized RL methods remain unaffected by changes in the number of actions, we show that it can severely impact their regularized counterparts. This paper demonstrates the importance of decoupling the regularizer from the action space: that i… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

  2. arXiv:2405.01616  [pdf, other

    q-bio.BM cs.AI cs.LG

    Generative Active Learning for the Search of Small-molecule Protein Binders

    Authors: Maksym Korablyov, Cheng-Hao Liu, Moksh Jain, Almer M. van der Sloot, Eric Jolicoeur, Edward Ruediger, Andrei Cristian Nica, Emmanuel Bengio, Kostiantyn Lapchevskyi, Daniel St-Cyr, Doris Alexandra Schuetz, Victor Ion Butoi, Jarrid Rector-Brooks, Simon Blackburn, Leo Feng, Hadi Nekoei, SaiKrishna Gottipati, Priyesh Vijayan, Prateek Gupta, Ladislav Rampášek, Sasikanth Avancha, Pierre-Luc Bacon, William L. Hamilton, Brooks Paige, Sanchit Misra , et al. (9 additional authors not shown)

    Abstract: Despite substantial progress in machine learning for scientific discovery in recent years, truly de novo design of small molecules which exhibit a property of interest remains a significant challenge. We introduce LambdaZero, a generative active learning approach to search for synthesizable molecules. Powered by deep reinforcement learning, LambdaZero learns to search over the vast space of molecu… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

  3. arXiv:2403.07688  [pdf, other

    cs.LG cs.AI

    Maxwell's Demon at Work: Efficient Pruning by Leveraging Saturation of Neurons

    Authors: Simon Dufort-Labbé, Pierluca D'Oro, Evgenii Nikishin, Razvan Pascanu, Pierre-Luc Bacon, Aristide Baratin

    Abstract: When training deep neural networks, the phenomenon of $\textit{dying neurons}$ $\unicode{x2013}$units that become inactive or saturated, output zero during training$\unicode{x2013}$ has traditionally been viewed as undesirable, linked with optimization challenges, and contributing to plasticity loss in continual learning scenarios. In this paper, we reassess this phenomenon, focusing on sparsity a… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

  4. arXiv:2402.05290  [pdf, other

    cs.LG cs.AI

    Do Transformer World Models Give Better Policy Gradients?

    Authors: Michel Ma, Tianwei Ni, Clement Gehring, Pierluca D'Oro, Pierre-Luc Bacon

    Abstract: A natural approach for reinforcement learning is to predict future rewards by unrolling a neural network world model, and to backpropagate through the resulting computational graph to learn a policy. However, this method often becomes impractical for long horizons since typical world models induce hard-to-optimize loss landscapes. Transformers are known to efficiently propagate gradients over long… ▽ More

    Submitted 10 February, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

    Comments: Michel Ma and Pierluca D'Oro contributed equally

  5. arXiv:2401.08898  [pdf, other

    cs.LG cs.AI

    Bridging State and History Representations: Understanding Self-Predictive RL

    Authors: Tianwei Ni, Benjamin Eysenbach, Erfan Seyedsalehi, Michel Ma, Clement Gehring, Aditya Mahajan, Pierre-Luc Bacon

    Abstract: Representations are at the core of all deep reinforcement learning (RL) methods for both Markov decision processes (MDPs) and partially observable Markov decision processes (POMDPs). Many representation learning methods and theoretical frameworks have been developed to understand what constitutes an effective representation. However, the relationships between these methods and the shared propertie… ▽ More

    Submitted 21 April, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

    Comments: ICLR 2024 (Poster). Code is available at https://github.com/twni2016/self-predictive-rl

  6. arXiv:2312.14331  [pdf, other

    cs.LG

    Maximum entropy GFlowNets with soft Q-learning

    Authors: Sobhan Mohammadpour, Emmanuel Bengio, Emma Fre**ger, Pierre-Luc Bacon

    Abstract: Generative Flow Networks (GFNs) have emerged as a powerful tool for sampling discrete objects from unnormalized distributions, offering a scalable alternative to Markov Chain Monte Carlo (MCMC) methods. While GFNs draw inspiration from maximum entropy reinforcement learning (RL), the connection between the two has largely been unclear and seemingly applicable only in specific cases. This paper add… ▽ More

    Submitted 2 May, 2024; v1 submitted 21 December, 2023; originally announced December 2023.

    Journal ref: 2024 Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 238:2593-2601

  7. arXiv:2310.15386  [pdf, other

    cs.LG cs.AI cs.RO eess.SY

    Course Correcting Koopman Representations

    Authors: Mahan Fathi, Clement Gehring, Jonathan Pilault, David Kanaa, Pierre-Luc Bacon, Ross Goroshin

    Abstract: Koopman representations aim to learn features of nonlinear dynamical systems (NLDS) which lead to linear dynamics in the latent space. Theoretically, such features can be used to simplify many problems in modeling and control of NLDS. In this work we study autoencoder formulations of this problem, and different ways they can be used to model dynamics, specifically for future state prediction over… ▽ More

    Submitted 23 November, 2023; v1 submitted 23 October, 2023; originally announced October 2023.

  8. arXiv:2310.00166  [pdf, other

    cs.AI cs.LG

    Motif: Intrinsic Motivation from Artificial Intelligence Feedback

    Authors: Martin Klissarov, Pierluca D'Oro, Shagun Sodhani, Roberta Raileanu, Pierre-Luc Bacon, Pascal Vincent, Amy Zhang, Mikael Henaff

    Abstract: Exploring rich environments and evaluating one's actions without prior knowledge is immensely challenging. In this paper, we propose Motif, a general method to interface such prior knowledge from a Large Language Model (LLM) with an agent. Motif is based on the idea of grounding LLMs for decision-making without requiring them to interact with the environment: it elicits preferences from an LLM ove… ▽ More

    Submitted 29 September, 2023; originally announced October 2023.

    Comments: The first two authors equally contributed - order decided by coin flip

  9. arXiv:2309.14597  [pdf, other

    cs.LG

    Policy Optimization in a Noisy Neighborhood: On Return Landscapes in Continuous Control

    Authors: Nate Rahn, Pierluca D'Oro, Harley Wiltzer, Pierre-Luc Bacon, Marc G. Bellemare

    Abstract: Deep reinforcement learning agents for continuous control are known to exhibit significant instability in their performance over time. In this work, we provide a fresh perspective on these behaviors by studying the return landscape: the map** between a policy and a return. We find that popular algorithms traverse noisy neighborhoods of this landscape, in which a single update to the policy param… ▽ More

    Submitted 10 April, 2024; v1 submitted 25 September, 2023; originally announced September 2023.

    Comments: NeurIPS 2023 Accepted Paper. The first two authors contributed equally

  10. arXiv:2307.03864  [pdf, other

    cs.LG

    When Do Transformers Shine in RL? Decoupling Memory from Credit Assignment

    Authors: Tianwei Ni, Michel Ma, Benjamin Eysenbach, Pierre-Luc Bacon

    Abstract: Reinforcement learning (RL) algorithms face two distinct challenges: learning effective representations of past and present observations, and determining how actions influence future returns. Both challenges involve modeling long-term dependencies. The Transformer architecture has been very successful to solve problems that involve long-term dependencies, including in the RL domain. However, the u… ▽ More

    Submitted 3 November, 2023; v1 submitted 7 July, 2023; originally announced July 2023.

    Comments: NeurIPS 2023 (Oral)

  11. arXiv:2306.09539  [pdf, other

    cs.CL cs.LG

    Block-State Transformers

    Authors: Mahan Fathi, Jonathan Pilault, Orhan Firat, Christopher Pal, Pierre-Luc Bacon, Ross Goroshin

    Abstract: State space models (SSMs) have shown impressive results on tasks that require modeling long-range dependencies and efficiently scale to long sequences owing to their subquadratic runtime complexity. Originally designed for continuous signals, SSMs have shown superior performance on a plethora of tasks, in vision and audio; however, SSMs still lag Transformer performance in Language Modeling tasks.… ▽ More

    Submitted 30 October, 2023; v1 submitted 15 June, 2023; originally announced June 2023.

    Comments: NeurIPS'23 - Thirty-seventh Conference on Neural Information Processing Systems

  12. arXiv:2306.04620  [pdf, other

    cs.LG q-bio.BM

    Goal-conditioned GFlowNets for Controllable Multi-Objective Molecular Design

    Authors: Julien Roy, Pierre-Luc Bacon, Christopher Pal, Emmanuel Bengio

    Abstract: In recent years, in-silico molecular design has received much attention from the machine learning community. When designing a new compound for pharmaceutical applications, there are usually multiple properties of such molecules that need to be optimised: binding energy to the target, synthesizability, toxicity, EC50, and so on. While previous approaches have employed a scalarization scheme to turn… ▽ More

    Submitted 29 June, 2023; v1 submitted 7 June, 2023; originally announced June 2023.

    Comments: 14 pages

  13. arXiv:2209.06259  [pdf, other

    cs.LG cs.AI

    Designing Biological Sequences via Meta-Reinforcement Learning and Bayesian Optimization

    Authors: Leo Feng, Padideh Nouri, Aneri Muni, Yoshua Bengio, Pierre-Luc Bacon

    Abstract: The ability to accelerate the design of biological sequences can have a substantial impact on the progress of the medical field. The problem can be framed as a global optimization problem where the objective is an expensive black-box function such that we can query large batches restricted with a limitation of a low number of rounds. Bayesian Optimization is a principled method for tackling this p… ▽ More

    Submitted 13 September, 2022; originally announced September 2022.

  14. arXiv:2205.13513  [pdf, other

    gr-qc astro-ph.IM

    Denoising gravitational-wave signals from binary black holes with dilated convolutional autoencoder

    Authors: P. Bacon, A. Trovato, M. Bejger

    Abstract: Broadband frequency output of gravitational-wave detectors is a non-stationary and non-Gaussian time series data stream dominated by noise populated by local disturbances and transient artifacts, which evolve on the same timescale as the gravitational-wave signals and may corrupt the astrophysical information. We study a denoising algorithm dedicated to expose the astrophysical signals by employin… ▽ More

    Submitted 26 May, 2022; originally announced May 2022.

    Comments: 27 pages, 5 figures in the text and 7 in the appendix

  15. arXiv:2205.07802  [pdf, other

    cs.LG cs.AI stat.ML

    The Primacy Bias in Deep Reinforcement Learning

    Authors: Evgenii Nikishin, Max Schwarzer, Pierluca D'Oro, Pierre-Luc Bacon, Aaron Courville

    Abstract: This work identifies a common flaw of deep reinforcement learning (RL) algorithms: a tendency to rely on early interactions and ignore useful evidence encountered later. Because of training on progressively growing datasets, deep RL agents incur a risk of overfitting to earlier experiences, negatively affecting the rest of the learning process. Inspired by cognitive science, we refer to this effec… ▽ More

    Submitted 16 May, 2022; originally announced May 2022.

    Comments: ICML 2022; code at https://github.com/evgenii-nikishin/rl_with_resets

  16. arXiv:2203.01443  [pdf, other

    cs.LG

    Continuous-Time Meta-Learning with Forward Mode Differentiation

    Authors: Tristan Deleu, David Kanaa, Leo Feng, Giancarlo Kerg, Yoshua Bengio, Guillaume Lajoie, Pierre-Luc Bacon

    Abstract: Drawing inspiration from gradient-based meta-learning methods with infinitely small gradient steps, we introduce Continuous-Time Meta-Learning (COMLN), a meta-learning algorithm where adaptation follows the dynamics of a gradient vector field. Specifically, representations of the inputs are meta-learned such that a task-specific linear classifier is obtained as a solution of an ordinary differenti… ▽ More

    Submitted 2 March, 2022; originally announced March 2022.

  17. arXiv:2202.10600  [pdf, other

    cs.LG cs.AI eess.SY stat.ML

    Myriad: a real-world testbed to bridge trajectory optimization and deep learning

    Authors: Nikolaus H. R. Howe, Simon Dufort-Labbé, Nitarshan Rajkumar, Pierre-Luc Bacon

    Abstract: We present Myriad, a testbed written in JAX for learning and planning in real-world continuous environments. The primary contributions of Myriad are threefold. First, Myriad provides machine learning practitioners access to trajectory optimization techniques for application within a typical automatic differentiation workflow. Second, Myriad presents many real-world optimal control problems, rangin… ▽ More

    Submitted 26 January, 2023; v1 submitted 21 February, 2022; originally announced February 2022.

    Comments: Updated to match version accepted at NeurIPS 2022

  18. arXiv:2112.12228  [pdf, other

    cs.LG

    Direct Behavior Specification via Constrained Reinforcement Learning

    Authors: Julien Roy, Roger Girgis, Joshua Romoff, Pierre-Luc Bacon, Christopher Pal

    Abstract: The standard formulation of Reinforcement Learning lacks a practical way of specifying what are admissible and forbidden behaviors. Most often, practitioners go about the task of behavior specification by manually engineering the reward function, a counter-intuitive process that requires several iterations and is prone to reward hacking by the agent. In this work, we argue that constrained RL, whi… ▽ More

    Submitted 18 June, 2022; v1 submitted 22 December, 2021; originally announced December 2021.

  19. arXiv:2110.05442  [pdf, other

    cs.LG cs.AI stat.ML

    Neural Algorithmic Reasoners are Implicit Planners

    Authors: Andreea Deac, Petar Veličković, Ognjen Milinković, Pierre-Luc Bacon, Jian Tang, Mladen Nikolić

    Abstract: Implicit planning has emerged as an elegant technique for combining learned models of the world with end-to-end model-free reinforcement learning. We study the class of implicit planners inspired by value iteration, an algorithm that is guaranteed to yield perfect policies in fully-specified tabular environments. We find that prior approaches either assume that the environment is provided in such… ▽ More

    Submitted 11 October, 2021; originally announced October 2021.

    Comments: To appear at NeurIPS 2021 (Spotlight talk). 20 pages, 10 figures. arXiv admin note: text overlap with arXiv:2010.13146

  20. arXiv:2106.03273  [pdf, other

    cs.LG cs.AI stat.ML

    Control-Oriented Model-Based Reinforcement Learning with Implicit Differentiation

    Authors: Evgenii Nikishin, Romina Abachi, Rishabh Agarwal, Pierre-Luc Bacon

    Abstract: The shortcomings of maximum likelihood estimation in the context of model-based reinforcement learning have been highlighted by an increasing number of papers. When the model class is misspecified or has a limited representational capacity, model parameters with high likelihood might not necessarily result in high performance of the agent on a downstream control task. To alleviate this problem, we… ▽ More

    Submitted 6 June, 2021; originally announced June 2021.

    Comments: Code at https://github.com/evgenii-nikishin/omd

  21. arXiv:2103.06224  [pdf, ps, other

    cs.LG cs.IT

    An Information-Theoretic Perspective on Credit Assignment in Reinforcement Learning

    Authors: Dilip Arumugam, Peter Henderson, Pierre-Luc Bacon

    Abstract: How do we formalize the challenge of credit assignment in reinforcement learning? Common intuition would draw attention to reward sparsity as a key contributor to difficult credit assignment and traditional heuristics would look to temporal recency for the solution, calling upon the classic eligibility trace. We posit that it is not the sparsity of the reward itself that causes difficulty in credi… ▽ More

    Submitted 10 March, 2021; originally announced March 2021.

    Comments: Workshop on Biological and Artificial Reinforcement Learning (NeurIPS 2020)

  22. arXiv:2010.14550  [pdf, other

    astro-ph.HE gr-qc

    Search for Gravitational Waves Associated with Gamma-Ray Bursts Detected by Fermi and Swift During the LIGO-Virgo Run O3a

    Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, R. Abbott, T. D. Abbott, S. Abraham, F. Acernese, K. Ackley, C. Adams, R. X. Adhikari, V. B. Adya, C. Affeldt, M. Agathos, K. Agatsuma, N. Aggarwal, O. D. Aguiar, A. Aich, L. Aiello, A. Ain, P. Ajith, G. Allen, A. Allocca, P. A. Altin, A. Amato, S. Anand, A. Ananyeva , et al. (1228 additional authors not shown)

    Abstract: We search for gravitational-wave transients associated with gamma-ray bursts detected by the Fermi and Swift satellites during the first part of the third observing run of Advanced LIGO and Advanced Virgo (1 April 2019 15:00 UTC - 1 October 2019 15:00 UTC). 105 gamma-ray bursts were analyzed using a search for generic gravitational-wave transients; 32 gamma-ray bursts were analyzed with a search t… ▽ More

    Submitted 20 August, 2021; v1 submitted 27 October, 2020; originally announced October 2020.

    Comments: 17 pages, 5 figures, 2 tables

    Report number: LIGO-P2000040

    Journal ref: Astrophys. J. 915, 86 (2021)

  23. arXiv:2010.13146  [pdf, other

    cs.LG cs.AI stat.ML

    XLVIN: eXecuted Latent Value Iteration Nets

    Authors: Andreea Deac, Petar Veličković, Ognjen Milinković, Pierre-Luc Bacon, Jian Tang, Mladen Nikolić

    Abstract: Value Iteration Networks (VINs) have emerged as a popular method to incorporate planning algorithms within deep reinforcement learning, enabling performance improvements on tasks requiring long-range reasoning and understanding of environment dynamics. This came with several limitations, however: the model is not incentivised in any way to perform meaningful planning computations, the underlying s… ▽ More

    Submitted 6 December, 2020; v1 submitted 25 October, 2020; originally announced October 2020.

    Comments: NeurIPS 2020 Deep Reinforcement Learning Workshop

  24. arXiv:2009.12604  [pdf, other

    cs.LG cs.AI stat.ML

    Graph neural induction of value iteration

    Authors: Andreea Deac, Pierre-Luc Bacon, Jian Tang

    Abstract: Many reinforcement learning tasks can benefit from explicit planning based on an internal model of the environment. Previously, such planning components have been incorporated through a neural network that partially aligns with the computational graph of value iteration. Such network have so far been focused on restrictive environments (e.g. grid-worlds), and modelled the planning procedure only i… ▽ More

    Submitted 26 September, 2020; originally announced September 2020.

    Comments: ICML GRL+ 2020

  25. arXiv:2009.01190  [pdf, other

    astro-ph.HE gr-qc

    Properties and astrophysical implications of the 150 Msun binary black hole merger GW190521

    Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, R. Abbott, T. D. Abbott, S. Abraham, F. Acernese, K. Ackley, C. Adams, R. X. Adhikari, V. B. Adya, C. Affeldt, M. Agathos, K. Agatsuma, N. Aggarwal, O. D. Aguiar, A. Aich, L. Aiello, A. Ain, P. Ajith, S. Akcay, G. Allen, A. Allocca, P. A. Altin, A. Amato, S. Anand , et al. (1233 additional authors not shown)

    Abstract: The gravitational-wave signal GW190521 is consistent with a binary black hole merger source at redshift 0.8 with unusually high component masses, $85^{+21}_{-14}\,M_{\odot}$ and $66^{+17}_{-18}\,M_{\odot}$, compared to previously reported events, and shows mild evidence for spin-induced orbital precession. The primary falls in the mass gap predicted by (pulsational) pair-instability supernova theo… ▽ More

    Submitted 2 September, 2020; originally announced September 2020.

    Comments: 39 pages, 13 figures; data available at https://dcc.ligo.org/P2000158-v4/public

    Report number: LIGO-P2000021

    Journal ref: Astrophys. J. Lett. 900, L13 (2020)

  26. GW190521: A Binary Black Hole Merger with a Total Mass of $150 ~ M_{\odot}$

    Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, R. Abbott, T. D. Abbott, S. Abraham, F. Acernese, K. Ackley, C. Adams, R. X. Adhikari, V. B. Adya, C. Affeldt, M. Agathos, K. Agatsuma, N. Aggarwal, O. D. Aguiar, A. Aich, L. Aiello, A. Ain, P. Ajith, S. Akcay, G. Allen, A. Allocca, P. A. Altin, A. Amato, S. Anand , et al. (1232 additional authors not shown)

    Abstract: On May 21, 2019 at 03:02:29 UTC Advanced LIGO and Advanced Virgo observed a short duration gravitational-wave signal, GW190521, with a three-detector network signal-to-noise ratio of 14.7, and an estimated false-alarm rate of 1 in 4900 yr using a search sensitive to generic transients. If GW190521 is from a quasicircular binary inspiral, then the detected signal is consistent with the merger of tw… ▽ More

    Submitted 2 September, 2020; originally announced September 2020.

    Comments: Supplementary Material at https://dcc.ligo.org/LIGO-P2000020/Public

    Journal ref: Phys. Rev. Lett. 125, 101102 (2020)

  27. arXiv:2007.02786  [pdf, other

    cs.LG stat.ML

    TDprop: Does Jacobi Preconditioning Help Temporal Difference Learning?

    Authors: Joshua Romoff, Peter Henderson, David Kanaa, Emmanuel Bengio, Ahmed Touati, Pierre-Luc Bacon, Joelle Pineau

    Abstract: We investigate whether Jacobi preconditioning, accounting for the bootstrap term in temporal difference (TD) learning, can help boost performance of adaptive optimizers. Our method, TDprop, computes a per parameter learning rate based on the diagonal preconditioning of the TD update rule. We show how this can be used in both $n$-step returns and TD($λ$). Our theoretical findings demonstrate that i… ▽ More

    Submitted 6 July, 2020; originally announced July 2020.

    Comments: Presented at the Theoretical Foundations of Reinforcement Learning workshop at ICML 2020

  28. arXiv:2006.12611  [pdf, other

    astro-ph.HE gr-qc

    GW190814: Gravitational Waves from the Coalescence of a 23 M$_\odot$ Black Hole with a 2.6 M$_\odot$ Compact Object

    Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, R. Abbott, T. D. Abbott, S. Abraham, F. Acernese, K. Ackley, C. Adams, R. X. Adhikari, V. B. Adya, C. Affeldt, M. Agathos, K. Agatsuma, N. Aggarwal, O. D. Aguiar, A. Aich, L. Aiello, A. Ain, P. Ajith, S. Akcay, G. Allen, A. Allocca, P. A. Altin, A. Amato, S. Anand , et al. (1232 additional authors not shown)

    Abstract: We report the observation of a compact binary coalescence involving a 22.2 - 24.3 $M_{\odot}$ black hole and a compact object with a mass of 2.50 - 2.67 $M_{\odot}$ (all measurements quoted at the 90$\%$ credible level). The gravitational-wave signal, GW190814, was observed during LIGO's and Virgo's third observing run on August 14, 2019 at 21:10:39 UTC and has a signal-to-noise ratio of 25 in the… ▽ More

    Submitted 22 June, 2020; originally announced June 2020.

    Comments: 23 pages, 8 figures, accepted by ApJ Letters

    Report number: LIGO-P190814

  29. GW190412: Observation of a Binary-Black-Hole Coalescence with Asymmetric Masses

    Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, R. Abbott, T. D. Abbott, S. Abraham, F. Acernese, K. Ackley, C. Adams, R. X. Adhikari, V. B. Adya, C. Affeldt, M. Agathos, K. Agatsuma, N. Aggarwal, O. D. Aguiar, A. Aich, L. Aiello, A. Ain, P. Ajith, S. Akcay, G. Allen, A. Allocca, P. A. Altin, A. Amato, S. Anand , et al. (1232 additional authors not shown)

    Abstract: We report the observation of gravitational waves from a binary-black-hole coalescence during the first two weeks of LIGO's and Virgo's third observing run. The signal was recorded on April 12, 2019 at 05:30:44 UTC with a network signal-to-noise ratio of 19. The binary is different from observations during the first two observing runs most notably due to its asymmetric masses: a ~30 solar mass blac… ▽ More

    Submitted 24 August, 2020; v1 submitted 17 April, 2020; originally announced April 2020.

    Comments: 29 pages, 12 figures; data available under https://doi.org/10.7935/20yv-ka61 posterior samples available under https://dcc.ligo.org/P190412/public

    Report number: LIGO-P190412

    Journal ref: Phys. Rev. D 102, 043015 (2020)

  30. arXiv:2002.11833  [pdf, other

    cs.LG cs.AI stat.ML

    Policy Evaluation Networks

    Authors: Jean Harb, Tom Schaul, Doina Precup, Pierre-Luc Bacon

    Abstract: Many reinforcement learning algorithms use value functions to guide the search for better policies. These methods estimate the value of a single policy while generalizing across many states. The core idea of this paper is to flip this convention and estimate the value of many policies, for a single set of states. This approach opens up the possibility of performing direct gradient ascent in policy… ▽ More

    Submitted 26 February, 2020; originally announced February 2020.

    Comments: 12 pages, 11 figures

  31. arXiv:2001.01761  [pdf

    astro-ph.HE gr-qc

    GW190425: Observation of a Compact Binary Coalescence with Total Mass $\sim 3.4 M_{\odot}$

    Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, B. P. Abbott, R. Abbott, T. D. Abbott, S. Abraham, F. Acernese, K. Ackley, C. Adams, R. X. Adhikari, V. B. Adya, C. Affeldt, M. Agathos, K. Agatsuma, N. Aggarwal, O. D. Aguiar, L. Aiello, A. Ain, P. Ajith, G. Allen, A. Allocca, M. A. Aloy, P. A. Altin, A. Amato, S. Anand , et al. (1177 additional authors not shown)

    Abstract: On 2019 April 25, the LIGO Livingston detector observed a compact binary coalescence with signal-to-noise ratio 12.9. The Virgo detector was also taking data that did not contribute to detection due to a low signal-to-noise ratio, but were used for subsequent parameter estimation. The 90% credible intervals for the component masses range from 1.12 to 2.52 $M_{\odot}$ (1.45 to 1.88 $M_{\odot}$ if w… ▽ More

    Submitted 7 April, 2020; v1 submitted 6 January, 2020; originally announced January 2020.

    Comments: 24 pages, 19 figures, published in ApJL

    Report number: LIGO-P190425

    Journal ref: Astrophysical Journal Letters 892 (2020) L3

  32. A Joint Fermi-GBM and LIGO/Virgo Analysis of Compact Binary Mergers From the First and Second Gravitational-wave Observing Runs

    Authors: The Fermi Gamma-ray Burst Monitor Team, the LIGO Scientific Collaboration, the Virgo Collaboration, :, R. Hamburg, C. Fletcher, E. Burns, A. Goldstein, E. Bissaldi, M. S. Briggs, W. H. Cleveland, M. M. Giles, C. M. Hui, D. Kocevski, S. Lesage, B. Mailyan, C. Malacaria, S. Poolakkil, R. Preece, O. J. Roberts, P. Veres, A. von Kienlin, C. A. Wilson-Hodge, J. Wood, R. Abbott , et al. (1241 additional authors not shown)

    Abstract: We present results from offline searches of Fermi Gamma-ray Burst Monitor (GBM) data for gamma-ray transients coincident with the compact binary coalescences observed by the gravitational-wave (GW) detectors Advanced LIGO and Advanced Virgo during their first and second observing runs. In particular, we perform follow-up for both confirmed events and low significance candidates reported in the LIG… ▽ More

    Submitted 24 February, 2020; v1 submitted 3 January, 2020; originally announced January 2020.

    Comments: Accepted for publication in ApJ. 18 pages, 4 figures, 1 table

    Journal ref: The Astrophysical Journal, 893:100 (14pp), 2020 April 20

  33. arXiv:2001.00271  [pdf, other

    cs.LG cs.AI stat.ML

    Options of Interest: Temporal Abstraction with Interest Functions

    Authors: Khimya Khetarpal, Martin Klissarov, Maxime Chevalier-Boisvert, Pierre-Luc Bacon, Doina Precup

    Abstract: Temporal abstraction refers to the ability of an agent to use behaviours of controllers which act for a limited, variable amount of time. The options framework describes such behaviours as consisting of a subset of states in which they can initiate, an internal policy and a stochastic termination condition. However, much of the subsequent work on option discovery has ignored the initiation set, be… ▽ More

    Submitted 1 January, 2020; originally announced January 2020.

    Comments: To appear in Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20)

  34. Open data from the first and second observing runs of Advanced LIGO and Advanced Virgo

    Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, R. Abbott, T. D. Abbott, S. Abraham, F. Acernese, K. Ackley, C. Adams, R. X. Adhikari, V. B. Adya, C. Affeldt, M. Agathos, K. Agatsuma, N. Aggarwal, O. D. Aguiar, A. Aich, L. Aiello, A. Ain, P. Ajith, G. Allen, A. Allocca, P. A. Altin, A. Amato, S. Anand, A. Ananyeva , et al. (1223 additional authors not shown)

    Abstract: Advanced LIGO and Advanced Virgo are actively monitoring the sky and collecting gravitational-wave strain data with sufficient sensitivity to detect signals routinely. In this paper we describe the data recorded by these instruments during their first and second observing runs. The main data products are the gravitational-wave strain arrays, released as time series sampled at 16384 Hz. The dataset… ▽ More

    Submitted 25 January, 2021; v1 submitted 25 December, 2019; originally announced December 2019.

    Comments: 42 pages, 5 figures

    Report number: LIGO-P1900206

    Journal ref: SoftwareX 13 (2021) 100658

  35. arXiv:1912.05104  [pdf, other

    cs.LG cs.AI stat.ML

    Entropy Regularization with Discounted Future State Distribution in Policy Gradient Methods

    Authors: Riashat Islam, Raihan Seraj, Pierre-Luc Bacon, Doina Precup

    Abstract: The policy gradient theorem is defined based on an objective with respect to the initial distribution over states. In the discounted case, this results in policies that are optimal for one distribution over initial states, but may not be uniformly optimal for others, no matter where the agent starts from. Furthermore, to obtain unbiased gradient estimates, the starting point of the policy gradient… ▽ More

    Submitted 10 December, 2019; originally announced December 2019.

    Comments: In Submission; Appeared at NeurIPS 2019 Optimization Foundations of Reinforcement Learning Workshop

  36. arXiv:1910.09093  [pdf, ps, other

    cs.LG cs.AI stat.ML

    All-Action Policy Gradient Methods: A Numerical Integration Approach

    Authors: Benjamin Petit, Loren Amdahl-Culleton, Yao Liu, Jimmy Smith, Pierre-Luc Bacon

    Abstract: While often stated as an instance of the likelihood ratio trick [Rubinstein, 1989], the original policy gradient theorem [Sutton, 1999] involves an integral over the action space. When this integral can be computed, the resulting "all-action" estimator [Sutton, 2001] provides a conditioning effect [Bratley, 1987] reducing the variance significantly compared to the REINFORCE estimator [Williams, 19… ▽ More

    Submitted 20 October, 2019; originally announced October 2019.

    Comments: 9 pages, 2 figures. NeurIPS 2019 Optimization Foundations of Reinforcement Learning Workshop

  37. arXiv:1910.06508  [pdf, other

    cs.LG stat.ML

    Understanding the Curse of Horizon in Off-Policy Evaluation via Conditional Importance Sampling

    Authors: Yao Liu, Pierre-Luc Bacon, Emma Brunskill

    Abstract: Off-policy policy estimators that use importance sampling (IS) can suffer from high variance in long-horizon domains, and there has been particular excitement over new IS methods that leverage the structure of Markov decision processes. We analyze the variance of the most popular approaches through the viewpoint of conditional Monte Carlo. Surprisingly, we find that in finite horizon MDPs there is… ▽ More

    Submitted 5 June, 2020; v1 submitted 14 October, 2019; originally announced October 2019.

    Comments: Accepted by ICML 2020, 21 pages, 1 figure

  38. arXiv:1908.11170  [pdf, other

    gr-qc astro-ph.IM

    A guide to LIGO-Virgo detector noise and extraction of transient gravitational-wave signals

    Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, B. P. Abbott, R. Abbott, T. D. Abbott, S. Abraham, F. Acernese, K. Ackley, C. Adams, V. B. Adya, C. Affeldt, M. Agathos, K. Agatsuma, N. Aggarwal, O. D. Aguiar, L. Aiello, A. Ain, P. Ajith, T. Alford, G. Allen, A. Allocca, M. A. Aloy, P. A. Altin, A. Amato, A. Ananyeva , et al. (1113 additional authors not shown)

    Abstract: The LIGO Scientific Collaboration and the Virgo Collaboration have cataloged eleven confidently detected gravitational-wave events during the first two observing runs of the advanced detector era. All eleven events were consistent with being from well-modeled mergers between compact stellar-mass objects: black holes or neutron stars. The data around the time of each of these events have been made… ▽ More

    Submitted 10 February, 2020; v1 submitted 29 August, 2019; originally announced August 2019.

    Journal ref: B P Abbott et al 2020 Class. Quantum Grav. 37 055002

  39. arXiv:1908.06060  [pdf, other

    astro-ph.CO gr-qc

    A gravitational-wave measurement of the Hubble constant following the second observing run of Advanced LIGO and Virgo

    Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, B. P. Abbott, R. Abbott, T. D. Abbott, S. Abraham, F. Acernese, K. Ackley, C. Adams, R. X. Adhikari, V. B. Adya, C. Affeldt, M. Agathos, K. Agatsuma, N. Aggarwal, O. D. Aguiar, L. Aiello, A. Ain, P. Ajith, G. Allen, A. Allocca, M. A. Aloy, P. A. Altin, A. Amato, S. Anand , et al. (1164 additional authors not shown)

    Abstract: This paper presents the gravitational-wave measurement of the Hubble constant ($H_0$) using the detections from the first and second observing runs of the Advanced LIGO and Virgo detector network. The presence of the transient electromagnetic counterpart of the binary neutron star GW170817 led to the first standard-siren measurement of $H_0$. Here we additionally use binary black hole detections i… ▽ More

    Submitted 8 November, 2021; v1 submitted 16 August, 2019; originally announced August 2019.

    Comments: 21 pages, 8 figures; this version corrects Fig 2; there are minor changes to also Figs 3 & 4 and the final results

    Report number: LIGO-P1900015

    Journal ref: Astrophys J 909 Number 2 218 (2021)

  40. An Optically Targeted Search for Gravitational Waves emitted by Core-Collapse Supernovae during the First and Second Observing Runs of Advanced LIGO and Advanced Virgo

    Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, B. P. Abbott, R. Abbott, T. D. Abbott, S. Abraham, F. Acernese, K. Ackley, C. Adams, V. B. Adya, C. Affeldt, M. Agathos, K. Agatsuma, N. Aggarwal, O. D. Aguiar, L. Aiello, A. Ain, P. Ajith, G. Allen, A. Allocca, M. A. Aloy, P. A. Altin, A. Amato, S. Anand, A. Ananyeva , et al. (1173 additional authors not shown)

    Abstract: We present the results from a search for gravitational-wave transients associated with core-collapse supernovae observed within a source distance of approximately 20 Mpc during the first and second observing runs of Advanced LIGO and Advanced Virgo. No significant gravitational-wave candidate was detected. We report the detection efficiencies as a function of the distance for waveforms derived fro… ▽ More

    Submitted 20 August, 2019; v1 submitted 9 August, 2019; originally announced August 2019.

    Comments: 13 pages, 5 figures

    Report number: LIGO-P1700177

    Journal ref: Phys. Rev. D 101, 084002 (2020)

  41. arXiv:1908.01012  [pdf

    gr-qc astro-ph.HE

    Model comparison from LIGO-Virgo data on GW170817's binary components and consequences for the merger remnant

    Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, B. P. Abbott, R. Abbott, T. D. Abbott, S. Abraham, F. Acernese, K. Ackley, C. Adams, V. B. Adya, C. Affeldt, M. Agathos, K. Agatsuma, N. Aggarwal, O. D. Aguiar, L. Aiello, A. Ain, P. Ajith, G. Allen, A. Allocca, M. A. Aloy, P. A. Altin, A. Amato, S. Anand, A. Ananyeva , et al. (1169 additional authors not shown)

    Abstract: GW170817 is the very first observation of gravitational waves originating from the coalescence of two compact objects in the mass range of neutron stars, accompanied by electromagnetic counterparts, and offers an opportunity to directly probe the internal structure of neutron stars. We perform Bayesian model selection on a wide range of theoretical predictions for the neutron star equation of stat… ▽ More

    Submitted 6 March, 2020; v1 submitted 2 August, 2019; originally announced August 2019.

    Comments: 35 pages, 4 figures

    Report number: LIGO-P1800379

    Journal ref: Classical and Quantum Gravity, Vol. 37, No 4, p 045006 (2020)

  42. Astrophysical signal consistency test adapted for gravitational-wave transient searches

    Authors: V. Gayathri, P. Bacon, A. Pai, E. Chassande-Mottin, F. Salemi, G. Vedovato

    Abstract: Gravitational wave astronomy is established with direct observation of gravitational wave from merging binary black holes and binary neutron stars during the first and second observing run of LIGO and Virgo detectors. The gravitational-wave transient searches mainly categories into two families: modeled and modeled-independent searches. The modeled searches are based on matched filtering technique… ▽ More

    Submitted 23 December, 2019; v1 submitted 25 July, 2019; originally announced July 2019.

    Comments: main paper: 8 page and 13 figures, total with appendices: 10 pages and 13 figures

    Report number: LIGO-P1900221

    Journal ref: Phys. Rev. D 100, 124022 (2019)

  43. Search for Eccentric Binary Black Hole Mergers with Advanced LIGO and Advanced Virgo during their First and Second Observing Runs

    Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, B. P. Abbott, R. Abbott, T. D. Abbott, S. Abraham, F. Acernese, K. Ackley, C. Adams, R. X. Adhikari, V. B. Adya, C. Affeldt, M. Agathos, K. Agatsuma, N. Aggarwal, O. D. Aguiar, L. Aiello, A. Ain, P. Ajith, G. Allen, A. Allocca, M. A. Aloy, P. A. Altin, A. Amato, S. Anand , et al. (1161 additional authors not shown)

    Abstract: When formed through dynamical interactions, stellar-mass binary black holes may retain eccentric orbits ($e>0.1$ at 10 Hz) detectable by ground-based gravitational-wave detectors. Eccentricity can therefore be used to differentiate dynamically-formed binaries from isolated binary black hole mergers. Current template-based gravitational-wave searches do not use waveform models associated to eccentr… ▽ More

    Submitted 24 January, 2020; v1 submitted 22 July, 2019; originally announced July 2019.

    Comments: 7 pages, 2 figures

    Report number: LIGO Document P1900110

  44. arXiv:1907.01443  [pdf

    astro-ph.HE gr-qc

    Search for gravitational-wave signals associated with gamma-ray bursts during the second observing run of Advanced LIGO and Advanced Virgo

    Authors: B. P. Abbott, R. Abbott, T. D. Abbott, S. Abraham, F. Acernese, K. Ackley, C. Adams, R. X. Adhikari, V. B. Adya, C. Affeldt, M. Agathos, K. Agatsuma, N. Aggarwal, O. D. Aguiar, L. Aiello, A. Ain, P. Ajith, G. Allen, A. Allocca, M. A. Aloy, P. A. Altin, A. Amato, S. Anand, A. Ananyeva, S. B. Anderson , et al. (1174 additional authors not shown)

    Abstract: We present the results of targeted searches for gravitational-wave transients associated with gamma-ray bursts during the second observing run of Advanced LIGO and Advanced Virgo, which took place from 2016 November to 2017 August. We have analyzed 98 gamma-ray bursts using an unmodeled search method that searches for generic transient gravitational waves and 42 with a modeled search method that t… ▽ More

    Submitted 22 November, 2019; v1 submitted 2 July, 2019; originally announced July 2019.

    Report number: LIGO-P1900034

    Journal ref: Astrophys. J. 886, 75 (2019)

  45. Search for gravitational waves from Scorpius X-1 in the second Advanced LIGO observing run with an improved hidden Markov model

    Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, B. P. Abbott, R. Abbott, T. D. Abbott, S. Abraham, F. Acernese, K. Ackley, C. Adams, R. X. Adhikari, V. B. Adya, C. Affeldt, M. Agathos, K. Agatsuma, N. Aggarwal, O. D. Aguiar, L. Aiello, A. Ain, P. Ajith, G. Allen, A. Allocca, M. A. Aloy, P. A. Altin, A. Amato, A. Ananyeva , et al. (1112 additional authors not shown)

    Abstract: We present results from a semicoherent search for continuous gravitational waves from the low-mass X-ray binary Scorpius X-1, using a hidden Markov model (HMM) to track spin wandering. This search improves on previous HMM-based searches of LIGO data by using an improved frequency domain matched filter, the $\mathcal{J}$-statistic, and by analysing data from Advanced LIGO's second observing run. In… ▽ More

    Submitted 27 November, 2019; v1 submitted 28 June, 2019; originally announced June 2019.

    Comments: 21 pages, 5 figures; accepted for publication in Physical Review D

    Report number: LIGO-P1800208; erratum LIGO-P2100373

    Journal ref: Phys. Rev. D 100, 122002 (2019); erratum Phys. Rev. D 104, 109903 (2021)

  46. Search for intermediate mass black hole binaries in the first and second observing runs of the Advanced LIGO and Virgo network

    Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, B. P. Abbott, R. Abbott, T. D. Abbott, S. Abraham, F. Acernese, K. Ackley, A. Adams, C. Adams, R. X. Adhikari, V. B. Adya, C. Affeldt, M. Agathos, K. Agatsuma, N. Aggarwal, O. D. Aguiar, L. Aiello, A. Ain, P. Ajith, G. Allen, A. Allocca, M. A. Aloy, P. A. Altin, A. Amato , et al. (1174 additional authors not shown)

    Abstract: Gravitational wave astronomy has been firmly established with the detection of gravitational waves from the merger of ten stellar mass binary black holes and a neutron star binary. This paper reports on the all-sky search for gravitational waves from intermediate mass black hole binaries in the first and second observing runs of the Advanced LIGO and Virgo network. The search uses three independen… ▽ More

    Submitted 24 January, 2020; v1 submitted 19 June, 2019; originally announced June 2019.

    Comments: main paper: 14 pages, 2 figures and 1 table : total with appendices 19 pages, 2 figures and 2 tables

    Report number: LIGO-P1900045

    Journal ref: Phys. Rev. D 100, 064064 (2019)

  47. All-sky search for short gravitational-wave bursts in the second Advanced LIGO and Advanced Virgo run

    Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, B. P. Abbott, R. Abbott, T. D. Abbott, S. Abraham, F. Acernese, K. Ackley, C. Adams, R. X. Adhikari, V. B. Adya, C. Affeldt, M. Agathos, K. Agatsuma, N. Aggarwal, O. D. Aguiar, L. Aiello, A. Ain, P. Ajith, G. Allen, A. Allocca, M. A. Aloy, P. A. Altin, A. Amato, S. Anand , et al. (1164 additional authors not shown)

    Abstract: We present the results of a search for short-duration gravitational-wave transients in the data from the second observing run of Advanced LIGO and Advanced Virgo. We search for gravitational-wave transients with a duration of milliseconds to approximately one second in the 32-4096 Hz frequency band with minimal assumptions about the signal properties, thus targeting a wide variety of sources. We a… ▽ More

    Submitted 9 May, 2019; originally announced May 2019.

    Comments: 10 pages, 6 figures

    Report number: LIGO-P1800308

    Journal ref: Phys. Rev. D 100, 024017 (2019)

  48. arXiv:1904.08976  [pdf, other

    astro-ph.CO astro-ph.HE gr-qc

    Search for sub-solar mass ultracompact binaries in Advanced LIGO's second observing run

    Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, B. P. Abbott, R. Abbott, T. D. Abbott, S. Abraham, F. Acernese, K. Ackley, C. Adams, R. X. Adhikari, V. B. Adya, C. Affeldt, M. Agathos, K. Agatsuma, N. Aggarwal, O. D. Aguiar, L. Aiello, A. Ain, P. Ajith, G. Allen, A. Allocca, M. A. Aloy, P. A. Altin, A. Amato, S. Anand , et al. (1165 additional authors not shown)

    Abstract: We present an Advanced LIGO and Advanced Virgo search for sub-solar mass ultracompact objects in data obtained during Advanced LIGO's second observing run. In contrast to a previous search of Advanced LIGO data from the first observing run, this search includes the effects of component spin on the gravitational waveform. We identify no viable gravitational wave candidates consistent with sub-solar… ▽ More

    Submitted 25 May, 2019; v1 submitted 18 April, 2019; originally announced April 2019.

    Report number: LIGO-P1900037

    Journal ref: Phys. Rev. Lett. 123, 161102 (2019)

  49. All-sky search for long-duration gravitational-wave transients in the second Advanced LIGO observing run

    Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, B. P. Abbott, R. Abbott, T. D. Abbott, S. Abraham, F. Acernese, K. Ackley, C. Adams, R. X. Adhikari, V. B. Adya, C. Affeldt, M. Agathos, K. Agatsuma, N. Aggarwal, O. D. Aguiar, L. Aiello, A. Ain, P. Ajith, G. Allen, A. Allocca, M. A. Aloy, P. A. Altin, A. Amato, S. Anand , et al. (1161 additional authors not shown)

    Abstract: We present the results of a search for long-duration gravitational-wave transients in the data from the Advanced LIGO second observation run; we search for gravitational-wave transients of $2~\text{--}~ 500$~s duration in the $24 - 2048$\,Hz frequency band with minimal assumptions about signal properties such as waveform morphologies, polarization, sky location or time of occurrence. Targeted sign… ▽ More

    Submitted 27 September, 2019; v1 submitted 28 March, 2019; originally announced March 2019.

    Journal ref: Phys. Rev. D 99, 104033 (2019)

  50. Directional limits on persistent gravitational waves using data from Advanced LIGO's first two observing runs

    Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, B. P. Abbott, R. Abbott, T. D. Abbott, S. Abraham, F. Acernese, K. Ackley, C. Adams, R. X. Adhikari, V. B. Adya, C. Affeldt, M. Agathos, K. Agatsuma, N. Aggarwal, O. D. Aguiar, L. Aiello, A. Ain, P. Ajith, G. Allen, A. Allocca, M. A. Aloy, P. A. Altin, A. Amato, A. Ananyeva , et al. (1110 additional authors not shown)

    Abstract: We perform an unmodeled search for persistent, directional gravitational wave (GW) sources using data from the first and second observing runs of Advanced LIGO. We do not find evidence for any GW signals. We place limits on the broadband GW flux emitted at 25~Hz from point sources with a power law spectrum at $F_{α,Θ} <(0.05-25)\times 10^{-8} ~{\rm erg\,cm^{-2}\,s^{-1}\,Hz^{-1}}$ and the (normaliz… ▽ More

    Submitted 9 September, 2019; v1 submitted 21 March, 2019; originally announced March 2019.

    Comments: 15 pages, 5 figures

    Report number: LIGO-P1900053

    Journal ref: Phys. Rev. D 100, 062001 (2019)