-
IDs for AI Systems
Authors:
Alan Chan,
Noam Kolt,
Peter Wills,
Usman Anwar,
Christian Schroeder de Witt,
Nitarshan Rajkumar,
Lewis Hammond,
David Krueger,
Lennart Heim,
Markus Anderljung
Abstract:
AI systems are increasingly pervasive, yet information needed to decide whether and how to engage with them may not exist or be accessible. A user may not be able to verify whether a system satisfies certain safety standards. An investigator may not know whom to investigate when a system causes an incident. A platform may find it difficult to penalize repeated negative interactions with the same s…
▽ More
AI systems are increasingly pervasive, yet information needed to decide whether and how to engage with them may not exist or be accessible. A user may not be able to verify whether a system satisfies certain safety standards. An investigator may not know whom to investigate when a system causes an incident. A platform may find it difficult to penalize repeated negative interactions with the same system. Across a number of domains, IDs address analogous problems by identifying \textit{particular} entities (e.g., a particular Boeing 747) and providing information about other entities of the same class (e.g., some or all Boeing 747s). We propose a framework in which IDs are ascribed to \textbf{instances} of AI systems (e.g., a particular chat session with Claude 3), and associated information is accessible to parties seeking to interact with that system. We characterize IDs for AI systems, argue that there could be significant demand for IDs from key actors, analyze how those actors could incentivize ID adoption, explore potential implementations of our framework, and highlight limitations and risks. IDs seem most warranted in high-stakes settings, where certain actors (e.g., those that enable AI systems to make financial transactions) could experiment with incentives for ID use. Deployers of AI systems could experiment with develo** ID implementations. With further study, IDs could help to manage a world where AI systems pervade society.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
Sliding Window 3-Objective Pareto Optimization for Problems with Chance Constraints
Authors:
Frank Neumann,
Carsten Witt
Abstract:
Constrained single-objective problems have been frequently tackled by evolutionary multi-objective algorithms where the constraint is relaxed into an additional objective. Recently, it has been shown that Pareto optimization approaches using bi-objective models can be significantly sped up using sliding windows (Neumann and Witt, ECAI 2023). In this paper, we extend the sliding window approach to…
▽ More
Constrained single-objective problems have been frequently tackled by evolutionary multi-objective algorithms where the constraint is relaxed into an additional objective. Recently, it has been shown that Pareto optimization approaches using bi-objective models can be significantly sped up using sliding windows (Neumann and Witt, ECAI 2023). In this paper, we extend the sliding window approach to $3$-objective formulations for tackling chance constrained problems. On the theoretical side, we show that our new sliding window approach improves previous runtime bounds obtained in (Neumann and Witt, GECCO 2023) while maintaining the same approximation guarantees. Our experimental investigations for the chance constrained dominating set problem show that our new sliding window approach allows one to solve much larger instances in a much more efficient way than the 3-objective approach presented in (Neumann and Witt, GECCO 2023).
△ Less
Submitted 7 June, 2024;
originally announced June 2024.
-
Unelicitable Backdoors in Language Models via Cryptographic Transformer Circuits
Authors:
Andis Draguns,
Andrew Gritsevskiy,
Sumeet Ramesh Motwani,
Charlie Rogers-Smith,
Jeffrey Ladish,
Christian Schroeder de Witt
Abstract:
The rapid proliferation of open-source language models significantly increases the risks of downstream backdoor attacks. These backdoors can introduce dangerous behaviours during model deployment and can evade detection by conventional cybersecurity monitoring systems. In this paper, we introduce a novel class of backdoors in autoregressive transformer models, that, in contrast to prior art, are u…
▽ More
The rapid proliferation of open-source language models significantly increases the risks of downstream backdoor attacks. These backdoors can introduce dangerous behaviours during model deployment and can evade detection by conventional cybersecurity monitoring systems. In this paper, we introduce a novel class of backdoors in autoregressive transformer models, that, in contrast to prior art, are unelicitable in nature. Unelicitability prevents the defender from triggering the backdoor, making it impossible to evaluate or detect ahead of deployment even if given full white-box access and using automated techniques, such as red-teaming or certain formal verification methods. We show that our novel construction is not only unelicitable thanks to using cryptographic techniques, but also has favourable robustness properties. We confirm these properties in empirical investigations, and provide evidence that our backdoors can withstand state-of-the-art mitigation strategies. Additionally, we expand on previous work by showing that our universal backdoors, while not completely undetectable in white-box settings, can be harder to detect than some existing designs. By demonstrating the feasibility of seamlessly integrating backdoors into transformer models, this paper fundamentally questions the efficacy of pre-deployment detection strategies. This offers new insights into the offence-defence balance in AI safety and security.
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
Computing Low-Entropy Couplings for Large-Support Distributions
Authors:
Samuel Sokota,
Dylan Sam,
Christian Schroeder de Witt,
Spencer Compton,
Jakob Foerster,
J. Zico Kolter
Abstract:
Minimum-entropy coupling (MEC) -- the process of finding a joint distribution with minimum entropy for given marginals -- has applications in areas such as causality and steganography. However, existing algorithms are either computationally intractable for large-support distributions or limited to specific distribution types and sensitive to hyperparameter choices. This work addresses these limita…
▽ More
Minimum-entropy coupling (MEC) -- the process of finding a joint distribution with minimum entropy for given marginals -- has applications in areas such as causality and steganography. However, existing algorithms are either computationally intractable for large-support distributions or limited to specific distribution types and sensitive to hyperparameter choices. This work addresses these limitations by unifying a prior family of iterative MEC (IMEC) approaches into a generalized partition-based formalism. From this framework, we derive a novel IMEC algorithm called ARIMEC, capable of handling arbitrary discrete distributions, and introduce a method to make IMEC robust to suboptimal hyperparameter settings. These innovations facilitate the application of IMEC to high-throughput steganography with language models, among other settings. Our codebase is available at https://github.com/ssokota/mec .
△ Less
Submitted 29 May, 2024;
originally announced May 2024.
-
Near to Mid-term Risks and Opportunities of Open-Source Generative AI
Authors:
Francisco Eiras,
Aleksandar Petrov,
Bertie Vidgen,
Christian Schroeder de Witt,
Fabio Pizzati,
Katherine Elkins,
Supratik Mukhopadhyay,
Adel Bibi,
Botos Csaba,
Fabro Steibel,
Fazl Barez,
Genevieve Smith,
Gianluca Guadagni,
Jon Chun,
Jordi Cabot,
Joseph Marvin Imperial,
Juan A. Nolazco-Flores,
Lori Landay,
Matthew Jackson,
Paul Röttger,
Philip H. S. Torr,
Trevor Darrell,
Yong Suk Lee,
Jakob Foerster
Abstract:
In the next few years, applications of Generative AI are expected to revolutionize a number of different areas, ranging from science & medicine to education. The potential for these seismic changes has triggered a lively debate about potential risks and resulted in calls for tighter regulation, in particular from some of the major tech companies who are leading in AI development. This regulation i…
▽ More
In the next few years, applications of Generative AI are expected to revolutionize a number of different areas, ranging from science & medicine to education. The potential for these seismic changes has triggered a lively debate about potential risks and resulted in calls for tighter regulation, in particular from some of the major tech companies who are leading in AI development. This regulation is likely to put at risk the budding field of open-source Generative AI. We argue for the responsible open sourcing of generative AI models in the near and medium term. To set the stage, we first introduce an AI openness taxonomy system and apply it to 40 current large language models. We then outline differential benefits and risks of open versus closed source AI and present potential risk mitigation, ranging from best practices to calls for technical and scientific contributions. We hope that this report will add a much needed missing voice to the current public discourse on near to mid-term AI safety and other societal impact.
△ Less
Submitted 24 May, 2024; v1 submitted 25 April, 2024;
originally announced April 2024.
-
Runtime Analysis of a Multi-Valued Compact Genetic Algorithm on Generalized OneMax
Authors:
Sumit Adak,
Carsten Witt
Abstract:
A class of metaheuristic techniques called estimation-of-distribution algorithms (EDAs) are employed in optimization as more sophisticated substitutes for traditional strategies like evolutionary algorithms. EDAs generally drive the search for the optimum by creating explicit probabilistic models of potential candidate solutions through repeated sampling and selection from the underlying search sp…
▽ More
A class of metaheuristic techniques called estimation-of-distribution algorithms (EDAs) are employed in optimization as more sophisticated substitutes for traditional strategies like evolutionary algorithms. EDAs generally drive the search for the optimum by creating explicit probabilistic models of potential candidate solutions through repeated sampling and selection from the underlying search space.
Most theoretical research on EDAs has focused on pseudo-Boolean optimization. Jedidia et al. (GECCO 2023) proposed the first EDAs for optimizing problems involving multi-valued decision variables. By building a framework, they have analyzed the runtime of a multi-valued UMDA on the r-valued LeadingOnes function. Using their framework, here we focus on the multi-valued compact genetic algorithm (r-cGA) and provide a first runtime analysis of a generalized OneMax function.
To prove our results, we investigate the effect of genetic drift and progress of the probabilistic model towards the optimum. After finding the right algorithm parameters, we prove that the r-cGA solves this r-valued OneMax problem efficiently. We show that with high probability, the runtime bound is O(r2 n log2 r log3 n). At the end of experiments, we state one conjecture related to the expected runtime of another variant of multi-valued OneMax function.
△ Less
Submitted 17 April, 2024;
originally announced April 2024.
-
Rethinking Out-of-Distribution Detection for Reinforcement Learning: Advancing Methods for Evaluation and Detection
Authors:
Linas Nasvytis,
Kai Sandbrink,
Jakob Foerster,
Tim Franzmeyer,
Christian Schroeder de Witt
Abstract:
While reinforcement learning (RL) algorithms have been successfully applied across numerous sequential decision-making problems, their generalization to unforeseen testing environments remains a significant concern. In this paper, we study the problem of out-of-distribution (OOD) detection in RL, which focuses on identifying situations at test time that RL agents have not encountered in their trai…
▽ More
While reinforcement learning (RL) algorithms have been successfully applied across numerous sequential decision-making problems, their generalization to unforeseen testing environments remains a significant concern. In this paper, we study the problem of out-of-distribution (OOD) detection in RL, which focuses on identifying situations at test time that RL agents have not encountered in their training environments. We first propose a clarification of terminology for OOD detection in RL, which aligns it with the literature from other machine learning domains. We then present new benchmark scenarios for OOD detection, which introduce anomalies with temporal autocorrelation into different components of the agent-environment loop. We argue that such scenarios have been understudied in the current literature, despite their relevance to real-world situations. Confirming our theoretical predictions, our experimental results suggest that state-of-the-art OOD detectors are not able to identify such anomalies. To address this problem, we propose a novel method for OOD detection, which we call DEXTER (Detection via Extraction of Time Series Representations). By treating environment observations as time series data, DEXTER extracts salient time series features, and then leverages an ensemble of isolation forest algorithms to detect anomalies. We find that DEXTER can reliably identify anomalies across benchmark scenarios, exhibiting superior performance compared to both state-of-the-art OOD detectors and high-dimensional changepoint detectors adopted from statistics.
△ Less
Submitted 10 April, 2024;
originally announced April 2024.
-
The NANOGrav 15 yr Data Set: Looking for Signs of Discreteness in the Gravitational-wave Background
Authors:
Gabriella Agazie,
Paul T. Baker,
Bence Bécsy,
Laura Blecha,
Adam Brazier,
Paul R. Brook,
Lucas Brown,
Sarah Burke-Spolaor,
J. Andrew Casey-Clyde,
Maria Charisi,
Shami Chatterjee,
Tyler Cohen,
James M. Cordes,
Neil J. Cornish,
Fronefield Crawford,
H. Thankful Cromartie,
Megan E. DeCesar,
Paul B. Demorest,
Heling Deng,
Timothy Dolch,
Elizabeth C. Ferrara,
William Fiore,
Emmanuel Fonseca,
Gabriel E. Freedman,
Nate Garver-Daniels
, et al. (58 additional authors not shown)
Abstract:
The cosmic merger history of supermassive black hole binaries (SMBHBs) is expected to produce a low-frequency gravitational wave background (GWB). Here we investigate how signs of the discrete nature of this GWB can manifest in pulsar timing arrays through excursions from, and breaks in, the expected $f_{\mathrm{GW}}^{-2/3}$ power-law of the GWB strain spectrum. To do this, we create a semi-analyt…
▽ More
The cosmic merger history of supermassive black hole binaries (SMBHBs) is expected to produce a low-frequency gravitational wave background (GWB). Here we investigate how signs of the discrete nature of this GWB can manifest in pulsar timing arrays through excursions from, and breaks in, the expected $f_{\mathrm{GW}}^{-2/3}$ power-law of the GWB strain spectrum. To do this, we create a semi-analytic SMBHB population model, fit to NANOGrav's 15 yr GWB amplitude, and with 1,000 realizations we study the populations' characteristic strain and residual spectra. Comparing our models to the NANOGrav 15 yr spectrum, we find two interesting excursions from the power-law. The first, at $2 \; \mathrm{nHz}$, is below our GWB realizations with $p$-value significance $p = 0.05$ to $0.06$ ($\approx 1.8 σ- 1.9 σ$). The second, at $16 \; \mathrm{nHz}$, is above our GWB realizations with $p = 0.04$ to $0.15$ ($\approx 1.4 σ- 2.1 σ$). We explore the properties of a loud SMBHB which could cause such an excursion. Our simulations also show that the expected number of SMBHBs decreases by three orders of magnitude, from $\sim 10^6$ to $\sim 10^3$, between $2\; \mathrm{nHz}$ and $20 \; \mathrm{nHz}$. This causes a break in the strain spectrum as the stochasticity of the background breaks down at $26^{+28}_{-19} \; \mathrm{nHz}$, consistent with predictions pre-dating GWB measurements. The diminished GWB signal from SMBHBs at frequencies above the $26$~nHz break opens a window for PTAs to detect continuous GWs from individual SMBHBs or GWs from the early universe.
△ Less
Submitted 10 April, 2024;
originally announced April 2024.
-
A Flexible Evolutionary Algorithm With Dynamic Mutation Rate Archive
Authors:
Martin S. Krejca,
Carsten Witt
Abstract:
We propose a new, flexible approach for dynamically maintaining successful mutation rates in evolutionary algorithms using $k$-bit flip mutations. The algorithm adds successful mutation rates to an archive of promising rates that are favored in subsequent steps. Rates expire when their number of unsuccessful trials has exceeded a threshold, while rates currently not present in the archive can ente…
▽ More
We propose a new, flexible approach for dynamically maintaining successful mutation rates in evolutionary algorithms using $k$-bit flip mutations. The algorithm adds successful mutation rates to an archive of promising rates that are favored in subsequent steps. Rates expire when their number of unsuccessful trials has exceeded a threshold, while rates currently not present in the archive can enter it in two ways: (i) via user-defined minimum selection probabilities for rates combined with a successful step or (ii) via a stagnation detection mechanism increasing the value for a promising rate after the current bit-flip neighborhood has been explored with high probability. For the minimum selection probabilities, we suggest different options, including heavy-tailed distributions.
We conduct rigorous runtime analysis of the flexible evolutionary algorithm on the OneMax and Jump functions, on general unimodal functions, on minimum spanning trees, and on a class of hurdle-like functions with varying hurdle width that benefit particularly from the archive of promising mutation rates. In all cases, the runtime bounds are close to or even outperform the best known results for both stagnation detection and heavy-tailed mutations.
△ Less
Submitted 5 April, 2024;
originally announced April 2024.
-
Secret Collusion Among Generative AI Agents
Authors:
Sumeet Ramesh Motwani,
Mikhail Baranchuk,
Martin Strohmeier,
Vijay Bolina,
Philip H. S. Torr,
Lewis Hammond,
Christian Schroeder de Witt
Abstract:
Recent capability increases in large language models (LLMs) open up applications in which teams of communicating generative AI agents solve joint tasks. This poses privacy and security challenges concerning the unauthorised sharing of information, or other unwanted forms of agent coordination. Modern steganographic techniques could render such dynamics hard to detect. In this paper, we comprehensi…
▽ More
Recent capability increases in large language models (LLMs) open up applications in which teams of communicating generative AI agents solve joint tasks. This poses privacy and security challenges concerning the unauthorised sharing of information, or other unwanted forms of agent coordination. Modern steganographic techniques could render such dynamics hard to detect. In this paper, we comprehensively formalise the problem of secret collusion in systems of generative AI agents by drawing on relevant concepts from both the AI and security literature. We study incentives for the use of steganography, and propose a variety of mitigation measures. Our investigations result in a model evaluation framework that systematically tests capabilities required for various forms of secret collusion. We provide extensive empirical results across a range of contemporary LLMs. While the steganographic capabilities of current models remain limited, GPT-4 displays a capability jump suggesting the need for continuous monitoring of steganographic frontier model capabilities. We conclude by laying out a comprehensive research program to mitigate future risks of collusion between generative AI models.
△ Less
Submitted 12 February, 2024;
originally announced February 2024.
-
The Danger Of Arrogance: Welfare Equilibra As A Solution To Stackelberg Self-Play In Non-Coincidental Games
Authors:
Jake Levi,
Chris Lu,
Timon Willi,
Christian Schroeder de Witt,
Jakob Foerster
Abstract:
The increasing prevalence of multi-agent learning systems in society necessitates understanding how to learn effective and safe policies in general-sum multi-agent environments against a variety of opponents, including self-play. General-sum learning is difficult because of non-stationary opponents and misaligned incentives. Our first main contribution is to show that many recent approaches to gen…
▽ More
The increasing prevalence of multi-agent learning systems in society necessitates understanding how to learn effective and safe policies in general-sum multi-agent environments against a variety of opponents, including self-play. General-sum learning is difficult because of non-stationary opponents and misaligned incentives. Our first main contribution is to show that many recent approaches to general-sum learning can be derived as approximations to Stackelberg strategies, which suggests a framework for develo** new multi-agent learning algorithms. We then define non-coincidental games as games in which the Stackelberg strategy profile is not a Nash Equilibrium. This notably includes several canonical matrix games and provides a normative theory for why existing algorithms fail in self-play in such games. We address this problem by introducing Welfare Equilibria (WE) as a generalisation of Stackelberg Strategies, which can recover desirable Nash Equilibria even in non-coincidental games. Finally, we introduce Welfare Function Search (WelFuSe) as a practical approach to finding desirable WE against unknown opponents, which finds more mutually desirable solutions in self-play, while preserving performance against naive learning opponents.
△ Less
Submitted 27 March, 2024; v1 submitted 1 February, 2024;
originally announced February 2024.
-
A foundation model for atomistic materials chemistry
Authors:
Ilyes Batatia,
Philipp Benner,
Yuan Chiang,
Alin M. Elena,
Dávid P. Kovács,
Janosh Riebesell,
Xavier R. Advincula,
Mark Asta,
Matthew Avaylon,
William J. Baldwin,
Fabian Berger,
Noam Bernstein,
Arghya Bhowmik,
Samuel M. Blau,
Vlad Cărare,
James P. Darby,
Sandip De,
Flaviano Della Pia,
Volker L. Deringer,
Rokas Elijošius,
Zakariya El-Machachi,
Fabio Falcioni,
Edvin Fako,
Andrea C. Ferrari,
Annalena Genreith-Schriever
, et al. (51 additional authors not shown)
Abstract:
Machine-learned force fields have transformed the atomistic modelling of materials by enabling simulations of ab initio quality on unprecedented time and length scales. However, they are currently limited by: (i) the significant computational and human effort that must go into development and validation of potentials for each particular system of interest; and (ii) a general lack of transferabilit…
▽ More
Machine-learned force fields have transformed the atomistic modelling of materials by enabling simulations of ab initio quality on unprecedented time and length scales. However, they are currently limited by: (i) the significant computational and human effort that must go into development and validation of potentials for each particular system of interest; and (ii) a general lack of transferability from one chemical system to the next. Here, using the state-of-the-art MACE architecture we introduce a single general-purpose ML model, trained on a public database of 150k inorganic crystals, that is capable of running stable molecular dynamics on molecules and materials. We demonstrate the power of the MACE-MP-0 model - and its qualitative and at times quantitative accuracy - on a diverse set problems in the physical sciences, including the properties of solids, liquids, gases, chemical reactions, interfaces and even the dynamics of a small protein. The model can be applied out of the box and as a starting or "foundation model" for any atomistic system of interest and is thus a step towards democratising the revolution of ML force fields by lowering the barriers to entry.
△ Less
Submitted 1 March, 2024; v1 submitted 29 December, 2023;
originally announced January 2024.
-
MACE-OFF23: Transferable Machine Learning Force Fields for Organic Molecules
Authors:
Dávid Péter Kovács,
J. Harry Moore,
Nicholas J. Browning,
Ilyes Batatia,
Joshua T. Horton,
Venkat Kapil,
William C. Witt,
Ioan-Bogdan Magdău,
Daniel J. Cole,
Gábor Csányi
Abstract:
Classical empirical force fields have dominated biomolecular simulation for over 50 years. Although widely used in drug discovery, crystal structure prediction, and biomolecular dynamics, they generally lack the accuracy and transferability required for predictive modelling. In this paper, we introduce MACE-OFF23, a transferable force field for organic molecules created using state-of-the-art mach…
▽ More
Classical empirical force fields have dominated biomolecular simulation for over 50 years. Although widely used in drug discovery, crystal structure prediction, and biomolecular dynamics, they generally lack the accuracy and transferability required for predictive modelling. In this paper, we introduce MACE-OFF23, a transferable force field for organic molecules created using state-of-the-art machine learning technology and first-principles reference data computed with a high level of quantum mechanical theory. MACE-OFF23 demonstrates the remarkable capabilities of local, short-range models by accurately predicting a wide variety of gas and condensed phase properties of molecular systems. It produces accurate, easy-to-converge dihedral torsion scans of unseen molecules, as well as reliable descriptions of molecular crystals and liquids, including quantum nuclear effects. We further demonstrate the capabilities of MACE-OFF23 by determining free energy surfaces in explicit solvent, as well as the folding dynamics of peptides. Finally, we simulate a fully solvated small protein, observing accurate secondary structure and vibrational spectrum. These developments enable first-principles simulations of molecular systems for the broader chemistry community at high accuracy and low computational cost.
△ Less
Submitted 29 December, 2023; v1 submitted 23 December, 2023;
originally announced December 2023.
-
Reliable Identification of Binary Supermassive Black Holes from Rubin Observatory Time-Domain Monitoring
Authors:
Megan C. Davis,
Kaylee E. Grace,
Jonathan R. Trump,
Jessie C. Runnoe,
Amelia Henkel,
Laura Blecha,
W. N. Brandt,
J. Andrew Casey-Clyde,
Maria Charisi,
Caitlin Witt
Abstract:
Periodic signatures in time-domain observations of quasars have been used to search for binary supermassive black holes. These searches, across existing time-domain surveys, have produced several hundred candidates. The general stochastic variability of quasars, however, can masquerade as a false-positive periodic signal, especially when monitoring cadence and duration are limited. In this work, w…
▽ More
Periodic signatures in time-domain observations of quasars have been used to search for binary supermassive black holes. These searches, across existing time-domain surveys, have produced several hundred candidates. The general stochastic variability of quasars, however, can masquerade as a false-positive periodic signal, especially when monitoring cadence and duration are limited. In this work, we predict the detectability of binary supermassive black holes in the upcoming Rubin Observatory Legacy Survey of Space and Time (LSST). We apply computationally inexpensive sinusoidal curve fits to millions of simulated LSST Deep Drilling Field light curves of both single, isolated quasars and binary quasars. Period and phase of simulated binary signals can generally be disentangled from quasar variability. Binary amplitude is overestimated and poorly recovered for two-thirds of potential binaries due to quasar accretion variability. Quasars with strong intrinsic variability can obscure a binary signal too much for recovery. We also find that the most luminous quasars mimic current binary candidate light curves and their properties: false positive rates are 60\% for these quasars. The reliable recovery of binary period and phase for a wide range of input binary LSST light curves is promising for multi-messenger characterization of binary supermassive black holes. However, pure electromagnetic detections of binaries using photometric periodicity with amplitude greater than 0.1 magnitude will result in samples that are overwhelmed by false positives. This paper represents an important and computationally inexpensive way forward for understanding the true and false positive rates for binary candidates identified by Rubin.
△ Less
Submitted 17 November, 2023;
originally announced November 2023.
-
JaxMARL: Multi-Agent RL Environments in JAX
Authors:
Alexander Rutherford,
Benjamin Ellis,
Matteo Gallici,
Jonathan Cook,
Andrei Lupu,
Gardar Ingvarsson,
Timon Willi,
Akbir Khan,
Christian Schroeder de Witt,
Alexandra Souly,
Saptarashmi Bandyopadhyay,
Mikayel Samvelyan,
Minqi Jiang,
Robert Tjarko Lange,
Shimon Whiteson,
Bruno Lacerda,
Nick Hawes,
Tim Rocktaschel,
Chris Lu,
Jakob Nicolaus Foerster
Abstract:
Benchmarks play an important role in the development of machine learning algorithms. For example, research in reinforcement learning (RL) has been heavily influenced by available environments and benchmarks. However, RL environments are traditionally run on the CPU, limiting their scalability with typical academic compute. Recent advancements in JAX have enabled the wider use of hardware accelerat…
▽ More
Benchmarks play an important role in the development of machine learning algorithms. For example, research in reinforcement learning (RL) has been heavily influenced by available environments and benchmarks. However, RL environments are traditionally run on the CPU, limiting their scalability with typical academic compute. Recent advancements in JAX have enabled the wider use of hardware acceleration to overcome these computational hurdles, enabling massively parallel RL training pipelines and environments. This is particularly useful for multi-agent reinforcement learning (MARL) research. First of all, multiple agents must be considered at each environment step, adding computational burden, and secondly, the sample complexity is increased due to non-stationarity, decentralised partial observability, or other MARL challenges. In this paper, we present JaxMARL, the first open-source code base that combines ease-of-use with GPU enabled efficiency, and supports a large number of commonly used MARL environments as well as popular baseline algorithms. When considering wall clock time, our experiments show that per-run our JAX-based training pipeline is up to 12500x faster than existing approaches. This enables efficient and thorough evaluations, with the potential to alleviate the evaluation crisis of the field. We also introduce and benchmark SMAX, a vectorised, simplified version of the popular StarCraft Multi-Agent Challenge, which removes the need to run the StarCraft II game engine. This not only enables GPU acceleration, but also provides a more flexible MARL environment, unlocking the potential for self-play, meta-learning, and other future applications in MARL. We provide code at https://github.com/flairox/jaxmarl.
△ Less
Submitted 19 December, 2023; v1 submitted 16 November, 2023;
originally announced November 2023.
-
The NANOGrav 15-year data set: Search for Transverse Polarization Modes in the Gravitational-Wave Background
Authors:
Gabriella Agazie,
Akash Anumarlapudi,
Anne M. Archibald,
Zaven Arzoumanian,
Jeremy Baier,
Paul T. Baker,
Bence Bécsy,
Laura Blecha,
Adam Brazier,
Paul R. Brook,
Sarah Burke-Spolaor,
Rand Burnette,
Robin Case,
J. Andrew Casey-Clyde,
Maria Charisi,
Shami Chatterjee,
Tyler Cohen,
James M. Cordes,
Neil J. Cornish,
Fronefield Crawford,
H. Thankful Cromartie,
Kathryn Crowter,
Megan E. DeCesar,
Dallas DeGan,
Paul B. Demorest
, et al. (74 additional authors not shown)
Abstract:
Recently we found compelling evidence for a gravitational wave background with Hellings and Downs (HD) correlations in our 15-year data set. These correlations describe gravitational waves as predicted by general relativity, which has two transverse polarization modes. However, more general metric theories of gravity can have additional polarization modes which produce different interpulsar correl…
▽ More
Recently we found compelling evidence for a gravitational wave background with Hellings and Downs (HD) correlations in our 15-year data set. These correlations describe gravitational waves as predicted by general relativity, which has two transverse polarization modes. However, more general metric theories of gravity can have additional polarization modes which produce different interpulsar correlations. In this work we search the NANOGrav 15-year data set for evidence of a gravitational wave background with quadrupolar Hellings and Downs (HD) and Scalar Transverse (ST) correlations. We find that HD correlations are the best fit to the data, and no significant evidence in favor of ST correlations. While Bayes factors show strong evidence for a correlated signal, the data does not strongly prefer either correlation signature, with Bayes factors $\sim 2$ when comparing HD to ST correlations, and $\sim 1$ for HD plus ST correlations to HD correlations alone. However, when modeled alongside HD correlations, the amplitude and spectral index posteriors for ST correlations are uninformative, with the HD process accounting for the vast majority of the total signal. Using the optimal statistic, a frequentist technique that focuses on the pulsar-pair cross-correlations, we find median signal-to-noise-ratios of 5.0 for HD and 4.6 for ST correlations when fit for separately, and median signal-to-noise-ratios of 3.5 for HD and 3.0 for ST correlations when fit for simultaneously. While the signal-to-noise-ratios for each of the correlations are comparable, the estimated amplitude and spectral index for HD are a significantly better fit to the total signal, in agreement with our Bayesian analysis.
△ Less
Submitted 18 October, 2023;
originally announced October 2023.
-
The NANOGrav 12.5-year data set: A computationally efficient eccentric binary search pipeline and constraints on an eccentric supermassive binary candidate in 3C 66B
Authors:
Gabriella Agazie,
Zaven Arzoumanian,
Paul T. Baker,
Bence Bécsy,
Laura Blecha,
Harsha Blumer,
Adam Brazier,
Paul R. Brook,
Sarah Burke-Spolaor,
J. Andrew Casey-Clyde,
Maria Charisi,
Shami Chatterjee,
Belinda D. Cheeseboro,
Tyler Cohen,
James M. Cordes,
Neil J. Cornish,
Fronefield Crawford,
H. Thankful Cromartie,
Megan E. DeCesar,
Paul B. Demorest,
Lankeswar Dey,
Timothy Dolch,
Justin A. Ellis,
Robert D. Ferdman,
Elizabeth C. Ferrara
, et al. (63 additional authors not shown)
Abstract:
The radio galaxy 3C 66B has been hypothesized to host a supermassive black hole binary (SMBHB) at its center based on electromagnetic observations. Its apparent 1.05-year period and low redshift ($\sim0.02$) make it an interesting testbed to search for low-frequency gravitational waves (GWs) using Pulsar Timing Array (PTA) experiments. This source has been subjected to multiple searches for contin…
▽ More
The radio galaxy 3C 66B has been hypothesized to host a supermassive black hole binary (SMBHB) at its center based on electromagnetic observations. Its apparent 1.05-year period and low redshift ($\sim0.02$) make it an interesting testbed to search for low-frequency gravitational waves (GWs) using Pulsar Timing Array (PTA) experiments. This source has been subjected to multiple searches for continuous GWs from a circular SMBHB, resulting in progressively more stringent constraints on its GW amplitude and chirp mass. In this paper, we develop a pipeline for performing Bayesian targeted searches for eccentric SMBHBs in PTA data sets, and test its efficacy by applying it on simulated data sets with varying injected signal strengths. We also search for a realistic eccentric SMBHB source in 3C 66B using the NANOGrav 12.5-year data set employing PTA signal models containing Earth term-only as well as Earth+Pulsar term contributions using this pipeline. Due to limitations in our PTA signal model, we get meaningful results only when the initial eccentricity $e_0<0.5$ and the symmetric mass ratio $η>0.1$. We find no evidence for an eccentric SMBHB signal in our data, and therefore place 95% upper limits on the PTA signal amplitude of $88.1\pm3.7$ ns for the Earth term-only and $81.74\pm0.86$ ns for the Earth+Pulsar term searches for $e_0<0.5$ and $η>0.1$. Similar 95% upper limits on the chirp mass are $(1.98 \pm 0.05) \times 10^9\,M_{\odot}$ and $(1.81 \pm 0.01) \times 10^9\,M_{\odot}$. These upper limits, while less stringent than those calculated from a circular binary search in the NANOGrav 12.5-year data set, are consistent with the SMBHB model of 3C 66B developed from electromagnetic observations.
△ Less
Submitted 15 January, 2024; v1 submitted 29 September, 2023;
originally announced September 2023.
-
How to Detect an Astrophysical Nanohertz Gravitational-Wave Background
Authors:
Bence Bécsy,
Neil J. Cornish,
Patrick M. Meyers,
Luke Zoltan Kelley,
Gabriella Agazie,
Akash Anumarlapudi,
Anne M. Archibald,
Zaven Arzoumanian,
Paul T. Baker,
Laura Blecha,
Adam Brazier,
Paul R. Brook,
Sarah Burke-Spolaor,
J. Andrew Casey-Clyde,
Maria Charisi,
Shami Chatterjee,
Katerina Chatziioannou,
Tyler Cohen,
James M. Cordes,
Fronefield Crawford,
H. Thankful Cromartie,
Kathryn Crowter,
Megan E. DeCesar,
Paul B. Demorest,
Timothy Dolch
, et al. (71 additional authors not shown)
Abstract:
Analysis of pulsar timing data have provided evidence for a stochastic gravitational wave background in the nHz frequency band. The most plausible source of such a background is the superposition of signals from millions of supermassive black hole binaries. The standard statistical techniques used to search for such a background and assess its significance make several simplifying assumptions, nam…
▽ More
Analysis of pulsar timing data have provided evidence for a stochastic gravitational wave background in the nHz frequency band. The most plausible source of such a background is the superposition of signals from millions of supermassive black hole binaries. The standard statistical techniques used to search for such a background and assess its significance make several simplifying assumptions, namely: i) Gaussianity; ii) isotropy; and most often iii) a power-law spectrum. However, a stochastic background from a finite collection of binaries does not exactly satisfy any of these assumptions. To understand the effect of these assumptions, we test standard analysis techniques on a large collection of realistic simulated datasets. The dataset length, observing schedule, and noise levels were chosen to emulate the NANOGrav 15-year dataset. Simulated signals from millions of binaries drawn from models based on the Illustris cosmological hydrodynamical simulation were added to the data. We find that the standard statistical methods perform remarkably well on these simulated datasets, despite their fundamental assumptions not being strictly met. They are able to achieve a confident detection of the background. However, even for a fixed set of astrophysical parameters, different realizations of the universe result in a large variance in the significance and recovered parameters of the background. We also find that the presence of loud individual binaries can bias the spectral recovery of the background if we do not account for them.
△ Less
Submitted 1 December, 2023; v1 submitted 8 September, 2023;
originally announced September 2023.
-
ACEpotentials.jl: A Julia Implementation of the Atomic Cluster Expansion
Authors:
William C. Witt,
Cas van der Oord,
Elena Gelžinytė,
Teemu Järvinen,
Andres Ross,
James P. Darby,
Cheuk Hin Ho,
William J. Baldwin,
Matthias Sachs,
James Kermode,
Noam Bernstein,
Gábor Csányi,
Christoph Ortner
Abstract:
We introduce ACEpotentials.jl, a Julia-language software package that constructs interatomic potentials from quantum mechanical reference data using the Atomic Cluster Expansion (Drautz, 2019). As the latter provides a complete description of atomic environments, including invariance to overall translation and rotation as well as permutation of like atoms, the resulting potentials are systematical…
▽ More
We introduce ACEpotentials.jl, a Julia-language software package that constructs interatomic potentials from quantum mechanical reference data using the Atomic Cluster Expansion (Drautz, 2019). As the latter provides a complete description of atomic environments, including invariance to overall translation and rotation as well as permutation of like atoms, the resulting potentials are systematically improvable and data efficient. Furthermore, the descriptor's expressiveness enables use of a linear model, facilitating rapid evaluation and straightforward application of Bayesian techniques for active learning. We summarize the capabilities of ACEpotentials.jl and demonstrate its strengths (simplicity, interpretability, robustness, performance) on a selection of prototypical atomistic modelling workflows.
△ Less
Submitted 7 September, 2023; v1 submitted 6 September, 2023;
originally announced September 2023.
-
Comparing recent PTA results on the nanohertz stochastic gravitational wave background
Authors:
The International Pulsar Timing Array Collaboration,
G. Agazie,
J. Antoniadis,
A. Anumarlapudi,
A. M. Archibald,
P. Arumugam,
S. Arumugam,
Z. Arzoumanian,
J. Askew,
S. Babak,
M. Bagchi,
M. Bailes,
A. -S. Bak Nielsen,
P. T. Baker,
C. G. Bassa,
A. Bathula,
B. Bécsy,
A. Berthereau,
N. D. R. Bhat,
L. Blecha,
M. Bonetti,
E. Bortolas,
A. Brazier,
P. R. Brook,
M. Burgay
, et al. (220 additional authors not shown)
Abstract:
The Australian, Chinese, European, Indian, and North American pulsar timing array (PTA) collaborations recently reported, at varying levels, evidence for the presence of a nanohertz gravitational wave background (GWB). Given that each PTA made different choices in modeling their data, we perform a comparison of the GWB and individual pulsar noise parameters across the results reported from the PTA…
▽ More
The Australian, Chinese, European, Indian, and North American pulsar timing array (PTA) collaborations recently reported, at varying levels, evidence for the presence of a nanohertz gravitational wave background (GWB). Given that each PTA made different choices in modeling their data, we perform a comparison of the GWB and individual pulsar noise parameters across the results reported from the PTAs that constitute the International Pulsar Timing Array (IPTA). We show that despite making different modeling choices, there is no significant difference in the GWB parameters that are measured by the different PTAs, agreeing within $1σ$. The pulsar noise parameters are also consistent between different PTAs for the majority of the pulsars included in these analyses. We bridge the differences in modeling choices by adopting a standardized noise model for all pulsars and PTAs, finding that under this model there is a reduction in the tension in the pulsar noise parameters. As part of this reanalysis, we "extended" each PTA's data set by adding extra pulsars that were not timed by that PTA. Under these extensions, we find better constraints on the GWB amplitude and a higher signal-to-noise ratio for the Hellings and Downs correlations. These extensions serve as a prelude to the benefits offered by a full combination of data across all pulsars in the IPTA, i.e., the IPTA's Data Release 3, which will involve not just adding in additional pulsars, but also including data from all three PTAs where any given pulsar is timed by more than as single PTA.
△ Less
Submitted 1 September, 2023;
originally announced September 2023.
-
Bayesian Exploration Networks
Authors:
Mattie Fellows,
Brandon Kaplowitz,
Christian Schroeder de Witt,
Shimon Whiteson
Abstract:
Bayesian reinforcement learning (RL) offers a principled and elegant approach for sequential decision making under uncertainty. Most notably, Bayesian agents do not face an exploration/exploitation dilemma, a major pathology of frequentist methods. However theoretical understanding of model-free approaches is lacking. In this paper, we introduce a novel Bayesian model-free formulation and the firs…
▽ More
Bayesian reinforcement learning (RL) offers a principled and elegant approach for sequential decision making under uncertainty. Most notably, Bayesian agents do not face an exploration/exploitation dilemma, a major pathology of frequentist methods. However theoretical understanding of model-free approaches is lacking. In this paper, we introduce a novel Bayesian model-free formulation and the first analysis showing that model-free approaches can yield Bayes-optimal policies. We show all existing model-free approaches make approximations that yield policies that can be arbitrarily Bayes-suboptimal. As a first step towards model-free Bayes optimality, we introduce the Bayesian exploration network (BEN) which uses normalising flows to model both the aleatoric uncertainty (via density estimation) and epistemic uncertainty (via variational inference) in the Bellman operator. In the limit of complete optimisation, BEN learns true Bayes-optimal policies, but like in variational expectation-maximisation, partial optimisation renders our approach tractable. Empirical results demonstrate that BEN can learn true Bayes-optimal policies in tasks where existing model-free approaches fail.
△ Less
Submitted 25 June, 2024; v1 submitted 24 August, 2023;
originally announced August 2023.
-
The NANOGrav 12.5-year Data Set: Search for Gravitational Wave Memory
Authors:
Gabriella Agazie,
Zaven Arzoumanian,
Paul T. Baker,
Bence Bécsy,
Laura Blecha,
Harsha Blumer,
Adam Brazier,
Paul R. Brook,
Sarah Burke-Spolaor,
Rand Burnette,
Robin Case,
J. Andrew Casey-Clyde,
Maria Charisi,
Shami Chatterjee,
Tyler Cohen,
James M. Cordes,
Neil J. Cornish,
Fronefield Crawford,
H. Thankful Cromartie,
Megan E. DeCesar,
Dallas DeGan,
Paul B. Demorest,
Timothy Dolch,
Brendan Drachler,
Justin A. Ellis
, et al. (65 additional authors not shown)
Abstract:
We present the results of a Bayesian search for gravitational wave (GW) memory in the NANOGrav 12.5-yr data set. We find no convincing evidence for any gravitational wave memory signals in this data set (Bayes factor = 2.8). As such, we go on to place upper limits on the strain amplitude of GW memory events as a function of sky location and event epoch. These upper limits are computed using a sign…
▽ More
We present the results of a Bayesian search for gravitational wave (GW) memory in the NANOGrav 12.5-yr data set. We find no convincing evidence for any gravitational wave memory signals in this data set (Bayes factor = 2.8). As such, we go on to place upper limits on the strain amplitude of GW memory events as a function of sky location and event epoch. These upper limits are computed using a signal model that assumes the existence of a common, spatially uncorrelated red noise in addition to a GW memory signal. The median strain upper limit as a function of sky position is approximately $3.3 \times 10^{-14}$. We also find that there are some differences in the upper limits as a function of sky position centered around PSR J0613$-$0200. This suggests that this pulsar has some excess noise which can be confounded with GW memory. Finally, the upper limits as a function of burst epoch continue to improve at later epochs. This improvement is attributable to the continued growth of the pulsar timing array.
△ Less
Submitted 25 July, 2023;
originally announced July 2023.
-
First Steps Towards a Runtime Analysis of Neuroevolution
Authors:
Paul Fischer,
Emil Lundt Larsen,
Carsten Witt
Abstract:
We consider a simple setting in neuroevolution where an evolutionary algorithm optimizes the weights and activation functions of a simple artificial neural network. We then define simple example functions to be learned by the network and conduct rigorous runtime analyses for networks with a single neuron and for a more advanced structure with several neurons and two layers. Our results show that t…
▽ More
We consider a simple setting in neuroevolution where an evolutionary algorithm optimizes the weights and activation functions of a simple artificial neural network. We then define simple example functions to be learned by the network and conduct rigorous runtime analyses for networks with a single neuron and for a more advanced structure with several neurons and two layers. Our results show that the proposed algorithm is generally efficient on two example problems designed for one neuron and efficient with at least constant probability on the example problem for a two-layer network. In particular, the so-called harmonic mutation operator choosing steps of size $j$ with probability proportional to $1/j$ turns out as a good choice for the underlying search space. However, for the case of one neuron, we also identify situations with hard-to-overcome local optima. Experimental investigations of our neuroevolutionary algorithm and a state-of-the-art CMA-ES support the theoretical findings.
△ Less
Submitted 16 October, 2023; v1 submitted 3 July, 2023;
originally announced July 2023.
-
The NANOGrav 15-year Gravitational-Wave Background Analysis Pipeline
Authors:
Aaron D. Johnson,
Patrick M. Meyers,
Paul T. Baker,
Neil J. Cornish,
Jeffrey S. Hazboun,
Tyson B. Littenberg,
Joseph D. Romano,
Stephen R. Taylor,
Michele Vallisneri,
Sarah J. Vigeland,
Ken D. Olum,
Xavier Siemens,
Justin A. Ellis,
Rutger van Haasteren,
Sophie Hourihane,
Gabriella Agazie,
Akash Anumarlapudi,
Anne M. Archibald,
Zaven Arzoumanian,
Laura Blecha,
Adam Brazier,
Paul R. Brook,
Sarah Burke-Spolaor,
Bence Bécsy,
J. Andrew Casey-Clyde
, et al. (71 additional authors not shown)
Abstract:
This paper presents rigorous tests of pulsar timing array methods and software, examining their consistency across a wide range of injected parameters and signal strength. We discuss updates to the 15-year isotropic gravitational-wave background analyses and their corresponding code representations. Descriptions of the internal structure of the flagship algorithms \texttt{Enterprise} and \texttt{P…
▽ More
This paper presents rigorous tests of pulsar timing array methods and software, examining their consistency across a wide range of injected parameters and signal strength. We discuss updates to the 15-year isotropic gravitational-wave background analyses and their corresponding code representations. Descriptions of the internal structure of the flagship algorithms \texttt{Enterprise} and \texttt{PTMCMCSampler} are given to facilitate understanding of the PTA likelihood structure, how models are built, and what methods are currently used in sampling the high-dimensional PTA parameter space. We introduce a novel version of the PTA likelihood that uses a two-step marginalization procedure that performs much faster when the white noise parameters remain fixed. We perform stringent tests of consistency and correctness of the Bayesian and frequentist analysis software. For the Bayesian analysis, we test prior recovery, injection recovery, and Bayes factors. For the frequentist analysis, we test that the cross-correlation-based optimal statistic, when modified to account for a non-negligible gravitational-wave background, accurately recovers the amplitude of the background. We also summarize recent advances and tests performed on the optimal statistic in the literature from both GWB detection and parameter estimation perspectives. The tests presented here validate current and future analyses of PTA data.
△ Less
Submitted 7 July, 2023; v1 submitted 28 June, 2023;
originally announced June 2023.
-
The NANOGrav 15-year Data Set: Bayesian Limits on Gravitational Waves from Individual Supermassive Black Hole Binaries
Authors:
Gabriella Agazie,
Akash Anumarlapudi,
Anne M. Archibald,
Zaven Arzoumanian,
Paul T. Baker,
Bence Bécsy,
Laura Blecha,
Adam Brazier,
Paul R. Brook,
Sarah Burke-Spolaor,
Robin Case,
J. Andrew Casey-Clyde,
Maria Charisi,
Shami Chatterjee,
Tyler Cohen,
James M. Cordes,
Neil Cornish,
Fronefield Crawford,
H. Thankful Cromartie,
Kathryn Crowter,
Megan DeCesar,
Paul B. Demorest,
Matthew C. Digman,
Timothy Dolch,
Brendan Drachler
, et al. (74 additional authors not shown)
Abstract:
Evidence for a low-frequency stochastic gravitational wave background has recently been reported based on analyses of pulsar timing array data. The most likely source of such a background is a population of supermassive black hole binaries, the loudest of which may be individually detected in these datasets. Here we present the search for individual supermassive black hole binaries in the NANOGrav…
▽ More
Evidence for a low-frequency stochastic gravitational wave background has recently been reported based on analyses of pulsar timing array data. The most likely source of such a background is a population of supermassive black hole binaries, the loudest of which may be individually detected in these datasets. Here we present the search for individual supermassive black hole binaries in the NANOGrav 15-year dataset. We introduce several new techniques, which enhance the efficiency and modeling accuracy of the analysis. The search uncovered weak evidence for two candidate signals, one with a gravitational-wave frequency of $\sim$4 nHz, and another at $\sim$170 nHz. The significance of the low-frequency candidate was greatly diminished when Hellings-Downs correlations were included in the background model. The high-frequency candidate was discounted due to the lack of a plausible host galaxy, the unlikely astrophysical prior odds of finding such a source, and since most of its support comes from a single pulsar with a commensurate binary period. Finding no compelling evidence for signals from individual binary systems, we place upper limits on the strain amplitude of gravitational waves emitted by such systems.
△ Less
Submitted 28 June, 2023;
originally announced June 2023.
-
The NANOGrav 15-year Data Set: Search for Anisotropy in the Gravitational-Wave Background
Authors:
Gabriella Agazie,
Akash Anumarlapudi,
Anne M. Archibald,
Zaven Arzoumanian,
Paul T. Baker,
Bence Bécsy,
Laura Blecha,
Adam Brazier,
Paul R. Brook,
Sarah Burke-Spolaor,
J. Andrew Casey-Clyde,
Maria Charisi,
Shami Chatterjee,
Tyler Cohen,
James M. Cordes,
Neil J. Cornish,
Fronefield Crawford,
H. Thankful Cromartie,
Kathryn Crowter,
Megan E. DeCesar,
Paul B. Demorest,
Timothy Dolch,
Brendan Drachler,
Elizabeth C. Ferrara,
William Fiore
, et al. (68 additional authors not shown)
Abstract:
The North American Nanohertz Observatory for Gravitational Waves (NANOGrav) has reported evidence for the presence of an isotropic nanohertz gravitational wave background (GWB) in its 15 yr dataset. However, if the GWB is produced by a population of inspiraling supermassive black hole binary (SMBHB) systems, then the background is predicted to be anisotropic, depending on the distribution of these…
▽ More
The North American Nanohertz Observatory for Gravitational Waves (NANOGrav) has reported evidence for the presence of an isotropic nanohertz gravitational wave background (GWB) in its 15 yr dataset. However, if the GWB is produced by a population of inspiraling supermassive black hole binary (SMBHB) systems, then the background is predicted to be anisotropic, depending on the distribution of these systems in the local Universe and the statistical properties of the SMBHB population. In this work, we search for anisotropy in the GWB using multiple methods and bases to describe the distribution of the GWB power on the sky. We do not find significant evidence of anisotropy, and place a Bayesian $95\%$ upper limit on the level of broadband anisotropy such that $(C_{l>0} / C_{l=0}) < 20\%$. We also derive conservative estimates on the anisotropy expected from a random distribution of SMBHB systems using astrophysical simulations conditioned on the isotropic GWB inferred in the 15-yr dataset, and show that this dataset has sufficient sensitivity to probe a large fraction of the predicted level of anisotropy. We end by highlighting the opportunities and challenges in searching for anisotropy in pulsar timing array data.
△ Less
Submitted 28 June, 2023;
originally announced June 2023.
-
The NANOGrav 15-year Data Set: Constraints on Supermassive Black Hole Binaries from the Gravitational Wave Background
Authors:
Gabriella Agazie,
Akash Anumarlapudi,
Anne M. Archibald,
Paul T. Baker,
Bence Bécsy,
Laura Blecha,
Alexander Bonilla,
Adam Brazier,
Paul R. Brook,
Sarah Burke-Spolaor,
Rand Burnette,
Robin Case,
J. Andrew Casey-Clyde,
Maria Charisi,
Shami Chatterjee,
Katerina Chatziioannou,
Belinda D. Cheeseboro,
Siyuan Chen,
Tyler Cohen,
James M. Cordes,
Neil J. Cornish,
Fronefield Crawford,
H. Thankful Cromartie,
Kathryn Crowter,
Curt J. Cutler
, et al. (89 additional authors not shown)
Abstract:
The NANOGrav 15-year data set shows evidence for the presence of a low-frequency gravitational-wave background (GWB). While many physical processes can source such low-frequency gravitational waves, here we analyze the signal as coming from a population of supermassive black hole (SMBH) binaries distributed throughout the Universe. We show that astrophysically motivated models of SMBH binary popul…
▽ More
The NANOGrav 15-year data set shows evidence for the presence of a low-frequency gravitational-wave background (GWB). While many physical processes can source such low-frequency gravitational waves, here we analyze the signal as coming from a population of supermassive black hole (SMBH) binaries distributed throughout the Universe. We show that astrophysically motivated models of SMBH binary populations are able to reproduce both the amplitude and shape of the observed low-frequency gravitational-wave spectrum. While multiple model variations are able to reproduce the GWB spectrum at our current measurement precision, our results highlight the importance of accurately modeling binary evolution for producing realistic GWB spectra. Additionally, while reasonable parameters are able to reproduce the 15-year observations, the implied GWB amplitude necessitates either a large number of parameters to be at the edges of expected values, or a small number of parameters to be notably different from standard expectations. While we are not yet able to definitively establish the origin of the inferred GWB signal, the consistency of the signal with astrophysical expectations offers a tantalizing prospect for confirming that SMBH binaries are able to form, reach sub-parsec separations, and eventually coalesce. As the significance grows over time, higher-order features of the GWB spectrum will definitively determine the nature of the GWB and allow for novel constraints on SMBH populations.
△ Less
Submitted 18 July, 2023; v1 submitted 28 June, 2023;
originally announced June 2023.
-
The NANOGrav 15-year Data Set: Search for Signals from New Physics
Authors:
Adeela Afzal,
Gabriella Agazie,
Akash Anumarlapudi,
Anne M. Archibald,
Zaven Arzoumanian,
Paul T. Baker,
Bence Bécsy,
Jose Juan Blanco-Pillado,
Laura Blecha,
Kimberly K. Boddy,
Adam Brazier,
Paul R. Brook,
Sarah Burke-Spolaor,
Rand Burnette,
Robin Case,
Maria Charisi,
Shami Chatterjee,
Katerina Chatziioannou,
Belinda D. Cheeseboro,
Siyuan Chen,
Tyler Cohen,
James M. Cordes,
Neil J. Cornish,
Fronefield Crawford,
H. Thankful Cromartie
, et al. (98 additional authors not shown)
Abstract:
The 15-year pulsar timing data set collected by the North American Nanohertz Observatory for Gravitational Waves (NANOGrav) shows positive evidence for the presence of a low-frequency gravitational-wave (GW) background. In this paper, we investigate potential cosmological interpretations of this signal, specifically cosmic inflation, scalar-induced GWs, first-order phase transitions, cosmic string…
▽ More
The 15-year pulsar timing data set collected by the North American Nanohertz Observatory for Gravitational Waves (NANOGrav) shows positive evidence for the presence of a low-frequency gravitational-wave (GW) background. In this paper, we investigate potential cosmological interpretations of this signal, specifically cosmic inflation, scalar-induced GWs, first-order phase transitions, cosmic strings, and domain walls. We find that, with the exception of stable cosmic strings of field theory origin, all these models can reproduce the observed signal. When compared to the standard interpretation in terms of inspiraling supermassive black hole binaries (SMBHBs), many cosmological models seem to provide a better fit resulting in Bayes factors in the range from 10 to 100. However, these results strongly depend on modeling assumptions about the cosmic SMBHB population and, at this stage, should not be regarded as evidence for new physics. Furthermore, we identify excluded parameter regions where the predicted GW signal from cosmological sources significantly exceeds the NANOGrav signal. These parameter constraints are independent of the origin of the NANOGrav signal and illustrate how pulsar timing data provide a new way to constrain the parameter space of these models. Finally, we search for deterministic signals produced by models of ultralight dark matter (ULDM) and dark matter substructures in the Milky Way. We find no evidence for either of these signals and thus report updated constraints on these models. In the case of ULDM, these constraints outperform torsion balance and atomic clock constraints for ULDM coupled to electrons, muons, or gluons.
△ Less
Submitted 28 June, 2023;
originally announced June 2023.
-
The NANOGrav 15-Year Data Set: Detector Characterization and Noise Budget
Authors:
Gabriella Agazie,
Akash Anumarlapudi,
Anne M. Archibald,
Zaven Arzoumanian,
Paul T. Baker,
Bence Bécsy,
Laura Blecha,
Adam Brazier,
Paul R. Brook,
Sarah Burke-Spolaor,
Maria Charisi,
Shami Chatterjee,
Tyler Cohen,
James M. Cordes,
Neil J. Cornish,
Fronefield Crawford,
H. Thankful Cromartie,
Kathryn Crowter,
Megan E. Decesar,
Paul B. Demorest,
Timothy Dolch,
Brendan Drachler,
Elizabeth C. Ferrara,
William Fiore,
Emmanuel Fonseca
, et al. (66 additional authors not shown)
Abstract:
Pulsar timing arrays (PTAs) are galactic-scale gravitational wave detectors. Each individual arm, composed of a millisecond pulsar, a radio telescope, and a kiloparsecs-long path, differs in its properties but, in aggregate, can be used to extract low-frequency gravitational wave (GW) signals. We present a noise and sensitivity analysis to accompany the NANOGrav 15-year data release and associated…
▽ More
Pulsar timing arrays (PTAs) are galactic-scale gravitational wave detectors. Each individual arm, composed of a millisecond pulsar, a radio telescope, and a kiloparsecs-long path, differs in its properties but, in aggregate, can be used to extract low-frequency gravitational wave (GW) signals. We present a noise and sensitivity analysis to accompany the NANOGrav 15-year data release and associated papers, along with an in-depth introduction to PTA noise models. As a first step in our analysis, we characterize each individual pulsar data set with three types of white noise parameters and two red noise parameters. These parameters, along with the timing model and, particularly, a piecewise-constant model for the time-variable dispersion measure, determine the sensitivity curve over the low-frequency GW band we are searching. We tabulate information for all of the pulsars in this data release and present some representative sensitivity curves. We then combine the individual pulsar sensitivities using a signal-to-noise-ratio statistic to calculate the global sensitivity of the PTA to a stochastic background of GWs, obtaining a minimum noise characteristic strain of $7\times 10^{-15}$ at 5 nHz. A power law-integrated analysis shows rough agreement with the amplitudes recovered in NANOGrav's 15-year GW background analysis. While our phenomenological noise model does not model all known physical effects explicitly, it provides an accurate characterization of the noise in the data while preserving sensitivity to multiple classes of GW signals.
△ Less
Submitted 28 June, 2023;
originally announced June 2023.
-
The NANOGrav 15-year Data Set: Observations and Timing of 68 Millisecond Pulsars
Authors:
Gabriella Agazie,
Md Faisal Alam,
Akash Anumarlapudi,
Anne M. Archibald,
Zaven Arzoumanian,
Paul T. Baker,
Laura Blecha,
Victoria Bonidie,
Adam Brazier,
Paul R. Brook,
Sarah Burke-Spolaor,
Bence Bécsy,
Christopher Chapman,
Maria Charisi,
Shami Chatterjee,
Tyler Cohen,
James M. Cordes,
Neil J. Cornish,
Fronefield Crawford,
H. Thankful Cromartie,
Kathryn Crowter,
Megan E. DeCesar,
Paul B. Demorest,
Timothy Dolch,
Brendan Drachler
, et al. (75 additional authors not shown)
Abstract:
We present observations and timing analyses of 68 millisecond pulsars (MSPs) comprising the 15-year data set of the North American Nanohertz Observatory for Gravitational Waves (NANOGrav). NANOGrav is a pulsar timing array (PTA) experiment that is sensitive to low-frequency gravitational waves. This is NANOGrav's fifth public data release, including both "narrowband" and "wideband" time-of-arrival…
▽ More
We present observations and timing analyses of 68 millisecond pulsars (MSPs) comprising the 15-year data set of the North American Nanohertz Observatory for Gravitational Waves (NANOGrav). NANOGrav is a pulsar timing array (PTA) experiment that is sensitive to low-frequency gravitational waves. This is NANOGrav's fifth public data release, including both "narrowband" and "wideband" time-of-arrival (TOA) measurements and corresponding pulsar timing models. We have added 21 MSPs and extended our timing baselines by three years, now spanning nearly 16 years for some of our sources. The data were collected using the Arecibo Observatory, the Green Bank Telescope, and the Very Large Array between frequencies of 327 MHz and 3 GHz, with most sources observed approximately monthly. A number of notable methodological and procedural changes were made compared to our previous data sets. These improve the overall quality of the TOA data set and are part of the transition to new pulsar timing and PTA analysis software packages. For the first time, our data products are accompanied by a full suite of software to reproduce data reduction, analysis, and results. Our timing models include a variety of newly detected astrometric and binary pulsar parameters, including several significant improvements to pulsar mass constraints. We find that the time series of 23 pulsars contain detectable levels of red noise, 10 of which are new measurements. In this data set, we find evidence for a stochastic gravitational-wave background.
△ Less
Submitted 28 June, 2023;
originally announced June 2023.
-
The NANOGrav 15-year Data Set: Evidence for a Gravitational-Wave Background
Authors:
Gabriella Agazie,
Akash Anumarlapudi,
Anne M. Archibald,
Zaven Arzoumanian,
Paul T. Baker,
Bence Becsy,
Laura Blecha,
Adam Brazier,
Paul R. Brook,
Sarah Burke-Spolaor,
Rand Burnette,
Robin Case,
Maria Charisi,
Shami Chatterjee,
Katerina Chatziioannou,
Belinda D. Cheeseboro,
Siyuan Chen,
Tyler Cohen,
James M. Cordes,
Neil J. Cornish,
Fronefield Crawford,
H. Thankful Cromartie,
Kathryn Crowter,
Curt J. Cutler,
Megan E. DeCesar
, et al. (89 additional authors not shown)
Abstract:
We report multiple lines of evidence for a stochastic signal that is correlated among 67 pulsars from the 15-year pulsar-timing data set collected by the North American Nanohertz Observatory for Gravitational Waves. The correlations follow the Hellings-Downs pattern expected for a stochastic gravitational-wave background. The presence of such a gravitational-wave background with a power-law-spectr…
▽ More
We report multiple lines of evidence for a stochastic signal that is correlated among 67 pulsars from the 15-year pulsar-timing data set collected by the North American Nanohertz Observatory for Gravitational Waves. The correlations follow the Hellings-Downs pattern expected for a stochastic gravitational-wave background. The presence of such a gravitational-wave background with a power-law-spectrum is favored over a model with only independent pulsar noises with a Bayes factor in excess of $10^{14}$, and this same model is favored over an uncorrelated common power-law-spectrum model with Bayes factors of 200-1000, depending on spectral modeling choices. We have built a statistical background distribution for these latter Bayes factors using a method that removes inter-pulsar correlations from our data set, finding $p = 10^{-3}$ (approx. $3σ$) for the observed Bayes factors in the null no-correlation scenario. A frequentist test statistic built directly as a weighted sum of inter-pulsar correlations yields $p = 5 \times 10^{-5} - 1.9 \times 10^{-4}$ (approx. $3.5 - 4σ$). Assuming a fiducial $f^{-2/3}$ characteristic-strain spectrum, as appropriate for an ensemble of binary supermassive black-hole inspirals, the strain amplitude is $2.4^{+0.7}_{-0.6} \times 10^{-15}$ (median + 90% credible interval) at a reference frequency of 1/(1 yr). The inferred gravitational-wave background amplitude and spectrum are consistent with astrophysical expectations for a signal from a population of supermassive black-hole binaries, although more exotic cosmological and astrophysical sources cannot be excluded. The observation of Hellings-Downs correlations points to the gravitational-wave origin of this signal.
△ Less
Submitted 28 June, 2023;
originally announced June 2023.
-
Developments and Further Applications of Ephemeral Data Derived Potentials
Authors:
Pascal T. Salzbrenner,
Se Hun Joo,
Lewis J. Conway,
Peter I. C. Cooke,
Bonan Zhu,
Milosz P. Matraszek,
William C. Witt,
Chris J. Pickard
Abstract:
Machine-learned interatomic potentials are fast becoming an indispensable tool in computational materials science. One approach is the ephemeral data-derived potential (EDDP), which was designed to accelerate atomistic structure prediction. The EDDP is simple and cost-efficient. It relies on training data generated in small unit cells and is fit using a lightweight neural network, leading to smoot…
▽ More
Machine-learned interatomic potentials are fast becoming an indispensable tool in computational materials science. One approach is the ephemeral data-derived potential (EDDP), which was designed to accelerate atomistic structure prediction. The EDDP is simple and cost-efficient. It relies on training data generated in small unit cells and is fit using a lightweight neural network, leading to smooth interactions which exhibit the robust transferability essential for structure prediction. Here, we present a variety of applications of EDDPs, enabled by recent developments of the open-source EDDP software. New features include interfaces to phonon and molecular dynamics codes, as well as deployment of the ensemble deviation for estimating the confidence in EDDP predictions. Through case studies ranging from elemental carbon and lead to the binary scandium hydride and the ternary zinc cyanide, we demonstrate that EDDPs can be trained to cover wide ranges of pressures and stoichiometries, and used to evaluate phonons, phase diagrams, superionicity, and thermal expansion. These developments complement continued success in accelerated structure prediction.
△ Less
Submitted 2 October, 2023; v1 submitted 10 June, 2023;
originally announced June 2023.
-
Fast Pareto Optimization Using Sliding Window Selection
Authors:
Frank Neumann,
Carsten Witt
Abstract:
Pareto optimization using evolutionary multi-objective algorithms has been widely applied to solve constrained submodular optimization problems. A crucial factor determining the runtime of the used evolutionary algorithms to obtain good approximations is the population size of the algorithms which grows with the number of trade-offs that the algorithms encounter. In this paper, we introduce a slid…
▽ More
Pareto optimization using evolutionary multi-objective algorithms has been widely applied to solve constrained submodular optimization problems. A crucial factor determining the runtime of the used evolutionary algorithms to obtain good approximations is the population size of the algorithms which grows with the number of trade-offs that the algorithms encounter. In this paper, we introduce a sliding window speed up technique for recently introduced algorithms. We prove that our technique eliminates the population size as a crucial factor negatively impacting the runtime and achieves the same theoretical performance guarantees as previous approaches within less computation time. Our experimental investigations for the classical maximum coverage problem confirms that our sliding window technique clearly leads to better results for a wide range of instances and constraint settings.
△ Less
Submitted 11 May, 2023;
originally announced May 2023.
-
How Well Does the Metropolis Algorithm Cope With Local Optima?
Authors:
Benjamin Doerr,
Taha El Ghazi El Houssaini,
Amirhossein Rajabi,
Carsten Witt
Abstract:
The Metropolis algorithm (MA) is a classic stochastic local search heuristic. It avoids getting stuck in local optima by occasionally accepting inferior solutions. To better and in a rigorous manner understand this ability, we conduct a mathematical runtime analysis of the MA on the CLIFF benchmark. Apart from one local optimum, cliff functions are monotonically increasing towards the global optim…
▽ More
The Metropolis algorithm (MA) is a classic stochastic local search heuristic. It avoids getting stuck in local optima by occasionally accepting inferior solutions. To better and in a rigorous manner understand this ability, we conduct a mathematical runtime analysis of the MA on the CLIFF benchmark. Apart from one local optimum, cliff functions are monotonically increasing towards the global optimum. Consequently, to optimize a cliff function, the MA only once needs to accept an inferior solution. Despite seemingly being an ideal benchmark for the MA to profit from its main working principle, our mathematical runtime analysis shows that this hope does not come true. Even with the optimal temperature (the only parameter of the MA), the MA optimizes most cliff functions less efficiently than simple elitist evolutionary algorithms (EAs), which can only leave the local optimum by generating a superior solution possibly far away. This result suggests that our understanding of why the MA is often very successful in practice is not yet complete. Our work also suggests to equip the MA with global mutation operators, an idea supported by our preliminary experiments.
△ Less
Submitted 15 May, 2023; v1 submitted 21 April, 2023;
originally announced April 2023.
-
Atoms, dimers, and nanoparticles from orbital-free density-potential functional theory
Authors:
Martin-Isbjörn Trappe,
William C. Witt,
Sergei Manzhos
Abstract:
Density-potential functional theory (DPFT) is an alternative formulation of orbital-free density functional theory that may be suitable for modeling the electronic structure of large systems. To date, DPFT has been applied mainly to quantum gases in one- and two dimensional settings. In this work, we study the performance of DPFT when applied to real-life systems: atoms, dimers, and nanoparticles.…
▽ More
Density-potential functional theory (DPFT) is an alternative formulation of orbital-free density functional theory that may be suitable for modeling the electronic structure of large systems. To date, DPFT has been applied mainly to quantum gases in one- and two dimensional settings. In this work, we study the performance of DPFT when applied to real-life systems: atoms, dimers, and nanoparticles. We build on systematic Suzuki-Trotter factorizations of the quantum-mechanical propagator and on the Wigner function formalism, respectively, to derive nonlocal as well as semilocal functional approximations in complete analogy to their well-established lower-dimensional versions -- without resorting to system-specific approximations or ad-hoc measures of any kind. The cost for computing the associated semiclassical ground-state single-particle density scales (quasi-)linearly with particle number. We illustrate that the developed density formulae become relatively more accurate for larger particle numbers, can be improved systematically, are quite universally applicable, and, hence, may offer alternatives to existing orbital-free methods for mesoscopic quantum systems.
△ Less
Submitted 19 April, 2023;
originally announced April 2023.
-
3-Objective Pareto Optimization for Problems with Chance Constraints
Authors:
Frank Neumann,
Carsten Witt
Abstract:
Evolutionary multi-objective algorithms have successfully been used in the context of Pareto optimization where a given constraint is relaxed into an additional objective. In this paper, we explore the use of 3-objective formulations for problems with chance constraints. Our formulation trades off the expected cost and variance of the stochastic component as well as the given deterministic constra…
▽ More
Evolutionary multi-objective algorithms have successfully been used in the context of Pareto optimization where a given constraint is relaxed into an additional objective. In this paper, we explore the use of 3-objective formulations for problems with chance constraints. Our formulation trades off the expected cost and variance of the stochastic component as well as the given deterministic constraint. We point out benefits that this 3-objective formulation has compared to a bi-objective one recently investigated for chance constraints with Normally distributed stochastic components. Our analysis shows that the 3-objective formulation allows to compute all required trade-offs using 1-bit flips only, when dealing with a deterministic cardinality constraint. Furthermore, we carry out experimental investigations for the chance constrained dominating set problem and show the benefit for this classical NP-hard problem.
△ Less
Submitted 18 April, 2023;
originally announced April 2023.
-
Efficient large-scale, targeted gravitational-wave probes of supermassive black-hole binaries
Authors:
Maria Charisi,
Stephen R. Taylor,
Caitlin A. Witt,
Jessie Runnoe
Abstract:
Supermassive black hole binaries are promising sources of low-frequency gravitational waves (GWs) and bright electromagnetic emission. Pulsar timing array searches for resolved binaries are complex and computationally expensive and so far limited to only a few sources. We present an efficient approximation that empowers large-scale targeted multi-messenger searches by neglecting GW signal componen…
▽ More
Supermassive black hole binaries are promising sources of low-frequency gravitational waves (GWs) and bright electromagnetic emission. Pulsar timing array searches for resolved binaries are complex and computationally expensive and so far limited to only a few sources. We present an efficient approximation that empowers large-scale targeted multi-messenger searches by neglecting GW signal components from the pulsar term. This Earth-term approximation provides similar constraints on the total mass and GW frequency of the binary, yet is $>100$ times more efficient.
△ Less
Submitted 7 April, 2023;
originally announced April 2023.
-
Searching for continuous Gravitational Waves in the second data release of the International Pulsar Timing Array
Authors:
M. Falxa,
S. Babak,
P. T. Baker,
B. Bécsy,
A. Chalumeau,
S. Chen,
Z. Chen,
N. J. Cornish,
L. Guillemot,
J. S. Hazboun,
C. M. F. Mingarelli,
A. Parthasarathy,
A. Petiteau,
N. S. Pol,
A. Sesana,
S. B. Spolaor,
S. R. Taylor,
G. Theureau,
M. Vallisneri,
S. J. Vigeland,
C. A. Witt,
X. Zhu,
J. Antoniadis,
Z. Arzoumanian,
M. Bailes
, et al. (102 additional authors not shown)
Abstract:
The International Pulsar Timing Array 2nd data release is the combination of datasets from worldwide collaborations. In this study, we search for continuous waves: gravitational wave signals produced by individual supermassive black hole binaries in the local universe. We consider binaries on circular orbits and neglect the evolution of orbital frequency over the observational span. We find no evi…
▽ More
The International Pulsar Timing Array 2nd data release is the combination of datasets from worldwide collaborations. In this study, we search for continuous waves: gravitational wave signals produced by individual supermassive black hole binaries in the local universe. We consider binaries on circular orbits and neglect the evolution of orbital frequency over the observational span. We find no evidence for such signals and set sky averaged 95% upper limits on their amplitude h 95 . The most sensitive frequency is 10nHz with h 95 = 9.1 10-15 . We achieved the best upper limit to date at low and high frequencies of the PTA band thanks to improved effective cadence of observations. In our analysis, we have taken into account the recently discovered common red noise process, which has an impact at low frequencies. We also find that the peculiar noise features present in some pulsars data must be taken into account to reduce the false alarm. We show that using custom noise models is essential in searching for continuous gravitational wave signals and setting the upper limit.
△ Less
Submitted 19 March, 2023;
originally announced March 2023.
-
Cheap Talk Discovery and Utilization in Multi-Agent Reinforcement Learning
Authors:
Yat Long Lo,
Christian Schroeder de Witt,
Samuel Sokota,
Jakob Nicolaus Foerster,
Shimon Whiteson
Abstract:
By enabling agents to communicate, recent cooperative multi-agent reinforcement learning (MARL) methods have demonstrated better task performance and more coordinated behavior. Most existing approaches facilitate inter-agent communication by allowing agents to send messages to each other through free communication channels, i.e., cheap talk channels. Current methods require these channels to be co…
▽ More
By enabling agents to communicate, recent cooperative multi-agent reinforcement learning (MARL) methods have demonstrated better task performance and more coordinated behavior. Most existing approaches facilitate inter-agent communication by allowing agents to send messages to each other through free communication channels, i.e., cheap talk channels. Current methods require these channels to be constantly accessible and known to the agents a priori. In this work, we lift these requirements such that the agents must discover the cheap talk channels and learn how to use them. Hence, the problem has two main parts: cheap talk discovery (CTD) and cheap talk utilization (CTU). We introduce a novel conceptual framework for both parts and develop a new algorithm based on mutual information maximization that outperforms existing algorithms in CTD/CTU settings. We also release a novel benchmark suite to stimulate future research in CTD/CTU.
△ Less
Submitted 19 March, 2023;
originally announced March 2023.
-
Vibrational and thermal properties of amorphous alumina from first principles
Authors:
Angela F. Harper,
Kamil Iwanowski,
William C. Witt,
Mike C. Payne,
Michele Simoncelli
Abstract:
Amorphous alumina is employed ubiquitously as a high-dielectric-constant material in electronics, and its thermal-transport properties are of key relevance for heat management in electronic chips and devices. Experiments show that the thermal conductivity of alumina depends significantly on the synthesis process, indicating the need for a theoretical study to elucidate the atomistic origin of thes…
▽ More
Amorphous alumina is employed ubiquitously as a high-dielectric-constant material in electronics, and its thermal-transport properties are of key relevance for heat management in electronic chips and devices. Experiments show that the thermal conductivity of alumina depends significantly on the synthesis process, indicating the need for a theoretical study to elucidate the atomistic origin of these variations. Here we employ first-principles simulations to characterize the atomistic structure, vibrational properties, and thermal conductivity of alumina at densities ranging from 2.28 g/cm3 to 3.49 g/cm3. Moreover, using an interatomic potential trained on first-principles data, we investigate how system size affects predictions of the thermal conductivity, showing that simulations containing 120 atoms can already reproduce the bulk limit of the conductivity. Finally, relying on the recently developed Wigner formulation of thermal transport, we shed light on the interplay between atomistic topological disorder and anharmonicity in the context of heat conduction, showing that the former dominates over the latter in determining the conductivity of alumina.
△ Less
Submitted 24 December, 2023; v1 submitted 15 March, 2023;
originally announced March 2023.
-
The NANOGrav 12.5-year Data Set: Bayesian Limits on Gravitational Waves from Individual Supermassive Black Hole Binaries
Authors:
Zaven Arzoumanian,
Paul T. Baker,
Laura Blecha,
Harsha Blumer,
Adam Brazier,
Paul R. Brook,
Sarah Burke-Spolaor,
Bence Bécsy,
J. Andrew Casey-Clyde,
Maria Charisi,
Shami Chatterjee,
Siyuan Chen,
James M. Cordes,
Neil J. Cornish,
Fronefield Crawford,
H. Thankful Cromartie,
Megan E. DeCesar,
Paul B. Demorest,
Timothy Dolch,
Brendan Drachler,
Justin A. Ellis,
E. C. Ferrara,
William Fiore,
Emmanuel Fonseca,
Gabriel E. Freedman
, et al. (53 additional authors not shown)
Abstract:
Pulsar timing array collaborations, such as the North American Nanohertz Observatory for Gravitational Waves (NANOGrav), are seeking to detect nanohertz gravitational waves emitted by supermassive black hole binaries formed in the aftermath of galaxy mergers. We have searched for continuous waves from individual circular supermassive black hole binaries using the NANOGrav's recent 12.5-year data s…
▽ More
Pulsar timing array collaborations, such as the North American Nanohertz Observatory for Gravitational Waves (NANOGrav), are seeking to detect nanohertz gravitational waves emitted by supermassive black hole binaries formed in the aftermath of galaxy mergers. We have searched for continuous waves from individual circular supermassive black hole binaries using the NANOGrav's recent 12.5-year data set. We created new methods to accurately model the uncertainties on pulsar distances in our analysis, and we implemented new techniques to account for a common red noise process in pulsar timing array data sets while searching for deterministic gravitational wave signals, including continuous waves. As we found no evidence for continuous waves in our data, we placed 95\% upper limits on the strain amplitude of continuous waves emitted by these sources. At our most sensitive frequency of 7.65 nanohertz, we placed a sky-averaged limit of $h_0 < $ $(6.82 \pm 0.35) \times 10^{-15}$, and $h_0 <$ $(2.66 \pm 0.15) \times 10^{-15}$ in our most sensitive sky location. Finally, we placed a multi-messenger limit of $\mathcal{M} <$ $(1.41 \pm 0.02) \times 10^9 M_\odot$ on the chirp mass of the supermassive black hole binary candidate 3C~66B.
△ Less
Submitted 6 June, 2023; v1 submitted 9 January, 2023;
originally announced January 2023.
-
Automatic Differentiation for Orbital-Free Density Functional Theory
Authors:
Chuin Wei Tan,
Chris J. Pickard,
William C. Witt
Abstract:
Differentiable programming has facilitated numerous methodological advances in scientific computing. Physics engines supporting automatic differentiation have simpler code, accelerating the development process and reducing the maintenance burden. Furthermore, fully-differentiable simulation tools enable direct evaluation of challenging derivatives - including those directly related to properties m…
▽ More
Differentiable programming has facilitated numerous methodological advances in scientific computing. Physics engines supporting automatic differentiation have simpler code, accelerating the development process and reducing the maintenance burden. Furthermore, fully-differentiable simulation tools enable direct evaluation of challenging derivatives - including those directly related to properties measurable by experiment - that are conventionally computed with finite difference methods. Here, we investigate automatic differentiation in the context of orbital-free density functional theory (OFDFT) simulations of materials, introducing PROFESS-AD. Its automatic evaluation of properties derived from first derivatives, including functional potentials, forces, and stresses, facilitates the development and testing of new density functionals, while its direct evaluation of properties requiring higher-order derivatives, such as bulk moduli, elastic constants, and force constants, offers more concise implementations compared to conventional finite difference methods. For these reasons, PROFESS-AD serves as an excellent prototy** tool and provides new opportunities for OFDFT.
△ Less
Submitted 2 April, 2023; v1 submitted 6 December, 2022;
originally announced December 2022.
-
Revealing Robust Oil and Gas Company Macro-Strategies using Deep Multi-Agent Reinforcement Learning
Authors:
Dylan Radovic,
Lucas Kruitwagen,
Christian Schroeder de Witt,
Ben Caldecott,
Shane Tomlinson,
Mark Workman
Abstract:
The energy transition potentially poses an existential risk for major international oil companies (IOCs) if they fail to adapt to low-carbon business models. Projections of energy futures, however, are met with diverging assumptions on its scale and pace, causing disagreement among IOC decision-makers and their stakeholders over what the business model of an incumbent fossil fuel company should be…
▽ More
The energy transition potentially poses an existential risk for major international oil companies (IOCs) if they fail to adapt to low-carbon business models. Projections of energy futures, however, are met with diverging assumptions on its scale and pace, causing disagreement among IOC decision-makers and their stakeholders over what the business model of an incumbent fossil fuel company should be. In this work, we used deep multi-agent reinforcement learning to solve an energy systems wargame wherein players simulate IOC decision-making, including hydrocarbon and low-carbon investments decisions, dividend policies, and capital structure measures, through an uncertain energy transition to explore critical and non-linear governance questions, from leveraged transitions to reserve replacements. Adversarial play facilitated by state-of-the-art algorithms revealed decision-making strategies robust to energy transition uncertainty and against multiple IOCs. In all games, robust strategies emerged in the form of low-carbon business models as a result of early transition-oriented movement. IOCs adopting such strategies outperformed business-as-usual and delayed transition strategies regardless of hydrocarbon demand projections. In addition to maximizing value, these strategies benefit greater society by contributing substantial amounts of capital necessary to accelerate the global low-carbon energy transition. Our findings point towards the need for lenders and investors to effectively mobilize transition-oriented finance and engage with IOCs to ensure responsible reallocation of capital towards low-carbon business models that would enable the emergence of fossil fuel incumbents as future low-carbon leaders.
△ Less
Submitted 20 November, 2022;
originally announced November 2022.
-
Perfectly Secure Steganography Using Minimum Entropy Coupling
Authors:
Christian Schroeder de Witt,
Samuel Sokota,
J. Zico Kolter,
Jakob Foerster,
Martin Strohmeier
Abstract:
Steganography is the practice of encoding secret information into innocuous content in such a manner that an adversarial third party would not realize that there is hidden meaning. While this problem has classically been studied in security literature, recent advances in generative models have led to a shared interest among security and machine learning researchers in develo** scalable steganogr…
▽ More
Steganography is the practice of encoding secret information into innocuous content in such a manner that an adversarial third party would not realize that there is hidden meaning. While this problem has classically been studied in security literature, recent advances in generative models have led to a shared interest among security and machine learning researchers in develo** scalable steganography techniques. In this work, we show that a steganography procedure is perfectly secure under Cachin (1998)'s information-theoretic model of steganography if and only if it is induced by a coupling. Furthermore, we show that, among perfectly secure procedures, a procedure maximizes information throughput if and only if it is induced by a minimum entropy coupling. These insights yield what are, to the best of our knowledge, the first steganography algorithms to achieve perfect security guarantees for arbitrary covertext distributions. To provide empirical validation, we compare a minimum entropy coupling-based approach to three modern baselines -- arithmetic coding, Meteor, and adaptive dynamic grou** -- using GPT-2, WaveRNN, and Image Transformer as communication channels. We find that the minimum entropy coupling-based approach achieves superior encoding efficiency, despite its stronger security constraints. In aggregate, these results suggest that it may be natural to view information-theoretic steganography through the lens of minimum entropy coupling.
△ Less
Submitted 30 October, 2023; v1 submitted 24 October, 2022;
originally announced October 2022.
-
An unusual pulse shape change event in PSR J1713+0747 observed with the Green Bank Telescope and CHIME
Authors:
Ross J. Jennings,
James M. Cordes,
Shami Chatterjee,
Maura A. McLaughlin,
Paul B. Demorest,
Zaven Arzoumanian,
Paul T. Baker,
Harsha Blumer,
Paul R. Brook,
Tyler Cohen,
Fronefield Crawford,
H. Thankful Cromartie,
Megan E. DeCesar,
Timothy Dolch,
Elizabeth C. Ferrara,
Emmanuel Fonseca,
Deborah C. Good,
Jeffrey S. Hazboun,
Megan L. Jones,
David L. Kaplan,
Michael T. Lam,
T. Joseph W. Lazio,
Duncan R. Lorimer,
**g Luo,
Ryan S. Lynch
, et al. (19 additional authors not shown)
Abstract:
The millisecond pulsar J1713+0747 underwent a sudden and significant pulse shape change between April 16 and 17, 2021 (MJDs 59320 and 59321). Subsequently, the pulse shape gradually recovered over the course of several months. We report the results of continued multi-frequency radio observations of the pulsar made using the Canadian Hydrogen Intensity Map** Experiment (CHIME) and the 100-meter G…
▽ More
The millisecond pulsar J1713+0747 underwent a sudden and significant pulse shape change between April 16 and 17, 2021 (MJDs 59320 and 59321). Subsequently, the pulse shape gradually recovered over the course of several months. We report the results of continued multi-frequency radio observations of the pulsar made using the Canadian Hydrogen Intensity Map** Experiment (CHIME) and the 100-meter Green Bank Telescope (GBT) in a three-year period encompassing the shape change event, between February 2020 and February 2023. As of February 2023, the pulse shape had returned to a state similar to that seen before the event, but with measurable changes remaining. The amplitude of the shape change and the accompanying TOA residuals display a strong non-monotonic dependence on radio frequency, demonstrating that the event is neither a glitch (the effects of which should be independent of radio frequency, $ν$) nor a change in dispersion measure (DM) alone (which would produce a delay proportional to $ν^{-2}$). However, it does bear some resemblance to the two previous "chromatic timing events" observed in J1713+0747 (Demorest et al. 2013; Lam et al. 2016), as well as to a similar event observed in PSR J1643-1224 in 2015 (Shannon et al. 2016).
△ Less
Submitted 31 January, 2024; v1 submitted 21 October, 2022;
originally announced October 2022.
-
Equivariant Networks for Zero-Shot Coordination
Authors:
Darius Muglich,
Christian Schroeder de Witt,
Elise van der Pol,
Shimon Whiteson,
Jakob Foerster
Abstract:
Successful coordination in Dec-POMDPs requires agents to adopt robust strategies and interpretable styles of play for their partner. A common failure mode is symmetry breaking, when agents arbitrarily converge on one out of many equivalent but mutually incompatible policies. Commonly these examples include partial observability, e.g. waving your right hand vs. left hand to convey a covert message.…
▽ More
Successful coordination in Dec-POMDPs requires agents to adopt robust strategies and interpretable styles of play for their partner. A common failure mode is symmetry breaking, when agents arbitrarily converge on one out of many equivalent but mutually incompatible policies. Commonly these examples include partial observability, e.g. waving your right hand vs. left hand to convey a covert message. In this paper, we present a novel equivariant network architecture for use in Dec-POMDPs that effectively leverages environmental symmetry for improving zero-shot coordination, doing so more effectively than prior methods. Our method also acts as a ``coordination-improvement operator'' for generic, pre-trained policies, and thus may be applied at test-time in conjunction with any self-play algorithm. We provide theoretical guarantees of our work and test on the AI benchmark task of Hanabi, where we demonstrate our methods outperforming other symmetry-aware baselines in zero-shot coordination, as well as able to improve the coordination ability of a variety of pre-trained policies. In particular, we show our method can be used to improve on the state of the art for zero-shot coordination on the Hanabi benchmark.
△ Less
Submitted 10 April, 2024; v1 submitted 21 October, 2022;
originally announced October 2022.
-
Discovered Policy Optimisation
Authors:
Chris Lu,
Jakub Grudzien Kuba,
Alistair Letcher,
Luke Metz,
Christian Schroeder de Witt,
Jakob Foerster
Abstract:
Tremendous progress has been made in reinforcement learning (RL) over the past decade. Most of these advancements came through the continual development of new algorithms, which were designed using a combination of mathematical derivations, intuitions, and experimentation. Such an approach of creating algorithms manually is limited by human understanding and ingenuity. In contrast, meta-learning p…
▽ More
Tremendous progress has been made in reinforcement learning (RL) over the past decade. Most of these advancements came through the continual development of new algorithms, which were designed using a combination of mathematical derivations, intuitions, and experimentation. Such an approach of creating algorithms manually is limited by human understanding and ingenuity. In contrast, meta-learning provides a toolkit for automatic machine learning method optimisation, potentially addressing this flaw. However, black-box approaches which attempt to discover RL algorithms with minimal prior structure have thus far not outperformed existing hand-crafted algorithms. Mirror Learning, which includes RL algorithms, such as PPO, offers a potential middle-ground starting point: while every method in this framework comes with theoretical guarantees, components that differentiate them are subject to design. In this paper we explore the Mirror Learning space by meta-learning a "drift" function. We refer to the immediate result as Learnt Policy Optimisation (LPO). By analysing LPO we gain original insights into policy optimisation which we use to formulate a novel, closed-form RL algorithm, Discovered Policy Optimisation (DPO). Our experiments in Brax environments confirm state-of-the-art performance of LPO and DPO, as well as their transfer to unseen settings.
△ Less
Submitted 12 October, 2022; v1 submitted 11 October, 2022;
originally announced October 2022.
-
Runtime Analysis of the (1+1) EA on Weighted Sums of Transformed Linear Functions
Authors:
Frank Neumann,
Carsten Witt
Abstract:
Linear functions play a key role in the runtime analysis of evolutionary algorithms and studies have provided a wide range of new insights and techniques for analyzing evolutionary computation methods. Motivated by studies on separable functions and the optimization behaviour of evolutionary algorithms as well as objective functions from the area of chance constrained optimization, we study the cl…
▽ More
Linear functions play a key role in the runtime analysis of evolutionary algorithms and studies have provided a wide range of new insights and techniques for analyzing evolutionary computation methods. Motivated by studies on separable functions and the optimization behaviour of evolutionary algorithms as well as objective functions from the area of chance constrained optimization, we study the class of objective functions that are weighted sums of two transformed linear functions. Our results show that the (1+1) EA, with a mutation rate depending on the number of overlap** bits of the functions, obtains an optimal solution for these functions in expected time O(n log n), thereby generalizing a well-known result for linear functions to a much wider range of problems.
△ Less
Submitted 11 August, 2022;
originally announced August 2022.
-
Disentangling Multiple Stochastic Gravitational Wave Background Sources in PTA Datasets
Authors:
Andrew R. Kaiser,
Nihan S. Pol,
Maura A. McLaughlin,
Siyuan Chen,
Jeffrey S. Hazboun,
Luke Zoltan Kelley,
Joseph Simon,
Stephen R. Taylor,
Sarah J. Vigeland,
Caitlin A. Witt
Abstract:
With strong evidence of a common-spectrum stochastic process in the most recent datasets from the NANOGrav Collaboration, the European Pulsar Timing Array (PTA), Parkes PTA, and the International PTA, it is crucial to assess the effects of the several astrophysical and cosmological sources that could contribute to the stochastic gravitational wave background (GWB). Using the same dataset creation…
▽ More
With strong evidence of a common-spectrum stochastic process in the most recent datasets from the NANOGrav Collaboration, the European Pulsar Timing Array (PTA), Parkes PTA, and the International PTA, it is crucial to assess the effects of the several astrophysical and cosmological sources that could contribute to the stochastic gravitational wave background (GWB). Using the same dataset creation and injection techniques as in Pol et al. (2021), we assess the separability of multiple GWBs by creating single and multiple GWB source datasets. We search for these injected sources using Bayesian PTA analysis techniques to assess recovery and separability of multiple astrophysical and cosmological backgrounds. For a GWB due to supermassive black hole binaries and an underlying weaker background due to primordial gravitational waves with a GW energy density ratio of $Ω_{\mathrm{PGW}}/Ω_{\mathrm{SMBHB}} = 0.5$, the Bayes' factor for a second process exceeds unity at 17 years, and increases with additional data. At 20 years of data, we are able to constrain the spectral index and amplitude of the weaker GWB at this density ratio to a fractional uncertainty of 64% and 110%, respectively, using current PTA methods and techniques. Using these methods and findings, we outline a basic protocol to search for multiple backgrounds in future PTA datasets.
△ Less
Submitted 5 August, 2022; v1 submitted 3 August, 2022;
originally announced August 2022.
-
Illusory Attacks: Information-Theoretic Detectability Matters in Adversarial Attacks
Authors:
Tim Franzmeyer,
Stephen McAleer,
João F. Henriques,
Jakob N. Foerster,
Philip H. S. Torr,
Adel Bibi,
Christian Schroeder de Witt
Abstract:
Autonomous agents deployed in the real world need to be robust against adversarial attacks on sensory inputs. Robustifying agent policies requires anticipating the strongest attacks possible. We demonstrate that existing observation-space attacks on reinforcement learning agents have a common weakness: while effective, their lack of information-theoretic detectability constraints makes them detect…
▽ More
Autonomous agents deployed in the real world need to be robust against adversarial attacks on sensory inputs. Robustifying agent policies requires anticipating the strongest attacks possible. We demonstrate that existing observation-space attacks on reinforcement learning agents have a common weakness: while effective, their lack of information-theoretic detectability constraints makes them detectable using automated means or human inspection. Detectability is undesirable to adversaries as it may trigger security escalations. We introduce ε-illusory, a novel form of adversarial attack on sequential decision-makers that is both effective and of ε-bounded statistical detectability. We propose a novel dual ascent algorithm to learn such attacks end-to-end. Compared to existing attacks, we empirically find ε-illusory to be significantly harder to detect with automated methods, and a small study with human participants (IRB approval under reference R84123/RE001) suggests they are similarly harder to detect for humans. Our findings suggest the need for better anomaly detectors, as well as effective hardware- and system-level defenses. The project website can be found at https://tinyurl.com/illusory-attacks.
△ Less
Submitted 6 May, 2024; v1 submitted 20 July, 2022;
originally announced July 2022.