-
Massive IIA flux compactifications with dynamical open strings
Authors:
Juan Ramón Balaguer,
Valentina Bevilacqua,
Giuseppe Dibitetto,
Jose J. Fernández-Melgarejo,
Giuseppe Sudano
Abstract:
We consider massive type IIA compactifications down to 4 dimensions in presence of O6 planes and D6 branes parallel to them, in order to preserve half-maximal supersymmetry in 4D. The dynamics of open strings living on the spacetime filling branes is taken into account, in the gauged supergravity description, by adding extra vector multiplets and embedding tensor components. The scalar potential g…
▽ More
We consider massive type IIA compactifications down to 4 dimensions in presence of O6 planes and D6 branes parallel to them, in order to preserve half-maximal supersymmetry in 4D. The dynamics of open strings living on the spacetime filling branes is taken into account, in the gauged supergravity description, by adding extra vector multiplets and embedding tensor components. The scalar potential gets new terms that can be matched with contributions coming from dimensional reduction of the non-Abelian DBI and WZ brane actions. In this setting, we analyze the vacuum structure of the theory and find novel AdS$_4$ vacua, both supersymmetric and non-supersymmetric ones. Furthermore, we address their perturbative stability by computing their mass spectra. Some of the vacua are found to be perturbatively stable, despite their being non-supersymmetric. We conclude by discussing the reliability of our setup in terms of higher-derivative corrections.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
Using deep reinforcement learning to promote sustainable human behaviour on a common pool resource problem
Authors:
Raphael Koster,
Miruna Pîslar,
Andrea Tacchetti,
Jan Balaguer,
Leqi Liu,
Romuald Elie,
Oliver P. Hauser,
Karl Tuyls,
Matt Botvinick,
Christopher Summerfield
Abstract:
A canonical social dilemma arises when finite resources are allocated to a group of people, who can choose to either reciprocate with interest, or keep the proceeds for themselves. What resource allocation mechanisms will encourage levels of reciprocation that sustain the commons? Here, in an iterated multiplayer trust game, we use deep reinforcement learning (RL) to design an allocation mechanism…
▽ More
A canonical social dilemma arises when finite resources are allocated to a group of people, who can choose to either reciprocate with interest, or keep the proceeds for themselves. What resource allocation mechanisms will encourage levels of reciprocation that sustain the commons? Here, in an iterated multiplayer trust game, we use deep reinforcement learning (RL) to design an allocation mechanism that endogenously promotes sustainable contributions from human participants to a common pool resource. We first trained neural networks to behave like human players, creating a stimulated economy that allowed us to study how different mechanisms influenced the dynamics of receipt and reciprocation. We then used RL to train a social planner to maximise aggregate return to players. The social planner discovered a redistributive policy that led to a large surplus and an inclusive economy, in which players made roughly equal gains. The RL agent increased human surplus over baseline mechanisms based on unrestricted welfare or conditional cooperation, by conditioning its generosity on available resources and temporarily sanctioning defectors by allocating fewer resources to them. Examining the AI policy allowed us to develop an explainable mechanism that performed similarly and was more popular among players. Deep reinforcement learning can be used to discover mechanisms that promote sustainable human behaviour.
△ Less
Submitted 23 April, 2024;
originally announced April 2024.
-
Gemini: A Family of Highly Capable Multimodal Models
Authors:
Gemini Team,
Rohan Anil,
Sebastian Borgeaud,
Jean-Baptiste Alayrac,
Jiahui Yu,
Radu Soricut,
Johan Schalkwyk,
Andrew M. Dai,
Anja Hauth,
Katie Millican,
David Silver,
Melvin Johnson,
Ioannis Antonoglou,
Julian Schrittwieser,
Amelia Glaese,
Jilin Chen,
Emily Pitler,
Timothy Lillicrap,
Angeliki Lazaridou,
Orhan Firat,
James Molloy,
Michael Isard,
Paul R. Barham,
Tom Hennigan,
Benjamin Lee
, et al. (1325 additional authors not shown)
Abstract:
This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr…
▽ More
This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultra model advances the state of the art in 30 of 32 of these benchmarks - notably being the first model to achieve human-expert performance on the well-studied exam benchmark MMLU, and improving the state of the art in every one of the 20 multimodal benchmarks we examined. We believe that the new capabilities of the Gemini family in cross-modal reasoning and language understanding will enable a wide variety of use cases. We discuss our approach toward post-training and deploying Gemini models responsibly to users through services including Gemini, Gemini Advanced, Google AI Studio, and Cloud Vertex AI.
△ Less
Submitted 17 June, 2024; v1 submitted 18 December, 2023;
originally announced December 2023.
-
Open Strings in IIB Orientifold Reductions
Authors:
Juan R. Balaguer,
Giuseppe Dibitetto,
Jose J. Fernandez-Melgarejo,
Alejandro Ruiperez
Abstract:
We consider type IIB compactifications on a general 4D group manifold with different types of possible spacetime filling O-planes and the corresponding D-branes parallel to them. Once fluxes allowed by the associated orientifold projection are included, a 6D $\mathcal{N}=(1,1)$ gauged supergravity is obtained. In this paper we show how the consistent coupling to dynamical open strings living on th…
▽ More
We consider type IIB compactifications on a general 4D group manifold with different types of possible spacetime filling O-planes and the corresponding D-branes parallel to them. Once fluxes allowed by the associated orientifold projection are included, a 6D $\mathcal{N}=(1,1)$ gauged supergravity is obtained. In this paper we show how the consistent coupling to dynamical open strings living on the spacetime filling D-branes may be captured by the inclusion of extra vector multiplets and extra embedding tensor deformations on the gauged supergravity side. As a result, the quadratic constraints on the embedding tensor consistently reproduce the source corrected 10D Bianchi identities. Furthermore, the field strength modifications induced by the open string sector could potentially be understood as U-dual versions of the Green-Schwarz terms. Finally, the entire scalar potential of the theory exactly matches the one obtained from reduction of the bulk action plus the source contributions.
△ Less
Submitted 3 March, 2023;
originally announced March 2023.
-
Fine-tuning language models to find agreement among humans with diverse preferences
Authors:
Michiel A. Bakker,
Martin J. Chadwick,
Hannah R. Sheahan,
Michael Henry Tessler,
Lucy Campbell-Gillingham,
Jan Balaguer,
Nat McAleese,
Amelia Glaese,
John Aslanides,
Matthew M. Botvinick,
Christopher Summerfield
Abstract:
Recent work in large language modeling (LLMs) has used fine-tuning to align outputs with the preferences of a prototypical user. This work assumes that human preferences are static and homogeneous across individuals, so that aligning to a a single "generic" user will confer more general alignment. Here, we embrace the heterogeneity of human preferences to consider a different challenge: how might…
▽ More
Recent work in large language modeling (LLMs) has used fine-tuning to align outputs with the preferences of a prototypical user. This work assumes that human preferences are static and homogeneous across individuals, so that aligning to a a single "generic" user will confer more general alignment. Here, we embrace the heterogeneity of human preferences to consider a different challenge: how might a machine help people with diverse views find agreement? We fine-tune a 70 billion parameter LLM to generate statements that maximize the expected approval for a group of people with potentially diverse opinions. Human participants provide written opinions on thousands of questions touching on moral and political issues (e.g., "should we raise taxes on the rich?"), and rate the LLM's generated candidate consensus statements for agreement and quality. A reward model is then trained to predict individual preferences, enabling it to quantify and rank consensus statements in terms of their appeal to the overall group, defined according to different aggregation (social welfare) functions. The model produces consensus statements that are preferred by human users over those from prompted LLMs (>70%) and significantly outperforms a tight fine-tuned baseline that lacks the final ranking step. Further, our best model's consensus statements are preferred over the best human-generated opinions (>65%). We find that when we silently constructed consensus statements from only a subset of group members, those who were excluded were more likely to dissent, revealing the sensitivity of the consensus to individual contributions. These results highlight the potential to use LLMs to help groups of humans align their values with one another.
△ Less
Submitted 27 November, 2022;
originally announced November 2022.
-
The Good Shepherd: An Oracle Agent for Mechanism Design
Authors:
Jan Balaguer,
Raphael Koster,
Christopher Summerfield,
Andrea Tacchetti
Abstract:
From social networks to traffic routing, artificial learning agents are playing a central role in modern institutions. We must therefore understand how to leverage these systems to foster outcomes and behaviors that align with our own values and aspirations. While multiagent learning has received considerable attention in recent years, artificial agents have been primarily evaluated when interacti…
▽ More
From social networks to traffic routing, artificial learning agents are playing a central role in modern institutions. We must therefore understand how to leverage these systems to foster outcomes and behaviors that align with our own values and aspirations. While multiagent learning has received considerable attention in recent years, artificial agents have been primarily evaluated when interacting with fixed, non-learning co-players. While this evaluation scheme has merit, it fails to capture the dynamics faced by institutions that must deal with adaptive and continually learning constituents. Here we address this limitation, and construct agents ("mechanisms") that perform well when evaluated over the learning trajectory of their adaptive co-players ("participants"). The algorithm we propose consists of two nested learning loops: an inner loop where participants learn to best respond to fixed mechanisms; and an outer loop where the mechanism agent updates its policy based on experience. We report the performance of our mechanism agents when paired with both artificial learning agents and humans as co-players. Our results show that our mechanisms are able to shepherd the participants strategies towards favorable outcomes, indicating a path for modern institutions to effectively and automatically influence the strategies and behaviors of their constituents.
△ Less
Submitted 21 February, 2022;
originally announced February 2022.
-
HCMD-zero: Learning Value Aligned Mechanisms from Data
Authors:
Jan Balaguer,
Raphael Koster,
Ari Weinstein,
Lucy Campbell-Gillingham,
Christopher Summerfield,
Matthew Botvinick,
Andrea Tacchetti
Abstract:
Artificial learning agents are mediating a larger and larger number of interactions among humans, firms, and organizations, and the intersection between mechanism design and machine learning has been heavily investigated in recent years. However, mechanism design methods often make strong assumptions on how participants behave (e.g. rationality), on the kind of knowledge designers have access to a…
▽ More
Artificial learning agents are mediating a larger and larger number of interactions among humans, firms, and organizations, and the intersection between mechanism design and machine learning has been heavily investigated in recent years. However, mechanism design methods often make strong assumptions on how participants behave (e.g. rationality), on the kind of knowledge designers have access to a priori (e.g. access to strong baseline mechanisms), or on what the goal of the mechanism should be (e.g. total welfare). Here we introduce HCMD-zero, a general purpose method to construct mechanisms making none of these three assumptions. HCMD-zero learns to mediate interactions among participants and adjusts the mechanism parameters to make itself more likely to be preferred by participants. It does so by remaining engaged in an electoral contest with copies of itself, thereby accessing direct feedback from participants. We test our method on a stylized resource allocation game that highlights the tension between productivity, equality and the temptation to free ride. HCMD-zero produces a mechanism that is preferred by human participants over a strong baseline, it does so automatically, without requiring prior knowledge, and using human behavioral trajectories sparingly and effectively. Our analysis shows HCMD-zero consistently makes the mechanism policy more and more likely to be preferred by human participants over the course of training, and that it results in a mechanism with an interpretable and intuitive policy.
△ Less
Submitted 20 May, 2022; v1 submitted 21 February, 2022;
originally announced February 2022.
-
Human-centered mechanism design with Democratic AI
Authors:
Raphael Koster,
Jan Balaguer,
Andrea Tacchetti,
Ari Weinstein,
Tina Zhu,
Oliver Hauser,
Duncan Williams,
Lucy Campbell-Gillingham,
Phoebe Thacker,
Matthew Botvinick,
Christopher Summerfield
Abstract:
Building artificial intelligence (AI) that aligns with human values is an unsolved problem. Here, we developed a human-in-the-loop research pipeline called Democratic AI, in which reinforcement learning is used to design a social mechanism that humans prefer by majority. A large group of humans played an online investment game that involved deciding whether to keep a monetary endowment or to share…
▽ More
Building artificial intelligence (AI) that aligns with human values is an unsolved problem. Here, we developed a human-in-the-loop research pipeline called Democratic AI, in which reinforcement learning is used to design a social mechanism that humans prefer by majority. A large group of humans played an online investment game that involved deciding whether to keep a monetary endowment or to share it with others for collective benefit. Shared revenue was returned to players under two different redistribution mechanisms, one designed by the AI and the other by humans. The AI discovered a mechanism that redressed initial wealth imbalance, sanctioned free riders, and successfully won the majority vote. By optimizing for human preferences, Democratic AI may be a promising method for value-aligned policy innovation.
△ Less
Submitted 27 January, 2022;
originally announced January 2022.
-
PonderNet: Learning to Ponder
Authors:
Andrea Banino,
Jan Balaguer,
Charles Blundell
Abstract:
In standard neural networks the amount of computation used grows with the size of the inputs, but not with the complexity of the problem being learnt. To overcome this limitation we introduce PonderNet, a new algorithm that learns to adapt the amount of computation based on the complexity of the problem at hand. PonderNet learns end-to-end the number of computational steps to achieve an effective…
▽ More
In standard neural networks the amount of computation used grows with the size of the inputs, but not with the complexity of the problem being learnt. To overcome this limitation we introduce PonderNet, a new algorithm that learns to adapt the amount of computation based on the complexity of the problem at hand. PonderNet learns end-to-end the number of computational steps to achieve an effective compromise between training prediction accuracy, computational cost and generalization. On a complex synthetic problem, PonderNet dramatically improves performance over previous adaptive computation methods and additionally succeeds at extrapolation tests where traditional neural networks fail. Also, our method matched the current state of the art results on a real world question and answering dataset, but using less compute. Finally, PonderNet reached state of the art results on a complex task designed to test the reasoning capabilities of neural networks.1
△ Less
Submitted 2 September, 2021; v1 submitted 12 July, 2021;
originally announced July 2021.
-
New IIB intersecting brane solutions yielding supersymmetric AdS$_3$ vacua
Authors:
Juan R. Balaguer,
Giuseppe Dibitetto,
Jose J. Fernandez-Melgarejo
Abstract:
We consider genuine type IIB string theory (supersymmetric) brane intersections that preserve $(1+1)$D Lorentz symmetry. We provide the full supergravity solutions in their analytic form and discuss their physical properties. The Ansatz for the spacetime dependence of the different brane warp factors goes beyond the harmonic superposition principle. By studying the associated near-horizon geometry…
▽ More
We consider genuine type IIB string theory (supersymmetric) brane intersections that preserve $(1+1)$D Lorentz symmetry. We provide the full supergravity solutions in their analytic form and discuss their physical properties. The Ansatz for the spacetime dependence of the different brane warp factors goes beyond the harmonic superposition principle. By studying the associated near-horizon geometry, we construct interesting classes of AdS$_3$ vacua in type IIB and highlight their relation to the existing classifications in the literature. Finally, we discuss their holographic properties.
△ Less
Submitted 8 April, 2021;
originally announced April 2021.