-
Training language models to follow instructions with human feedback
Authors:
Long Ouyang,
Jeff Wu,
Xu Jiang,
Diogo Almeida,
Carroll L. Wainwright,
Pamela Mishkin,
Chong Zhang,
Sandhini Agarwal,
Katarina Slama,
Alex Ray,
John Schulman,
Jacob Hilton,
Fraser Kelton,
Luke Miller,
Maddie Simens,
Amanda Askell,
Peter Welinder,
Paul Christiano,
Jan Leike,
Ryan Lowe
Abstract:
Making language models bigger does not inherently make them better at following a user's intent. For example, large language models can generate outputs that are untruthful, toxic, or simply not helpful to the user. In other words, these models are not aligned with their users. In this paper, we show an avenue for aligning language models with user intent on a wide range of tasks by fine-tuning wi…
▽ More
Making language models bigger does not inherently make them better at following a user's intent. For example, large language models can generate outputs that are untruthful, toxic, or simply not helpful to the user. In other words, these models are not aligned with their users. In this paper, we show an avenue for aligning language models with user intent on a wide range of tasks by fine-tuning with human feedback. Starting with a set of labeler-written prompts and prompts submitted through the OpenAI API, we collect a dataset of labeler demonstrations of the desired model behavior, which we use to fine-tune GPT-3 using supervised learning. We then collect a dataset of rankings of model outputs, which we use to further fine-tune this supervised model using reinforcement learning from human feedback. We call the resulting models InstructGPT. In human evaluations on our prompt distribution, outputs from the 1.3B parameter InstructGPT model are preferred to outputs from the 175B GPT-3, despite having 100x fewer parameters. Moreover, InstructGPT models show improvements in truthfulness and reductions in toxic output generation while having minimal performance regressions on public NLP datasets. Even though InstructGPT still makes simple mistakes, our results show that fine-tuning with human feedback is a promising direction for aligning language models with human intent.
△ Less
Submitted 4 March, 2022;
originally announced March 2022.
-
SafeLife 1.0: Exploring Side Effects in Complex Environments
Authors:
Carroll L. Wainwright,
Peter Eckersley
Abstract:
We present SafeLife, a publicly available reinforcement learning environment that tests the safety of reinforcement learning agents. It contains complex, dynamic, tunable, procedurally generated levels with many opportunities for unsafe behavior. Agents are graded both on their ability to maximize their explicit reward and on their ability to operate safely without unnecessary side effects. We tra…
▽ More
We present SafeLife, a publicly available reinforcement learning environment that tests the safety of reinforcement learning agents. It contains complex, dynamic, tunable, procedurally generated levels with many opportunities for unsafe behavior. Agents are graded both on their ability to maximize their explicit reward and on their ability to operate safely without unnecessary side effects. We train agents to maximize rewards using proximal policy optimization and score them on a suite of benchmark levels. The resulting agents are performant but not safe -- they tend to cause large side effects in their environments -- but they form a baseline against which future safety research can be measured.
△ Less
Submitted 26 February, 2021; v1 submitted 3 December, 2019;
originally announced December 2019.
-
Simulating the Universe(s) III: Observables for the full bubble collision spacetime
Authors:
Matthew C. Johnson,
Carroll L. Wainwright,
Anthony Aguirre,
Hiranya V. Peiris
Abstract:
This is the third paper in a series establishing a quantitative relation between inflationary scalar field potential landscapes and the relic perturbations left by the collision between bubbles produced during eternal inflation. We introduce a new method for computing cosmological observables from numerical relativity simulations of bubble collisions in one space and one time dimension. This metho…
▽ More
This is the third paper in a series establishing a quantitative relation between inflationary scalar field potential landscapes and the relic perturbations left by the collision between bubbles produced during eternal inflation. We introduce a new method for computing cosmological observables from numerical relativity simulations of bubble collisions in one space and one time dimension. This method tiles comoving hypersurfaces with locally-perturbed Friedmann-Robertson-Walker coordinate patches. The method extends previous work, which was limited to the spacetime region just inside the future light cone of the collision, and allows us to explore the full bubble-collision spacetime. We validate our new methods against previous work, and present a full set of predictions for the comoving curvature perturbation and local negative spatial curvature produced by identical and non-identical bubble collisions, in single scalar field models of eternal inflation. In both collision types, there is a non-zero contribution to the spatial curvature and cosmic microwave background quadrupole. Some collisions between non-identical bubbles excite wall modes, giving extra structure to the predicted temperature anisotropies. We comment on the implications of our results for future observational searches. For non-identical bubble collisions, we also find that the surfaces of constant field can readjust in the presence of a collision to produce spatially infinite sections that become nearly homogeneous deep into the region affected by the collision. Contrary to previous assumptions, this is true even in the bubble into which the domain wall is accelerating.
△ Less
Submitted 10 February, 2017; v1 submitted 14 August, 2015;
originally announced August 2015.
-
Singlet-Catalyzed Electroweak Phase Transitions and Precision Higgs Studies
Authors:
Stefano Profumo,
Michael J. Ramsey-Musolf,
Carroll L. Wainwright,
Peter Winslow
Abstract:
We update the phenomenology of gauge singlet extensions of the Standard Model scalar sector and their implications for the electroweak phase transition. Considering the introduction of one real scalar singlet to the scalar potential, we analyze present constraints on the potential parameters from Higgs coupling measurements at the Large Hadron Collider (LHC) and electroweak precision observables f…
▽ More
We update the phenomenology of gauge singlet extensions of the Standard Model scalar sector and their implications for the electroweak phase transition. Considering the introduction of one real scalar singlet to the scalar potential, we analyze present constraints on the potential parameters from Higgs coupling measurements at the Large Hadron Collider (LHC) and electroweak precision observables for the kinematic regime in which no new scalar decay modes arise. We then show how future precision measurements of Higgs boson signal strengths and Higgs self-coupling could probe the scalar potential parameter space associated with a strong first-order electroweak phase transition. We illustrate using benchmark precision for several future collider options, including the High Luminosity LHC (HL-LHC), the International Linear Collider (ILC), TLEP, China Electron Positron Collider (CEPC), and a 100 TeV proton-proton collider, such as the Very High Energy LHC (VHE-LHC) or the Super proton-proton Collider (SPPC). For the regions of parameter space leading to a strong first order electroweak phase transition, we find that there exists considerable potential for observable deviations from purely Standard Model Higgs properties at these prospective future colliders.
△ Less
Submitted 4 September, 2014; v1 submitted 20 July, 2014;
originally announced July 2014.
-
Cosmological Phase Transitions and their Properties in the NMSSM
Authors:
Jonathan Kozaczuk,
Stefano Profumo,
Laurel Stephenson Haskins,
Carroll L. Wainwright
Abstract:
We study cosmological phase transitions in the Next-to-Minimal Supersymmetric Standard Model (NMSSM) in light of the Higgs discovery. We use an effective field theory approach to calculate the finite temperature effective potential, focusing on regions with significant tree-level contributions to the Higgs mass, a viable neutralino dark matter candidate, 1-2 TeV stops, and with the remaining parti…
▽ More
We study cosmological phase transitions in the Next-to-Minimal Supersymmetric Standard Model (NMSSM) in light of the Higgs discovery. We use an effective field theory approach to calculate the finite temperature effective potential, focusing on regions with significant tree-level contributions to the Higgs mass, a viable neutralino dark matter candidate, 1-2 TeV stops, and with the remaining particle spectrum compatible with current LHC searches and results. The phase transition structure in viable regions of parameter space exhibits a rich phenomenology, potentially giving rise to one- or two-step first-order phase transitions in the singlet and/or $SU(2)$ directions. We compute several parameters pertaining to the bubble wall profile, including the bubble wall width and $Δβ$ (the variation of the ratio in Higgs vacuum expectation values across the wall). These quantities can vary significantly across small regions of parameter space and can be promising for successful electroweak baryogenesis. We estimate the wall velocity microphysically, taking into account the various sources of friction acting on the expanding bubble wall. Ultra-relativistic solutions to the bubble wall equations of motion typically exist when the electroweak phase transition features substantial supercooling. For somewhat weaker transitions, the bubble wall instead tends to be sub-luminal and, in fact, likely sub-sonic, suggesting that successful electroweak baryogenesis may indeed occur in regions of the NMSSM compatible with the Higgs discovery.
△ Less
Submitted 11 March, 2015; v1 submitted 15 July, 2014;
originally announced July 2014.
-
Simulating the universe(s) II: phenomenology of cosmic bubble collisions in full General Relativity
Authors:
Carroll L. Wainwright,
Matthew C. Johnson,
Anthony Aguirre,
Hiranya V. Peiris
Abstract:
Observing the relics of collisions between bubble universes would provide direct evidence for the existence of an eternally inflating Multiverse; the non-observation of such events can also provide important constraints on inflationary physics. Realizing these prospects requires quantitative predictions for observables from the properties of the possible scalar field Lagrangians underlying eternal…
▽ More
Observing the relics of collisions between bubble universes would provide direct evidence for the existence of an eternally inflating Multiverse; the non-observation of such events can also provide important constraints on inflationary physics. Realizing these prospects requires quantitative predictions for observables from the properties of the possible scalar field Lagrangians underlying eternal inflation. Building on previous work, we establish this connection in detail. We perform a fully relativistic numerical study of the phenomenology of bubble collisions in models with a single scalar field, computing the comoving curvature perturbation produced in a wide variety of models. We also construct a set of analytic predictions, allowing us to identify the phenomenologically relevant properties of the scalar field Lagrangian. The agreement between the analytic predictions and numerics in the relevant regions is excellent, and allows us to generalize our results beyond the models we adopt for the numerical studies. Specifically, the signature is completely determined by the spatial profile of the colliding bubble just before the collision, and the de Sitter invariant distance between the bubble centers. The analytic and numerical results support a power-law fit with an index $1< κ\lesssim 2$. For collisions between identical bubbles, we establish a lower-bound on the observed amplitude of collisions that is set by the present energy density in curvature.
△ Less
Submitted 10 July, 2014;
originally announced July 2014.
-
Simulating the universe(s): from cosmic bubble collisions to cosmological observables with numerical relativity
Authors:
Carroll L. Wainwright,
Matthew C. Johnson,
Hiranya V. Peiris,
Anthony Aguirre,
Luis Lehner,
Steven L. Liebling
Abstract:
The theory of eternal inflation in an inflaton potential with multiple vacua predicts that our universe is one of many bubble universes nucleating and growing inside an ever-expanding false vacuum. The collision of our bubble with another could provide an important observational signature to test this scenario. We develop and implement an algorithm for accurately computing the cosmological observa…
▽ More
The theory of eternal inflation in an inflaton potential with multiple vacua predicts that our universe is one of many bubble universes nucleating and growing inside an ever-expanding false vacuum. The collision of our bubble with another could provide an important observational signature to test this scenario. We develop and implement an algorithm for accurately computing the cosmological observables arising from bubble collisions directly from the Lagrangian of a single scalar field. We first simulate the collision spacetime by solving Einstein's equations, starting from nucleation and ending at reheating. Taking advantage of the collision's hyperbolic symmetry, simulations are performed with a 1+1-dimensional fully relativistic code that uses adaptive mesh refinement. We then calculate the comoving curvature perturbation in an open Friedmann-Robertson-Walker universe, which is used to determine the temperature anisotropies of the cosmic microwave background radiation. For a fiducial Lagrangian, the anisotropies are well described by a power law in the cosine of the angular distance from the center of the collision signature. For a given form of the Lagrangian, the resulting observational predictions are inherently statistical due to stochastic elements of the bubble nucleation process. Further uncertainties arise due to our imperfect knowledge about inflationary and pre-recombination physics. We characterize observational predictions by computing the probability distributions over four phenomenological parameters which capture these intrinsic and model uncertainties. This represents the first fully-relativistic set of predictions from an ensemble of scalar field models giving rise to eternal inflation, yielding significant differences from previous non-relativistic approximations. Thus, our results provide a basis for a rigorous confrontation of these theories with cosmological data.
△ Less
Submitted 23 June, 2014; v1 submitted 4 December, 2013;
originally announced December 2013.
-
Holographic Fluctuations from Unitary de Sitter Invariant Field Theory
Authors:
Tom Banks,
Willy Fischler,
T. J. Torres,
Carroll L. Wainwright
Abstract:
We continue the study of inflationary fluctuations in Holographic Space Time models of inflation. We argue that the holographic theory of inflation provides a physical context for what is often called dS/CFT. The holographic theory is a quantum theory which, in the limit of a large number of e-foldings, gives rise to a field theory on $S^3$, which is the representation space for a unitary represen…
▽ More
We continue the study of inflationary fluctuations in Holographic Space Time models of inflation. We argue that the holographic theory of inflation provides a physical context for what is often called dS/CFT. The holographic theory is a quantum theory which, in the limit of a large number of e-foldings, gives rise to a field theory on $S^3$, which is the representation space for a unitary representation of SO(1,4). This is not a conventional CFT, and we do not know the detailed non-perturbative axioms for correlation functions. However, the two- and three-point functions are completely determined by symmetry, and coincide up to a few constants (really functions of the background FRW geometry) with those calculated in a single field slow-roll inflation model. The only significant deviation from slow roll is in the tensor fluctuations. We predict zero tensor tilt and roughly equal weight for all three conformally invariant tensor 3-point functions (unless parity is imposed as a symmetry). We discuss the relation between our results and those of Maldacena, McFadden, Skenderis, and others. Current data can be explained in terms of symmetries and a few general principles, and is consistent with a large class of models, including HST.
△ Less
Submitted 17 June, 2013;
originally announced June 2013.
-
Electroweak Baryogenesis And The Fermi Gamma-Ray Line
Authors:
Jonathan Kozaczuk,
Stefano Profumo,
Carroll L. Wainwright
Abstract:
Many particle physics models attempt to explain the 130 GeV gamma-ray feature that the Fermi-LAT observes in the Galactic Center. Neutralino dark matter in non-minimal supersymmetric models, such as the NMSSM, is an especially well-motivated theoretical setup which can explain the line. We explore the possibility that regions of the NMSSM consistent with the 130 GeV line can also produce the obser…
▽ More
Many particle physics models attempt to explain the 130 GeV gamma-ray feature that the Fermi-LAT observes in the Galactic Center. Neutralino dark matter in non-minimal supersymmetric models, such as the NMSSM, is an especially well-motivated theoretical setup which can explain the line. We explore the possibility that regions of the NMSSM consistent with the 130 GeV line can also produce the observed baryon asymmetry of the universe via electroweak baryogenesis. We find that such regions can in fact accommodate a strongly first-order electroweak phase transition (due to the singlet contribution to the effective potential), while also avoiding a light stop and producing a Standard Model-like Higgs in the observed mass range. Simultaneously, CP-violation from a complex phase in the wino-higgsino sector can account for the observed baryon asymmetry through resonant sources at the electroweak phase transition, while satisfying current constraints from dark matter, collider, and electric dipole moment (EDM) experiments. This result is possible by virtue of a relatively light pseudoscalar Higgs sector with a small degree of mixing, which yields efficient s-channel resonant neutralino annihilation consistent with indirect detection constraints, and of the moderate values of $μ$ required to obtain a bino-like LSP consistent with the line. The wino mass is essentially a free parameter which one can tune to satisfy electroweak baryogenesis. Thus, the NMSSM framework can potentially explain the origins of both baryonic and dark matter components in the Universe. The tightness of the constraints we impose on this scenario makes it extraordinarily predictive, and conclusively testable in the near future by modest improvements in EDM and dark matter search experiments.
△ Less
Submitted 19 February, 2013;
originally announced February 2013.
-
Accidental Supersymmetric Dark Matter and Baryogenesis
Authors:
Jonathan Kozaczuk,
Stefano Profumo,
Carroll L. Wainwright
Abstract:
We show that "accidental" supersymmetry is a beyond-the-Standard Model framework that naturally accommodates a thermal relic dark matter candidate and successful electroweak baryogenesis, including the needed strongly first-order character of the electroweak phase transition. We study the phenomenology of this setup from the standpoint of both dark matter and baryogenesis. For energies around the…
▽ More
We show that "accidental" supersymmetry is a beyond-the-Standard Model framework that naturally accommodates a thermal relic dark matter candidate and successful electroweak baryogenesis, including the needed strongly first-order character of the electroweak phase transition. We study the phenomenology of this setup from the standpoint of both dark matter and baryogenesis. For energies around the electroweak phase transition temperature, the low-energy effective theory is similar to the MSSM with light super-partners of the third-generation quarks and of the Higgs and gauge bosons. We calculate the dark matter relic abundance and the baryon asymmetry across the accidental supersymmetry parameter space, including resonant and non-resonant CP-violating sources. We find that there are regions of parameter space producing both the observed value of the baryon asymmetry and a dark matter candidate with the correct relic density and conforming to present-day constraints from dark matter searches. This scenario makes sharp predictions for the particle spectrum, predicting a lightest neutralino mass between 200 and 500 GeV, with all charginos and neutralinos within less than a factor 2 of the lightest neutralino mass and the heavy Higgs sector within 20-25% of that mass, making it an interesting target for collider searches. In addition, we demonstrate that successful accidental supersymmetric dark matter and baryogenesis will be conclusively tested with improvements smaller than one order of magnitude to the current performance of electron electric dipole moment searches and of direct dark matter searches, as well as with IceCube plus Deep Core neutrino telescope data.
△ Less
Submitted 25 August, 2012;
originally announced August 2012.
-
Supersymmetric Electroweak Baryogenesis Via Resonant Sfermion Sources
Authors:
Jonathan Kozaczuk,
Stefano Profumo,
Michael J. Ramsey-Musolf,
Carroll L. Wainwright
Abstract:
We calculate the baryon asymmetry produced at the electroweak phase transition by quasi-degenerate third generation sfermions in the minimal supersymmetric extension of the Standard Model. We evaluate constraints from Higgs searches, from collider searches for supersymmetric particles, and from null searches for the permanent electric dipole moment (EDM) of the electron, of the neutron and of atom…
▽ More
We calculate the baryon asymmetry produced at the electroweak phase transition by quasi-degenerate third generation sfermions in the minimal supersymmetric extension of the Standard Model. We evaluate constraints from Higgs searches, from collider searches for supersymmetric particles, and from null searches for the permanent electric dipole moment (EDM) of the electron, of the neutron and of atoms. We find that resonant sfermion sources can in principle provide a large enough baryon asymmetry in various corners of the sfermion parameter space, and we focus, in particular, on the case of large $\tanβ$, where third-generation down-type (s)fermions become relevant. We show that in the case of stop and sbottom sources, the viable parameter space is ruled out by constraints from the non-observation of the Mercury EDM. We introduce a new class of CP violating sources, quasi-degenerate staus, that escapes current EDM constraints while providing large enough net chiral currents to achieve successful "slepton-mediated" electroweak baryogenesis.
△ Less
Submitted 15 November, 2012; v1 submitted 18 June, 2012;
originally announced June 2012.
-
Phase Transitions and Gauge Artifacts in an Abelian Higgs Plus Singlet Model
Authors:
Carroll L. Wainwright,
Stefano Profumo,
Michael J. Ramsey-Musolf
Abstract:
While the finite-temperature effective potential in a gauge theory is a gauge-dependent quantity, in several instances a first-order phase transition can be triggered by gauge-independent terms. A particularly interesting case occurs when the potential barrier separating the broken and symmetric vacua of a spontaneously broken symmetry is produced by tree-level terms in the potential. Here, we stu…
▽ More
While the finite-temperature effective potential in a gauge theory is a gauge-dependent quantity, in several instances a first-order phase transition can be triggered by gauge-independent terms. A particularly interesting case occurs when the potential barrier separating the broken and symmetric vacua of a spontaneously broken symmetry is produced by tree-level terms in the potential. Here, we study this scenario in a simple Abelian Higgs model, for which the gauge-invariant potential is known, augmented with a singlet real scalar. We analyze the possible symmetry breaking patterns in the model, and illustrate in which cases gauge artifacts are expected to manifest themselves most severely. We then show that gauge artifacts can be pronounced even in the presence of a relatively large, tree-level singlet-Higgs cubic interaction. When the transition is strongly first order, these artifacts, while present, are more subtle than in the generic situation.
△ Less
Submitted 15 October, 2012; v1 submitted 24 April, 2012;
originally announced April 2012.
-
CosmoTransitions: Computing Cosmological Phase Transition Temperatures and Bubble Profiles with Multiple Fields
Authors:
Carroll L. Wainwright
Abstract:
I present a numerical package (CosmoTransitions) for analyzing finite-temperature cosmological phase transitions driven by single or multiple scalar fields. The package analyzes the different vacua of a theory to determine their critical temperatures (where the vacuum energy levels are degenerate), their super-cooling temperatures, and the bubble wall profiles which separate the phases and describ…
▽ More
I present a numerical package (CosmoTransitions) for analyzing finite-temperature cosmological phase transitions driven by single or multiple scalar fields. The package analyzes the different vacua of a theory to determine their critical temperatures (where the vacuum energy levels are degenerate), their super-cooling temperatures, and the bubble wall profiles which separate the phases and describe their tunneling dynamics. I introduce a new method of path deformation to find the profiles of both thin- and thick-walled bubbles. CosmoTransitions is freely available for public use.
△ Less
Submitted 19 September, 2011;
originally announced September 2011.