Search | arXiv e-print repository

The Ethics of Advanced AI Assistants

Authors: Iason Gabriel, Arianna Manzini, Geoff Keeling, Lisa Anne Hendricks, Verena Rieser, Hasan Iqbal, Nenad Tomašev, Ira Ktena, Zachary Kenton, Mikel Rodriguez, Seliem El-Sayed, Sasha Brown, Canfer Akbulut, Andrew Trask, Edward Hughes, A. Stevie Bergman, Renee Shelby, Nahema Marchal, Conor Griffin, Juan Mateos-Garcia, Laura Weidinger, Winnie Street, Benjamin Lange, Alex Ingerman, Alison Lentz , et al. (32 additional authors not shown)

Abstract: This paper focuses on the opportunities and the ethical and societal risks posed by advanced AI assistants. We define advanced AI assistants as artificial agents with natural language interfaces, whose function is to plan and execute sequences of actions on behalf of a user, across one or more domains, in line with the user's expectations. The paper starts by considering the technology itself, pro… ▽ More This paper focuses on the opportunities and the ethical and societal risks posed by advanced AI assistants. We define advanced AI assistants as artificial agents with natural language interfaces, whose function is to plan and execute sequences of actions on behalf of a user, across one or more domains, in line with the user's expectations. The paper starts by considering the technology itself, providing an overview of AI assistants, their technical foundations and potential range of applications. It then explores questions around AI value alignment, well-being, safety and malicious uses. Extending the circle of inquiry further, we next consider the relationship between advanced AI assistants and individual users in more detail, exploring topics such as manipulation and persuasion, anthropomorphism, appropriate relationships, trust and privacy. With this analysis in place, we consider the deployment of advanced assistants at a societal scale, focusing on cooperation, equity and access, misinformation, economic impact, the environment and how best to evaluate advanced AI assistants. Finally, we conclude by providing a range of recommendations for researchers, developers, policymakers and public stakeholders. △ Less

Submitted 28 April, 2024; v1 submitted 24 April, 2024; originally announced April 2024.

arXiv:2404.15058 [pdf, other]

A Mechanism-Based Approach to Mitigating Harms from Persuasive Generative AI

Authors: Seliem El-Sayed, Canfer Akbulut, Amanda McCroskery, Geoff Keeling, Zachary Kenton, Zaria Jalan, Nahema Marchal, Arianna Manzini, Toby Shevlane, Shannon Vallor, Daniel Susser, Matija Franklin, Sophie Bridgers, Harry Law, Matthew Rahtz, Murray Shanahan, Michael Henry Tessler, Arthur Douillard, Tom Everitt, Sasha Brown

Abstract: Recent generative AI systems have demonstrated more advanced persuasive capabilities and are increasingly permeating areas of life where they can influence decision-making. Generative AI presents a new risk profile of persuasion due the opportunity for reciprocal exchange and prolonged interactions. This has led to growing concerns about harms from AI persuasion and how they can be mitigated, high… ▽ More Recent generative AI systems have demonstrated more advanced persuasive capabilities and are increasingly permeating areas of life where they can influence decision-making. Generative AI presents a new risk profile of persuasion due the opportunity for reciprocal exchange and prolonged interactions. This has led to growing concerns about harms from AI persuasion and how they can be mitigated, highlighting the need for a systematic study of AI persuasion. The current definitions of AI persuasion are unclear and related harms are insufficiently studied. Existing harm mitigation approaches prioritise harms from the outcome of persuasion over harms from the process of persuasion. In this paper, we lay the groundwork for the systematic study of AI persuasion. We first put forward definitions of persuasive generative AI. We distinguish between rationally persuasive generative AI, which relies on providing relevant facts, sound reasoning, or other forms of trustworthy evidence, and manipulative generative AI, which relies on taking advantage of cognitive biases and heuristics or misrepresenting information. We also put forward a map of harms from AI persuasion, including definitions and examples of economic, physical, environmental, psychological, sociocultural, political, privacy, and autonomy harm. We then introduce a map of mechanisms that contribute to harmful persuasion. Lastly, we provide an overview of approaches that can be used to mitigate against process harms of persuasion, including prompt engineering for manipulation classification and red teaming. Future work will operationalise these mitigations and study the interaction between different types of mechanisms of persuasion. △ Less

Submitted 23 April, 2024; originally announced April 2024.

arXiv:2312.10029 [pdf, other]

Challenges with unsupervised LLM knowledge discovery

Authors: Sebastian Farquhar, Vikrant Varma, Zachary Kenton, Johannes Gasteiger, Vladimir Mikulik, Rohin Shah

Abstract: We show that existing unsupervised methods on large language model (LLM) activations do not discover knowledge -- instead they seem to discover whatever feature of the activations is most prominent. The idea behind unsupervised knowledge elicitation is that knowledge satisfies a consistency structure, which can be used to discover knowledge. We first prove theoretically that arbitrary features (no… ▽ More We show that existing unsupervised methods on large language model (LLM) activations do not discover knowledge -- instead they seem to discover whatever feature of the activations is most prominent. The idea behind unsupervised knowledge elicitation is that knowledge satisfies a consistency structure, which can be used to discover knowledge. We first prove theoretically that arbitrary features (not just knowledge) satisfy the consistency structure of a particular leading unsupervised knowledge-elicitation method, contrast-consistent search (Burns et al. - arXiv:2212.03827). We then present a series of experiments showing settings in which unsupervised methods result in classifiers that do not predict knowledge, but instead predict a different prominent feature. We conclude that existing unsupervised methods for discovering latent knowledge are insufficient, and we contribute sanity checks to apply to evaluating future knowledge elicitation methods. Conceptually, we hypothesise that the identification issues explored here, e.g. distinguishing a model's knowledge from that of a simulated character's, will persist for future unsupervised methods. △ Less

Submitted 18 December, 2023; v1 submitted 15 December, 2023; originally announced December 2023.

Comments: 12 pages (38 including references and appendices). First three authors equal contribution, randomised order

arXiv:2309.02390 [pdf, other]

Explaining grokking through circuit efficiency

Authors: Vikrant Varma, Rohin Shah, Zachary Kenton, János Kramár, Ramana Kumar

Abstract: One of the most surprising puzzles in neural network generalisation is grokking: a network with perfect training accuracy but poor generalisation will, upon further training, transition to perfect generalisation. We propose that grokking occurs when the task admits a generalising solution and a memorising solution, where the generalising solution is slower to learn but more efficient, producing la… ▽ More One of the most surprising puzzles in neural network generalisation is grokking: a network with perfect training accuracy but poor generalisation will, upon further training, transition to perfect generalisation. We propose that grokking occurs when the task admits a generalising solution and a memorising solution, where the generalising solution is slower to learn but more efficient, producing larger logits with the same parameter norm. We hypothesise that memorising circuits become more inefficient with larger training datasets while generalising circuits do not, suggesting there is a critical dataset size at which memorisation and generalisation are equally efficient. We make and confirm four novel predictions about grokking, providing significant evidence in favour of our explanation. Most strikingly, we demonstrate two novel and surprising behaviours: ungrokking, in which a network regresses from perfect to low test accuracy, and semi-grokking, in which a network shows delayed generalisation to partial rather than perfect test accuracy. △ Less

Submitted 5 September, 2023; originally announced September 2023.

arXiv:2210.01790 [pdf, other]

Goal Misgeneralization: Why Correct Specifications Aren't Enough For Correct Goals

Authors: Rohin Shah, Vikrant Varma, Ramana Kumar, Mary Phuong, Victoria Krakovna, Jonathan Uesato, Zac Kenton

Abstract: The field of AI alignment is concerned with AI systems that pursue unintended goals. One commonly studied mechanism by which an unintended goal might arise is specification gaming, in which the designer-provided specification is flawed in a way that the designers did not foresee. However, an AI system may pursue an undesired goal even when the specification is correct, in the case of goal misgener… ▽ More The field of AI alignment is concerned with AI systems that pursue unintended goals. One commonly studied mechanism by which an unintended goal might arise is specification gaming, in which the designer-provided specification is flawed in a way that the designers did not foresee. However, an AI system may pursue an undesired goal even when the specification is correct, in the case of goal misgeneralization. Goal misgeneralization is a specific form of robustness failure for learning algorithms in which the learned program competently pursues an undesired goal that leads to good performance in training situations but bad performance in novel test situations. We demonstrate that goal misgeneralization can occur in practical systems by providing several examples in deep learning systems across a variety of domains. Extrapolating forward to more capable systems, we provide hypotheticals that illustrate how goal misgeneralization could lead to catastrophic risk. We suggest several research directions that could reduce the risk of goal misgeneralization for future systems. △ Less

Submitted 2 November, 2022; v1 submitted 4 October, 2022; originally announced October 2022.

arXiv:2208.08345 [pdf, other]

Discovering Agents

Authors: Zachary Kenton, Ramana Kumar, Sebastian Farquhar, Jonathan Richens, Matt MacDermott, Tom Everitt

Abstract: Causal models of agents have been used to analyse the safety aspects of machine learning systems. But identifying agents is non-trivial -- often the causal model is just assumed by the modeler without much justification -- and modelling failures can lead to mistakes in the safety analysis. This paper proposes the first formal causal definition of agents -- roughly that agents are systems that woul… ▽ More Causal models of agents have been used to analyse the safety aspects of machine learning systems. But identifying agents is non-trivial -- often the causal model is just assumed by the modeler without much justification -- and modelling failures can lead to mistakes in the safety analysis. This paper proposes the first formal causal definition of agents -- roughly that agents are systems that would adapt their policy if their actions influenced the world in a different way. From this we derive the first causal discovery algorithm for discovering agents from empirical data, and give algorithms for translating between causal models and game-theoretic influence diagrams. We demonstrate our approach by resolving some previous confusions caused by incorrect causal modelling of agents. △ Less

Submitted 24 August, 2022; v1 submitted 17 August, 2022; originally announced August 2022.

Comments: Some typos corrected

arXiv:2201.08102 [pdf, other]

Safe Deep RL in 3D Environments using Human Feedback

Authors: Matthew Rahtz, Vikrant Varma, Ramana Kumar, Zachary Kenton, Shane Legg, Jan Leike

Abstract: Agents should avoid unsafe behaviour during both training and deployment. This typically requires a simulator and a procedural specification of unsafe behaviour. Unfortunately, a simulator is not always available, and procedurally specifying constraints can be difficult or impossible for many real-world tasks. A recently introduced technique, ReQueST, aims to solve this problem by learning a neura… ▽ More Agents should avoid unsafe behaviour during both training and deployment. This typically requires a simulator and a procedural specification of unsafe behaviour. Unfortunately, a simulator is not always available, and procedurally specifying constraints can be difficult or impossible for many real-world tasks. A recently introduced technique, ReQueST, aims to solve this problem by learning a neural simulator of the environment from safe human trajectories, then using the learned simulator to efficiently learn a reward model from human feedback. However, it is yet unknown whether this approach is feasible in complex 3D environments with feedback obtained from real humans - whether sufficient pixel-based neural simulator quality can be achieved, and whether the human data requirements are viable in terms of both quantity and quality. In this paper we answer this question in the affirmative, using ReQueST to train an agent to perform a 3D first-person object collection task using data entirely from human contractors. We show that the resulting agent exhibits an order of magnitude reduction in unsafe behaviour compared to standard reinforcement learning. △ Less

Submitted 21 January, 2022; v1 submitted 20 January, 2022; originally announced January 2022.

arXiv:2112.04359 [pdf, other]

Ethical and social risks of harm from Language Models

Authors: Laura Weidinger, John Mellor, Maribeth Rauh, Conor Griffin, Jonathan Uesato, Po-Sen Huang, Myra Cheng, Mia Glaese, Borja Balle, Atoosa Kasirzadeh, Zac Kenton, Sasha Brown, Will Hawkins, Tom Stepleton, Courtney Biles, Abeba Birhane, Julia Haas, Laura Rimell, Lisa Anne Hendricks, William Isaac, Sean Legassick, Geoffrey Irving, Iason Gabriel

Abstract: This paper aims to help structure the risk landscape associated with large-scale Language Models (LMs). In order to foster advances in responsible innovation, an in-depth understanding of the potential risks posed by these models is needed. A wide range of established and anticipated risks are analysed in detail, drawing on multidisciplinary expertise and literature from computer science, linguist… ▽ More This paper aims to help structure the risk landscape associated with large-scale Language Models (LMs). In order to foster advances in responsible innovation, an in-depth understanding of the potential risks posed by these models is needed. A wide range of established and anticipated risks are analysed in detail, drawing on multidisciplinary expertise and literature from computer science, linguistics, and social sciences. We outline six specific risk areas: I. Discrimination, Exclusion and Toxicity, II. Information Hazards, III. Misinformation Harms, V. Malicious Uses, V. Human-Computer Interaction Harms, VI. Automation, Access, and Environmental Harms. The first area concerns the perpetuation of stereotypes, unfair discrimination, exclusionary norms, toxic language, and lower performance by social group for LMs. The second focuses on risks from private data leaks or LMs correctly inferring sensitive information. The third addresses risks arising from poor, false or misleading information including in sensitive domains, and knock-on risks such as the erosion of trust in shared information. The fourth considers risks from actors who try to use LMs to cause harm. The fifth focuses on risks specific to LLMs used to underpin conversational agents that interact with human users, including unsafe use, manipulation or deception. The sixth discusses the risk of environmental harm, job automation, and other challenges that may have a disparate effect on different social groups or communities. In total, we review 21 risks in-depth. We discuss the points of origin of different risks and point to potential mitigation approaches. Lastly, we discuss organisational responsibilities in implementing mitigations, and the role of collaboration and participation. We highlight directions for further research, particularly on expanding the toolkit for assessing and evaluating the outlined risks in LMs. △ Less

Submitted 8 December, 2021; originally announced December 2021.

arXiv:2103.14659 [pdf, other]

Alignment of Language Agents

Authors: Zachary Kenton, Tom Everitt, Laura Weidinger, Iason Gabriel, Vladimir Mikulik, Geoffrey Irving

Abstract: For artificial intelligence to be beneficial to humans the behaviour of AI agents needs to be aligned with what humans want. In this paper we discuss some behavioural issues for language agents, arising from accidental misspecification by the system designer. We highlight some ways that misspecification can occur and discuss some behavioural issues that could arise from misspecification, including… ▽ More For artificial intelligence to be beneficial to humans the behaviour of AI agents needs to be aligned with what humans want. In this paper we discuss some behavioural issues for language agents, arising from accidental misspecification by the system designer. We highlight some ways that misspecification can occur and discuss some behavioural issues that could arise from misspecification, including deceptive or manipulative language, and review some approaches for avoiding these issues. △ Less

Submitted 26 March, 2021; originally announced March 2021.

arXiv:2012.05672 [pdf, other]

Imitating Interactive Intelligence

Authors: Josh Abramson, Arun Ahuja, Iain Barr, Arthur Brussee, Federico Carnevale, Mary Cassin, Rachita Chhaparia, Stephen Clark, Bogdan Damoc, Andrew Dudzik, Petko Georgiev, Aurelia Guy, Tim Harley, Felix Hill, Alden Hung, Zachary Kenton, Jessica Landon, Timothy Lillicrap, Kory Mathewson, Soňa Mokrá, Alistair Muldal, Adam Santoro, Nikolay Savinov, Vikrant Varma, Greg Wayne , et al. (4 additional authors not shown)

Abstract: A common vision from science fiction is that robots will one day inhabit our physical spaces, sense the world as we do, assist our physical labours, and communicate with us through natural language. Here we study how to design artificial agents that can interact naturally with humans using the simplification of a virtual environment. This setting nevertheless integrates a number of the central cha… ▽ More A common vision from science fiction is that robots will one day inhabit our physical spaces, sense the world as we do, assist our physical labours, and communicate with us through natural language. Here we study how to design artificial agents that can interact naturally with humans using the simplification of a virtual environment. This setting nevertheless integrates a number of the central challenges of artificial intelligence (AI) research: complex visual perception and goal-directed physical control, grounded language comprehension and production, and multi-agent social interaction. To build agents that can robustly interact with humans, we would ideally train them while they interact with humans. However, this is presently impractical. Therefore, we approximate the role of the human with another learned agent, and use ideas from inverse reinforcement learning to reduce the disparities between human-human and agent-agent interactive behaviour. Rigorously evaluating our agents poses a great challenge, so we develop a variety of behavioural tests, including evaluation by humans who watch videos of agents or interact directly with them. These evaluations convincingly demonstrate that interactive training and auxiliary losses improve agent behaviour beyond what is achieved by supervised learning of actions alone. Further, we demonstrate that agent capabilities generalise beyond literal experiences in the dataset. Finally, we train evaluation models whose ratings of agents agree well with human judgement, thus permitting the evaluation of new agent models without additional effort. Taken together, our results in this virtual environment provide evidence that large-scale human behavioural imitation is a promising tool to create intelligent, interactive agents, and the challenge of reliably evaluating such agents is possible to surmount. △ Less

Submitted 20 January, 2021; v1 submitted 10 December, 2020; originally announced December 2020.

arXiv:1912.10481 [pdf, other]

A Systematic Comparison of Bayesian Deep Learning Robustness in Diabetic Retinopathy Tasks

Authors: Angelos Filos, Sebastian Farquhar, Aidan N. Gomez, Tim G. J. Rudner, Zachary Kenton, Lewis Smith, Milad Alizadeh, Arnoud de Kroon, Yarin Gal

Abstract: Evaluation of Bayesian deep learning (BDL) methods is challenging. We often seek to evaluate the methods' robustness and scalability, assessing whether new tools give `better' uncertainty estimates than old ones. These evaluations are paramount for practitioners when choosing BDL tools on-top of which they build their applications. Current popular evaluations of BDL methods, such as the UCI experi… ▽ More Evaluation of Bayesian deep learning (BDL) methods is challenging. We often seek to evaluate the methods' robustness and scalability, assessing whether new tools give `better' uncertainty estimates than old ones. These evaluations are paramount for practitioners when choosing BDL tools on-top of which they build their applications. Current popular evaluations of BDL methods, such as the UCI experiments, are lacking: Methods that excel with these experiments often fail when used in application such as medical or automotive, suggesting a pertinent need for new benchmarks in the field. We propose a new BDL benchmark with a diverse set of tasks, inspired by a real-world medical imaging application on \emph{diabetic retinopathy diagnosis}. Visual inputs (512x512 RGB images of retinas) are considered, where model uncertainty is used for medical pre-screening---i.e. to refer patients to an expert when model diagnosis is uncertain. Methods are then ranked according to metrics derived from expert-domain to reflect real-world use of model uncertainty in automated diagnosis. We develop multiple tasks that fall under this application, including out-of-distribution detection and robustness to distribution shift. We then perform a systematic comparison of well-tuned BDL techniques on the various tasks. From our comparison we conclude that some current techniques which solve benchmarks such as UCI `overfit' their uncertainty to the dataset---when evaluated on our benchmark these underperform in comparison to simpler baselines. The code for the benchmark, its baselines, and a simple API for evaluating new BDL tools are made available at https://github.com/oatml/bdl-benchmarks. △ Less

Submitted 22 December, 2019; originally announced December 2019.

arXiv:1907.01475 [pdf, other]

Generalizing from a few environments in safety-critical reinforcement learning

Authors: Zachary Kenton, Angelos Filos, Owain Evans, Yarin Gal

Abstract: Before deploying autonomous agents in the real world, we need to be confident they will perform safely in novel situations. Ideally, we would expose agents to a very wide range of situations during training, allowing them to learn about every possible danger, but this is often impractical. This paper investigates safety and generalization from a limited number of training environments in deep rein… ▽ More Before deploying autonomous agents in the real world, we need to be confident they will perform safely in novel situations. Ideally, we would expose agents to a very wide range of situations during training, allowing them to learn about every possible danger, but this is often impractical. This paper investigates safety and generalization from a limited number of training environments in deep reinforcement learning (RL). We find RL algorithms can fail dangerously on unseen test environments even when performing perfectly on training environments. Firstly, in a gridworld setting, we show that catastrophes can be significantly reduced with simple modifications, including ensemble model averaging and the use of a blocking classifier. In the more challenging CoinRun environment we find similar methods do not significantly reduce catastrophes. However, we do find that the uncertainty information from the ensemble is useful for predicting whether a catastrophe will occur within a few steps and hence whether human intervention should be requested. △ Less

Submitted 2 July, 2019; originally announced July 2019.

arXiv:1807.05031 [pdf, other]

On the Relation Between the Sharpest Directions of DNN Loss and the SGD Step Length

Authors: Stanisław Jastrzębski, Zachary Kenton, Nicolas Ballas, Asja Fischer, Yoshua Bengio, Amos Storkey

Abstract: Stochastic Gradient Descent (SGD) based training of neural networks with a large learning rate or a small batch-size typically ends in well-generalizing, flat regions of the weight space, as indicated by small eigenvalues of the Hessian of the training loss. However, the curvature along the SGD trajectory is poorly understood. An empirical investigation shows that initially SGD visits increasingly… ▽ More Stochastic Gradient Descent (SGD) based training of neural networks with a large learning rate or a small batch-size typically ends in well-generalizing, flat regions of the weight space, as indicated by small eigenvalues of the Hessian of the training loss. However, the curvature along the SGD trajectory is poorly understood. An empirical investigation shows that initially SGD visits increasingly sharp regions, reaching a maximum sharpness determined by both the learning rate and the batch-size of SGD. When studying the SGD dynamics in relation to the sharpest directions in this initial phase, we find that the SGD step is large compared to the curvature and commonly fails to minimize the loss along the sharpest directions. Furthermore, using a reduced learning rate along these directions can improve training speed while leading to both sharper and better generalizing solutions compared to vanilla SGD. In summary, our analysis of the dynamics of SGD in the subspace of the sharpest directions shows that they influence the regions that SGD steers to (where larger learning rate or smaller batch size result in wider regions visited), the overall training speed, and the generalization ability of the final model. △ Less

Submitted 23 December, 2019; v1 submitted 13 July, 2018; originally announced July 2018.

Journal ref: International Conference on Learning Representations (ICLR) 2019

arXiv:1711.04623 [pdf, other]

Three Factors Influencing Minima in SGD

Authors: Stanisław Jastrzębski, Zachary Kenton, Devansh Arpit, Nicolas Ballas, Asja Fischer, Yoshua Bengio, Amos Storkey

Abstract: We investigate the dynamical and convergent properties of stochastic gradient descent (SGD) applied to Deep Neural Networks (DNNs). Characterizing the relation between learning rate, batch size and the properties of the final minima, such as width or generalization, remains an open question. In order to tackle this problem we investigate the previously proposed approximation of SGD by a stochastic… ▽ More We investigate the dynamical and convergent properties of stochastic gradient descent (SGD) applied to Deep Neural Networks (DNNs). Characterizing the relation between learning rate, batch size and the properties of the final minima, such as width or generalization, remains an open question. In order to tackle this problem we investigate the previously proposed approximation of SGD by a stochastic differential equation (SDE). We theoretically argue that three factors - learning rate, batch size and gradient covariance - influence the minima found by SGD. In particular we find that the ratio of learning rate to batch size is a key determinant of SGD dynamics and of the width of the final minima, and that higher values of the ratio lead to wider minima and often better generalization. We confirm these findings experimentally. Further, we include experiments which show that learning rate schedules can be replaced with batch size schedules and that the ratio of learning rate to batch size is an important factor influencing the memorization process. △ Less

Submitted 13 September, 2018; v1 submitted 13 November, 2017; originally announced November 2017.

Comments: First two authors contributed equally. Short version accepted into ICLR workshop. Accepted to Artificial Neural Networks and Machine Learning, ICANN 2018

arXiv:1605.03435 [pdf, other]

doi 10.1088/1475-7516/2016/10/035

The Separate Universe Approach to Soft Limits

Authors: Zachary Kenton, David J. Mulryne

Abstract: We develop a formalism for calculating soft limits of $n$-point inflationary correlation functions using separate universe techniques. Our method naturally allows for multiple fields and leads to an elegant diagrammatic approach. As an application we focus on the trispectrum produced by inflation with multiple light fields, giving explicit formulae for all possible single- and double-soft limits.… ▽ More We develop a formalism for calculating soft limits of $n$-point inflationary correlation functions using separate universe techniques. Our method naturally allows for multiple fields and leads to an elegant diagrammatic approach. As an application we focus on the trispectrum produced by inflation with multiple light fields, giving explicit formulae for all possible single- and double-soft limits. We also investigate consistency relations and present an infinite tower of inequalities between soft correlation functions which generalise the Suyama-Yamaguchi inequality. △ Less

Submitted 9 December, 2016; v1 submitted 11 May, 2016; originally announced May 2016.

Comments: 28 pages, 7 figures. This is an author-created, un-copyedited version of an article published in JCAP. IOP Publishing Ltd is not responsible for any errors or omissions in this version of the manuscript or any version derived from it. The Version of Record is available online at the DOI below. v3: Updated to match version published in JCAP

arXiv:1507.08629 [pdf, other]

doi 10.1088/1475-7516/2015/10/018

The squeezed limit of the bispectrum in multi-field inflation

Authors: Zachary Kenton, David J. Mulryne

Abstract: We calculate the squeezed limit of the bispectrum produced by inflation with multiple light fields. To achieve this we allow for different horizon exit times for each mode and calculate the intrinsic field-space three-point function in the squeezed limit using soft-limit techniques. We then use the $δN$ formalism from the time the last mode exits the horizon to calculate the bispectrum of the prim… ▽ More We calculate the squeezed limit of the bispectrum produced by inflation with multiple light fields. To achieve this we allow for different horizon exit times for each mode and calculate the intrinsic field-space three-point function in the squeezed limit using soft-limit techniques. We then use the $δN$ formalism from the time the last mode exits the horizon to calculate the bispectrum of the primordial curvature perturbation. We apply our results to calculate the spectral index of the halo bias, $n_{δb}$, an important observational probe of the squeezed limit of the primordial bispectrum and compare our results with previous formulae. We give an example of a curvaton model with $n_{δb} \sim {\cal O}(n_s-1)$ for which we find a 20% correction to observable parameters for squeezings relevant to future experiments. For completeness, we also calculate the squeezed limit of three-point correlation functions involving gravitons for multiple field models. △ Less

Submitted 6 April, 2016; v1 submitted 30 July, 2015; originally announced July 2015.

Comments: 34 pages, 7 figures. Updated to match published version (corrected typos)

arXiv:1504.05736 [pdf, other]

doi 10.1103/PhysRevD.92.023505

Generating the cosmic microwave background power asymmetry with $g_{NL}$

Authors: Zachary Kenton, David J. Mulryne, Steven Thomas

Abstract: We consider a higher order term in the $δN$ expansion for the CMB power asymmetry generated by a superhorizon isocurvature field fluctuation. The term can generate the asymmetry without requiring a large value of $f_{NL}$. Instead it produces a non-zero value of $g_{NL}$. A combination of constraints lead to an allowed region in $f_{NL}-g_{NL}$ space. To produce the asymmetry with this term withou… ▽ More We consider a higher order term in the $δN$ expansion for the CMB power asymmetry generated by a superhorizon isocurvature field fluctuation. The term can generate the asymmetry without requiring a large value of $f_{NL}$. Instead it produces a non-zero value of $g_{NL}$. A combination of constraints lead to an allowed region in $f_{NL}-g_{NL}$ space. To produce the asymmetry with this term without a large value of $f_{NL}$ we find that the isocurvature field needs to contribute less than the inflaton towards the power spectrum of the curvature perturbation. △ Less

Submitted 7 July, 2015; v1 submitted 22 April, 2015; originally announced April 2015.

Comments: 6 pages, 1 figure. Updated to match published version. Minor typographical corrections

Journal ref: Phys. Rev. D 92, 023505 (2015)

arXiv:1409.1221 [pdf, other]

doi 10.1007/JHEP02(2015)127

D-brane Potentials in the Warped Resolved Conifold and Natural Inflation

Authors: Zachary Kenton, Steven Thomas

Abstract: In this paper we obtain a model of Natural Inflation from string theory with a Planckian decay constant. We investigate D-brane dynamics in the background of the warped resolved conifold (WRC) throat approximation of Type IIB string compactifications on Calabi-Yau manifolds. When we glue the throat to a compact bulk Calabi-Yau, we generate a D-brane potential which is a solution to the Laplace equ… ▽ More In this paper we obtain a model of Natural Inflation from string theory with a Planckian decay constant. We investigate D-brane dynamics in the background of the warped resolved conifold (WRC) throat approximation of Type IIB string compactifications on Calabi-Yau manifolds. When we glue the throat to a compact bulk Calabi-Yau, we generate a D-brane potential which is a solution to the Laplace equation on the resolved conifold. We can exactly solve this equation, including dependence on the angular coordinates. The solutions are valid down to the tip of the resolved conifold, which is not the case for the more commonly used deformed conifold. This allows us to exploit the effect of the war**, which is strongest at the tip. We inflate near the tip using an angular coordinate of a D5-brane in the WRC which has a discrete shift symmetry, and feels a cosine potential, giving us a model of Natural Inflation, from which it is possible to get a Planckian decay constant whilst maintaining control over the backreaction. This is because the decay constant for a wrapped brane contains powers of the warp factor, and so can be made large, while the wrap** parameter can be kept small enough so that backreaction is under control. △ Less

Submitted 27 February, 2015; v1 submitted 3 September, 2014; originally announced September 2014.

Comments: 41 pages, 3 appendices, 1 figure, PDFLaTex; various clarifications added along with a new appendix on b-axions and wrapped D5 branes;version matches the one published in JHEP

Journal ref: JHEP02(2015);127

Showing 1–18 of 18 results for author: Kenton, Z