Skip to main content

Showing 1–50 of 178 results for author: West, P

.
  1. arXiv:2407.02446  [pdf, other

    cs.CL cs.AI

    Predicting vs. Acting: A Trade-off Between World Modeling & Agent Modeling

    Authors: Margaret Li, Weijia Shi, Artidoro Pagnoni, Peter West, Ari Holtzman

    Abstract: RLHF-aligned LMs have shown unprecedented ability on both benchmarks and long-form text generation, yet they struggle with one foundational task: next-token prediction. As RLHF models become agent models aimed at interacting with humans, they seem to lose their world modeling -- the ability to predict what comes next in arbitrary documents, which is the foundational training objective of the Base… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  2. arXiv:2403.13780  [pdf, other

    cs.CL cs.AI

    Information-Theoretic Distillation for Reference-less Summarization

    Authors: Jaehun Jung, Ximing Lu, Liwei Jiang, Faeze Brahman, Peter West, Pang Wei Koh, Ye** Choi

    Abstract: The current winning recipe for automatic summarization is using proprietary large-scale language models (LLMs) such as ChatGPT as is, or imitation learning from them as teacher models. While increasingly ubiquitous dependence on such large-scale language models is convenient, there remains an important question of whether small-scale models could have achieved competitive results, if we were to se… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

  3. arXiv:2403.13453  [pdf, ps, other

    physics.hist-ph hep-th

    Memories of Abdus Salam and the early days of supersymmetry

    Authors: Peter West

    Abstract: I give an account of what it was like to be a PhD student of Abdus Salam and also to take part during the early stages of the development of supersymmetry.

    Submitted 20 March, 2024; originally announced March 2024.

    Comments: 11 pages. arXiv admin note: text overlap with arXiv:1609.06863

  4. arXiv:2312.05979  [pdf, other

    cs.CL

    NovaCOMET: Open Commonsense Foundation Models with Symbolic Knowledge Distillation

    Authors: Peter West, Ronan Le Bras, Taylor Sorensen, Bill Yuchen Lin, Liwei Jiang, Ximing Lu, Khyathi Chandu, Jack Hessel, Ashutosh Baheti, Chandra Bhagavatula, Ye** Choi

    Abstract: We present NovaCOMET, an open commonsense knowledge model, that combines the best aspects of knowledge and general task models. Compared to previous knowledge models, NovaCOMET allows open-format relations enabling direct application to reasoning tasks; compared to general task models like Flan-T5, it explicitly centers knowledge, enabling superior performance for commonsense reasoning. NovaCOME… ▽ More

    Submitted 10 December, 2023; originally announced December 2023.

  5. arXiv:2312.04837  [pdf, other

    cs.AI cs.CL cs.CV

    Localized Symbolic Knowledge Distillation for Visual Commonsense Models

    Authors: Jae Sung Park, Jack Hessel, Khyathi Raghavi Chandu, Paul Pu Liang, Ximing Lu, Peter West, Youngjae Yu, Qiuyuan Huang, Jianfeng Gao, Ali Farhadi, Ye** Choi

    Abstract: Instruction following vision-language (VL) models offer a flexible interface that supports a broad range of multimodal tasks in a zero-shot fashion. However, interfaces that operate on full images do not directly enable the user to "point to" and access specific regions within images. This capability is important not only to support reference-grounded VL benchmarks, but also, for practical applica… ▽ More

    Submitted 12 December, 2023; v1 submitted 8 December, 2023; originally announced December 2023.

    Comments: Neurips 2023

  6. arXiv:2311.00059  [pdf, other

    cs.AI cs.CL cs.CV cs.LG

    The Generative AI Paradox: "What It Can Create, It May Not Understand"

    Authors: Peter West, Ximing Lu, Nouha Dziri, Faeze Brahman, Linjie Li, Jena D. Hwang, Liwei Jiang, Jillian Fisher, Abhilasha Ravichander, Khyathi Chandu, Benjamin Newman, Pang Wei Koh, Allyson Ettinger, Ye** Choi

    Abstract: The recent wave of generative AI has sparked unprecedented global attention, with both excitement and concern over potentially superhuman levels of artificial intelligence: models now take only seconds to produce outputs that would challenge or exceed the capabilities even of expert humans. At the same time, models still show basic errors in understanding that would not be expected even in non-exp… ▽ More

    Submitted 31 October, 2023; originally announced November 2023.

  7. Value Kaleidoscope: Engaging AI with Pluralistic Human Values, Rights, and Duties

    Authors: Taylor Sorensen, Liwei Jiang, Jena Hwang, Sydney Levine, Valentina Pyatkin, Peter West, Nouha Dziri, Ximing Lu, Kavel Rao, Chandra Bhagavatula, Maarten Sap, John Tasioulas, Ye** Choi

    Abstract: Human values are crucial to human decision-making. Value pluralism is the view that multiple correct values may be held in tension with one another (e.g., when considering lying to a friend to protect their feelings, how does one balance honesty with friendship?). As statistical learners, AI systems fit to averages by default, washing out these potentially irreducible value conflicts. To improve A… ▽ More

    Submitted 2 April, 2024; v1 submitted 1 September, 2023; originally announced September 2023.

    Comments: Proceedings of the AAAI Conference on Artificial Intelligence, 38

    Journal ref: Vol. 38 No. 18: AAAI-24 Technical Tracks 18; 2024; 19937-19947

  8. arXiv:2308.00189  [pdf, other

    cs.LG cs.AI cs.CL

    Generative Models as a Complex Systems Science: How can we make sense of large language model behavior?

    Authors: Ari Holtzman, Peter West, Luke Zettlemoyer

    Abstract: Coaxing out desired behavior from pretrained models, while avoiding undesirable ones, has redefined NLP and is resha** how we interact with computers. What was once a scientific engineering discipline-in which building blocks are stacked one on top of the other-is arguably already a complex systems science, in which emergent behaviors are sought out to support previously unimagined use cases.… ▽ More

    Submitted 31 July, 2023; originally announced August 2023.

    Comments: 15 pages, 7 figures

  9. arXiv:2306.00924  [pdf, other

    cs.CL cs.AI cs.LG

    Minding Language Models' (Lack of) Theory of Mind: A Plug-and-Play Multi-Character Belief Tracker

    Authors: Melanie Sclar, Sachin Kumar, Peter West, Alane Suhr, Ye** Choi, Yulia Tsvetkov

    Abstract: Theory of Mind (ToM)$\unicode{x2014}$the ability to reason about the mental states of other people$\unicode{x2014}$is a key element of our social intelligence. Yet, despite their ever more impressive performance, large-scale neural language models still lack basic theory of mind capabilities out-of-the-box. We posit that simply scaling up models will not imbue them with theory of mind due to the i… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

    Journal ref: ACL 2023

  10. arXiv:2305.18654  [pdf, other

    cs.CL cs.AI cs.LG

    Faith and Fate: Limits of Transformers on Compositionality

    Authors: Nouha Dziri, Ximing Lu, Melanie Sclar, Xiang Lorraine Li, Liwei Jiang, Bill Yuchen Lin, Peter West, Chandra Bhagavatula, Ronan Le Bras, Jena D. Hwang, Soumya Sanyal, Sean Welleck, Xiang Ren, Allyson Ettinger, Zaid Harchaoui, Ye** Choi

    Abstract: Transformer large language models (LLMs) have sparked admiration for their exceptional performance on tasks that demand intricate multi-step reasoning. Yet, these models simultaneously show failures on surprisingly trivial problems. This begs the question: Are these errors incidental, or do they signal more substantial limitations? In an attempt to demystify transformer LLMs, we investigate the li… ▽ More

    Submitted 31 October, 2023; v1 submitted 29 May, 2023; originally announced May 2023.

    Comments: 10 pages + appendix (40 pages)

  11. arXiv:2305.16635  [pdf, other

    cs.CL cs.AI cs.LG

    Impossible Distillation: from Low-Quality Model to High-Quality Dataset & Model for Summarization and Paraphrasing

    Authors: Jaehun Jung, Peter West, Liwei Jiang, Faeze Brahman, Ximing Lu, Jillian Fisher, Taylor Sorensen, Ye** Choi

    Abstract: We present Impossible Distillation, a novel framework for paraphrasing and sentence summarization, that distills a high-quality dataset and model from a low-quality teacher that itself cannot perform these tasks. Unlike prior works that rely on an extreme-scale teacher model (e.g., GPT3) or task-specific architecture, we hypothesize and verify the paraphrastic proximity intrinsic to pre-trained LM… ▽ More

    Submitted 5 April, 2024; v1 submitted 26 May, 2023; originally announced May 2023.

    Comments: NAACL 2024

  12. arXiv:2305.15065  [pdf, other

    cs.CL

    Inference-Time Policy Adapters (IPA): Tailoring Extreme-Scale LMs without Fine-tuning

    Authors: Ximing Lu, Faeze Brahman, Peter West, Jaehun Jang, Khyathi Chandu, Abhilasha Ravichander, Lianhui Qin, Prithviraj Ammanabrolu, Liwei Jiang, Sahana Ramnath, Nouha Dziri, Jillian Fisher, Bill Yuchen Lin, Skyler Hallinan, Xiang Ren, Sean Welleck, Ye** Choi

    Abstract: While extreme-scale language models have demonstrated exceptional performance on a variety of language tasks, the degree of control over these language models through pure prompting can often be limited. Directly fine-tuning such language models can be effective for tailoring them, but it can be either extremely costly (e.g., GPT-3) or not even feasible for the broader community (e.g., GPT-4). W… ▽ More

    Submitted 6 December, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: EMNLP 2023

  13. arXiv:2305.02884  [pdf, ps, other

    hep-th

    Carrollian conformal fields and flat holography

    Authors: Kevin Nguyen, Peter West

    Abstract: The null conformal boundary $\mathscr{I}$ of Minkowski spacetime $\mathbb{M}$ plays a special role in scattering theory, as it is the locus where massless particle states are most naturally defined. We construct quantum fields on $\mathscr{I}$ which create these massless states from the vacuum and transform covariantly under Poincaré symmetries. Since the latter symmetries act as Carrollian confor… ▽ More

    Submitted 26 August, 2023; v1 submitted 4 May, 2023; originally announced May 2023.

    Comments: 21 pages + appendix; v2: additional references and comments; v3: published version

  14. arXiv:2304.14399  [pdf, other

    cs.CL

    We're Afraid Language Models Aren't Modeling Ambiguity

    Authors: Alisa Liu, Zhaofeng Wu, Julian Michael, Alane Suhr, Peter West, Alexander Koller, Swabha Swayamdipta, Noah A. Smith, Ye** Choi

    Abstract: Ambiguity is an intrinsic feature of natural language. Managing ambiguity is a key part of human language understanding, allowing us to anticipate misunderstanding as communicators and revise our interpretations as listeners. As language models (LMs) are increasingly employed as dialogue interfaces and writing aids, handling ambiguous language is critical to their success. We characterize ambiguit… ▽ More

    Submitted 20 October, 2023; v1 submitted 27 April, 2023; originally announced April 2023.

    Comments: EMNLP 2023 camera-ready

  15. Spacetime and large local transformations

    Authors: Peter West

    Abstract: We argue that the existence of solitons in theories in which local symmetries are spontaneously broken requires spacetime to be enlarged by additional coordinates that are associated with large local transformations. In the context of gravity theories the usual coordinates of spacetime can be thought of arising in this way. E theory automatically contains such an enlarged spacetime. We propose tha… ▽ More

    Submitted 26 April, 2023; v1 submitted 4 February, 2023; originally announced February 2023.

    Comments: A dedication to Lars Brink and references 45 and 46 added. A typo in equation (1.2.1) corrected

  16. arXiv:2301.09617  [pdf, other

    cs.CV

    Fully transformer-based biomarker prediction from colorectal cancer histology: a large-scale multicentric study

    Authors: Sophia J. Wagner, Daniel Reisenbüchler, Nicholas P. West, Jan Moritz Niehues, Gregory Patrick Veldhuizen, Philip Quirke, Heike I. Grabsch, Piet A. van den Brandt, Gordon G. A. Hutchins, Susan D. Richman, Tanwei Yuan, Rupert Langer, Josien Christina Anna Jenniskens, Kelly Offermans, Wolfram Mueller, Richard Gray, Stephen B. Gruber, Joel K. Greenson, Gad Rennert, Joseph D. Bonner, Daniel Schmolze, Jacqueline A. James, Maurice B. Loughrey, Manuel Salto-Tellez, Hermann Brenner , et al. (6 additional authors not shown)

    Abstract: Background: Deep learning (DL) can extract predictive and prognostic biomarkers from routine pathology slides in colorectal cancer. For example, a DL test for the diagnosis of microsatellite instability (MSI) in CRC has been approved in 2022. Current approaches rely on convolutional neural networks (CNNs). Transformer networks are outperforming CNNs and are replacing them in many applications, but… ▽ More

    Submitted 1 March, 2023; v1 submitted 23 January, 2023; originally announced January 2023.

    Comments: Updated Figure 2 and Table A.5

  17. arXiv:2301.03354  [pdf

    econ.GN

    Action needed to make carbon offsets from tropical forest conservation work for climate change mitigation

    Authors: Thales A. P. West, Sven Wunder, Erin O. Sills, Jan Börner, Sami W. Rifai, Alexandra N. Neidermeier, Andreas Kontoleon

    Abstract: Carbon offsets from voluntarily avoided deforestation projects are generated based on performance vis-à-vis ex-ante deforestation baselines. We examined the impacts of 27 forest conservation projects in six countries on three continents using synthetic control methods for causal inference. We compare the project baselines with ex-post counterfactuals based on observed deforestation in control site… ▽ More

    Submitted 5 January, 2023; originally announced January 2023.

  18. arXiv:2212.10465  [pdf, other

    cs.CL

    SODA: Million-scale Dialogue Distillation with Social Commonsense Contextualization

    Authors: Hyunwoo Kim, Jack Hessel, Liwei Jiang, Peter West, Ximing Lu, Youngjae Yu, Pei Zhou, Ronan Le Bras, Malihe Alikhani, Gunhee Kim, Maarten Sap, Ye** Choi

    Abstract: Data scarcity has been a long standing issue in the field of open-domain social dialogue. To quench this thirst, we present SODA: the first publicly available, million-scale high-quality social dialogue dataset. By contextualizing social commonsense knowledge from a knowledge graph, we are able to distill an exceptionally broad spectrum of social interactions from a large language model. Human eva… ▽ More

    Submitted 23 October, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

    Comments: EMNLP 2023. Dataset, model, and code can be found at https://hyunw.kim/sodaverse

  19. arXiv:2212.09246  [pdf, other

    cs.CL

    I2D2: Inductive Knowledge Distillation with NeuroLogic and Self-Imitation

    Authors: Chandra Bhagavatula, Jena D. Hwang, Doug Downey, Ronan Le Bras, Ximing Lu, Lianhui Qin, Keisuke Sakaguchi, Swabha Swayamdipta, Peter West, Ye** Choi

    Abstract: Commonsense capabilities of pre-trained language models dramatically improve with scale, leading many to believe that scale is the only winning recipe. But is it? Here, we investigate an alternative that a priori seems impossible: can smaller language models (e.g., GPT-2) win over models that are orders of magnitude larger and better (e.g., GPT-3), if powered with novel commonsense distillation al… ▽ More

    Submitted 26 May, 2023; v1 submitted 18 December, 2022; originally announced December 2022.

    Comments: ACL 2023

  20. arXiv:2211.00053  [pdf, other

    cs.CL

    Generating Sequences by Learning to Self-Correct

    Authors: Sean Welleck, Ximing Lu, Peter West, Faeze Brahman, Tianxiao Shen, Daniel Khashabi, Ye** Choi

    Abstract: Sequence generation applications require satisfying semantic constraints, such as ensuring that programs are correct, using certain keywords, or avoiding undesirable content. Language models, whether fine-tuned or prompted with few-shot demonstrations, frequently violate these constraints, and lack a mechanism to iteratively revise their outputs. Moreover, some powerful language models are of extr… ▽ More

    Submitted 31 October, 2022; originally announced November 2022.

  21. Universal derivation of the asymptotic charges of bosonic massless particles

    Authors: Kevin Nguyen, Peter West

    Abstract: We present a unified treatment of the conserved asymptotic charges associated with any bosonic massless particle in any spacetime dimension. In particular we provide master formulae for the asymptotic charges and the central extensions in the corresponding charge algebras. These formulae can be explicitly evaluated for any given theory. For illustration we apply them to electromagnetism and gravit… ▽ More

    Submitted 23 June, 2023; v1 submitted 26 October, 2022; originally announced October 2022.

    Comments: 9 pages, References added and commented on

  22. arXiv:2210.13800  [pdf, other

    cs.CL cs.AI cs.LG

    Referee: Reference-Free Sentence Summarization with Sharper Controllability through Symbolic Knowledge Distillation

    Authors: Melanie Sclar, Peter West, Sachin Kumar, Yulia Tsvetkov, Ye** Choi

    Abstract: We present Referee, a novel framework for sentence summarization that can be trained reference-free (i.e., requiring no gold summaries for supervision), while allowing direct control for compression ratio. Our work is the first to demonstrate that reference-free, controlled sentence summarization is feasible via the conceptual framework of Symbolic Knowledge Distillation (West et al., 2022), where… ▽ More

    Submitted 25 October, 2022; originally announced October 2022.

    Journal ref: Empirical Methods in Natural Language Processing 2022 (EMNLP 2022)

  23. Higher dualisations of linearised gravity and the $A_1^{+++}$ algebra

    Authors: Nicolas Boulanger, Paul P. Cook, Josh A. O'Connor, Peter West

    Abstract: The non-linear realisation based on $A_1^{+++}$ is known to describe gravity in terms of both the graviton and the dual graviton. We extend this analysis at the linearised level to find the equations of motion for the first higher dual description of gravity that it contains. We also give a systematic method for finding the additional fields beyond those in the non-linear realisation that are requ… ▽ More

    Submitted 12 June, 2023; v1 submitted 24 August, 2022; originally announced August 2022.

    Comments: 46 pages, no figures. Published version. One reference added, some content moved to an appendix

  24. Conserved asymptotic charges for any massless particle

    Authors: Kevin Nguyen, Peter West

    Abstract: We compute the conserved charges associated with the asymptotic symmetries of massless particles by examining their free theory in Minkowski spacetime. We give a procedure to systematically deduce the fall off of the massless fields at spatial infinity and show that it has a universal behaviour when expressed in tangent space. We do this for generic massless particles. We do not impose gauge fixin… ▽ More

    Submitted 20 June, 2023; v1 submitted 17 August, 2022; originally announced August 2022.

    Comments: 25 pages. In this new version we added an acknowledgement, namely The work of KN is supported by the ERC Consolidator Grant N. 681908, Quantum black holes: A microscopic window into the microstructure of gravity

  25. arXiv:2205.13636  [pdf, other

    cs.CL cs.LG

    Quark: Controllable Text Generation with Reinforced Unlearning

    Authors: Ximing Lu, Sean Welleck, Jack Hessel, Liwei Jiang, Lianhui Qin, Peter West, Prithviraj Ammanabrolu, Ye** Choi

    Abstract: Large-scale language models often learn behaviors that are misaligned with user expectations. Generated text may contain offensive or toxic language, contain significant repetition, or be of a different sentiment than desired by the user. We consider the task of unlearning these misalignments by fine-tuning the language model on signals of what not to do. We introduce Quantized Reward Konditioning… ▽ More

    Submitted 16 November, 2022; v1 submitted 26 May, 2022; originally announced May 2022.

    Journal ref: NeurIPS 2022 (Oral Selection)

  26. arXiv:2203.10133  [pdf, other

    cs.CL cs.AI

    Probing Factually Grounded Content Transfer with Factual Ablation

    Authors: Peter West, Chris Quirk, Michel Galley, Ye** Choi

    Abstract: Despite recent success, large neural models often generate factually incorrect text. Compounding this is the lack of a standard automatic evaluation for factuality--it cannot be meaningfully improved if it cannot be measured. Grounded generation promises a path to solving both of these problems: models draw on a reliable external document (grounding) for factual information, simplifying the challe… ▽ More

    Submitted 28 March, 2022; v1 submitted 18 March, 2022; originally announced March 2022.

  27. arXiv:2203.08609  [pdf

    physics.chem-ph cond-mat.mtrl-sci

    The Sabatier principle for Battery Anodes: Chemical Kinetics and Reversible Electrodeposition at Heterointerfaces

    Authors: **gxu Zheng, Yue Deng, Wenzao Li, Jiefu Yin, Patrick J. West, Tian Tang, Xiao Tong, David C. Bock, Shuo **, Qing Zhao, Regina Garcia-Mendez, Kenneth J. Takeuchi, Esther S. Takeuchi, Amy C. Marschilok, Lynden A. Archer

    Abstract: How surface chemistry influences reactions occurring thereupon has been a long-standing question of broad scientific and technological interest for centuries. Recently, it has re-emerged as a critical question in a subdiscipline of chemistry - electrochemistry at heterointerphases, where the answers have implications for both how, and in what forms, humanity stores the rising quantities of renewab… ▽ More

    Submitted 25 September, 2022; v1 submitted 14 March, 2022; originally announced March 2022.

    Comments: accepted at Science Advances, in press

  28. The string little algebra

    Authors: Keith Glennon, Peter West

    Abstract: We consider the string, like point particles and branes, to be an irreducible representation of the semi-direct product of the Cartan involution invariant subalgebra of E11 and its vector representation. We show that the subalgebra that preserves the string charges, the string little algebra, is essentially the Borel subalgebra of E9. We also show that the known string physical states carry a repr… ▽ More

    Submitted 2 February, 2022; originally announced February 2022.

  29. The role of the 1.5 order formalism and the gauging of spacetime groups in the development of gravity and supergravity theories

    Authors: Ali H. Chamseddine, Peter West

    Abstract: The 1.5 formalism played a key role in the discovery of supergravity and it has been used to prove the invariance of essentially all supergravity theories under local supersymmetry. It emerged from the gauging of the super Poincare group to find supergravity. We review both of these developments as well as the auxiliary fields for simple supergravity and its most general coupling to matter using t… ▽ More

    Submitted 18 January, 2022; originally announced January 2022.

    Comments: 14 pages

  30. arXiv:2112.08726  [pdf, other

    cs.CL

    NeuroLogic A*esque Decoding: Constrained Text Generation with Lookahead Heuristics

    Authors: Ximing Lu, Sean Welleck, Peter West, Liwei Jiang, Jungo Kasai, Daniel Khashabi, Ronan Le Bras, Lianhui Qin, Youngjae Yu, Rowan Zellers, Noah A. Smith, Ye** Choi

    Abstract: The dominant paradigm for neural text generation is left-to-right decoding from autoregressive language models. Constrained or controllable generation under complex lexical constraints, however, requires foresight to plan ahead feasible future paths. Drawing inspiration from the A* search algorithm, we propose NeuroLogic A*esque, a decoding algorithm that incorporates heuristic estimates of futu… ▽ More

    Submitted 16 December, 2021; originally announced December 2021.

  31. arXiv:2110.08387  [pdf, other

    cs.CL

    Generated Knowledge Prompting for Commonsense Reasoning

    Authors: Jiacheng Liu, Alisa Liu, Ximing Lu, Sean Welleck, Peter West, Ronan Le Bras, Ye** Choi, Hannaneh Hajishirzi

    Abstract: It remains an open question whether incorporating external knowledge benefits commonsense reasoning while maintaining the flexibility of pretrained sequence models. To investigate this question, we develop generated knowledge prompting, which consists of generating knowledge from a language model, then providing the knowledge as additional input when answering a question. Our method does not requi… ▽ More

    Submitted 28 September, 2022; v1 submitted 15 October, 2021; originally announced October 2021.

    Comments: ACL 2022 main conference

  32. arXiv:2110.07178  [pdf, other

    cs.CL

    Symbolic Knowledge Distillation: from General Language Models to Commonsense Models

    Authors: Peter West, Chandra Bhagavatula, Jack Hessel, Jena D. Hwang, Liwei Jiang, Ronan Le Bras, Ximing Lu, Sean Welleck, Ye** Choi

    Abstract: The common practice for training commonsense models has gone from-human-to-corpus-to-machine: humans author commonsense knowledge graphs in order to train commonsense models. In this work, we investigate an alternative, from-machine-to-corpus-to-machine: general language models author these commonsense knowledge graphs to train commonsense models. Our study leads to a new framework, Symbolic Knowl… ▽ More

    Submitted 28 November, 2022; v1 submitted 14 October, 2021; originally announced October 2021.

  33. arXiv:2109.13986  [pdf, other

    cs.LG

    Symbolic Brittleness in Sequence Models: on Systematic Generalization in Symbolic Mathematics

    Authors: Sean Welleck, Peter West, Jize Cao, Ye** Choi

    Abstract: Neural sequence models trained with maximum likelihood estimation have led to breakthroughs in many tasks, where success is defined by the gap between training and test performance. However, their ability to achieve stronger forms of generalization remains unclear. We consider the problem of symbolic mathematical integration, as it requires generalizing systematically beyond the test set. We devel… ▽ More

    Submitted 24 February, 2022; v1 submitted 28 September, 2021; originally announced September 2021.

    Comments: AAAI 2022

  34. arXiv:2108.02247  [pdf

    physics.app-ph cond-mat.mes-hall cond-mat.mtrl-sci

    Quasi-HfO$_x$/ AlO$_y$ and AlO$_y$/ HfO$_x$ Based Memristor Devices: Role of Bi-layered Oxides in Digital Set and Analog Reset Switching

    Authors: Pradip Basnet, Erik Anderson, Bhaswar Chakrabarti, Matthew P. West, Fabia Farlin Athena, Eric M. Vogel

    Abstract: Understanding the resistive switching behavior, or the resistance change, of oxide-based memristor devices, is critical to predicting their responses with known electrical inputs. Also, with the known electrical response of a memristor, one can confirm its usefulness in non-volatile memory and/or in artificial neural networks. Although bi- or multi-layered oxides have been reported to improve the… ▽ More

    Submitted 2 October, 2021; v1 submitted 4 August, 2021; originally announced August 2021.

    Comments: 7 pages, 5 figures

  35. arXiv:2104.08315  [pdf, other

    cs.CL

    Surface Form Competition: Why the Highest Probability Answer Isn't Always Right

    Authors: Ari Holtzman, Peter West, Vered Shwartz, Ye** Choi, Luke Zettlemoyer

    Abstract: Large language models have shown promising results in zero-shot settings (Brown et al.,2020; Radford et al., 2019). For example, they can perform multiple choice tasks simply by conditioning on a question and selecting the answer with the highest probability. However, ranking by string probability can be problematic due to surface form competition-wherein different surface forms compete for prob… ▽ More

    Submitted 20 November, 2022; v1 submitted 16 April, 2021; originally announced April 2021.

  36. The massless irreducible representation in E theory and how bosons can appear as spinors

    Authors: Keith Glennon, Peter West

    Abstract: We study in detail the irreducible representation of E theory that corresponds to massless particles. This has little algebra Ic(E9) and contains 128 physical states that belong to the spinor representation of SO(16). These are the degrees of freedom of maximal supergravity in eleven dimensions. This smaller number of the degrees of freedom, compared to what might be expected, is due to an infinit… ▽ More

    Submitted 3 February, 2021; originally announced February 2021.

  37. Supersymmetry anomalies and the Wess-Zumino Model in a supergravity background

    Authors: Giorgos Eleftheriou, Peter West

    Abstract: We briefly recall the procedure for computing the Ward Identities in the presence of a regulator which violates the symmetry being considered. We compute the first non-trivial correction to the supersymmetry Ward Identity of the Wess-Zumino model in the presence of background supergravity using dimensional regularisation. We find that the result can be removed using a finite local counter term and… ▽ More

    Submitted 16 December, 2020; originally announced December 2020.

    Comments: seven pages

  38. arXiv:2011.14243  [pdf, other

    cs.DC

    Srifty: Swift and Thrifty Distributed Training on the Cloud

    Authors: Liang Luo, Peter West, Arvind Krishnamurthy, Luis Ceze

    Abstract: Finding the best VM configuration is key to achieve lower cost and higher throughput, two primary concerns in cloud-based distributed neural network (NN) training today. Optimal VM selection that meets user constraints requires efficiently navigating a large search space while controlling for the performance variance associated with sharing cloud instances and networks. In this work, we characteri… ▽ More

    Submitted 1 July, 2022; v1 submitted 28 November, 2020; originally announced November 2020.

  39. arXiv:2010.12884  [pdf, other

    cs.CL

    NeuroLogic Decoding: (Un)supervised Neural Text Generation with Predicate Logic Constraints

    Authors: Ximing Lu, Peter West, Rowan Zellers, Ronan Le Bras, Chandra Bhagavatula, Ye** Choi

    Abstract: Conditional text generation often requires lexical constraints, i.e., which words should or shouldn't be included in the output text. While the dominant recipe for conditional text generation has been large-scale pretrained language models that are finetuned on the task-specific training data, such models do not learn to follow the underlying constraints reliably, even when supervised with large a… ▽ More

    Submitted 20 April, 2021; v1 submitted 24 October, 2020; originally announced October 2020.

    Comments: NAACL 2021

  40. arXiv:2010.08566  [pdf, other

    cs.CL

    Reflective Decoding: Beyond Unidirectional Generation with Off-the-Shelf Language Models

    Authors: Peter West, Ximing Lu, Ari Holtzman, Chandra Bhagavatula, Jena Hwang, Ye** Choi

    Abstract: Publicly available, large pretrained LanguageModels (LMs) generate text with remarkable quality, but only sequentially from left to right. As a result, they are not immediately applicable to generation tasks that break the unidirectional assumption, such as paraphrasing or text-infilling, necessitating task-specific supervision. In this paper, we present Reflective Decoding, a novel unsupervised… ▽ More

    Submitted 24 December, 2021; v1 submitted 16 October, 2020; originally announced October 2020.

  41. arXiv:2010.05906  [pdf, other

    cs.CL cs.AI cs.LG

    Back to the Future: Unsupervised Backprop-based Decoding for Counterfactual and Abductive Commonsense Reasoning

    Authors: Lianhui Qin, Vered Shwartz, Peter West, Chandra Bhagavatula, Jena Hwang, Ronan Le Bras, Antoine Bosselut, Ye** Choi

    Abstract: Abductive and counterfactual reasoning, core abilities of everyday human cognition, require reasoning about what might have happened at time t, while conditioning on multiple contexts from the relative past and future. However, simultaneous incorporation of past and future contexts using generative language models (LMs) can be challenging, as they are trained either to condition only on the past c… ▽ More

    Submitted 2 August, 2021; v1 submitted 12 October, 2020; originally announced October 2020.

    Comments: EMNLP 2020

  42. arXiv:2009.09961  [pdf, other

    cs.CL

    Adjusting for Confounders with Text: Challenges and an Empirical Evaluation Framework for Causal Inference

    Authors: Galen Weld, Peter West, Maria Glenski, David Arbour, Ryan Rossi, Tim Althoff

    Abstract: Causal inference studies using textual social media data can provide actionable insights on human behavior. Making accurate causal inferences with text requires controlling for confounding which could otherwise impart bias. Recently, many different methods for adjusting for confounders have been proposed, and we show that these existing methods disagree with one another on two datasets inspired by… ▽ More

    Submitted 6 May, 2022; v1 submitted 21 September, 2020; originally announced September 2020.

    Comments: to appear at ICWSM 2022

  43. Kac-Moody algebras and the cosmological constant

    Authors: Peter West

    Abstract: We show that the theory of gravity constructed from the non-linear realisation of the semi-direct product of the Kac-Moody algebra A1+++ with its vector representation does not allow a cosmological constant. The existence of a cosmological constant in this theory is related to the breaking of the gravitational duality symmetry.

    Submitted 23 July, 2020; originally announced July 2020.

  44. The non-linear dual gravity equation of motion in eleven dimensions

    Authors: Keith Glennon, Peter West

    Abstract: We derive the non-linear dual graviton equation of motion in eleven dimensions in the context of E theory.

    Submitted 3 June, 2020; originally announced June 2020.

  45. arXiv:2004.05483  [pdf, other

    cs.CL

    Unsupervised Commonsense Question Answering with Self-Talk

    Authors: Vered Shwartz, Peter West, Ronan Le Bras, Chandra Bhagavatula, Ye** Choi

    Abstract: Natural language understanding involves reading between the lines with implicit background knowledge. Current systems either rely on pre-trained language models as the sole implicit source of world knowledge, or resort to external knowledge bases (KBs) to incorporate additional relevant knowledge. We propose an unsupervised framework based on self-talk as a novel alternative to multiple-choice com… ▽ More

    Submitted 15 September, 2020; v1 submitted 11 April, 2020; originally announced April 2020.

    Comments: EMNLP 2020

  46. Gravity, Dual Gravity and A1+++

    Authors: Keith Glennon, Peter West

    Abstract: We construct the non-linear realisation of the semi-direct product of the very extended algebra A1+++ and its vector representation. This theory has an infinite number of fields that depend on a spacetime with an infinite number of coordinates. Discarding all except the lowest level field and coordinates the dynamics is just Einstein's equation for the graviton field. We show that the gravity fiel… ▽ More

    Submitted 7 April, 2020; originally announced April 2020.

    Comments: 27 pages

  47. arXiv:1912.03545  [pdf

    cond-mat.mtrl-sci

    Substrate Dependent Resistive Switching in Amorphous-HfOx Memristors: An Experimental and Computational Investigation

    Authors: Pradip Basnet, Darshan G Pahinkar, Matthew P. West, Christopher J. Perini, Samuel Graham, Eric M. Vogel

    Abstract: While two-terminal HfOX (x<2) memristor devices have been studied for ion transport and current evolution, there have been limited reports on the effect of the long-range thermal environment on their performance. In this work, amorphous-HfOX based memristor devices on two different substrates, thin SiO2(280 nm)/Si and glass, with different thermal conductivities in the range from 1.2 to 138 W/m-K… ▽ More

    Submitted 1 April, 2020; v1 submitted 7 December, 2019; originally announced December 2019.

    Comments: 8 pages, 9 figures. Journal of Materials Chemistry C, 2020

  48. arXiv:1911.03015  [pdf, ps, other

    physics.atom-ph hep-ex quant-ph

    The metastable Q $^3Δ_2$ state of ThO: A new resource for the ACME electron EDM search

    Authors: Xing Wu, Zhen Han, James Chow, Daniel G. Ang, Cole Meisenhelder, Cristian D. Panda, Elizabeth P. West, Gerald Gabrielse, John M. Doyle, David DeMille

    Abstract: The best upper limit for the electron electric dipole moment was recently set by the ACME collaboration. This experiment measures an electron spin-precession in a cold beam of ThO molecules in their metastable $H~(^3Δ_1)$ state. Improvement in the statistical and systematic uncertainties is possible with more efficient use of molecules from the source and better magnetometry in the experiment, res… ▽ More

    Submitted 7 November, 2019; originally announced November 2019.

    Comments: 21 pages, 6 figures, 5 pages appendices

    Journal ref: New Journal of Physics, 22 023013 (2020)

  49. arXiv:1909.10434  [pdf, other

    astro-ph.GA astro-ph.CO

    Clocking the formation of today's largest galaxies: Wide field integral spectroscopy of Brightest Cluster Galaxies and their surroundings

    Authors: Louise O. V. Edwards, Matthew Salinas, Steffanie Stanley, Priscilla E. Holguin West, Isabella Trierweiler, Hannah Alpert, Paula Coelho, Saisneha Koppaka, Grant R. Tremblay, Hugo Martel, Yuan Li

    Abstract: The formation and evolution of local brightest cluster galaxies (BCGs) is investigated by determining the stellar populations and dynamics from the galaxy core, though the outskirts and into the intracluster light (ICL). Integral spectroscopy of 23 BCGs observed out to 4 r_e is collected and high signal-to-noise regions are identified. Stellar population synthesis codes are used to determine the a… ▽ More

    Submitted 23 September, 2019; originally announced September 2019.

    Comments: 22 pages, 21 figures, MNRAS, accepted

  50. arXiv:1909.07405  [pdf, other

    cs.CL

    BottleSum: Unsupervised and Self-supervised Sentence Summarization using the Information Bottleneck Principle

    Authors: Peter West, Ari Holtzman, Jan Buys, Ye** Choi

    Abstract: The principle of the Information Bottleneck (Tishby et al. 1999) is to produce a summary of information X optimized to predict some other relevant information Y. In this paper, we propose a novel approach to unsupervised sentence summarization by map** the Information Bottleneck principle to a conditional language modelling objective: given a sentence, our approach seeks a compressed sentence th… ▽ More

    Submitted 20 September, 2019; v1 submitted 16 September, 2019; originally announced September 2019.