Skip to main content

Showing 1–25 of 25 results for author: Siegel, N

.
  1. arXiv:2405.05348  [pdf, other

    cs.CL cs.AI cs.LG

    The Effect of Model Size on LLM Post-hoc Explainability via LIME

    Authors: Henning Heyen, Amy Widdicombe, Noah Y. Siegel, Maria Perez-Ortiz, Philip Treleaven

    Abstract: Large language models (LLMs) are becoming bigger to boost performance. However, little is known about how explainability is affected by this trend. This work explores LIME explanations for DeBERTaV3 models of four different sizes on natural language inference (NLI) and zero-shot classification (ZSC) tasks. We evaluate the explanations based on their faithfulness to the models' internal decision pr… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: Published at ICLR 2024 Workshop on Secure and Trustworthy Large Language Models

  2. arXiv:2404.03189  [pdf, other

    cs.CL cs.AI

    The Probabilities Also Matter: A More Faithful Metric for Faithfulness of Free-Text Explanations in Large Language Models

    Authors: Noah Y. Siegel, Oana-Maria Camburu, Nicolas Heess, Maria Perez-Ortiz

    Abstract: In order to oversee advanced AI systems, it is important to understand their underlying decision-making process. When prompted, large language models (LLMs) can provide natural language explanations or reasoning traces that sound plausible and receive high ratings from human annotators. However, it is unclear to what extent these explanations are faithful, i.e., truly capture the factors responsib… ▽ More

    Submitted 7 June, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

    Comments: To be published in ACL 2024. 19 pages, 2 figures

  3. arXiv:2312.16259  [pdf, ps, other

    math.CO

    On the General Dead-Ending Universe of Partizan Games

    Authors: Aaron N. Siegel

    Abstract: The universe $\mathcal{E}$ of dead-ending partizan games has emerged as an important structure in the study of misère play. Here we attempt a systematic investigation of the structure of $\mathcal{E}$ and its subuniverses. We begin by showing that the dead-ends exhibit a rich "absolute" structure, in the sense that they behave identically in any universe in which they appear. We will use this re… ▽ More

    Submitted 26 December, 2023; originally announced December 2023.

    MSC Class: 05A99; 91A46

  4. arXiv:2309.15534  [pdf

    physics.chem-ph cond-mat.mes-hall

    Universal click-chemistry approach for the DNA functionalization of nanoparticles

    Authors: Nicole Siegel, Hiroaki Hasebe, German Chiarelli, Denis Garoli, Hiroshi Sugimoto, Minoru Fujii, Guillermo P. Acuna, Karol Kolataj

    Abstract: Nanotechnology has revolutionized the fabrication of hybrid species with tailored functionalities. A milestone in this field is the DNA conjugation of nanoparticles, introduced almost 30 years ago, which typically exploits the affinity between thiol groups and metallic surfaces. Over the last decades, developments in colloidal research have enabled the synthesis of an assortment of non-metallic st… ▽ More

    Submitted 29 December, 2023; v1 submitted 27 September, 2023; originally announced September 2023.

  5. Learning Agile Soccer Skills for a Bipedal Robot with Deep Reinforcement Learning

    Authors: Tuomas Haarnoja, Ben Moran, Guy Lever, Sandy H. Huang, Dhruva Tirumala, Jan Humplik, Markus Wulfmeier, Saran Tunyasuvunakool, Noah Y. Siegel, Roland Hafner, Michael Bloesch, Kristian Hartikainen, Arunkumar Byravan, Leonard Hasenclever, Yuval Tassa, Fereshteh Sadeghi, Nathan Batchelor, Federico Casarini, Stefano Saliceti, Charles Game, Neil Sreendra, Kushal Patel, Marlon Gwira, Andrea Huber, Nicole Hurley , et al. (3 additional authors not shown)

    Abstract: We investigate whether Deep Reinforcement Learning (Deep RL) is able to synthesize sophisticated and safe movement skills for a low-cost, miniature humanoid robot that can be composed into complex behavioral strategies in dynamic environments. We used Deep RL to train a humanoid robot with 20 actuated joints to play a simplified one-versus-one (1v1) soccer game. The resulting agent exhibits robust… ▽ More

    Submitted 11 April, 2024; v1 submitted 26 April, 2023; originally announced April 2023.

    Comments: Project website: https://sites.google.com/view/op3-soccer

  6. The James Webb Space Telescope Mission

    Authors: Jonathan P. Gardner, John C. Mather, Randy Abbott, James S. Abell, Mark Abernathy, Faith E. Abney, John G. Abraham, Roberto Abraham, Yasin M. Abul-Huda, Scott Acton, Cynthia K. Adams, Evan Adams, David S. Adler, Maarten Adriaensen, Jonathan Albert Aguilar, Mansoor Ahmed, Nasif S. Ahmed, Tanjira Ahmed, Rüdeger Albat, Loïc Albert, Stacey Alberts, David Aldridge, Mary Marsha Allen, Shaune S. Allen, Martin Altenburg , et al. (983 additional authors not shown)

    Abstract: Twenty-six years ago a small committee report, building on earlier studies, expounded a compelling and poetic vision for the future of astronomy, calling for an infrared-optimized space telescope with an aperture of at least $4m$. With the support of their governments in the US, Europe, and Canada, 20,000 people realized that vision as the $6.5m$ James Webb Space Telescope. A generation of astrono… ▽ More

    Submitted 10 April, 2023; originally announced April 2023.

    Comments: Accepted by PASP for the special issue on The James Webb Space Telescope Overview, 29 pages, 4 figures

  7. arXiv:2211.14275  [pdf, other

    cs.LG cs.AI cs.CL

    Solving math word problems with process- and outcome-based feedback

    Authors: Jonathan Uesato, Nate Kushman, Ramana Kumar, Francis Song, Noah Siegel, Lisa Wang, Antonia Creswell, Geoffrey Irving, Irina Higgins

    Abstract: Recent work has shown that asking language models to generate reasoning steps improves performance on many reasoning tasks. When moving beyond prompting, this raises the question of how we should supervise such models: outcome-based approaches which supervise the final result, or process-based approaches which supervise the reasoning process itself? Differences between these approaches might natur… ▽ More

    Submitted 25 November, 2022; originally announced November 2022.

  8. arXiv:2203.17138  [pdf, other

    cs.RO cs.AI cs.LG

    Imitate and Repurpose: Learning Reusable Robot Movement Skills From Human and Animal Behaviors

    Authors: Steven Bohez, Saran Tunyasuvunakool, Philemon Brakel, Fereshteh Sadeghi, Leonard Hasenclever, Yuval Tassa, Emilio Parisotto, Jan Humplik, Tuomas Haarnoja, Roland Hafner, Markus Wulfmeier, Michael Neunert, Ben Moran, Noah Siegel, Andrea Huber, Francesco Romano, Nathan Batchelor, Federico Casarini, Josh Merel, Raia Hadsell, Nicolas Heess

    Abstract: We investigate the use of prior knowledge of human and animal movement to learn reusable locomotion skills for real legged robots. Our approach builds upon previous work on imitating human or dog Motion Capture (MoCap) data to learn a movement skill module. Once learned, this skill module can be reused for complex downstream tasks. Importantly, due to the prior imposed by the MoCap data, our appro… ▽ More

    Submitted 31 March, 2022; originally announced March 2022.

    Comments: 30 pages, 9 figures, 8 tables, 14 videos at https://bit.ly/robot-npmp , submitted to Science Robotics

  9. arXiv:2112.02721  [pdf, other

    cs.CL cs.AI cs.LG

    NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation

    Authors: Kaustubh D. Dhole, Varun Gangal, Sebastian Gehrmann, Aadesh Gupta, Zhenhao Li, Saad Mahamood, Abinaya Mahendiran, Simon Mille, Ashish Shrivastava, Samson Tan, Tongshuang Wu, Jascha Sohl-Dickstein, **ho D. Choi, Eduard Hovy, Ondrej Dusek, Sebastian Ruder, Sajant Anand, Nagender Aneja, Rabin Banjade, Lisa Barthe, Hanna Behnke, Ian Berlot-Attwell, Connor Boyle, Caroline Brun, Marco Antonio Sobrevilla Cabezudo , et al. (101 additional authors not shown)

    Abstract: Data augmentation is an important component in the robustness evaluation of models in natural language processing (NLP) and in enhancing the diversity of the data they are trained on. In this paper, we present NL-Augmenter, a new participatory Python-based natural language augmentation framework which supports the creation of both transformations (modifications to the data) and filters (data split… ▽ More

    Submitted 11 October, 2022; v1 submitted 5 December, 2021; originally announced December 2021.

    Comments: 39 pages, repository at https://github.com/GEM-benchmark/NL-Augmenter

  10. arXiv:2105.12196  [pdf, other

    cs.AI cs.MA cs.NE cs.RO

    From Motor Control to Team Play in Simulated Humanoid Football

    Authors: Siqi Liu, Guy Lever, Zhe Wang, Josh Merel, S. M. Ali Eslami, Daniel Hennes, Wojciech M. Czarnecki, Yuval Tassa, Shayegan Omidshafiei, Abbas Abdolmaleki, Noah Y. Siegel, Leonard Hasenclever, Luke Marris, Saran Tunyasuvunakool, H. Francis Song, Markus Wulfmeier, Paul Muller, Tuomas Haarnoja, Brendan D. Tracey, Karl Tuyls, Thore Graepel, Nicolas Heess

    Abstract: Intelligent behaviour in the physical world exhibits structure at multiple spatial and temporal scales. Although movements are ultimately executed at the level of instantaneous muscle tensions or joint torques, they must be selected to serve goals defined on much longer timescales, and in terms of relations that extend far beyond the body itself, ultimately involving coordination with other agents… ▽ More

    Submitted 25 May, 2021; originally announced May 2021.

  11. arXiv:2012.08554  [pdf, ps, other

    math.CO

    On the Structure of Misère Impartial Games

    Authors: Aaron N. Siegel

    Abstract: We consider the abstract structure of the monoid M of misère impartial game values. Several new results are presented, including a proof that the group of fractions of M is almost torsion-free; a method of calculating the number of distinct games born by day 7; and some new results on the structure of prime games. Also included are proofs of a few older results due to Conway, such as the Cancellat… ▽ More

    Submitted 3 September, 2021; v1 submitted 15 December, 2020; originally announced December 2020.

    MSC Class: 05A99; 91A46

  12. arXiv:2010.15492  [pdf, other

    cs.RO

    "What, not how": Solving an under-actuated insertion task from scratch

    Authors: Giulia Vezzani, Michael Neunert, Markus Wulfmeier, Rae Jeong, Thomas Lampe, Noah Siegel, Roland Hafner, Abbas Abdolmaleki, Martin Riedmiller, Francesco Nori

    Abstract: Robot manipulation requires a complex set of skills that need to be carefully combined and coordinated to solve a task. Yet, most ReinforcementLearning (RL) approaches in robotics study tasks which actually consist only of a single manipulation skill, such as gras** an object or inserting a pre-grasped object. As a result the skill ('how' to solve the task) but not the actual goal of a complete… ▽ More

    Submitted 30 October, 2020; v1 submitted 29 October, 2020; originally announced October 2020.

  13. arXiv:2007.15588  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    Data-efficient Hindsight Off-policy Option Learning

    Authors: Markus Wulfmeier, Dushyant Rao, Roland Hafner, Thomas Lampe, Abbas Abdolmaleki, Tim Hertweck, Michael Neunert, Dhruva Tirumala, Noah Siegel, Nicolas Heess, Martin Riedmiller

    Abstract: We introduce Hindsight Off-policy Options (HO2), a data-efficient option learning algorithm. Given any trajectory, HO2 infers likely option choices and backpropagates through the dynamic programming inference procedure to robustly train all policy components off-policy and end-to-end. The approach outperforms existing option learning methods on common benchmarks. To better understand the option fr… ▽ More

    Submitted 15 June, 2021; v1 submitted 30 July, 2020; originally announced July 2020.

    Comments: Published at ICML2021

  14. arXiv:2006.15134  [pdf, other

    cs.LG cs.AI stat.ML

    Critic Regularized Regression

    Authors: Ziyu Wang, Alexander Novikov, Konrad Zolna, Jost Tobias Springenberg, Scott Reed, Bobak Shahriari, Noah Siegel, Josh Merel, Caglar Gulcehre, Nicolas Heess, Nando de Freitas

    Abstract: Offline reinforcement learning (RL), also known as batch RL, offers the prospect of policy optimization from large pre-recorded datasets without online environment interaction. It addresses challenges with regard to the cost of data collection and safety, both of which are particularly pertinent to real-world applications of RL. Unfortunately, most off-policy algorithms perform poorly when learnin… ▽ More

    Submitted 22 September, 2021; v1 submitted 26 June, 2020; originally announced June 2020.

    Comments: 24 pages; presented at NeurIPS 2020

  15. arXiv:2005.07541  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    Simple Sensor Intentions for Exploration

    Authors: Tim Hertweck, Martin Riedmiller, Michael Bloesch, Jost Tobias Springenberg, Noah Siegel, Markus Wulfmeier, Roland Hafner, Nicolas Heess

    Abstract: Modern reinforcement learning algorithms can learn solutions to increasingly difficult control problems while at the same time reduce the amount of prior knowledge needed for their application. One of the remaining challenges is the definition of reward schemes that appropriately facilitate exploration without biasing the solution in undesirable ways, and that can be implemented on real robotic sy… ▽ More

    Submitted 15 May, 2020; originally announced May 2020.

  16. arXiv:2002.08396  [pdf, other

    cs.LG cs.RO stat.ML

    Keep Doing What Worked: Behavioral Modelling Priors for Offline Reinforcement Learning

    Authors: Noah Y. Siegel, Jost Tobias Springenberg, Felix Berkenkamp, Abbas Abdolmaleki, Michael Neunert, Thomas Lampe, Roland Hafner, Nicolas Heess, Martin Riedmiller

    Abstract: Off-policy reinforcement learning algorithms promise to be applicable in settings where only a fixed data-set (batch) of environment interactions is available and no new experience can be acquired. This property makes these algorithms appealing for real world problems such as robot control. In practice, however, standard off-policy algorithms fail in the batch setting for continuous control. In th… ▽ More

    Submitted 17 June, 2020; v1 submitted 19 February, 2020; originally announced February 2020.

    ACM Class: I.2.6; I.2.9

    Journal ref: ICLR 2020

  17. arXiv:1912.10517  [pdf, other

    math.CO

    Memgames

    Authors: Urban Larsson, Simon Rubinstein-Salzedo, Aaron N. Siegel

    Abstract: In this article, we study the structure, and in particular the Grundy values, of a family of games known as memgames.

    Submitted 30 October, 2023; v1 submitted 22 December, 2019; originally announced December 2019.

    Comments: Feedback welcome!

  18. arXiv:1910.04142  [pdf, other

    cs.RO cs.AI cs.CV cs.LG cs.NE

    Imagined Value Gradients: Model-Based Policy Optimization with Transferable Latent Dynamics Models

    Authors: Arunkumar Byravan, Jost Tobias Springenberg, Abbas Abdolmaleki, Roland Hafner, Michael Neunert, Thomas Lampe, Noah Siegel, Nicolas Heess, Martin Riedmiller

    Abstract: Humans are masters at quickly learning many complex tasks, relying on an approximate understanding of the dynamics of their environments. In much the same way, we would like our learning agents to quickly adapt to new tasks. In this paper, we explore how model-based Reinforcement Learning (RL) can facilitate transfer to new tasks. We develop an algorithm that learns an action-conditional, predicti… ▽ More

    Submitted 9 October, 2019; originally announced October 2019.

    Comments: To appear at the 3rd annual Conference on Robot Learning, Osaka, Japan (CoRL 2019). 24 pages including appendix (main paper - 8 pages)

  19. arXiv:1906.11228  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    Compositional Transfer in Hierarchical Reinforcement Learning

    Authors: Markus Wulfmeier, Abbas Abdolmaleki, Roland Hafner, Jost Tobias Springenberg, Michael Neunert, Tim Hertweck, Thomas Lampe, Noah Siegel, Nicolas Heess, Martin Riedmiller

    Abstract: The successful application of general reinforcement learning algorithms to real-world robotics applications is often limited by their high data requirements. We introduce Regularized Hierarchical Policy Optimization (RHPO) to improve data-efficiency for domains with multiple dominant tasks and ultimately reduce required platform time. To this end, we employ compositional inductive biases on multip… ▽ More

    Submitted 19 May, 2020; v1 submitted 26 June, 2019; originally announced June 2019.

    Comments: Robotics Science and Systems 2020

  20. Extracting Scientific Figures with Distantly Supervised Neural Networks

    Authors: Noah Siegel, Nicholas Lourie, Russell Power, Waleed Ammar

    Abstract: Non-textual components such as charts, diagrams and tables provide key information in many scientific documents, but the lack of large labeled datasets has impeded the development of data-driven methods for scientific figure extraction. In this paper, we induce high-quality training labels for the task of figure extraction in a large number of scientific documents, with no human intervention. To a… ▽ More

    Submitted 30 May, 2018; v1 submitted 6 April, 2018; originally announced April 2018.

    Comments: 10 pages, 5 figures, paper accepted at JCDL 2018

  21. arXiv:0705.2404  [pdf, ps, other

    math.CO math.AC

    Misere quotients for impartial games: Supplementary material

    Authors: Thane E. Plambeck, Aaron N. Siegel

    Abstract: We provide supplementary appendices to the paper Misere quotients for impartial games. These include detailed solutions to many of the octal games discussed in the paper, and descriptions of the algorithms used to compute most of our solutions.

    Submitted 16 May, 2007; originally announced May 2007.

    Comments: Supplement to the paper Misere Quotients for Impartial Games. 17 pages

    MSC Class: 91A46

  22. arXiv:math/0703565  [pdf, ps, other

    math.CO

    Misère canonical forms of partizan games

    Authors: Aaron N. Siegel

    Abstract: We show that partizan games admit canonical forms in misère play. The proof is a synthesis of the canonical form theorems for normal-play partizan games and misère-play impartial games. It is fully constructive, and algorithms readily emerge for comparing misère games and calculating their canonical forms. We use these techniques to show that there are precisely 256 games born by day 2, and to… ▽ More

    Submitted 19 March, 2007; originally announced March 2007.

    Comments: 12 pages

    MSC Class: 91A46

  23. arXiv:math/0703070  [pdf, ps, other

    math.CO math.AC

    The structure and classification of misère quotients

    Authors: Aaron N. Siegel

    Abstract: A \emph{bipartite monoid} is a commutative monoid $\Q$ together with an identified subset $¶\subset \Q$. In this paper we study a class of bipartite monoids, known as \emph{misère quotients}, that are naturally associated to impartial combinatorial games. We introduce a structure theory for misère quotients with $|¶| = 2$, and give a complete classification of all such quotients up to isomorph… ▽ More

    Submitted 2 March, 2007; originally announced March 2007.

    Comments: 23 pages

    MSC Class: 91A46

  24. arXiv:math/0612616  [pdf, ps, other

    math.CO math.AC

    Misère Games and Misère Quotients

    Authors: Aaron N. Siegel

    Abstract: These lecture notes are based on a short course on misère quotients offered at the Weizmann Institute of Science in Rehovot, Israel, in November 2006. They include an introduction to impartial games, starting from the beginning; the basic misère quotient construction; a proof of the Guy--Smith--Plambeck Periodicity Theorem; and statements of some recent results and open problems in the subject.

    Submitted 21 December, 2006; v1 submitted 20 December, 2006; originally announced December 2006.

    Comments: 34 pages; fixed references

    MSC Class: 91A46

  25. arXiv:math/0609825  [pdf, ps, other

    math.CO math.AC

    Misere quotients for impartial games

    Authors: Thane E. Plambeck, Aaron N. Siegel

    Abstract: We announce misere-play solutions to several previously-unsolved combinatorial games. The solutions are described in terms of misere quotients--commutative monoids that encode the additive structure of specific misere-play games. We also introduce several advances in the structure theory of misere quotients, including a connection between the combinatorial structure of normal and misere play.

    Submitted 13 August, 2007; v1 submitted 28 September, 2006; originally announced September 2006.

    Comments: Paper has been split into two parts: this part, and a supplement at arXiv:0705.2404v1

    MSC Class: 91A46; 20M14

    Journal ref: Journal of Combinatorial Theory, Series A (May 2008) pp 593-622