Skip to main content

Showing 1–3 of 3 results for author: Mills, E

Searching in archive cs. Search in all archives.
.
  1. arXiv:2312.12747  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    ALMANACS: A Simulatability Benchmark for Language Model Explainability

    Authors: Edmund Mills, Shiye Su, Stuart Russell, Scott Emmons

    Abstract: How do we measure the efficacy of language model explainability methods? While many explainability methods have been developed, they are typically evaluated on bespoke tasks, preventing an apples-to-apples comparison. To help fill this gap, we present ALMANACS, a language model explainability benchmark. ALMANACS scores explainability methods on simulatability, i.e., how well the explanations impro… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

    Comments: Code is available at https://github.com/edmundmills/ALMANACS}{https://github.com/edmundmills/ALMANACS

  2. arXiv:2208.03776  [pdf, other

    cs.LG math.NA

    Stochastic Scaling in Loss Functions for Physics-Informed Neural Networks

    Authors: Ethan Mills, Alexey Pozdnyakov

    Abstract: Differential equations are used in a wide variety of disciplines, describing the complex behavior of the physical world. Analytic solutions to these equations are often difficult to solve for, limiting our current ability to solve complex differential equations and necessitating sophisticated numerical methods to approximate solutions. Trained neural networks act as universal function approximator… ▽ More

    Submitted 7 August, 2022; originally announced August 2022.

    Comments: 26 pages, 11 figures

  3. arXiv:2204.07123  [pdf, other

    cs.AI

    Retrospective on the 2021 BASALT Competition on Learning from Human Feedback

    Authors: Rohin Shah, Steven H. Wang, Cody Wild, Stephanie Milani, Anssi Kanervisto, Vinicius G. Goecks, Nicholas Waytowich, David Watkins-Valls, Bharat Prakash, Edmund Mills, Divyansh Garg, Alexander Fries, Alexandra Souly, Chan Jun Shern, Daniel del Castillo, Tom Lieberum

    Abstract: We held the first-ever MineRL Benchmark for Agents that Solve Almost-Lifelike Tasks (MineRL BASALT) Competition at the Thirty-fifth Conference on Neural Information Processing Systems (NeurIPS 2021). The goal of the competition was to promote research towards agents that use learning from human feedback (LfHF) techniques to solve open-world tasks. Rather than mandating the use of LfHF techniques,… ▽ More

    Submitted 14 April, 2022; originally announced April 2022.

    Comments: Accepted to the PMLR NeurIPS 2021 Demo & Competition Track volume