Skip to main content

Showing 1–30 of 30 results for author: Farquhar, S

.
  1. arXiv:2406.18563  [pdf

    cs.CY cs.AI cs.CV cs.LG

    Interdisciplinary Expertise to Advance Equitable Explainable AI

    Authors: Chloe R. Bennett, Heather Cole-Lewis, Stephanie Farquhar, Naama Haamel, Boris Babenko, Oran Lang, Mat Fleck, Ilana Traynis, Charles Lau, Ivor Horn, Courtney Lyles

    Abstract: The field of artificial intelligence (AI) is rapidly influencing health and healthcare, but bias and poor performance persists for populations who face widespread structural oppression. Previous work has clearly outlined the need for more rigorous attention to data representativeness and model performance to advance equity and reduce bias. However, there is an opportunity to also improve the expla… ▽ More

    Submitted 29 May, 2024; originally announced June 2024.

  2. arXiv:2404.14068  [pdf, other

    cs.AI cs.LG

    Holistic Safety and Responsibility Evaluations of Advanced AI Models

    Authors: Laura Weidinger, Joslyn Barnhart, Jenny Brennan, Christina Butterfield, Susie Young, Will Hawkins, Lisa Anne Hendricks, Ramona Comanescu, Oscar Chang, Mikel Rodriguez, Jennifer Beroshi, Dawn Bloxwich, Lev Proleev, Jilin Chen, Sebastian Farquhar, Lewis Ho, Iason Gabriel, Allan Dafoe, William Isaac

    Abstract: Safety and responsibility evaluations of advanced AI models are a critical but develo** field of research and practice. In the development of Google DeepMind's advanced AI models, we innovated on and applied a broad set of approaches to safety evaluation. In this report, we summarise and share elements of our evolving approach as well as lessons learned for a broad audience. Key lessons learned… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: 10 pages excluding bibliography

  3. arXiv:2403.13793  [pdf, other

    cs.LG

    Evaluating Frontier Models for Dangerous Capabilities

    Authors: Mary Phuong, Matthew Aitchison, Elliot Catt, Sarah Cogan, Alexandre Kaskasoli, Victoria Krakovna, David Lindner, Matthew Rahtz, Yannis Assael, Sarah Hodkinson, Heidi Howard, Tom Lieberum, Ramana Kumar, Maria Abi Raad, Albert Webson, Lewis Ho, Sharon Lin, Sebastian Farquhar, Marcus Hutter, Gregoire Deletang, Anian Ruoss, Seliem El-Sayed, Sasha Brown, Anca Dragan, Rohin Shah , et al. (2 additional authors not shown)

    Abstract: To understand the risks posed by a new AI system, we must understand what it can and cannot do. Building on prior work, we introduce a programme of new "dangerous capability" evaluations and pilot them on Gemini 1.0 models. Our evaluations cover four areas: (1) persuasion and deception; (2) cyber-security; (3) self-proliferation; and (4) self-reasoning. We do not find evidence of strong dangerous… ▽ More

    Submitted 5 April, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

  4. arXiv:2312.10029  [pdf, other

    cs.LG cs.AI

    Challenges with unsupervised LLM knowledge discovery

    Authors: Sebastian Farquhar, Vikrant Varma, Zachary Kenton, Johannes Gasteiger, Vladimir Mikulik, Rohin Shah

    Abstract: We show that existing unsupervised methods on large language model (LLM) activations do not discover knowledge -- instead they seem to discover whatever feature of the activations is most prominent. The idea behind unsupervised knowledge elicitation is that knowledge satisfies a consistency structure, which can be used to discover knowledge. We first prove theoretically that arbitrary features (no… ▽ More

    Submitted 18 December, 2023; v1 submitted 15 December, 2023; originally announced December 2023.

    Comments: 12 pages (38 including references and appendices). First three authors equal contribution, randomised order

  5. arXiv:2305.15324  [pdf, other

    cs.AI

    Model evaluation for extreme risks

    Authors: Toby Shevlane, Sebastian Farquhar, Ben Garfinkel, Mary Phuong, Jess Whittlestone, Jade Leung, Daniel Kokotajlo, Nahema Marchal, Markus Anderljung, Noam Kolt, Lewis Ho, Divya Siddarth, Shahar Avin, Will Hawkins, Been Kim, Iason Gabriel, Vijay Bolina, Jack Clark, Yoshua Bengio, Paul Christiano, Allan Dafoe

    Abstract: Current approaches to building general-purpose AI systems tend to produce systems with both beneficial and harmful capabilities. Further progress in AI development could lead to capabilities that pose extreme risks, such as offensive cyber capabilities or strong manipulation skills. We explain why model evaluation is critical for addressing extreme risks. Developers must be able to identify danger… ▽ More

    Submitted 22 September, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: Fixed typos; added citation

    ACM Class: K.4.1

  6. arXiv:2304.08151  [pdf, other

    cs.LG stat.ML

    Prediction-Oriented Bayesian Active Learning

    Authors: Freddie Bickford Smith, Andreas Kirsch, Sebastian Farquhar, Yarin Gal, Adam Foster, Tom Rainforth

    Abstract: Information-theoretic approaches to active learning have traditionally focused on maximising the information gathered about the model parameters, most commonly by optimising the BALD score. We highlight that this can be suboptimal from the perspective of predictive performance. For example, BALD lacks a notion of an input distribution and so is prone to prioritise data of limited relevance. To add… ▽ More

    Submitted 17 April, 2023; originally announced April 2023.

    Comments: Published at AISTATS 2023

  7. arXiv:2302.09664  [pdf, other

    cs.CL cs.AI cs.LG

    Semantic Uncertainty: Linguistic Invariances for Uncertainty Estimation in Natural Language Generation

    Authors: Lorenz Kuhn, Yarin Gal, Sebastian Farquhar

    Abstract: We introduce a method to measure uncertainty in large language models. For tasks like question answering, it is essential to know when we can trust the natural language outputs of foundation models. We show that measuring uncertainty in natural language is challenging because of "semantic equivalence" -- different sentences can mean the same thing. To overcome these challenges we introduce semanti… ▽ More

    Submitted 15 April, 2023; v1 submitted 19 February, 2023; originally announced February 2023.

    Comments: International Conference on Learning Representations 2023 (Spotlight)

  8. arXiv:2301.05062  [pdf, other

    cs.LG cs.AI stat.ML

    Tracr: Compiled Transformers as a Laboratory for Interpretability

    Authors: David Lindner, János Kramár, Sebastian Farquhar, Matthew Rahtz, Thomas McGrath, Vladimir Mikulik

    Abstract: We show how to "compile" human-readable programs into standard decoder-only transformer models. Our compiler, Tracr, generates models with known structure. This structure can be used to design experiments. For example, we use it to study "superposition" in transformers that execute multi-step algorithms. Additionally, the known structure of Tracr-compiled models can serve as ground-truth for evalu… ▽ More

    Submitted 3 November, 2023; v1 submitted 12 January, 2023; originally announced January 2023.

    Comments: Presented at NeurIPS 2023 (Spotlight)

  9. arXiv:2212.07769  [pdf, other

    cs.CL cs.AI cs.LG

    CLAM: Selective Clarification for Ambiguous Questions with Generative Language Models

    Authors: Lorenz Kuhn, Yarin Gal, Sebastian Farquhar

    Abstract: Users often ask dialogue systems ambiguous questions that require clarification. We show that current language models rarely ask users to clarify ambiguous questions and instead provide incorrect answers. To address this, we introduce CLAM: a framework for getting language models to selectively ask for clarification about ambiguous user questions. In particular, we show that we can prompt language… ▽ More

    Submitted 20 February, 2023; v1 submitted 15 December, 2022; originally announced December 2022.

  10. arXiv:2211.06291  [pdf, other

    cs.LG cs.AI stat.ML

    Do Bayesian Neural Networks Need To Be Fully Stochastic?

    Authors: Mrinank Sharma, Sebastian Farquhar, Eric Nalisnick, Tom Rainforth

    Abstract: We investigate the benefit of treating all the parameters in a Bayesian neural network stochastically and find compelling theoretical and empirical evidence that this standard construction may be unnecessary. To this end, we prove that expressive predictive distributions require only small amounts of stochasticity. In particular, partially stochastic networks with only $n$ stochastic biases are un… ▽ More

    Submitted 20 February, 2023; v1 submitted 11 November, 2022; originally announced November 2022.

    Comments: Published at AISTATS2023 (Oral)

  11. arXiv:2211.06139  [pdf, other

    stat.ML cs.LG

    Understanding Approximation for Bayesian Inference in Neural Networks

    Authors: Sebastian Farquhar

    Abstract: Bayesian inference has theoretical attractions as a principled framework for reasoning about beliefs. However, the motivations of Bayesian inference which claim it to be the only 'rational' kind of reasoning do not apply in practice. They create a binary split in which all approximate inference is equally 'irrational'. Instead, we should ask ourselves how to define a spectrum of more- and less-rat… ▽ More

    Submitted 11 November, 2022; originally announced November 2022.

    Comments: Accepted as a thesis satisfying the requirements of a D.Phil at the Universty of Oxford

  12. arXiv:2208.08345  [pdf, other

    cs.AI cs.LG

    Discovering Agents

    Authors: Zachary Kenton, Ramana Kumar, Sebastian Farquhar, Jonathan Richens, Matt MacDermott, Tom Everitt

    Abstract: Causal models of agents have been used to analyse the safety aspects of machine learning systems. But identifying agents is non-trivial -- often the causal model is just assumed by the modeler without much justification -- and modelling failures can lead to mistakes in the safety analysis. This paper proposes the first formal causal definition of agents -- roughly that agents are systems that woul… ▽ More

    Submitted 24 August, 2022; v1 submitted 17 August, 2022; originally announced August 2022.

    Comments: Some typos corrected

  13. arXiv:2206.07137  [pdf, other

    cs.LG cs.AI cs.CL cs.CV

    Prioritized Training on Points that are Learnable, Worth Learning, and Not Yet Learnt

    Authors: Sören Mindermann, Jan Brauner, Muhammed Razzak, Mrinank Sharma, Andreas Kirsch, Winnie Xu, Benedikt Höltgen, Aidan N. Gomez, Adrien Morisot, Sebastian Farquhar, Yarin Gal

    Abstract: Training on web-scale data can take months. But most computation and time is wasted on redundant and noisy points that are already learnt or not learnable. To accelerate training, we introduce Reducible Holdout Loss Selection (RHO-LOSS), a simple but principled technique which selects approximately those points for training that most reduce the model's generalization loss. As a result, RHO-LOSS mi… ▽ More

    Submitted 26 September, 2022; v1 submitted 14 June, 2022; originally announced June 2022.

    Comments: ICML 2022

  14. arXiv:2204.10018  [pdf, other

    cs.AI stat.ML

    Path-Specific Objectives for Safer Agent Incentives

    Authors: Sebastian Farquhar, Ryan Carey, Tom Everitt

    Abstract: We present a general framework for training safe agents whose naive incentives are unsafe. As an example, manipulative or deceptive behaviour can improve rewards but should be avoided. Most approaches fail here: agents maximize expected return by any means necessary. We formally describe settings with 'delicate' parts of the state which should not be used as a means to an end. We then train agents… ▽ More

    Submitted 21 April, 2022; originally announced April 2022.

    Comments: Presented at AAAI 2022

  15. arXiv:2202.08132  [pdf, other

    cs.LG

    Prospect Pruning: Finding Trainable Weights at Initialization using Meta-Gradients

    Authors: Milad Alizadeh, Shyam A. Tailor, Luisa M Zintgraf, Joost van Amersfoort, Sebastian Farquhar, Nicholas Donald Lane, Yarin Gal

    Abstract: Pruning neural networks at initialization would enable us to find sparse models that retain the accuracy of the original network while consuming fewer computational resources for training and inference. However, current methods are insufficient to enable this optimization and lead to a large degradation in model performance. In this paper, we identify a fundamental limitation in the formulation of… ▽ More

    Submitted 5 April, 2022; v1 submitted 16 February, 2022; originally announced February 2022.

  16. arXiv:2202.06881  [pdf, other

    cs.LG stat.ML

    Active Surrogate Estimators: An Active Learning Approach to Label-Efficient Model Evaluation

    Authors: Jannik Kossen, Sebastian Farquhar, Yarin Gal, Tom Rainforth

    Abstract: We propose Active Surrogate Estimators (ASEs), a new method for label-efficient model evaluation. Evaluating model performance is a challenging and important problem when labels are expensive. ASEs address this active testing problem using a surrogate-based estimation approach that interpolates the errors of points with unknown labels, rather than forming a Monte Carlo estimator. ASEs actively lea… ▽ More

    Submitted 18 October, 2022; v1 submitted 14 February, 2022; originally announced February 2022.

    Comments: Accepted for publication at NeurIPS 2022

  17. arXiv:2107.02565  [pdf, other

    cs.LG cs.IT

    Prioritized training on points that are learnable, worth learning, and not yet learned (workshop version)

    Authors: Sören Mindermann, Muhammed Razzak, Winnie Xu, Andreas Kirsch, Mrinank Sharma, Adrien Morisot, Aidan N. Gomez, Sebastian Farquhar, Jan Brauner, Yarin Gal

    Abstract: We introduce Goldilocks Selection, a technique for faster model training which selects a sequence of training points that are "just right". We propose an information-theoretic acquisition function -- the reducible validation loss -- and compute it with a small proxy model -- GoldiProx -- to efficiently choose training points that maximize information about a validation set. We show that the "hard"… ▽ More

    Submitted 17 October, 2023; v1 submitted 6 July, 2021; originally announced July 2021.

    Journal ref: ICML 2021 Workshop on Subset Selection in Machine Learning

  18. arXiv:2106.12059  [pdf, other

    cs.LG stat.ML

    Stochastic Batch Acquisition: A Simple Baseline for Deep Active Learning

    Authors: Andreas Kirsch, Sebastian Farquhar, Parmida Atighehchian, Andrew Jesson, Frederic Branchaud-Charron, Yarin Gal

    Abstract: We examine a simple stochastic strategy for adapting well-known single-point acquisition functions to allow batch active learning. Unlike acquiring the top-K points from the pool set, score- or rank-based sampling takes into account that acquisition scores change as new data are acquired. This simple strategy for adapting standard single-sample acquisition strategies can even perform just as well… ▽ More

    Submitted 19 September, 2023; v1 submitted 22 June, 2021; originally announced June 2021.

    Comments: TMLR Paper: https://openreview.net/forum?id=vcHwQyNBjW

  19. arXiv:2106.04015  [pdf, other

    cs.LG

    Uncertainty Baselines: Benchmarks for Uncertainty & Robustness in Deep Learning

    Authors: Zachary Nado, Neil Band, Mark Collier, Josip Djolonga, Michael W. Dusenberry, Sebastian Farquhar, Qixuan Feng, Angelos Filos, Marton Havasi, Rodolphe Jenatton, Ghassen Jerfel, Jeremiah Liu, Zelda Mariet, Jeremy Nixon, Shreyas Padhy, Jie Ren, Tim G. J. Rudner, Faris Sbahi, Yeming Wen, Florian Wenzel, Kevin Murphy, D. Sculley, Balaji Lakshminarayanan, Jasper Snoek, Yarin Gal , et al. (1 additional authors not shown)

    Abstract: High-quality estimates of uncertainty and robustness are crucial for numerous real-world applications, especially for deep learning which underlies many deployed ML systems. The ability to compare techniques for improving these estimates is therefore very important for research and practice alike. Yet, competitive comparisons of methods are often lacking due to a range of reasons, including: compu… ▽ More

    Submitted 5 January, 2022; v1 submitted 7 June, 2021; originally announced June 2021.

  20. arXiv:2103.05331  [pdf, other

    stat.ML cs.LG

    Active Testing: Sample-Efficient Model Evaluation

    Authors: Jannik Kossen, Sebastian Farquhar, Yarin Gal, Tom Rainforth

    Abstract: We introduce a new framework for sample-efficient model evaluation that we call active testing. While approaches like active learning reduce the number of labels needed for model training, existing literature largely ignores the cost of labeling test data, typically unrealistically assuming large test sets for model evaluation. This creates a disconnect to real applications, where test labels are… ▽ More

    Submitted 14 June, 2021; v1 submitted 9 March, 2021; originally announced March 2021.

    Comments: Published at the 38th International Conference on Machine Learning (ICML 2021)

  21. arXiv:2102.03276  [pdf

    q-bio.PE physics.bio-ph

    Does Non-Genetic Heterogeneity Facilitate the Development of Genetic Drug Resistance?

    Authors: Kevin S. Farquhar, Samira Rasouli Koohi, Daniel A. Charlebois

    Abstract: Non-genetic forms of antimicrobial drug resistance can result from cell-to-cell variability that is not encoded in the genetic material. Data from recent studies also suggest that non-genetic mechanisms can facilitate the development of genetic drug resistance. In this Perspective article, we speculate on how the interplay between non-genetic and genetic mechanisms may affect microbial adaptation… ▽ More

    Submitted 5 February, 2021; originally announced February 2021.

    Comments: 11 pages, 2 figures

    Journal ref: BioEssays, 43: e2100043 (2021)

  22. arXiv:2101.11665  [pdf, other

    stat.ML cs.LG

    On Statistical Bias In Active Learning: How and When To Fix It

    Authors: Sebastian Farquhar, Yarin Gal, Tom Rainforth

    Abstract: Active learning is a powerful tool when labelling data is expensive, but it introduces a bias because the training data no longer follows the population distribution. We formalize this bias and investigate the situations in which it can be harmful and sometimes even helpful. We further introduce novel corrective weights to remove bias when doing so is beneficial. Through this, our work not only pr… ▽ More

    Submitted 31 May, 2021; v1 submitted 27 January, 2021; originally announced January 2021.

    Comments: Published at ICLR 2021 (Spotlight)

  23. arXiv:2007.00389  [pdf, other

    cs.LG stat.ML

    Single Shot Structured Pruning Before Training

    Authors: Joost van Amersfoort, Milad Alizadeh, Sebastian Farquhar, Nicholas Lane, Yarin Gal

    Abstract: We introduce a method to speed up training by 2x and inference by 3x in deep neural networks using structured pruning applied before training. Unlike previous works on pruning before training which prune individual weights, our work develops a methodology to remove entire channels and hidden units with the explicit aim of speeding up training and inference. We introduce a compute-aware scoring mec… ▽ More

    Submitted 1 July, 2020; originally announced July 2020.

  24. arXiv:2002.03704  [pdf, other

    cs.LG stat.ML

    Liberty or Depth: Deep Bayesian Neural Nets Do Not Need Complex Weight Posterior Approximations

    Authors: Sebastian Farquhar, Lewis Smith, Yarin Gal

    Abstract: We challenge the longstanding assumption that the mean-field approximation for variational inference in Bayesian neural networks is severely restrictive, and show this is not the case in deep networks. We prove several results indicating that deep mean-field variational weight posteriors can induce similar distributions in function-space to those induced by shallower networks with complex weight p… ▽ More

    Submitted 10 March, 2021; v1 submitted 10 February, 2020; originally announced February 2020.

    Comments: Advances In Neural Information Processing Systems. 2020

  25. arXiv:1912.10481  [pdf, other

    stat.ML cs.LG eess.IV

    A Systematic Comparison of Bayesian Deep Learning Robustness in Diabetic Retinopathy Tasks

    Authors: Angelos Filos, Sebastian Farquhar, Aidan N. Gomez, Tim G. J. Rudner, Zachary Kenton, Lewis Smith, Milad Alizadeh, Arnoud de Kroon, Yarin Gal

    Abstract: Evaluation of Bayesian deep learning (BDL) methods is challenging. We often seek to evaluate the methods' robustness and scalability, assessing whether new tools give `better' uncertainty estimates than old ones. These evaluations are paramount for practitioners when choosing BDL tools on-top of which they build their applications. Current popular evaluations of BDL methods, such as the UCI experi… ▽ More

    Submitted 22 December, 2019; originally announced December 2019.

  26. arXiv:1907.00865  [pdf, other

    stat.ML cs.LG

    Radial Bayesian Neural Networks: Beyond Discrete Support In Large-Scale Bayesian Deep Learning

    Authors: Sebastian Farquhar, Michael Osborne, Yarin Gal

    Abstract: We propose Radial Bayesian Neural Networks (BNNs): a variational approximate posterior for BNNs which scales well to large models while maintaining a distribution over weight-space with full support. Other scalable Bayesian deep learning methods, like MC dropout or deep ensembles, have discrete support-they assign zero probability to almost all of the weight-space. Unlike these discrete support me… ▽ More

    Submitted 31 May, 2021; v1 submitted 1 July, 2019; originally announced July 2019.

    Journal ref: AI Stats, PMLR 108:1352-1362, 2020

  27. arXiv:1902.06497  [pdf, other

    stat.ML cs.LG

    Differentially Private Continual Learning

    Authors: Sebastian Farquhar, Yarin Gal

    Abstract: Catastrophic forgetting can be a significant problem for institutions that must delete historic data for privacy reasons. For example, hospitals might not be able to retain patient data permanently. But neural networks trained on recent data alone will tend to forget lessons learned on old data. We present a differentially private continual learning framework based on variational inference. We est… ▽ More

    Submitted 18 February, 2019; originally announced February 2019.

    Comments: Presented at the Privacy in Machine Learning and AI workshop at ICML 2018

  28. arXiv:1902.06494  [pdf, other

    stat.ML cs.LG

    A Unifying Bayesian View of Continual Learning

    Authors: Sebastian Farquhar, Yarin Gal

    Abstract: Some machine learning applications require continual learning - where data comes in a sequence of datasets, each is used for training and then permanently discarded. From a Bayesian perspective, continual learning seems straightforward: Given the model posterior one would simply use this as the prior for the next task. However, exact posterior evaluation is intractable with many models, especially… ▽ More

    Submitted 18 February, 2019; originally announced February 2019.

    Comments: Presented at the Bayesian Deep Learning Workshop at Neural Information Processing Systems December 2018

  29. arXiv:1805.09733  [pdf, other

    stat.ML cs.LG

    Towards Robust Evaluations of Continual Learning

    Authors: Sebastian Farquhar, Yarin Gal

    Abstract: Experiments used in current continual learning research do not faithfully assess fundamental challenges of learning continually. Instead of assessing performance on challenging and representative experiment designs, recent research has focused on increased dataset difficulty, while still using flawed experiment set-ups. We examine standard evaluations and show why these evaluations make some conti… ▽ More

    Submitted 26 June, 2019; v1 submitted 24 May, 2018; originally announced May 2018.

  30. arXiv:1802.07228  [pdf

    cs.AI cs.CR cs.CY

    The Malicious Use of Artificial Intelligence: Forecasting, Prevention, and Mitigation

    Authors: Miles Brundage, Shahar Avin, Jack Clark, Helen Toner, Peter Eckersley, Ben Garfinkel, Allan Dafoe, Paul Scharre, Thomas Zeitzoff, Bobby Filar, Hyrum Anderson, Heather Roff, Gregory C. Allen, Jacob Steinhardt, Carrick Flynn, Seán Ó hÉigeartaigh, Simon Beard, Haydn Belfield, Sebastian Farquhar, Clare Lyle, Rebecca Crootof, Owain Evans, Michael Page, Joanna Bryson, Roman Yampolskiy , et al. (1 additional authors not shown)

    Abstract: This report surveys the landscape of potential security threats from malicious uses of AI, and proposes ways to better forecast, prevent, and mitigate these threats. After analyzing the ways in which AI may influence the threat landscape in the digital, physical, and political domains, we make four high-level recommendations for AI researchers and other stakeholders. We also suggest several promis… ▽ More

    Submitted 20 February, 2018; originally announced February 2018.