Skip to main content

Showing 1–2 of 2 results for author: Pan, A Y

.
  1. arXiv:2309.15840  [pdf, other

    cs.CL cs.AI cs.LG

    How to Catch an AI Liar: Lie Detection in Black-Box LLMs by Asking Unrelated Questions

    Authors: Lorenzo Pacchiardi, Alex J. Chan, Sören Mindermann, Ilan Moscovitz, Alexa Y. Pan, Yarin Gal, Owain Evans, Jan Brauner

    Abstract: Large language models (LLMs) can "lie", which we define as outputting false statements despite "knowing" the truth in a demonstrable sense. LLMs might "lie", for example, when instructed to output misinformation. Here, we develop a simple lie detector that requires neither access to the LLM's activations (black-box) nor ground-truth knowledge of the fact in question. The detector works by asking a… ▽ More

    Submitted 26 September, 2023; originally announced September 2023.

  2. Improved Sensitivity of the DRIFT-IId Directional Dark Matter Experiment using Machine Learning

    Authors: J. B. R. Battat, C. Eldridge, A. C. Ezeribe, O. P. Gaunt, J. -L. Gauvreau, R. R. Marcelo Gregorio, E. K. K. Habich, K. E. Hall, J. L. Harton, I. Ingabire, R. Lafler, D. Loomba, W. A. Lynch, S. M. Paling, A. Y. Pan, A. Scarff, F. G. Schuckman II, D. P. Snowden-Ifft, N. J. C. Spooner, C. Toth, A. A. Xu

    Abstract: We demonstrate a new type of analysis for the DRIFT-IId directional dark matter detector using a machine learning algorithm called a Random Forest Classifier. The analysis labels events as signal or background based on a series of selection parameters, rather than solely applying hard cuts. The analysis efficiency is shown to be comparable to our previous result at high energy but with increased e… ▽ More

    Submitted 8 June, 2021; v1 submitted 11 March, 2021; originally announced March 2021.