Skip to main content

Showing 1–8 of 8 results for author: Favaro, M

.
  1. arXiv:2401.05566  [pdf, other

    cs.CR cs.AI cs.CL cs.LG cs.SE

    Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training

    Authors: Evan Hubinger, Carson Denison, Jesse Mu, Mike Lambert, Meg Tong, Monte MacDiarmid, Tamera Lanham, Daniel M. Ziegler, Tim Maxwell, Newton Cheng, Adam Jermyn, Amanda Askell, Ansh Radhakrishnan, Cem Anil, David Duvenaud, Deep Ganguli, Fazl Barez, Jack Clark, Kamal Ndousse, Kshitij Sachan, Michael Sellitto, Mrinank Sharma, Nova DasSarma, Roger Grosse, Shauna Kravec , et al. (14 additional authors not shown)

    Abstract: Humans are capable of strategically deceptive behavior: behaving helpfully in most situations, but then behaving very differently in order to pursue alternative objectives when given the opportunity. If an AI system learned such a deceptive strategy, could we detect it and remove it using current state-of-the-art safety training techniques? To study this question, we construct proof-of-concept exa… ▽ More

    Submitted 17 January, 2024; v1 submitted 10 January, 2024; originally announced January 2024.

    Comments: updated to add missing acknowledgements

  2. arXiv:2308.00862  [pdf, ps, other

    cs.CY

    Confidence-Building Measures for Artificial Intelligence: Workshop Proceedings

    Authors: Sarah Shoker, Andrew Reddie, Sarah Barrington, Ruby Booth, Miles Brundage, Husanjot Chahal, Michael Depp, Bill Drexel, Ritwik Gupta, Marina Favaro, Jake Hecla, Alan Hickey, Margarita Konaev, Kirthi Kumar, Nathan Lambert, Andrew Lohn, Cullen O'Keefe, Nazneen Rajani, Michael Sellitto, Robert Trager, Leah Walker, Alexa Wehsener, Jessica Young

    Abstract: Foundation models could eventually introduce several pathways for undermining state security: accidents, inadvertent escalation, unintentional conflict, the proliferation of weapons, and the interference with human diplomacy are just a few on a long list. The Confidence-Building Measures for Artificial Intelligence workshop hosted by the Geopolitics Team at OpenAI and the Berkeley Risk and Securit… ▽ More

    Submitted 3 August, 2023; v1 submitted 1 August, 2023; originally announced August 2023.

  3. arXiv:2306.17682  [pdf

    cs.CY

    ADS Standardization Landscape: Making Sense of its Status and of the Associated Research Questions

    Authors: Scott Schnelle, Francesca M. Favaro

    Abstract: Automated Driving Systems (ADS) hold great potential to increase safety, mobility, and equity. However, without public acceptance, none of these promises can be fulfilled. To engender public trust, many entities in the ADS community participate in standards development organizations (SDOs) with the goal of enhancing safety for the entire industry through a collaborative approach. The breadth and d… ▽ More

    Submitted 30 June, 2023; originally announced June 2023.

    Comments: 13 pages, 1 figure

  4. arXiv:2306.14923  [pdf

    cs.SE

    Interpreting Safety Outcomes: Waymo's Performance Evaluation in the Context of a Broader Determination of Safety Readiness

    Authors: Francesca M. Favaro, Trent Victor, Henning Hohnhold, Scott Schnelle

    Abstract: This paper frames recent publications from Waymo within the broader context of the safety readiness determination for an Automated Driving System (ADS). Starting from a brief overview of safety performance outcomes reported by Waymo (i.e., contact events experienced during fully autonomous operations), this paper highlights the need for a diversified approach to safety determination that complemen… ▽ More

    Submitted 23 June, 2023; originally announced June 2023.

  5. arXiv:2104.08090  [pdf

    physics.app-ph cond-mat.mtrl-sci

    Near ambient pressure photoelectron spectro-microscopy: from gas-solid interface to operando devices

    Authors: Matteo Amati, Luca Gregoratti, Patrick Zeller, Mark Greiner, Mattia Scardamaglia, Benjamin Junker, Tamara Ruß, Udo Weimar, Nicolae Barsan, Marco Favaro, Abdulaziz Alharbi, Ingvild J. T. Jensen, Ayaz Ali, Branson D. Belle

    Abstract: Near Ambient Pressure Scanning Photoelectron Microscopy adds to the widely used photoemission spectroscopy and its chemically selective capability two key features: (i) the possibility to chemically analyse samples in a more realistic environmental, gas pressure condition, and (ii) the capability to investigate a system at the relevant spatial scale. To achieve these goals the approach developed a… ▽ More

    Submitted 16 April, 2021; originally announced April 2021.

    Journal ref: Journal of Physics D: Applied Physics (2021)

  6. arXiv:2012.10779  [pdf

    physics.ins-det physics.chem-ph

    Soft X-ray spectroscopies in liquids and at solid-liquid interface at BACH beamline at Elettra

    Authors: Silvia Nappini, Luca D'Amario, Marco Favaro, Simone Dal Zilio, Federico Salvador, Erik Betz-Guttner, Andrea Fondacaro, Igor Pis, Luca Romanzin, Alessandro Gambitta, Federica Bondino, Marco Lazzarino, Elena Magnano

    Abstract: The Beamline for Advanced diCHroism (BACH) of the Istituto Officina dei Materiali-Consiglio Nazionale delle Ricerche (IOM-CNR), operating at Elettra synchrotron in Trieste (Italy), works in the extreme ultra violet (EUV)-soft X-ray photon energy range with selectable light polarization, high energy resolution, brilliance and time resolution. The beamline offers a multi-technique approach for the i… ▽ More

    Submitted 19 December, 2020; originally announced December 2020.

    Comments: 30 pages, 14 figures, accepted by Review of Scientific Instruments

  7. arXiv:1909.01752  [pdf, other

    cs.CR cs.SC

    SATURN -- Software Deobfuscation Framework Based on LLVM

    Authors: Peter Garba, Matteo Favaro

    Abstract: The strength of obfuscated software has increased over the recent years. Compiler based obfuscation has become the de facto standard in the industry and recent papers also show that injection of obfuscation techniques is done at the compiler level. In this paper we discuss a generic approach for deobfuscation and recompilation of obfuscated code based on the compiler framework LLVM. We show how bi… ▽ More

    Submitted 5 September, 2019; v1 submitted 4 September, 2019; originally announced September 2019.

    Comments: reverse engineering, llvm, code lifting, obfuscation, deobfuscation, static software analysis, binary recompilation, binary rewriting

    Journal ref: 3rd International Workshop on Software PROtection, Nov 2019, London, United Kingdom

  8. arXiv:1612.05621  [pdf, other

    physics.ins-det hep-ex hep-ph

    Intrinsic limits on resolutions in muon- and electron-neutrino charged-current events in the KM3NeT/ORCA detector

    Authors: S. Adrián-Martínez, M. Ageron, S. Aiello, A. Albert, F. Ameli, E. G. Anassontzis, M. Andre, G. Androulakis, M. Anghinolfi, G. Anton, M. Ardid, T. Avgitas, G. Barbarino, E. Barbarito, B. Baret, J. Barrios-Martí, A. Belias, E. Berbee, A. van den Berg, V. Bertin, S. Beurthey, V. van Beveren, N. Beverini, S. Biagi, A. Biagioni , et al. (228 additional authors not shown)

    Abstract: Studying atmospheric neutrino oscillations in the few-GeV range with a multimegaton detector promises to determine the neutrino mass hierarchy. This is the main science goal pursued by the future KM3NeT/ORCA water Cherenkov detector in the Mediterranean Sea. In this paper, the processes that limit the obtainable resolution in both energy and direction in charged-current neutrino events in the ORCA… ▽ More

    Submitted 19 May, 2017; v1 submitted 29 November, 2016; originally announced December 2016.

    Comments: 37 pages, 28 figures, JHEP published version

    Journal ref: The KM3NeT collaboration, Adri{á}n-Mart{\'ı}nez, S., Ageron, M. et al. J. High Energ. Phys. (2017) 2017: 8