Search | arXiv e-print repository

Can large language models democratize access to dual-use biotechnology?

Authors: Emily H. Soice, Rafael Rocha, Kimberlee Cordova, Michael Specter, Kevin M. Esvelt

Abstract: Large language models (LLMs) such as those embedded in 'chatbots' are accelerating and democratizing research by providing comprehensible information and expertise from many different fields. However, these models may also confer easy access to dual-use technologies capable of inflicting great harm. To evaluate this risk, the 'Safeguarding the Future' course at MIT tasked non-scientist students wi… ▽ More Large language models (LLMs) such as those embedded in 'chatbots' are accelerating and democratizing research by providing comprehensible information and expertise from many different fields. However, these models may also confer easy access to dual-use technologies capable of inflicting great harm. To evaluate this risk, the 'Safeguarding the Future' course at MIT tasked non-scientist students with investigating whether LLM chatbots could be prompted to assist non-experts in causing a pandemic. In one hour, the chatbots suggested four potential pandemic pathogens, explained how they can be generated from synthetic DNA using reverse genetics, supplied the names of DNA synthesis companies unlikely to screen orders, identified detailed protocols and how to troubleshoot them, and recommended that anyone lacking the skills to perform reverse genetics engage a core facility or contract research organization. Collectively, these results suggest that LLMs will make pandemic-class agents widely accessible as soon as they are credibly identified, even to people with little or no laboratory training. Promising nonproliferation measures include pre-release evaluations of LLMs by third parties, curating training datasets to remove harmful concepts, and verifiably screening all DNA generated by synthesis providers or used by contract research organizations and robotic cloud laboratories to engineer organisms or viruses. △ Less

Submitted 6 June, 2023; originally announced June 2023.

Comments: 6 pages, 0 figures

arXiv:2304.02810 [pdf, other]

Robust, privacy-preserving, transparent, and auditable on-device blocklisting

Authors: Kurt Thomas, Sarah Meiklejohn, Michael A. Specter, Xiang Wang, Xavier Llorà, Stephan Somogyi, David Kleidermacher

Abstract: With the accelerated adoption of end-to-end encryption, there is an opportunity to re-architect security and anti-abuse primitives in a manner that preserves new privacy expectations. In this paper, we consider two novel protocols for on-device blocklisting that allow a client to determine whether an object (e.g., URL, document, image, etc.) is harmful based on threat information possessed by a so… ▽ More With the accelerated adoption of end-to-end encryption, there is an opportunity to re-architect security and anti-abuse primitives in a manner that preserves new privacy expectations. In this paper, we consider two novel protocols for on-device blocklisting that allow a client to determine whether an object (e.g., URL, document, image, etc.) is harmful based on threat information possessed by a so-called remote enforcer in a way that is both privacy-preserving and trustworthy. Our protocols leverage a unique combination of private set intersection to promote privacy, cryptographic hashes to ensure resilience to false positives, cryptographic signatures to improve transparency, and Merkle inclusion proofs to ensure consistency and auditability. We benchmark our protocols -- one that is time-efficient, and the other space-efficient -- to demonstrate their practical use for applications such as email, messaging, storage, and other applications. We also highlight remaining challenges, such as privacy and censorship tensions that exist with logging or reporting. We consider our work to be a critical first step towards enabling complex, multi-stakeholder discussions on how best to provide on-device protections. △ Less

Submitted 5 April, 2023; originally announced April 2023.

arXiv:2107.04940 [pdf, other]

You Really Shouldn't Roll Your Own Crypto: An Empirical Study of Vulnerabilities in Cryptographic Libraries

Authors: Jenny Blessing, Michael A. Specter, Daniel J. Weitzner

Abstract: The security of the Internet rests on a small number of open-source cryptographic libraries: a vulnerability in any one of them threatens to compromise a significant percentage of web traffic. Despite this potential for security impact, the characteristics and causes of vulnerabilities in cryptographic software are not well understood. In this work, we conduct the first comprehensive analysis of c… ▽ More The security of the Internet rests on a small number of open-source cryptographic libraries: a vulnerability in any one of them threatens to compromise a significant percentage of web traffic. Despite this potential for security impact, the characteristics and causes of vulnerabilities in cryptographic software are not well understood. In this work, we conduct the first comprehensive analysis of cryptographic libraries and the vulnerabilities affecting them. We collect data from the National Vulnerability Database, individual project repositories and mailing lists, and other relevant sources for eight widely used cryptographic libraries. Among our most interesting findings is that only 27.2% of vulnerabilities in cryptographic libraries are cryptographic issues while 37.2% of vulnerabilities are memory safety issues, indicating that systems-level bugs are a greater security concern than the actual cryptographic procedures. In our investigation of the causes of these vulnerabilities, we find evidence of a strong correlation between the complexity of these libraries and their (in)security, empirically demonstrating the potential risks of bloated cryptographic codebases. We further compare our findings with non-cryptographic systems, observing that these systems are, indeed, more complex than similar counterparts, and that this excess complexity appears to produce significantly more vulnerabilities in cryptographic libraries than in non-cryptographic software. △ Less

Submitted 10 July, 2021; originally announced July 2021.

arXiv:2012.04770 [pdf, other]

SonicPACT: An Ultrasonic Ranging Method for the Private Automated Contact Tracing (PACT) Protocol

Authors: John Meklenburg, Michael Specter, Michael Wentz, Hari Balakrishnan, Anantha Chandrakasan, John Cohn, Gary Hatke, Louise Ivers, Ronald Rivest, Gerald Jay Sussman, Daniel Weitzner

Abstract: Throughout the course of the COVID-19 pandemic, several countries have developed and released contact tracing and exposure notification smartphone applications (apps) to help slow the spread of the disease. To support such apps, Apple and Google have released Exposure Notification Application Programming Interfaces (APIs) to infer device (user) proximity using Bluetooth Low Energy (BLE) beacons. T… ▽ More Throughout the course of the COVID-19 pandemic, several countries have developed and released contact tracing and exposure notification smartphone applications (apps) to help slow the spread of the disease. To support such apps, Apple and Google have released Exposure Notification Application Programming Interfaces (APIs) to infer device (user) proximity using Bluetooth Low Energy (BLE) beacons. The Private Automated Contact Tracing (PACT) team has shown that accurately estimating the distance between devices using only BLE radio signals is challenging. This paper describes the design and implementation of the SonicPACT protocol to use near-ultrasonic signals on commodity iOS and Android smartphones to estimate distances using time-of-flight measurements. The protocol allows Android and iOS devices to interoperate, augmenting and improving the current exposure notification APIs. Our initial experimental results are promising, suggesting that SonicPACT should be considered for implementation by Apple and Google. △ Less

Submitted 8 December, 2020; originally announced December 2020.

arXiv:1904.06425 [pdf, other]

KeyForge: Mitigating Email Breaches with Forward-Forgeable Signatures

Authors: Michael Specter, Sunoo Park, Matthew Green

Abstract: Email breaches are commonplace, and they expose a wealth of personal, business, and political data that may have devastating consequences. The current email system allows any attacker who gains access to your email to prove the authenticity of the stolen messages to third parties -- a property arising from a necessary anti-spam / anti-spoofing protocol called DKIM. This exacerbates the problem of… ▽ More Email breaches are commonplace, and they expose a wealth of personal, business, and political data that may have devastating consequences. The current email system allows any attacker who gains access to your email to prove the authenticity of the stolen messages to third parties -- a property arising from a necessary anti-spam / anti-spoofing protocol called DKIM. This exacerbates the problem of email breaches by greatly increasing the potential for attackers to damage the users' reputation, blackmail them, or sell the stolen information to third parties. In this paper, we introduce "non-attributable email", which guarantees that a wide class of adversaries are unable to convince any third party of the authenticity of stolen emails. We formally define non-attributability, and present two practical system proposals -- KeyForge and TimeForge -- that provably achieve non-attributability while maintaining the important protection against spam and spoofing that is currently provided by DKIM. Moreover, we implement KeyForge and demonstrate that that scheme is practical, achieving competitive verification and signing speed while also requiring 42% less bandwidth per email than RSA2048. △ Less

Submitted 12 April, 2019; originally announced April 2019.

arXiv:1904.05572 [pdf, other]

doi 10.1145/3448609

The Android Platform Security Model (2023)

Authors: René Mayrhofer, Jeffrey Vander Stoep, Chad Brubaker, Dianne Hackborn, Bram Bonné, Güliz Seray Tuncay, Roger Piqueras Jover, Michael A. Specter

Abstract: Android is the most widely deployed end-user focused operating system. With its growing set of use cases encompassing communication, navigation, media consumption, entertainment, finance, health, and access to sensors, actuators, cameras, or microphones, its underlying security model needs to address a host of practical threats in a wide variety of scenarios while being useful to non-security expe… ▽ More Android is the most widely deployed end-user focused operating system. With its growing set of use cases encompassing communication, navigation, media consumption, entertainment, finance, health, and access to sensors, actuators, cameras, or microphones, its underlying security model needs to address a host of practical threats in a wide variety of scenarios while being useful to non-security experts. To support this flexibility, Android's security model must strike a difficult balance between security, privacy, and usability for end users; provide assurances for app developers; and maintain system performance under tight hardware constraints. This paper aims to both document the assumed threat model and discuss its implications, with a focus on the ecosystem context in which Android exists. We analyze how different security measures in past and current Android implementations work together to mitigate these threats, and, where there are special cases in applying the security model in practice; we discuss these deliberate deviations and examine their impact. △ Less

Submitted 8 January, 2024; v1 submitted 11 April, 2019; originally announced April 2019.

Journal ref: ACM Transactions on Privacy and Security, Volume 24, Issue 3, Article No. 19, 2021, pp 1-35

arXiv:1806.00069 [pdf, ps, other]

Explaining Explanations: An Overview of Interpretability of Machine Learning

Authors: Leilani H. Gilpin, David Bau, Ben Z. Yuan, Ayesha Bajwa, Michael Specter, Lalana Kagal

Abstract: There has recently been a surge of work in explanatory artificial intelligence (XAI). This research area tackles the important problem that complex machines and algorithms often cannot provide insights into their behavior and thought processes. XAI allows users and parts of the internal system to be more transparent, providing explanations of their decisions in some level of detail. These explanat… ▽ More There has recently been a surge of work in explanatory artificial intelligence (XAI). This research area tackles the important problem that complex machines and algorithms often cannot provide insights into their behavior and thought processes. XAI allows users and parts of the internal system to be more transparent, providing explanations of their decisions in some level of detail. These explanations are important to ensure algorithmic fairness, identify potential bias/problems in the training data, and to ensure that the algorithms perform as expected. However, explanations produced by these systems is neither standardized nor systematically assessed. In an effort to create best practices and identify open challenges, we provide our definition of explainability and show how it can be used to classify existing literature. We discuss why current approaches to explanatory methods especially for deep neural networks are insufficient. Finally, based on our survey, we conclude with suggested future research directions for explanatory artificial intelligence. △ Less

Submitted 3 February, 2019; v1 submitted 31 May, 2018; originally announced June 2018.

Comments: The 5th IEEE International Conference on Data Science and Advanced Analytics (DSAA 2018). [Research Track]

Showing 1–7 of 7 results for author: Specter, M