-
Phantom: General Trigger Attacks on Retrieval Augmented Language Generation
Authors:
Harsh Chaudhari,
Giorgio Severi,
John Abascal,
Matthew Jagielski,
Christopher A. Choquette-Choo,
Milad Nasr,
Cristina Nita-Rotaru,
Alina Oprea
Abstract:
Retrieval Augmented Generation (RAG) expands the capabilities of modern large language models (LLMs) in chatbot applications, enabling developers to adapt and personalize the LLM output without expensive training or fine-tuning. RAG systems use an external knowledge database to retrieve the most relevant documents for a given query, providing this context to the LLM generator. While RAG achieves i…
▽ More
Retrieval Augmented Generation (RAG) expands the capabilities of modern large language models (LLMs) in chatbot applications, enabling developers to adapt and personalize the LLM output without expensive training or fine-tuning. RAG systems use an external knowledge database to retrieve the most relevant documents for a given query, providing this context to the LLM generator. While RAG achieves impressive utility in many applications, its adoption to enable personalized generative models introduces new security risks. In this work, we propose new attack surfaces for an adversary to compromise a victim's RAG system, by injecting a single malicious document in its knowledge database. We design Phantom, general two-step attack framework against RAG augmented LLMs. The first step involves crafting a poisoned document designed to be retrieved by the RAG system within the top-k results only when an adversarial trigger, a specific sequence of words acting as backdoor, is present in the victim's queries. In the second step, a specially crafted adversarial string within the poisoned document triggers various adversarial attacks in the LLM generator, including denial of service, reputation damage, privacy violations, and harmful behaviors. We demonstrate our attacks on multiple LLM architectures, including Gemma, Vicuna, and Llama.
△ Less
Submitted 30 May, 2024;
originally announced May 2024.
-
Can a Multichoice Dataset be Repurposed for Extractive Question Answering?
Authors:
Teresa Lynn,
Malik H. Altakrori,
Samar Mohamed Magdy,
Rocktim Jyoti Das,
Chenyang Lyu,
Mohamed Nasr,
Younes Samih,
Alham Fikri Aji,
Preslav Nakov,
Shantanu Godbole,
Salim Roukos,
Radu Florian,
Nizar Habash
Abstract:
The rapid evolution of Natural Language Processing (NLP) has favored major languages such as English, leaving a significant gap for many others due to limited resources. This is especially evident in the context of data annotation, a task whose importance cannot be underestimated, but which is time-consuming and costly. Thus, any dataset for resource-poor languages is precious, in particular when…
▽ More
The rapid evolution of Natural Language Processing (NLP) has favored major languages such as English, leaving a significant gap for many others due to limited resources. This is especially evident in the context of data annotation, a task whose importance cannot be underestimated, but which is time-consuming and costly. Thus, any dataset for resource-poor languages is precious, in particular when it is task-specific. Here, we explore the feasibility of repurposing existing datasets for a new NLP task: we repurposed the Belebele dataset (Bandarkar et al., 2023), which was designed for multiple-choice question answering (MCQA), to enable extractive QA (EQA) in the style of machine reading comprehension. We present annotation guidelines and a parallel EQA dataset for English and Modern Standard Arabic (MSA). We also present QA evaluation results for several monolingual and cross-lingual QA pairs including English, MSA, and five Arabic dialects. Our aim is to enable others to adapt our approach for the 120+ other language variants in Belebele, many of which are deemed under-resourced. We also conduct a thorough analysis and share our insights from the process, which we hope will contribute to a deeper understanding of the challenges and the opportunities associated with task reformulation in NLP research.
△ Less
Submitted 26 April, 2024;
originally announced April 2024.
-
Stealing Part of a Production Language Model
Authors:
Nicholas Carlini,
Daniel Paleka,
Krishnamurthy Dj Dvijotham,
Thomas Steinke,
Jonathan Hayase,
A. Feder Cooper,
Katherine Lee,
Matthew Jagielski,
Milad Nasr,
Arthur Conmy,
Eric Wallace,
David Rolnick,
Florian Tramèr
Abstract:
We introduce the first model-stealing attack that extracts precise, nontrivial information from black-box production language models like OpenAI's ChatGPT or Google's PaLM-2. Specifically, our attack recovers the embedding projection layer (up to symmetries) of a transformer model, given typical API access. For under \…
▽ More
We introduce the first model-stealing attack that extracts precise, nontrivial information from black-box production language models like OpenAI's ChatGPT or Google's PaLM-2. Specifically, our attack recovers the embedding projection layer (up to symmetries) of a transformer model, given typical API access. For under \$20 USD, our attack extracts the entire projection matrix of OpenAI's Ada and Babbage language models. We thereby confirm, for the first time, that these black-box models have a hidden dimension of 1024 and 2048, respectively. We also recover the exact hidden dimension size of the gpt-3.5-turbo model, and estimate it would cost under \$2,000 in queries to recover the entire projection matrix. We conclude with potential defenses and mitigations, and discuss the implications of possible future work that could extend our attack.
△ Less
Submitted 11 March, 2024;
originally announced March 2024.
-
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context
Authors:
Gemini Team,
Petko Georgiev,
Ving Ian Lei,
Ryan Burnell,
Libin Bai,
Anmol Gulati,
Garrett Tanzer,
Damien Vincent,
Zhufeng Pan,
Shibo Wang,
Soroosh Mariooryad,
Yifan Ding,
Xinyang Geng,
Fred Alcober,
Roy Frostig,
Mark Omernick,
Lexi Walker,
Cosmin Paduraru,
Christina Sorokin,
Andrea Tacchetti,
Colin Gaffney,
Samira Daruki,
Olcan Sercinoglu,
Zach Gleicher,
Juliette Love
, et al. (1092 additional authors not shown)
Abstract:
In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February…
▽ More
In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February version on the great majority of capabilities and benchmarks; (2) Gemini 1.5 Flash, a more lightweight variant designed for efficiency with minimal regression in quality. Gemini 1.5 models achieve near-perfect recall on long-context retrieval tasks across modalities, improve the state-of-the-art in long-document QA, long-video QA and long-context ASR, and match or surpass Gemini 1.0 Ultra's state-of-the-art performance across a broad set of benchmarks. Studying the limits of Gemini 1.5's long-context ability, we find continued improvement in next-token prediction and near-perfect retrieval (>99%) up to at least 10M tokens, a generational leap over existing models such as Claude 3.0 (200k) and GPT-4 Turbo (128k). Finally, we highlight real-world use cases, such as Gemini 1.5 collaborating with professionals on completing their tasks achieving 26 to 75% time savings across 10 different job categories, as well as surprising new capabilities of large language models at the frontier; when given a grammar manual for Kalamang, a language with fewer than 200 speakers worldwide, the model learns to translate English to Kalamang at a similar level to a person who learned from the same content.
△ Less
Submitted 14 June, 2024; v1 submitted 8 March, 2024;
originally announced March 2024.
-
Query-Based Adversarial Prompt Generation
Authors:
Jonathan Hayase,
Ema Borevkovic,
Nicholas Carlini,
Florian Tramèr,
Milad Nasr
Abstract:
Recent work has shown it is possible to construct adversarial examples that cause an aligned language model to emit harmful strings or perform harmful behavior. Existing attacks work either in the white-box setting (with full access to the model weights), or through transferability: the phenomenon that adversarial examples crafted on one model often remain effective on other models. We improve on…
▽ More
Recent work has shown it is possible to construct adversarial examples that cause an aligned language model to emit harmful strings or perform harmful behavior. Existing attacks work either in the white-box setting (with full access to the model weights), or through transferability: the phenomenon that adversarial examples crafted on one model often remain effective on other models. We improve on prior work with a query-based attack that leverages API access to a remote language model to construct adversarial examples that cause the model to emit harmful strings with (much) higher probability than with transfer-only attacks. We validate our attack on GPT-3.5 and OpenAI's safety classifier; we can cause GPT-3.5 to emit harmful strings that current transfer attacks fail at, and we can evade the safety classifier with nearly 100% probability.
△ Less
Submitted 19 February, 2024;
originally announced February 2024.
-
Auditing Private Prediction
Authors:
Karan Chadha,
Matthew Jagielski,
Nicolas Papernot,
Christopher Choquette-Choo,
Milad Nasr
Abstract:
Differential privacy (DP) offers a theoretical upper bound on the potential privacy leakage of analgorithm, while empirical auditing establishes a practical lower bound. Auditing techniques exist forDP training algorithms. However machine learning can also be made private at inference. We propose thefirst framework for auditing private prediction where we instantiate adversaries with varying poiso…
▽ More
Differential privacy (DP) offers a theoretical upper bound on the potential privacy leakage of analgorithm, while empirical auditing establishes a practical lower bound. Auditing techniques exist forDP training algorithms. However machine learning can also be made private at inference. We propose thefirst framework for auditing private prediction where we instantiate adversaries with varying poisoningand query capabilities. This enables us to study the privacy leakage of four private prediction algorithms:PATE [Papernot et al., 2016], CaPC [Choquette-Choo et al., 2020], PromptPATE [Duan et al., 2023],and Private-kNN [Zhu et al., 2020]. To conduct our audit, we introduce novel techniques to empiricallyevaluate privacy leakage in terms of Renyi DP. Our experiments show that (i) the privacy analysis ofprivate prediction can be improved, (ii) algorithms which are easier to poison lead to much higher privacyleakage, and (iii) the privacy leakage is significantly lower for adversaries without query control than thosewith full control.
△ Less
Submitted 14 February, 2024;
originally announced February 2024.
-
Private Fine-tuning of Large Language Models with Zeroth-order Optimization
Authors:
Xinyu Tang,
Ashwinee Panda,
Milad Nasr,
Saeed Mahloujifar,
Prateek Mittal
Abstract:
Fine-tuning large pretrained models on private datasets may run the risk of violating privacy. Differential privacy is a framework for mitigating privacy risks by enforcing algorithmic stability. DP-SGD enables training models with private data in a privacy-preserving manner, but raises new obstacles in the form of performance loss and significant engineering challenges. We introduce DP-ZO, a new…
▽ More
Fine-tuning large pretrained models on private datasets may run the risk of violating privacy. Differential privacy is a framework for mitigating privacy risks by enforcing algorithmic stability. DP-SGD enables training models with private data in a privacy-preserving manner, but raises new obstacles in the form of performance loss and significant engineering challenges. We introduce DP-ZO, a new method for fine-tuning large language models that preserves the privacy of training data by privatizing zeroth-order optimization. A key insight into the design of our method is that the direction of the gradient in SPSA, the zeroth-order algorithm we use, is always random and the only information that depends on private data is the step size, i.e., a scalar. Therefore, we only need to privatize the scalar step size, which is memory-efficient. DP-ZO, which can be instantiated with either Laplace or Gaussian noise, provides a strong privacy-utility trade-off across different tasks, and model sizes, under conservative privacy budgets. One noteworthy result is that DP-ZO exhibits just $1.86\%$ performance degradation due to privacy at $(1,10^{-5})$-DP when fine-tuning OPT-66B on 1000 training samples from SQuAD.
△ Less
Submitted 8 January, 2024;
originally announced January 2024.
-
SpatialVisVR: An Immersive, Multiplexed Medical Image Viewer With Contextual Similar-Patient Search
Authors:
Jai Prakash Veerla,
Partha Sai Guttikonda,
Amir Hajighasemi,
Jillur Rahman Saurav,
Aarti Darji,
Cody T. Reynolds,
Mohamed Mohamed,
Mohammad S. Nasr,
Helen H. Shang,
Jacob M. Luber
Abstract:
In contemporary pathology, multiplexed immunofluorescence (mIF) and multiplex immunohistochemistry (mIHC) present both significant opportunities and challenges. These methodologies shed light on intricate tumor microenvironment interactions, emphasizing the need for intuitive visualization tools to analyze vast biological datasets effectively. As electronic health records (EHR) proliferate and phy…
▽ More
In contemporary pathology, multiplexed immunofluorescence (mIF) and multiplex immunohistochemistry (mIHC) present both significant opportunities and challenges. These methodologies shed light on intricate tumor microenvironment interactions, emphasizing the need for intuitive visualization tools to analyze vast biological datasets effectively. As electronic health records (EHR) proliferate and physicians face increasing information overload, the integration of advanced technologies becomes imperative. SpatialVisVR emerges as a versatile VR platform tailored for comparing medical images, with adaptability for data privacy on embedded hardware. Clinicians can capture pathology slides in real-time via mobile devices, leveraging SpatialVisVR's deep learning algorithm to match and display similar mIF images. This interface supports the manipulation of up to 100 multiplexed protein channels, thereby assisting in immuno-oncology decision-making. Ultimately, SpatialVisVR aims to streamline diagnostic processes, advocating for a comprehensive and efficient approach to immuno-oncology research and treatment.
△ Less
Submitted 11 May, 2024; v1 submitted 5 January, 2024;
originally announced January 2024.
-
Demonstration of an Adversarial Attack Against a Multimodal Vision Language Model for Pathology Imaging
Authors:
Poojitha Thota,
Jai Prakash Veerla,
Partha Sai Guttikonda,
Mohammad S. Nasr,
Shirin Nilizadeh,
Jacob M. Luber
Abstract:
In the context of medical artificial intelligence, this study explores the vulnerabilities of the Pathology Language-Image Pretraining (PLIP) model, a Vision Language Foundation model, under targeted attacks. Leveraging the Kather Colon dataset with 7,180 H&E images across nine tissue types, our investigation employs Projected Gradient Descent (PGD) adversarial perturbation attacks to induce miscl…
▽ More
In the context of medical artificial intelligence, this study explores the vulnerabilities of the Pathology Language-Image Pretraining (PLIP) model, a Vision Language Foundation model, under targeted attacks. Leveraging the Kather Colon dataset with 7,180 H&E images across nine tissue types, our investigation employs Projected Gradient Descent (PGD) adversarial perturbation attacks to induce misclassifications intentionally. The outcomes reveal a 100% success rate in manipulating PLIP's predictions, underscoring its susceptibility to adversarial perturbations. The qualitative analysis of adversarial examples delves into the interpretability challenges, shedding light on nuanced changes in predictions induced by adversarial manipulations. These findings contribute crucial insights into the interpretability, domain adaptation, and trustworthiness of Vision Language Models in medical imaging. The study emphasizes the pressing need for robust defenses to ensure the reliability of AI models. The source codes for this experiment can be found at https://github.com/jaiprakash1824/VLM_Adv_Attack.
△ Less
Submitted 7 May, 2024; v1 submitted 4 January, 2024;
originally announced January 2024.
-
Predicting Future States with Spatial Point Processes in Single Molecule Resolution Spatial Transcriptomics
Authors:
Parisa Boodaghi Malidarreh,
Biraaj Rout,
Mohammad Sadegh Nasr,
Priyanshi Borad,
Jillur Rahman Saurav,
Jai Prakash Veerla,
Kelli Fenelon,
Theodora Koromila,
Jacob M. Luber
Abstract:
In this paper, we introduce a pipeline based on Random Forest Regression to predict the future distribution of cells that are expressed by the Sog-D gene (active cells) in both the Anterior to posterior (AP) and the Dorsal to Ventral (DV) axis of the Drosophila in embryogenesis process. This method provides insights about how cells and living organisms control gene expression in super resolution w…
▽ More
In this paper, we introduce a pipeline based on Random Forest Regression to predict the future distribution of cells that are expressed by the Sog-D gene (active cells) in both the Anterior to posterior (AP) and the Dorsal to Ventral (DV) axis of the Drosophila in embryogenesis process. This method provides insights about how cells and living organisms control gene expression in super resolution whole embryo spatial transcriptomics imaging at sub cellular, single molecule resolution. A Random Forest Regression model was used to predict the next stage active distribution based on the previous one. To achieve this goal, we leveraged temporally resolved, spatial point processes by including Ripley's K-function in conjunction with the cell's state in each stage of embryogenesis, and found average predictive accuracy of active cell distribution. This tool is analogous to RNA Velocity for spatially resolved developmental biology, from one data point we can predict future spatially resolved gene expression using features from the spatial point processes.
△ Less
Submitted 4 January, 2024;
originally announced January 2024.
-
Real-Time Diagnostic Integrity Meets Efficiency: A Novel Platform-Agnostic Architecture for Physiological Signal Compression
Authors:
Neel R Vora,
Amir Hajighasemi,
Cody T. Reynolds,
Amirmohammad Radmehr,
Mohamed Mohamed,
Jillur Rahman Saurav,
Abdul Aziz,
Jai Prakash Veerla,
Mohammad S Nasr,
Hayden Lotspeich,
Partha Sai Guttikonda,
Thuong Pham,
Aarti Darji,
Parisa Boodaghi Malidarreh,
Helen H Shang,
Jay Harvey,
Kan Ding,
Phuc Nguyen,
Jacob M Luber
Abstract:
Head-based signals such as EEG, EMG, EOG, and ECG collected by wearable systems will play a pivotal role in clinical diagnosis, monitoring, and treatment of important brain disorder diseases.
However, the real-time transmission of the significant corpus physiological signals over extended periods consumes substantial power and time, limiting the viability of battery-dependent physiological monit…
▽ More
Head-based signals such as EEG, EMG, EOG, and ECG collected by wearable systems will play a pivotal role in clinical diagnosis, monitoring, and treatment of important brain disorder diseases.
However, the real-time transmission of the significant corpus physiological signals over extended periods consumes substantial power and time, limiting the viability of battery-dependent physiological monitoring wearables.
This paper presents a novel deep-learning framework employing a variational autoencoder (VAE) for physiological signal compression to reduce wearables' computational complexity and energy consumption.
Our approach achieves an impressive compression ratio of 1:293 specifically for spectrogram data, surpassing state-of-the-art compression techniques such as JPEG2000, H.264, Direct Cosine Transform (DCT), and Huffman Encoding, which do not excel in handling physiological signals.
We validate the efficacy of the compressed algorithms using collected physiological signals from real patients in the Hospital and deploy the solution on commonly used embedded AI chips (i.e., ARM Cortex V8 and Jetson Nano). The proposed framework achieves a 91% seizure detection accuracy using XGBoost, confirming the approach's reliability, practicality, and scalability.
△ Less
Submitted 4 January, 2024; v1 submitted 19 December, 2023;
originally announced December 2023.
-
Gemini: A Family of Highly Capable Multimodal Models
Authors:
Gemini Team,
Rohan Anil,
Sebastian Borgeaud,
Jean-Baptiste Alayrac,
Jiahui Yu,
Radu Soricut,
Johan Schalkwyk,
Andrew M. Dai,
Anja Hauth,
Katie Millican,
David Silver,
Melvin Johnson,
Ioannis Antonoglou,
Julian Schrittwieser,
Amelia Glaese,
Jilin Chen,
Emily Pitler,
Timothy Lillicrap,
Angeliki Lazaridou,
Orhan Firat,
James Molloy,
Michael Isard,
Paul R. Barham,
Tom Hennigan,
Benjamin Lee
, et al. (1325 additional authors not shown)
Abstract:
This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr…
▽ More
This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultra model advances the state of the art in 30 of 32 of these benchmarks - notably being the first model to achieve human-expert performance on the well-studied exam benchmark MMLU, and improving the state of the art in every one of the 20 multimodal benchmarks we examined. We believe that the new capabilities of the Gemini family in cross-modal reasoning and language understanding will enable a wide variety of use cases. We discuss our approach toward post-training and deploying Gemini models responsibly to users through services including Gemini, Gemini Advanced, Google AI Studio, and Cloud Vertex AI.
△ Less
Submitted 17 June, 2024; v1 submitted 18 December, 2023;
originally announced December 2023.
-
Scalable Extraction of Training Data from (Production) Language Models
Authors:
Milad Nasr,
Nicholas Carlini,
Jonathan Hayase,
Matthew Jagielski,
A. Feder Cooper,
Daphne Ippolito,
Christopher A. Choquette-Choo,
Eric Wallace,
Florian Tramèr,
Katherine Lee
Abstract:
This paper studies extractable memorization: training data that an adversary can efficiently extract by querying a machine learning model without prior knowledge of the training dataset. We show an adversary can extract gigabytes of training data from open-source language models like Pythia or GPT-Neo, semi-open models like LLaMA or Falcon, and closed models like ChatGPT. Existing techniques from…
▽ More
This paper studies extractable memorization: training data that an adversary can efficiently extract by querying a machine learning model without prior knowledge of the training dataset. We show an adversary can extract gigabytes of training data from open-source language models like Pythia or GPT-Neo, semi-open models like LLaMA or Falcon, and closed models like ChatGPT. Existing techniques from the literature suffice to attack unaligned models; in order to attack the aligned ChatGPT, we develop a new divergence attack that causes the model to diverge from its chatbot-style generations and emit training data at a rate 150x higher than when behaving properly. Our methods show practical attacks can recover far more data than previously thought, and reveal that current alignment techniques do not eliminate memorization.
△ Less
Submitted 28 November, 2023;
originally announced November 2023.
-
Report of the 1st Workshop on Generative AI and Law
Authors:
A. Feder Cooper,
Katherine Lee,
James Grimmelmann,
Daphne Ippolito,
Christopher Callison-Burch,
Christopher A. Choquette-Choo,
Niloofar Mireshghallah,
Miles Brundage,
David Mimno,
Madiha Zahrah Choksi,
Jack M. Balkin,
Nicholas Carlini,
Christopher De Sa,
Jonathan Frankle,
Deep Ganguli,
Bryant Gipson,
Andres Guadamuz,
Swee Leng Harris,
Abigail Z. Jacobs,
Elizabeth Joh,
Gautam Kamath,
Mark Lemley,
Cass Matthews,
Christine McLeavey,
Corynne McSherry
, et al. (10 additional authors not shown)
Abstract:
This report presents the takeaways of the inaugural Workshop on Generative AI and Law (GenLaw), held in July 2023. A cross-disciplinary group of practitioners and scholars from computer science and law convened to discuss the technical, doctrinal, and policy challenges presented by law for Generative AI, and by Generative AI for law, with an emphasis on U.S. law in particular. We begin the report…
▽ More
This report presents the takeaways of the inaugural Workshop on Generative AI and Law (GenLaw), held in July 2023. A cross-disciplinary group of practitioners and scholars from computer science and law convened to discuss the technical, doctrinal, and policy challenges presented by law for Generative AI, and by Generative AI for law, with an emphasis on U.S. law in particular. We begin the report with a high-level statement about why Generative AI is both immensely significant and immensely challenging for law. To meet these challenges, we conclude that there is an essential need for 1) a shared knowledge base that provides a common conceptual language for experts across disciplines; 2) clarification of the distinctive technical capabilities of generative-AI systems, as compared and contrasted to other computer and AI systems; 3) a logical taxonomy of the legal issues these systems raise; and, 4) a concrete research agenda to promote collaboration and knowledge-sharing on emerging issues at the intersection of Generative AI and law. In this report, we synthesize the key takeaways from the GenLaw workshop that begin to address these needs. All of the listed authors contributed to the workshop upon which this report is based, but they and their organizations do not necessarily endorse all of the specific claims in this report.
△ Less
Submitted 2 December, 2023; v1 submitted 10 November, 2023;
originally announced November 2023.
-
SoccerNet 2023 Challenges Results
Authors:
Anthony Cioppa,
Silvio Giancola,
Vladimir Somers,
Floriane Magera,
Xin Zhou,
Hassan Mkhallati,
Adrien Deliège,
Jan Held,
Carlos Hinojosa,
Amir M. Mansourian,
Pierre Miralles,
Olivier Barnich,
Christophe De Vleeschouwer,
Alexandre Alahi,
Bernard Ghanem,
Marc Van Droogenbroeck,
Abdullah Kamal,
Adrien Maglo,
Albert Clapés,
Amr Abdelaziz,
Artur Xarles,
Astrid Orcesi,
Atom Scott,
Bin Liu,
Byoungkwon Lim
, et al. (77 additional authors not shown)
Abstract:
The SoccerNet 2023 challenges were the third annual video understanding challenges organized by the SoccerNet team. For this third edition, the challenges were composed of seven vision-based tasks split into three main themes. The first theme, broadcast video understanding, is composed of three high-level tasks related to describing events occurring in the video broadcasts: (1) action spotting, fo…
▽ More
The SoccerNet 2023 challenges were the third annual video understanding challenges organized by the SoccerNet team. For this third edition, the challenges were composed of seven vision-based tasks split into three main themes. The first theme, broadcast video understanding, is composed of three high-level tasks related to describing events occurring in the video broadcasts: (1) action spotting, focusing on retrieving all timestamps related to global actions in soccer, (2) ball action spotting, focusing on retrieving all timestamps related to the soccer ball change of state, and (3) dense video captioning, focusing on describing the broadcast with natural language and anchored timestamps. The second theme, field understanding, relates to the single task of (4) camera calibration, focusing on retrieving the intrinsic and extrinsic camera parameters from images. The third and last theme, player understanding, is composed of three low-level tasks related to extracting information about the players: (5) re-identification, focusing on retrieving the same players across multiple views, (6) multiple object tracking, focusing on tracking players and the ball through unedited video streams, and (7) jersey number recognition, focusing on recognizing the jersey number of players from tracklets. Compared to the previous editions of the SoccerNet challenges, tasks (2-3-7) are novel, including new annotations and data, task (4) was enhanced with more data and annotations, and task (6) now focuses on end-to-end approaches. More information on the tasks, challenges, and leaderboards are available on https://www.soccer-net.org. Baselines and development kits can be found on https://github.com/SoccerNet.
△ Less
Submitted 12 September, 2023;
originally announced September 2023.
-
Privacy Side Channels in Machine Learning Systems
Authors:
Edoardo Debenedetti,
Giorgio Severi,
Nicholas Carlini,
Christopher A. Choquette-Choo,
Matthew Jagielski,
Milad Nasr,
Eric Wallace,
Florian Tramèr
Abstract:
Most current approaches for protecting privacy in machine learning (ML) assume that models exist in a vacuum, when in reality, ML models are part of larger systems that include components for training data filtering, output monitoring, and more. In this work, we introduce privacy side channels: attacks that exploit these system-level components to extract private information at far higher rates th…
▽ More
Most current approaches for protecting privacy in machine learning (ML) assume that models exist in a vacuum, when in reality, ML models are part of larger systems that include components for training data filtering, output monitoring, and more. In this work, we introduce privacy side channels: attacks that exploit these system-level components to extract private information at far higher rates than is otherwise possible for standalone models. We propose four categories of side channels that span the entire ML lifecycle (training data filtering, input preprocessing, output post-processing, and query filtering) and allow for either enhanced membership inference attacks or even novel threats such as extracting users' test queries. For example, we show that deduplicating training data before applying differentially-private training creates a side-channel that completely invalidates any provable privacy guarantees. Moreover, we show that systems which block language models from regenerating training data can be exploited to allow exact reconstruction of private keys contained in the training set -- even if the model did not memorize these keys. Taken together, our results demonstrate the need for a holistic, end-to-end privacy analysis of machine learning.
△ Less
Submitted 11 September, 2023;
originally announced September 2023.
-
Reverse-Engineering Decoding Strategies Given Blackbox Access to a Language Generation System
Authors:
Daphne Ippolito,
Nicholas Carlini,
Katherine Lee,
Milad Nasr,
Yun William Yu
Abstract:
Neural language models are increasingly deployed into APIs and websites that allow a user to pass in a prompt and receive generated text. Many of these systems do not reveal generation parameters. In this paper, we present methods to reverse-engineer the decoding method used to generate text (i.e., top-$k$ or nucleus sampling). Our ability to discover which decoding strategy was used has implicati…
▽ More
Neural language models are increasingly deployed into APIs and websites that allow a user to pass in a prompt and receive generated text. Many of these systems do not reveal generation parameters. In this paper, we present methods to reverse-engineer the decoding method used to generate text (i.e., top-$k$ or nucleus sampling). Our ability to discover which decoding strategy was used has implications for detecting generated text. Additionally, the process of discovering the decoding strategy can reveal biases caused by selecting decoding settings which severely truncate a model's predicted distributions. We perform our attack on several families of open-source language models, as well as on production systems (e.g., ChatGPT).
△ Less
Submitted 9 September, 2023;
originally announced September 2023.
-
Universal and Transferable Adversarial Attacks on Aligned Language Models
Authors:
Andy Zou,
Zifan Wang,
Nicholas Carlini,
Milad Nasr,
J. Zico Kolter,
Matt Fredrikson
Abstract:
Because "out-of-the-box" large language models are capable of generating a great deal of objectionable content, recent work has focused on aligning these models in an attempt to prevent undesirable generation. While there has been some success at circumventing these measures -- so-called "jailbreaks" against LLMs -- these attacks have required significant human ingenuity and are brittle in practic…
▽ More
Because "out-of-the-box" large language models are capable of generating a great deal of objectionable content, recent work has focused on aligning these models in an attempt to prevent undesirable generation. While there has been some success at circumventing these measures -- so-called "jailbreaks" against LLMs -- these attacks have required significant human ingenuity and are brittle in practice. In this paper, we propose a simple and effective attack method that causes aligned language models to generate objectionable behaviors. Specifically, our approach finds a suffix that, when attached to a wide range of queries for an LLM to produce objectionable content, aims to maximize the probability that the model produces an affirmative response (rather than refusing to answer). However, instead of relying on manual engineering, our approach automatically produces these adversarial suffixes by a combination of greedy and gradient-based search techniques, and also improves over past automatic prompt generation methods.
Surprisingly, we find that the adversarial prompts generated by our approach are quite transferable, including to black-box, publicly released LLMs. Specifically, we train an adversarial attack suffix on multiple prompts (i.e., queries asking for many different types of objectionable content), as well as multiple models (in our case, Vicuna-7B and 13B). When doing so, the resulting attack suffix is able to induce objectionable content in the public interfaces to ChatGPT, Bard, and Claude, as well as open source LLMs such as LLaMA-2-Chat, Pythia, Falcon, and others. In total, this work significantly advances the state-of-the-art in adversarial attacks against aligned language models, raising important questions about how such systems can be prevented from producing objectionable information. Code is available at github.com/llm-attacks/llm-attacks.
△ Less
Submitted 20 December, 2023; v1 submitted 27 July, 2023;
originally announced July 2023.
-
Histopathology Slide Indexing and Search: Are We There Yet?
Authors:
Helen H. Shang,
Mohammad Sadegh Nasr,
Jai Prakash Veerla,
Parisa Boodaghi Malidarreh,
MD Jillur Rahman Saurav,
Amir Hajighasemi,
Manfred Huber,
Chace Moleta,
Jitin Makker,
Jacob M. Luber
Abstract:
The search and retrieval of digital histopathology slides is an important task that has yet to be solved. In this case study, we investigate the clinical readiness of three state-of-the-art histopathology slide search engines, Yottixel, SISH, and RetCCL, on three patients with solid tumors. We provide a qualitative assessment of each model's performance in providing retrieval results that are reli…
▽ More
The search and retrieval of digital histopathology slides is an important task that has yet to be solved. In this case study, we investigate the clinical readiness of three state-of-the-art histopathology slide search engines, Yottixel, SISH, and RetCCL, on three patients with solid tumors. We provide a qualitative assessment of each model's performance in providing retrieval results that are reliable and useful to pathologists. We found that all three image search engines fail to produce consistently reliable results and have difficulties in capturing granular and subtle features of malignancy, limiting their diagnostic accuracy. Based on our findings, we also propose a minimal set of requirements to further advance the development of accurate and reliable histopathology image search engines for successful clinical adoption.
△ Less
Submitted 4 January, 2024; v1 submitted 29 June, 2023;
originally announced June 2023.
-
The State of Applying Artificial Intelligence to Tissue Imaging for Cancer Research and Early Detection
Authors:
Michael Robben,
Amir Hajighasemi,
Mohammad Sadegh Nasr,
Jai Prakesh Veerla,
Anne M. Alsup,
Biraaj Rout,
Helen H. Shang,
Kelli Fowlds,
Parisa Boodaghi Malidarreh,
Paul Koomey,
MD Jillur Rahman Saurav,
Jacob M. Luber
Abstract:
Artificial intelligence represents a new frontier in human medicine that could save more lives and reduce the costs, thereby increasing accessibility. As a consequence, the rate of advancement of AI in cancer medical imaging and more particularly tissue pathology has exploded, opening it to ethical and technical questions that could impede its adoption into existing systems. In order to chart the…
▽ More
Artificial intelligence represents a new frontier in human medicine that could save more lives and reduce the costs, thereby increasing accessibility. As a consequence, the rate of advancement of AI in cancer medical imaging and more particularly tissue pathology has exploded, opening it to ethical and technical questions that could impede its adoption into existing systems. In order to chart the path of AI in its application to cancer tissue imaging, we review current work and identify how it can improve cancer pathology diagnostics and research. In this review, we identify 5 core tasks that models are developed for, including regression, classification, segmentation, generation, and compression tasks. We address the benefits and challenges that such methods face, and how they can be adapted for use in cancer prevention and treatment. The studies looked at in this paper represent the beginning of this field and future experiments will build on the foundations that we highlight.
△ Less
Submitted 29 June, 2023;
originally announced June 2023.
-
Are aligned neural networks adversarially aligned?
Authors:
Nicholas Carlini,
Milad Nasr,
Christopher A. Choquette-Choo,
Matthew Jagielski,
Irena Gao,
Anas Awadalla,
Pang Wei Koh,
Daphne Ippolito,
Katherine Lee,
Florian Tramer,
Ludwig Schmidt
Abstract:
Large language models are now tuned to align with the goals of their creators, namely to be "helpful and harmless." These models should respond helpfully to user questions, but refuse to answer requests that could cause harm. However, adversarial users can construct inputs which circumvent attempts at alignment. In this work, we study adversarial alignment, and ask to what extent these models rema…
▽ More
Large language models are now tuned to align with the goals of their creators, namely to be "helpful and harmless." These models should respond helpfully to user questions, but refuse to answer requests that could cause harm. However, adversarial users can construct inputs which circumvent attempts at alignment. In this work, we study adversarial alignment, and ask to what extent these models remain aligned when interacting with an adversarial user who constructs worst-case inputs (adversarial examples). These inputs are designed to cause the model to emit harmful content that would otherwise be prohibited. We show that existing NLP-based optimization attacks are insufficiently powerful to reliably attack aligned text models: even when current NLP-based attacks fail, we can find adversarial inputs with brute force. As a result, the failure of current attacks should not be seen as proof that aligned text models remain aligned under adversarial inputs.
However the recent trend in large-scale ML models is multimodal models that allow users to provide images that influence the text that is generated. We show these models can be easily attacked, i.e., induced to perform arbitrary un-aligned behavior through adversarial perturbation of the input image. We conjecture that improved NLP attacks may demonstrate this same level of adversarial control over text-only models.
△ Less
Submitted 6 May, 2024; v1 submitted 26 June, 2023;
originally announced June 2023.
-
Multimodal Pathology Image Search Between H&E Slides and Multiplexed Immunofluorescent Images
Authors:
Amir Hajighasemi,
MD Jillur Rahman Saurav,
Mohammad S Nasr,
Jai Prakash Veerla,
Aarti Darji,
Parisa Boodaghi Malidarreh,
Michael Robben,
Helen H Shang,
Jacob M Luber
Abstract:
We present an approach for multimodal pathology image search, using dynamic time war** (DTW) on Variational Autoencoder (VAE) latent space that is fed into a ranked choice voting scheme to retrieve multiplexed immunofluorescent imaging (mIF) that is most similar to a query H&E slide. Through training the VAE and applying DTW, we align and compare mIF and H&E slides. Our method improves different…
▽ More
We present an approach for multimodal pathology image search, using dynamic time war** (DTW) on Variational Autoencoder (VAE) latent space that is fed into a ranked choice voting scheme to retrieve multiplexed immunofluorescent imaging (mIF) that is most similar to a query H&E slide. Through training the VAE and applying DTW, we align and compare mIF and H&E slides. Our method improves differential diagnosis and therapeutic decisions by integrating morphological H&E data with immunophenoty** from mIF, providing clinicians a rich perspective of disease states. This facilitates an understanding of the spatial relationships in tissue samples and could revolutionize the diagnostic process, enhancing precision and enabling personalized therapy selection. Our technique demonstrates feasibility using colorectal cancer and healthy tonsil samples. An exhaustive ablation study was conducted on a search engine designed to explore the correlation between multiplexed Immunofluorescence (mIF) and Hematoxylin and Eosin (H&E) staining, in order to validate its ability to map these distinct modalities into a unified vector space. Despite extreme class imbalance, the system demonstrated robustness and utility by returning similar results across various data features, which suggests potential for future use in multimodal histopathology data analysis.
△ Less
Submitted 11 June, 2023;
originally announced June 2023.
-
Privacy Auditing with One (1) Training Run
Authors:
Thomas Steinke,
Milad Nasr,
Matthew Jagielski
Abstract:
We propose a scheme for auditing differentially private machine learning systems with a single training run. This exploits the parallelism of being able to add or remove multiple training examples independently. We analyze this using the connection between differential privacy and statistical generalization, which avoids the cost of group privacy. Our auditing scheme requires minimal assumptions a…
▽ More
We propose a scheme for auditing differentially private machine learning systems with a single training run. This exploits the parallelism of being able to add or remove multiple training examples independently. We analyze this using the connection between differential privacy and statistical generalization, which avoids the cost of group privacy. Our auditing scheme requires minimal assumptions about the algorithm and can be applied in the black-box or white-box setting.
△ Less
Submitted 15 May, 2023;
originally announced May 2023.
-
Synthetic Query Generation for Privacy-Preserving Deep Retrieval Systems using Differentially Private Language Models
Authors:
Aldo Gael Carranza,
Rezsa Farahani,
Natalia Ponomareva,
Alex Kurakin,
Matthew Jagielski,
Milad Nasr
Abstract:
We address the challenge of ensuring differential privacy (DP) guarantees in training deep retrieval systems. Training these systems often involves the use of contrastive-style losses, which are typically non-per-example decomposable, making them difficult to directly DP-train with since common techniques require per-example gradients. To address this issue, we propose an approach that prioritizes…
▽ More
We address the challenge of ensuring differential privacy (DP) guarantees in training deep retrieval systems. Training these systems often involves the use of contrastive-style losses, which are typically non-per-example decomposable, making them difficult to directly DP-train with since common techniques require per-example gradients. To address this issue, we propose an approach that prioritizes ensuring query privacy prior to training a deep retrieval system. Our method employs DP language models (LMs) to generate private synthetic queries representative of the original data. These synthetic queries can be used in downstream retrieval system training without compromising privacy. Our approach demonstrates a significant enhancement in retrieval quality compared to direct DP-training, all while maintaining query-level privacy guarantees. This work highlights the potential of harnessing LMs to overcome limitations in standard DP-training methods.
△ Less
Submitted 23 May, 2024; v1 submitted 10 May, 2023;
originally announced May 2023.
-
Clinically Relevant Latent Space Embedding of Cancer Histopathology Slides through Variational Autoencoder Based Image Compression
Authors:
Mohammad Sadegh Nasr,
Amir Hajighasemi,
Paul Koomey,
Parisa Boodaghi Malidarreh,
Michael Robben,
Jillur Rahman Saurav,
Helen H. Shang,
Manfred Huber,
Jacob M. Luber
Abstract:
In this paper, we introduce a Variational Autoencoder (VAE) based training approach that can compress and decompress cancer pathology slides at a compression ratio of 1:512, which is better than the previously reported state of the art (SOTA) in the literature, while still maintaining accuracy in clinical validation tasks. The compression approach was tested on more common computer vision datasets…
▽ More
In this paper, we introduce a Variational Autoencoder (VAE) based training approach that can compress and decompress cancer pathology slides at a compression ratio of 1:512, which is better than the previously reported state of the art (SOTA) in the literature, while still maintaining accuracy in clinical validation tasks. The compression approach was tested on more common computer vision datasets such as CIFAR10, and we explore which image characteristics enable this compression ratio on cancer imaging data but not generic images. We generate and visualize embeddings from the compressed latent space and demonstrate how they are useful for clinical interpretation of data, and how in the future such latent embeddings can be used to accelerate search of clinical imaging data.
△ Less
Submitted 23 March, 2023;
originally announced March 2023.
-
Students Parrot Their Teachers: Membership Inference on Model Distillation
Authors:
Matthew Jagielski,
Milad Nasr,
Christopher Choquette-Choo,
Katherine Lee,
Nicholas Carlini
Abstract:
Model distillation is frequently proposed as a technique to reduce the privacy leakage of machine learning. These empirical privacy defenses rely on the intuition that distilled ``student'' models protect the privacy of training data, as they only interact with this data indirectly through a ``teacher'' model. In this work, we design membership inference attacks to systematically study the privacy…
▽ More
Model distillation is frequently proposed as a technique to reduce the privacy leakage of machine learning. These empirical privacy defenses rely on the intuition that distilled ``student'' models protect the privacy of training data, as they only interact with this data indirectly through a ``teacher'' model. In this work, we design membership inference attacks to systematically study the privacy provided by knowledge distillation to both the teacher and student training sets. Our new attacks show that distillation alone provides only limited privacy across a number of domains. We explain the success of our attacks on distillation by showing that membership inference attacks on a private dataset can succeed even if the target model is *never* queried on any actual training points, but only on inputs whose predictions are highly influenced by training data. Finally, we show that our attacks are strongest when student and teacher sets are similar, or when the attacker can poison the teacher set.
△ Less
Submitted 6 March, 2023;
originally announced March 2023.
-
Why Is Public Pretraining Necessary for Private Model Training?
Authors:
Arun Ganesh,
Mahdi Haghifam,
Milad Nasr,
Sewoong Oh,
Thomas Steinke,
Om Thakkar,
Abhradeep Thakurta,
Lun Wang
Abstract:
In the privacy-utility tradeoff of a model trained on benchmark language and vision tasks, remarkable improvements have been widely reported with the use of pretraining on publicly available data. This is in part due to the benefits of transfer learning, which is the standard motivation for pretraining in non-private settings. However, the stark contrast in the improvement achieved through pretrai…
▽ More
In the privacy-utility tradeoff of a model trained on benchmark language and vision tasks, remarkable improvements have been widely reported with the use of pretraining on publicly available data. This is in part due to the benefits of transfer learning, which is the standard motivation for pretraining in non-private settings. However, the stark contrast in the improvement achieved through pretraining under privacy compared to non-private settings suggests that there may be a deeper, distinct cause driving these gains. To explain this phenomenon, we hypothesize that the non-convex loss landscape of a model training necessitates an optimization algorithm to go through two phases. In the first, the algorithm needs to select a good "basin" in the loss landscape. In the second, the algorithm solves an easy optimization within that basin. The former is a harder problem to solve with private data, while the latter is harder to solve with public data due to a distribution shift or data scarcity. Guided by this intuition, we provide theoretical constructions that provably demonstrate the separation between private training with and without public pretraining. Further, systematic experiments on CIFAR10 and LibriSpeech provide supporting evidence for our hypothesis.
△ Less
Submitted 19 February, 2023;
originally announced February 2023.
-
Tight Auditing of Differentially Private Machine Learning
Authors:
Milad Nasr,
Jamie Hayes,
Thomas Steinke,
Borja Balle,
Florian Tramèr,
Matthew Jagielski,
Nicholas Carlini,
Andreas Terzis
Abstract:
Auditing mechanisms for differential privacy use probabilistic means to empirically estimate the privacy level of an algorithm. For private machine learning, existing auditing mechanisms are tight: the empirical privacy estimate (nearly) matches the algorithm's provable privacy guarantee. But these auditing techniques suffer from two limitations. First, they only give tight estimates under implaus…
▽ More
Auditing mechanisms for differential privacy use probabilistic means to empirically estimate the privacy level of an algorithm. For private machine learning, existing auditing mechanisms are tight: the empirical privacy estimate (nearly) matches the algorithm's provable privacy guarantee. But these auditing techniques suffer from two limitations. First, they only give tight estimates under implausible worst-case assumptions (e.g., a fully adversarial dataset). Second, they require thousands or millions of training runs to produce non-trivial statistical estimates of the privacy leakage.
This work addresses both issues. We design an improved auditing scheme that yields tight privacy estimates for natural (not adversarially crafted) datasets -- if the adversary can see all model updates during training. Prior auditing works rely on the same assumption, which is permitted under the standard differential privacy threat model. This threat model is also applicable, e.g., in federated learning settings. Moreover, our auditing scheme requires only two training runs (instead of thousands) to produce tight privacy estimates, by adapting recent advances in tight composition theorems for differential privacy. We demonstrate the utility of our improved auditing schemes by surfacing implementation bugs in private machine learning code that eluded prior auditing techniques.
△ Less
Submitted 15 February, 2023;
originally announced February 2023.
-
Extracting Training Data from Diffusion Models
Authors:
Nicholas Carlini,
Jamie Hayes,
Milad Nasr,
Matthew Jagielski,
Vikash Sehwag,
Florian Tramèr,
Borja Balle,
Daphne Ippolito,
Eric Wallace
Abstract:
Image diffusion models such as DALL-E 2, Imagen, and Stable Diffusion have attracted significant attention due to their ability to generate high-quality synthetic images. In this work, we show that diffusion models memorize individual images from their training data and emit them at generation time. With a generate-and-filter pipeline, we extract over a thousand training examples from state-of-the…
▽ More
Image diffusion models such as DALL-E 2, Imagen, and Stable Diffusion have attracted significant attention due to their ability to generate high-quality synthetic images. In this work, we show that diffusion models memorize individual images from their training data and emit them at generation time. With a generate-and-filter pipeline, we extract over a thousand training examples from state-of-the-art models, ranging from photographs of individual people to trademarked company logos. We also train hundreds of diffusion models in various settings to analyze how different modeling and data decisions affect privacy. Overall, our results show that diffusion models are much less private than prior generative models such as GANs, and that mitigating these vulnerabilities may require new advances in privacy-preserving training.
△ Less
Submitted 30 January, 2023;
originally announced January 2023.
-
Preventing Verbatim Memorization in Language Models Gives a False Sense of Privacy
Authors:
Daphne Ippolito,
Florian Tramèr,
Milad Nasr,
Chiyuan Zhang,
Matthew Jagielski,
Katherine Lee,
Christopher A. Choquette-Choo,
Nicholas Carlini
Abstract:
Studying data memorization in neural language models helps us understand the risks (e.g., to privacy or copyright) associated with models regurgitating training data and aids in the development of countermeasures. Many prior works -- and some recently deployed defenses -- focus on "verbatim memorization", defined as a model generation that exactly matches a substring from the training set. We argu…
▽ More
Studying data memorization in neural language models helps us understand the risks (e.g., to privacy or copyright) associated with models regurgitating training data and aids in the development of countermeasures. Many prior works -- and some recently deployed defenses -- focus on "verbatim memorization", defined as a model generation that exactly matches a substring from the training set. We argue that verbatim memorization definitions are too restrictive and fail to capture more subtle forms of memorization. Specifically, we design and implement an efficient defense that perfectly prevents all verbatim memorization. And yet, we demonstrate that this "perfect" filter does not prevent the leakage of training data. Indeed, it is easily circumvented by plausible and minimally modified "style-transfer" prompts -- and in some cases even the non-modified original prompts -- to extract memorized information. We conclude by discussing potential alternative definitions and why defining memorization is a difficult yet crucial open question for neural language models.
△ Less
Submitted 11 September, 2023; v1 submitted 31 October, 2022;
originally announced October 2022.
-
No Free Lunch in "Privacy for Free: How does Dataset Condensation Help Privacy"
Authors:
Nicholas Carlini,
Vitaly Feldman,
Milad Nasr
Abstract:
New methods designed to preserve data privacy require careful scrutiny. Failure to preserve privacy is hard to detect, and yet can lead to catastrophic results when a system implementing a ``privacy-preserving'' method is attacked. A recent work selected for an Outstanding Paper Award at ICML 2022 (Dong et al., 2022) claims that dataset condensation (DC) significantly improves data privacy when tr…
▽ More
New methods designed to preserve data privacy require careful scrutiny. Failure to preserve privacy is hard to detect, and yet can lead to catastrophic results when a system implementing a ``privacy-preserving'' method is attacked. A recent work selected for an Outstanding Paper Award at ICML 2022 (Dong et al., 2022) claims that dataset condensation (DC) significantly improves data privacy when training machine learning models. This claim is supported by theoretical analysis of a specific dataset condensation technique and an empirical evaluation of resistance to some existing membership inference attacks.
In this note we examine the claims in the work of Dong et al. (2022) and describe major flaws in the empirical evaluation of the method and its theoretical analysis. These flaws imply that their work does not provide statistically significant evidence that DC improves the privacy of training ML models over a naive baseline. Moreover, previously published results show that DP-SGD, the standard approach to privacy preserving ML, simultaneously gives better accuracy and achieves a (provably) lower membership attack success rate.
△ Less
Submitted 29 September, 2022;
originally announced September 2022.
-
FEL: High Capacity Learning for Recommendation and Ranking via Federated Ensemble Learning
Authors:
Meisam Hejazinia,
Dzmitry Huba,
Ilias Leontiadis,
Kiwan Maeng,
Mani Malek,
Luca Melis,
Ilya Mironov,
Milad Nasr,
Kaikai Wang,
Carole-Jean Wu
Abstract:
Federated learning (FL) has emerged as an effective approach to address consumer privacy needs. FL has been successfully applied to certain machine learning tasks, such as training smart keyboard models and keyword spotting. Despite FL's initial success, many important deep learning use cases, such as ranking and recommendation tasks, have been limited from on-device learning. One of the key chall…
▽ More
Federated learning (FL) has emerged as an effective approach to address consumer privacy needs. FL has been successfully applied to certain machine learning tasks, such as training smart keyboard models and keyword spotting. Despite FL's initial success, many important deep learning use cases, such as ranking and recommendation tasks, have been limited from on-device learning. One of the key challenges faced by practical FL adoption for DL-based ranking and recommendation is the prohibitive resource requirements that cannot be satisfied by modern mobile systems. We propose Federated Ensemble Learning (FEL) as a solution to tackle the large memory requirement of deep learning ranking and recommendation tasks. FEL enables large-scale ranking and recommendation model training on-device by simultaneously training multiple model versions on disjoint clusters of client devices. FEL integrates the trained sub-models via an over-arch layer into an ensemble model that is hosted on the server. Our experiments demonstrate that FEL leads to 0.43-2.31% model quality improvement over traditional on-device federated learning - a significant improvement for ranking and recommendation system use cases.
△ Less
Submitted 7 June, 2022;
originally announced June 2022.
-
A SSIM Guided cGAN Architecture For Clinically Driven Generative Image Synthesis of Multiplexed Spatial Proteomics Channels
Authors:
Jillur Rahman Saurav,
Mohammad Sadegh Nasr,
Paul Koomey,
Michael Robben,
Manfred Huber,
Jon Weidanz,
Bríd Ryan,
Eytan Ruppin,
Peng Jiang,
Jacob M. Luber
Abstract:
Here we present a structural similarity index measure (SSIM) guided conditional Generative Adversarial Network (cGAN) that generatively performs image-to-image (i2i) synthesis to generate photo-accurate protein channels in multiplexed spatial proteomics images. This approach can be utilized to accurately generate missing spatial proteomics channels that were not included during experimental data c…
▽ More
Here we present a structural similarity index measure (SSIM) guided conditional Generative Adversarial Network (cGAN) that generatively performs image-to-image (i2i) synthesis to generate photo-accurate protein channels in multiplexed spatial proteomics images. This approach can be utilized to accurately generate missing spatial proteomics channels that were not included during experimental data collection either at the bench or the clinic. Experimental spatial proteomic data from the Human BioMolecular Atlas Program (HuBMAP) was used to generate spatial representations of missing proteins through a U-Net based image synthesis pipeline. HuBMAP channels were hierarchically clustered by the (SSIM) as a heuristic to obtain the minimal set needed to recapitulate the underlying biology represented by the spatial landscape of proteins. We subsequently prove that our SSIM based architecture allows for scaling of generative image synthesis to slides with up to 100 channels, which is better than current state of the art algorithms which are limited to data with 11 channels. We validate these claims by generating a new experimental spatial proteomics data set from human lung adenocarcinoma tissue sections and show that a model trained on HuBMAP can accurately synthesize channels from our new data set. The ability to recapitulate experimental data from sparsely stained multiplexed histological slides containing spatial proteomic will have tremendous impact on medical diagnostics and drug development, and also raises important questions on the medical ethics of utilizing data produced by generative image synthesis in the clinical setting. The algorithm that we present in this paper will allow researchers and clinicians to save time and costs in proteomics based histological staining while also increasing the amount of data that they can generate through their experiments.
△ Less
Submitted 11 June, 2023; v1 submitted 20 May, 2022;
originally announced May 2022.
-
Membership Inference Attacks From First Principles
Authors:
Nicholas Carlini,
Steve Chien,
Milad Nasr,
Shuang Song,
Andreas Terzis,
Florian Tramer
Abstract:
A membership inference attack allows an adversary to query a trained machine learning model to predict whether or not a particular example was contained in the model's training dataset. These attacks are currently evaluated using average-case "accuracy" metrics that fail to characterize whether the attack can confidently identify any members of the training set. We argue that attacks should instea…
▽ More
A membership inference attack allows an adversary to query a trained machine learning model to predict whether or not a particular example was contained in the model's training dataset. These attacks are currently evaluated using average-case "accuracy" metrics that fail to characterize whether the attack can confidently identify any members of the training set. We argue that attacks should instead be evaluated by computing their true-positive rate at low (e.g., <0.1%) false-positive rates, and find most prior attacks perform poorly when evaluated in this way. To address this we develop a Likelihood Ratio Attack (LiRA) that carefully combines multiple ideas from the literature. Our attack is 10x more powerful at low false-positive rates, and also strictly dominates prior attacks on existing metrics.
△ Less
Submitted 12 April, 2022; v1 submitted 7 December, 2021;
originally announced December 2021.
-
Mitigating Membership Inference Attacks by Self-Distillation Through a Novel Ensemble Architecture
Authors:
Xinyu Tang,
Saeed Mahloujifar,
Liwei Song,
Virat Shejwalkar,
Milad Nasr,
Amir Houmansadr,
Prateek Mittal
Abstract:
Membership inference attacks are a key measure to evaluate privacy leakage in machine learning (ML) models. These attacks aim to distinguish training members from non-members by exploiting differential behavior of the models on member and non-member inputs. The goal of this work is to train ML models that have high membership privacy while largely preserving their utility; we therefore aim for an…
▽ More
Membership inference attacks are a key measure to evaluate privacy leakage in machine learning (ML) models. These attacks aim to distinguish training members from non-members by exploiting differential behavior of the models on member and non-member inputs. The goal of this work is to train ML models that have high membership privacy while largely preserving their utility; we therefore aim for an empirical membership privacy guarantee as opposed to the provable privacy guarantees provided by techniques like differential privacy, as such techniques are shown to deteriorate model utility. Specifically, we propose a new framework to train privacy-preserving models that induces similar behavior on member and non-member inputs to mitigate membership inference attacks. Our framework, called SELENA, has two major components. The first component and the core of our defense is a novel ensemble architecture for training. This architecture, which we call Split-AI, splits the training data into random subsets, and trains a model on each subset of the data. We use an adaptive inference strategy at test time: our ensemble architecture aggregates the outputs of only those models that did not contain the input sample in their training data. We prove that our Split-AI architecture defends against a large family of membership inference attacks, however, it is susceptible to new adaptive attacks. Therefore, we use a second component in our framework called Self-Distillation to protect against such stronger attacks. The Self-Distillation component (self-)distills the training dataset through our Split-AI ensemble, without using any external public datasets. Through extensive experiments on major benchmark datasets we show that SELENA presents a superior trade-off between membership privacy and utility compared to the state of the art.
△ Less
Submitted 15 October, 2021;
originally announced October 2021.
-
Smart Healthcare in the Age of AI: Recent Advances, Challenges, and Future Prospects
Authors:
Mahmoud Nasr,
MD. Milon Islam,
Shady Shehata,
Fakhri Karray,
Yuri Quintana
Abstract:
The significant increase in the number of individuals with chronic ailments (including the elderly and disabled) has dictated an urgent need for an innovative model for healthcare systems. The evolved model will be more personalized and less reliant on traditional brick-and-mortar healthcare institutions such as hospitals, nursing homes, and long-term healthcare centers. The smart healthcare syste…
▽ More
The significant increase in the number of individuals with chronic ailments (including the elderly and disabled) has dictated an urgent need for an innovative model for healthcare systems. The evolved model will be more personalized and less reliant on traditional brick-and-mortar healthcare institutions such as hospitals, nursing homes, and long-term healthcare centers. The smart healthcare system is a topic of recently growing interest and has become increasingly required due to major developments in modern technologies, especially in artificial intelligence (AI) and machine learning (ML). This paper is aimed to discuss the current state-of-the-art smart healthcare systems highlighting major areas like wearable and smartphone devices for health monitoring, machine learning for disease diagnosis, and the assistive frameworks, including social robots developed for the ambient assisted living environment. Additionally, the paper demonstrates software integration architectures that are very significant to create smart healthcare systems, integrating seamlessly the benefit of data analytics and other tools of AI. The explained developed systems focus on several facets: the contribution of each developed framework, the detailed working procedure, the performance as outcomes, and the comparative merits and limitations. The current research challenges with potential future directions are addressed to highlight the drawbacks of existing systems and the possible methods to introduce novel frameworks, respectively. This review aims at providing comprehensive insights into the recent developments of smart healthcare systems to equip experts to contribute to the field.
△ Less
Submitted 24 June, 2021;
originally announced July 2021.
-
Robust Adversarial Attacks Against DNN-Based Wireless Communication Systems
Authors:
Alireza Bahramali,
Milad Nasr,
Amir Houmansadr,
Dennis Goeckel,
Don Towsley
Abstract:
Deep Neural Networks (DNNs) have become prevalent in wireless communication systems due to their promising performance. However, similar to other DNN-based applications, they are vulnerable to adversarial examples. In this work, we propose an input-agnostic, undetectable, and robust adversarial attack against DNN-based wireless communication systems in both white-box and black-box scenarios. We de…
▽ More
Deep Neural Networks (DNNs) have become prevalent in wireless communication systems due to their promising performance. However, similar to other DNN-based applications, they are vulnerable to adversarial examples. In this work, we propose an input-agnostic, undetectable, and robust adversarial attack against DNN-based wireless communication systems in both white-box and black-box scenarios. We design tailored Universal Adversarial Perturbations (UAPs) to perform the attack. We also use a Generative Adversarial Network (GAN) to enforce an undetectability constraint for our attack. Furthermore, we investigate the robustness of our attack against countermeasures. We show that in the presence of defense mechanisms deployed by the communicating parties, our attack performs significantly better compared to existing attacks against DNN-based wireless systems. In particular, the results demonstrate that even when employing well-considered defenses, DNN-based wireless communications are vulnerable to adversarial attacks.
△ Less
Submitted 1 February, 2021;
originally announced February 2021.
-
Adversary Instantiation: Lower Bounds for Differentially Private Machine Learning
Authors:
Milad Nasr,
Shuang Song,
Abhradeep Thakurta,
Nicolas Papernot,
Nicholas Carlini
Abstract:
Differentially private (DP) machine learning allows us to train models on private data while limiting data leakage. DP formalizes this data leakage through a cryptographic game, where an adversary must predict if a model was trained on a dataset D, or a dataset D' that differs in just one example.If observing the training algorithm does not meaningfully increase the adversary's odds of successfull…
▽ More
Differentially private (DP) machine learning allows us to train models on private data while limiting data leakage. DP formalizes this data leakage through a cryptographic game, where an adversary must predict if a model was trained on a dataset D, or a dataset D' that differs in just one example.If observing the training algorithm does not meaningfully increase the adversary's odds of successfully guessing which dataset the model was trained on, then the algorithm is said to be differentially private. Hence, the purpose of privacy analysis is to upper bound the probability that any adversary could successfully guess which dataset the model was trained on.In our paper, we instantiate this hypothetical adversary in order to establish lower bounds on the probability that this distinguishing game can be won. We use this adversary to evaluate the importance of the adversary capabilities allowed in the privacy analysis of DP training algorithms.For DP-SGD, the most common method for training neural networks with differential privacy, our lower bounds are tight and match the theoretical upper bound. This implies that in order to prove better upper bounds, it will be necessary to make use of additional assumptions. Fortunately, we find that our attacks are significantly weaker when additional (realistic)restrictions are put in place on the adversary's capabilities.Thus, in the practical setting common to many real-world deployments, there is a gap between our lower bounds and the upper bounds provided by the analysis: differential privacy is conservative and adversaries may not be able to leak as much information as suggested by the theoretical bound.
△ Less
Submitted 11 January, 2021;
originally announced January 2021.
-
Benchmarking Meta-heuristic Optimization
Authors:
Mona Nasr,
Omar Farouk,
Ahmed Mohamedeen,
Ali Elrafie,
Marwan Bedeir,
Ali Khaled
Abstract:
Solving an optimization task in any domain is a very challenging problem, especially when dealing with nonlinear problems and non-convex functions. Many meta-heuristic algorithms are very efficient when solving nonlinear functions. A meta-heuristic algorithm is a problem-independent technique that can be applied to a broad range of problems. In this experiment, some of the evolutionary algorithms…
▽ More
Solving an optimization task in any domain is a very challenging problem, especially when dealing with nonlinear problems and non-convex functions. Many meta-heuristic algorithms are very efficient when solving nonlinear functions. A meta-heuristic algorithm is a problem-independent technique that can be applied to a broad range of problems. In this experiment, some of the evolutionary algorithms will be tested, evaluated, and compared with each other. We will go through the Genetic Algorithm\, Differential Evolution, Particle Swarm Optimization Algorithm, Grey Wolf Optimizer, and Simulated Annealing. They will be evaluated against the performance from many points of view like how the algorithm performs throughout generations and how the algorithm's result is close to the optimal result. Other points of evaluation are discussed in depth in later sections.
△ Less
Submitted 27 July, 2020;
originally announced July 2020.
-
Improving Deep Learning with Differential Privacy using Gradient Encoding and Denoising
Authors:
Milad Nasr,
Reza Shokri,
Amir houmansadr
Abstract:
Deep learning models leak significant amounts of information about their training datasets. Previous work has investigated training models with differential privacy (DP) guarantees through adding DP noise to the gradients. However, such solutions (specifically, DPSGD), result in large degradations in the accuracy of the trained models. In this paper, we aim at training deep learning models with DP…
▽ More
Deep learning models leak significant amounts of information about their training datasets. Previous work has investigated training models with differential privacy (DP) guarantees through adding DP noise to the gradients. However, such solutions (specifically, DPSGD), result in large degradations in the accuracy of the trained models. In this paper, we aim at training deep learning models with DP guarantees while preserving model accuracy much better than previous works. Our key technique is to encode gradients to map them to a smaller vector space, therefore enabling us to obtain DP guarantees for different noise distributions. This allows us to investigate and choose noise distributions that best preserve model accuracy for a target privacy budget. We also take advantage of the post-processing property of differential privacy by introducing the idea of denoising, which further improves the utility of the trained models without degrading their DP guarantees. We show that our mechanism outperforms the state-of-the-art DPSGD; for instance, for the same model accuracy of $96.1\%$ on MNIST, our technique results in a privacy bound of $ε=3.2$ compared to $ε=6$ of DPSGD, which is a significant improvement.
△ Less
Submitted 22 July, 2020;
originally announced July 2020.
-
Blind Adversarial Network Perturbations
Authors:
Milad Nasr,
Alireza Bahramali,
Amir Houmansadr
Abstract:
Deep Neural Networks (DNNs) are commonly used for various traffic analysis problems, such as website fingerprinting and flow correlation, as they outperform traditional (e.g., statistical) techniques by large margins. However, deep neural networks are known to be vulnerable to adversarial examples: adversarial inputs to the model that get labeled incorrectly by the model due to small adversarial p…
▽ More
Deep Neural Networks (DNNs) are commonly used for various traffic analysis problems, such as website fingerprinting and flow correlation, as they outperform traditional (e.g., statistical) techniques by large margins. However, deep neural networks are known to be vulnerable to adversarial examples: adversarial inputs to the model that get labeled incorrectly by the model due to small adversarial perturbations. In this paper, for the first time, we show that an adversary can defeat DNN-based traffic analysis techniques by applying \emph{adversarial perturbations} on the patterns of \emph{live} network traffic.
△ Less
Submitted 15 February, 2020;
originally announced February 2020.
-
The TechQA Dataset
Authors:
Vittorio Castelli,
Rishav Chakravarti,
Saswati Dana,
Anthony Ferritto,
Radu Florian,
Martin Franz,
Dinesh Garg,
Dinesh Khandelwal,
Scott McCarley,
Mike McCawley,
Mohamed Nasr,
Lin Pan,
Cezar Pendus,
John Pitrelli,
Saurabh Pujar,
Salim Roukos,
Andrzej Sakrajda,
Avirup Sil,
Rosario Uceda-Sosa,
Todd Ward,
Rong Zhang
Abstract:
We introduce TechQA, a domain-adaptation question answering dataset for the technical support domain. The TechQA corpus highlights two real-world issues from the automated customer support domain. First, it contains actual questions posed by users on a technical forum, rather than questions generated specifically for a competition or a task. Second, it has a real-world size -- 600 training, 310 de…
▽ More
We introduce TechQA, a domain-adaptation question answering dataset for the technical support domain. The TechQA corpus highlights two real-world issues from the automated customer support domain. First, it contains actual questions posed by users on a technical forum, rather than questions generated specifically for a competition or a task. Second, it has a real-world size -- 600 training, 310 dev, and 490 evaluation question/answer pairs -- thus reflecting the cost of creating large labeled datasets with actual data. Consequently, TechQA is meant to stimulate research in domain adaptation rather than being a resource to build QA systems from scratch. The dataset was obtained by crawling the IBM Developer and IBM DeveloperWorks forums for questions with accepted answers that appear in a published IBM Technote---a technical document that addresses a specific technical issue. We also release a collection of the 801,998 publicly available Technotes as of April 4, 2019 as a companion resource that might be used for pretraining, to learn representations of the IT domain language.
△ Less
Submitted 7 November, 2019;
originally announced November 2019.
-
Bidding Strategies with Gender Nondiscrimination: Constraints for Online Ad Auctions
Authors:
Milad Nasr,
Michael Tschantz
Abstract:
Interactions between bids to show ads online can lead to an advertiser's ad being shown to more men than women even when the advertiser does not target towards men. We design bidding strategies that advertisers can use to avoid such emergent discrimination without having to modify the auction mechanism. We mathematically analyze the strategies to determine the additional cost to the advertiser for…
▽ More
Interactions between bids to show ads online can lead to an advertiser's ad being shown to more men than women even when the advertiser does not target towards men. We design bidding strategies that advertisers can use to avoid such emergent discrimination without having to modify the auction mechanism. We mathematically analyze the strategies to determine the additional cost to the advertiser for avoiding discrimination, proving our strategies to be optimal in some settings. We use simulations to understand other settings.
△ Less
Submitted 4 September, 2019;
originally announced September 2019.
-
Comprehensive Privacy Analysis of Deep Learning: Passive and Active White-box Inference Attacks against Centralized and Federated Learning
Authors:
Milad Nasr,
Reza Shokri,
Amir Houmansadr
Abstract:
Deep neural networks are susceptible to various inference attacks as they remember information about their training data. We design white-box inference attacks to perform a comprehensive privacy analysis of deep learning models. We measure the privacy leakage through parameters of fully trained models as well as the parameter updates of models during training. We design inference algorithms for bo…
▽ More
Deep neural networks are susceptible to various inference attacks as they remember information about their training data. We design white-box inference attacks to perform a comprehensive privacy analysis of deep learning models. We measure the privacy leakage through parameters of fully trained models as well as the parameter updates of models during training. We design inference algorithms for both centralized and federated learning, with respect to passive and active inference attackers, and assuming different adversary prior knowledge.
We evaluate our novel white-box membership inference attacks against deep learning algorithms to trace their training data records. We show that a straightforward extension of the known black-box attacks to the white-box setting (through analyzing the outputs of activation functions) is ineffective. We therefore design new algorithms tailored to the white-box setting by exploiting the privacy vulnerabilities of the stochastic gradient descent algorithm, which is the algorithm used to train deep neural networks. We investigate the reasons why deep learning models may leak information about their training data. We then show that even well-generalized models are significantly susceptible to white-box membership inference attacks, by analyzing state-of-the-art pre-trained and publicly available models for the CIFAR dataset. We also show how adversarial participants, in the federated learning setting, can successfully run active membership inference attacks against other participants, even when the global model achieves high prediction accuracies.
△ Less
Submitted 6 June, 2020; v1 submitted 3 December, 2018;
originally announced December 2018.
-
DeepCorr: Strong Flow Correlation Attacks on Tor Using Deep Learning
Authors:
Milad Nasr,
Alireza Bahramali,
Amir Houmansadr
Abstract:
Flow correlation is the core technique used in a multitude of deanonymization attacks on Tor. Despite the importance of flow correlation attacks on Tor, existing flow correlation techniques are considered to be ineffective and unreliable in linking Tor flows when applied at a large scale, i.e., they impose high rates of false positive error rates or require impractically long flow observations to…
▽ More
Flow correlation is the core technique used in a multitude of deanonymization attacks on Tor. Despite the importance of flow correlation attacks on Tor, existing flow correlation techniques are considered to be ineffective and unreliable in linking Tor flows when applied at a large scale, i.e., they impose high rates of false positive error rates or require impractically long flow observations to be able to make reliable correlations. In this paper, we show that, unfortunately, flow correlation attacks can be conducted on Tor traffic with drastically higher accuracies than before by leveraging emerging learning mechanisms. We particularly design a system, called DeepCorr, that outperforms the state-of-the-art by significant margins in correlating Tor connections. DeepCorr leverages an advanced deep learning architecture to learn a flow correlation function tailored to Tor's complex network this is in contrast to previous works' use of generic statistical correlation metrics to correlated Tor flows. We show that with moderate learning, DeepCorr can correlate Tor connections (and therefore break its anonymity) with accuracies significantly higher than existing algorithms, and using substantially shorter lengths of flow observations. For instance, by collecting only about 900 packets of each target Tor flow (roughly 900KB of Tor data), DeepCorr provides a flow correlation accuracy of 96% compared to 4% by the state-of-the-art system of RAPTOR using the same exact setting.
We hope that our work demonstrates the escalating threat of flow correlation attacks on Tor given recent advances in learning algorithms, calling for the timely deployment of effective countermeasures by the Tor community.
△ Less
Submitted 22 August, 2018;
originally announced August 2018.
-
Machine Learning with Membership Privacy using Adversarial Regularization
Authors:
Milad Nasr,
Reza Shokri,
Amir Houmansadr
Abstract:
Machine learning models leak information about the datasets on which they are trained. An adversary can build an algorithm to trace the individual members of a model's training dataset. As a fundamental inference attack, he aims to distinguish between data points that were part of the model's training set and any other data points from the same distribution. This is known as the tracing (and also…
▽ More
Machine learning models leak information about the datasets on which they are trained. An adversary can build an algorithm to trace the individual members of a model's training dataset. As a fundamental inference attack, he aims to distinguish between data points that were part of the model's training set and any other data points from the same distribution. This is known as the tracing (and also membership inference) attack. In this paper, we focus on such attacks against black-box models, where the adversary can only observe the output of the model, but not its parameters. This is the current setting of machine learning as a service in the Internet.
We introduce a privacy mechanism to train machine learning models that provably achieve membership privacy: the model's predictions on its training data are indistinguishable from its predictions on other data points from the same distribution. We design a strategic mechanism where the privacy mechanism anticipates the membership inference attacks. The objective is to train a model such that not only does it have the minimum prediction error (high utility), but also it is the most robust model against its corresponding strongest inference attack (high privacy). We formalize this as a min-max game optimization problem, and design an adversarial training algorithm that minimizes the classification loss of the model as well as the maximum gain of the membership inference attack against it. This strategy, which guarantees membership privacy (as prediction indistinguishability), acts also as a strong regularizer and significantly generalizes the model.
We evaluate our privacy mechanism on deep neural networks using different benchmark datasets. We show that our min-max strategy can mitigate the risk of membership inference attacks (close to the random guess) with a negligible cost in terms of the classification error.
△ Less
Submitted 16 July, 2018;
originally announced July 2018.
-
Enemy At the Gateways: A Game Theoretic Approach to Proxy Distribution
Authors:
Milad Nasr,
Sadegh Farhang,
Amir Houmansadr,
Jens Grossklags
Abstract:
A core technique used by popular proxy-based circumvention systems like Tor, Psiphon, and Lantern is to secretly share the IP addresses of circumvention proxies with the censored clients for them to be able to use such systems. For instance, such secretly shared proxies are known as bridges in Tor. However, a key challenge to this mechanism is the insider attack problem: censoring agents can imper…
▽ More
A core technique used by popular proxy-based circumvention systems like Tor, Psiphon, and Lantern is to secretly share the IP addresses of circumvention proxies with the censored clients for them to be able to use such systems. For instance, such secretly shared proxies are known as bridges in Tor. However, a key challenge to this mechanism is the insider attack problem: censoring agents can impersonate as benign censored clients in order to obtain (and then block) such secretly shared circumvention proxies.
In this paper, we perform a fundamental study on the problem of insider attack on proxy-based circumvention systems. We model the proxy distribution problem using game theory, based on which we derive the optimal strategies of the parties involved, i.e., the censors and circumvention system operators.
That is, we derive the optimal proxy distribution mechanism of a circumvention system like Tor, against the censorship adversary who also takes his optimal censorship strategies.
This is unlike previous works that design ad hoc mechanisms for proxy distribution, against non-optimal censors.
We perform extensive simulations to evaluate our optimal proxy assignment algorithm under various adversarial and network settings. Comparing with the state-of-the-art prior work, we show that our optimal proxy assignment algorithm has superior performance, i.e., better resistance to censorship even against the strongest censorship adversary who takes her optimal actions. We conclude with lessons and recommendation for the design of proxy-based circumvention systems.
△ Less
Submitted 12 September, 2017;
originally announced September 2017.