Whispers in the Machine: Confidentiality in LLM-integrated Systems

Evertz, Jonathan; Chlosta, Merlin; Schönherr, Lea; Eisenhofer, Thorsten

Computer Science > Cryptography and Security

arXiv:2402.06922 (cs)

[Submitted on 10 Feb 2024]

Title:Whispers in the Machine: Confidentiality in LLM-integrated Systems

Authors:Jonathan Evertz, Merlin Chlosta, Lea Schönherr, Thorsten Eisenhofer

View PDF HTML (experimental)

Abstract:Large Language Models (LLMs) are increasingly integrated with external tools. While these integrations can significantly improve the functionality of LLMs, they also create a new attack surface where confidential data may be disclosed between different components. Specifically, malicious tools can exploit vulnerabilities in the LLM itself to manipulate the model and compromise the data of other services, raising the question of how private data can be protected in the context of LLM integrations.
In this work, we provide a systematic way of evaluating confidentiality in LLM-integrated systems. For this, we formalize a "secret key" game that can capture the ability of a model to conceal private information. This enables us to compare the vulnerability of a model against confidentiality attacks and also the effectiveness of different defense strategies. In this framework, we evaluate eight previously published attacks and four defenses. We find that current defenses lack generalization across attack strategies. Building on this analysis, we propose a method for robustness fine-tuning, inspired by adversarial training. This approach is effective in lowering the success rate of attackers and in improving the system's resilience against unknown attacks.

Subjects:	Cryptography and Security (cs.CR); Machine Learning (cs.LG)
Cite as:	arXiv:2402.06922 [cs.CR]
	(or arXiv:2402.06922v1 [cs.CR] for this version)
	https://doi.org/10.48550/arXiv.2402.06922

Submission history

From: Jonathan Evertz [view email]
[v1] Sat, 10 Feb 2024 11:07:24 UTC (165 KB)

Computer Science > Cryptography and Security

Title:Whispers in the Machine: Confidentiality in LLM-integrated Systems

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Cryptography and Security

Title:Whispers in the Machine: Confidentiality in LLM-integrated Systems

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators