-
Vulnerability Detection with Code Language Models: How Far Are We?
Authors:
Yangruibo Ding,
Yanjun Fu,
Omniyyah Ibrahim,
Chawin Sitawarin,
Xinyun Chen,
Basel Alomair,
David Wagner,
Baishakhi Ray,
Yizheng Chen
Abstract:
In the context of the rising interest in code language models (code LMs) and vulnerability detection, we study the effectiveness of code LMs for detecting vulnerabilities. Our analysis reveals significant shortcomings in existing vulnerability datasets, including poor data quality, low label accuracy, and high duplication rates, leading to unreliable model performance in realistic vulnerability de…
▽ More
In the context of the rising interest in code language models (code LMs) and vulnerability detection, we study the effectiveness of code LMs for detecting vulnerabilities. Our analysis reveals significant shortcomings in existing vulnerability datasets, including poor data quality, low label accuracy, and high duplication rates, leading to unreliable model performance in realistic vulnerability detection scenarios. Additionally, the evaluation methods used with these datasets are not representative of real-world vulnerability detection.
To address these challenges, we introduce PrimeVul, a new dataset for training and evaluating code LMs for vulnerability detection. PrimeVul incorporates a novel set of data labeling techniques that achieve comparable label accuracy to human-verified benchmarks while significantly expanding the dataset. It also implements a rigorous data de-duplication and chronological data splitting strategy to mitigate data leakage issues, alongside introducing more realistic evaluation metrics and settings. This comprehensive approach aims to provide a more accurate assessment of code LMs' performance in real-world conditions.
Evaluating code LMs on PrimeVul reveals that existing benchmarks significantly overestimate the performance of these models. For instance, a state-of-the-art 7B model scored 68.26% F1 on BigVul but only 3.09% F1 on PrimeVul. Attempts to improve performance through advanced training techniques and larger models like GPT-3.5 and GPT-4 were unsuccessful, with results akin to random guessing in the most stringent settings. These findings underscore the considerable gap between current capabilities and the practical requirements for deploying code LMs in security roles, highlighting the need for more innovative research in this domain.
△ Less
Submitted 27 March, 2024;
originally announced March 2024.
-
Jatmo: Prompt Injection Defense by Task-Specific Finetuning
Authors:
Julien Piet,
Maha Alrashed,
Chawin Sitawarin,
Sizhe Chen,
Zeming Wei,
Elizabeth Sun,
Basel Alomair,
David Wagner
Abstract:
Large Language Models (LLMs) are attracting significant research attention due to their instruction-following abilities, allowing users and developers to leverage LLMs for a variety of tasks. However, LLMs are vulnerable to prompt-injection attacks: a class of attacks that hijack the model's instruction-following abilities, changing responses to prompts to undesired, possibly malicious ones. In th…
▽ More
Large Language Models (LLMs) are attracting significant research attention due to their instruction-following abilities, allowing users and developers to leverage LLMs for a variety of tasks. However, LLMs are vulnerable to prompt-injection attacks: a class of attacks that hijack the model's instruction-following abilities, changing responses to prompts to undesired, possibly malicious ones. In this work, we introduce Jatmo, a method for generating task-specific models resilient to prompt-injection attacks. Jatmo leverages the fact that LLMs can only follow instructions once they have undergone instruction tuning. It harnesses a teacher instruction-tuned model to generate a task-specific dataset, which is then used to fine-tune a base model (i.e., a non-instruction-tuned model). Jatmo only needs a task prompt and a dataset of inputs for the task: it uses the teacher model to generate outputs. For situations with no pre-existing datasets, Jatmo can use a single example, or in some cases none at all, to produce a fully synthetic dataset. Our experiments on seven tasks show that Jatmo models provide similar quality of outputs on their specific task as standard LLMs, while being resilient to prompt injections. The best attacks succeeded in less than 0.5% of cases against our models, versus 87% success rate against GPT-3.5-Turbo. We release Jatmo at https://github.com/wagner-group/prompt-injection-defense.
△ Less
Submitted 8 January, 2024; v1 submitted 29 December, 2023;
originally announced December 2023.
-
Can LLMs Follow Simple Rules?
Authors:
Norman Mu,
Sarah Chen,
Zifan Wang,
Sizhe Chen,
David Karamardian,
Lulwa Aljeraisy,
Basel Alomair,
Dan Hendrycks,
David Wagner
Abstract:
As Large Language Models (LLMs) are deployed with increasing real-world responsibilities, it is important to be able to specify and constrain the behavior of these systems in a reliable manner. Model developers may wish to set explicit rules for the model, such as "do not generate abusive content", but these may be circumvented by jailbreaking techniques. Existing evaluations of adversarial attack…
▽ More
As Large Language Models (LLMs) are deployed with increasing real-world responsibilities, it is important to be able to specify and constrain the behavior of these systems in a reliable manner. Model developers may wish to set explicit rules for the model, such as "do not generate abusive content", but these may be circumvented by jailbreaking techniques. Existing evaluations of adversarial attacks and defenses on LLMs generally require either expensive manual review or unreliable heuristic checks. To address this issue, we propose Rule-following Language Evaluation Scenarios (RuLES), a programmatic framework for measuring rule-following ability in LLMs. RuLES consists of 14 simple text scenarios in which the model is instructed to obey various rules while interacting with the user. Each scenario has a programmatic evaluation function to determine whether the model has broken any rules in a conversation. Our evaluations of proprietary and open models show that almost all current models struggle to follow scenario rules, even on straightforward test cases. We also demonstrate that simple optimization attacks suffice to significantly increase failure rates on test cases. We conclude by exploring two potential avenues for improvement: test-time steering and supervised fine-tuning.
△ Less
Submitted 8 March, 2024; v1 submitted 6 November, 2023;
originally announced November 2023.
-
Secret-Key Agreement with Public Discussion subject to an Amplitude Constraint
Authors:
Marwen Zorgui,
Zouheir Rezki,
Basel Alomair,
Mohamed-Slim Alouini
Abstract:
This paper considers the problem of secret-key agreement with public discussion subject to a peak power constraint $A$ on the channel input. The optimal input distribution is proved to be discrete with finite support. The result is obtained by first transforming the secret-key channel model into an equivalent Gaussian wiretap channel with better noise statistics at the legitimate receiver and then…
▽ More
This paper considers the problem of secret-key agreement with public discussion subject to a peak power constraint $A$ on the channel input. The optimal input distribution is proved to be discrete with finite support. The result is obtained by first transforming the secret-key channel model into an equivalent Gaussian wiretap channel with better noise statistics at the legitimate receiver and then using the fact that the optimal distribution of the Gaussian wiretap channel is discrete. To overcome the computationally heavy search for the optimal discrete distribution, several suboptimal schemes are proposed and shown numerically to perform close to the capacity. Moreover, lower and upper bounds for the secret-key capacity are provided and used to prove that the secret-key capacity converges for asymptotic high values of $A$, to the secret-key capacity with an average power constraint $A^2$. Finally, when the amplitude constraint A is small ($A \to 0$), the secret-key capacity is proved to be asymptotically equal to the capacity of the legitimate user with an amplitude constraint A and no secrecy constraint.
△ Less
Submitted 31 March, 2016;
originally announced April 2016.
-
Adaptive Mitigation of Multi-Virus Propagation: A Passivity-Based Approach
Authors:
Phillip Lee,
Andrew Clark,
Basel Alomair,
Linda Bushnell,
Radha Poovendran
Abstract:
Malware propagation poses a growing threat to networked systems such as computer networks and cyber-physical systems. Current approaches to defending against malware propagation are based on patching or filtering susceptible nodes at a fixed rate. When the propagation dynamics are unknown or uncertain, however, the static rate that is chosen may be either insufficient to remove all viruses or too…
▽ More
Malware propagation poses a growing threat to networked systems such as computer networks and cyber-physical systems. Current approaches to defending against malware propagation are based on patching or filtering susceptible nodes at a fixed rate. When the propagation dynamics are unknown or uncertain, however, the static rate that is chosen may be either insufficient to remove all viruses or too high, incurring additional performance cost. In this paper, we formulate adaptive strategies for mitigating multiple malware epidemics when the propagation rate is unknown, using patching and filtering-based defense mechanisms. In order to identify conditions for ensuring that all viruses are asymptotically removed, we show that the malware propagation, patching, and filtering processes can be modeled as coupled passive dynamical systems. We prove that the patching rate required to remove all viruses is bounded above by the passivity index of the coupled system, and formulate the problem of selecting the minimum-cost mitigation strategy. Our results are evaluated through numerical study.
△ Less
Submitted 20 September, 2016; v1 submitted 14 March, 2016;
originally announced March 2016.