-
CyberSecEval 2: A Wide-Ranging Cybersecurity Evaluation Suite for Large Language Models
Authors:
Manish Bhatt,
Sahana Chennabasappa,
Yue Li,
Cyrus Nikolaidis,
Daniel Song,
Shengye Wan,
Faizan Ahmad,
Cornelius Aschermann,
Yaohui Chen,
Dhaval Kapil,
David Molnar,
Spencer Whitman,
Joshua Saxe
Abstract:
Large language models (LLMs) introduce new security risks, but there are few comprehensive evaluation suites to measure and reduce these risks. We present BenchmarkName, a novel benchmark to quantify LLM security risks and capabilities. We introduce two new areas for testing: prompt injection and code interpreter abuse. We evaluated multiple state-of-the-art (SOTA) LLMs, including GPT-4, Mistral,…
▽ More
Large language models (LLMs) introduce new security risks, but there are few comprehensive evaluation suites to measure and reduce these risks. We present BenchmarkName, a novel benchmark to quantify LLM security risks and capabilities. We introduce two new areas for testing: prompt injection and code interpreter abuse. We evaluated multiple state-of-the-art (SOTA) LLMs, including GPT-4, Mistral, Meta Llama 3 70B-Instruct, and Code Llama. Our results show that conditioning away risk of attack remains an unsolved problem; for example, all tested models showed between 26% and 41% successful prompt injection tests. We further introduce the safety-utility tradeoff: conditioning an LLM to reject unsafe prompts can cause the LLM to falsely reject answering benign prompts, which lowers utility. We propose quantifying this tradeoff using False Refusal Rate (FRR). As an illustration, we introduce a novel test set to quantify FRR for cyberattack helpfulness risk. We find many LLMs able to successfully comply with "borderline" benign requests while still rejecting most unsafe requests. Finally, we quantify the utility of LLMs for automating a core cybersecurity task, that of exploiting software vulnerabilities. This is important because the offensive capabilities of LLMs are of intense interest; we quantify this by creating novel test sets for four representative problems. We find that models with coding capabilities perform better than those without, but that further work is needed for LLMs to become proficient at exploit generation. Our code is open source and can be used to evaluate other LLMs.
△ Less
Submitted 19 April, 2024;
originally announced April 2024.
-
Purple Llama CyberSecEval: A Secure Coding Benchmark for Language Models
Authors:
Manish Bhatt,
Sahana Chennabasappa,
Cyrus Nikolaidis,
Shengye Wan,
Ivan Evtimov,
Dominik Gabi,
Daniel Song,
Faizan Ahmad,
Cornelius Aschermann,
Lorenzo Fontana,
Sasha Frolov,
Ravi Prakash Giri,
Dhaval Kapil,
Yiannis Kozyrakis,
David LeBlanc,
James Milazzo,
Aleksandar Straumann,
Gabriel Synnaeve,
Varun Vontimitta,
Spencer Whitman,
Joshua Saxe
Abstract:
This paper presents CyberSecEval, a comprehensive benchmark developed to help bolster the cybersecurity of Large Language Models (LLMs) employed as coding assistants. As what we believe to be the most extensive unified cybersecurity safety benchmark to date, CyberSecEval provides a thorough evaluation of LLMs in two crucial security domains: their propensity to generate insecure code and their lev…
▽ More
This paper presents CyberSecEval, a comprehensive benchmark developed to help bolster the cybersecurity of Large Language Models (LLMs) employed as coding assistants. As what we believe to be the most extensive unified cybersecurity safety benchmark to date, CyberSecEval provides a thorough evaluation of LLMs in two crucial security domains: their propensity to generate insecure code and their level of compliance when asked to assist in cyberattacks. Through a case study involving seven models from the Llama 2, Code Llama, and OpenAI GPT large language model families, CyberSecEval effectively pinpointed key cybersecurity risks. More importantly, it offered practical insights for refining these models. A significant observation from the study was the tendency of more advanced models to suggest insecure code, highlighting the critical need for integrating security considerations in the development of sophisticated LLMs. CyberSecEval, with its automated test case generation and evaluation pipeline covers a broad scope and equips LLM designers and researchers with a tool to broadly measure and enhance the cybersecurity safety properties of LLMs, contributing to the development of more secure AI systems.
△ Less
Submitted 7 December, 2023;
originally announced December 2023.
-
Federated Learning for Early Dropout Prediction on Healthy Ageing Applications
Authors:
Christos Chrysanthos Nikolaidis,
Vasileios Perifanis,
Nikolaos Pavlidis,
Pavlos S. Efraimidis
Abstract:
The provision of social care applications is crucial for elderly people to improve their quality of life and enables operators to provide early interventions. Accurate predictions of user dropouts in healthy ageing applications are essential since they are directly related to individual health statuses. Machine Learning (ML) algorithms have enabled highly accurate predictions, outperforming tradit…
▽ More
The provision of social care applications is crucial for elderly people to improve their quality of life and enables operators to provide early interventions. Accurate predictions of user dropouts in healthy ageing applications are essential since they are directly related to individual health statuses. Machine Learning (ML) algorithms have enabled highly accurate predictions, outperforming traditional statistical methods that struggle to cope with individual patterns. However, ML requires a substantial amount of data for training, which is challenging due to the presence of personal identifiable information (PII) and the fragmentation posed by regulations. In this paper, we present a federated machine learning (FML) approach that minimizes privacy concerns and enables distributed training, without transferring individual data. We employ collaborative training by considering individuals and organizations under FML, which models both cross-device and cross-silo learning scenarios. Our approach is evaluated on a real-world dataset with non-independent and identically distributed (non-iid) data among clients, class imbalance and label ambiguity. Our results show that data selection and class imbalance handling techniques significantly improve the predictive accuracy of models trained under FML, demonstrating comparable or superior predictive performance than traditional ML models.
△ Less
Submitted 8 September, 2023;
originally announced September 2023.
-
Behind Closed Doors: Process-Level Rootkit Attacks in Cyber-Physical Microgrid Systems
Authors:
Suman Rath,
Ioannis Zografopoulos,
Pedro P. Vergara,
Vassilis C. Nikolaidis,
Charalambos Konstantinou
Abstract:
Embedded controllers, sensors, actuators, advanced metering infrastructure, etc. are cornerstone components of cyber-physical energy systems such as microgrids (MGs). Harnessing their monitoring and control functionalities, sophisticated schemes enhancing MG stability can be deployed. However, the deployment of `smart' assets increases the threat surface. Power systems possess mechanisms capable o…
▽ More
Embedded controllers, sensors, actuators, advanced metering infrastructure, etc. are cornerstone components of cyber-physical energy systems such as microgrids (MGs). Harnessing their monitoring and control functionalities, sophisticated schemes enhancing MG stability can be deployed. However, the deployment of `smart' assets increases the threat surface. Power systems possess mechanisms capable of detecting abnormal operations. Furthermore, the lack of sophistication in attack strategies can render them detectable since they blindly violate power system semantics. On the other hand, the recent increase of process-aware rootkits that can attain persistence and compromise operations in undetectable ways requires special attention. In this work, we investigate the steps followed by stealthy rootkits at the process level of control systems pre- and post-compromise. We investigate the rootkits' precompromise stage involving the deployment to multiple system locations and aggregation of system-specific information to build a neural network-based virtual data-driven model (VDDM) of the system. Then, during the weaponization phase, we demonstrate how the VDDM measurement predictions are paramount, first to orchestrate crippling attacks from multiple system standpoints, maximizing the impact, and second, impede detection blinding system operator situational awareness.
△ Less
Submitted 20 February, 2022;
originally announced February 2022.