Search | arXiv e-print repository

NYU CTF Dataset: A Scalable Open-Source Benchmark Dataset for Evaluating LLMs in Offensive Security

Authors: Minghao Shao, Sofija Jancheska, Meet Udeshi, Brendan Dolan-Gavitt, Haoran Xi, Kimberly Milner, Boyuan Chen, Max Yin, Siddharth Garg, Prashanth Krishnamurthy, Farshad Khorrami, Ramesh Karri, Muhammad Shafique

Abstract: Large Language Models (LLMs) are being deployed across various domains today. However, their capacity to solve Capture the Flag (CTF) challenges in cybersecurity has not been thoroughly evaluated. To address this, we develop a novel method to assess LLMs in solving CTF challenges by creating a scalable, open-source benchmark database specifically designed for these applications. This database incl… ▽ More Large Language Models (LLMs) are being deployed across various domains today. However, their capacity to solve Capture the Flag (CTF) challenges in cybersecurity has not been thoroughly evaluated. To address this, we develop a novel method to assess LLMs in solving CTF challenges by creating a scalable, open-source benchmark database specifically designed for these applications. This database includes metadata for LLM testing and adaptive learning, compiling a diverse range of CTF challenges from popular competitions. Utilizing the advanced function calling capabilities of LLMs, we build a fully automated system with an enhanced workflow and support for external tool calls. Our benchmark dataset and automated framework allow us to evaluate the performance of five LLMs, encompassing both black-box and open-source models. This work lays the foundation for future research into improving the efficiency of LLMs in interactive cybersecurity tasks and automated task planning. By providing a specialized dataset, our project offers an ideal platform for develo**, testing, and refining LLM-based approaches to vulnerability detection and resolution. Evaluating LLMs on these challenges and comparing with human performance yields insights into their potential for AI-driven cybersecurity solutions to perform real-world threat management. We make our dataset open source to public https://github.com/NYU-LLM-CTF/LLM_CTF_Database along with our playground automated framework https://github.com/NYU-LLM-CTF/llm_ctf_automation. △ Less

Submitted 8 June, 2024; originally announced June 2024.

arXiv:1308.5788 [pdf, other]

doi 10.4086/toc.2015.v011a003

Quantum interactive proofs and the complexity of separability testing

Authors: Gus Gutoski, Patrick Hayden, Kevin Milner, Mark M. Wilde

Abstract: We identify a formal connection between physical problems related to the detection of separable (unentangled) quantum states and complexity classes in theoretical computer science. In particular, we show that to nearly every quantum interactive proof complexity class (including BQP, QMA, QMA(2), and QSZK), there corresponds a natural separability testing problem that is complete for that class. Of… ▽ More We identify a formal connection between physical problems related to the detection of separable (unentangled) quantum states and complexity classes in theoretical computer science. In particular, we show that to nearly every quantum interactive proof complexity class (including BQP, QMA, QMA(2), and QSZK), there corresponds a natural separability testing problem that is complete for that class. Of particular interest is the fact that the problem of determining whether an isometry can be made to produce a separable state is either QMA-complete or QMA(2)-complete, depending upon whether the distance between quantum states is measured by the one-way LOCC norm or the trace norm. We obtain strong hardness results by proving that for each n-qubit maximally entangled state there exists a fixed one-way LOCC measurement that distinguishes it from any separable state with error probability that decays exponentially in n. △ Less

Submitted 30 September, 2014; v1 submitted 27 August, 2013; originally announced August 2013.

Comments: v2: 43 pages, 5 figures, completely rewritten and in Theory of Computing (ToC) journal format

Journal ref: Theory of Computing vol. 11, article 3, pages 59-103, March 2015

arXiv:1211.6120 [pdf, other]

doi 10.1109/CCC.2013.24

Two-message quantum interactive proofs and the quantum separability problem

Authors: Patrick Hayden, Kevin Milner, Mark M. Wilde

Abstract: Suppose that a polynomial-time mixed-state quantum circuit, described as a sequence of local unitary interactions followed by a partial trace, generates a quantum state shared between two parties. One might then wonder, does this quantum circuit produce a state that is separable or entangled? Here, we give evidence that it is computationally hard to decide the answer to this question, even if one… ▽ More Suppose that a polynomial-time mixed-state quantum circuit, described as a sequence of local unitary interactions followed by a partial trace, generates a quantum state shared between two parties. One might then wonder, does this quantum circuit produce a state that is separable or entangled? Here, we give evidence that it is computationally hard to decide the answer to this question, even if one has access to the power of quantum computation. We begin by exhibiting a two-message quantum interactive proof system that can decide the answer to a promise version of the question. We then prove that the promise problem is hard for the class of promise problems with "quantum statistical zero knowledge" (QSZK) proof systems by demonstrating a polynomial-time Karp reduction from the QSZK-complete promise problem "quantum state distinguishability" to our quantum separability problem. By exploiting Knill's efficient encoding of a matrix description of a state into a description of a circuit to generate the state, we can show that our promise problem is NP-hard with respect to Cook reductions. Thus, the quantum separability problem (as phrased above) constitutes the first nontrivial promise problem decidable by a two-message quantum interactive proof system while being hard for both NP and QSZK. We also consider a variant of the problem, in which a given polynomial-time mixed-state quantum circuit accepts a quantum state as input, and the question is to decide if there is an input to this circuit which makes its output separable across some bipartite cut. We prove that this problem is a complete promise problem for the class QIP of problems decidable by quantum interactive proof systems. Finally, we show that a two-message quantum interactive proof system can also decide a multipartite generalization of the quantum separability problem. △ Less

Submitted 6 September, 2013; v1 submitted 26 November, 2012; originally announced November 2012.

Comments: 34 pages, 6 figures; v2: technical improvements and new result for the multipartite quantum separability problem; v3: minor changes to address referee comments, accepted for presentation at the 2013 IEEE Conference on Computational Complexity; v4: changed problem names; v5: updated references and added a paragraph to the conclusion to connect with prior work on separability testing

Journal ref: Proceedings of the 28th IEEE Conference on Computational Complexity, pages 156-167, Palo Alto, California, June 2013

Showing 1–3 of 3 results for author: Milner, K