Skip to main content

Showing 1–3 of 3 results for author: Chennabasappa, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.02435  [pdf, other

    cs.CR cs.SE

    Bridging the Gap: A Study of AI-based Vulnerability Management between Industry and Academia

    Authors: Shengye Wan, Joshua Saxe, Craig Gomes, Sahana Chennabasappa, Avilash Rath, Kun Sun, Xinda Wang

    Abstract: Recent research advances in Artificial Intelligence (AI) have yielded promising results for automated software vulnerability management. AI-based models are reported to greatly outperform traditional static analysis tools, indicating a substantial workload relief for security engineers. However, the industry remains very cautious and selective about integrating AI-based techniques into their secur… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

    Comments: Accepted by IEEE/IFIP International Conference on Dependable Systems and Networks, Industry Track, 2024

  2. arXiv:2404.13161  [pdf, other

    cs.CR cs.LG

    CyberSecEval 2: A Wide-Ranging Cybersecurity Evaluation Suite for Large Language Models

    Authors: Manish Bhatt, Sahana Chennabasappa, Yue Li, Cyrus Nikolaidis, Daniel Song, Shengye Wan, Faizan Ahmad, Cornelius Aschermann, Yaohui Chen, Dhaval Kapil, David Molnar, Spencer Whitman, Joshua Saxe

    Abstract: Large language models (LLMs) introduce new security risks, but there are few comprehensive evaluation suites to measure and reduce these risks. We present BenchmarkName, a novel benchmark to quantify LLM security risks and capabilities. We introduce two new areas for testing: prompt injection and code interpreter abuse. We evaluated multiple state-of-the-art (SOTA) LLMs, including GPT-4, Mistral,… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

  3. arXiv:2312.04724  [pdf, other

    cs.CR cs.LG

    Purple Llama CyberSecEval: A Secure Coding Benchmark for Language Models

    Authors: Manish Bhatt, Sahana Chennabasappa, Cyrus Nikolaidis, Shengye Wan, Ivan Evtimov, Dominik Gabi, Daniel Song, Faizan Ahmad, Cornelius Aschermann, Lorenzo Fontana, Sasha Frolov, Ravi Prakash Giri, Dhaval Kapil, Yiannis Kozyrakis, David LeBlanc, James Milazzo, Aleksandar Straumann, Gabriel Synnaeve, Varun Vontimitta, Spencer Whitman, Joshua Saxe

    Abstract: This paper presents CyberSecEval, a comprehensive benchmark developed to help bolster the cybersecurity of Large Language Models (LLMs) employed as coding assistants. As what we believe to be the most extensive unified cybersecurity safety benchmark to date, CyberSecEval provides a thorough evaluation of LLMs in two crucial security domains: their propensity to generate insecure code and their lev… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.