Skip to main content

Showing 1–12 of 12 results for author: Saxe, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.02435  [pdf, other

    cs.CR cs.SE

    Bridging the Gap: A Study of AI-based Vulnerability Management between Industry and Academia

    Authors: Shengye Wan, Joshua Saxe, Craig Gomes, Sahana Chennabasappa, Avilash Rath, Kun Sun, Xinda Wang

    Abstract: Recent research advances in Artificial Intelligence (AI) have yielded promising results for automated software vulnerability management. AI-based models are reported to greatly outperform traditional static analysis tools, indicating a substantial workload relief for security engineers. However, the industry remains very cautious and selective about integrating AI-based techniques into their secur… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

    Comments: Accepted by IEEE/IFIP International Conference on Dependable Systems and Networks, Industry Track, 2024

  2. arXiv:2404.13161  [pdf, other

    cs.CR cs.LG

    CyberSecEval 2: A Wide-Ranging Cybersecurity Evaluation Suite for Large Language Models

    Authors: Manish Bhatt, Sahana Chennabasappa, Yue Li, Cyrus Nikolaidis, Daniel Song, Shengye Wan, Faizan Ahmad, Cornelius Aschermann, Yaohui Chen, Dhaval Kapil, David Molnar, Spencer Whitman, Joshua Saxe

    Abstract: Large language models (LLMs) introduce new security risks, but there are few comprehensive evaluation suites to measure and reduce these risks. We present BenchmarkName, a novel benchmark to quantify LLM security risks and capabilities. We introduce two new areas for testing: prompt injection and code interpreter abuse. We evaluated multiple state-of-the-art (SOTA) LLMs, including GPT-4, Mistral,… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

  3. arXiv:2312.04724  [pdf, other

    cs.CR cs.LG

    Purple Llama CyberSecEval: A Secure Coding Benchmark for Language Models

    Authors: Manish Bhatt, Sahana Chennabasappa, Cyrus Nikolaidis, Shengye Wan, Ivan Evtimov, Dominik Gabi, Daniel Song, Faizan Ahmad, Cornelius Aschermann, Lorenzo Fontana, Sasha Frolov, Ravi Prakash Giri, Dhaval Kapil, Yiannis Kozyrakis, David LeBlanc, James Milazzo, Aleksandar Straumann, Gabriel Synnaeve, Varun Vontimitta, Spencer Whitman, Joshua Saxe

    Abstract: This paper presents CyberSecEval, a comprehensive benchmark developed to help bolster the cybersecurity of Large Language Models (LLMs) employed as coding assistants. As what we believe to be the most extensive unified cybersecurity safety benchmark to date, CyberSecEval provides a thorough evaluation of LLMs in two crucial security domains: their propensity to generate insecure code and their lev… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

  4. arXiv:2010.03484  [pdf, other

    cs.CR

    CATBERT: Context-Aware Tiny BERT for Detecting Social Engineering Emails

    Authors: Younghoo Lee, Joshua Saxe, Richard Harang

    Abstract: Targeted phishing emails are on the rise and facilitate the theft of billions of dollars from organizations a year. While malicious signals from attached files or malicious URLs in emails can be detected by conventional malware signatures or machine learning technologies, it is challenging to identify hand-crafted social engineering emails which don't contain any malicious code and don't share wor… ▽ More

    Submitted 7 October, 2020; originally announced October 2020.

  5. arXiv:1906.01032  [pdf, other

    cs.LG cs.CL cs.SE stat.ML

    A Language-Agnostic Model for Semantic Source Code Labeling

    Authors: Ben Gelman, Bryan Hoyle, Jessica Moore, Joshua Saxe, David Slater

    Abstract: Code search and comprehension have become more difficult in recent years due to the rapid expansion of available source code. Current tools lack a way to label arbitrary code at scale while maintaining up-to-date representations of new programming languages, libraries, and functionalities. Comprehensive labeling of source code enables users to search for documents of interest and obtain a high-lev… ▽ More

    Submitted 3 June, 2019; originally announced June 2019.

    Comments: MASES 2018 Publication

  6. arXiv:1804.08162  [pdf, other

    cs.CR

    MEADE: Towards a Malicious Email Attachment Detection Engine

    Authors: Ethan M. Rudd, Richard Harang, Joshua Saxe

    Abstract: Malicious email attachments are a growing delivery vector for malware. While machine learning has been successfully applied to portable executable (PE) malware detection, we ask, can we extend similar approaches to detect malware across heterogeneous file types commonly found in email attachments? In this paper, we explore the feasibility of applying machine learning as a static countermeasure to… ▽ More

    Submitted 22 April, 2018; originally announced April 2018.

    Comments: Pre-print of a manuscript submitted to IEEE Symposium on Technologies for Homeland Security (HST)

  7. arXiv:1804.05020  [pdf, other

    cs.CR cs.LG stat.ML

    A Deep Learning Approach to Fast, Format-Agnostic Detection of Malicious Web Content

    Authors: Joshua Saxe, Richard Harang, Cody Wild, Hillary Sanders

    Abstract: Malicious web content is a serious problem on the Internet today. In this paper we propose a deep learning approach to detecting malevolent web pages. While past work on web content detection has relied on syntactic parsing or on emulation of HTML and Javascript to extract features, our approach operates directly on a language-agnostic stream of tokens extracted directly from static HTML files wit… ▽ More

    Submitted 13 April, 2018; originally announced April 2018.

  8. arXiv:1702.08568  [pdf, ps, other

    cs.CR cs.LG

    eXpose: A Character-Level Convolutional Neural Network with Embeddings For Detecting Malicious URLs, File Paths and Registry Keys

    Authors: Joshua Saxe, Konstantin Berlin

    Abstract: For years security machine learning research has promised to obviate the need for signature based detection by automatically learning to detect indicators of attack. Unfortunately, this vision hasn't come to fruition: in fact, develo** and maintaining today's security machine learning systems can require engineering resources that are comparable to that of signature-based detection systems, due… ▽ More

    Submitted 27 February, 2017; originally announced February 2017.

  9. arXiv:1608.00669  [pdf, ps, other

    cs.CR cs.PF

    Improving Zero-Day Malware Testing Methodology Using Statistically Significant Time-Lagged Test Samples

    Authors: Konstantin Berlin, Joshua Saxe

    Abstract: Enterprise networks are in constant danger of being breached by cyber-attackers, but making the decision about what security tools to deploy to mitigate this risk requires carefully designed evaluation of security products. One of the most important metrics for a protection product is how well it is able to stop malware, specifically on "zero"-day malware that has not been seen by the security com… ▽ More

    Submitted 1 August, 2016; originally announced August 2016.

  10. arXiv:1605.08642  [pdf

    cs.CR

    CrowdSource: Automated Inference of High Level Malware Functionality from Low-Level Symbols Using a Crowd Trained Machine Learning Model

    Authors: Joshua Saxe, Rafael Turner, Kristina Blokhin

    Abstract: In this paper we introduce CrowdSource, a statistical natural language processing system designed to make rapid inferences about malware functionality based on printable character strings extracted from malware binaries. CrowdSource "learns" a map** between low-level language and high-level software functionality by leveraging millions of web technical documents from StackExchange, a popular net… ▽ More

    Submitted 27 May, 2016; originally announced May 2016.

  11. arXiv:1508.03096  [pdf, ps, other

    cs.CR

    Deep Neural Network Based Malware Detection Using Two Dimensional Binary Program Features

    Authors: Joshua Saxe, Konstantin Berlin

    Abstract: Malware remains a serious problem for corporations, government agencies, and individuals, as attackers continue to use it as a tool to effect frequent and costly network intrusions. Machine learning holds the promise of automating the work required to detect newly discovered malware families, and could potentially learn generalizations about malware and benign software that support the detection o… ▽ More

    Submitted 3 September, 2015; v1 submitted 12 August, 2015; originally announced August 2015.

  12. arXiv:1506.04200  [pdf, ps, other

    cs.CR

    Malicious Behavior Detection using Windows Audit Logs

    Authors: Konstantin Berlin, David Slater, Joshua Saxe

    Abstract: As antivirus and network intrusion detection systems have increasingly proven insufficient to detect advanced threats, large security operations centers have moved to deploy endpoint-based sensors that provide deeper visibility into low-level events across their enterprises. Unfortunately, for many organizations in government and industry, the installation, maintenance, and resource requirements o… ▽ More

    Submitted 25 August, 2015; v1 submitted 12 June, 2015; originally announced June 2015.