Skip to main content

Showing 1–27 of 27 results for author: Pearce, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.02326  [pdf, other

    cs.AR cs.AI cs.CL cs.LG cs.PL

    Evaluating LLMs for Hardware Design and Test

    Authors: Jason Blocklove, Siddharth Garg, Ramesh Karri, Hammond Pearce

    Abstract: Large Language Models (LLMs) have demonstrated capabilities for producing code in Hardware Description Languages (HDLs). However, most of the focus remains on their abilities to write functional code, not test code. The hardware design process consists of both design and test, and so eschewing validation and verification leaves considerable potential benefit unexplored, given that a design and tes… ▽ More

    Submitted 23 April, 2024; originally announced May 2024.

  2. arXiv:2404.15446  [pdf, other

    cs.CR eess.SY

    OffRAMPS: An FPGA-based Intermediary for Analysis and Modification of Additive Manufacturing Control Systems

    Authors: Jason Blocklove, Md Raz, Prithwish Basu Roy, Hammond Pearce, Prashanth Krishnamurthy, Farshad Khorrami, Ramesh Karri

    Abstract: Cybersecurity threats in Additive Manufacturing (AM) are an increasing concern as AM adoption continues to grow. AM is now being used for parts in the aerospace, transportation, and medical domains. Threat vectors which allow for part compromise are particularly concerning, as any failure in these domains would have life-threatening consequences. A major challenge to investigation of AM part-compr… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  3. arXiv:2404.07235  [pdf, other

    cs.AR cs.AI cs.PL cs.SE

    Explaining EDA synthesis errors with LLMs

    Authors: Siyu Qiu, Benjamin Tan, Hammond Pearce

    Abstract: Training new engineers in digital design is a challenge, particularly when it comes to teaching the complex electronic design automation (EDA) tooling used in this domain. Learners will typically deploy designs in the Verilog and VHDL hardware description languages to Field Programmable Gate Arrays (FPGAs) from Altera (Intel) and Xilinx (AMD) via proprietary closed-source toolchains (Quartus Prime… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

    Comments: 6 pages, 6 figures

  4. arXiv:2312.12575  [pdf, other

    cs.CR

    LLMs Cannot Reliably Identify and Reason About Security Vulnerabilities (Yet?): A Comprehensive Evaluation, Framework, and Benchmarks

    Authors: Saad Ullah, Mingji Han, Saurabh Pujar, Hammond Pearce, Ayse Coskun, Gianluca Stringhini

    Abstract: Large Language Models (LLMs) have been suggested for use in automated vulnerability repair, but benchmarks showing they can consistently identify security-related bugs are lacking. We thus develop SecLLMHolmes, a fully automated evaluation framework that performs the most detailed investigation to date on whether LLMs can reliably identify and reason about security-related bugs. We construct a set… ▽ More

    Submitted 13 April, 2024; v1 submitted 19 December, 2023; originally announced December 2023.

    Comments: Accepted for publication in IEEE Symposium on Security and Privacy 2024

  5. arXiv:2311.04887  [pdf, other

    cs.PL

    AutoChip: Automating HDL Generation Using LLM Feedback

    Authors: Shailja Thakur, Jason Blocklove, Hammond Pearce, Benjamin Tan, Siddharth Garg, Ramesh Karri

    Abstract: Traditionally, designs are written in Verilog hardware description language (HDL) and debugged by hardware engineers. While this approach is effective, it is time-consuming and error-prone for complex designs. Large language models (LLMs) are promising in automating HDL code generation. LLMs are trained on massive datasets of text and code, and they can learn to generate code that compiles and is… ▽ More

    Submitted 4 June, 2024; v1 submitted 8 November, 2023; originally announced November 2023.

  6. arXiv:2310.10560  [pdf, other

    cs.LG cs.AI cs.AR cs.PL

    Towards the Imagenets of ML4EDA

    Authors: Animesh Basak Chowdhury, Shailja Thakur, Hammond Pearce, Ramesh Karri, Siddharth Garg

    Abstract: Despite the growing interest in ML-guided EDA tools from RTL to GDSII, there are no standard datasets or prototypical learning tasks defined for the EDA problem domain. Experience from the computer vision community suggests that such datasets are crucial to spur further progress in ML for EDA. Here we describe our experience curating two large-scale, high-quality datasets for Verilog code generati… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

    Comments: Invited paper, ICCAD 2023

    Report number: October 16 Update

    Journal ref: ICCAD 2023

  7. arXiv:2310.05135  [pdf, other

    cs.CL cs.AI cs.LG

    Are Emily and Greg Still More Employable than Lakisha and Jamal? Investigating Algorithmic Hiring Bias in the Era of ChatGPT

    Authors: Akshaj Kumar Veldanda, Fabian Grob, Shailja Thakur, Hammond Pearce, Benjamin Tan, Ramesh Karri, Siddharth Garg

    Abstract: Large Language Models (LLMs) such as GPT-3.5, Bard, and Claude exhibit applicability across numerous tasks. One domain of interest is their use in algorithmic hiring, specifically in matching resumes with job categories. Yet, this introduces issues of bias on protected attributes like gender, race and maternity status. The seminal work of Bertrand & Mullainathan (2003) set the gold-standard for id… ▽ More

    Submitted 8 October, 2023; originally announced October 2023.

  8. arXiv:2308.11873  [pdf, other

    cs.SE cs.LG cs.PL

    Dcc --help: Generating Context-Aware Compiler Error Explanations with Large Language Models

    Authors: Andrew Taylor, Alexandra Vassar, Jake Renzella, Hammond Pearce

    Abstract: In the challenging field of introductory programming, high enrollments and failure rates drive us to explore tools and systems to enhance student outcomes, especially automated tools that scale to large cohorts. This paper presents and evaluates the dcc --help tool, an integration of a Large Language Model (LLM) into the Debugging C Compiler (DCC) to generate unique, novice-focused explanations ta… ▽ More

    Submitted 15 October, 2023; v1 submitted 22 August, 2023; originally announced August 2023.

    Comments: 7 pages, 2 figures. Accepted in SIGCSE'24

  9. arXiv:2308.00708  [pdf, other

    cs.PL cs.LG cs.SE

    VeriGen: A Large Language Model for Verilog Code Generation

    Authors: Shailja Thakur, Baleegh Ahmad, Hammond Pearce, Benjamin Tan, Brendan Dolan-Gavitt, Ramesh Karri, Siddharth Garg

    Abstract: In this study, we explore the capability of Large Language Models (LLMs) to automate hardware design by generating high-quality Verilog code, a common language for designing and modeling digital systems. We fine-tune pre-existing LLMs on Verilog datasets compiled from GitHub and Verilog textbooks. We evaluate the functional correctness of the generated Verilog code using a specially designed test… ▽ More

    Submitted 27 July, 2023; originally announced August 2023.

    Comments: arXiv admin note: text overlap with arXiv:2212.11140

  10. arXiv:2306.14027  [pdf, other

    cs.CR cs.AI

    LLM-assisted Generation of Hardware Assertions

    Authors: Rahul Kande, Hammond Pearce, Benjamin Tan, Brendan Dolan-Gavitt, Shailja Thakur, Ramesh Karri, Jeyavijayan Rajendran

    Abstract: The security of computer systems typically relies on a hardware root of trust. As vulnerabilities in hardware can have severe implications on a system, there is a need for techniques to support security verification activities. Assertion-based verification is a popular verification technique that involves capturing design intent in a set of assertions that can be used in formal verification or tes… ▽ More

    Submitted 24 June, 2023; originally announced June 2023.

  11. arXiv:2306.12643  [pdf, other

    cs.CR cs.AI cs.SE

    FLAG: Finding Line Anomalies (in code) with Generative AI

    Authors: Baleegh Ahmad, Benjamin Tan, Ramesh Karri, Hammond Pearce

    Abstract: Code contains security and functional bugs. The process of identifying and localizing them is difficult and relies on human labor. In this work, we present a novel approach (FLAG) to assist human debuggers. FLAG is based on the lexical capabilities of generative AI, specifically, Large Language Models (LLMs). Here, we input a code file then extract and regenerate each line within that file for sel… ▽ More

    Submitted 21 June, 2023; originally announced June 2023.

  12. Chip-Chat: Challenges and Opportunities in Conversational Hardware Design

    Authors: Jason Blocklove, Siddharth Garg, Ramesh Karri, Hammond Pearce

    Abstract: Modern hardware design starts with specifications provided in natural language. These are then translated by hardware engineers into appropriate Hardware Description Languages (HDLs) such as Verilog before synthesizing circuit elements. Automating this translation could reduce sources of human error from the engineering process. But, it is only recently that artificial intelligence (AI) has demons… ▽ More

    Submitted 14 November, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

    Comments: 6 pages, 8 figures. Accepted in 2023 ACM/IEEE 5th Workshop on Machine Learning for CAD (MLCAD)

  13. arXiv:2305.06902  [pdf, other

    cs.CR

    REMaQE: Reverse Engineering Math Equations from Executables

    Authors: Meet Udeshi, Prashanth Krishnamurthy, Hammond Pearce, Ramesh Karri, Farshad Khorrami

    Abstract: Cybersecurity attacks on embedded devices for industrial control systems and cyber-physical systems may cause catastrophic physical damage as well as economic loss. This could be achieved by infecting device binaries with malware that modifies the physical characteristics of the system operation. Mitigating such attacks benefits from reverse engineering tools that recover sufficient semantic knowl… ▽ More

    Submitted 11 April, 2024; v1 submitted 11 May, 2023; originally announced May 2023.

    ACM Class: C.3; D.2.5

  14. Fixing Hardware Security Bugs with Large Language Models

    Authors: Baleegh Ahmad, Shailja Thakur, Benjamin Tan, Ramesh Karri, Hammond Pearce

    Abstract: Novel AI-based code-writing Large Language Models (LLMs) such as OpenAI's Codex have demonstrated capabilities in many coding-adjacent domains. In this work we consider how LLMs maybe leveraged to automatically repair security relevant bugs present in hardware designs. We focus on bug repair in code written in the Hardware Description Language Verilog. For this study we build a corpus of domain-re… ▽ More

    Submitted 2 February, 2023; originally announced February 2023.

  15. arXiv:2301.10336  [pdf, other

    cs.CR

    A survey of Digital Manufacturing Hardware and Software Trojans

    Authors: Prithwish Basu Roy, Mudit Bhargava, Chia-Yun Chang, Ellen Hui, Nikhil Gupta, Ramesh Karri, Hammond Pearce

    Abstract: Digital Manufacturing (DM) refers to the on-going adoption of smarter, more agile manufacturing processes and cyber-physical systems. This includes modern techniques and technologies such as Additive Manufacturing (AM)/3D printing, as well as the Industrial Internet of Things (IIoT) and the broader trend toward Industry 4.0. However, this adoption is not without risks: with a growing complexity an… ▽ More

    Submitted 24 January, 2023; originally announced January 2023.

    Comments: 15 pages

  16. arXiv:2212.11140  [pdf, other

    cs.PL cs.LG cs.SE

    Benchmarking Large Language Models for Automated Verilog RTL Code Generation

    Authors: Shailja Thakur, Baleegh Ahmad, Zhenxing Fan, Hammond Pearce, Benjamin Tan, Ramesh Karri, Brendan Dolan-Gavitt, Siddharth Garg

    Abstract: Automating hardware design could obviate a significant amount of human error from the engineering process and lead to fewer errors. Verilog is a popular hardware description language to model and design digital systems, thus generating Verilog code is a critical first step. Emerging large language models (LLMs) are able to write high-quality code in other programming languages. In this paper, we c… ▽ More

    Submitted 13 December, 2022; originally announced December 2022.

    Comments: Accepted in DATE 2023. 7 pages, 4 tables, 7 figures

  17. Don't CWEAT It: Toward CWE Analysis Techniques in Early Stages of Hardware Design

    Authors: Baleegh Ahmad, Wei-Kai Liu, Luca Collini, Hammond Pearce, Jason M. Fung, Jonathan Valamehr, Mohammad Bidmeshki, Piotr Sapiecha, Steve Brown, Krishnendu Chakrabarty, Ramesh Karri, Benjamin Tan

    Abstract: To help prevent hardware security vulnerabilities from propagating to later design stages where fixes are costly, it is crucial to identify security concerns as early as possible, such as in RTL designs. In this work, we investigate the practical implications and feasibility of producing a set of security-specific scanners that operate on Verilog source files. The scanners indicate parts of code t… ▽ More

    Submitted 2 September, 2022; originally announced September 2022.

  18. arXiv:2208.09727  [pdf, other

    cs.CR

    Lost at C: A User Study on the Security Implications of Large Language Model Code Assistants

    Authors: Gustavo Sandoval, Hammond Pearce, Teo Nys, Ramesh Karri, Siddharth Garg, Brendan Dolan-Gavitt

    Abstract: Large Language Models (LLMs) such as OpenAI Codex are increasingly being used as AI-based coding assistants. Understanding the impact of these tools on developers' code is paramount, especially as recent work showed that LLMs may suggest cybersecurity vulnerabilities. We conduct a security-driven user study (N=58) to assess code written by student programmers when assisted by LLMs. Given the poten… ▽ More

    Submitted 27 February, 2023; v1 submitted 20 August, 2022; originally announced August 2022.

    Comments: Accepted for publication in USENIX'23. For associated dataset see https://doi.org/10.5281/zenodo.7187359. 18 pages, 12 figures. G. Sandoval and H. Pearce contributed equally to this work

  19. High-Level Approaches to Hardware Security: A Tutorial

    Authors: Hammond Pearce, Ramesh Karri, Benjamin Tan

    Abstract: Designers use third-party intellectual property (IP) cores and outsource various steps in the integrated circuit (IC) design and manufacturing flow. As a result, security vulnerabilities have been rising. This is forcing IC designers and end users to re-evaluate their trust in ICs. If attackers get hold of an unprotected IC, they can reverse engineer the IC and pirate the IP. Similarly, if attacke… ▽ More

    Submitted 6 March, 2023; v1 submitted 21 July, 2022; originally announced July 2022.

    Comments: Accepted in IEEE TECS. 41 pages, 13 figures

  20. arXiv:2202.01142  [pdf, other

    cs.SE cs.CR cs.LG

    Pop Quiz! Can a Large Language Model Help With Reverse Engineering?

    Authors: Hammond Pearce, Benjamin Tan, Prashanth Krishnamurthy, Farshad Khorrami, Ramesh Karri, Brendan Dolan-Gavitt

    Abstract: Large language models (such as OpenAI's Codex) have demonstrated impressive zero-shot multi-task capabilities in the software domain, including code explanation. In this work, we examine if this ability can be used to help with reverse engineering. Specifically, we investigate prompting Codex to identify the purpose, capabilities, and important variable names or values from code, even when the cod… ▽ More

    Submitted 2 February, 2022; originally announced February 2022.

    Comments: 18 pages, 19 figures. Linked dataset: https://doi.org/10.5281/zenodo.5949075

  21. arXiv:2112.02125  [pdf, other

    cs.CR cs.AI

    Examining Zero-Shot Vulnerability Repair with Large Language Models

    Authors: Hammond Pearce, Benjamin Tan, Baleegh Ahmad, Ramesh Karri, Brendan Dolan-Gavitt

    Abstract: Human developers can produce code with cybersecurity bugs. Can emerging 'smart' code completion tools help repair those bugs? In this work, we examine the use of large language models (LLMs) for code (such as OpenAI's Codex and AI21's Jurassic J-1) for zero-shot vulnerability repair. We investigate challenges in the design of prompts that coax LLMs into generating repaired versions of insecure cod… ▽ More

    Submitted 15 August, 2022; v1 submitted 3 December, 2021; originally announced December 2021.

    Comments: 18 pages, 19 figures. Accepted for publication in 2023 IEEE Symposium on Security and Privacy (SP)

  22. Needle in a Haystack: Detecting Subtle Malicious Edits to Additive Manufacturing G-code Files

    Authors: Caleb Beckwith, Harsh Sankar Naicker, Svara Mehta, Viba R. Udupa, Nghia Tri Nim, Varun Gadre, Hammond Pearce, Gary Mac, Nikhil Gupta

    Abstract: Increasing usage of Digital Manufacturing (DM) in safety-critical domains is increasing attention on the cybersecurity of the manufacturing process, as malicious third parties might aim to introduce defects in digital designs. In general, the DM process involves creating a digital object (as CAD files) before using a slicer program to convert the models into printing instructions (e.g. g-code) sui… ▽ More

    Submitted 24 November, 2021; originally announced November 2021.

    Comments: To appear in IEEE Embedded Systems Letters

  23. arXiv:2110.01974  [pdf, other

    cs.SE cs.FL cs.RO

    Runtime Interchange for Adaptive Re-use of Intelligent Cyber-Physical System Controllers

    Authors: Hammond Pearce, Xin Yang, Srinivas Pinisetty, Partha S. Roop

    Abstract: Cyber-Physical Systems (CPSs) such as those found within autonomous vehicles are increasingly adopting Artificial Neural Network (ANN)-based controllers. To ensure the safety of these controllers, there is a spate of recent activity to formally verify the ANN-based designs. There are two challenges with these approaches: (1) The verification of such systems is difficult and time consuming. (2) The… ▽ More

    Submitted 23 September, 2021; originally announced October 2021.

    Comments: 10 pages, 7 figures

  24. arXiv:2108.09293  [pdf, other

    cs.CR cs.AI

    Asleep at the Keyboard? Assessing the Security of GitHub Copilot's Code Contributions

    Authors: Hammond Pearce, Baleegh Ahmad, Benjamin Tan, Brendan Dolan-Gavitt, Ramesh Karri

    Abstract: There is burgeoning interest in designing AI-based systems to assist humans in designing computing systems, including tools that automatically generate computer code. The most notable of these comes in the form of the first self-described `AI pair programmer', GitHub Copilot, a language model trained over open-source GitHub code. However, code often contains bugs - and so, given the vast quantity… ▽ More

    Submitted 16 December, 2021; v1 submitted 20 August, 2021; originally announced August 2021.

    Comments: Accepted for publication in IEEE Symposium on Security and Privacy 2022

  25. arXiv:2104.09562  [pdf, other

    cs.CR

    FLAW3D: A Trojan-based Cyber Attack on the Physical Outcomes of Additive Manufacturing

    Authors: Hammond Pearce, Kaushik Yanamandra, Nikhil Gupta, Ramesh Karri

    Abstract: Additive Manufacturing (AM) systems such as 3D printers use inexpensive microcontrollers that rarely feature cybersecurity defenses. This is a risk, especially given the rising threat landscape within the larger digital manufacturing domain. In this work we demonstrate this risk by presenting the design and study of a malicious Trojan (the FLAW3D bootloader) for AVR-based Marlin-compatible 3D prin… ▽ More

    Submitted 19 April, 2021; originally announced April 2021.

    Comments: 8 pages, 11 figures

  26. arXiv:2009.01026  [pdf, other

    cs.SE cs.CL cs.LG stat.ML

    DAVE: Deriving Automatically Verilog from English

    Authors: Hammond Pearce, Benjamin Tan, Ramesh Karri

    Abstract: While specifications for digital systems are provided in natural language, engineers undertake significant efforts to translate them into the programming languages understood by compilers for digital systems. Automating this process allows designers to work with the language in which they are most comfortable --the original natural language -- and focus instead on other downstream design challenge… ▽ More

    Submitted 27 August, 2020; originally announced September 2020.

    Comments: 6 pages, 2 figures

  27. Designing Neural Networks for Real-Time Systems

    Authors: Hammond Pearce, Xin Yang, Partha S. Roop, Marc Katzef, Tórur Biskopstø Strøm

    Abstract: Artificial Neural Networks (ANNs) are increasingly being used within safety-critical Cyber-Physical Systems (CPSs). They are often co-located with traditional embedded software, and may perform advisory or control-based roles. It is important to validate both the timing and functional correctness of these systems. However, most approaches in the literature consider guaranteeing only the functional… ▽ More

    Submitted 26 August, 2020; originally announced August 2020.

    Comments: 4 pages, 2 figures. IEEE Embedded Systems Letters, 2020