Skip to main content

Showing 1–9 of 9 results for author: Santos, J C S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.10155  [pdf, other

    cs.SE cs.LG

    Quality Assessment of Prompts Used in Code Generation

    Authors: Mohammed Latif Siddiq, Simantika Dristi, Joy Saha, Joanna C. S. Santos

    Abstract: Large Language Models (LLMs) are gaining popularity among software engineers. A crucial aspect of develo** effective code-generation LLMs is to evaluate these models using a robust benchmark. Evaluation benchmarks with quality issues can provide a false sense of performance. In this work, we conduct the first-of-its-kind study of the quality of prompts within benchmarks used to compare the perfo… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: Under review

  2. arXiv:2403.10646  [pdf

    cs.LG cs.CR

    A Survey of Source Code Representations for Machine Learning-Based Cybersecurity Tasks

    Authors: Beatrice Casey, Joanna C. S. Santos, George Perry

    Abstract: Machine learning techniques for cybersecurity-related software engineering tasks are becoming increasingly popular. The representation of source code is a key portion of the technique that can impact the way the model is able to learn the features of the source code. With an increasing number of these techniques being developed, it is valuable to see the current state of the field to better unders… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  3. arXiv:2311.00943  [pdf

    cs.SE

    Sound Call Graph Construction for Java Object Deserialization

    Authors: Joanna C. S. Santos, Mehdi Mirakhorli, Ali Shokri

    Abstract: Object serialization and deserialization is widely used for storing and preserving objects in files, memory, or database as well as for transporting them across machines, enabling remote interaction among processes and many more. This mechanism relies on reflection, a dynamic language that introduces serious challenges for static analyses. Current state-of-the-art call graph construction algorithm… ▽ More

    Submitted 1 November, 2023; originally announced November 2023.

  4. arXiv:2311.00889  [pdf, other

    cs.SE cs.AI

    Generate and Pray: Using SALLMS to Evaluate the Security of LLM Generated Code

    Authors: Mohammed Latif Siddiq, Joanna C. S. Santos, Sajith Devareddy, Anna Muller

    Abstract: With the growing popularity of Large Language Models (LLMs) in software engineers' daily practices, it is important to ensure that the code generated by these tools is not only functionally correct but also free of vulnerabilities. Although LLMs can help developers to be more productive, prior empirical studies have shown that LLMs can generate insecure code. There are two contributing factors to… ▽ More

    Submitted 3 June, 2024; v1 submitted 1 November, 2023; originally announced November 2023.

    Comments: Under review; 12 Pages

  5. arXiv:2307.08220  [pdf, other

    cs.SE cs.LG

    A Lightweight Framework for High-Quality Code Generation

    Authors: Mohammed Latif Siddiq, Beatrice Casey, Joanna C. S. Santos

    Abstract: In recent years, the use of automated source code generation utilizing transformer-based generative models has expanded, and these models can generate functional code according to the requirements of the developers. However, recent research revealed that these automatically generated source codes can contain vulnerabilities and other quality issues. Despite researchers' and practitioners' attempts… ▽ More

    Submitted 16 July, 2023; originally announced July 2023.

    Comments: Under Review

  6. arXiv:2305.00418  [pdf, other

    cs.SE cs.LG

    Using Large Language Models to Generate JUnit Tests: An Empirical Study

    Authors: Mohammed Latif Siddiq, Joanna C. S. Santos, Ridwanul Hasan Tanvir, Noshin Ulfat, Fahmid Al Rifat, Vinicius Carvalho Lopes

    Abstract: A code generation model generates code by taking a prompt from a code comment, existing code, or a combination of both. Although code generation models (e.g., GitHub Copilot) are increasingly being adopted in practice, it is unclear whether they can successfully be used for unit test generation without fine-tuning for a strongly typed language like Java. To fill this gap, we investigated how well… ▽ More

    Submitted 8 March, 2024; v1 submitted 30 April, 2023; originally announced May 2023.

    Comments: Accepted in Research Track of The 28th International Conference on Evaluation and Assessment in Software Engineering (EASE 2024)

  7. arXiv:2304.07840  [pdf, other

    cs.LG cs.SE

    Enhancing Automated Program Repair through Fine-tuning and Prompt Engineering

    Authors: Rishov Paul, Md. Mohib Hossain, Mohammed Latif Siddiq, Masum Hasan, Anindya Iqbal, Joanna C. S. Santos

    Abstract: Sequence-to-sequence models have been used to transform erroneous programs into correct ones when trained with a large enough dataset. Some recent studies also demonstrated strong empirical evidence that code review could improve the program repair further. Large language models, trained with Natural Language (NL) and Programming Language (PL), can contain inherent knowledge of both. In this study… ▽ More

    Submitted 21 July, 2023; v1 submitted 16 April, 2023; originally announced April 2023.

    Comments: 12 pages, 2 figures, 4 tables

  8. ArCode: Facilitating the Use of Application Frameworks to Implement Tactics and Patterns

    Authors: Ali Shokri, Joanna C. S. Santos, Mehdi Mirakhorli

    Abstract: Software designers and developers are increasingly relying on application frameworks as first-class design concepts. They instantiate the services that frameworks provide to implement various architectural tactics and patterns. One of the challenges in using frameworks for such tasks is the difficulty of learning and correctly using frameworks' APIs. This paper introduces a learning-based approach… ▽ More

    Submitted 16 February, 2021; originally announced February 2021.

    Comments: This paper has been accepted in the main track of 2021 IEEE International Conference on Software Architecture (ICSA 2021) and is going to be published. Please feel free to cite it

  9. A Large-Scale Study on the Usage of Testing Patterns that Address Maintainability Attributes (Patterns for Ease of Modification, Diagnoses, and Comprehension)

    Authors: Danielle Gonzalez, Joanna C. S. Santos, Andrew Popovich, Mehdi Mirakhorli, Mei Nagappan

    Abstract: Test case maintainability is an important concern, especially in open source and distributed development environments where projects typically have high contributor turnover with varying backgrounds and experience, and where code ownership changes often. Similar to design patterns, patterns for unit testing promote maintainability quality attributes such as ease of diagnoses, modifiability, and co… ▽ More

    Submitted 26 April, 2017; originally announced April 2017.

    Comments: Mining Software Repositories (MSR) 2017 Research Track

    Journal ref: 017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR), Buenos Aires, 2017, pp. 391-401