Skip to main content

Showing 1–21 of 21 results for author: Moonen, L

.
  1. arXiv:2401.07994  [pdf, other

    cs.SE cs.CL cs.LG

    A Novel Approach for Automatic Program Repair using Round-Trip Translation with Large Language Models

    Authors: Fernando Vallecillos Ruiz, Anastasiia Grishina, Max Hort, Leon Moonen

    Abstract: Research shows that grammatical mistakes in a sentence can be corrected by translating it to another language and back using neural machine translation with language models. We investigate whether this correction capability of Large Language Models (LLMs) extends to Automatic Program Repair (APR). Current generative models for APR are pre-trained on source code and fine-tuned for repair. This pape… ▽ More

    Submitted 15 January, 2024; originally announced January 2024.

  2. arXiv:2307.02443  [pdf, other

    cs.SE cs.AI cs.CL cs.LG cs.NE

    An Exploratory Literature Study on Sharing and Energy Use of Language Models for Source Code

    Authors: Max Hort, Anastasiia Grishina, Leon Moonen

    Abstract: Large language models trained on source code can support a variety of software development tasks, such as code recommendation and program repair. Large amounts of data for training such models benefit the models' performance. However, the size of the data and models results in long training times and high energy consumption. While publishing source code allows for replicability, users need to repe… ▽ More

    Submitted 5 July, 2023; originally announced July 2023.

    Comments: Accepted for publication in the 17th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM 2023)

  3. arXiv:2305.04940  [pdf, other

    cs.SE cs.AI cs.CL cs.LG

    The EarlyBIRD Catches the Bug: On Exploiting Early Layers of Encoder Models for More Efficient Code Classification

    Authors: Anastasiia Grishina, Max Hort, Leon Moonen

    Abstract: The use of modern Natural Language Processing (NLP) techniques has shown to be beneficial for software engineering tasks, such as vulnerability detection and type inference. However, training deep NLP models requires significant computational resources. This paper explores techniques that aim at achieving the best usage of resources and available information in these models. We propose a generic… ▽ More

    Submitted 11 September, 2023; v1 submitted 8 May, 2023; originally announced May 2023.

    Comments: The content in this pre-print is the same as in the CRC accepted for publication in the ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2023)

  4. arXiv:2305.00382  [pdf, other

    cs.CR cs.AI cs.CL cs.SE

    Constructing a Knowledge Graph from Textual Descriptions of Software Vulnerabilities in the National Vulnerability Database

    Authors: Anders Mølmen Høst, Pierre Lison, Leon Moonen

    Abstract: Knowledge graphs have shown promise for several cybersecurity tasks, such as vulnerability assessment and threat analysis. In this work, we present a new method for constructing a vulnerability knowledge graph from information in the National Vulnerability Database (NVD). Our approach combines named entity recognition (NER), relation extraction (RE), and entity prediction using a combination of ne… ▽ More

    Submitted 15 May, 2023; v1 submitted 30 April, 2023; originally announced May 2023.

    Comments: Accepted for publication in the 24th Nordic Conference on Computational Linguistics (NoDaLiDa), Tórshavn, Faroe Islands, May 22nd-24th, 2023. [v2]: added funding acknowledgments

  5. arXiv:2304.10423  [pdf, other

    cs.SE cs.AI cs.NE

    Fully Autonomous Programming with Large Language Models

    Authors: Vadim Liventsev, Anastasiia Grishina, Aki Härmä, Leon Moonen

    Abstract: Current approaches to program synthesis with Large Language Models (LLMs) exhibit a "near miss syndrome": they tend to generate programs that semantically resemble the correct answer (as measured by text similarity metrics or human evaluation), but achieve a low or even zero accuracy as measured by unit tests due to small imperfections, such as the wrong input or output format. This calls for an a… ▽ More

    Submitted 20 April, 2023; originally announced April 2023.

    Comments: Accepted for publication in the Genetic and Evolutionary Computation Conference (GECCO 2023)

  6. arXiv:2303.07283  [pdf, other

    cs.SE cs.NE

    CHESS: A Framework for Evaluation of Self-adaptive Systems based on Chaos Engineering

    Authors: Sehrish Malik, Moeen Ali Naqvi, Leon Moonen

    Abstract: There is an increasing need to assess the correct behavior of self-adaptive and self-healing systems due to their adoption in critical and highly dynamic environments. However, there is a lack of systematic evaluation methods for self-adaptive and self-healing systems. We proposed CHESS, a novel approach to address this gap by evaluating self-adaptive and self-healing systems through fault injecti… ▽ More

    Submitted 13 March, 2023; originally announced March 2023.

    Comments: Accepted for publication in the 18nd Symposium on Software Engineering for Adaptive and Self-Managing Systems (SEAMS 2023)

  7. arXiv:2211.03911  [pdf, other

    cs.SE

    Towards Extending the Range of Bugs That Automated Program Repair Can Handle

    Authors: Omar I. Al-Bataineh, Leon Moonen

    Abstract: Modern automated program repair (APR) is well-tuned to finding and repairing bugs that introduce observable erroneous behavior to a program. However, a significant class of bugs does not lead to such observable behavior (e.g., liveness/termination bugs, non-functional bugs, and information flow bugs). Such bugs can generally not be handled with current APR approaches, so, as a community, we need t… ▽ More

    Submitted 7 November, 2022; originally announced November 2022.

    Comments: Accepted for publication in the 22nd IEEE International Conference on Software Quality, Reliability and Security (QRS 2022)

  8. arXiv:2208.13244  [pdf, other

    cs.SE

    Assessing the Impact of Execution Environment on Observation-Based Slicing

    Authors: David Binkley, Leon Moonen

    Abstract: Program slicing reduces a program to a smaller version that retains a chosen computation, referred to as a slicing criterion. One recent multi-lingual slicing approach, observation-based slicing (ORBS), speculatively deletes parts of the program and then executes the code. If the behavior of the slicing criteria is unchanged, the speculative deletion is made permanent. While this makes ORBS lang… ▽ More

    Submitted 28 August, 2022; originally announced August 2022.

  9. On Evaluating Self-Adaptive and Self-Healing Systems using Chaos Engineering

    Authors: Moeen Ali Naqvi, Sehrish Malik, Merve Astekin, Leon Moonen

    Abstract: With the growing adoption of self-adaptive systems in various domains, there is an increasing need for strategies to assess their correct behavior. In particular self-healing systems, which aim to provide resilience and fault-tolerance, often deal with unanticipated failures in critical and highly dynamic environments. Their reactive and complex behavior makes it challenging to assess if these sys… ▽ More

    Submitted 28 August, 2022; originally announced August 2022.

    Comments: 10 pages

  10. Featherweight Assisted Vulnerability Discovery

    Authors: David Binkley, Leon Moonen, Sibren Isaacman

    Abstract: Predicting vulnerable source code helps to focus attention on those parts of the code that need to be examined with more scrutiny. Recent work proposed the use of function names as semantic cues that can be learned by a deep neural network (DNN) to aid in the hunt for vulnerability of functions. Combining identifier splitting, which splits each function name into its constituent words, with a no… ▽ More

    Submitted 5 February, 2022; originally announced February 2022.

    Comments: 17 pages, 6 figures, 6 tables

    Journal ref: Information and Software Technology, 2022

  11. arXiv:2111.05713  [pdf, ps, other

    cs.SE

    Towards More Reliable Automated Program Repair by Integrating Static Analysis Techniques

    Authors: Omar I. Al-Bataineh, Anastasiia Grishina, Leon Moonen

    Abstract: A long-standing open challenge for automated program repair is the overfitting problem, which is caused by having insufficient or incomplete specifications to validate whether a generated patch is correct or not. Most available repair systems rely on weak specifications (i.e., specifications that are synthesized from test cases) which limits the quality of generated repairs. To strengthen specific… ▽ More

    Submitted 10 November, 2021; originally announced November 2021.

    Comments: Accepted at the 21st IEEE International Conference on Software Quality, Reliability, and Security (QRS 2021)

  12. arXiv:2107.08760  [pdf, other

    cs.SE cs.AI cs.CR cs.LG

    CVEfixes: Automated Collection of Vulnerabilities and Their Fixes from Open-Source Software

    Authors: Guru Prasad Bhandari, Amara Naseer, Leon Moonen

    Abstract: Data-driven research on the automated discovery and repair of security vulnerabilities in source code requires comprehensive datasets of real-life vulnerable code and their fixes. To assist in such research, we propose a method to automatically collect and curate a comprehensive vulnerability dataset from Common Vulnerabilities and Exposures (CVE) records in the public National Vulnerability Datab… ▽ More

    Submitted 19 July, 2021; originally announced July 2021.

    Comments: Accepted for publication in Proceedings of the 17th International Conference on Predictive Models and Data Analytics in Software Engineering (PROMISE '21), August 19-20, 2021, Athens, Greece

  13. arXiv:2101.02534  [pdf, other

    cs.SE cs.NE

    Adaptive Immunity for Software: Towards Autonomous Self-healing Systems

    Authors: Moeen Ali Naqvi, Merve Astekin, Sehrish Malik, Leon Moonen

    Abstract: Testing and code reviews are known techniques to improve the quality and robustness of software. Unfortunately, the complexity of modern software systems makes it impossible to anticipate all possible problems that can occur at runtime, which limits what issues can be found using testing and reviews. Thus, it is of interest to consider autonomous self-healing software systems, which can automatica… ▽ More

    Submitted 7 January, 2021; originally announced January 2021.

    Comments: 5 pages, 2 figures

  14. arXiv:2009.03257  [pdf, other

    cs.SE cs.IR cs.LG

    Improving Problem Identification via Automated Log Clustering using Dimensionality Reduction

    Authors: Carl Martin Rosenberg, Leon Moonen

    Abstract: Goal: We consider the problem of automatically grou** logs of runs that failed for the same underlying reasons, so that they can be treated more effectively, and investigate the following questions: (1) Does an approach developed to identify problems in system logs generalize to identifying problems in continuous deployment logs? (2) How does dimensionality reduction affect the quality of automa… ▽ More

    Submitted 7 September, 2020; originally announced September 2020.

    Journal ref: Published in ESEM'18, Proceedings of the 12th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, October 2018, Article: 16, pp. 1-10,

  15. Spectrum-Based Log Diagnosis

    Authors: Carl Martin Rosenberg, Leon Moonen

    Abstract: We present and evaluate Spectrum-Based Log Diagnosis (SBLD), a method to help developers quickly diagnose problems found in complex integration and deployment runs. Inspired by Spectrum-Based Fault Localization, SBLD leverages the differences in event occurrences between logs for failing and passing runs, to highlight events that are stronger associated with failing runs. Using data provided by… ▽ More

    Submitted 16 August, 2020; originally announced August 2020.

    Comments: Published in ESEM'20: ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), October 8-9, 2020, Bari, Italy. ACM, 12 pages

  16. arXiv:0707.2291  [pdf

    cs.SE

    An Integrated Crosscutting Concern Migration Strategy and its Application to JHotDraw

    Authors: Marius Marin, Leon Moonen, Arie van Deursen

    Abstract: In this paper we propose a systematic strategy for migrating crosscutting concerns in existing object-oriented systems to aspect-based solutions. The proposed strategy consists of four steps: mining, exploration, documentation and refactoring of crosscutting concerns. We discuss in detail a new approach to aspect refactoring that is fully integrated with our strategy, and apply the whole strateg… ▽ More

    Submitted 22 July, 2007; v1 submitted 16 July, 2007; originally announced July 2007.

    Comments: 10+ 4 pages

    Report number: TUD-SERG-2007-019 ACM Class: D.2

  17. arXiv:cs/0609147  [pdf

    cs.SE

    Identifying Crosscutting Concerns Using Fan-in Analysis

    Authors: Marius Marin, Arie van Deursen, Leon Moonen

    Abstract: Aspect mining is a reverse engineering process that aims at finding crosscutting concerns in existing systems. This paper proposes an aspect mining approach based on determining methods that are called from many different places, and hence have a high fan-in, which can be seen as a symptom of crosscutting functionality. The approach is semi-automatic, and consists of three steps: metric calculat… ▽ More

    Submitted 19 February, 2007; v1 submitted 26 September, 2006; originally announced September 2006.

    Comments: 34+4 pages; Extended version [Marin et al. 2004a]

    Report number: TUD-SERG-2006-013 ACM Class: D.2.3; D.2.7; D.2.8

    Journal ref: ACM Transactions on Software Engineering and Methodology, 2007

  18. arXiv:cs/0607063  [pdf

    cs.SE

    Prioritizing Software Inspection Results using Static Profiling

    Authors: Cathal Boogerd, Leon Moonen

    Abstract: Static software checking tools are useful as an additional automated software inspection step that can easily be integrated in the development cycle and assist in creating secure, reliable and high quality code. However, an often quoted disadvantage of these tools is that they generate an overly large number of warnings, including many false positives due to the approximate analysis techniques.… ▽ More

    Submitted 12 July, 2006; originally announced July 2006.

    Comments: 14 pages

    Report number: TUD-SERG-2006-001

  19. arXiv:cs/0607006  [pdf

    cs.SE cs.PL

    Applying and Combining Three Different Aspect Mining Techniques

    Authors: Mariano Ceccato, Marius Marin, Kim Mens, Leon Moonen, Paolo Tonella, Tom Tourwe

    Abstract: Understanding a software system at source-code level requires understanding the different concerns that it addresses, which in turn requires a way to identify these concerns in the source code. Whereas some concerns are explicitly represented by program entities (like classes, methods and variables) and thus are easy to identify, crosscutting concerns are not captured by a single program entity… ▽ More

    Submitted 2 July, 2006; originally announced July 2006.

    Comments: 28 pages

    Report number: TUD-SERG-2006-002

  20. A common framework for aspect mining based on crosscutting concern sorts

    Authors: Marius Marin, Leon Moonen, Arie van Deursen

    Abstract: The increasing number of aspect mining techniques proposed in literature calls for a methodological way of comparing and combining them in order to assess, and improve on, their quality. This paper addresses this situation by proposing a common framework based on crosscutting concern sorts which allows for consistent assessment, comparison and combination of aspect mining techniques. The framewo… ▽ More

    Submitted 27 June, 2006; originally announced June 2006.

    Comments: 14 pages

    Report number: TUD-SERG-2006-009

    Journal ref: Proceedings Working Conference on Reverse Engineering (WCRE), IEEE Computer Society, 2006, pages 29-38

  21. arXiv:cs/0503015  [pdf, ps, other

    cs.SE cs.PL

    A Systematic Aspect-Oriented Refactoring and Testing Strategy, and its Application to JHotDraw

    Authors: Arie van Deursen, Marius Marin, Leon Moonen

    Abstract: Aspect oriented programming aims at achieving better modularization for a system's crosscutting concerns in order to improve its key quality attributes, such as evolvability and reusability. Consequently, the adoption of aspect-oriented techniques in existing (legacy) software systems is of interest to remediate software aging. The refactoring of existing systems to employ aspect-orientation wil… ▽ More

    Submitted 5 March, 2005; originally announced March 2005.

    Comments: 25 pages

    ACM Class: D.2.7; D.2.5; D.1.5