Skip to main content

Showing 1–30 of 30 results for author: Tonella, P

.
  1. arXiv:2404.18573  [pdf, other

    cs.LG cs.RO cs.SE

    Predicting Safety Misbehaviours in Autonomous Driving Systems using Uncertainty Quantification

    Authors: Ruben Grewal, Paolo Tonella, Andrea Stocco

    Abstract: The automated real-time recognition of unexpected situations plays a crucial role in the safety of autonomous vehicles, especially in unsupported and unpredictable scenarios. This paper evaluates different Bayesian uncertainty quantification methods from the deep learning domain for the anticipatory testing of safety-critical misbehaviours during system-level simulation-based testing. Specifically… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: In Proceedings of 17th IEEE International Conference on Software Testing, Verification and Validation 2024 (ICST '24)

  2. arXiv:2403.13729  [pdf, other

    cs.SE cs.AI cs.LG cs.RO

    Reinforcement Learning for Online Testing of Autonomous Driving Systems: a Replication and Extension Study

    Authors: Luca Giamattei, Matteo Biagiola, Roberto Pietrantuono, Stefano Russo, Paolo Tonella

    Abstract: In a recent study, Reinforcement Learning (RL) used in combination with many-objective search, has been shown to outperform alternative techniques (random search and many-objective search) for online testing of Deep Neural Network-enabled systems. The empirical evaluation of these techniques was conducted on a state-of-the-art Autonomous Driving System (ADS). This work is a replication and extensi… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

  3. GenMorph: Automatically Generating Metamorphic Relations via Genetic Programming

    Authors: Jon Ayerdi, Valerio Terragni, Gunel Jahangirova, Aitor Arrieta, Paolo Tonella

    Abstract: Metamorphic testing is a popular approach that aims to alleviate the oracle problem in software testing. At the core of this approach are Metamorphic Relations (MRs), specifying properties that hold among multiple test inputs and corresponding outputs. Deriving MRs is mostly a manual activity, since their automated generation is a challenging and largely unexplored problem. This paper presents G… ▽ More

    Submitted 5 June, 2024; v1 submitted 23 December, 2023; originally announced December 2023.

  4. arXiv:2307.10590  [pdf, other

    cs.SE cs.AI cs.RO

    Boundary State Generation for Testing and Improvement of Autonomous Driving Systems

    Authors: Matteo Biagiola, Paolo Tonella

    Abstract: Recent advances in Deep Neural Networks (DNNs) and sensor technologies are enabling autonomous driving systems (ADSs) with an ever-increasing level of autonomy. However, assessing their dependability remains a critical concern. State-of-the-art ADS testing approaches modify the controllable attributes of a simulated driving environment until the ADS misbehaves. Such approaches have two main drawba… ▽ More

    Submitted 20 July, 2023; originally announced July 2023.

  5. arXiv:2306.07400  [pdf, other

    cs.SE

    Neural Embeddings for Web Testing

    Authors: Andrea Stocco, Alexandra Willi, Luigi Libero Lucio Starace, Matteo Biagiola, Paolo Tonella

    Abstract: Web test automation techniques employ web crawlers to automatically produce a web app model that is used for test generation. Existing crawlers rely on app-specific, threshold-based, algorithms to assess state equivalence. Such algorithms are hard to tune in the general case and cannot accurately identify and remove near-duplicate web pages from crawl models. Failing to retrieve an accurate web ap… ▽ More

    Submitted 12 June, 2023; originally announced June 2023.

    Comments: 12 pages; in revision

  6. arXiv:2305.12751  [pdf, other

    cs.SE cs.AI cs.LG

    Testing of Deep Reinforcement Learning Agents with Surrogate Models

    Authors: Matteo Biagiola, Paolo Tonella

    Abstract: Deep Reinforcement Learning (DRL) has received a lot of attention from the research community in recent years. As the technology moves away from game playing to practical contexts, such as autonomous vehicles and robotics, it is crucial to evaluate the quality of DRL agents. In this paper, we propose a search-based approach to test such agents. Our approach, implemented in a tool called Indago, tr… ▽ More

    Submitted 11 November, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

    ACM Class: D.2.5

  7. Two is Better Than One: Digital Siblings to Improve Autonomous Driving Testing

    Authors: Matteo Biagiola, Andrea Stocco, Vincenzo Riccio, Paolo Tonella

    Abstract: Simulation-based testing represents an important step to ensure the reliability of autonomous driving software. In practice, when companies rely on third-party general-purpose simulators, either for in-house or outsourced testing, the generalizability of testing results to real autonomous vehicles is at stake. In this paper, we enhance simulation-based testing by introducing the notion of digital… ▽ More

    Submitted 24 April, 2024; v1 submitted 14 May, 2023; originally announced May 2023.

    ACM Class: D.2.5

  8. arXiv:2304.02654  [pdf, other

    cs.LG cs.SE

    Adopting Two Supervisors for Efficient Use of Large-Scale Remote Deep Neural Networks

    Authors: Michael Weiss, Paolo Tonella

    Abstract: Recent decades have seen the rise of large-scale Deep Neural Networks (DNNs) to achieve human-competitive performance in a variety of artificial intelligence tasks. Often consisting of hundreds of millions, if not hundreds of billion parameters, these DNNs are too large to be deployed to, or efficiently run on resource-constrained devices such as mobile phones or IoT microcontrollers. Systems rely… ▽ More

    Submitted 5 April, 2023; originally announced April 2023.

    Comments: A corresponding registered Report (i.e., paper without results) was accepted at ACM TOSEM. This preprint includes the (not yet peer-reviewed) results

  9. arXiv:2301.11568  [pdf, other

    cs.SE

    Repairing DNN Architecture: Are We There Yet?

    Authors: **han Kim, Nargiz Humbatova, Gunel Jahangirova, Paolo Tonella, Shin Yoo

    Abstract: As Deep Neural Networks (DNNs) are rapidly being adopted within large software systems, software developers are increasingly required to design, train, and deploy such models into the systems they develop. Consequently, testing and improving the robustness of these models have received a lot of attention lately. However, relatively little effort has been made to address the difficulties developers… ▽ More

    Submitted 27 January, 2023; originally announced January 2023.

    Comments: Paper accepted to ICST 2023

  10. arXiv:2212.11368  [pdf, ps, other

    cs.SE cs.LG

    When and Why Test Generators for Deep Learning Produce Invalid Inputs: an Empirical Study

    Authors: Vincenzo Riccio, Paolo Tonella

    Abstract: Testing Deep Learning (DL) based systems inherently requires large and representative test sets to evaluate whether DL systems generalise beyond their training datasets. Diverse Test Input Generators (TIGs) have been proposed to produce artificial inputs that expose issues of the DL systems by triggering misbehaviours. Unfortunately, such generated inputs may be invalid, i.e., not recognisable as… ▽ More

    Submitted 21 December, 2022; originally announced December 2022.

    Comments: To be published in Proceedings of the 45th ACM/IEEE International Conference on Software Engineering (ICSE 2023)

    ACM Class: D.2.5

  11. arXiv:2212.07118  [pdf, other

    cs.SE cs.LG

    Uncertainty Quantification for Deep Neural Networks: An Empirical Comparison and Usage Guidelines

    Authors: Michael Weiss, Paolo Tonella

    Abstract: Deep Neural Networks (DNN) are increasingly used as components of larger software systems that need to process complex data, such as images, written texts, audio/video signals. DNN predictions cannot be assumed to be always correct for several reasons, among which the huge input space that is dealt with, the ambiguity of some inputs data, as well as the intrinsic properties of learning algorithms,… ▽ More

    Submitted 14 December, 2022; originally announced December 2022.

    Comments: Accepted for publication at the Journal of Software: Testing, Verification and Reliability. arXiv admin note: substantial text overlap with arXiv:2102.00902

  12. arXiv:2207.10495  [pdf, other

    cs.SE cs.LG

    Generating and Detecting True Ambiguity: A Forgotten Danger in DNN Supervision Testing

    Authors: Michael Weiss, André García Gómez, Paolo Tonella

    Abstract: Deep Neural Networks (DNNs) are becoming a crucial component of modern software systems, but they are prone to fail under conditions that are different from the ones observed during training (out-of-distribution inputs) or on inputs that are truly ambiguous, i.e., inputs that admit multiple classes with nonzero probability in their labels. Recent work proposed DNN supervisors to detect high-uncert… ▽ More

    Submitted 8 September, 2023; v1 submitted 21 July, 2022; originally announced July 2022.

    Comments: Accepted for publication at Springers "Empirical Software Engineering" (EMSE)

  13. arXiv:2205.00664  [pdf, other

    cs.LG cs.AI cs.SE

    Simple Techniques Work Surprisingly Well for Neural Network Test Prioritization and Active Learning (Replicability Study)

    Authors: Michael Weiss, Paolo Tonella

    Abstract: Test Input Prioritizers (TIP) for Deep Neural Networks (DNN) are an important technique to handle the typically very large test datasets efficiently, saving computation and labeling costs. This is particularly true for large-scale, deployed systems, where inputs observed in production are recorded to serve as potential test or training data for the next versions of the system. Feng et. al. propose… ▽ More

    Submitted 24 May, 2022; v1 submitted 2 May, 2022; originally announced May 2022.

    Comments: Accepted at ISSTA 2022

  14. arXiv:2112.11255  [pdf, other

    cs.SE cs.AI cs.RO

    Mind the Gap! A Study on the Transferability of Virtual vs Physical-world Testing of Autonomous Driving Systems

    Authors: Andrea Stocco, Brian Pulfer, Paolo Tonella

    Abstract: Safe deployment of self-driving cars (SDC) necessitates thorough simulated and in-field testing. Most testing techniques consider virtualized SDCs within a simulation environment, whereas less effort has been directed towards assessing whether such techniques transfer to and are effective with a physical real-world vehicle. In this paper, we shed light on the problem of generalizing testing result… ▽ More

    Submitted 25 August, 2022; v1 submitted 21 December, 2021; originally announced December 2021.

    Comments: 13 pages; Accepted for publication in the IEEE Transactions of Software Engineering (TSE)

  15. arXiv:2109.07514  [pdf, other

    cs.SE cs.AI

    DeepMetis: Augmenting a Deep Learning Test Set to Increase its Mutation Score

    Authors: Vincenzo Riccio, Nargiz Humbatova, Gunel Jahangirova, Paolo Tonella

    Abstract: Deep Learning (DL) components are routinely integrated into software systems that need to perform complex tasks such as image or natural language processing. The adequacy of the test data used to test such systems can be assessed by their ability to expose artificially injected faults (mutations) that simulate real DL faults. In this paper, we describe an approach to automatically generate new tes… ▽ More

    Submitted 15 September, 2021; originally announced September 2021.

    Comments: To be published in Proceedings of the 36th IEEE/ACM International Conference on Automated Software Engineering (ASE 2021)

    ACM Class: D.2.5

  16. arXiv:2107.06997  [pdf, other

    cs.LG cs.AI cs.SE

    DeepHyperion: Exploring the Feature Space of Deep Learning-Based Systems through Illumination Search

    Authors: Tahereh Zohdinasab, Vincenzo Riccio, Alessio Gambi, Paolo Tonella

    Abstract: Deep Learning (DL) has been successfully applied to a wide range of application domains, including safety-critical ones. Several DL testing approaches have been recently proposed in the literature but none of them aims to assess how different interpretable features of the generated inputs affect the system's behaviour. In this paper, we resort to Illumination Search to find the highest-performing… ▽ More

    Submitted 5 July, 2021; originally announced July 2021.

    Comments: To be published in Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA '21), July 11-17, 2021, Virtual, Denmark. ACM, New York, NY, USA, 12 pages

    ACM Class: D.2.5

  17. arXiv:2103.05939  [pdf, other

    cs.LG cs.SE

    A Review and Refinement of Surprise Adequacy

    Authors: Michael Weiss, Rwiddhi Chakraborty, Paolo Tonella

    Abstract: Surprise Adequacy (SA) is one of the emerging and most promising adequacy criteria for Deep Learning (DL) testing. As an adequacy criterion, it has been used to assess the strength of DL test suites. In addition, it has also been used to find inputs to a Deep Neural Network (DNN) which were not sufficiently represented in the training data, or to select samples for DNN retraining. However, computa… ▽ More

    Submitted 10 March, 2021; originally announced March 2021.

    Comments: Accepted at DeepTest 2021 (ICSE Workshop)

  18. arXiv:2103.02901  [pdf, other

    cs.SE

    GAssert: A Fully Automated Tool to Improve Assertion Oracles

    Authors: Valerio Terragni, Gunel Jahangirova, Paolo Tonella, Mauro Pezzè

    Abstract: This demo presents the implementation and usage details of GASSERT, the first tool to automatically improve assertion oracles. Assertion oracles are executable boolean expressions placed inside the program that should pass (return true) for all correct executions and fail (return false) for all incorrect executions. Because designing perfect assertion oracles is difficult, assertions are prone to… ▽ More

    Submitted 4 March, 2021; originally announced March 2021.

    Comments: 4 pages, published at the 43nd IEEE/ACM International Conference on Software Engineering, Demonstration Track ICSE-DEMO 2021

  19. arXiv:2102.00902  [pdf, ps, other

    cs.SE cs.LG

    Fail-Safe Execution of Deep Learning based Systems through Uncertainty Monitoring

    Authors: Michael Weiss, Paolo Tonella

    Abstract: Modern software systems rely on Deep Neural Networks (DNN) when processing complex, unstructured inputs, such as images, videos, natural language texts or audio signals. Provided the intractably large size of such input spaces, the intrinsic limitations of learning algorithms, and the ambiguity about the expected predictions for some of the inputs, not only there is no guarantee that DNN's predict… ▽ More

    Submitted 1 February, 2021; originally announced February 2021.

    Comments: Accepted at IEEE International Conference on Software Testing, Verification and Validation 2021

  20. Deep Reinforcement Learning for Black-Box Testing of Android Apps

    Authors: Andrea Romdhana, Alessio Merlo, Mariano Ceccato, Paolo Tonella

    Abstract: The state space of Android apps is huge and its thorough exploration during testing remains a major challenge. In fact, the best exploration strategy is highly dependent on the features of the app under test. Reinforcement Learning (RL) is a machine learning technique that learns the optimal strategy to solve a task by trial and error, guided by positive or negative reward, rather than by explicit… ▽ More

    Submitted 15 January, 2021; v1 submitted 7 January, 2021; originally announced January 2021.

    Journal ref: ACM Transactions on Software Engineering and Methodology, 2022

  21. arXiv:2101.00982  [pdf, ps, other

    cs.LG cs.SE

    Uncertainty-Wizard: Fast and User-Friendly Neural Network Uncertainty Quantification

    Authors: Michael Weiss, Paolo Tonella

    Abstract: Uncertainty and confidence have been shown to be useful metrics in a wide variety of techniques proposed for deep learning testing, including test data selection and system supervision.We present uncertainty-wizard, a tool that allows to quantify such uncertainty and confidence in artificial neural networks. It is built on top of the industry-leading tf.keras deep learning API and it provides a ne… ▽ More

    Submitted 28 January, 2021; v1 submitted 29 December, 2020; originally announced January 2021.

    Comments: Accepted for publication at the IEEE International Conference on Software Testing, Verification and Validation 2021

  22. arXiv:2011.10787  [pdf, other

    cs.SE

    An Empirical Study on Failed Error Propagation in Java Programs with Real Faults

    Authors: Gunel Jahangirova, David Clark, Mark Harman, Paolo Tonella

    Abstract: During testing, developers can place oracles externally or internally with respect to a method. Given a faulty execution state, i.e., one that differs from the expected one, an oracle might be unable to expose the fault if it is placed at a program point with no access to the incorrect program state or where the program state is no longer corrupted. In such a case, the oracle is subject to failed… ▽ More

    Submitted 21 November, 2020; originally announced November 2020.

  23. arXiv:2007.02787  [pdf, other

    cs.SE cs.AI cs.LG

    Model-based Exploration of the Frontier of Behaviours for Deep Learning System Testing

    Authors: Vincenzo Riccio, Paolo Tonella

    Abstract: With the increasing adoption of Deep Learning (DL) for critical tasks, such as autonomous driving, the evaluation of the quality of systems that rely on DL has become crucial. Once trained, DL systems produce an output for any arbitrary numeric vector provided as input, regardless of whether it is within or outside the validity domain of the system under test. Hence, the quality of such systems is… ▽ More

    Submitted 6 July, 2020; originally announced July 2020.

    Comments: To be published in the Proceedings of the ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2020); 13 pages, 6 figures

    ACM Class: D.2.5

  24. arXiv:2002.01785  [pdf, other

    cs.SE

    A Framework for In-Vivo Testing of Mobile Applications

    Authors: Mariano Ceccato, Davide Corradini, Luca Gazzola, Fitsum Meshesha Kifetew, Leonardo Mariani, Matteo Orrù, Paolo Tonella

    Abstract: The ecosystem in which mobile applications run is highly heterogeneous and configurable. All layers upon which mobile apps are built offer wide possibilities of variations, from the device and the hardware, to the operating system and middleware, up to the user preferences and settings. Testing all possible configurations exhaustively, before releasing the app, is unaffordable. As a consequence, t… ▽ More

    Submitted 5 February, 2020; originally announced February 2020.

    Comments: Research paper accepted to ICST'20, 10+1 pages

  25. arXiv:1910.11015  [pdf, other

    cs.SE cs.AI cs.LG

    Taxonomy of Real Faults in Deep Learning Systems

    Authors: Nargiz Humbatova, Gunel Jahangirova, Gabriele Bavota, Vincenzo Riccio, Andrea Stocco, Paolo Tonella

    Abstract: The growing application of deep neural networks in safety-critical domains makes the analysis of faults that occur in such systems of enormous importance. In this paper we introduce a large taxonomy of faults in deep learning (DL) systems. We have manually analysed 1059 artefacts gathered from GitHub commits and issues of projects that use the most popular DL frameworks (TensorFlow, Keras and PyTo… ▽ More

    Submitted 7 November, 2019; v1 submitted 24 October, 2019; originally announced October 2019.

  26. arXiv:1910.04443  [pdf, other

    eess.SP

    Misbehaviour Prediction for Autonomous Driving Systems

    Authors: Andrea Stocco, Michael Weiss, Marco Calzana, Paolo Tonella

    Abstract: Deep Neural Networks (DNNs) are the core component of modern autonomous driving systems. To date, it is still unrealistic that a DNN will generalize correctly in all driving conditions. Current testing techniques consist of offline solutions that identify adversarial or corner cases for improving the training phase, and little has been done for enabling online healing of DNN-based vehicles. In thi… ▽ More

    Submitted 10 October, 2019; originally announced October 2019.

    Comments: 11 pages

  27. Web Test Dependency Detection

    Authors: Matteo Biagiola, Andrea Stocco, Ali Mesbah, Filippo Ricca, Paolo Tonella

    Abstract: E2E web test suites are prone to test dependencies due to the heterogeneous multi-tiered nature of modern web apps, which makes it difficult for developers to create isolated program states for each test case. In this paper, we present the first approach for detecting and validating test dependencies present in E2E web test suites. Our approach employs string analysis to extract an approximated se… ▽ More

    Submitted 10 October, 2019; v1 submitted 1 May, 2019; originally announced May 2019.

    Comments: 11 pages, published in the Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2019), pp. 154-164

  28. How Professional Hackers Understand Protected Code while Performing Attack Tasks

    Authors: Mariano Ceccato, Paolo Tonella, Cataldo Basile, Bart Coppens, Bjorn De Sutter, Paolo Falcarin, Marco Torchiano

    Abstract: Code protections aim at blocking (or at least delaying) reverse engineering and tampering attacks to critical assets within programs. Knowing the way hackers understand protected code and perform attacks is important to achieve a stronger protection of the software assets, based on realistic assumptions about the hackers' behaviour. However, building such knowledge is difficult because hackers can… ▽ More

    Submitted 26 May, 2017; v1 submitted 10 April, 2017; originally announced April 2017.

    Comments: Post-print for ICPC 2017 conference

  29. Assessment of Source Code Obfuscation Techniques

    Authors: Alessio Viticchié, Leonardo Regano, Marco Torchiano, Cataldo Basile, Mariano Ceccato, Paolo Tonella, Roberto Tiella

    Abstract: Obfuscation techniques are a general category of software protections widely adopted to prevent malicious tampering of the code by making applications more difficult to understand and thus harder to modify. Obfuscation techniques are divided in code and data obfuscation, depending on the protected asset. While preliminary empirical studies have been conducted to determine the impact of code obfusc… ▽ More

    Submitted 7 April, 2017; originally announced April 2017.

    Comments: Post-print, SCAM 2016

  30. arXiv:cs/0607006  [pdf

    cs.SE cs.PL

    Applying and Combining Three Different Aspect Mining Techniques

    Authors: Mariano Ceccato, Marius Marin, Kim Mens, Leon Moonen, Paolo Tonella, Tom Tourwe

    Abstract: Understanding a software system at source-code level requires understanding the different concerns that it addresses, which in turn requires a way to identify these concerns in the source code. Whereas some concerns are explicitly represented by program entities (like classes, methods and variables) and thus are easy to identify, crosscutting concerns are not captured by a single program entity… ▽ More

    Submitted 2 July, 2006; originally announced July 2006.

    Comments: 28 pages

    Report number: TUD-SERG-2006-002