Search | arXiv e-print repository

ESBMC v7.6: Enhanced Model Checking of C++ Programs with Clang AST

Authors: Xianzhiyu Li, Kunjian Song, Mikhail R. Gadelha, Franz Brauße, Rafael S. Menezes, Konstantin Korovin, Lucas C. Cordeiro

Abstract: This paper presents Efficient SMT-Based Context-Bounded Model Checker (ESBMC) v7.6, an extended version based on previous work on ESBMC v7.3 by K. Song et al. The v7.3 introduced a new Clang-based C++ front-end to address the challenges posed by modern C++ programs. Although the new front-end has demonstrated significant potential in previous studies, it remains in the developmental stage and lack… ▽ More This paper presents Efficient SMT-Based Context-Bounded Model Checker (ESBMC) v7.6, an extended version based on previous work on ESBMC v7.3 by K. Song et al. The v7.3 introduced a new Clang-based C++ front-end to address the challenges posed by modern C++ programs. Although the new front-end has demonstrated significant potential in previous studies, it remains in the developmental stage and lacks several essential features. ESBMC v7.6 further enhanced this foundation by adding and extending features based on the Clang AST, such as 1) exception handling, 2) extended memory management and memory safety verification, including dangling pointers, duplicate deallocation, memory leaks and rvalue references and 3) new operational models for STL updating the outdated C++ operational models. Our extensive experiments demonstrate that ESBMC v7.6 can handle a significantly broader range of C++ features introduced in recent versions of the C++ standard. △ Less

Submitted 25 June, 2024; originally announced June 2024.

Comments: 27 pages, 2 figures. arXiv admin note: substantial text overlap with arXiv:2308.05649

arXiv:2312.14746 [pdf, ps, other]

ESBMC v7.4: Harnessing the Power of Intervals

Authors: Rafael Menezes, Mohannad Aldughaim, Bruno Farias, Xianzhiyu Li, Edoardo Manino, Fedor Shmarov, Kunjian Song, Franz Brauße, Mikhail R. Gadelha, Norbert Tihanyi, Konstantin Korovin, Lucas C. Cordeiro

Abstract: ESBMC implements many state-of-the-art techniques for model checking. We report on new and improved features that allow us to obtain verification results for previously unsupported programs and properties. ESBMC employs a new static interval analysis of expressions in programs to increase verification performance. This includes interval-based reasoning over booleans and integers, forward and backw… ▽ More ESBMC implements many state-of-the-art techniques for model checking. We report on new and improved features that allow us to obtain verification results for previously unsupported programs and properties. ESBMC employs a new static interval analysis of expressions in programs to increase verification performance. This includes interval-based reasoning over booleans and integers, forward and backward contractors, and particular optimizations related to singleton intervals because of their ubiquity. Other relevant improvements concern the verification of concurrent programs, as well as several operational models, internal ones, and also those of libraries such as pthread and the C mathematics library. An extended memory safety analysis now allows tracking of memory leaks that are considered still reachable. △ Less

Submitted 22 December, 2023; originally announced December 2023.

arXiv:2308.05649 [pdf, other]

ESBMC v7.3: Model Checking C++ Programs using Clang AST

Authors: Kunjian Song, Mikhail R. Gadelha, Franz Brauße, Rafael S. Menezes, Lucas C. Cordeiro

Abstract: This paper introduces ESBMC v7.3, the latest Efficient SMT-Based Context-Bounded Model Checker version, which now incorporates a new clang-based C++ front-end. While the previous CPROVER-based front-end served well for handling C++03 programs, it encountered challenges kee** up with the evolving C++ language. As new language and library features were added in each C++ version, the limitations of… ▽ More This paper introduces ESBMC v7.3, the latest Efficient SMT-Based Context-Bounded Model Checker version, which now incorporates a new clang-based C++ front-end. While the previous CPROVER-based front-end served well for handling C++03 programs, it encountered challenges kee** up with the evolving C++ language. As new language and library features were added in each C++ version, the limitations of the old front-end became apparent, leading to difficult-to-maintain code. Consequently, modern C++ programs were challenging to verify. To overcome this obstacle, we redeveloped the front-end, opting for a more robust approach using clang. The new front-end efficiently traverses the Abstract Syntax Tree (AST) in-memory using clang APIs and transforms each AST node into ESBMC's Intermediate Representation. Through extensive experimentation, our results demonstrate that ESBMC v7.3 with the new front-end significantly reduces parse and conversion errors, enabling successful verification of a wide range of C++ programs, thereby outperforming previous ESBMC versions. △ Less

Submitted 10 August, 2023; originally announced August 2023.

arXiv:2107.01093 [pdf, other]

Model Checking C++ Programs

Authors: Felipe R. Monteiro, Mikhail R. Gadelha, Lucas C. Cordeiro

Abstract: In the last three decades, memory safety issues in system programming languages such as C or C++ have been one of the significant sources of security vulnerabilities. However, there exist only a few attempts with limited success to cope with the complexity of C++ program verification. Here we describe and evaluate a novel verification approach based on bounded model checking (BMC) and satisfiabili… ▽ More In the last three decades, memory safety issues in system programming languages such as C or C++ have been one of the significant sources of security vulnerabilities. However, there exist only a few attempts with limited success to cope with the complexity of C++ program verification. Here we describe and evaluate a novel verification approach based on bounded model checking (BMC) and satisfiability modulo theories (SMT) to verify C++ programs formally. Our verification approach analyzes bounded C++ programs by encoding into SMT various sophisticated features that the C++ programming language offers, such as templates, inheritance, polymorphism, exception handling, and the Standard C++ Libraries. We formalize these features within our formal verification framework using a decidable fragment of first-order logic and then show how state-of-the-art SMT solvers can efficiently handle that. We implemented our verification approach on top of ESBMC. We compare ESBMC to LLBMC and DIVINE, which are state-of-the-art verifiers to check C++ programs directly from the LLVM bitcode. Experimental results show that ESBMC can handle a wide range of C++ programs, presenting a higher number of correct verification results. At the same time, it reduces the verification time if compared to LLBMC and DIVINE tools. Additionally, ESBMC has been applied to a commercial C++ application in the telecommunication domain and successfully detected arithmetic overflow errors, potentially leading to security vulnerabilities. △ Less

Submitted 2 July, 2021; originally announced July 2021.

Comments: 30 pages

arXiv:2012.11223 [pdf, other]

FuSeBMC: A White-Box Fuzzer for Finding Security Vulnerabilities in C Programs

Authors: Kaled M. Alshmrany, Rafael S. Menezes, Mikhail R. Gadelha, Lucas C. Cordeiro

Abstract: We describe and evaluate a novel white-box fuzzer for C programs named FuSeBMC, which combines fuzzing and symbolic execution, and applies Bounded Model Checking (BMC) to find security vulnerabilities in C programs. FuSeBMC explores and analyzes C programs (1) to find execution paths that lead to property violations and (2) to incrementally inject labels to guide the fuzzer and the BMC engine to p… ▽ More We describe and evaluate a novel white-box fuzzer for C programs named FuSeBMC, which combines fuzzing and symbolic execution, and applies Bounded Model Checking (BMC) to find security vulnerabilities in C programs. FuSeBMC explores and analyzes C programs (1) to find execution paths that lead to property violations and (2) to incrementally inject labels to guide the fuzzer and the BMC engine to produce test-cases for code coverage. FuSeBMC successfully participates in Test-Comp'21 and achieves first place in the Cover-Error category and second place in the Overall category. △ Less

Submitted 21 December, 2020; originally announced December 2020.

Comments: 4 pages

arXiv:2004.12699 [pdf, other]

An Efficient Floating-Point Bit-Blasting API for Verifying C Programs

Authors: Mikhail R. Gadelha, Lucas C. Cordeiro, Denis A. Nicole

Abstract: We describe a new SMT bit-blasting API for floating-points and evaluate it using different out-of-the-shelf SMT solvers during the verification of several C programs. The new floating-point API is part of the SMT backend in ESBMC, a state-of-the-art bounded model checker for C and C++. For the evaluation, we compared our floating-point API against the native floating-point APIs in Z3 and MathSAT.… ▽ More We describe a new SMT bit-blasting API for floating-points and evaluate it using different out-of-the-shelf SMT solvers during the verification of several C programs. The new floating-point API is part of the SMT backend in ESBMC, a state-of-the-art bounded model checker for C and C++. For the evaluation, we compared our floating-point API against the native floating-point APIs in Z3 and MathSAT. We show that Boolector, when using floating-point API, outperforms the solvers with native support for floating-points, correctly verifying more programs in less time. Experimental results also show that our floating-point API implemented in ESBMC is on par with other state-of-the-art software verifiers. Furthermore, when verifying programs with floating-point arithmetic, our new floating-point API produced no wrong answers. △ Less

Submitted 29 April, 2020; v1 submitted 27 April, 2020; originally announced April 2020.

Comments: 20 pages

arXiv:1909.00271 [pdf, other]

Exploring Reproducibility and FAIR Principles in Data Science Using Ecological Niche Modeling as a Case Study

Authors: Maria Luiza Mondelli, A. Townsend Peterson, Luiz M. R. Gadelha Jr

Abstract: Reproducibility is a fundamental requirement of the scientific process since it enables outcomes to be replicated and verified. Computational scientific experiments can benefit from improved reproducibility for many reasons, including validation of results and reuse by other scientists. However, designing reproducible experiments remains a challenge and hence the need for develo** methodologies… ▽ More Reproducibility is a fundamental requirement of the scientific process since it enables outcomes to be replicated and verified. Computational scientific experiments can benefit from improved reproducibility for many reasons, including validation of results and reuse by other scientists. However, designing reproducible experiments remains a challenge and hence the need for develo** methodologies and tools that can support this process. Here, we propose a conceptual model for reproducibility to specify its main attributes and properties, along with a framework that allows for computational experiments to be findable, accessible, interoperable, and reusable. We present a case study in ecological niche modeling to demonstrate and evaluate the implementation of this framework. △ Less

Submitted 31 August, 2019; originally announced September 2019.

Comments: 10 pages, 4 figures

arXiv:1907.12933 [pdf, other]

Incremental Bounded Model Checking of Artificial Neural Networks in CUDA

Authors: Luiz H. Sena, Iury V. Bessa, Mikhail R. Gadelha, Lucas C. Cordeiro, Edjard Mota

Abstract: Artificial Neural networks (ANNs) are powerful computing systems employed for various applications due to their versatility to generalize and to respond to unexpected inputs/patterns. However, implementations of ANNs for safety-critical systems might lead to failures, which are hardly predicted in the design phase since ANNs are highly parallel and their parameters are hardly interpretable. Here w… ▽ More Artificial Neural networks (ANNs) are powerful computing systems employed for various applications due to their versatility to generalize and to respond to unexpected inputs/patterns. However, implementations of ANNs for safety-critical systems might lead to failures, which are hardly predicted in the design phase since ANNs are highly parallel and their parameters are hardly interpretable. Here we develop and evaluate a novel symbolic software verification framework based on incremental bounded model checking (BMC) to check for adversarial cases and coverage methods in multi-layer perceptron (MLP). In particular, we further develop the efficient SMT-based Context-Bounded Model Checker for Graphical Processing Units (ESBMC-GPU) in order to ensure the reliability of certain safety properties in which safety-critical systems can fail and make incorrect decisions, thereby leading to unwanted material damage or even put lives in danger. This paper marks the first symbolic verification framework to reason over ANNs implemented in CUDA. Our experimental results show that our approach implemented in ESBMC-GPU can successfully verify safety properties and covering methods in ANNs and correctly generate 28 adversarial cases in MLPs. △ Less

Submitted 30 July, 2019; originally announced July 2019.

Comments: 8 pages

arXiv:1904.06152 [pdf, ps, other]

Boost the Impact of Continuous Formal Verification in Industry

Authors: Felipe R. Monteiro, Mikhail R. Gadelha, Lucas C. Cordeiro

Abstract: Software model checking has experienced significant progress in the last two decades, however, one of its major bottlenecks for practical applications remains its scalability and adaptability. Here, we describe an approach to integrate software model checking techniques into the DevOps culture by exploiting practices such as continuous integration and regression tests. In particular, our proposed… ▽ More Software model checking has experienced significant progress in the last two decades, however, one of its major bottlenecks for practical applications remains its scalability and adaptability. Here, we describe an approach to integrate software model checking techniques into the DevOps culture by exploiting practices such as continuous integration and regression tests. In particular, our proposed approach looks at the modifications to the software system since its last verification, and submits them to a continuous formal verification process, guided by a set of regression test cases. Our vision is to focus on the developer in order to integrate formal verification techniques into the developer workflow by using their main software development methodologies and tools. △ Less

Submitted 17 July, 2019; v1 submitted 12 April, 2019; originally announced April 2019.

Comments: 7 pages

arXiv:1904.02501 [pdf, other]

Beyond k-induction: Learning from Counterexamples to Bidirectionally Explore the State Space

Authors: Mikhail R. Gadelha, Felipe R. Monteiro, Enrico Steffinlongo, Lucas C. Cordeiro, Denis A. Nicole

Abstract: We describe and evaluate a novel k-induction proof rule called bidirectional k-induction (bkind), which substantially improves the k-induction bug-finding capabilities. Particularly, bkind exploits the counterexamples generated by the over-approximation step to derive new properties and feed them back to the bounded model checking procedure. We also combine an interval invariant generator and bkin… ▽ More We describe and evaluate a novel k-induction proof rule called bidirectional k-induction (bkind), which substantially improves the k-induction bug-finding capabilities. Particularly, bkind exploits the counterexamples generated by the over-approximation step to derive new properties and feed them back to the bounded model checking procedure. We also combine an interval invariant generator and bkind to significantly improve the number of correct verification results. Experimental results show that bkind can considerably reduce the verification time compared to the naive k-induction proof rule, since it only requires half the number of steps to find a given safety property violation in an unsafe program. The bkind algorithm outperforms 2LS, another state-of-the-art k-induction verifier, and produces more than twice correct proofs and about 35% more correct alarms than when analysing a large set of public available benchmarks. △ Less

Submitted 4 April, 2019; originally announced April 2019.

Comments: 17 pages

arXiv:1810.12041 [pdf, other]

SMT-Based Refutation of Spurious Bug Reports in the Clang Static Analyzer

Authors: Mikhail R. Gadelha, Enrico Steffinlongo, Lucas C. Cordeiro, Bernd Fischer, Denis A. Nicole

Abstract: We describe and evaluate a bug refutation extension for the Clang Static Analyzer (CSA) that addresses the limitations of the existing built-in constraint solver. In particular, we complement CSA's existing heuristics that remove spurious bug reports. We encode the path constraints produced by CSA as Satisfiability Modulo Theories (SMT) problems, use SMT solvers to precisely check them for satisfi… ▽ More We describe and evaluate a bug refutation extension for the Clang Static Analyzer (CSA) that addresses the limitations of the existing built-in constraint solver. In particular, we complement CSA's existing heuristics that remove spurious bug reports. We encode the path constraints produced by CSA as Satisfiability Modulo Theories (SMT) problems, use SMT solvers to precisely check them for satisfiability, and remove bug reports whose associated path constraints are unsatisfiable. Our refutation extension refutes spurious bug reports in 8 out of 12 widely used open-source applications; on average, it refutes ca. 7% of all bug reports, and never refutes any true bug report. It incurs only negligible performance overheads, and on average adds 1.2% to the runtime of the full Clang/LLVM toolchain. A demonstration is available at {\tt https://www.youtube.com/watch?v=ylW5iRYNsGA}. △ Less

Submitted 30 November, 2018; v1 submitted 29 October, 2018; originally announced October 2018.

Comments: 4 pages

arXiv:1810.00224 [pdf, other]

doi 10.1002/widm.1394

A survey of biodiversity informatics: Concepts, practices, and challenges

Authors: Luiz M. R. Gadelha Jr., Pedro C. de Siracusa, Artur Ziviani, Eduardo Couto Dalcin, Helen Michelle Affe, Marinez Ferreira de Siqueira, Luís Alexandre Estevão da Silva, Douglas A. Augusto, Eduardo Krempser, Marcia Chame, Raquel Lopes Costa, Pedro Milet Meirelles, Fabiano Thompson

Abstract: The unprecedented size of the human population, along with its associated economic activities, have an ever increasing impact on global environments. Across the world, countries are concerned about the growing resource consumption and the capacity of ecosystems to provide them. To effectively conserve biodiversity, it is essential to make indicators and knowledge openly available to decision-maker… ▽ More The unprecedented size of the human population, along with its associated economic activities, have an ever increasing impact on global environments. Across the world, countries are concerned about the growing resource consumption and the capacity of ecosystems to provide them. To effectively conserve biodiversity, it is essential to make indicators and knowledge openly available to decision-makers in ways that they can effectively use them. The development and deployment of mechanisms to produce these indicators depend on having access to trustworthy data from field surveys and automated sensors, biological collections, molecular data, and historic academic literature. The transformation of this raw data into synthesized information that is fit for use requires going through many refinement steps. The methodologies and techniques used to manage and analyze this data comprise an area often called biodiversity informatics (or e-Biodiversity). Biodiversity data follows a life cycle consisting of planning, collection, certification, description, preservation, discovery, integration, and analysis. Researchers, whether producers or consumers of biodiversity data, will likely perform activities related to at least one of these steps. This article explores each stage of the life cycle of biodiversity data, discussing its methodologies, tools, and challenges. △ Less

Submitted 7 December, 2020; v1 submitted 29 September, 2018; originally announced October 2018.

Journal ref: WIREs Data Mining and Knowledge Discovery (2020)

arXiv:1801.03915 [pdf, ps, other]

doi 10.7717/peerj.5551

BioWorkbench: A High-Performance Framework for Managing and Analyzing Bioinformatics Experiments

Authors: Maria Luiza Mondelli, Thiago Magalhães, Guilherme Loss, Michael Wilde, Ian Foster, Marta Mattoso, Daniel S. Katz, Helio J. C. Barbosa, Ana Tereza R. Vasconcelos, Kary Ocaña, Luiz M. R. Gadelha Jr

Abstract: Advances in sequencing techniques have led to exponential growth in biological data, demanding the development of large-scale bioinformatics experiments. Because these experiments are computation- and data-intensive, they require high-performance computing (HPC) techniques and can benefit from specialized technologies such as Scientific Workflow Management Systems (SWfMS) and databases. In this wo… ▽ More Advances in sequencing techniques have led to exponential growth in biological data, demanding the development of large-scale bioinformatics experiments. Because these experiments are computation- and data-intensive, they require high-performance computing (HPC) techniques and can benefit from specialized technologies such as Scientific Workflow Management Systems (SWfMS) and databases. In this work, we present BioWorkbench, a framework for managing and analyzing bioinformatics experiments. This framework automatically collects provenance data, including both performance data from workflow execution and data from the scientific domain of the workflow application. Provenance data can be analyzed through a web application that abstracts a set of queries to the provenance database, simplifying access to provenance information. We evaluate BioWorkbench using three case studies: SwiftPhylo, a phylogenetic tree assembly workflow; SwiftGECKO, a comparative genomics workflow; and RASflow, a RASopathy analysis workflow. We analyze each workflow from both computational and scientific domain perspectives, by using queries to a provenance and annotation database. Some of these queries are available as a pre-built feature of the BioWorkbench web application. Through the provenance data, we show that the framework is scalable and achieves high-performance, reducing up to 98% of the case studies execution time. We also show how the application of machine learning techniques can enrich the analysis process. △ Less

Submitted 11 January, 2018; originally announced January 2018.

Journal ref: PeerJ, 6 (2018), e5551

Showing 1–13 of 13 results for author: Gadelha, M R