Search | arXiv e-print repository

Supporting Error Chains in Static Analysis for Precise Evaluation Results and Enhanced Usability

Authors: Anna-Katharina Wickert, Michael Schlichtig, Marvin Vogel, Lukas Winter, Mira Mezini, Eric Bodden

Abstract: Context: Static analyses are well-established to aid in understanding bugs or vulnerabilities during the development process or in large-scale studies. A low false-positive rate is essential for the adaption in practice and for precise results of empirical studies. Unfortunately, static analyses tend to report where a vulnerability manifests rather than the fix location. This can cause presumed fa… ▽ More Context: Static analyses are well-established to aid in understanding bugs or vulnerabilities during the development process or in large-scale studies. A low false-positive rate is essential for the adaption in practice and for precise results of empirical studies. Unfortunately, static analyses tend to report where a vulnerability manifests rather than the fix location. This can cause presumed false positives or imprecise results. Method: To address this problem, we designed an adaption of an existing static analysis algorithm that can distinguish between a manifestation and fix location, and reports error chains. An error chain represents at least two interconnected errors that occur successively, thus building the connection between the fix and manifestation location. We used our tool CogniCryptSUBS for a case study on 471 GitHub repositories, a performance benchmark to compare different analysis configurations, and conducted an expert interview. Result: We found that 50 % of the projects with a report had at least one error chain. Our runtime benchmark demonstrated that our improvement caused only a minimal runtime overhead of less than 4 %. The results of our expert interview indicate that with our adapted version participants require fewer executions of the analysis. Conclusion: Our results indicate that error chains occur frequently in real-world projects, and ignoring them can lead to imprecise evaluation results. The runtime benchmark indicates that our tool is a feasible and efficient solution for detecting error chains in real-world projects. Further, our results gave a hint that the usability of static analyses may benefit from supporting error chains. △ Less

Submitted 12 March, 2024; originally announced March 2024.

Comments: 12 pages, 4 figures, accepted by the IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER), March 12-15, 2024, Rovaniemi, Finland at the research papers track

arXiv:2403.07501 [pdf, other]

doi 10.1145/3643796.3648464

Detecting Security-Relevant Methods using Multi-label Machine Learning

Authors: Oshando Johnson, Goran Piskachev, Ranjith Krishnamurthy, Eric Bodden

Abstract: To detect security vulnerabilities, static analysis tools need to be configured with security-relevant methods. Current approaches can automatically identify such methods using binary relevance machine learning approaches. However, they ignore dependencies among security-relevant methods, over-generalize and perform poorly in practice. Additionally, users have to nevertheless manually configure st… ▽ More To detect security vulnerabilities, static analysis tools need to be configured with security-relevant methods. Current approaches can automatically identify such methods using binary relevance machine learning approaches. However, they ignore dependencies among security-relevant methods, over-generalize and perform poorly in practice. Additionally, users have to nevertheless manually configure static analysis tools using the detected methods. Based on feedback from users and our observations, the excessive manual steps can often be tedious, error-prone and counter-intuitive. In this paper, we present Dev-Assist, an IntelliJ IDEA plugin that detects security-relevant methods using a multi-label machine learning approach that considers dependencies among labels. The plugin can automatically generate configurations for static analysis tools, run the static analysis, and show the results in IntelliJ IDEA. Our experiments reveal that Dev-Assist's machine learning approach has a higher F1-Measure than related approaches. Moreover, the plugin reduces and simplifies the manual effort required when configuring and using static analysis tools. △ Less

Submitted 12 March, 2024; originally announced March 2024.

Comments: 6 pages, 3 figures, The IDE Workshop

arXiv:2402.17679 [pdf, ps, other]

The Emergence of Large Language Models in Static Analysis: A First Look through Micro-Benchmarks

Authors: Ashwin Prasad Shivarpatna Venkatesh, Samkutty Sabu, Amir M. Mir, Sofia Reis, Eric Bodden

Abstract: The application of Large Language Models (LLMs) in software engineering, particularly in static analysis tasks, represents a paradigm shift in the field. In this paper, we investigate the role that current LLMs can play in improving callgraph analysis and type inference for Python programs. Using the PyCG, HeaderGen, and TypeEvalPy micro-benchmarks, we evaluate 26 LLMs, including OpenAI's GPT seri… ▽ More The application of Large Language Models (LLMs) in software engineering, particularly in static analysis tasks, represents a paradigm shift in the field. In this paper, we investigate the role that current LLMs can play in improving callgraph analysis and type inference for Python programs. Using the PyCG, HeaderGen, and TypeEvalPy micro-benchmarks, we evaluate 26 LLMs, including OpenAI's GPT series and open-source models such as LLaMA. Our study reveals that LLMs show promising results in type inference, demonstrating higher accuracy than traditional methods, yet they exhibit limitations in callgraph analysis. This contrast emphasizes the need for specialized fine-tuning of LLMs to better suit specific static analysis tasks. Our findings provide a foundation for further research towards integrating LLMs for static analysis tasks. △ Less

Submitted 27 February, 2024; originally announced February 2024.

Comments: To be published in: ICSE FORGE 2024 (AI Foundation Models and Software Engineering)

arXiv:2402.07889 [pdf, other]

Toward an Android Static Analysis Approach for Data Protection

Authors: Mugdha Khedkar, Eric Bodden

Abstract: Android applications collecting data from users must protect it according to the current legal frameworks. Such data protection has become even more important since the European Union rolled out the General Data Protection Regulation (GDPR). Since app developers are not legal experts, they find it difficult to write privacy-aware source code. Moreover, they have limited tool support to reason abou… ▽ More Android applications collecting data from users must protect it according to the current legal frameworks. Such data protection has become even more important since the European Union rolled out the General Data Protection Regulation (GDPR). Since app developers are not legal experts, they find it difficult to write privacy-aware source code. Moreover, they have limited tool support to reason about data protection throughout their app development process. This paper motivates the need for a static analysis approach to diagnose and explain data protection in Android apps. The analysis will recognize personal data sources in the source code, and aims to further examine the data flow originating from these sources. App developers can then address key questions about data manipulation, derived data, and the presence of technical measures. Despite challenges, we explore to what extent one can realize this analysis through static taint analysis, a common method for identifying security vulnerabilities. This is a first step towards designing a tool-based approach that aids app developers and assessors in ensuring data protection in Android apps, based on automated static program analysis. △ Less

Submitted 12 February, 2024; originally announced February 2024.

Comments: Accepted at MOBILESoft 2024 Research Forum Track

arXiv:2401.14813 [pdf, other]

Symbol-Specific Sparsification of Interprocedural Distributive Environment Problems

Authors: Kadiray Karakaya, Eric Bodden

Abstract: Previous work has shown that one can often greatly speed up static analysis by computing data flows not for every edge in the program's control-flow graph but instead only along definition-use chains. This yields a so-called sparse static analysis. Recent work on SparseDroid has shown that specifically taint analysis can be "sparsified" with extraordinary effectiveness because the taint state of o… ▽ More Previous work has shown that one can often greatly speed up static analysis by computing data flows not for every edge in the program's control-flow graph but instead only along definition-use chains. This yields a so-called sparse static analysis. Recent work on SparseDroid has shown that specifically taint analysis can be "sparsified" with extraordinary effectiveness because the taint state of one variable does not depend on those of others. This allows one to soundly omit more flow-function computations than in the general case. In this work, we now assess whether this result carries over to the more generic setting of so-called Interprocedural Distributive Environment (IDE) problems. Opposed to taint analysis, IDE comprises distributive problems with large or even infinitely broad domains, such as typestate analysis or linear constant propagation. Specifically, this paper presents Sparse IDE, a framework that realizes sparsification for any static analysis that fits the IDE framework. We implement Sparse IDE in SparseHeros, as an extension to the popular Heros IDE solver, and evaluate its performance on real-world Java libraries by comparing it to the baseline IDE algorithm. To this end, we design, implement and evaluate a linear constant propagation analysis client on top of SparseHeros. Our experiments show that, although IDE analyses can only be sparsified with respect to symbols and not (numeric) values, Sparse IDE can nonetheless yield significantly lower runtimes and often also memory consumptions compared to the original IDE. △ Less

Submitted 26 January, 2024; originally announced January 2024.

Comments: To be published in ICSE 2024

arXiv:2312.16882 [pdf, ps, other]

TypeEvalPy: A Micro-benchmarking Framework for Python Type Inference Tools

Authors: Ashwin Prasad Shivarpatna Venkatesh, Samkutty Sabu, Jiawei Wang, Amir M. Mir, Li Li, Eric Bodden

Abstract: In light of the growing interest in type inference research for Python, both researchers and practitioners require a standardized process to assess the performance of various type inference techniques. This paper introduces TypeEvalPy, a comprehensive micro-benchmarking framework for evaluating type inference tools. TypeEvalPy contains 154 code snippets with 845 type annotations across 18 categori… ▽ More In light of the growing interest in type inference research for Python, both researchers and practitioners require a standardized process to assess the performance of various type inference techniques. This paper introduces TypeEvalPy, a comprehensive micro-benchmarking framework for evaluating type inference tools. TypeEvalPy contains 154 code snippets with 845 type annotations across 18 categories that target various Python features. The framework manages the execution of containerized tools, transforms inferred types into a standardized format, and produces meaningful metrics for assessment. Through our analysis, we compare the performance of six type inference tools, highlighting their strengths and limitations. Our findings provide a foundation for further research and optimization in the domain of Python type inference. △ Less

Submitted 2 January, 2024; v1 submitted 28 December, 2023; originally announced December 2023.

Comments: To be published in ICSE 2024

arXiv:2310.06758 [pdf, other]

slash: A Technique for Static Configuration-Logic Identification

Authors: Mohannad Alhanahnah, Philipp Schubert, Thomas Reps, Somesh Jha, Eric Bodden

Abstract: Researchers have recently devised tools for debloating software and detecting configuration errors. Several of these tools rely on the observation that programs are composed of an initialization phase followed by a main-computation phase. Users of these tools are required to manually annotate the boundary that separates these phases, a task that can be time-consuming and error-prone (typically, th… ▽ More Researchers have recently devised tools for debloating software and detecting configuration errors. Several of these tools rely on the observation that programs are composed of an initialization phase followed by a main-computation phase. Users of these tools are required to manually annotate the boundary that separates these phases, a task that can be time-consuming and error-prone (typically, the user has to read and understand the source code or trace executions with a debugger). Because errors can impair the tool's accuracy and functionality, the manual-annotation requirement hinders the ability to apply the tools on a large scale. In this paper, we present a field study of 24 widely-used C/C++ programs, identifying common boundary properties in 96\% of them. We then introduce \textit{slash}, an automated tool that locates the boundary based on the identified properties. \textit{slash} successfully identifies the boundary in 87.5\% of the studied programs within 8.5\ minutes, using up to 4.4\ GB memory. In an independent test, carried out after \textit{slash} was developed, \textit{slash} identified the boundary in 85.7\% of a dataset of 21 popular C/C++ GitHub repositories. Finally, we demonstrate \textit{slash}'s potential to streamline the boundary-identification process of software-debloating and error-detection tools. △ Less

Submitted 20 November, 2023; v1 submitted 10 October, 2023; originally announced October 2023.

arXiv:2301.04419 [pdf, other]

Static Analysis Driven Enhancements for Comprehension in Machine Learning Notebooks

Authors: Ashwin Prasad Shivarpatna Venkatesh, Samkutty Sabu, Mouli Chekkapalli, Jiawei Wang, Li Li, Eric Bodden

Abstract: Jupyter notebooks enable developers to interleave code snippets with rich-text and in-line visualizations. Data scientists use Jupyter notebook as the de-facto standard for creating and sharing machine-learning based solutions, primarily written in Python. Recent studies have demonstrated, however, that a large portion of Jupyter notebooks available on public platforms are undocumented and lacks a… ▽ More Jupyter notebooks enable developers to interleave code snippets with rich-text and in-line visualizations. Data scientists use Jupyter notebook as the de-facto standard for creating and sharing machine-learning based solutions, primarily written in Python. Recent studies have demonstrated, however, that a large portion of Jupyter notebooks available on public platforms are undocumented and lacks a narrative structure. This reduces the readability of these notebooks. To address this shortcoming, this paper presents HeaderGen, a novel tool-based approach that automatically annotates code cells with categorical markdown headers based on a taxonomy of ML operations, and classifies and displays function calls according to this taxonomy. For this functionality to be realized, HeaderGen enhances an existing call graph analysis in PyCG. To improve precision, HeaderGen extends PyCG's analysis with support for handling external library code and flow-sensitivity. The former is realized by facilitating the resolution of function return-types. The evaluation on 15 real-world Jupyter notebooks from Kaggle shows that HeaderGen's underlying call graph analysis yields high accuracy (95.6% precision and 95.3% recall). This is because HeaderGen can resolve return-types of external libraries where existing type inference tools such as pytype (by Google), pyright (by Microsoft), and Jedi fall short. The header generation has a precision of 85.7% and a recall rate of 92.8%. In a user study, HeaderGen helps participants finish comprehension and navigation tasks faster. To further evaluate the type inference capability of tools, we introduce TypeEvalPy, a framework for evaluating type inference tools with a micro-benchmark containing 154 code snippets and 845 type annotations. Our comparative analysis on four tools revealed that HeaderGen outperforms other tools in exact matches with the ground truth. △ Less

Submitted 11 June, 2024; v1 submitted 11 January, 2023; originally announced January 2023.

Comments: To be published in: EMSE Journal

arXiv:2208.08173 [pdf, other]

doi 10.1145/3554732

An In-depth Study of Java Deserialization Remote-Code Execution Exploits and Vulnerabilities

Authors: Imen Sayar, Alexandre Bartel, Eric Bodden, Yves Le Traon

Abstract: Nowadays, an increasing number of applications uses deserialization. This technique, based on rebuilding the instance of objects from serialized byte streams, can be dangerous since it can open the application to attacks such as remote code execution (RCE) if the data to deserialize is originating from an untrusted source. Deserialization vulnerabilities are so critical that they are in OWASP's li… ▽ More Nowadays, an increasing number of applications uses deserialization. This technique, based on rebuilding the instance of objects from serialized byte streams, can be dangerous since it can open the application to attacks such as remote code execution (RCE) if the data to deserialize is originating from an untrusted source. Deserialization vulnerabilities are so critical that they are in OWASP's list of top 10 security risks for web applications. This is mainly caused by faults in the development process of applications and by flaws in their dependencies, i.e., flaws in the libraries used by these applications. No previous work has studied deserialization attacks in-depth: How are they performed? How are weaknesses introduced and patched? And for how long are vulnerabilities present in the codebase? To yield a deeper understanding of this important kind of vulnerability, we perform two main analyses: one on attack gadgets, i.e., exploitable pieces of code, present in Java libraries, and one on vulnerabilities present in Java applications. For the first analysis, we conduct an exploratory large-scale study by running 256515 experiments in which we vary the versions of libraries for each of the 19 publicly available exploits. Such attacks rely on a combination of gadgets present in one or multiple Java libraries. A gadget is a method which is using objects or fields that can be attacker-controlled. Our goal is to precisely identify library versions containing gadgets and to understand how gadgets have been introduced and how they have been patched. We observe that the modification of one innocent-looking detail in a class -- such as making it public -- can already introduce a gadget. Furthermore, we noticed that among the studied libraries, 37.5% are not patched, leaving gadgets available for future attacks. For the second analysis, we manually analyze 104 deserialization vulnerabilities CVEs to understand how vulnerabilities are introduced and patched in real-life Java applications. Results indicate that the vulnerabilities are not always completely patched or that a workaround solution is proposed. With a workaround solution, applications are still vulnerable since the code itself is unchanged. △ Less

Submitted 17 August, 2022; originally announced August 2022.

Comments: ACM Transactions on Software Engineering and Methodology, Association for Computing Machinery, 2022

arXiv:2208.06136 [pdf, ps, other]

How far are German companies in improving security through static program analysis tools?

Authors: Goran Piskachev, Stefan Dziwok, Thorsten Koch, Sven Merschjohan, Eric Bodden

Abstract: As security becomes more relevant for many companies, the popularity of static program analysis (SPA) tools is increasing. In this paper, we target the use of SPA tools among companies in Germany with a focus on security. We give insights on the current issues and the developers' willingness to configure the tools to overcome these issues. Compared to previous studies, our study considers the comp… ▽ More As security becomes more relevant for many companies, the popularity of static program analysis (SPA) tools is increasing. In this paper, we target the use of SPA tools among companies in Germany with a focus on security. We give insights on the current issues and the developers' willingness to configure the tools to overcome these issues. Compared to previous studies, our study considers the companies' culture and processes for using SPA tools. We conducted an online survey with 256 responses and semi-structured interviews with 17 product owners and executives from multiple companies. Our results show a diversity in the usage of tools. Only half of our survey participants use SPA tools. The free tools tend to be more popular among software developers. In most companies, software developers are encouraged to use free tools, whereas commercial tools can be requested. However, the product owners and executives in our interviews reported that their developers do not request new tools. We also find out that automatic security checks with tools are rarely performed on each release. △ Less

Submitted 12 August, 2022; originally announced August 2022.

Comments: IEEE Secure Development Conference 2022

arXiv:2207.09379 [pdf, ps, other]

To what extent can we analyze Kotlin programs using existing Java taint analysis tools? (Extended Version)

Authors: Ranjith Krishnamurthy, Goran Piskachev, Eric Bodden

Abstract: As an alternative to Java, Kotlin has gained rapid popularity since its introduction and has become the default choice for develo** Android apps. However, due to its interoperability with Java, Kotlin programs may contain almost the same security vulnerabilities as their Java counterparts. Hence, we question: to what extent can one use an existing Java static taint analysis on Kotlin code? In th… ▽ More As an alternative to Java, Kotlin has gained rapid popularity since its introduction and has become the default choice for develo** Android apps. However, due to its interoperability with Java, Kotlin programs may contain almost the same security vulnerabilities as their Java counterparts. Hence, we question: to what extent can one use an existing Java static taint analysis on Kotlin code? In this paper, we investigate the challenges in implementing a taint analysis for Kotlin compared to Java. To answer this question, we performed an exploratory study where each Kotlin construct was examined and compared to its Java equivalent. We identified 18 engineering challenges that static-analysis writers need to handle differently due to Kotlin's unique constructs or the differences in the generated bytecode between the Kotlin and Java compilers. For eight of them, we provide a conceptual solution, while six of those we implemented as part of SecuCheck-Kotlin, an extension to the existing Java taint analysis SecuCheck. △ Less

Submitted 29 July, 2022; v1 submitted 19 July, 2022; originally announced July 2022.

Comments: 12 pages, Technical Report

arXiv:2204.06447 [pdf, ps, other]

CamBench -- Cryptographic API Misuse Detection Tool Benchmark Suite

Authors: Michael Schlichtig, Anna-Katharina Wickert, Stefan Krüger, Eric Bodden, Mira Mezini

Abstract: Context: Cryptographic APIs are often misused in real-world applications. Therefore, many cryptographic API misuse detection tools have been introduced. However, there exists no established reference benchmark for a fair and comprehensive comparison and evaluation of these tools. While there are benchmarks, they often only address a subset of the domain or were only used to evaluate a subset of ex… ▽ More Context: Cryptographic APIs are often misused in real-world applications. Therefore, many cryptographic API misuse detection tools have been introduced. However, there exists no established reference benchmark for a fair and comprehensive comparison and evaluation of these tools. While there are benchmarks, they often only address a subset of the domain or were only used to evaluate a subset of existing misuse detection tools. Objective: To fairly compare cryptographic API misuse detection tools and to drive future development in this domain, we will devise such a benchmark. Openness and transparency in the generation process are key factors to fairly generate and establish the needed benchmark. Method: We propose an approach where we derive the benchmark generation methodology from the literature which consists of general best practices in benchmarking and domain-specific benchmark generation. A part of this methodology is transparency and openness of the generation process, which is achieved by pre-registering this work. Based on our methodology we design CamBench, a fair "Cryptographic API Misuse Detection Tool Benchmark Suite". We will implement the first version of CamBench limiting the domain to Java, the JCA, and static analyses. Finally, we will use CamBench to compare current misuse detection tools and compare CamBench to related benchmarks of its domain. △ Less

Submitted 13 April, 2022; originally announced April 2022.

Comments: 8 pages, accepted at the MSR 2022 Registered Reports Track as a In-Principal Acceptance (IPA)

arXiv:2204.03089 [pdf, other]

Fluently specifying taint-flow queries with fluentTQL

Authors: Goran Piskachev, Johannes Späth, Ingo Budde, Eric Bodden

Abstract: Previous work has shown that taint analyses are only useful if correctly customized to the context in which they are used. Existing domain-specific languages (DSLs) allow such customization through the definition of deny-listing data-flow rules that describe potentially vulnerable taint-flows. These languages, however, are designed primarily for security experts who are knowledgeable in taint anal… ▽ More Previous work has shown that taint analyses are only useful if correctly customized to the context in which they are used. Existing domain-specific languages (DSLs) allow such customization through the definition of deny-listing data-flow rules that describe potentially vulnerable taint-flows. These languages, however, are designed primarily for security experts who are knowledgeable in taint analysis. Software developers consider these languages to be complex. This paper presents fluentTQL, a query language particularly for taint-flow. fluentTQL is internal Java DSL and uses a fluent-interface design. fluentTQL queries can express various taint-style vulnerability types, e.g. injections, cross-site scripting or path traversal. This paper describes fluentTQL's abstract and concrete syntax and defines its runtime semantics. The semantics are independent of any underlying analysis and allows evaluation of fluentTQL queries by a variety of taint analyses. Instantiations of fluentTQL, on top of two taint analysis solvers, Boomerang and FlowDroid, show and validate fluentTQL expressiveness. Based on existing examples from the literature, we implemented queries for 11 popular security vulnerability types in Java. Using our SQL injection specification, the Boomerang-based taint analysis found all 17 known taint-flows in the OWASP WebGoat application, whereas with FlowDroid 13 taint-flows were found. Similarly, in a vulnerable version of the Java PetClinic application, the Boomerang-based taint analysis found all seven expected taint-flows. In seven real-world Android apps with 25 expected taint-flows, 18 were detected. In a user study with 26 software developers, fluentTQL reached a high usability score. In comparison to CodeQL, the state-of-the-art DSL by Semmle/GitHub, participants found fluentTQL more usable and with it they were able to specify taint analysis queries in shorter time. △ Less

Submitted 6 April, 2022; originally announced April 2022.

Comments: 39 pages, Springer Journal on Empirical Software Engineering

arXiv:2105.04950 [pdf, other]

Dealing with Variability in API Misuse Specification

Authors: Rodrigo Bonifacio, Stefan Krüger, Krishna Narasimhan, Eric Bodden, Mira Mezini

Abstract: APIs are the primary mechanism for developers to gain access to externally defined services and tools. However, previous research has revealed API misuses that violate the contract of APIs to be prevalent. Such misuses can have harmful consequences, especially in the context of cryptographic libraries. Various API misuse detectors have been proposed to address this issue including CogniCrypt, one… ▽ More APIs are the primary mechanism for developers to gain access to externally defined services and tools. However, previous research has revealed API misuses that violate the contract of APIs to be prevalent. Such misuses can have harmful consequences, especially in the context of cryptographic libraries. Various API misuse detectors have been proposed to address this issue including CogniCrypt, one of the most versatile of such detectors and that uses a language CrySL to specify cryptographic API usage contracts. Nonetheless, existing approaches to detect API misuse had not been designed for systematic reuse, ignoring the fact that different versions of a library, different versions of a platform, and different recommendations or guidelines might introduce variability in the correct usage of an API. Yet, little is known about how such variability impacts the specification of the correct API usage. This paper investigates this question by analyzing the impact of various sources of variability on widely used Java cryptographic libraries including JCA, Bouncy Castle, and Google Tink. The results of our investigation show that sources of variability like new versions of the API and security standards significantly impact the specifications. We then use the insights gained from our investigation to motivate an extension to the CrySL language named MetaCrySL, which builds on meta programming concepts. We evaluate MetaCrySL by specifying usage rules for a family of Android versions and illustrate that MetaCrySL can model all forms of variability we identified and drastically reduce the size of a family of specifications for the correct usage of cryptographic APIs △ Less

Submitted 17 May, 2021; v1 submitted 11 May, 2021; originally announced May 2021.

Comments: 28 pages, 16 figures

MSC Class: 68N19 ACM Class: D.2.1; D.3.3

arXiv:1908.01489 [pdf, other]

The Impact of Developer Experience in Using Java Cryptography

Authors: Mohammadreza Hazhirpasand, Mohammad Ghafari, Stefan Krüger, Eric Bodden, Oscar Nierstrasz

Abstract: Previous research has shown that crypto APIs are hard for developers to understand and difficult for them to use. They consequently rely on unvalidated boilerplate code from online resources where security vulnerabilities are common. We analyzed 2,324 open-source Java projects that rely on Java Cryptography Architecture (JCA) to understand how crypto APIs are used in practice, and what factors a… ▽ More Previous research has shown that crypto APIs are hard for developers to understand and difficult for them to use. They consequently rely on unvalidated boilerplate code from online resources where security vulnerabilities are common. We analyzed 2,324 open-source Java projects that rely on Java Cryptography Architecture (JCA) to understand how crypto APIs are used in practice, and what factors account for the performance of developers in using these APIs. We found that, in general, the experience of developers in using JCA does not correlate with their performance. In particular, none of the factors such as the number or frequency of committed lines of code, the number of JCA APIs developers use, or the number of projects they are involved in correlate with developer performance in this domain. We call for qualitative studies to shed light on the reasons underlying the success of developers who are expert in using cryptography. Also, detailed investigation at API level is necessary to further clarify a developer obstacles in this domain. △ Less

Submitted 5 August, 2019; originally announced August 2019.

Comments: The ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM)

arXiv:1901.03603 [pdf, other]

ACMiner: Extraction and Analysis of Authorization Checks in Android's Middleware

Authors: Sigmund Albert Gorski III, Benjamin Andow, Adwait Nadkarni, Sunil Manandhar, William Enck, Eric Bodden, Alexandre Bartel

Abstract: Billions of users rely on the security of the Android platform to protect phones, tablets, and many different types of consumer electronics. While Android's permission model is well studied, the enforcement of the protection policy has received relatively little attention. Much of this enforcement is spread across system services, taking the form of hard-coded checks within their implementations.… ▽ More Billions of users rely on the security of the Android platform to protect phones, tablets, and many different types of consumer electronics. While Android's permission model is well studied, the enforcement of the protection policy has received relatively little attention. Much of this enforcement is spread across system services, taking the form of hard-coded checks within their implementations. In this paper, we propose Authorization Check Miner (ACMiner), a framework for evaluating the correctness of Android's access control enforcement through consistency analysis of authorization checks. ACMiner combines program and text analysis techniques to generate a rich set of authorization checks, mines the corresponding protection policy for each service entry point, and uses association rule mining at a service granularity to identify inconsistencies that may correspond to vulnerabilities. We used ACMiner to study the AOSP version of Android 7.1.1 to identify 28 vulnerabilities relating to missing authorization checks. In doing so, we demonstrate ACMiner's ability to help domain experts process thousands of authorization checks scattered across millions of lines of code. △ Less

Submitted 11 January, 2019; originally announced January 2019.

arXiv:1804.02903 [pdf, other]

doi 10.1145/3236024.3236029

Do Android Taint Analysis Tools Keep Their Promises?

Authors: Felix Pauck, Eric Bodden, Heike Wehrheim

Abstract: In recent years, researchers have developed a number of tools to conduct taint analysis of Android applications. While all the respective papers aim at providing a thorough empirical evaluation, comparability is hindered by varying or unclear evaluation targets. Sometimes, the apps used for evaluation are not precisely described. In other cases, authors use an established benchmark but cover it on… ▽ More In recent years, researchers have developed a number of tools to conduct taint analysis of Android applications. While all the respective papers aim at providing a thorough empirical evaluation, comparability is hindered by varying or unclear evaluation targets. Sometimes, the apps used for evaluation are not precisely described. In other cases, authors use an established benchmark but cover it only partially. In yet other cases, the evaluations differ in terms of the data leaks searched for, or lack a ground truth to compare against. All those limitations make it impossible to truly compare the tools based on those published evaluations. We thus present ReproDroid, a framework allowing the accurate comparison of Android taint analysis tools. ReproDroid supports researchers in inferring the ground truth for data leaks in apps, in automatically applying tools to benchmarks, and in evaluating the obtained results. We use ReproDroid to comparatively evaluate on equal grounds the six prominent taint analysis tools Amandroid, DIALDroid, DidFail, DroidSafe, FlowDroid and IccTA. The results are largely positive although four tools violate some promises concerning features and accuracy. Finally, we contribute to the area of unbiased benchmarking with a new and improved version of the open test suite DroidBench. △ Less

Submitted 30 July, 2019; v1 submitted 9 April, 2018; originally announced April 2018.

arXiv:1801.04894 [pdf, other]

Debugging Static Analysis

Authors: Lisa Nguyen Quang Do, Stefan Krüger, Patrick Hill, Karim Ali, Eric Bodden

Abstract: To detect and fix bugs and security vulnerabilities, software companies use static analysis as part of the development process. However, static analysis code itself is also prone to bugs. To ensure a consistent level of precision, as analyzed programs grow more complex, a static analysis has to handle more code constructs, frameworks, and libraries that the programs use. While more complex analyse… ▽ More To detect and fix bugs and security vulnerabilities, software companies use static analysis as part of the development process. However, static analysis code itself is also prone to bugs. To ensure a consistent level of precision, as analyzed programs grow more complex, a static analysis has to handle more code constructs, frameworks, and libraries that the programs use. While more complex analyses are written and used in production systems every day, the cost of debugging and fixing them also increases tremendously. To better understand the difficulties of debugging static analyses, we surveyed 115 static analysis writers. From their responses, we extracted the core requirements to build a debugger for static analysis, which revolve around two main issues: (1) abstracting from two code bases at the same time (the analysis code and the analyzed code) and (2) tracking the analysis internal state throughout both code bases. Most current debugging tools that our survey participants use lack the capabilities to address both issues. Focusing on those requirements, we introduce VisuFlow, a debugging environment for static data-flow analysis that is integrated in the Eclipse development environment. VisuFlow features graph visualizations that enable users to view the state of a data-flow analysis and its intermediate results at any time. Special breakpoints in VisuFlow help users step through the analysis code and the analyzed simultaneously. To evaluate the usefulness of VisuFlow, we have conducted a user study on 20 static analysis writers. Using VisuFlow helped our sample of analysis writers identify 25% and fix 50% more errors in the analysis code compared to using the standard Eclipse debugging environment. △ Less

Submitted 15 January, 2018; originally announced January 2018.

arXiv:1710.07430 [pdf, other]

Self-adaptive static analysis

Authors: Eric Bodden

Abstract: Static code analysis is a powerful approach to detect quality deficiencies such as performance bottlenecks, safety violations or security vulnerabilities already during a software system's implementation. Yet, as current software systems continue to grow, current static-analysis systems more frequently face the problem of insufficient scalability. We argue that this is mainly due to the fact that… ▽ More Static code analysis is a powerful approach to detect quality deficiencies such as performance bottlenecks, safety violations or security vulnerabilities already during a software system's implementation. Yet, as current software systems continue to grow, current static-analysis systems more frequently face the problem of insufficient scalability. We argue that this is mainly due to the fact that current static analyses are implemented fully manually, often in general-purpose programming languages such as Java or C, or in declarative languages such as Datalog. This design choice predefines the way in which the static analysis evaluates, and limits the optimizations and extensions static-analysis designers can apply. To boost scalability to a new level, we propose to fuse static-analysis with just-in-time-optimization technology, introducing for the first time static analyses that are managed and inherently self-adaptive. Those analyses automatically adapt themselves to yield a performance/precision tradeoff that is optimal with respect to the analyzed software system and to the analysis itself. Self-adaptivity is enabled by the novel idea of designing a dedicated intermediate representation, not for the analyzed program but for the analysis itself. This representation allows for an automatic optimization and adaptation of the analysis code, both ahead-of-time (through static analysis of the static analysis) as well as just-in-time during the analysis' execution, similar to just-in-time compilers. △ Less

Submitted 20 October, 2017; originally announced October 2017.

arXiv:1710.00564 [pdf, ps, other]

CrySL: Validating Correct Usage of Cryptographic APIs

Authors: Stefan Krüger, Johannes Späth, Karim Ali, Eric Bodden, Mira Mezini

Abstract: Various studies have empirically shown that the majority of Java and Android apps misuse cryptographic libraries, causing devastating breaches of data security. Therefore, it is crucial to detect such misuses early in the development process. The fact that insecure usages are not the exception but the norm precludes approaches based on property inference and anomaly detection. In this paper, we… ▽ More Various studies have empirically shown that the majority of Java and Android apps misuse cryptographic libraries, causing devastating breaches of data security. Therefore, it is crucial to detect such misuses early in the development process. The fact that insecure usages are not the exception but the norm precludes approaches based on property inference and anomaly detection. In this paper, we present CrySL, a definition language that enables cryptography experts to specify the secure usage of the cryptographic libraries that they provide. CrySL combines the generic concepts of method-call sequences and data-flow constraints with domain-specific constraints related to cryptographic algorithms and their parameters. We have implemented a compiler that translates a CrySL ruleset into a context- and flow-sensitive demand-driven static analysis. The analysis automatically checks a given Java or Android app for violations of the CrySL-encoded rules. We empirically evaluated our ruleset through analyzing 10,001 Android apps. Our results show that misuse of cryptographic APIs is still widespread, with 96% of apps containing at least one misuse. However, we observed fewer of the misuses that were reported in previous work. △ Less

Submitted 2 October, 2017; originally announced October 2017.

Comments: 11 pages

arXiv:1710.00390 [pdf, other]

Computation on Encrypted Data using Data Flow Authentication

Authors: Andreas Fischer, Benny Fuhry, Florian Kerschbaum, Eric Bodden

Abstract: Encrypting data before sending it to the cloud protects it against hackers and malicious insiders, but requires the cloud to compute on encrypted data. Trusted (hardware) modules, e.g., secure enclaves like Intel's SGX, can very efficiently run entire programs in encrypted memory. However, it already has been demonstrated that software vulnerabilities give an attacker ample opportunity to insert a… ▽ More Encrypting data before sending it to the cloud protects it against hackers and malicious insiders, but requires the cloud to compute on encrypted data. Trusted (hardware) modules, e.g., secure enclaves like Intel's SGX, can very efficiently run entire programs in encrypted memory. However, it already has been demonstrated that software vulnerabilities give an attacker ample opportunity to insert arbitrary code into the program. This code can then modify the data flow of the program and leak any secret in the program to an observer in the cloud via SGX side-channels. Since any larger program is rife with software vulnerabilities, it is not a good idea to outsource entire programs to an SGX enclave. A secure alternative with a small trusted code base would be fully homomorphic encryption (FHE) -- the holy grail of encrypted computation. However, due to its high computational complexity it is unlikely to be adopted in the near future. As a result researchers have made several proposals for transforming programs to perform encrypted computations on less powerful encryption schemes. Yet, current approaches fail on programs that make control-flow decisions based on encrypted data. In this paper, we introduce the concept of data flow authentication (DFAuth). DFAuth prevents an adversary from arbitrarily deviating from the data flow of a program. Hence, an attacker cannot perform an attack as outlined before on SGX. This enables that all programs, even those including operations on control-flow decision variables, can be computed on encrypted data. We implemented DFAuth using a novel authenticated homomorphic encryption scheme, a Java bytecode-to-bytecode compiler producing fully executable programs, and SGX enclaves. A transformed neural network that performs machine learning on sensitive medical data can be evaluated on encrypted inputs and encrypted weights in 0.86 seconds. △ Less

Submitted 1 October, 2017; originally announced October 2017.

arXiv:1605.08159 [pdf, ps, other]

Analyzing the Gadgets Towards a Metric to Measure Gadget Quality

Authors: Andreas Follner, Alexandre Bartel, Eric Bodden

Abstract: Current low-level exploits often rely on code-reuse, whereby short sections of code (gadgets) are chained together into a coherent exploit that can be executed without the need to inject any code. Several protection mechanisms attempt to eliminate this attack vector by applying code transformations to reduce the number of available gadgets. Nevertheless, it has emerged that the residual gadgets ca… ▽ More Current low-level exploits often rely on code-reuse, whereby short sections of code (gadgets) are chained together into a coherent exploit that can be executed without the need to inject any code. Several protection mechanisms attempt to eliminate this attack vector by applying code transformations to reduce the number of available gadgets. Nevertheless, it has emerged that the residual gadgets can still be sufficient to conduct a successful attack. Crucially, the lack of a common metric for "gadget quality" hinders the effective comparison of current mitigations. This work proposes four metrics that assign scores to a set of gadgets, measuring quality, usefulness, and practicality. We apply these metrics to binaries produced when compiling programs for architectures implementing Intel's recent MPX CPU extensions. Our results demonstrate a 17% increase in useful gadgets in MPX binaries, and a decrease in side-effects and preconditions, making them better suited for ROP attacks. △ Less

Submitted 26 May, 2016; originally announced May 2016.

Comments: International Symposium on Engineering Secure Software and Systems, Apr 2016, London, United Kingdom

arXiv:1504.02288 [pdf, ps, other]

ROPocop - Dynamic Mitigation of Code-Reuse Attacks

Authors: Andreas Follner, Eric Bodden

Abstract: Control-flow attacks, usually achieved by exploiting a buffer-overflow vulnerability, have been a serious threat to system security for over fifteen years. Researchers have answered the threat with various mitigation techniques, but nevertheless, new exploits that successfully bypass these technologies still appear on a regular basis. In this paper, we propose ROPocop, a novel approach for detec… ▽ More Control-flow attacks, usually achieved by exploiting a buffer-overflow vulnerability, have been a serious threat to system security for over fifteen years. Researchers have answered the threat with various mitigation techniques, but nevertheless, new exploits that successfully bypass these technologies still appear on a regular basis. In this paper, we propose ROPocop, a novel approach for detecting and preventing the execution of injected code and for mitigating code-reuse attacks such as return-oriented programming (RoP). ROPocop uses dynamic binary instrumentation, requiring neither access to source code nor debug symbols or changes to the operating system. It mitigates attacks by both monitoring the program counter at potentially dangerous points and by detecting suspicious program flows. We have implemented ROPocop for Windows x86 using PIN, a dynamic program instrumentation framework from Intel. Benchmarks using the SPEC CPU2006 suite show an average overhead of 2.4x, which is comparable to similar approaches, which give weaker guarantees. Real-world applications show only an initially noticeable input lag and no stutter. In our evaluation our tool successfully detected all 11 of the latest real-world code-reuse exploits, with no false alarms. Therefore, despite the overhead, it is a viable, temporary solution to secure critical systems against exploits if a vendor patch is not yet available. △ Less

Submitted 9 April, 2015; originally announced April 2015.

arXiv:1404.7431 [pdf, ps, other]

I know what leaked in your pocket: uncovering privacy leaks on Android Apps with Static Taint Analysis

Authors: Li Li, Alexandre Bartel, Jacques Klein, Yves Le Traon, Steven Arzt, Siegfried Rasthofer, Eric Bodden, Damien Octeau, Patrick McDaniel

Abstract: Android applications may leak privacy data carelessly or maliciously. In this work we perform inter-component data-flow analysis to detect privacy leaks between components of Android applications. Unlike all current approaches, our tool, called IccTA, propagates the context between the components, which improves the precision of the analysis. IccTA outperforms all other available tools by reaching… ▽ More Android applications may leak privacy data carelessly or maliciously. In this work we perform inter-component data-flow analysis to detect privacy leaks between components of Android applications. Unlike all current approaches, our tool, called IccTA, propagates the context between the components, which improves the precision of the analysis. IccTA outperforms all other available tools by reaching a precision of 95.0% and a recall of 82.6% on DroidBench. Our approach detects 147 inter-component based privacy leaks in 14 applications in a set of 3000 real-world applications with a precision of 88.4%. With the help of ApkCombiner, our approach is able to detect inter-app based privacy leaks. △ Less

Submitted 29 April, 2014; originally announced April 2014.

Report number: 978-2-87971-129-4_TR-SNT-2014-9 ACM Class: D.2.4; D.4.6

Showing 1–24 of 24 results for author: Bodden, E