Search | arXiv e-print repository

Runtime Verification and Field Testing for ROS-Based Robotic Systems

Authors: Ricardo Caldas, Juan Antonio Piñera García, Matei Schiopu, Patrizio Pelliccione, Genaína Rodrigues, Thorsten Berger

Abstract: Robotic systems are becoming pervasive and adopted in increasingly many domains, such as manufacturing, healthcare, and space exploration. To this end, engineering software has emerged as a crucial discipline for building maintainable and reusable robotic systems. Robotics software engineering research has received increasing attention, fostering autonomy as a fundamental goal. However, robotics d… ▽ More Robotic systems are becoming pervasive and adopted in increasingly many domains, such as manufacturing, healthcare, and space exploration. To this end, engineering software has emerged as a crucial discipline for building maintainable and reusable robotic systems. Robotics software engineering research has received increasing attention, fostering autonomy as a fundamental goal. However, robotics developers are still challenged trying to achieve this goal given that simulation is not able to deliver solutions to realistically emulate real-world phenomena. Robots also need to operate in unpredictable and uncontrollable environments, which require safe and trustworthy self-adaptation capabilities implemented in software. Typical techniques to address the challenges are runtime verification, field-based testing, and mitigation techniques that enable fail-safe solutions. However, there is no clear guidance to architect ROS-based systems to enable and facilitate runtime verification and field-based testing. This paper aims to fill in this gap by providing guidelines that can help developers and QA teams when develo**, verifying or testing their robots in the field. These guidelines are carefully tailored to address the challenges and requirements of testing robotics systems in real-world scenarios. We conducted a literature review on studies addressing runtime verification and field-based testing for robotic systems, mined ROS-based application repositories, and validated the applicability, clarity, and usefulness via two questionnaires with 55 answers. We contribute 20 guidelines formulated for researchers and practitioners in robotic software engineering. Finally, we map our guidelines to open challenges thus far in runtime verification and field-based testing for ROS-based systems and, we outline promising research directions in the field. △ Less

Submitted 17 April, 2024; originally announced April 2024.

arXiv:2312.06845 [pdf, other]

High-Cadence Thermospheric Density Estimation enabled by Machine Learning on Solar Imagery

Authors: Shreshth A. Malik, James Walsh, Giacomo Acciarini, Thomas E. Berger, Atılım Güneş Baydin

Abstract: Accurate estimation of thermospheric density is critical for precise modeling of satellite drag forces in low Earth orbit (LEO). Improving this estimation is crucial to tasks such as state estimation, collision avoidance, and re-entry calculations. The largest source of uncertainty in determining thermospheric density is modeling the effects of space weather driven by solar and geomagnetic activit… ▽ More Accurate estimation of thermospheric density is critical for precise modeling of satellite drag forces in low Earth orbit (LEO). Improving this estimation is crucial to tasks such as state estimation, collision avoidance, and re-entry calculations. The largest source of uncertainty in determining thermospheric density is modeling the effects of space weather driven by solar and geomagnetic activity. Current operational models rely on ground-based proxy indices which imperfectly correlate with the complexity of solar outputs and geomagnetic responses. In this work, we directly incorporate NASA's Solar Dynamics Observatory (SDO) extreme ultraviolet (EUV) spectral images into a neural thermospheric density model to determine whether the predictive performance of the model is increased by using space-based EUV imagery data instead of, or in addition to, the ground-based proxy indices. We demonstrate that EUV imagery can enable predictions with much higher temporal resolution and replace ground-based proxies while significantly increasing performance relative to current operational models. Our method paves the way for assimilating EUV image data into operational thermospheric density forecasting models for use in LEO satellite navigation processes. △ Less

Submitted 12 November, 2023; originally announced December 2023.

Comments: Accepted at the Machine Learning and the Physical Sciences workshop, NeurIPS 2023

arXiv:2310.02395 [pdf, other]

Detecting Semantic Conflicts with Unit Tests

Authors: Léuson Da Silva, Paulo Borba, Toni Maciel, Wardah Mahmood, Thorsten Berger, João Moisakis, Aldiberg Gomes, Vinícius Leite

Abstract: Branching and merging are common practices in collaborative software development, increasing developer's productivity. Despite such benefits, developers need to merge software and resolve merge conflicts. While modern merge techniques can resolve textual conflicts automatically, they fail when the conflict arises at the semantic level. Although semantic merge tools have been proposed, they are usu… ▽ More Branching and merging are common practices in collaborative software development, increasing developer's productivity. Despite such benefits, developers need to merge software and resolve merge conflicts. While modern merge techniques can resolve textual conflicts automatically, they fail when the conflict arises at the semantic level. Although semantic merge tools have been proposed, they are usually based on heavyweight static analyses or need explicit specifications of program behavior. In this work, we take a different route and propose SAM (SemAntic Merge), a semantic merge tool based on the automated generation of unit tests that are used as partial specifications. To evaluate SAM's feasibility for detecting conflicts, we perform an empirical study analyzing more than 80 pairs of changes integrated into common class elements from 51 merge scenarios. Furthermore, we also assess how the four unit-test generation tools used by SAM contribute to conflict identification. We propose and assess the adoption of Testability Transformations and Serialization. Our results show that SAM best performs when combining only the tests generated by Differential EvoSuite and EvoSuite and using the proposed Testability Transformations (nine detected conflicts out of 28). These results reinforce previous findings about the potential of using test-case generation to detect test conflicts. △ Less

Submitted 3 October, 2023; originally announced October 2023.

Comments: 53 pages, 10 figures

arXiv:2310.01039 [pdf, other]

Software Reconfiguration in Robotics

Authors: Sven Peldszus, Davide Brugali, Daniel Strüber, Patrizio Pelliccione, Thorsten Berger

Abstract: Since it has often been claimed by academics that reconfiguration is essential, many approaches to reconfiguration, especially of robotic systems, have been developed. Accordingly, the literature on robotics is rich in techniques for reconfiguring robotic systems. However, when talking to researchers in the domain, there seems to be no common understanding of what exactly reconfiguration is and ho… ▽ More Since it has often been claimed by academics that reconfiguration is essential, many approaches to reconfiguration, especially of robotic systems, have been developed. Accordingly, the literature on robotics is rich in techniques for reconfiguring robotic systems. However, when talking to researchers in the domain, there seems to be no common understanding of what exactly reconfiguration is and how it relates to other concepts such as adaptation. Beyond this academic perspective, robotics frameworks provide mechanisms for dynamically loading and unloading parts of robotics applications. While we have a fuzzy picture of the state-of-the-art in robotic reconfiguration from an academic perspective, we lack a picture of the state-of-practice from a practitioner perspective. To fill this gap, we survey the literature on reconfiguration in robotic systems by identifying and analyzing 98 relevant papers, review how four major robotics frameworks support reconfiguration, and finally investigate the realization of reconfiguration in 48 robotics applications. When comparing the state-of-the-art with the state-of-practice, we observed a significant discrepancy between them, in particular, the scientific community focuses on complex structural reconfiguration, while in practice only parameter reconfiguration is widely used. Based on our observations, we discuss possible reasons for this discrepancy and conclude with a takeaway message for academics and practitioners interested in robotics. △ Less

Submitted 2 October, 2023; originally announced October 2023.

arXiv:2209.11633 [pdf, ps, other]

Formal Semantics of the CDL Language

Authors: Thorsten Berger, Steven She

Abstract: We reverse-engineer a formal semantics of the Component Definition Language (CDL), which is part of the highly configurable, embedded operating system eCos. This work provides the basis for an analysis and comparison of the two variability-modeling languages Kconfig and CDL. The semantics given in this document are based on analyzing the CDL documentation, inspecting the source code of the toolcha… ▽ More We reverse-engineer a formal semantics of the Component Definition Language (CDL), which is part of the highly configurable, embedded operating system eCos. This work provides the basis for an analysis and comparison of the two variability-modeling languages Kconfig and CDL. The semantics given in this document are based on analyzing the CDL documentation, inspecting the source code of the toolchain, as well as testing the tools on particular examples. △ Less

Submitted 23 September, 2022; originally announced September 2022.

Comments: Technical Note, Department of Computer Science, University of Leipzig, Germany

arXiv:2209.04916 [pdf, ps, other]

Formal Semantics of the Kconfig Language

Authors: Steven She, Thorsten Berger

Abstract: The Kconfig language defines a set of symbols that are assigned a value in a configuration. We describe the semantics of the Kconfig language according to the behavior exhibited in the xconfig configurator. We assume an abstract syntax representation for concepts in the Kconfig language and delegate the details of the translation from concrete to abstract syntaxes to a later document. The Kconfig language defines a set of symbols that are assigned a value in a configuration. We describe the semantics of the Kconfig language according to the behavior exhibited in the xconfig configurator. We assume an abstract syntax representation for concepts in the Kconfig language and delegate the details of the translation from concrete to abstract syntaxes to a later document. △ Less

Submitted 11 September, 2022; originally announced September 2022.

Comments: Technical Note, Department of Electrical and Computer Engineering, University of Waterloo, Canada

arXiv:2208.04211 [pdf, other]

Behavior Trees and State Machines in Robotics Applications

Authors: Razan Ghzouli, Thorsten Berger, Einar Broch Johnsen, Andrzej Wasowski, Swaib Dragule

Abstract: Autonomous robots combine skills to form increasingly complex behaviors, called missions. While skills are often programmed at a relatively low abstraction level, their coordination is architecturally separated and often expressed in higher-level languages or frameworks. State machines have been the go-to language to model behavior for decades, but recently, behavior trees have gained attention am… ▽ More Autonomous robots combine skills to form increasingly complex behaviors, called missions. While skills are often programmed at a relatively low abstraction level, their coordination is architecturally separated and often expressed in higher-level languages or frameworks. State machines have been the go-to language to model behavior for decades, but recently, behavior trees have gained attention among roboticists. Although several implementations of behavior trees are in use, little is known about their usage and scope in the real world.How do concepts offered by behavior trees relate to traditional languages, such as state machines? How are concepts in behavior trees and state machines used in actual applications? This paper is a study of the key language concepts in behavior trees as realized in domain-specific languages (DSLs), internal and external DSLs offered as libraries, and their use in open-source robotic applications supported by the Robot Operating System (ROS). We analyze behavior-tree DSLs and compare them to the standard language for behavior models in robotics:state machines. We identify DSLs for both behavior-modeling languages, and we analyze five in-depth.We mine open-source repositories for robotic applications that use the analyzed DSLs and analyze their usage. We identify similarities between behavior trees and state machines in terms of language design and the concepts offered to accommodate the needs of the robotics domain. We observed that the usage of behavior-tree DSLs in open-source projects is increasing rapidly. We observed similar usage patterns at model structure and at code reuse in the behavior-tree and state-machine models within the mined open-source projects. We contribute all extracted models as a dataset, ho** to inspire the community to use and further develop behavior trees, associated tools, and analysis techniques. △ Less

Submitted 6 March, 2023; v1 submitted 8 August, 2022; originally announced August 2022.

Comments: 22 pages, 11 figures. This work is submitted to IEEE TSE Journal. arXiv admin note: substantial text overlap with arXiv:2010.06256

ACM Class: D.0; D.2.13; D.2.2

arXiv:2205.02911 [pdf, other]

doi 10.1109/TITS.2024.3373531

A Driver-Vehicle Model for ADS Scenario-based Testing

Authors: Rodrigo Queiroz, Divit Sharma, Ricardo Caldas, Krzysztof Czarnecki, Sergio García, Thorsten Berger, Patrizio Pelliccione

Abstract: Scenario-based testing for automated driving systems (ADS) must be able to simulate traffic scenarios that rely on interactions with other vehicles. Although many languages for high-level scenario modelling have been proposed, they lack the features to precisely and reliably control the required micro-simulation, while also supporting behavior reuse and test reproducibility for a wide range of int… ▽ More Scenario-based testing for automated driving systems (ADS) must be able to simulate traffic scenarios that rely on interactions with other vehicles. Although many languages for high-level scenario modelling have been proposed, they lack the features to precisely and reliably control the required micro-simulation, while also supporting behavior reuse and test reproducibility for a wide range of interactive scenarios. To fill this gap between scenario design and execution, we propose the Simulated Driver-Vehicle (SDV) model to represent and simulate vehicles as dynamic entities with their behavior being constrained by scenario design and goals set by testers. The model combines driver and vehicle as a single entity. It is based on human-like driving and the mechanical limitations of real vehicles for realistic simulation. The model leverages behavior trees to express high-level behaviors in terms of lower-level maneuvers, affording multiple driving styles and reuse. Furthermore, optimization-based maneuver planners guide the simulated vehicles towards the desired behavior. Our extensive evaluation shows the model's design effectiveness using NHTSA pre-crash scenarios, its motion realism in comparison to naturalistic urban traffic, and its scalability with traffic density. Finally, we show the applicability of our SDV model to test a real ADS and to identify crash scenarios, which are impractical to represent using predefined vehicle trajectories. The SDV model instances can be injected into existing simulation environments via co-simulation. △ Less

Submitted 29 May, 2024; v1 submitted 5 May, 2022; originally announced May 2022.

Comments: 15 pages, 15 figures

arXiv:2112.01315 [pdf, other]

A Generator Framework For Evolving Variant-Rich Software

Authors: Christoph Derks, Daniel Strüber, Thorsten Berger

Abstract: Evolving software is challenging, even more when it exists in many different variants. Such software evolves not only in time, but also in space--another dimension of complexity. While evolution in space is supported by a variety of product-line and variability management tools, many of which originating from research, their level of evaluation varies significantly, which threatens their relevance… ▽ More Evolving software is challenging, even more when it exists in many different variants. Such software evolves not only in time, but also in space--another dimension of complexity. While evolution in space is supported by a variety of product-line and variability management tools, many of which originating from research, their level of evaluation varies significantly, which threatens their relevance for practitioners and future research. Many tools have only been evaluated on ad hoc datasets, minimal examples or available preprocessor-based product lines, missing the early clone & own phases and the re-engineering into configurable platforms--large parts of the actual evolution lifecycle of variant-rich systems. Our long-term goal is to provide benchmarks to increase the maturity of evaluating such tools. However, providing manually curated benchmarks that cover the whole evolution lifecycle and that are detailed enough to serve as ground truths, is challenging. We present the framework vpbench to generates source-code histories of variant-rich systems. Vpbench comprises several modular generators relying on evolution operators that systematically and automatically evolve real codebases and document the evolution in detail. We provide simple and more advanced generators--e.g., relying on code transplantation techniques to obtain whole features from external, real-world projects. We define requirements and demonstrate how vpbench addresses them for the generated version histories, focusing on support for evolution in time and space, the generation of detailed meta-data about the evolution, also considering compileability and extensibility. △ Less

Submitted 2 December, 2021; originally announced December 2021.

Comments: 9 pages, 5 figures

arXiv:2108.08139 [pdf, other]

doi 10.1109/ACSOS-C52956.2021.00067

Towards Map** Control Theory and Software Engineering Properties using Specification Patterns

Authors: Ricardo Caldas, Razan Ghzouli, Alessandro V. Papadopoulos, Patrizio Pelliccione, Danny Weyns, Thorsten Berger

Abstract: A traditional approach to realize self-adaptation in software engineering (SE) is by means of feedback loops. The goals of the system can be specified as formal properties that are verified against models of the system. On the other hand, control theory (CT) provides a well-established foundation for designing feedback loop systems and providing guarantees for essential properties, such as stabili… ▽ More A traditional approach to realize self-adaptation in software engineering (SE) is by means of feedback loops. The goals of the system can be specified as formal properties that are verified against models of the system. On the other hand, control theory (CT) provides a well-established foundation for designing feedback loop systems and providing guarantees for essential properties, such as stability, settling time, and steady state error. Currently, it is an open question whether and how traditional SE approaches to self-adaptation consider properties from CT. Answering this question is challenging given the principle differences in representing properties in both fields. In this paper, we take a first step to answer this question. We follow a bottom up approach where we specify a control design (in Simulink) for a case inspired by Scuderia Ferrari (F1) and provide evidence for stability and safety. The design is then transferred into code (in C) that is further optimized. Next, we define properties that enable verifying whether the control properties still hold at code level. Then, we consolidate the solution by map** the properties in both worlds using specification patterns as common language and we verify the correctness of this map**. The map** offers a reusable artifact to solve similar problems. Finally, we outline opportunities for future work, particularly to refine and extend the map** and investigate how it can improve the engineering of self-adaptive systems for both SE and CT engineers. △ Less

Submitted 23 May, 2022; v1 submitted 18 August, 2021; originally announced August 2021.

Journal ref: 2021 IEEE International Conference on Autonomic Computing and Self-Organizing Systems Companion (ACSOS-C)

arXiv:2107.02063 [pdf, other]

doi 10.3389/frobt.2021.721890

Using Probabilistic Movement Primitives in Analyzing Human Motion Difference under Transcranial Current Stimulation

Authors: Honghu Xue, Rebecca Herzog, Till M Berger, Tobias Bäumer, Anne Weissbach, Elmar Rueckert

Abstract: In medical tasks such as human motion analysis, computer-aided auxiliary systems have become preferred choice for human experts for its high efficiency. However, conventional approaches are typically based on user-defined features such as movement onset times, peak velocities, motion vectors or frequency domain analyses. Such approaches entail careful data post-processing or specific domain knowle… ▽ More In medical tasks such as human motion analysis, computer-aided auxiliary systems have become preferred choice for human experts for its high efficiency. However, conventional approaches are typically based on user-defined features such as movement onset times, peak velocities, motion vectors or frequency domain analyses. Such approaches entail careful data post-processing or specific domain knowledge to achieve a meaningful feature extraction. Besides, they are prone to noise and the manual-defined features could hardly be re-used for other analyses. In this paper, we proposed probabilistic movement primitives (ProMPs), a widely-used approach in robot skill learning, to model human motions. The benefit of ProMPs is that the features are directly learned from the data and ProMPs can capture important features describing the trajectory shape, which can easily be extended to other tasks. Distinct from previous research, where classification tasks are mostly investigated, we applied ProMPs together with a variant of Kullback-Leibler (KL) divergence to quantify the effect of different transcranial current stimulation methods on human motions. We presented an initial result with 10 participants. The results validate ProMPs as a robust and effective feature extractor for human motions. △ Less

Submitted 5 July, 2021; originally announced July 2021.

Journal ref: https://www.frontiersin.org/articles/10.3389/frobt.2021.721890/full

arXiv:2104.06161 [pdf, other]

Feature-Oriented Defect Prediction: Scenarios, Metrics, and Classifiers

Authors: Mukelabai Mukelabai, Stefan Strüder, Daniel Strüber, Thorsten Berger

Abstract: Several software defect prediction techniques have been developed over the past decades. These techniques predict defects at the granularity of typical software assets, such as components and files. In this paper, we investigate feature-oriented defect prediction: predicting defects at the granularity of features -- domain-entities that represent software functionality and often cross-cut software… ▽ More Several software defect prediction techniques have been developed over the past decades. These techniques predict defects at the granularity of typical software assets, such as components and files. In this paper, we investigate feature-oriented defect prediction: predicting defects at the granularity of features -- domain-entities that represent software functionality and often cross-cut software assets. Feature-oriented defect prediction can be beneficial since: (i) some features might be more error-prone than others, (ii) characteristics of defective features might be useful to predict other error-prone features, and (iii) feature-specific code might be prone to faults arising from feature interactions. We explore the feasibility and solution space for feature-oriented defect prediction. Our study relies on 12 software projects from which we analyzed 13,685 bug-introducing and corrective commits, and systematically generated 62,868 training and test datasets to evaluate classifiers, metrics, and scenarios. The datasets were generated based on the 13,685 commits, 81 releases, and 24, 532 permutations of our 12 projects depending on the scenario addressed. We covered scenarios such as just-in-time (JIT) and cross-project defect prediction. Our results confirm the feasibility of feature-oriented defect prediction. We found the best performance (i.e., precision and robustness) when using the Random Forest classifier, with process and structure metrics. Surprisingly, single-project JIT and release-level predictions had median AUC-ROC values greater than 95% and 90% respectively, contrary to studies that assert poor performance due to insufficient training data. We also found that a model trained on release-level data from one of the twelve projects could predict defect-proneness of features in the other eleven projects with median AUC-ROC of 82%, without retraining. △ Less

Submitted 13 April, 2021; originally announced April 2021.

Comments: 16 pages, 10 figures, 14 tables, journal

arXiv:2103.00437 [pdf, other]

Seamless Variability Management With the Virtual Platform

Authors: Wardah Mahmood, Daniel Strüber, Thorsten Berger, Ralf Lämmel, Mukelabai Mukelabai

Abstract: Customization is a general trend in software engineering, demanding systems that support variable stakeholder requirements. Two opposing strategies are commonly used to create variants: software clone & own and software configuration with an integrated platform. Organizations often start with the former, which is cheap, agile, and supports quick innovation, but does not scale. The latter scales by… ▽ More Customization is a general trend in software engineering, demanding systems that support variable stakeholder requirements. Two opposing strategies are commonly used to create variants: software clone & own and software configuration with an integrated platform. Organizations often start with the former, which is cheap, agile, and supports quick innovation, but does not scale. The latter scales by establishing an integrated platform that shares software assets between variants, but requires high up-front investments or risky migration processes. So, could we have a method that allows an easy transition or even combine the benefits of both strategies? We propose a method and tool that supports a truly incremental development of variant-rich systems, exploiting a spectrum between both opposing strategies. We design, formalize, and prototype the variability-management framework virtual platform. It bridges clone & own and platform-oriented development. Relying on programming-language-independent conceptual structures representing software assets, it offers operators for engineering and evolving a system, comprising: traditional, asset-oriented operators and novel, feature-oriented operators for incrementally adopting concepts of an integrated platform. The operators record meta-data that is exploited by other operators to support the transition. Among others, they eliminate expensive feature-location effort or the need to trace clones. Our evaluation simulates the evolution of a real-world, clone-based system, measuring its costs and benefits. △ Less

Submitted 2 March, 2021; v1 submitted 28 February, 2021; originally announced March 2021.

Comments: 13 pages, 10 figures; accepted for publication at the 43rd International Conference on Software Engineering (ICSE 2021), main technical track

arXiv:2102.06919 [pdf, other]

Asset Management in Machine Learning: A Survey

Authors: Samuel Idowu, Daniel Strüber, Thorsten Berger

Abstract: Machine Learning (ML) techniques are becoming essential components of many software systems today, causing an increasing need to adapt traditional software engineering practices and tools to the development of ML-based software systems. This need is especially pronounced due to the challenges associated with the large-scale development and deployment of ML systems. Among the most commonly reported… ▽ More Machine Learning (ML) techniques are becoming essential components of many software systems today, causing an increasing need to adapt traditional software engineering practices and tools to the development of ML-based software systems. This need is especially pronounced due to the challenges associated with the large-scale development and deployment of ML systems. Among the most commonly reported challenges during the development, production, and operation of ML-based systems are experiment management, dependency management, monitoring, and logging of ML assets. In recent years, we have seen several efforts to address these challenges as witnessed by an increasing number of tools for tracking and managing ML experiments and their assets. To facilitate research and practice on engineering intelligent systems, it is essential to understand the nature of the current tool support for managing ML assets. What kind of support is provided? What asset types are tracked? What operations are offered to users for managing those assets? We discuss and position ML asset management as an important discipline that provides methods and tools for ML assets as structures and the ML development activities as their operations. We present a feature-based survey of 17 tools with ML asset management support identified in a systematic search. We overview these tools' features for managing the different types of assets used for engineering ML-based systems and performing experiments. We found that most of the asset management support depends on traditional version control systems, while only a few tools support an asset granularity level that differentiates between important ML assets, such as datasets and models. △ Less

Submitted 17 February, 2021; v1 submitted 13 February, 2021; originally announced February 2021.

Comments: 10 pages, 8 figures. Accepted for publication at ICSE-SEIP 2021: International Conference on Software Engineering, track on Software Engineering in Practice

arXiv:2012.15342 [pdf, other]

ConfigFix: Interactive Configuration Conflict Resolution for the Linux Kernel

Authors: Patrick Franz, Thorsten Berger, Ibrahim Fayaz, Sarah Nadi, Evgeny Groshev

Abstract: Highly configurable systems are highly complex systems, with the Linux kernel arguably being one of the most well-known ones. Since 2007, it has been a frequent target of the research community, conducting empirical studies and building dedicated methods and tools for analyzing, configuring, testing, optimizing, and maintaining the kernel in the light of its vast configuration space. However, desp… ▽ More Highly configurable systems are highly complex systems, with the Linux kernel arguably being one of the most well-known ones. Since 2007, it has been a frequent target of the research community, conducting empirical studies and building dedicated methods and tools for analyzing, configuring, testing, optimizing, and maintaining the kernel in the light of its vast configuration space. However, despite a large body of work, mainly bug fixes that were the result of such research made it back into the kernel's source tree. Unfortunately, Linux users still struggle with kernel configuration and resolving configuration conflicts, since the kernel largely lacks automated support. Additionally, there are technical and community requirements for supporting automated conflict resolution in the kernel, such as, for example, using a pure C-based solution that uses only compatible third-party libraries (if any). With the aim of contributing back to the Linux community, we present CONFIGFIX, a tooling that we integrated with the kernel configurator, that is purely implemented in C, and that is finally a working solution able to produce fixes for configuration conflicts. In this experience report, we describe our experiences ranging over a decade of building upon the large body of work from research on the Linux kernel configuration mechanisms as well as how we designed and realized CONFIGFIX while adhering to the Linux kernel's community requirements and standards. While CONFIGFIX helps Linux kernel users obtaining their desired configuration, the sound semantic abstraction we implement provides the basis for many of the above techniques supporting kernel configuration, hel** researchers and kernel developers. △ Less

Submitted 30 December, 2020; originally announced December 2020.

arXiv:2012.14405 [pdf, other]

Shape-based Feature Engineering for Solar Flare Prediction

Authors: Varad Deshmukh, Thomas Berger, James Meiss, Elizabeth Bradley

Abstract: Solar flares are caused by magnetic eruptions in active regions (ARs) on the surface of the sun. These events can have significant impacts on human activity, many of which can be mitigated with enough advance warning from good forecasts. To date, machine learning-based flare-prediction methods have employed physics-based attributes of the AR images as features; more recently, there has been some w… ▽ More Solar flares are caused by magnetic eruptions in active regions (ARs) on the surface of the sun. These events can have significant impacts on human activity, many of which can be mitigated with enough advance warning from good forecasts. To date, machine learning-based flare-prediction methods have employed physics-based attributes of the AR images as features; more recently, there has been some work that uses features deduced automatically by deep learning methods (such as convolutional neural networks). We describe a suite of novel shape-based features extracted from magnetogram images of the Sun using the tools of computational topology and computational geometry. We evaluate these features in the context of a multi-layer perceptron (MLP) neural network and compare their performance against the traditional physics-based attributes. We show that these abstract shape-based features outperform the features chosen by the human experts, and that a combination of the two feature sets improves the forecasting capability even further. △ Less

Submitted 28 December, 2020; originally announced December 2020.

Comments: To be published in Proceedings for Innovative Applications of Artificial Intelligence Conference 2021

Journal ref: AAAI Conference on Artificial Intelligence, 35(17), 2021, 15293-15300

arXiv:2012.11976 [pdf, other]

doi 10.1145/3412841.3442046

A Maturity Assessment Framework for Conversational AI Development Platforms

Authors: Johan Aronsson, Philip Lu, Daniel Strüber, Thorsten Berger

Abstract: Conversational Artificial Intelligence (AI) systems have recently sky-rocketed in popularity and are now used in many applications, from car assistants to customer support. The development of conversational AI systems is supported by a large variety of software platforms, all with similar goals, but different focus points and functionalities. A systematic foundation for classifying conversational… ▽ More Conversational Artificial Intelligence (AI) systems have recently sky-rocketed in popularity and are now used in many applications, from car assistants to customer support. The development of conversational AI systems is supported by a large variety of software platforms, all with similar goals, but different focus points and functionalities. A systematic foundation for classifying conversational AI platforms is currently lacking. We propose a framework for assessing the maturity level of conversational AI development platforms. Our framework is based on a systematic literature review, in which we extracted common and distinguishing features of various open-source and commercial (or in-house) platforms. Inspired by language reference frameworks, we identify different maturity levels that a conversational AI development platform may exhibit in understanding and responding to user inputs. Our framework can guide organizations in selecting a conversational AI development platform according to their needs, as well as hel** researchers and platform developers improving the maturity of their platforms. △ Less

Submitted 22 December, 2020; originally announced December 2020.

Comments: 10 pages, 10 figures. Accepted for publication at SAC 2021: ACM/SIGAPP Symposium On Applied Computing

arXiv:2010.06256 [pdf]

doi 10.1145/3426425.3426942

Behavior Trees in Action: A Study of Robotics Applications

Authors: Razan Ghzouli, Thorsten Berger, Einar Broch Johnsen, Swaib Dragule, Andrzej Wąsowski

Abstract: Autonomous robots combine a variety of skills to form increasingly complex behaviors called missions. While the skills are often programmed at a relatively low level of abstraction, their coordination is architecturally separated and often expressed in higher-level languages or frameworks. Recently, the language of Behavior Trees gained attention among roboticists for this reason. Originally desig… ▽ More Autonomous robots combine a variety of skills to form increasingly complex behaviors called missions. While the skills are often programmed at a relatively low level of abstraction, their coordination is architecturally separated and often expressed in higher-level languages or frameworks. Recently, the language of Behavior Trees gained attention among roboticists for this reason. Originally designed for computer games to model autonomous actors, Behavior Trees offer an extensible tree-based representation of missions. However, even though, several implementations of the language are in use, little is known about its usage and scope in the real world. How do behavior trees relate to traditional languages for describing behavior? How are behavior tree concepts used in applications? What are the benefits of using them? We present a study of the key language concepts in Behavior Trees and their use in real-world robotic applications. We identify behavior tree languages and compare their semantics to the most well-known behavior modeling languages: state and activity diagrams. We mine open source repositories for robotics applications that use the language and analyze this usage. We find that Behavior Trees are a pragmatic language, not fully specified, allowing projects to extend it even for just one model. Behavior trees clearly resemble the models-at-runtime paradigm. We contribute a dataset of real-world behavior models, ho** to inspire the community to use and further develop this language, associated tools, and analysis techniques. △ Less

Submitted 11 November, 2020; v1 submitted 13 October, 2020; originally announced October 2020.

Comments: 14 pages, 5 figures, 13rd ACM SIGPLAN International Conference on Software Language Engineering (SLE) (SLE 2020)

arXiv:2006.10608 [pdf, other]

doi 10.1145/3368089.3409743

Robotics Software Engineering: A Perspective from the Service Robotics Domain

Authors: Sergio García, Daniel Strüber, Davide Brugali, Thorsten Berger, Patrizio Pelliccione

Abstract: Robots that support humans by performing useful tasks (a.k.a., service robots) are booming worldwide. In contrast to industrial robots, the development of service robots comes with severe software engineering challenges, since they require high levels of robustness and autonomy to operate in highly heterogeneous environments. As a domain with critical safety implications, service robotics faces a… ▽ More Robots that support humans by performing useful tasks (a.k.a., service robots) are booming worldwide. In contrast to industrial robots, the development of service robots comes with severe software engineering challenges, since they require high levels of robustness and autonomy to operate in highly heterogeneous environments. As a domain with critical safety implications, service robotics faces a need for sound software development practices. In this paper, we present the first large-scale empirical study to assess the state of the art and practice of robotics software engineering. We conducted 18 semi-structured interviews with industrial practitioners working in 15 companies from 9 different countries and a survey with 156 respondents (from 26 countries) from the robotics domain. Our results provide a comprehensive picture of (i) the practices applied by robotics industrial and academic practitioners, including processes, paradigms, languages, tools, frameworks, and reuse practices, (ii) the distinguishing characteristics of robotics software engineering, and (iii) recurrent challenges usually faced, together with adopted solutions. The paper concludes by discussing observations, derived hypotheses, and proposed actions for researchers and practitioners. △ Less

Submitted 8 September, 2020; v1 submitted 18 June, 2020; originally announced June 2020.

Comments: 11 pages + 1 page for references, 3 figures, 3 tables, in proceedings of ESEC/FSE 2020

arXiv:2003.11985 [pdf]

Is the Juice Worth the Squeeze? Machine Learning (ML) In and For Agent-Based Modelling (ABM)

Authors: Johannes Dahlke, Kristina Bogner, Matthias Mueller, Thomas Berger, Andreas Pyka, Bernd Ebersberger

Abstract: In recent years, many scholars praised the seemingly endless possibilities of using machine learning (ML) techniques in and for agent-based simulation models (ABM). To get a more comprehensive understanding of these possibilities, we conduct a systematic literature review (SLR) and classify the literature on the application of ML in and for ABM according to a theoretically derived classification s… ▽ More In recent years, many scholars praised the seemingly endless possibilities of using machine learning (ML) techniques in and for agent-based simulation models (ABM). To get a more comprehensive understanding of these possibilities, we conduct a systematic literature review (SLR) and classify the literature on the application of ML in and for ABM according to a theoretically derived classification scheme. We do so to investigate how exactly machine learning has been utilized in and for agent-based models so far and to critically discuss the combination of these two promising methods. We find that, indeed, there is a broad range of possible applications of ML to support and complement ABMs in many different ways, already applied in many different disciplines. We see that, so far, ML is mainly used in ABM for two broad cases: First, the modelling of adaptive agents equipped with experience learning and, second, the analysis of outcomes produced by a given ABM. While these are the most frequent, there also exist a variety of many more interesting applications. This being the case, researchers should dive deeper into the analysis of when and how which kinds of ML techniques can support ABM, e.g. by conducting a more in-depth analysis and comparison of different use cases. Nonetheless, as the application of ML in and for ABM comes at certain costs, researchers should not use ML for ABMs just for the sake of doing it. △ Less

Submitted 26 March, 2020; originally announced March 2020.

Comments: 25 pages, 3 figures, 2 tables, discussion paper

arXiv:1901.02077 [pdf, other]

Specification Patterns for Robotic Missions

Authors: Claudio Menghi, Christos Tsigkanos, Patrizio Pelliccione, Carlo Ghezzi, Thorsten Berger

Abstract: Mobile and general-purpose robots increasingly support our everyday life, requiring dependable robotics control software. Creating such software mainly amounts to implementing their complex behaviors known as missions. Recognizing the need, a large number of domain-specific specification languages has been proposed. These, in addition to traditional logical languages, allow the use of formally spe… ▽ More Mobile and general-purpose robots increasingly support our everyday life, requiring dependable robotics control software. Creating such software mainly amounts to implementing their complex behaviors known as missions. Recognizing the need, a large number of domain-specific specification languages has been proposed. These, in addition to traditional logical languages, allow the use of formally specified missions for synthesis, verification, simulation, or guiding the implementation. For instance, the logical language LTL is commonly used by experts to specify missions, as an input for planners, which synthesize the behavior a robot should have. Unfortunately, domain-specific languages are usually tied to specific robot models, while logical languages such as LTL are difficult to use by non-experts. We present a catalog of 22 mission specification patterns for mobile robots, together with tooling for instantiating, composing, and compiling the patterns to create mission specifications. The patterns provide solutions for recurrent specification problems, each of which detailing the usage intent, known uses, relationships to other patterns, and---most importantly---a template mission specification in temporal logic. Our tooling produces specifications expressed in the LTL and CTL temporal logics to be used by planners, simulators, or model checkers. The patterns originate from 245 realistic textual mission requirements extracted from the robotics literature, and they are evaluated upon a total of 441 real-world mission requirements and 1251 mission specifications. Five of these reflect scenarios we defined with two well-known industrial partners develo** human-size robots. We validated our patterns' correctness with simulators and two real robots. △ Less

Submitted 7 January, 2019; originally announced January 2019.

arXiv:1704.07882 [pdf, ps, other]

Generalized subspace subcodes with application in cryptology

Authors: Thierry P. Berger, Cheikh Thiécoumba Gueye, Jean Belo Klamti

Abstract: Most of the codes that have an algebraic decoding algorithm are derived from the Reed Solomon codes. They are obtained by taking equivalent codes, for example the generalized Reed Solomon codes, or by using the so-called subfield subcode method, which leads to Alternant codes and Goppa codes over the underlying prime field, or over some intermediate subfield. The main advantages of these construct… ▽ More Most of the codes that have an algebraic decoding algorithm are derived from the Reed Solomon codes. They are obtained by taking equivalent codes, for example the generalized Reed Solomon codes, or by using the so-called subfield subcode method, which leads to Alternant codes and Goppa codes over the underlying prime field, or over some intermediate subfield. The main advantages of these constructions is to preserve both the minimum distance and the decoding algorithm of the underlying Reed Solomon code. In this paper, we propose a generalization of the subfield subcode construction by introducing the notion of subspace subcodes and a generalization of the equivalence of codes which leads to the notion of generalized subspace subcodes. When the dimension of the selected subspaces is equal to one, we show that our approach gives exactly the family of the codes obtained by equivalence and subfield subcode technique. However, our approach highlights the links between the subfield subcode of a code defined over an extension field and the operation of puncturing the $q$-ary image of this code. When the dimension of the subspaces is greater than one, we obtain codes whose alphabet is no longer a finite field, but a set of r-uples. We explain why these codes are practically as efficient for applications as the codes defined on an extension of degree r. In addition, they make it possible to obtain decodable codes over a large alphabet having parameters previously inaccessible. As an application, we give some examples that can be used in public key cryptosystems such as McEliece. △ Less

Submitted 25 April, 2017; originally announced April 2017.

arXiv:1402.0972

Construction of dyadic MDS matrices for cryptographic applications

Authors: Thierry P. Berger

Abstract: Many recent block ciphers use Maximum Distance Separable (MDS) matrices in their diffusion layer. The main objective of this operation is to spread as much as possible the differences between the outputs of nonlinear Sboxes. So they generally act at nibble or at byte level. The MDS matrices are associated to MDS codes of ratio 1/2. The most famous example is the MixColumns operation of the AES blo… ▽ More Many recent block ciphers use Maximum Distance Separable (MDS) matrices in their diffusion layer. The main objective of this operation is to spread as much as possible the differences between the outputs of nonlinear Sboxes. So they generally act at nibble or at byte level. The MDS matrices are associated to MDS codes of ratio 1/2. The most famous example is the MixColumns operation of the AES block cipher. In this example, the MDS matrix was carefully chosen to obtain compact and efficient implementations. However, this MDS matrix is dedicated to 8-bit words, and is not always adapted to lightweight applications. Recently, several studies have been devoted to the construction of recursive diffusion layers. Such a method allows to apply an MDS matrix using an iterative process which looks like a Feistel network with linear functions instead of nonlinear. Our approach is quite different. We present a generic construction of classical MDS matrices that are not recursively computed, but that are strong symmetric in order to either accelerate their evaluation with a minimal number of look-up tables, or to perform this evaluation with a minimal number of gates in a circuit. We call this particular kind of matrices "dyadic matrices", since they are related to dyadic codes. We study some basic properties of such matrices. We introduce a generic construction of involutive dyadic MDS matrices from Reed Solomon codes. Finally, we discuss the implementation aspects of these dyadic MDS matrices in order to build efficient block ciphers. △ Less

Submitted 5 March, 2014; v1 submitted 5 February, 2014; originally announced February 2014.

Comments: This paper has been withdrawn. Indeed, similar results to those presented in this paper have been obtained in [1]. [1] A. M. Youssef, S. Mister, and S. E. Tavares, "On the design of linear transformations for substitution permutation encryption networks," SAC'97, 1997, pp. 40--48

arXiv:1304.3780 [pdf, other]

doi 10.23943/princeton/9780691164038.003.0005

Solving the Tower of Hanoi with Random Moves

Authors: Max A. Alekseyev, Toby Berger

Abstract: We prove the exact formulae for the expected number of moves to solve several variants of the Tower of Hanoi puzzle with 3 pegs and n disks, when each move is chosen uniformly randomly from the set of all valid moves. We further present an alternative proof for one of the formulae that couples a theorem about expected commute times of random walks on graphs with the delta-to-wye transformation use… ▽ More We prove the exact formulae for the expected number of moves to solve several variants of the Tower of Hanoi puzzle with 3 pegs and n disks, when each move is chosen uniformly randomly from the set of all valid moves. We further present an alternative proof for one of the formulae that couples a theorem about expected commute times of random walks on graphs with the delta-to-wye transformation used in the analysis of three-phase AC systems for electrical power distribution. △ Less

Submitted 18 September, 2014; v1 submitted 13 April, 2013; originally announced April 2013.

Journal ref: In: The Mathematics of Various Entertaining Subjects: Research in Recreational Math, Princeton University Press, 2016, pp. 65-79. ISBN 978-0-691-16403-8

arXiv:1004.4806 [pdf, ps, other]

Revisiting LFSMs

Authors: François Arnault, Thierry Berger, Marine Minier, Benjamin Pousse

Abstract: Linear Finite State Machines (LFSMs) are particular primitives widely used in information theory, coding theory and cryptography. Among those linear automata, a particular case of study is Linear Feedback Shift Registers (LFSRs) used in many cryptographic applications such as design of stream ciphers or pseudo-random generation. LFSRs could be seen as particular LFSMs without inputs. In this pap… ▽ More Linear Finite State Machines (LFSMs) are particular primitives widely used in information theory, coding theory and cryptography. Among those linear automata, a particular case of study is Linear Feedback Shift Registers (LFSRs) used in many cryptographic applications such as design of stream ciphers or pseudo-random generation. LFSRs could be seen as particular LFSMs without inputs. In this paper, we first recall the description of LFSMs using traditional matrices representation. Then, we introduce a new matrices representation with polynomial fractional coefficients. This new representation leads to sparse representations and implementations. As direct applications, we focus our work on the Windmill LFSRs case, used for example in the E0 stream cipher and on other general applications that use this new representation. In a second part, a new design criterion called diffusion delay for LFSRs is introduced and well compared with existing related notions. This criterion represents the diffusion capacity of an LFSR. Thus, using the matrices representation, we present a new algorithm to randomly pick LFSRs with good properties (including the new one) and sparse descriptions dedicated to hardware and software designs. We present some examples of LFSRs generated using our algorithm to show the relevance of our approach. △ Less

Submitted 25 March, 2011; v1 submitted 27 April, 2010; originally announced April 2010.

Comments: Submitted to IEEE-IT

arXiv:cs/0604091 [pdf, ps, other]

Robust Distributed Source Coding

Authors: Jun Chen, Toby Berger

Abstract: We consider a distributed source coding system in which several observations are communicated to the decoder using limited transmission rate. The observations must be separately coded. We introduce a robust distributed coding scheme which flexibly trades off between system robustness and compression efficiency. The optimality of this coding scheme is proved for various special cases. We consider a distributed source coding system in which several observations are communicated to the decoder using limited transmission rate. The observations must be separately coded. We introduce a robust distributed coding scheme which flexibly trades off between system robustness and compression efficiency. The optimality of this coding scheme is proved for various special cases. △ Less

Submitted 23 April, 2006; originally announced April 2006.

Comments: 40 pages, submitted to the IEEE Transactions on Information Theory

arXiv:cs/0604077 [pdf, ps, other]

doi 10.1109/TIT.2008.917687

Successive Wyner-Ziv Coding Scheme and its Application to the Quadratic Gaussian CEO Problem

Authors: Jun Chen, Toby Berger

Abstract: We introduce a distributed source coding scheme called successive Wyner-Ziv coding. We show that any point in the rate region of the quadratic Gaussian CEO problem can be achieved via the successive Wyner-Ziv coding. The concept of successive refinement in the single source coding is generalized to the distributed source coding scenario, which we refer to as distributed successive refinement. Fo… ▽ More We introduce a distributed source coding scheme called successive Wyner-Ziv coding. We show that any point in the rate region of the quadratic Gaussian CEO problem can be achieved via the successive Wyner-Ziv coding. The concept of successive refinement in the single source coding is generalized to the distributed source coding scenario, which we refer to as distributed successive refinement. For the quadratic Gaussian CEO problem, we establish a necessary and sufficient condition for distributed successive refinement, where the successive Wyner-Ziv coding scheme plays an important role. △ Less

Submitted 19 April, 2006; originally announced April 2006.

Comments: 28 pages, submitted to the IEEE Transactions on Information Theory

arXiv:cs/0504003 [pdf, ps, other]

doi 10.1109/TIT.2006.885498

Multiple Description Quantization via Gram-Schmidt Orthogonalization

Authors: Jun Chen, Chao Tian, Toby Berger, Sheila Hemami

Abstract: The multiple description (MD) problem has received considerable attention as a model of information transmission over unreliable channels. A general framework for designing efficient multiple description quantization schemes is proposed in this paper. We provide a systematic treatment of the El Gamal-Cover (EGC) achievable MD rate-distortion region, and show that any point in the EGC region can… ▽ More The multiple description (MD) problem has received considerable attention as a model of information transmission over unreliable channels. A general framework for designing efficient multiple description quantization schemes is proposed in this paper. We provide a systematic treatment of the El Gamal-Cover (EGC) achievable MD rate-distortion region, and show that any point in the EGC region can be achieved via a successive quantization scheme along with quantization splitting. For the quadratic Gaussian case, the proposed scheme has an intrinsic connection with the Gram-Schmidt orthogonalization, which implies that the whole Gaussian MD rate-distortion region is achievable with a sequential dithered lattice-based quantization scheme as the dimension of the (optimal) lattice quantizers becomes large. Moreover, this scheme is shown to be universal for all i.i.d. smooth sources with performance no worse than that for an i.i.d. Gaussian source with the same variance and asymptotically optimal at high resolution. A class of low-complexity MD scalar quantizers in the proposed general framework also is constructed and is illustrated geometrically; the performance is analyzed in the high resolution regime, which exhibits a noticeable improvement over the existing MD scalar quantization schemes. △ Less

Submitted 1 April, 2005; originally announced April 2005.

Comments: 48 pages; submitted to IEEE Transactions on Information Theory

Showing 1–28 of 28 results for author: Berger, T