-
Evaluating Collaborative Autonomy in Opposed Environments using Maritime Capture-the-Flag Competitions
Authors:
Jordan Beason,
Michael Novitzky,
John Kliem,
Tyler Errico,
Zachary Serlin,
Kevin Becker,
Tyler Paine,
Michael Benjamin,
Prithviraj Dasgupta,
Peter Crowley,
Charles O'Donnell,
John James
Abstract:
The objective of this work is to evaluate multi-agent artificial intelligence methods when deployed on teams of unmanned surface vehicles (USV) in an adversarial environment. Autonomous agents were evaluated in real-world scenarios using the Aquaticus test-bed, which is a Capture-the-Flag (CTF) style competition involving teams of USV systems. Cooperative teaming algorithms of various foundations…
▽ More
The objective of this work is to evaluate multi-agent artificial intelligence methods when deployed on teams of unmanned surface vehicles (USV) in an adversarial environment. Autonomous agents were evaluated in real-world scenarios using the Aquaticus test-bed, which is a Capture-the-Flag (CTF) style competition involving teams of USV systems. Cooperative teaming algorithms of various foundations in behavior-based optimization and deep reinforcement learning (RL) were deployed on these USV systems in two versus two teams and tested against each other during a competition period in the fall of 2023. Deep reinforcement learning applied to USV agents was achieved via the Pyquaticus test bed, a lightweight gymnasium environment that allows simulated CTF training in a low-level environment. The results of the experiment demonstrate that rule-based cooperation for behavior-based agents outperformed those trained in Deep-reinforcement learning paradigms as implemented in these competitions. Further integration of the Pyquaticus gymnasium environment for RL with MOOS-IvP in terms of configuration and control schema will allow for more competitive CTF games in future studies. As the development of experimental deep RL methods continues, the authors expect that the competitive gap between behavior-based autonomy and deep RL will be reduced. As such, this report outlines the overall competition, methods, and results with an emphasis on future works such as reward sha** and sim-to-real methodologies and extending rule-based cooperation among agents to react to safety and security events in accordance with human experts intent/rules for executing safety and security processes.
△ Less
Submitted 25 April, 2024;
originally announced April 2024.
-
The METRIC-framework for assessing data quality for trustworthy AI in medicine: a systematic review
Authors:
Daniel Schwabe,
Katinka Becker,
Martin Seyferth,
Andreas Klaß,
Tobias Schäffter
Abstract:
The adoption of machine learning (ML) and, more specifically, deep learning (DL) applications into all major areas of our lives is underway. The development of trustworthy AI is especially important in medicine due to the large implications for patients' lives. While trustworthiness concerns various aspects including ethical, technical and privacy requirements, we focus on the importance of data q…
▽ More
The adoption of machine learning (ML) and, more specifically, deep learning (DL) applications into all major areas of our lives is underway. The development of trustworthy AI is especially important in medicine due to the large implications for patients' lives. While trustworthiness concerns various aspects including ethical, technical and privacy requirements, we focus on the importance of data quality (training/test) in DL. Since data quality dictates the behaviour of ML products, evaluating data quality will play a key part in the regulatory approval of medical AI products. We perform a systematic review following PRISMA guidelines using the databases PubMed and ACM Digital Library. We identify 2362 studies, out of which 62 records fulfil our eligibility criteria. From this literature, we synthesise the existing knowledge on data quality frameworks and combine it with the perspective of ML applications in medicine. As a result, we propose the METRIC-framework, a specialised data quality framework for medical training data comprising 15 awareness dimensions, along which developers of medical ML applications should investigate a dataset. This knowledge helps to reduce biases as a major source of unfairness, increase robustness, facilitate interpretability and thus lays the foundation for trustworthy AI in medicine. Incorporating such systematic assessment of medical datasets into regulatory approval processes has the potential to accelerate the approval of ML products and builds the basis for new standards.
△ Less
Submitted 21 February, 2024;
originally announced February 2024.
-
Observation of high-energy neutrinos from the Galactic plane
Authors:
R. Abbasi,
M. Ackermann,
J. Adams,
J. A. Aguilar,
M. Ahlers,
M. Ahrens,
J. M. Alameddine,
A. A. Alves Jr.,
N. M. Amin,
K. Andeen,
T. Anderson,
G. Anton,
C. Argüelles,
Y. Ashida,
S. Athanasiadou,
S. Axani,
X. Bai,
A. Balagopal V.,
S. W. Barwick,
V. Basu,
S. Baur,
R. Bay,
J. J. Beatty,
K. -H. Becker,
J. Becker Tjus
, et al. (364 additional authors not shown)
Abstract:
The origin of high-energy cosmic rays, atomic nuclei that continuously impact Earth's atmosphere, has been a mystery for over a century. Due to deflection in interstellar magnetic fields, cosmic rays from the Milky Way arrive at Earth from random directions. However, near their sources and during propagation, cosmic rays interact with matter and produce high-energy neutrinos. We search for neutrin…
▽ More
The origin of high-energy cosmic rays, atomic nuclei that continuously impact Earth's atmosphere, has been a mystery for over a century. Due to deflection in interstellar magnetic fields, cosmic rays from the Milky Way arrive at Earth from random directions. However, near their sources and during propagation, cosmic rays interact with matter and produce high-energy neutrinos. We search for neutrino emission using machine learning techniques applied to ten years of data from the IceCube Neutrino Observatory. We identify neutrino emission from the Galactic plane at the 4.5$σ$ level of significance, by comparing diffuse emission models to a background-only hypothesis. The signal is consistent with modeled diffuse emission from the Galactic plane, but could also arise from a population of unresolved point sources.
△ Less
Submitted 10 July, 2023;
originally announced July 2023.
-
3D Coverage Path Planning for Efficient Construction Progress Monitoring
Authors:
Katrin Becker,
Martin Oehler,
Oskar von Stryk
Abstract:
On construction sites, progress must be monitored continuously to ensure that the current state corresponds to the planned state in order to increase efficiency, safety and detect construction defects at an early stage. Autonomous mobile robots can document the state of construction with high data quality and consistency. However, finding a path that fully covers the construction site is a challen…
▽ More
On construction sites, progress must be monitored continuously to ensure that the current state corresponds to the planned state in order to increase efficiency, safety and detect construction defects at an early stage. Autonomous mobile robots can document the state of construction with high data quality and consistency. However, finding a path that fully covers the construction site is a challenging task as it can be large, slowly changing over time, and contain dynamic objects. Existing approaches are either exploration approaches that require a long time to explore the entire building, object scanning approaches that are not suitable for large and complex buildings, or planning approaches that only consider 2D coverage. In this paper, we present a novel approach for planning an efficient 3D path for progress monitoring on large construction sites with multiple levels. By making use of an existing 3D model we ensure that all surfaces of the building are covered by the sensor payload such as a 360-degree camera or a lidar. This enables the consistent and reliable monitoring of construction site progress with an autonomous ground robot. We demonstrate the effectiveness of the proposed planner on an artificial and a real building model, showing that much shorter paths and better coverage are achieved than with a traditional exploration planner.
△ Less
Submitted 2 February, 2023;
originally announced February 2023.
-
Graph Neural Networks for Low-Energy Event Classification & Reconstruction in IceCube
Authors:
R. Abbasi,
M. Ackermann,
J. Adams,
N. Aggarwal,
J. A. Aguilar,
M. Ahlers,
M. Ahrens,
J. M. Alameddine,
A. A. Alves Jr.,
N. M. Amin,
K. Andeen,
T. Anderson,
G. Anton,
C. Argüelles,
Y. Ashida,
S. Athanasiadou,
S. Axani,
X. Bai,
A. Balagopal V.,
M. Baricevic,
S. W. Barwick,
V. Basu,
R. Bay,
J. J. Beatty,
K. -H. Becker
, et al. (359 additional authors not shown)
Abstract:
IceCube, a cubic-kilometer array of optical sensors built to detect atmospheric and astrophysical neutrinos between 1 GeV and 1 PeV, is deployed 1.45 km to 2.45 km below the surface of the ice sheet at the South Pole. The classification and reconstruction of events from the in-ice detectors play a central role in the analysis of data from IceCube. Reconstructing and classifying events is a challen…
▽ More
IceCube, a cubic-kilometer array of optical sensors built to detect atmospheric and astrophysical neutrinos between 1 GeV and 1 PeV, is deployed 1.45 km to 2.45 km below the surface of the ice sheet at the South Pole. The classification and reconstruction of events from the in-ice detectors play a central role in the analysis of data from IceCube. Reconstructing and classifying events is a challenge due to the irregular detector geometry, inhomogeneous scattering and absorption of light in the ice and, below 100 GeV, the relatively low number of signal photons produced per event. To address this challenge, it is possible to represent IceCube events as point cloud graphs and use a Graph Neural Network (GNN) as the classification and reconstruction method. The GNN is capable of distinguishing neutrino events from cosmic-ray backgrounds, classifying different neutrino event types, and reconstructing the deposited energy, direction and interaction vertex. Based on simulation, we provide a comparison in the 1-100 GeV energy range to the current state-of-the-art maximum likelihood techniques used in current IceCube analyses, including the effects of known systematic uncertainties. For neutrino event classification, the GNN increases the signal efficiency by 18% at a fixed false positive rate (FPR), compared to current IceCube methods. Alternatively, the GNN offers a reduction of the FPR by over a factor 8 (to below half a percent) at a fixed signal efficiency. For the reconstruction of energy, direction, and interaction vertex, the resolution improves by an average of 13%-20% compared to current maximum likelihood techniques in the energy range of 1-30 GeV. The GNN, when run on a GPU, is capable of processing IceCube events at a rate nearly double of the median IceCube trigger rate of 2.7 kHz, which opens the possibility of using low energy neutrinos in online searches for transient events.
△ Less
Submitted 11 October, 2022; v1 submitted 7 September, 2022;
originally announced September 2022.
-
An additive framework for kirigami design
Authors:
Levi H. Dudte,
Gary P. T. Choi,
Kaitlyn P. Becker,
L. Mahadevan
Abstract:
We present an additive approach for the inverse design of kirigami-based mechanical metamaterials by focusing on the empty (negative) spaces instead of the solid tiles. By considering each negative space as a four-bar linkage, we identify a simple recursive relationship between adjacent linkages, yielding an efficient method for creating kirigami patterns. This allows us to solve the kirigami desi…
▽ More
We present an additive approach for the inverse design of kirigami-based mechanical metamaterials by focusing on the empty (negative) spaces instead of the solid tiles. By considering each negative space as a four-bar linkage, we identify a simple recursive relationship between adjacent linkages, yielding an efficient method for creating kirigami patterns. This allows us to solve the kirigami design problem using elementary linear algebra, with compatibility, reconfigurability and rigid-deployability encoded into an iterative procedure involving simple matrix multiplications. The resulting linear design strategy circumvents the solution of a non-convex global optimization problem and allows us to control the degrees of freedom in the deployment angle field, linkage offsets and boundary conditions. We demonstrate this by creating a large variety of rigid-deployable, compact, reconfigurable kirigami patterns. We then realize our kirigami designs physically using two simple but effective fabrication strategies with very different materials. Altogether, our additive approaches present routes for efficient mechanical metamaterial design and fabrication based on ori/kirigami art forms.
△ Less
Submitted 25 May, 2023; v1 submitted 5 July, 2022;
originally announced July 2022.
-
IoT-Scan: Network Reconnaissance for the Internet of Things
Authors:
Stefan Gvozdenovic,
Johannes K Becker,
John Mikulskis,
David Starobinski
Abstract:
Network reconnaissance is a core networking and security procedure aimed at discovering devices and their properties. For IP-based networks, several network reconnaissance tools are available, such as Nmap. For the Internet of Things (IoT), there is currently no similar tool capable of discovering devices across multiple protocols. In this paper, we present IoT-Scan, a universal IoT network reconn…
▽ More
Network reconnaissance is a core networking and security procedure aimed at discovering devices and their properties. For IP-based networks, several network reconnaissance tools are available, such as Nmap. For the Internet of Things (IoT), there is currently no similar tool capable of discovering devices across multiple protocols. In this paper, we present IoT-Scan, a universal IoT network reconnaissance tool.
IoT-Scan is based on software defined radio (SDR) technology, which allows for a flexible software-based implementation of radio protocols. We present a series of passive, active, multi-channel, and multi-protocol scanning algorithms to speed up the discovery of devices with IoT-Scan. We benchmark the passive scanning algorithms against a theoretical traffic model based on the non-uniform coupon collector problem. We implement the scanning algorithms and compare their performance for four popular IoT protocols: Zigbee, Bluetooth LE, Z-Wave, and LoRa. Through extensive experiments with dozens of IoT devices, we demonstrate that our implementation experiences minimal packet losses and achieves performance near the theoretical benchmark. Using multi-protocol scanning, we further demonstrate a reduction of 70\% in the discovery times of Bluetooth and Zigbee devices in the 2.4\,GHz band and of LoRa and Z-Wave devices in the 900\,MHz band, compared to sequential passive scanning. We make our implementation and data available to the research community to allow independent replication of our results and facilitate further development of the tool.
△ Less
Submitted 5 April, 2022;
originally announced April 2022.
-
Efficient Floating Point Arithmetic for Quantum Computers
Authors:
Raphael Seidel,
Nikolay Tcholtchev,
Sebastian Bock,
Colin Kai-Uwe Becker,
Manfred Hauswirth
Abstract:
One of the major promises of quantum computing is the realization of SIMD (single instruction - multiple data) operations using the phenomenon of superposition. Since the dimension of the state space grows exponentially with the number of qubits, we can easily reach situations where we pay less than a single quantum gate per data point for data-processing instructions which would be rather expensi…
▽ More
One of the major promises of quantum computing is the realization of SIMD (single instruction - multiple data) operations using the phenomenon of superposition. Since the dimension of the state space grows exponentially with the number of qubits, we can easily reach situations where we pay less than a single quantum gate per data point for data-processing instructions which would be rather expensive in classical computing. Formulating such instructions in terms of quantum gates, however, still remains a challenging task. Laying out the foundational functions for more advanced data-processing is therefore a subject of paramount importance for advancing the realm of quantum computing. In this paper, we introduce the formalism of encoding so called-semi-boolean polynomials. As it turns out, arithmetic $\mathbb{Z}/2^n\mathbb{Z}$ ring operations can be formulated as semi-boolean polynomial evaluations, which allows convenient generation of unsigned integer arithmetic quantum circuits. For arithmetic evaluations, the resulting algorithm has been known as Fourier-arithmetic. We extend this type of algorithm with additional features, such as ancilla-free in-place multiplication and integer coefficient polynomial evaluation. Furthermore, we introduce a tailor-made method for encoding signed integers succeeded by an encoding for arbitrary floating-point numbers. This representation of floating-point numbers and their processing can be applied to any quantum algorithm that performs unsigned modular integer arithmetic. We discuss some further performance enhancements of the semi boolean polynomial encoder and finally supply a complexity estimation. The application of our methods to a 32-bit unsigned integer multiplication demonstrated a 90\% circuit depth reduction compared to carry-ripple approaches.
△ Less
Submitted 20 December, 2021;
originally announced December 2021.
-
Automatic Generation of Grover Quantum Oracles for Arbitrary Data Structures
Authors:
Raphael Seidel,
Colin Kai-Uwe Becker,
Sebastian Bock,
Nikolay Tcholtchev,
Ilie-Daniel Gheorge-Pop,
Manfred Hauswirth
Abstract:
The steadily growing research interest in quantum computing - together with the accompanying technological advances in the realization of quantum hardware - fuels the development of meaningful real-world applications, as well as implementations for well-known quantum algorithms. One of the most prominent examples till today is Grover's algorithm, which can be used for efficient search in unstructu…
▽ More
The steadily growing research interest in quantum computing - together with the accompanying technological advances in the realization of quantum hardware - fuels the development of meaningful real-world applications, as well as implementations for well-known quantum algorithms. One of the most prominent examples till today is Grover's algorithm, which can be used for efficient search in unstructured databases. Quantum oracles that are frequently masked as black boxes play an important role in Grover's algorithm. Hence, the automatic generation of oracles is of paramount importance. Moreover, the automatic generation of the corresponding circuits for a Grover quantum oracle is deeply linked to the synthesis of reversible quantum logic, which - despite numerous advances in the field - still remains a challenge till today in terms of synthesizing efficient and scalable circuits for complex boolean functions.
In this paper, we present a flexible method for automatically encoding unstructured databases into oracles, which can then be efficiently searched with Grover's algorithm. Furthermore, we develop a tailor-made method for quantum logic synthesis, which vastly improves circuit complexity over other current approaches. Finally, we present another logic synthesis method that considers the requirements of scaling onto real world backends. We compare our method with other approaches through evaluating the oracle generation for random databases and analyzing the resulting circuit complexities using various metrics.
△ Less
Submitted 14 October, 2021;
originally announced October 2021.
-
Analysis of the influence of political polarization in the vaccination stance: the Brazilian COVID-19 scenario
Authors:
Régis Ebeling,
Carlos Abel Córdova Sáenz,
Jeferson Nobre,
Karin Becker
Abstract:
The outbreak of COVID-19 had a huge global impact, and non-scientific beliefs and political polarization have significantly influenced the population's behavior. In this context, COVID vaccines were made available in an unprecedented time, but a high level of hesitance has been observed that can undermine community immunization. Traditionally, anti-vaccination attitudes are more related to conspir…
▽ More
The outbreak of COVID-19 had a huge global impact, and non-scientific beliefs and political polarization have significantly influenced the population's behavior. In this context, COVID vaccines were made available in an unprecedented time, but a high level of hesitance has been observed that can undermine community immunization. Traditionally, anti-vaccination attitudes are more related to conspiratorial thinking rather than political bias. In Brazil, a country with an exemplar tradition in large-scale vaccination programs, all COVID-related topics have also been discussed under a strong political bias. In this paper, we use a multidimensional analysis framework to understand if anti/pro-vaccination stances expressed by Brazilians in social media are influenced by political polarization. The analysis framework incorporates techniques to automatically infer from users their political orientation, topic modeling to discover their concerns, network analysis to characterize their social behavior, and the characterization of information sources and external influence. Our main findings confirm that anti/pro stances are biased by political polarization, right and left, respectively. While a significant proportion of pro-vaxxers display haste for an immunization program and criticize the government's actions, the anti-vaxxers distrust a vaccine developed in a record time. Anti-vaccination stance is also related to prejudice against China (anti-sinovaxxers), revealing conspiratorial theories related to communism. All groups display an "echo chamber behavior, revealing they are not open to distinct views.
△ Less
Submitted 7 October, 2021;
originally announced October 2021.
-
A Convolutional Neural Network based Cascade Reconstruction for the IceCube Neutrino Observatory
Authors:
R. Abbasi,
M. Ackermann,
J. Adams,
J. A. Aguilar,
M. Ahlers,
M. Ahrens,
C. Alispach,
A. A. Alves Jr.,
N. M. Amin,
R. An,
K. Andeen,
T. Anderson,
I. Ansseau,
G. Anton,
C. Argüelles,
S. Axani,
X. Bai,
A. Balagopal V.,
A. Barbano,
S. W. Barwick,
B. Bastian,
V. Basu,
V. Baum,
S. Baur,
R. Bay
, et al. (343 additional authors not shown)
Abstract:
Continued improvements on existing reconstruction methods are vital to the success of high-energy physics experiments, such as the IceCube Neutrino Observatory. In IceCube, further challenges arise as the detector is situated at the geographic South Pole where computational resources are limited. However, to perform real-time analyses and to issue alerts to telescopes around the world, powerful an…
▽ More
Continued improvements on existing reconstruction methods are vital to the success of high-energy physics experiments, such as the IceCube Neutrino Observatory. In IceCube, further challenges arise as the detector is situated at the geographic South Pole where computational resources are limited. However, to perform real-time analyses and to issue alerts to telescopes around the world, powerful and fast reconstruction methods are desired. Deep neural networks can be extremely powerful, and their usage is computationally inexpensive once the networks are trained. These characteristics make a deep learning-based approach an excellent candidate for the application in IceCube. A reconstruction method based on convolutional architectures and hexagonally shaped kernels is presented. The presented method is robust towards systematic uncertainties in the simulation and has been tested on experimental data. In comparison to standard reconstruction methods in IceCube, it can improve upon the reconstruction accuracy, while reducing the time necessary to run the reconstruction by two to three orders of magnitude.
△ Less
Submitted 26 July, 2021; v1 submitted 27 January, 2021;
originally announced January 2021.
-
How Many Annotators Do We Need? -- A Study on the Influence of Inter-Observer Variability on the Reliability of Automatic Mitotic Figure Assessment
Authors:
Frauke Wilm,
Christof A. Bertram,
Christian Marzahl,
Alexander Bartel,
Taryn A. Donovan,
Charles-Antoine Assenmacher,
Kathrin Becker,
Mark Bennett,
Sarah Corner,
Brieuc Cossic,
Daniela Denk,
Martina Dettwiler,
Beatriz Garcia Gonzalez,
Corinne Gurtner,
Annika Lehmbecker,
Sophie Merz,
Stephanie Plog,
Anja Schmidt,
Rebecca C. Smedley,
Marco Tecilla,
Tuddow Thaiwong,
Katharina Breininger,
Matti Kiupel,
Andreas Maier,
Robert Klopfleisch
, et al. (1 additional authors not shown)
Abstract:
Density of mitotic figures in histologic sections is a prognostically relevant characteristic for many tumours. Due to high inter-pathologist variability, deep learning-based algorithms are a promising solution to improve tumour prognostication. Pathologists are the gold standard for database development, however, labelling errors may hamper development of accurate algorithms. In the present work…
▽ More
Density of mitotic figures in histologic sections is a prognostically relevant characteristic for many tumours. Due to high inter-pathologist variability, deep learning-based algorithms are a promising solution to improve tumour prognostication. Pathologists are the gold standard for database development, however, labelling errors may hamper development of accurate algorithms. In the present work we evaluated the benefit of multi-expert consensus (n = 3, 5, 7, 9, 11) on algorithmic performance. While training with individual databases resulted in highly variable F$_1$ scores, performance was notably increased and more consistent when using the consensus of three annotators. Adding more annotators only resulted in minor improvements. We conclude that databases by few pathologists and high label accuracy may be the best compromise between high algorithmic performance and time investment.
△ Less
Submitted 8 January, 2021; v1 submitted 4 December, 2020;
originally announced December 2020.
-
Flying Unicorn: Develo** a Game for a Quantum Computer
Authors:
Kory Becker
Abstract:
What is it like to create a game for a quantum computer? With its ability to perform calculations and processing in a distinctly different way than classical computers, quantum computing has the potential for becoming the next revolution in information technology. Flying Unicorn is a game developed for a quantum computer. It is designed to explore the properties of superposition and uncertainty. I…
▽ More
What is it like to create a game for a quantum computer? With its ability to perform calculations and processing in a distinctly different way than classical computers, quantum computing has the potential for becoming the next revolution in information technology. Flying Unicorn is a game developed for a quantum computer. It is designed to explore the properties of superposition and uncertainty. In this paper, we explore the development of the game, using Python Qiskit. We detail the usage of qubits and an implementation of Grover's search. Finally, we compare and contrast a classical implementation of the game against the quantum computing design, including execution and performance on a physical quantum computer at IBMQ.
△ Less
Submitted 17 October, 2019;
originally announced October 2019.
-
UFRGS Participation on the WMT Biomedical Translation Shared Task
Authors:
Felipe Soares,
Karin Becker
Abstract:
This paper describes the machine translation systems developed by the Universidade Federal do Rio Grande do Sul (UFRGS) team for the biomedical translation shared task. Our systems are based on statistical machine translation and neural machine translation, using the Moses and OpenNMT toolkits, respectively. We participated in four translation directions for the English/Spanish and English/Portugu…
▽ More
This paper describes the machine translation systems developed by the Universidade Federal do Rio Grande do Sul (UFRGS) team for the biomedical translation shared task. Our systems are based on statistical machine translation and neural machine translation, using the Moses and OpenNMT toolkits, respectively. We participated in four translation directions for the English/Spanish and English/Portuguese language pairs. To create our training data, we concatenated several parallel corpora, both from in-domain and out-of-domain sources, as well as terminological resources from UMLS. Our systems achieved the best BLEU scores according to the official shared task evaluation.
△ Less
Submitted 6 May, 2019;
originally announced May 2019.
-
A Large Parallel Corpus of Full-Text Scientific Articles
Authors:
Felipe Soares,
Viviane Pereira Moreira,
Karin Becker
Abstract:
The Scielo database is an important source of scientific information in Latin America, containing articles from several research domains. A striking characteristic of Scielo is that many of its full-text contents are presented in more than one language, thus being a potential source of parallel corpora. In this article, we present the development of a parallel corpus from Scielo in three languages…
▽ More
The Scielo database is an important source of scientific information in Latin America, containing articles from several research domains. A striking characteristic of Scielo is that many of its full-text contents are presented in more than one language, thus being a potential source of parallel corpora. In this article, we present the development of a parallel corpus from Scielo in three languages: English, Portuguese, and Spanish. Sentences were automatically aligned using the Hunalign algorithm for all language pairs, and for a subset of trilingual articles also. We demonstrate the capabilities of our corpus by training a Statistical Machine Translation system (Moses) for each language pair, which outperformed related works on scientific articles. Sentence alignment was also manually evaluated, presenting an average of 98.8% correctly aligned sentences across all languages. Our parallel corpus is freely available in the TMX format, with complementary information regarding article metadata.
△ Less
Submitted 6 May, 2019;
originally announced May 2019.
-
AI Programmer: Autonomously Creating Software Programs Using Genetic Algorithms
Authors:
Kory Becker,
Justin Gottschlich
Abstract:
In this paper, we present the first-of-its-kind machine learning (ML) system, called AI Programmer, that can automatically generate full software programs requiring only minimal human guidance. At its core, AI Programmer uses genetic algorithms (GA) coupled with a tightly constrained programming language that minimizes the overhead of its ML search space. Part of AI Programmer's novelty stems from…
▽ More
In this paper, we present the first-of-its-kind machine learning (ML) system, called AI Programmer, that can automatically generate full software programs requiring only minimal human guidance. At its core, AI Programmer uses genetic algorithms (GA) coupled with a tightly constrained programming language that minimizes the overhead of its ML search space. Part of AI Programmer's novelty stems from (i) its unique system design, including an embedded, hand-crafted interpreter for efficiency and security and (ii) its augmentation of GAs to include instruction-gene randomization bindings and programming language-specific genome construction and elimination techniques. We provide a detailed examination of AI Programmer's system design, several examples detailing how the system works, and experimental data demonstrating its software generation capabilities and performance using only mainstream CPUs.
△ Less
Submitted 17 September, 2017;
originally announced September 2017.
-
Deployment Calculation and Analysis for a Fail-Operational Automotive Platform
Authors:
Klaus Becker,
Bernhard Schatz,
Christian Buckl,
Michael Armbruster
Abstract:
In domains like automotive, safety-critical features are increasingly realized by software. Some features might even require fail-operational behavior, so that they must be provided even in the presence of random hardware failures. A new fault-tolerant SW/HW architecture for electric vehicles provides inherent safety capabilities that enable fail-operational features. In this paper we introduce a…
▽ More
In domains like automotive, safety-critical features are increasingly realized by software. Some features might even require fail-operational behavior, so that they must be provided even in the presence of random hardware failures. A new fault-tolerant SW/HW architecture for electric vehicles provides inherent safety capabilities that enable fail-operational features. In this paper we introduce a formal model of this architecture and an approach to calculate valid deployments of mixed-critical software-components to the execution nodes, while ensuring fail-operational behavior of certain components. Calculated redeployments cover the cases in which faulty execution nodes have to be isolated. This allows to formally analyze which set of features can be provided under decreasing available execution resources.
△ Less
Submitted 7 May, 2014; v1 submitted 30 April, 2014;
originally announced April 2014.
-
The IceProd Framework: Distributed Data Processing for the IceCube Neutrino Observatory
Authors:
M. G. Aartsen,
R. Abbasi,
M. Ackermann,
J. Adams,
J. A. Aguilar,
M. Ahlers,
D. Altmann,
C. Arguelles,
J. Auffenberg,
X. Bai,
M. Baker,
S. W. Barwick,
V. Baum,
R. Bay,
J. J. Beatty,
J. Becker Tjus,
K. -H. Becker,
S. BenZvi,
P. Berghaus,
D. Berley,
E. Bernardini,
A. Bernhard,
D. Z. Besson,
G. Binder,
D. Bindig
, et al. (262 additional authors not shown)
Abstract:
IceCube is a one-gigaton instrument located at the geographic South Pole, designed to detect cosmic neutrinos, iden- tify the particle nature of dark matter, and study high-energy neutrinos themselves. Simulation of the IceCube detector and processing of data require a significant amount of computational resources. IceProd is a distributed management system based on Python, XML-RPC and GridFTP. It…
▽ More
IceCube is a one-gigaton instrument located at the geographic South Pole, designed to detect cosmic neutrinos, iden- tify the particle nature of dark matter, and study high-energy neutrinos themselves. Simulation of the IceCube detector and processing of data require a significant amount of computational resources. IceProd is a distributed management system based on Python, XML-RPC and GridFTP. It is driven by a central database in order to coordinate and admin- ister production of simulations and processing of data produced by the IceCube detector. IceProd runs as a separate layer on top of other middleware and can take advantage of a variety of computing resources, including grids and batch systems such as CREAM, Condor, and PBS. This is accomplished by a set of dedicated daemons that process job submission in a coordinated fashion through the use of middleware plugins that serve to abstract the details of job submission and job management from the framework.
△ Less
Submitted 22 August, 2014; v1 submitted 22 November, 2013;
originally announced November 2013.