-
Learning Decentralized Multi-Biped Control for Payload Transport
Authors:
Bikram Pandit,
Ashutosh Gupta,
Mohitvishnu S. Gadde,
Addison Johnson,
Aayam Kumar Shrestha,
Helei Duan,
Jeremy Dao,
Alan Fern
Abstract:
Payload transport over flat terrain via multi-wheel robot carriers is well-understood, highly effective, and configurable. In this paper, our goal is to provide similar effectiveness and configurability for transport over rough terrain that is more suitable for legs rather than wheels. For this purpose, we consider multi-biped robot carriers, where wheels are replaced by multiple bipedal robots at…
▽ More
Payload transport over flat terrain via multi-wheel robot carriers is well-understood, highly effective, and configurable. In this paper, our goal is to provide similar effectiveness and configurability for transport over rough terrain that is more suitable for legs rather than wheels. For this purpose, we consider multi-biped robot carriers, where wheels are replaced by multiple bipedal robots attached to the carrier. Our main contribution is to design a decentralized controller for such systems that can be effectively applied to varying numbers and configurations of rigidly attached bipedal robots without retraining. We present a reinforcement learning approach for training the controller in simulation that supports transfer to the real world. Our experiments in simulation provide quantitative metrics showing the effectiveness of the approach over a wide variety of simulated transport scenarios. In addition, we demonstrate the controller in the real-world for systems composed of two and three Cassie robots. To our knowledge, this is the first example of a scalable multi-biped payload transport system.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
The Penalized Inverse Probability Measure for Conformal Classification
Authors:
Paul Melki,
Lionel Bombrun,
Boubacar Diallo,
Jérôme Dias,
Jean-Pierre da Costa
Abstract:
The deployment of safe and trustworthy machine learning systems, and particularly complex black box neural networks, in real-world applications requires reliable and certified guarantees on their performance. The conformal prediction framework offers such formal guarantees by transforming any point into a set predictor with valid, finite-set, guarantees on the coverage of the true at a chosen leve…
▽ More
The deployment of safe and trustworthy machine learning systems, and particularly complex black box neural networks, in real-world applications requires reliable and certified guarantees on their performance. The conformal prediction framework offers such formal guarantees by transforming any point into a set predictor with valid, finite-set, guarantees on the coverage of the true at a chosen level of confidence. Central to this methodology is the notion of the nonconformity score function that assigns to each example a measure of ''strangeness'' in comparison with the previously seen observations. While the coverage guarantees are maintained regardless of the nonconformity measure, the point predictor and the dataset, previous research has shown that the performance of a conformal model, as measured by its efficiency (the average size of the predicted sets) and its informativeness (the proportion of prediction sets that are singletons), is influenced by the choice of the nonconformity score function. The current work introduces the Penalized Inverse Probability (PIP) nonconformity score, and its regularized version RePIP, that allow the joint optimization of both efficiency and informativeness. Through toy examples and empirical results on the task of crop and weed image classification in agricultural robotics, the current work shows how PIP-based conformal classifiers exhibit precisely the desired behavior in comparison with other nonconformity measures and strike a good balance between informativeness and efficiency.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
Analyzing constrained LLM through PDFA-learning
Authors:
Matías Carrasco,
Franz Mayr,
Sergio Yovine,
Johny Kidd,
Martín Iturbide,
Juan Pedro da Silva,
Alejo Garat
Abstract:
We define a congruence that copes with null next-symbol probabilities that arise when the output of a language model is constrained by some means during text generation. We develop an algorithm for efficiently learning the quotient with respect to this congruence and evaluate it on case studies for analyzing statistical properties of LLM.
We define a congruence that copes with null next-symbol probabilities that arise when the output of a language model is constrained by some means during text generation. We develop an algorithm for efficiently learning the quotient with respect to this congruence and evaluate it on case studies for analyzing statistical properties of LLM.
△ Less
Submitted 15 June, 2024; v1 submitted 12 June, 2024;
originally announced June 2024.
-
Twitter should now be referred to as X: How academics, journals and publishers need to make the nomenclatural transition
Authors:
Jaime A. Teixeira da Silva,
Serhii Nazarovets
Abstract:
Here, we note how academics, journals and publishers should no longer refer to the social media platform Twitter as such, rather as X. Relying on Google Scholar, we found 16 examples of papers published in the last months of 2023 - essentially during the transition period between Twitter and X - that used Twitter and X, but in different ways. Unlike that transition period in which the binary Twitt…
▽ More
Here, we note how academics, journals and publishers should no longer refer to the social media platform Twitter as such, rather as X. Relying on Google Scholar, we found 16 examples of papers published in the last months of 2023 - essentially during the transition period between Twitter and X - that used Twitter and X, but in different ways. Unlike that transition period in which the binary Twitter/X could have been used in academic papers, we suggest that papers should no longer refer to Twitter as Twitter, but only as X, except for historical studies about that social media platform, because such use would be factually incorrect.
△ Less
Submitted 31 May, 2024;
originally announced May 2024.
-
A Careful Examination of Large Language Model Performance on Grade School Arithmetic
Authors:
Hugh Zhang,
Jeff Da,
Dean Lee,
Vaughn Robinson,
Catherine Wu,
Will Song,
Tiffany Zhao,
Pranav Raja,
Dylan Slack,
Qin Lyu,
Sean Hendryx,
Russell Kaplan,
Michele Lunati,
Summer Yue
Abstract:
Large language models (LLMs) have achieved impressive success on many benchmarks for mathematical reasoning. However, there is growing concern that some of this performance actually reflects dataset contamination, where data closely resembling benchmark questions leaks into the training data, instead of true reasoning ability. To investigate this claim rigorously, we commission Grade School Math 1…
▽ More
Large language models (LLMs) have achieved impressive success on many benchmarks for mathematical reasoning. However, there is growing concern that some of this performance actually reflects dataset contamination, where data closely resembling benchmark questions leaks into the training data, instead of true reasoning ability. To investigate this claim rigorously, we commission Grade School Math 1000 (GSM1k). GSM1k is designed to mirror the style and complexity of the established GSM8k benchmark, the gold standard for measuring elementary mathematical reasoning. We ensure that the two benchmarks are comparable across important metrics such as human solve rates, number of steps in solution, answer magnitude, and more. When evaluating leading open- and closed-source LLMs on GSM1k, we observe accuracy drops of up to 13%, with several families of models (e.g., Phi and Mistral) showing evidence of systematic overfitting across almost all model sizes. At the same time, many models, especially those on the frontier, (e.g., Gemini/GPT/Claude) show minimal signs of overfitting. Further analysis suggests a positive relationship (Spearman's r^2=0.32) between a model's probability of generating an example from GSM8k and its performance gap between GSM8k and GSM1k, suggesting that many models may have partially memorized GSM8k.
△ Less
Submitted 3 May, 2024; v1 submitted 1 May, 2024;
originally announced May 2024.
-
Revisiting Reward Design and Evaluation for Robust Humanoid Standing and Walking
Authors:
Bart van Marum,
Aayam Shrestha,
Helei Duan,
Pranay Dugar,
Jeremy Dao,
Alan Fern
Abstract:
A necessary capability for humanoid robots is the ability to stand and walk while rejecting natural disturbances. Recent progress has been made using sim-to-real reinforcement learning (RL) to train such locomotion controllers, with approaches differing mainly in their reward functions. However, prior works lack a clear method to systematically test new reward functions and compare controller perf…
▽ More
A necessary capability for humanoid robots is the ability to stand and walk while rejecting natural disturbances. Recent progress has been made using sim-to-real reinforcement learning (RL) to train such locomotion controllers, with approaches differing mainly in their reward functions. However, prior works lack a clear method to systematically test new reward functions and compare controller performance through repeatable experiments. This limits our understanding of the trade-offs between approaches and hinders progress. To address this, we propose a low-cost, quantitative benchmarking method to evaluate and compare the real-world performance of standing and walking (SaW) controllers on metrics like command following, disturbance recovery, and energy efficiency. We also revisit reward function design and construct a minimally constraining reward function to train SaW controllers. We experimentally verify that our benchmarking framework can identify areas for improvement, which can be systematically addressed to enhance the policies. We also compare our new controller to state-of-the-art controllers on the Digit humanoid robot. The results provide clear quantitative trade-offs among the controllers and suggest directions for future improvements to the reward functions and expansion of the benchmarks.
△ Less
Submitted 29 April, 2024;
originally announced April 2024.
-
A Survey of Large Language Models in Cybersecurity
Authors:
Gabriel de Jesus Coelho da Silva,
Carlos Becker Westphall
Abstract:
Large Language Models (LLMs) have quickly risen to prominence due to their ability to perform at or close to the state-of-the-art in a variety of fields while handling natural language. An important field of research is the application of such models at the cybersecurity context. This survey aims to identify where in the field of cybersecurity LLMs have already been applied, the ways in which they…
▽ More
Large Language Models (LLMs) have quickly risen to prominence due to their ability to perform at or close to the state-of-the-art in a variety of fields while handling natural language. An important field of research is the application of such models at the cybersecurity context. This survey aims to identify where in the field of cybersecurity LLMs have already been applied, the ways in which they are being used and their limitations in the field. Finally, suggestions are made on how to improve such limitations and what can be expected from these systems once these limitations are overcome.
△ Less
Submitted 26 February, 2024;
originally announced February 2024.
-
Position Paper: Toward New Frameworks for Studying Model Representations
Authors:
Satvik Golechha,
James Dao
Abstract:
Mechanistic interpretability (MI) aims to understand AI models by reverse-engineering the exact algorithms neural networks learn. Most works in MI so far have studied behaviors and capabilities that are trivial and token-aligned. However, most capabilities are not that trivial, which advocates for the study of hidden representations inside these networks as the unit of analysis. We do a literature…
▽ More
Mechanistic interpretability (MI) aims to understand AI models by reverse-engineering the exact algorithms neural networks learn. Most works in MI so far have studied behaviors and capabilities that are trivial and token-aligned. However, most capabilities are not that trivial, which advocates for the study of hidden representations inside these networks as the unit of analysis. We do a literature review, formalize representations for features and behaviors, highlight their importance and evaluation, and perform some basic exploration in the mechanistic interpretability of representations. With discussion and exploratory results, we justify our position that studying representations is an important and under-studied field, and that currently established methods in MI are not sufficient to understand representations, thus pushing for the research community to work toward new frameworks for studying representations.
△ Less
Submitted 6 February, 2024;
originally announced February 2024.
-
Kairos: Efficient Temporal Graph Analytics on a Single Machine
Authors:
Joana M. F. da Trindade,
Julian Shun,
Samuel Madden,
Nesime Tatbul
Abstract:
Many important societal problems are naturally modeled as algorithms over temporal graphs. To date, however, most graph processing systems remain inefficient as they rely on distributed processing even for graphs that fit well within a commodity server's available storage. In this paper, we introduce Kairos, a temporal graph analytics system that provides application developers a framework for eff…
▽ More
Many important societal problems are naturally modeled as algorithms over temporal graphs. To date, however, most graph processing systems remain inefficient as they rely on distributed processing even for graphs that fit well within a commodity server's available storage. In this paper, we introduce Kairos, a temporal graph analytics system that provides application developers a framework for efficiently implementing and executing algorithms over temporal graphs on a single machine. Specifically, Kairos relies on fork-join parallelism and a highly optimized parallel data structure as core primitives to maximize performance of graph processing tasks needed for temporal graph analytics. Furthermore, we introduce the notion of selective indexing and show how it can be used with an efficient index to speedup temporal queries. Our experiments on a 24-core server show that our algorithms obtain good parallel speedups, and are significantly faster than equivalent algorithms in existing temporal graph processing systems: up to 60x against a shared-memory approach, and several orders of magnitude when compared with distributed processing of graphs that fit within a single server.
△ Less
Submitted 4 January, 2024;
originally announced January 2024.
-
AdaSub: Stochastic Optimization Using Second-Order Information in Low-Dimensional Subspaces
Authors:
João Victor Galvão da Mata,
Martin S. Andersen
Abstract:
We introduce AdaSub, a stochastic optimization algorithm that computes a search direction based on second-order information in a low-dimensional subspace that is defined adaptively based on available current and past information. Compared to first-order methods, second-order methods exhibit better convergence characteristics, but the need to compute the Hessian matrix at each iteration results in…
▽ More
We introduce AdaSub, a stochastic optimization algorithm that computes a search direction based on second-order information in a low-dimensional subspace that is defined adaptively based on available current and past information. Compared to first-order methods, second-order methods exhibit better convergence characteristics, but the need to compute the Hessian matrix at each iteration results in excessive computational expenses, making them impractical. To address this issue, our approach enables the management of computational expenses and algorithm efficiency by enabling the selection of the subspace dimension for the search. Our code is freely available on GitHub, and our preliminary numerical results demonstrate that AdaSub surpasses popular stochastic optimizers in terms of time and number of iterations required to reach a given accuracy.
△ Less
Submitted 6 November, 2023; v1 submitted 30 October, 2023;
originally announced October 2023.
-
Linear decomposition of approximate multi-controlled single qubit gates
Authors:
Jefferson D. S. Silva,
Thiago Melo D. Azevedo,
Israel F. Araujo,
Adenilton J. da Silva
Abstract:
We provide a method for compiling approximate multi-controlled single qubit gates into quantum circuits without ancilla qubits. The total number of elementary gates to decompose an n-qubit multi-controlled gate is proportional to 32n, and the previous best approximate approach without auxiliary qubits requires 32nk elementary operations, where k is a function that depends on the error threshold. T…
▽ More
We provide a method for compiling approximate multi-controlled single qubit gates into quantum circuits without ancilla qubits. The total number of elementary gates to decompose an n-qubit multi-controlled gate is proportional to 32n, and the previous best approximate approach without auxiliary qubits requires 32nk elementary operations, where k is a function that depends on the error threshold. The proposed decomposition depends on an optimization technique that minimizes the CNOT gate count for multi-target and multi-controlled CNOT and SU(2) gates. Computational experiments show the reduction in the number of CNOT gates to apply multi-controlled U(2) gates. As multi-controlled single-qubit gates serve as fundamental components of quantum algorithms, the proposed decomposition offers a comprehensive solution that can significantly decrease the count of elementary operations employed in quantum computing applications.
△ Less
Submitted 23 October, 2023;
originally announced October 2023.
-
An Adversarial Example for Direct Logit Attribution: Memory Management in gelu-4l
Authors:
James Dao,
Yeu-Tong Lau,
Can Rager,
Jett Janiak
Abstract:
How do language models deal with the limited bandwidth of the residual stream? Prior work has suggested that some attention heads and MLP layers may perform a "memory management" role. That is, clearing residual stream directions set by earlier layers by reading in information and writing out the negative version. In this work, we present concrete evidence for this phenomenon in a 4-layer transfor…
▽ More
How do language models deal with the limited bandwidth of the residual stream? Prior work has suggested that some attention heads and MLP layers may perform a "memory management" role. That is, clearing residual stream directions set by earlier layers by reading in information and writing out the negative version. In this work, we present concrete evidence for this phenomenon in a 4-layer transformer. We identify several heads in layer 2 that consistently remove the output of a single layer 0 head. We then verify that this erasure causally depends on the original written direction. We further demonstrate that direct logit attribution (DLA) suggests that writing and erasing heads directly contribute to predictions, when in fact their effects cancel out. Then we present adversarial prompts for which this effect is particularly salient. These findings reveal that memory management can make DLA results misleading. Accordingly, we make concrete recommendations for circuit analysis to prevent interpretability illusions.
△ Less
Submitted 9 November, 2023; v1 submitted 11 October, 2023;
originally announced October 2023.
-
Sim-to-Real Learning for Humanoid Box Loco-Manipulation
Authors:
Jeremy Dao,
Helei Duan,
Alan Fern
Abstract:
In this work we propose a learning-based approach to box loco-manipulation for a humanoid robot. This is a particularly challenging problem due to the need for whole-body coordination in order to lift boxes of varying weight, position, and orientation while maintaining balance. To address this challenge, we present a sim-to-real reinforcement learning approach for training general box pickup and c…
▽ More
In this work we propose a learning-based approach to box loco-manipulation for a humanoid robot. This is a particularly challenging problem due to the need for whole-body coordination in order to lift boxes of varying weight, position, and orientation while maintaining balance. To address this challenge, we present a sim-to-real reinforcement learning approach for training general box pickup and carrying skills for the bipedal robot Digit. Our reward functions are designed to produce the desired interactions with the box while also valuing balance and gait quality. We combine the learned skills into a full system for box loco-manipulation to achieve the task of moving boxes from one table to another with a variety of sizes, weights, and initial configurations. In addition to quantitative simulation results, we demonstrate successful sim-to-real transfer on the humanoid r
△ Less
Submitted 4 October, 2023;
originally announced October 2023.
-
Learning Vision-Based Bipedal Locomotion for Challenging Terrain
Authors:
Helei Duan,
Bikram Pandit,
Mohitvishnu S. Gadde,
Bart van Marum,
Jeremy Dao,
Chanho Kim,
Alan Fern
Abstract:
Reinforcement learning (RL) for bipedal locomotion has recently demonstrated robust gaits over moderate terrains using only proprioceptive sensing. However, such blind controllers will fail in environments where robots must anticipate and adapt to local terrain, which requires visual perception. In this paper, we propose a fully-learned system that allows bipedal robots to react to local terrain w…
▽ More
Reinforcement learning (RL) for bipedal locomotion has recently demonstrated robust gaits over moderate terrains using only proprioceptive sensing. However, such blind controllers will fail in environments where robots must anticipate and adapt to local terrain, which requires visual perception. In this paper, we propose a fully-learned system that allows bipedal robots to react to local terrain while maintaining commanded travel speed and direction. Our approach first trains a controller in simulation using a heightmap expressed in the robot's local frame. Next, data is collected in simulation to train a heightmap predictor, whose input is the history of depth images and robot states. We demonstrate that with appropriate domain randomization, this approach allows for successful sim-to-real transfer with no explicit pose estimation and no fine-tuning using real-world data. To the best of our knowledge, this is the first example of sim-to-real learning for vision-based bipedal locomotion over challenging terrains.
△ Less
Submitted 25 September, 2023;
originally announced September 2023.
-
Desenvolvimento de modelo para predição de cotações de ação baseada em análise de sentimentos de tweets
Authors:
Mario Mitsuo Akita,
Everton Josue da Silva
Abstract:
Training machine learning models for predicting stock market share prices is an active area of research since the automatization of trading such papers was available in real time. While most of the work in this field of research is done by training Neural networks based on past prices of stock shares, in this work, we use iFeel 2.0 platform to extract 19 sentiment features from posts obtained from…
▽ More
Training machine learning models for predicting stock market share prices is an active area of research since the automatization of trading such papers was available in real time. While most of the work in this field of research is done by training Neural networks based on past prices of stock shares, in this work, we use iFeel 2.0 platform to extract 19 sentiment features from posts obtained from microblog platform Twitter that mention the company Petrobras. Then, we used those features to train XBoot models to predict future stock prices for the referred company. Later, we simulated the trading of Petrobras' shares based on the model's outputs and determined the gain of R$88,82 (net) in a 250-day period when compared to a 100 random models' average performance.
△ Less
Submitted 11 September, 2023;
originally announced September 2023.
-
Group-Conditional Conformal Prediction via Quantile Regression Calibration for Crop and Weed Classification
Authors:
Paul Melki,
Lionel Bombrun,
Boubacar Diallo,
Jérôme Dias,
Jean-Pierre da Costa
Abstract:
As deep learning predictive models become an integral part of a large spectrum of precision agricultural systems, a barrier to the adoption of such automated solutions is the lack of user trust in these highly complex, opaque and uncertain models. Indeed, deep neural networks are not equipped with any explicit guarantees that can be used to certify the system's performance, especially in highly va…
▽ More
As deep learning predictive models become an integral part of a large spectrum of precision agricultural systems, a barrier to the adoption of such automated solutions is the lack of user trust in these highly complex, opaque and uncertain models. Indeed, deep neural networks are not equipped with any explicit guarantees that can be used to certify the system's performance, especially in highly varying uncontrolled environments such as the ones typically faced in computer vision for agriculture.Fortunately, certain methods developed in other communities can prove to be important for agricultural applications. This article presents the conformal prediction framework that provides valid statistical guarantees on the predictive performance of any black box prediction machine, with almost no assumptions, applied to the problem of deep visual classification of weeds and crops in real-world conditions. The framework is exposed with a focus on its practical aspects and special attention accorded to the Adaptive Prediction Sets (APS) approach that delivers marginal guarantees on the model's coverage. Marginal results are then shown to be insufficient to guarantee performance on all groups of individuals in the population as characterized by their environmental and pedo-climatic auxiliary data gathered during image acquisition.To tackle this shortcoming, group-conditional conformal approaches are presented: the ''classical'' method that consists of iteratively applying the APS procedure on all groups, and a proposed elegant reformulation and implementation of the procedure using quantile regression on group membership indicators. Empirical results showing the validity of the proposed approach are presented and compared to the marginal APS then discussed.
△ Less
Submitted 29 August, 2023;
originally announced August 2023.
-
Efficient set-theoretic algorithms for computing high-order Forman-Ricci curvature on abstract simplicial complexes
Authors:
Danillo Barros de Souza,
Jonatas T. S. da Cunha,
Fernando A. N. Santos,
Jürgen Jost,
Serafim Rodrigues
Abstract:
Forman-Ricci curvature (FRC) is a potent and powerful tool for analysing empirical networks, as the distribution of the curvature values can identify structural information that is not readily detected by other geometrical methods. Crucially, FRC captures higher-order structural information of clique complexes of a graph or Vietoris-Rips complexes, which is not readily accessible to alternative me…
▽ More
Forman-Ricci curvature (FRC) is a potent and powerful tool for analysing empirical networks, as the distribution of the curvature values can identify structural information that is not readily detected by other geometrical methods. Crucially, FRC captures higher-order structural information of clique complexes of a graph or Vietoris-Rips complexes, which is not readily accessible to alternative methods. However, existing FRC platforms are prohibitively computationally expensive. Therefore, herein we develop an efficient set-theoretic formulation for computing such high-order FRC in simplicial complexes. Significantly, our set theory representation reveals previous computational bottlenecks and also accelerates the computation of FRC. Finally, We provide a pseudo-code, a software implementation coined FastForman, as well as a benchmark comparison with alternative implementations. We envisage that FastForman will be used in Topological and Geometrical Data analysis for high-dimensional complex data sets. Moreover, our development paves the way for future generalisations towards efficient computations of FRC on cell complexes.
△ Less
Submitted 9 May, 2024; v1 submitted 22 August, 2023;
originally announced August 2023.
-
RobôCIn Small Size League Extended Team Description Paper for RoboCup 2023
Authors:
Aline Lima de Oliveira,
Cauê Addae da Silva Gomes,
Cecília Virginia Santos da Silva,
Charles Matheus de Sousa Alves,
Danilo Andrade Martins de Souza,
Driele Pires Ferreira Araújo Xavier,
Edgleyson Pereira da Silva,
Felipe Bezerra Martins,
Lucas Henrique Cavalcanti Santos,
Lucas Dias Maciel,
Matheus Paixão Gumercindo dos Santos,
Matheus Lafayette Vasconcelos,
Matheus Vinícius Teotonio do Nascimento Andrade,
João Guilherme Oliveira Carvalho de Melo,
João Pedro Souza Pereira de Moura,
José Ronald da Silva,
José Victor Silva Cruz,
Pedro Henrique Santana de Morais,
Pedro Paulo Salman de Oliveira,
Riei Joaquim Matos Rodrigues,
Roberto Costa Fernandes,
Ryan Vinicius Santos Morais,
Tamara Mayara Ramos Teobaldo,
Washington Igor dos Santos Silva,
Edna Natividade Silva Barros
Abstract:
RobôCIn has participated in RoboCup Small Size League since 2019, won its first world title in 2022 (Division B), and is currently a three-times Latin-American champion. This paper presents our improvements to defend the Small Size League (SSL) division B title in RoboCup 2023 in Bordeaux, France. This paper aims to share some of the academic research that our team developed over the past year. Ou…
▽ More
RobôCIn has participated in RoboCup Small Size League since 2019, won its first world title in 2022 (Division B), and is currently a three-times Latin-American champion. This paper presents our improvements to defend the Small Size League (SSL) division B title in RoboCup 2023 in Bordeaux, France. This paper aims to share some of the academic research that our team developed over the past year. Our team has successfully published 2 articles related to SSL at two high-impact conferences: the 25th RoboCup International Symposium and the 19th IEEE Latin American Robotics Symposium (LARS 2022). Over the last year, we have been continuously migrating from our past codebase to Unification. We will describe the new architecture implemented and some points of software and AI refactoring. In addition, we discuss the process of integrating machined components into the mechanical system, our development for participating in the vision blackout challenge last year and what we are preparing for this year.
△ Less
Submitted 19 July, 2023;
originally announced July 2023.
-
ISP meets Deep Learning: A Survey on Deep Learning Methods for Image Signal Processing
Authors:
Matheus Henrique Marques da Silva,
Jhessica Victoria Santos da Silva,
Rodrigo Reis Arrais,
Wladimir Barroso Guedes de Araújo Neto,
Leonardo Tadeu Lopes,
Guilherme Augusto Bileki,
Iago Oliveira Lima,
Lucas Borges Rondon,
Bruno Melo de Souza,
Mayara Costa Regazio,
Rodolfo Coelho Dalapicola,
Claudio Filipi Gonçalves dos Santos
Abstract:
The entire Image Signal Processor (ISP) of a camera relies on several processes to transform the data from the Color Filter Array (CFA) sensor, such as demosaicing, denoising, and enhancement. These processes can be executed either by some hardware or via software. In recent years, Deep Learning has emerged as one solution for some of them or even to replace the entire ISP using a single neural ne…
▽ More
The entire Image Signal Processor (ISP) of a camera relies on several processes to transform the data from the Color Filter Array (CFA) sensor, such as demosaicing, denoising, and enhancement. These processes can be executed either by some hardware or via software. In recent years, Deep Learning has emerged as one solution for some of them or even to replace the entire ISP using a single neural network for the task. In this work, we investigated several recent pieces of research in this area and provide deeper analysis and comparison among them, including results and possible points of improvement for future researchers.
△ Less
Submitted 23 May, 2023; v1 submitted 19 May, 2023;
originally announced May 2023.
-
Visual Question Answering: A Survey on Techniques and Common Trends in Recent Literature
Authors:
Ana Cláudia Akemi Matsuki de Faria,
Felype de Castro Bastos,
José Victor Nogueira Alves da Silva,
Vitor Lopes Fabris,
Valeska de Sousa Uchoa,
Décio Gonçalves de Aguiar Neto,
Claudio Filipi Goncalves dos Santos
Abstract:
Visual Question Answering (VQA) is an emerging area of interest for researches, being a recent problem in natural language processing and image prediction. In this area, an algorithm needs to answer questions about certain images. As of the writing of this survey, 25 recent studies were analyzed. Besides, 6 datasets were analyzed and provided their link to download. In this work, several recent pi…
▽ More
Visual Question Answering (VQA) is an emerging area of interest for researches, being a recent problem in natural language processing and image prediction. In this area, an algorithm needs to answer questions about certain images. As of the writing of this survey, 25 recent studies were analyzed. Besides, 6 datasets were analyzed and provided their link to download. In this work, several recent pieces of research in this area were investigated and a deeper analysis and comparison among them were provided, including results, the state-of-the-art, common errors, and possible points of improvement for future researchers.
△ Less
Submitted 2 June, 2023; v1 submitted 18 May, 2023;
originally announced May 2023.
-
eXplainable Artificial Intelligence on Medical Images: A Survey
Authors:
Matteus Vargas Simão da Silva,
Rodrigo Reis Arrais,
Jhessica Victoria Santos da Silva,
Felipe Souza Tânios,
Mateus Antonio Chinelatto,
Natalia Backhaus Pereira,
Renata De Paris,
Lucas Cesar Ferreira Domingos,
Rodrigo Dória Villaça,
Vitor Lopes Fabris,
Nayara Rossi Brito da Silva,
Ana Claudia Akemi Matsuki de Faria,
Jose Victor Nogueira Alves da Silva,
Fabiana Cristina Queiroz de Oliveira Marucci,
Francisco Alves de Souza Neto,
Danilo Xavier Silva,
Vitor Yukio Kondo,
Claudio Filipi Gonçalves dos Santos
Abstract:
Over the last few years, the number of works about deep learning applied to the medical field has increased enormously. The necessity of a rigorous assessment of these models is required to explain these results to all people involved in medical exams. A recent field in the machine learning area is explainable artificial intelligence, also known as XAI, which targets to explain the results of such…
▽ More
Over the last few years, the number of works about deep learning applied to the medical field has increased enormously. The necessity of a rigorous assessment of these models is required to explain these results to all people involved in medical exams. A recent field in the machine learning area is explainable artificial intelligence, also known as XAI, which targets to explain the results of such black box models to permit the desired assessment. This survey analyses several recent studies in the XAI field applied to medical diagnosis research, allowing some explainability of the machine learning results in several different diseases, such as cancers and COVID-19.
△ Less
Submitted 12 May, 2023;
originally announced May 2023.
-
Decomposition of Multi-controlled Special Unitary Single-Qubit Gates
Authors:
Rafaella Vale,
Thiago Melo D. Azevedo,
Ismael C. S. Araújo,
Israel F. Araujo,
Adenilton J. da Silva
Abstract:
Multi-controlled unitary gates have been a subject of interest in quantum computing since its inception, and are widely used in quantum algorithms. The current state-of-the-art approach to implementing n-qubit multi-controlled gates involves the use of a quadratic number of single-qubit and CNOT gates. However, linear solutions are possible for the case where the controlled gate is a special unita…
▽ More
Multi-controlled unitary gates have been a subject of interest in quantum computing since its inception, and are widely used in quantum algorithms. The current state-of-the-art approach to implementing n-qubit multi-controlled gates involves the use of a quadratic number of single-qubit and CNOT gates. However, linear solutions are possible for the case where the controlled gate is a special unitary SU(2). The most widely-used decomposition of an n-qubit multi-controlled SU(2) gate requires a circuit with a number of CNOT gates proportional to 28n. In this work, we present a new decomposition of n-qubit multi-controlled SU(2) gates that requires a circuit with a number of CNOT gates proportional to 20n, and proportional to 16n if the SU(2) gate has at least one real-valued diagonal. This new approach significantly improves the existing algorithm by reducing the number of CNOT gates and the overall circuit depth. As an application, we show the use of this decomposition for sparse quantum state preparation. Our results are further validated by demonstrating a proof of principle on a quantum device accessed through quantum cloud services.
△ Less
Submitted 13 February, 2023;
originally announced February 2023.
-
Do Androids Laugh at Electric Sheep? Humor "Understanding" Benchmarks from The New Yorker Caption Contest
Authors:
Jack Hessel,
Ana Marasović,
Jena D. Hwang,
Lillian Lee,
Jeff Da,
Rowan Zellers,
Robert Mankoff,
Ye** Choi
Abstract:
Large neural networks can now generate jokes, but do they really "understand" humor? We challenge AI models with three tasks derived from the New Yorker Cartoon Caption Contest: matching a joke to a cartoon, identifying a winning caption, and explaining why a winning caption is funny. These tasks encapsulate progressively more sophisticated aspects of "understanding" a cartoon; key elements are th…
▽ More
Large neural networks can now generate jokes, but do they really "understand" humor? We challenge AI models with three tasks derived from the New Yorker Cartoon Caption Contest: matching a joke to a cartoon, identifying a winning caption, and explaining why a winning caption is funny. These tasks encapsulate progressively more sophisticated aspects of "understanding" a cartoon; key elements are the complex, often surprising relationships between images and captions and the frequent inclusion of indirect and playful allusions to human experience and culture. We investigate both multimodal and language-only models: the former are challenged with the cartoon images directly, while the latter are given multifaceted descriptions of the visual scene to simulate human-level visual understanding. We find that both types of models struggle at all three tasks. For example, our best multimodal models fall 30 accuracy points behind human performance on the matching task, and, even when provided ground-truth visual scene descriptors, human-authored explanations are preferred head-to-head over the best machine-authored ones (few-shot GPT-4) in more than 2/3 of cases. We release models, code, leaderboard, and corpus, which includes newly-gathered annotations describing the image's locations/entities, what's unusual in the scene, and an explanation of the joke.
△ Less
Submitted 6 July, 2023; v1 submitted 13 September, 2022;
originally announced September 2022.
-
Dynamic Bipedal Maneuvers through Sim-to-Real Reinforcement Learning
Authors:
Fangzhou Yu,
Ryan Batke,
Jeremy Dao,
Jonathan Hurst,
Kevin Green,
Alan Fern
Abstract:
For legged robots to match the athletic capabilities of humans and animals, they must not only produce robust periodic walking and running, but also seamlessly switch between nominal locomotion gaits and more specialized transient maneuvers. Despite recent advancements in controls of bipedal robots, there has been little focus on producing highly dynamic behaviors. Recent work utilizing reinforcem…
▽ More
For legged robots to match the athletic capabilities of humans and animals, they must not only produce robust periodic walking and running, but also seamlessly switch between nominal locomotion gaits and more specialized transient maneuvers. Despite recent advancements in controls of bipedal robots, there has been little focus on producing highly dynamic behaviors. Recent work utilizing reinforcement learning to produce policies for control of legged robots have demonstrated success in producing robust walking behaviors. However, these learned policies have difficulty expressing a multitude of different behaviors on a single network. Inspired by conventional optimization-based control techniques for legged robots, this work applies a recurrent policy to execute four-step, 90 degree turns trained using reference data generated from optimized single rigid body model trajectories. We present a novel training framework using epilogue terminal rewards for learning specific behaviors from pre-computed trajectory data and demonstrate a successful transfer to hardware on the bipedal robot Cassie.
△ Less
Submitted 16 July, 2022;
originally announced July 2022.
-
Optimizing Bipedal Maneuvers of Single Rigid-Body Models for Reinforcement Learning
Authors:
Ryan Batke,
Fangzhou Yu,
Jeremy Dao,
Jonathan Hurst,
Ross L. Hatton,
Alan Fern,
Kevin Green
Abstract:
In this work, we propose a method to generate reduced-order model reference trajectories for general classes of highly dynamic maneuvers for bipedal robots for use in sim-to-real reinforcement learning. Our approach is to utilize a single rigid-body model (SRBM) to optimize libraries of trajectories offline to be used as expert references in the reward function of a learned policy. This method tra…
▽ More
In this work, we propose a method to generate reduced-order model reference trajectories for general classes of highly dynamic maneuvers for bipedal robots for use in sim-to-real reinforcement learning. Our approach is to utilize a single rigid-body model (SRBM) to optimize libraries of trajectories offline to be used as expert references in the reward function of a learned policy. This method translates the model's dynamically rich rotational and translational behaviour to a full-order robot model and successfully transfers to real hardware. The SRBM's simplicity allows for fast iteration and refinement of behaviors, while the robustness of learning-based controllers allows for highly dynamic motions to be transferred to hardware. % Within this work we introduce a set of transferability constraints that amend the SRBM dynamics to actual bipedal robot hardware, our framework for creating optimal trajectories for dynamic step**, turning maneuvers and jumps as well as our approach to integrating reference trajectories to a reinforcement learning policy. Within this work we introduce a set of transferability constraints that amend the SRBM dynamics to actual bipedal robot hardware, our framework for creating optimal trajectories for a variety of highly dynamic maneuvers as well as our approach to integrating reference trajectories for a high-speed running reinforcement learning policy. We validate our methods on the bipedal robot Cassie on which we were successfully able to demonstrate highly dynamic grounded running gaits up to 3.0 m/s.
△ Less
Submitted 8 July, 2022;
originally announced July 2022.
-
Learning Dynamic Bipedal Walking Across Step** Stones
Authors:
Helei Duan,
Ashish Malik,
Mohitvishnu S. Gadde,
Jeremy Dao,
Alan Fern,
Jonathan Hurst
Abstract:
In this work, we propose a learning approach for 3D dynamic bipedal walking when footsteps are constrained to step** stones. While recent work has shown progress on this problem, real-world demonstrations have been limited to relatively simple open-loop, perception-free scenarios. Our main contribution is a more advanced learning approach that enables real-world demonstrations, using the Cassie…
▽ More
In this work, we propose a learning approach for 3D dynamic bipedal walking when footsteps are constrained to step** stones. While recent work has shown progress on this problem, real-world demonstrations have been limited to relatively simple open-loop, perception-free scenarios. Our main contribution is a more advanced learning approach that enables real-world demonstrations, using the Cassie robot, of closed-loop dynamic walking over moderately difficult step**-stone patterns. Our approach first uses reinforcement learning (RL) in simulation to train a controller that maps footstep commands onto joint actions without any reference motion information. We then learn a model of that controller's capabilities, which enables prediction of feasible footsteps given the robot's current dynamic state. The resulting controller and model are then integrated with a real-time overhead camera system for detecting step** stone locations. For evaluation, we develop a benchmark set of step** stone patterns, which are used to test performance in both simulation and the real world. Overall, we demonstrate that sim-to-real learning is extremely promising for enabling dynamic locomotion over step** stones. We also identify challenges remaining that motivate important future research directions.
△ Less
Submitted 3 May, 2022;
originally announced May 2022.
-
Sim-to-Real Learning for Bipedal Locomotion Under Unsensed Dynamic Loads
Authors:
Jeremy Dao,
Kevin Green,
Helei Duan,
Alan Fern,
Jonathan Hurst
Abstract:
Recent work on sim-to-real learning for bipedal locomotion has demonstrated new levels of robustness and agility over a variety of terrains. However, that work, and most prior bipedal locomotion work, have not considered locomotion under a variety of external loads that can significantly influence the overall system dynamics. In many applications, robots will need to maintain robust locomotion und…
▽ More
Recent work on sim-to-real learning for bipedal locomotion has demonstrated new levels of robustness and agility over a variety of terrains. However, that work, and most prior bipedal locomotion work, have not considered locomotion under a variety of external loads that can significantly influence the overall system dynamics. In many applications, robots will need to maintain robust locomotion under a wide range of potential dynamic loads, such as pulling a cart or carrying a large container of sloshing liquid, ideally without requiring additional load-sensing capabilities. In this work, we explore the capabilities of reinforcement learning (RL) and sim-to-real transfer for bipedal locomotion under dynamic loads using only proprioceptive feedback. We show that prior RL policies trained for unloaded locomotion fail for some loads and that simply training in the context of loads is enough to result in successful and improved policies. We also compare training specialized policies for each load versus a single policy for all considered loads and analyze how the resulting gaits change to accommodate different loads. Finally, we demonstrate sim-to-real transfer, which is successful but shows a wider sim-to-real gap than prior unloaded work, which points to interesting future research.
△ Less
Submitted 8 April, 2022;
originally announced April 2022.
-
Linear-depth quantum circuits for multiqubit controlled gates
Authors:
Adenilton J. da Silva,
Daniel K. Park
Abstract:
Quantum circuit depth minimization is critical for practical applications of circuit-based quantum computation. In this work, we present a systematic procedure to decompose multiqubit controlled unitary gates, which is essential in many quantum algorithms, to controlled-NOT and single-qubit gates with which the quantum circuit depth only increases linearly with the number of control qubits. Our al…
▽ More
Quantum circuit depth minimization is critical for practical applications of circuit-based quantum computation. In this work, we present a systematic procedure to decompose multiqubit controlled unitary gates, which is essential in many quantum algorithms, to controlled-NOT and single-qubit gates with which the quantum circuit depth only increases linearly with the number of control qubits. Our algorithm does not require any ancillary qubits and achieves a quadratic reduction of the circuit depth against known methods. We show the advantage of our algorithm with proof-of-principle experiments on the IBM quantum cloud platform.
△ Less
Submitted 4 October, 2022; v1 submitted 22 March, 2022;
originally announced March 2022.
-
Sim-to-Real Learning of Footstep-Constrained Bipedal Dynamic Walking
Authors:
Helei Duan,
Ashish Malik,
Jeremy Dao,
Aseem Saxena,
Kevin Green,
Jonah Siekmann,
Alan Fern,
Jonathan Hurst
Abstract:
Recently, work on reinforcement learning (RL) for bipedal robots has successfully learned controllers for a variety of dynamic gaits with robust sim-to-real demonstrations. In order to maintain balance, the learned controllers have full freedom of where to place the feet, resulting in highly robust gaits. In the real world however, the environment will often impose constraints on the feasible foot…
▽ More
Recently, work on reinforcement learning (RL) for bipedal robots has successfully learned controllers for a variety of dynamic gaits with robust sim-to-real demonstrations. In order to maintain balance, the learned controllers have full freedom of where to place the feet, resulting in highly robust gaits. In the real world however, the environment will often impose constraints on the feasible footstep locations, typically identified by perception systems. Unfortunately, most demonstrated RL controllers on bipedal robots do not allow for specifying and responding to such constraints. This missing control interface greatly limits the real-world application of current RL controllers. In this paper, we aim to maintain the robust and dynamic nature of learned gaits while also respecting footstep constraints imposed externally. We develop an RL formulation for training dynamic gait controllers that can respond to specified touchdown locations. We then successfully demonstrate simulation and sim-to-real performance on the bipedal robot Cassie. In addition, we use supervised learning to induce a transition model for accurately predicting the next touchdown locations that the controller can achieve given the robot's proprioceptive observations. This model paves the way for integrating the learned controller into a full-order robot locomotion planner that robustly satisfies both balance and environmental constraints.
△ Less
Submitted 3 May, 2022; v1 submitted 14 March, 2022;
originally announced March 2022.
-
Cyber security and the Leviathan
Authors:
Joseph Da Silva
Abstract:
Dedicated cyber-security functions are common in commercial businesses, who are confronted by evolving and pervasive threats of data breaches and other perilous security events. Such businesses are enmeshed with the wider societies in which they operate. Using data gathered from in-depth, semi-structured interviews with 15 Chief Information Security Officers, as well as six senior organisational l…
▽ More
Dedicated cyber-security functions are common in commercial businesses, who are confronted by evolving and pervasive threats of data breaches and other perilous security events. Such businesses are enmeshed with the wider societies in which they operate. Using data gathered from in-depth, semi-structured interviews with 15 Chief Information Security Officers, as well as six senior organisational leaders, we show that the work of political philosopher Thomas Hobbes, particularly Leviathan, offers a useful lens through which to understand the context of these functions and of cyber security in Western society. Our findings indicate that cyber security within these businesses demonstrates a number of Hobbesian features that are further implicated in, and provide significant benefits to, the wider Leviathan-esque state. These include the normalisation of intrusive controls, such as surveillance, and the stimulation of consumption. We conclude by suggesting implications for cyber-security practitioners, in particular, the reflexivity that these perspectives offer, as well as for businesses and other researchers.
△ Less
Submitted 10 March, 2022;
originally announced March 2022.
-
'Cyber security is a dark art': The CISO as soothsayer
Authors:
Joseph Da Silva,
Rikke Bjerg Jensen
Abstract:
Commercial organisations continue to face a growing and evolving threat of data breaches and system compromises, making their cyber-security function critically important. Many organisations employ a Chief Information Security Officer (CISO) to lead such a function. We conducted in-depth, semi-structured interviews with 15 CISOs and six senior organisational leaders, between October 2019 and July…
▽ More
Commercial organisations continue to face a growing and evolving threat of data breaches and system compromises, making their cyber-security function critically important. Many organisations employ a Chief Information Security Officer (CISO) to lead such a function. We conducted in-depth, semi-structured interviews with 15 CISOs and six senior organisational leaders, between October 2019 and July 2020, as part of a wider exploration into the purpose of CISOs and cyber-security functions. In this paper, we employ broader security scholarship related to ontological security and sociological notions of identity work to provide an interpretative analysis of the CISO role in organisations. Research findings reveal that cyber security is an expert system that positions the CISO as an interpreter of something that is mystical, unknown and fearful to the uninitiated. They show how the fearful nature of cyber security contributes to it being considered an ontological threat by the organisation, while responding to that threat contributes to the organisation's overall identity. We further show how cyber security is analogous to a belief system and how one of the roles of the CISO is akin to that of a modern-day soothsayer for senior management; that this role is precarious and, at the same time, superior, leading to alienation within the organisation. Our study also highlights that the CISO identity of protector-from-threat, linked to the precarious position, motivates self-serving actions that we term `cyber sophistry'. We conclude by outlining a series of implications for both organisations and CISOs.
△ Less
Submitted 25 February, 2022;
originally announced February 2022.
-
Low-rank quantum state preparation
Authors:
Israel F. Araujo,
Carsten Blank,
Ismael C. S. Araújo,
Adenilton J. da Silva
Abstract:
Ubiquitous in quantum computing is the step to encode data into a quantum state. This process is called quantum state preparation, and its complexity for non-structured data is exponential on the number of qubits. Several works address this problem, for instance, by using variational methods that train a fixed depth circuit with manageable complexity. These methods have their limitations, as the l…
▽ More
Ubiquitous in quantum computing is the step to encode data into a quantum state. This process is called quantum state preparation, and its complexity for non-structured data is exponential on the number of qubits. Several works address this problem, for instance, by using variational methods that train a fixed depth circuit with manageable complexity. These methods have their limitations, as the lack of a back-propagation technique and barren plateaus. This work proposes an algorithm to reduce state preparation circuit depth by offloading computational complexity to a classical computer. The initialized quantum state can be exact or an approximation, and we show that the approximation is better on today's quantum processors than the initialization of the original state. Experimental evaluation demonstrates that the proposed method enables more efficient initialization of probability distributions in a quantum state.
△ Less
Submitted 27 July, 2023; v1 submitted 4 November, 2021;
originally announced November 2021.
-
Exploring the Use of Static and Dynamic Analysis to Improve the Performance of the Mining Sandbox Approach for Android Malware Identification
Authors:
Francisco Handrick da Costa,
Ismael Medeiros,
Thales Menezes,
João Victor da Silva,
Ingrid Lorraine da Silva,
Rodrigo Bonifácio,
Krishna Narasimhan,
Márcio Ribeiro
Abstract:
The Android mining sandbox approach consists in running dynamic analysis tools on a benign version of an Android app and recording every call to sensitive APIs. Later, one can use this information to (a) prevent calls to other sensitive APIs (those not previously recorded) or (b) run the dynamic analysis tools again in a different version of the app -- in order to identify possible malicious behav…
▽ More
The Android mining sandbox approach consists in running dynamic analysis tools on a benign version of an Android app and recording every call to sensitive APIs. Later, one can use this information to (a) prevent calls to other sensitive APIs (those not previously recorded) or (b) run the dynamic analysis tools again in a different version of the app -- in order to identify possible malicious behavior. Although the use of dynamic analysis for mining Android sandboxes has been empirically investigated before, little is known about the potential benefits of combining static analysis with the mining sandbox approach for identifying malicious behavior. As such, in this paper we present the results of two empirical studies: The first is a non-exact replication of a previous research work from Bao et al., which compares the performance of test case generation tools for mining Android sandboxes. The second is a new experiment to investigate the implications of using taint analysis algorithms to complement the mining sandbox approach in the task to identify malicious behavior. Our study brings several findings. For instance, the first study reveals that a static analysis component of DroidFax (a tool used for instrumenting Android apps in the Bao et al. study) contributes substantially to the performance of the dynamic analysis tools explored in the previous work. The results of the second study show that taint analysis is also practical to complement the mining sandboxes approach, improve the performance of the later strategy in at most 28.57%.
△ Less
Submitted 14 September, 2021;
originally announced September 2021.
-
Sense representations for Portuguese: experiments with sense embeddings and deep neural language models
Authors:
Jessica Rodrigues da Silva,
Helena de Medeiros Caseli
Abstract:
Sense representations have gone beyond word representations like Word2Vec, GloVe and FastText and achieved innovative performance on a wide range of natural language processing tasks. Although very useful in many applications, the traditional approaches for generating word embeddings have a strict drawback: they produce a single vector representation for a given word ignoring the fact that ambiguo…
▽ More
Sense representations have gone beyond word representations like Word2Vec, GloVe and FastText and achieved innovative performance on a wide range of natural language processing tasks. Although very useful in many applications, the traditional approaches for generating word embeddings have a strict drawback: they produce a single vector representation for a given word ignoring the fact that ambiguous words can assume different meanings. In this paper, we explore unsupervised sense representations which, different from traditional word embeddings, are able to induce different senses of a word by analyzing its contextual semantics in a text. The unsupervised sense representations investigated in this paper are: sense embeddings and deep neural language models. We present the first experiments carried out for generating sense embeddings for Portuguese. Our experiments show that the sense embedding model (Sense2vec) outperformed traditional word embeddings in syntactic and semantic analogies task, proving that the language resource generated here can improve the performance of NLP tasks in Portuguese. We also evaluated the performance of pre-trained deep neural language models (ELMo and BERT) in two transfer learning approaches: feature based and fine-tuning, in the semantic textual similarity task. Our experiments indicate that the fine tuned Multilingual and Portuguese BERT language models were able to achieve better accuracy than the ELMo model and baselines.
△ Less
Submitted 31 August, 2021;
originally announced September 2021.
-
Double sparse quantum state preparation
Authors:
Tiago M. L. de Veras,
Leon D. da Silva,
Adenilton J. da Silva
Abstract:
Initializing classical data in a quantum device is an essential step in many quantum algorithms. As a consequence of measurement and noisy operations, some algorithms need to reinitialize the prepared state several times during its execution. In this work, we propose a quantum state preparation algorithm called CVO-QRAM with computational cost O(kM), where M is the number of nonzero probability am…
▽ More
Initializing classical data in a quantum device is an essential step in many quantum algorithms. As a consequence of measurement and noisy operations, some algorithms need to reinitialize the prepared state several times during its execution. In this work, we propose a quantum state preparation algorithm called CVO-QRAM with computational cost O(kM), where M is the number of nonzero probability amplitudes and $k$ is the maximum number of bits with value 1 in the patterns to be stored. The proposed algorithm can be an alternative to create sparse states in future NISQ devices.
△ Less
Submitted 30 August, 2021;
originally announced August 2021.
-
Configurable sublinear circuits for quantum state preparation
Authors:
Israel F. Araujo,
Daniel K. Park,
Teresa B. Ludermir,
Wilson R. Oliveira,
Francesco Petruccione,
Adenilton J. da Silva
Abstract:
The theory of quantum algorithms promises unprecedented benefits of harnessing the laws of quantum mechanics for solving certain computational problems. A persistent obstacle to using such algorithms for solving a wide range of real-world problems is the cost of loading classical data to a quantum state. Several quantum circuit-based methods have been proposed for encoding classical data as probab…
▽ More
The theory of quantum algorithms promises unprecedented benefits of harnessing the laws of quantum mechanics for solving certain computational problems. A persistent obstacle to using such algorithms for solving a wide range of real-world problems is the cost of loading classical data to a quantum state. Several quantum circuit-based methods have been proposed for encoding classical data as probability amplitudes of a quantum state. However, they require either quantum circuit depth or width to grow linearly with the data size, even though the other dimension of the quantum circuit grows logarithmically. In this paper, we present a configurable bidirectional procedure that addresses this problem by tailoring the resource trade-off between quantum circuit width and depth. In particular, we show a configuration that encodes an $N$-dimensional state by a quantum circuit with $O(\sqrt{N})$ width and depth and entangled information in ancillary qubits. We show a proof-of-principle on five quantum computers and compare the results.
△ Less
Submitted 2 March, 2022; v1 submitted 23 August, 2021;
originally announced August 2021.
-
On the inversion number of oriented graphs
Authors:
Jørgen Bang-Jensen,
Jonas Costa Ferreira da Silva,
Frédéric Havet
Abstract:
Let $D$ be an oriented graph. The inversion of a set $X$ of vertices in $D$ consists in reversing the direction of all arcs with both ends in $X$. The inversion number of $D$, denoted by ${\rm inv}(D)$, is the minimum number of inversions needed to make $D$ acyclic. Denoting by $τ(D)$, $τ' (D)$, and $ν(D)$ the cycle transversal number, the cycle arc-transversal number and the cycle packing number…
▽ More
Let $D$ be an oriented graph. The inversion of a set $X$ of vertices in $D$ consists in reversing the direction of all arcs with both ends in $X$. The inversion number of $D$, denoted by ${\rm inv}(D)$, is the minimum number of inversions needed to make $D$ acyclic. Denoting by $τ(D)$, $τ' (D)$, and $ν(D)$ the cycle transversal number, the cycle arc-transversal number and the cycle packing number of $D$ respectively, one shows that ${\rm inv}(D) \leq τ' (D)$, ${\rm inv}(D) \leq 2τ(D)$ and there exists a function $g$ such that ${\rm inv}(D)\leq g(ν(D))$. We conjecture that for any two oriented graphs $L$ and $R$, ${\rm inv}(L\rightarrow R) ={\rm inv}(L) +{\rm inv}(R)$ where $L\rightarrow R$ is the dijoin of $L$ and $R$. This would imply that the first two inequalities are tight. We prove this conjecture when ${\rm inv}(L)\leq 1$ and ${\rm inv}(R)\leq 2$ and when ${\rm inv}(L) ={\rm inv}(R)=2$ and $L$ and $R$ are strongly connected. We also show that the function $g$ of the third inequality satisfies $g(1)\leq 4$.
We then consider the complexity of deciding whether ${\rm inv}(D)\leq k$ for a given oriented graph $D$. We show that it is NP-complete for $k=1$, which together with the above conjecture would imply that it is NP-complete for every $k$. This contrasts with a result of Belkhechine et al. which states that deciding whether ${\rm inv}(T)\leq k$ for a given tournament $T$ is polynomial-time solvable.
△ Less
Submitted 18 December, 2022; v1 submitted 10 May, 2021;
originally announced May 2021.
-
Metadata Interpretation Driven Development
Authors:
Júlio G. S. F. da Costa,
Reinaldo A. Petta,
Samuel Xavier-de-Souza
Abstract:
Despite decades of engineering and scientific research efforts, separation of concerns in software development remains not fully achieved. The challenge has been to avoid the crosscutting of concerns phenomenon, which has no apparent complete solution. In this paper, we show that business-domain coding plays an even larger role in this challenge. We then introduce a new approach called \emph{Metad…
▽ More
Despite decades of engineering and scientific research efforts, separation of concerns in software development remains not fully achieved. The challenge has been to avoid the crosscutting of concerns phenomenon, which has no apparent complete solution. In this paper, we show that business-domain coding plays an even larger role in this challenge. We then introduce a new approach called \emph{Metadata Interpretation Driven Development} (MIDD), which suggests a way to enhance the current way of realizing separation of concerns by eliminating the need to code functional concerns. We propose to code non-functional concerns as metadata interpreters. This interpretation occurs at run-time and is possible because it assumes the existence of such metadata in artefacts created in previous stages of the process, such as the modelling phase. We show how this can increase the (re)use of the constructs. Furthermore, we show that a single interpreter, due to its semantic disconnection from the domain, can simultaneously serve different business domains with no concerns regarding the need to rewrite or refactor code. Although high-reuse software construction is considered a relatively mature field, changes in the software services scenario demand constant evolution of the actual solutions. The emergence of new software architectures, such as serverless computing, reinforces the need to rethink software construction. This approach is presented as a response to this need.
△ Less
Submitted 8 October, 2021; v1 submitted 2 May, 2021;
originally announced May 2021.
-
Design Principles for Packet Deparsers on FPGAs
Authors:
Thomas Luinaud,
Jeferson Santiago da Silva,
J. M. Pierre Langlois,
Yvon Savaria
Abstract:
The P4 language has drastically changed the networking field as it allows to quickly describe and implement new networking applications. Although a large variety of applications can be described with the P4 language, current programmable switch architectures impose significant constraints on P4 programs. To address this shortcoming, FPGAs have been explored as potential targets for P4 applications…
▽ More
The P4 language has drastically changed the networking field as it allows to quickly describe and implement new networking applications. Although a large variety of applications can be described with the P4 language, current programmable switch architectures impose significant constraints on P4 programs. To address this shortcoming, FPGAs have been explored as potential targets for P4 applications. P4 applications are described using three abstractions: a packet parser, match-action tables, and a packet deparser, which reassembles the output packet with the result of the match-action tables. While implementations of packet parsers and match-action tables on FPGAs have been widely covered in the literature, no general design principles have been presented for the packet deparser. Indeed, implementing a high-speed and efficient deparser on FPGAs remains an open issue because it requires a large amount of interconnections and the architecture must be tailored to a P4 program. As a result, in several works where a P4 application is implemented on FPGAs, the deparser consumes a significant proportion of chip resources. Hence, in this paper, we address this issue by presenting design principles for efficient and high-speed deparsers on FPGAs. As an artifact, we introduce a tool that generates an efficient vendor-agnostic deparser architecture from a P4 program. Our design has been validated and simulated with a cocotb-based framework. The resulting architecture is implemented on Xilinx Ultrascale+ FPGAs and supports a throughput of more than 200 Gbps while reducing resource usage by almost 10$\times$ compared to other solutions.
△ Less
Submitted 13 March, 2021;
originally announced March 2021.
-
Analyzing Commonsense Emergence in Few-shot Knowledge Models
Authors:
Jeff Da,
Ronan Le Bras,
Ximing Lu,
Ye** Choi,
Antoine Bosselut
Abstract:
Recently, commonsense knowledge models - pretrained language models (LM) fine-tuned on knowledge graph (KG) tuples - showed that considerable amounts of commonsense knowledge can be encoded in the parameters of large language models. However, as parallel studies show that LMs are poor hypothesizers of declarative commonsense relationships on their own, it remains unclear whether this knowledge is…
▽ More
Recently, commonsense knowledge models - pretrained language models (LM) fine-tuned on knowledge graph (KG) tuples - showed that considerable amounts of commonsense knowledge can be encoded in the parameters of large language models. However, as parallel studies show that LMs are poor hypothesizers of declarative commonsense relationships on their own, it remains unclear whether this knowledge is learned during pretraining or from fine-tuning on KG examples. To investigate this question, we train commonsense knowledge models in few-shot settings to study the emergence of their commonsense representation abilities. Our results show that commonsense knowledge models can rapidly adapt from limited examples, indicating that KG fine-tuning serves to learn an interface to encoded knowledge learned during pretraining. Importantly, our analysis of absolute, angular, and distributional parameter changes during few-shot fine-tuning provides novel insights into how this interface is learned.
△ Less
Submitted 9 September, 2021; v1 submitted 1 January, 2021;
originally announced January 2021.
-
Analysis of co-authorship networks among Brazilian graduate programs in computer science
Authors:
Alex Junior Nunes da Silva,
Matheus Montanini Breve,
Jesús Pascual Mena-Chalco,
Fabrício Martins Lopes
Abstract:
The growth and popularization of platforms on scientific production have been the subject of several studies, producing relevant analyses of coauthorship behavior among groups of researchers. Researchers and their scientific productions can be analyzed as coauthorship social networks, so researchers are linked through common publications. In this context, coauthoring networks can be analyzed to fi…
▽ More
The growth and popularization of platforms on scientific production have been the subject of several studies, producing relevant analyses of coauthorship behavior among groups of researchers. Researchers and their scientific productions can be analyzed as coauthorship social networks, so researchers are linked through common publications. In this context, coauthoring networks can be analyzed to find patterns that can describe or characterize them. This work presents the analysis and characterization of co-authorship networks of academic Brazilian graduate programs in computer science. To this end, data from the curricula of Brazilian researchers were collected and modeled as coauthoring networks among the graduate programs that researchers participate in. Each network topology was analyzed regarding complex network measurements and three qualitative indices that evaluate the publications quality. In addition, the coauthorship networks of the graduate programs were characterized in relation to the evaluation received by CAPES, which attributes a qualitative grade to the graduate programs in Brazil. The results indicate some of the most relevant topological measures for the programs characterization and evaluate at different qualitative rates and indicate a pattern of the graduate programs best evaluated by CAPES.
△ Less
Submitted 22 December, 2020;
originally announced December 2020.
-
Edited Media Understanding: Reasoning About Implications of Manipulated Images
Authors:
Jeff Da,
Maxwell Forbes,
Rowan Zellers,
Anthony Zheng,
Jena D. Hwang,
Antoine Bosselut,
Ye** Choi
Abstract:
Multimodal disinformation, from `deepfakes' to simple edits that deceive, is an important societal problem. Yet at the same time, the vast majority of media edits are harmless -- such as a filtered vacation photo. The difference between this example, and harmful edits that spread disinformation, is one of intent. Recognizing and describing this intent is a major challenge for today's AI systems.…
▽ More
Multimodal disinformation, from `deepfakes' to simple edits that deceive, is an important societal problem. Yet at the same time, the vast majority of media edits are harmless -- such as a filtered vacation photo. The difference between this example, and harmful edits that spread disinformation, is one of intent. Recognizing and describing this intent is a major challenge for today's AI systems.
We present the task of Edited Media Understanding, requiring models to answer open-ended questions that capture the intent and implications of an image edit. We introduce a dataset for our task, EMU, with 48k question-answer pairs written in rich natural language. We evaluate a wide variety of vision-and-language models for our task, and introduce a new model PELICAN, which builds upon recent progress in pretrained multimodal representations. Our model obtains promising results on our dataset, with humans rating its answers as accurate 40.35% of the time. At the same time, there is still much work to be done -- humans prefer human-annotated captions 93.56% of the time -- and we provide analysis that highlights areas for further progress.
△ Less
Submitted 8 December, 2020;
originally announced December 2020.
-
Recent Trends in Wearable Computing Research: A Systematic Review
Authors:
Vicente J. P. Amorim,
Ricardo A. O. Oliveira,
Mauricio Jose da Silva
Abstract:
Wearable devices are a trending topic in both commercial and academic areas. Increasing demand for innovation has led to increased research and new products, addressing new challenges and creating profitable opportunities. However, despite a number of reviews and surveys on wearable computing, a study outlining how this area has recently evolved, which provides a broad and objective view of the ma…
▽ More
Wearable devices are a trending topic in both commercial and academic areas. Increasing demand for innovation has led to increased research and new products, addressing new challenges and creating profitable opportunities. However, despite a number of reviews and surveys on wearable computing, a study outlining how this area has recently evolved, which provides a broad and objective view of the main topics addressed by scientists, is lacking. The systematic review of literature presented in this paper investigates recent trends in wearable computing studies, taking into account a set of constraints applied to relevant studies over a window of ten years. The extracted articles were considered as a means to extract valuable information, creating a useful data set to represent the current status. Results of this study faithfully portray evolving interests in wearable devices. The analysis conducted here involving studies made over the past ten years allows evaluation of the areas, research focus, and technologies that are currently at the forefront of wearable device development. Conclusions presented in this review aim to assist scientists to better perceive recent demand trends and how wearable technology can further evolve. Finally, this study should assist in outlining the next steps in current and future development.
△ Less
Submitted 27 November, 2020;
originally announced November 2020.
-
Circuit-based quantum random access memory for classical data with continuous amplitudes
Authors:
Tiago M. L. de Veras,
Ismael C. S. de Araujo,
Daniel K. Park,
Adenilton J. da Silva
Abstract:
Loading data in a quantum device is required in several quantum computing applications. Without an efficient loading procedure, the cost to initialize the algorithms can dominate the overall computational cost. A circuit-based quantum random access memory named FF-QRAM can load M n-bit patterns with computational cost O(CMn) to load continuous data where C depends on the data distribution. In this…
▽ More
Loading data in a quantum device is required in several quantum computing applications. Without an efficient loading procedure, the cost to initialize the algorithms can dominate the overall computational cost. A circuit-based quantum random access memory named FF-QRAM can load M n-bit patterns with computational cost O(CMn) to load continuous data where C depends on the data distribution. In this work, we propose a strategy to load continuous data without post-selection with computational cost O(Mn). The proposed method is based on the probabilistic quantum memory, a strategy to load binary data in quantum devices, and the FF-QRAM using standard quantum gates, and is suitable for noisy intermediate-scale quantum computers.
△ Less
Submitted 16 November, 2020;
originally announced November 2020.
-
Learning Task Space Actions for Bipedal Locomotion
Authors:
Helei Duan,
Jeremy Dao,
Kevin Green,
Taylor Apgar,
Alan Fern,
Jonathan Hurst
Abstract:
Recent work has demonstrated the success of reinforcement learning (RL) for training bipedal locomotion policies for real robots. This prior work, however, has focused on learning joint-coordination controllers based on an objective of following joint trajectories produced by already available controllers. As such, it is difficult to train these approaches to achieve higher-level goals of legged l…
▽ More
Recent work has demonstrated the success of reinforcement learning (RL) for training bipedal locomotion policies for real robots. This prior work, however, has focused on learning joint-coordination controllers based on an objective of following joint trajectories produced by already available controllers. As such, it is difficult to train these approaches to achieve higher-level goals of legged locomotion, such as simply specifying the desired end-effector foot movement or ground reaction forces. In this work, we propose an approach for integrating knowledge of the robot system into RL to allow for learning at the level of task space actions in terms of feet setpoints. In particular, we integrate learning a task space policy with a model-based inverse dynamics controller, which translates task space actions into joint-level controls. With this natural action space for learning locomotion, the approach is more sample efficient and produces desired task space dynamics compared to learning purely joint space actions. We demonstrate the approach in simulation and also show that the learned policies are able to transfer to the real bipedal robot Cassie. This result encourages further research towards incorporating bipedal control techniques into the structure of the learning process to enable dynamic behaviors.
△ Less
Submitted 5 May, 2021; v1 submitted 9 November, 2020;
originally announced November 2020.
-
A Software Architecture for Autonomous Vehicles: Team LRM-B Entry in the First CARLA Autonomous Driving Challenge
Authors:
Luis Alberto Rosero,
Iago Pacheco Gomes,
Júnior Anderson Rodrigues da Silva,
Tiago Cesar dos Santos,
Angelica Tiemi Mizuno Nakamura,
Jean Amaro,
Denis Fernando Wolf,
Fernando Santos Osório
Abstract:
The objective of the first CARLA autonomous driving challenge was to deploy autonomous driving systems to lead with complex traffic scenarios where all participants faced the same challenging traffic situations. According to the organizers, this competition emerges as a way to democratize and to accelerate the research and development of autonomous vehicles around the world using the CARLA simulat…
▽ More
The objective of the first CARLA autonomous driving challenge was to deploy autonomous driving systems to lead with complex traffic scenarios where all participants faced the same challenging traffic situations. According to the organizers, this competition emerges as a way to democratize and to accelerate the research and development of autonomous vehicles around the world using the CARLA simulator contributing to the development of the autonomous vehicle area. Therefore, this paper presents the architecture design for the navigation of an autonomous vehicle in a simulated urban environment that attempts to commit the least number of traffic infractions, which used as the baseline the original architecture of the platform for autonomous navigation CaRINA 2. Our agent traveled in simulated scenarios for several hours, demonstrating his capabilities, winning three out of the four tracks of the challenge, and being ranked second in the remaining track.
Our architecture was made towards meeting the requirements of CARLA Autonomous Driving Challenge and has components for obstacle detection using 3D point clouds, traffic signs detection and classification which employs Convolutional Neural Networks (CNN) and depth information, risk assessment with collision detection using short-term motion prediction, decision-making with Markov Decision Process (MDP), and control using Model Predictive Control (MPC).
△ Less
Submitted 23 October, 2020;
originally announced October 2020.
-
Learning Spring Mass Locomotion: Guiding Policies with a Reduced-Order Model
Authors:
Kevin Green,
Yesh Godse,
Jeremy Dao,
Ross L. Hatton,
Alan Fern,
Jonathan Hurst
Abstract:
In this paper, we describe an approach to achieve dynamic legged locomotion on physical robots which combines existing methods for control with reinforcement learning. Specifically, our goal is a control hierarchy in which highest-level behaviors are planned through reduced-order models, which describe the fundamental physics of legged locomotion, and lower level controllers utilize a learned poli…
▽ More
In this paper, we describe an approach to achieve dynamic legged locomotion on physical robots which combines existing methods for control with reinforcement learning. Specifically, our goal is a control hierarchy in which highest-level behaviors are planned through reduced-order models, which describe the fundamental physics of legged locomotion, and lower level controllers utilize a learned policy that can bridge the gap between the idealized, simple model and the complex, full order robot. The high-level planner can use a model of the environment and be task specific, while the low-level learned controller can execute a wide range of motions so that it applies to many different tasks. In this letter we describe this learned dynamic walking controller and show that a range of walking motions from reduced-order models can be used as the command and primary training signal for learned policies. The resulting policies do not attempt to naively track the motion (as a traditional trajectory tracking controller would) but instead balance immediate motion tracking with long term stability. The resulting controller is demonstrated on a human scale, unconstrained, untethered bipedal robot at speeds up to 1.2 m/s. This letter builds the foundation of a generic, dynamic learned walking controller that can be applied to many different tasks.
△ Less
Submitted 11 March, 2021; v1 submitted 21 October, 2020;
originally announced October 2020.
-
COMET-ATOMIC 2020: On Symbolic and Neural Commonsense Knowledge Graphs
Authors:
Jena D. Hwang,
Chandra Bhagavatula,
Ronan Le Bras,
Jeff Da,
Keisuke Sakaguchi,
Antoine Bosselut,
Ye** Choi
Abstract:
Recent years have brought about a renewed interest in commonsense representation and reasoning in the field of natural language understanding. The development of new commonsense knowledge graphs (CSKG) has been central to these advances as their diverse facts can be used and referenced by machine learning models for tackling new and challenging tasks. At the same time, there remain questions about…
▽ More
Recent years have brought about a renewed interest in commonsense representation and reasoning in the field of natural language understanding. The development of new commonsense knowledge graphs (CSKG) has been central to these advances as their diverse facts can be used and referenced by machine learning models for tackling new and challenging tasks. At the same time, there remain questions about the quality and coverage of these resources due to the massive scale required to comprehensively encompass general commonsense knowledge.
In this work, we posit that manually constructed CSKGs will never achieve the coverage necessary to be applicable in all situations encountered by NLP agents. Therefore, we propose a new evaluation framework for testing the utility of KGs based on how effectively implicit knowledge representations can be learned from them.
With this new goal, we propose ATOMIC 2020, a new CSKG of general-purpose commonsense knowledge containing knowledge that is not readily available in pretrained language models. We evaluate its properties in comparison with other leading CSKGs, performing the first large-scale pairwise study of commonsense knowledge resources. Next, we show that ATOMIC 2020 is better suited for training knowledge models that can generate accurate, representative knowledge for new, unseen entities and events. Finally, through human evaluation, we show that the few-shot performance of GPT-3 (175B parameters), while impressive, remains ~12 absolute points lower than a BART-based knowledge model trained on ATOMIC 2020 despite using over 430x fewer parameters.
△ Less
Submitted 16 December, 2021; v1 submitted 12 October, 2020;
originally announced October 2020.
-
Detecting soccer balls with reduced neural networks: a comparison of multiple architectures under constrained hardware scenarios
Authors:
Douglas De Rizzo Meneghetti,
Thiago Pedro Donadon Homem,
Jonas Henrique Renolfi de Oliveira,
Isaac Jesus da Silva,
Danilo Hernani Perico,
Reinaldo Augusto da Costa Bianchi
Abstract:
Object detection techniques that achieve state-of-the-art detection accuracy employ convolutional neural networks, implemented to have optimal performance in graphics processing units. Some hardware systems, such as mobile robots, operate under constrained hardware situations, but still benefit from object detection capabilities. Multiple network models have been proposed, achieving comparable acc…
▽ More
Object detection techniques that achieve state-of-the-art detection accuracy employ convolutional neural networks, implemented to have optimal performance in graphics processing units. Some hardware systems, such as mobile robots, operate under constrained hardware situations, but still benefit from object detection capabilities. Multiple network models have been proposed, achieving comparable accuracy with reduced architectures and leaner operations. Motivated by the need to create an object detection system for a soccer team of mobile robots, this work provides a comparative study of recent proposals of neural networks targeted towards constrained hardware environments, in the specific task of soccer ball detection. We train multiple open implementations of MobileNetV2 and MobileNetV3 models with different underlying architectures, as well as YOLOv3, TinyYOLOv3, YOLOv4 and TinyYOLOv4 in an annotated image data set captured using a mobile robot. We then report their mean average precision on a test data set and their inference times in videos of different resolutions, under constrained and unconstrained hardware configurations. Results show that MobileNetV3 models have a good trade-off between mAP and inference time in constrained scenarios only, while MobileNetV2 with high width multipliers are appropriate for server-side inference. YOLO models in their official implementations are not suitable for inference in CPUs.
△ Less
Submitted 21 February, 2021; v1 submitted 28 September, 2020;
originally announced September 2020.
-
YNU-HPCC at SemEval-2020 Task 11: LSTM Network for Detection of Propaganda Techniques in News Articles
Authors:
Jiaxu Dao,
** Wang,
Xuejie Zhang
Abstract:
This paper summarizes our studies on propaganda detection techniques for news articles in the SemEval-2020 task 11. This task is divided into the SI and TC subtasks. We implemented the GloVe word representation, the BERT pretraining model, and the LSTM model architecture to accomplish this task. Our approach achieved good results for both the SI and TC subtasks. The macro-F1-score for the SI subta…
▽ More
This paper summarizes our studies on propaganda detection techniques for news articles in the SemEval-2020 task 11. This task is divided into the SI and TC subtasks. We implemented the GloVe word representation, the BERT pretraining model, and the LSTM model architecture to accomplish this task. Our approach achieved good results for both the SI and TC subtasks. The macro-F1-score for the SI subtask is 0.406, and the micro-F1-score for the TC subtask is 0.505. Our method significantly outperforms the officially released baseline method, and the SI and TC subtasks rank 17th and 22nd, respectively, for the test set. This paper also compares the performances of different deep learning model architectures, such as the Bi-LSTM, LSTM, BERT, and XGBoost models, on the detection of news promotion techniques. The code of this paper is availabled at: https://github.com/daojiaxu/semeval_11.
△ Less
Submitted 25 August, 2020; v1 submitted 23 August, 2020;
originally announced August 2020.