Search | arXiv e-print repository

Ego-Foresight: Agent Visuomotor Prediction as Regularization for RL

Authors: Manuel S. Nunes, Atabak Dehban, Yiannis Demiris, José Santos-Victor

Abstract: Despite the significant advancements in Deep Reinforcement Learning (RL) observed in the last decade, the amount of training experience necessary to learn effective policies remains one of the primary concerns both in simulated and real environments. Looking to solve this issue, previous work has shown that improved training efficiency can be achieved by separately modeling agent and environment,… ▽ More Despite the significant advancements in Deep Reinforcement Learning (RL) observed in the last decade, the amount of training experience necessary to learn effective policies remains one of the primary concerns both in simulated and real environments. Looking to solve this issue, previous work has shown that improved training efficiency can be achieved by separately modeling agent and environment, but usually requiring a supervisory agent mask. In contrast to RL, humans can perfect a new skill from a very small number of trials and in most cases do so without a supervisory signal, making neuroscientific studies of human development a valuable source of inspiration for RL. In particular, we explore the idea of motor prediction, which states that humans develop an internal model of themselves and of the consequences that their motor commands have on the immediate sensory inputs. Our insight is that the movement of the agent provides a cue that allows the duality between agent and environment to be learned. To instantiate this idea, we present Ego-Foresight, a self-supervised method for disentangling agent and environment based on motion and prediction. Our main finding is that visuomotor prediction of the agent provides regularization to the RL algorithm, by encouraging the actions to stay within predictable bounds. To test our approach, we first study the ability of our model to visually predict agent movement irrespective of the environment, in real-world robotic interactions. Then, we integrate Ego-Foresight with a model-free RL algorithm to solve simulated robotic manipulation tasks, showing an average improvement of 23% in efficiency and 8% in performance. △ Less

Submitted 27 May, 2024; originally announced July 2024.

Comments: 9 pages, 9 figures, 3 tables, conference

arXiv:2209.02384 [pdf, other]

Data Centred Intelligent Geosciences: Research Agenda and Opportunities, Position Paper

Authors: Aderson Farias do Nascimento, Martin A. Musicante, Umberto Souza da Costa, Bruno M. Carvalho, Marcus Alexandre Nunes, Genoveva Vargas-Solar

Abstract: This paper describes and discusses our vision to develop and reason about best practices and novel ways of curating data-centric geosciences knowledge (data, experiments, models, methods, conclusions, and interpretations). This knowledge is produced from applying statistical modelling, Machine Learning, and modern data analytics methods on geo-data collections. The problems address open methodolog… ▽ More This paper describes and discusses our vision to develop and reason about best practices and novel ways of curating data-centric geosciences knowledge (data, experiments, models, methods, conclusions, and interpretations). This knowledge is produced from applying statistical modelling, Machine Learning, and modern data analytics methods on geo-data collections. The problems address open methodological questions in model building, models' assessment, prediction, and forecasting workflows. △ Less

Submitted 20 August, 2022; originally announced September 2022.

arXiv:2101.03564 [pdf, other]

Cybersecurity of Industrial Cyber-Physical Systems: A Review

Authors: Hakan Kayan, Matthew Nunes, Omer Rana, Pete Burnap, Charith Perera

Abstract: Industrial cyber-physical systems (ICPSs) manage critical infrastructures by controlling the processes based on the "physics" data gathered by edge sensor networks. Recent innovations in ubiquitous computing and communication technologies have prompted the rapid integration of highly interconnected systems to ICPSs. Hence, the "security by obscurity" principle provided by air-gap** is no longer… ▽ More Industrial cyber-physical systems (ICPSs) manage critical infrastructures by controlling the processes based on the "physics" data gathered by edge sensor networks. Recent innovations in ubiquitous computing and communication technologies have prompted the rapid integration of highly interconnected systems to ICPSs. Hence, the "security by obscurity" principle provided by air-gap** is no longer followed. As the interconnectivity in ICPSs increases, so does the attack surface. Industrial vulnerability assessment reports have shown that a variety of new vulnerabilities have occurred due to this transition while the most common ones are related to weak boundary protection. Although there are existing surveys in this context, very little is mentioned regarding these reports. This paper bridges this gap by defining and reviewing ICPSs from a cybersecurity perspective. In particular, multi-dimensional adaptive attack taxonomy is presented and utilized for evaluating real-life ICPS cyber incidents. We also identify the general shortcomings and highlight the points that cause a gap in existing literature while defining future research directions. △ Less

Submitted 10 January, 2021; originally announced January 2021.

Comments: 32 pages, 10 figures

arXiv:2008.00077 [pdf, other]

doi 10.1007/978-3-030-61377-8

Neural Architecture Search in Graph Neural Networks

Authors: Matheus Nunes, Gisele L. Pappa

Abstract: Performing analytical tasks over graph data has become increasingly interesting due to the ubiquity and large availability of relational information. However, unlike images or sentences, there is no notion of sequence in networks. Nodes (and edges) follow no absolute order, and it is hard for traditional machine learning (ML) algorithms to recognize a pattern and generalize their predictions on th… ▽ More Performing analytical tasks over graph data has become increasingly interesting due to the ubiquity and large availability of relational information. However, unlike images or sentences, there is no notion of sequence in networks. Nodes (and edges) follow no absolute order, and it is hard for traditional machine learning (ML) algorithms to recognize a pattern and generalize their predictions on this type of data. Graph Neural Networks (GNN) successfully tackled this problem. They became popular after the generalization of the convolution concept to the graph domain. However, they possess a large number of hyperparameters and their design and optimization is currently hand-made, based on heuristics or empirical intuition. Neural Architecture Search (NAS) methods appear as an interesting solution to this problem. In this direction, this paper compares two NAS methods for optimizing GNN: one based on reinforcement learning and a second based on evolutionary algorithms. Results consider 7 datasets over two search spaces and show that both methods obtain similar accuracies to a random search, raising the question of how many of the search space dimensions are actually relevant to the problem. △ Less

Submitted 31 July, 2020; originally announced August 2020.

arXiv:2005.02217 [pdf, other]

Long short-term memory networks and laglasso for bond yield forecasting: Pee** inside the black box

Authors: Manuel Nunes, Enrico Gerding, Frank McGroarty, Mahesan Niranjan

Abstract: Modern decision-making in fixed income asset management benefits from intelligent systems, which involve the use of state-of-the-art machine learning models and appropriate methodologies. We conduct the first study of bond yield forecasting using long short-term memory (LSTM) networks, validating its potential and identifying its memory advantage. Specifically, we model the 10-year bond yield usin… ▽ More Modern decision-making in fixed income asset management benefits from intelligent systems, which involve the use of state-of-the-art machine learning models and appropriate methodologies. We conduct the first study of bond yield forecasting using long short-term memory (LSTM) networks, validating its potential and identifying its memory advantage. Specifically, we model the 10-year bond yield using univariate LSTMs with three input sequences and five forecasting horizons. We compare those with multilayer perceptrons (MLP), univariate and with the most relevant features. To demystify the notion of black box associated with LSTMs, we conduct the first internal study of the model. To this end, we calculate the LSTM signals through time, at selected locations in the memory cell, using sequence-to-sequence architectures, uni and multivariate. We then proceed to explain the states' signals using exogenous information, for what we develop the LSTM-LagLasso methodology. The results show that the univariate LSTM model with additional memory is capable of achieving similar results as the multivariate MLP using macroeconomic and market information. Furthermore, shorter forecasting horizons require smaller input sequences and vice-versa. The most remarkable property found consistently in the LSTM signals, is the activation/deactivation of units through time, and the specialisation of units by yield range or feature. Those signals are complex but can be explained by exogenous variables. Additionally, some of the relevant features identified via LSTM-LagLasso are not commonly used in forecasting models. In conclusion, our work validates the potential of LSTMs and methodologies for bonds, providing additional tools for financial practitioners. △ Less

Submitted 5 May, 2020; originally announced May 2020.

Comments: 27 pages, 16 figures

arXiv:1910.02564 [pdf, other]

Action-conditioned Benchmarking of Robotic Video Prediction Models: a Comparative Study

Authors: Manuel Serra Nunes, Atabak Dehban, Plinio Moreno, José Santos-Victor

Abstract: A defining characteristic of intelligent systems is the ability to make action decisions based on the anticipated outcomes. Video prediction systems have been demonstrated as a solution for predicting how the future will unfold visually, and thus, many models have been proposed that are capable of predicting future frames based on a history of observed frames~(and sometimes robot actions). However… ▽ More A defining characteristic of intelligent systems is the ability to make action decisions based on the anticipated outcomes. Video prediction systems have been demonstrated as a solution for predicting how the future will unfold visually, and thus, many models have been proposed that are capable of predicting future frames based on a history of observed frames~(and sometimes robot actions). However, a comprehensive method for determining the fitness of different video prediction models at guiding the selection of actions is yet to be developed. Current metrics assess video prediction models based on human perception of frame quality. In contrast, we argue that if these systems are to be used to guide action, necessarily, the actions the robot performs should be encoded in the predicted frames. In this paper, we are proposing a new metric to compare different video prediction models based on this argument. More specifically, we propose an action inference system and quantitatively rank different models based on how well we can infer the robot actions from the predicted frames. Our extensive experiments show that models with high perceptual scores can perform poorly in the proposed action inference tests and thus, may not be suitable options to be used in robot planning systems. △ Less

Submitted 6 October, 2019; originally announced October 2019.

arXiv:1905.12804

A Music Classification Model based on Metric Learning and Feature Extraction from MP3 Audio Files

Authors: Angelo C. Mendes da Silva, Mauricio A. Nunes, Raul Fonseca Neto

Abstract: The development of models for learning music similarity and feature extraction from audio media files is an increasingly important task for the entertainment industry. This work proposes a novel music classification model based on metric learning and feature extraction from MP3 audio files. The metric learning process considers the learning of a set of parameterized distances employing a structure… ▽ More The development of models for learning music similarity and feature extraction from audio media files is an increasingly important task for the entertainment industry. This work proposes a novel music classification model based on metric learning and feature extraction from MP3 audio files. The metric learning process considers the learning of a set of parameterized distances employing a structured prediction approach from a set of MP3 audio files containing several music genres. The main objective of this work is to make possible learning a personalized metric for each customer. To extract the acoustic information we use the Mel-Frequency Cepstral Coefficient (MFCC) and make a dimensionality reduction with the use of Principal Components Analysis. We attest the model validity performing a set of experiments and comparing the training and testing results with baseline algorithms, such as K-means and Soft Margin Linear Support Vector Machine (SVM). Experiments show promising results and encourage the future development of an online version of the learning model. △ Less

Submitted 17 September, 2019; v1 submitted 29 May, 2019; originally announced May 2019.

Comments: In a review process, I found some errors and made some changes in methodology that improved my results. Once I finish the experiments, I will upload the new version

arXiv:1712.08917 [pdf, ps, other]

Building a Sentiment Corpus of Tweets in Brazilian Portuguese

Authors: Henrico Bertini Brum, Maria das Graças Volpe Nunes

Abstract: The large amount of data available in social media, forums and websites motivates researches in several areas of Natural Language Processing, such as sentiment analysis. The popularity of the area due to its subjective and semantic characteristics motivates research on novel methods and approaches for classification. Hence, there is a high demand for datasets on different domains and different lan… ▽ More The large amount of data available in social media, forums and websites motivates researches in several areas of Natural Language Processing, such as sentiment analysis. The popularity of the area due to its subjective and semantic characteristics motivates research on novel methods and approaches for classification. Hence, there is a high demand for datasets on different domains and different languages. This paper introduces TweetSentBR, a sentiment corpora for Brazilian Portuguese manually annotated with 15.000 sentences on TV show domain. The sentences were labeled in three classes (positive, neutral and negative) by seven annotators, following literature guidelines for ensuring reliability on the annotation. We also ran baseline experiments on polarity classification using three machine learning methods, reaching 80.99% on F-Measure and 82.06% on accuracy in binary classification, and 59.85% F-Measure and 64.62% on accuracy on three point classification. △ Less

Submitted 24 December, 2017; originally announced December 2017.

Comments: Accepted for publication in 11th International Conference on Language Resources and Evaluation (LREC 2018)

arXiv:1704.02963 [pdf, other]

Exploring Word Embeddings for Unsupervised Textual User-Generated Content Normalization

Authors: Thales Felipe Costa Bertaglia, Maria das Graças Volpe Nunes

Abstract: Text normalization techniques based on rules, lexicons or supervised training requiring large corpora are not scalable nor domain interchangeable, and this makes them unsuitable for normalizing user-generated content (UGC). Current tools available for Brazilian Portuguese make use of such techniques. In this work we propose a technique based on distributed representation of words (or word embeddin… ▽ More Text normalization techniques based on rules, lexicons or supervised training requiring large corpora are not scalable nor domain interchangeable, and this makes them unsuitable for normalizing user-generated content (UGC). Current tools available for Brazilian Portuguese make use of such techniques. In this work we propose a technique based on distributed representation of words (or word embeddings). It generates continuous numeric vectors of high-dimensionality to represent words. The vectors explicitly encode many linguistic regularities and patterns, as well as syntactic and semantic word relationships. Words that share semantic similarity are represented by similar vectors. Based on these features, we present a totally unsupervised, expandable and language and domain independent method for learning normalization lexicons from word embeddings. Our approach obtains high correction rate of orthographic errors and internet slang in product reviews, outperforming the current available tools for Brazilian Portuguese. △ Less

Submitted 10 April, 2017; originally announced April 2017.

Comments: Published in Proceedings of the 2nd Workshop on Noisy User-generated Text, 9 pages

arXiv:1005.3063 [pdf, ps, other]

doi 10.1007/s11192-012-0630-z

Good practices for a literature survey are not followed by authors while preparing scientific manuscripts

Authors: D. R. Amancio, M. G. V. Nunes, O. N. Oliveira Jr., L. da F. Costa

Abstract: The number of citations received by authors in scientific journals has become a major parameter to assess individual researchers and the journals themselves through the impact factor. A fair assessment therefore requires that the criteria for selecting references in a given manuscript should be unbiased with respect to the authors or the journals cited. In this paper, we advocate that authors shou… ▽ More The number of citations received by authors in scientific journals has become a major parameter to assess individual researchers and the journals themselves through the impact factor. A fair assessment therefore requires that the criteria for selecting references in a given manuscript should be unbiased with respect to the authors or the journals cited. In this paper, we advocate that authors should follow two mandatory principles to select papers (later reflected in the list of references) while studying the literature for a given research: i) consider similarity of content with the topics investigated, lest very related work should be reproduced or ignored; ii) perform a systematic search over the network of citations including seminal or very related papers. We use formalisms of complex networks for two datasets of papers from the arXiv repository to show that neither of these two criteria is fulfilled in practice. △ Less

Submitted 22 September, 2012; v1 submitted 17 May, 2010; originally announced May 2010.

Journal ref: Scientometrics, v. 90, p. 2, (2012)

Showing 1–10 of 10 results for author: Nunes, M