Showing 1&ndash;35 of 35 results for author: Alexandre, L A

Abstract: Recent advancements in imitation learning have been largely fueled by the integration of sequence models, which provide a structured flow of information to effectively mimic task behaviours. Currently, Decision Transformer (DT) and subsequently, the Hierarchical Decision Transformer (HDT), presented Transformer-based approaches to learn task policies. Recently, the Mamba architecture has shown to… ▽ More Recent advancements in imitation learning have been largely fueled by the integration of sequence models, which provide a structured flow of information to effectively mimic task behaviours. Currently, Decision Transformer (DT) and subsequently, the Hierarchical Decision Transformer (HDT), presented Transformer-based approaches to learn task policies. Recently, the Mamba architecture has shown to outperform Transformers across various task domains. In this work, we introduce two novel methods, Decision Mamba (DM) and Hierarchical Decision Mamba (HDM), aimed at enhancing the performance of the Transformer models. Through extensive experimentation across diverse environments such as OpenAI Gym and D4RL, leveraging varying demonstration data sets, we demonstrate the superiority of Mamba models over their Transformer counterparts in a majority of tasks. Results show that HDM outperforms other methods in most settings. The code can be found at https://github.com/meowatthemoon/HierarchicalDecisionMamba. △ Less

Submitted 13 May, 2024; originally announced May 2024.

arXiv:2403.15569 [pdf, other]

Music to Dance as Language Translation using Sequence Models

Abstract: Synthesising appropriate choreographies from music remains an open problem. We introduce MDLT, a novel approach that frames the choreography generation problem as a translation task. Our method leverages an existing data set to learn to translate sequences of audio into corresponding dance poses. We present two variants of MDLT: one utilising the Transformer architecture and the other employing th… ▽ More Synthesising appropriate choreographies from music remains an open problem. We introduce MDLT, a novel approach that frames the choreography generation problem as a translation task. Our method leverages an existing data set to learn to translate sequences of audio into corresponding dance poses. We present two variants of MDLT: one utilising the Transformer architecture and the other employing the Mamba architecture. We train our method on AIST++ and PhantomDance data sets to teach a robotic arm to dance, but our method can be applied to a full humanoid robot. Evaluation metrics, including Average Joint Error and Frechet Inception Distance, consistently demonstrate that, when given a piece of music, MDLT excels at producing realistic and high-quality choreography. The code can be found at github.com/meowatthemoon/MDLT. △ Less

Submitted 22 March, 2024; originally announced March 2024.

arXiv:2309.12484 [pdf, other]

Robust Energy Consumption Prediction with a Missing Value-Resilient Metaheuristic-based Neural Network in Mobile App Development

Authors: Seyed Jalaleddin Mousavirad, Luís A. Alexandre

Abstract: Energy consumption is a fundamental concern in mobile application development, bearing substantial significance for both developers and end-users. Main objective of this research is to propose a novel neural network-based framework, enhanced by a metaheuristic approach, to achieve robust energy prediction in the context of mobile app development. The metaheuristic approach here aims to achieve two… ▽ More Energy consumption is a fundamental concern in mobile application development, bearing substantial significance for both developers and end-users. Main objective of this research is to propose a novel neural network-based framework, enhanced by a metaheuristic approach, to achieve robust energy prediction in the context of mobile app development. The metaheuristic approach here aims to achieve two goals: 1) identifying suitable learning algorithms and their corresponding hyperparameters, and 2) determining the optimal number of layers and neurons within each layer. Moreover, due to limitations in accessing certain aspects of a mobile phone, there might be missing data in the data set, and the proposed framework can handle this. In addition, we conducted an optimal algorithm selection strategy, employing 13 base and advanced metaheuristic algorithms, to identify the best algorithm based on accuracy and resistance to missing values. The representation in our proposed metaheuristic algorithm is variable-size, meaning that the length of the candidate solutions changes over time. We compared the algorithms based on the architecture found by each algorithm at different levels of missing values, accuracy, F-measure, and stability analysis. Additionally, we conducted a Wilcoxon signed-rank test for statistical comparison of the results. The extensive experiments show that our proposed approach significantly improves energy consumption prediction. Particularly, the JADE algorithm, a variant of Differential Evolution (DE), DE, and the Covariance Matrix Adaptation Evolution Strategy deliver superior results under various conditions and across different missing value levels. △ Less

Submitted 4 June, 2024; v1 submitted 21 September, 2023; originally announced September 2023.

Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2306.09931 [pdf, other]

A Metaheuristic-based Machine Learning Approach for Energy Prediction in Mobile App Development

Authors: Seyed Jalaleddin Mousavirad, Luís A. Alexandre

Abstract: Energy consumption plays a vital role in mobile App development for developers and end-users, and it is considered one of the most crucial factors for purchasing a smartphone. In addition, in terms of sustainability, it is essential to find methods to reduce the energy consumption of mobile devices since the extensive use of billions of smartphones worldwide significantly impacts the environment.… ▽ More Energy consumption plays a vital role in mobile App development for developers and end-users, and it is considered one of the most crucial factors for purchasing a smartphone. In addition, in terms of sustainability, it is essential to find methods to reduce the energy consumption of mobile devices since the extensive use of billions of smartphones worldwide significantly impacts the environment. Despite the existence of several energy-efficient programming practices in Android, the leading mobile ecosystem, machine learning-based energy prediction algorithms for mobile App development have yet to be reported. Therefore, this paper proposes a histogram-based gradient boosting classification machine (HGBC), boosted by a metaheuristic approach, for energy prediction in mobile App development. Our metaheuristic approach is responsible for two issues. First, it finds redundant and irrelevant features without any noticeable change in performance. Second, it performs a hyper-parameter tuning for the HGBC algorithm. Since our proposed metaheuristic approach is algorithm-independent, we selected 12 algorithms for the search strategy to find the optimal search algorithm. Our finding shows that a success-history-based parameter adaption for differential evolution with linear population size (L-SHADE) offers the best performance. It can improve performance and decrease the number of features effectively. Our extensive set of experiments clearly shows that our proposed approach can provide significant results for energy consumption prediction. △ Less

Submitted 16 June, 2023; originally announced June 2023.

Comments: This paper is submitted to a related journal

arXiv:2303.16938 [pdf, other]

Are Neural Architecture Search Benchmarks Well Designed? A Deeper Look Into Operation Importance

Authors: Vasco Lopes, Bruno Degardin, Luís A. Alexandre

Abstract: Neural Architecture Search (NAS) benchmarks significantly improved the capability of develo** and comparing NAS methods while at the same time drastically reduced the computational overhead by providing meta-information about thousands of trained neural networks. However, tabular benchmarks have several drawbacks that can hinder fair comparisons and provide unreliable results. These usually focu… ▽ More Neural Architecture Search (NAS) benchmarks significantly improved the capability of develo** and comparing NAS methods while at the same time drastically reduced the computational overhead by providing meta-information about thousands of trained neural networks. However, tabular benchmarks have several drawbacks that can hinder fair comparisons and provide unreliable results. These usually focus on providing a small pool of operations in heavily constrained search spaces -- usually cell-based neural networks with pre-defined outer-skeletons. In this work, we conducted an empirical analysis of the widely used NAS-Bench-101, NAS-Bench-201 and TransNAS-Bench-101 benchmarks in terms of their generability and how different operations influence the performance of the generated architectures. We found that only a subset of the operation pool is required to generate architectures close to the upper-bound of the performance range. Also, the performance distribution is negatively skewed, having a higher density of architectures in the upper-bound range. We consistently found convolution layers to have the highest impact on the architecture's performance, and that specific combination of operations favors top-scoring architectures. These findings shed insights on the correct evaluation and comparison of NAS methods using NAS benchmarks, showing that directly searching on NAS-Bench-201, ImageNet16-120 and TransNAS-Bench-101 produces more reliable results than searching only on CIFAR-10. Furthermore, with this work we provide suggestions for future benchmark evaluations and design. The code used to conduct the evaluations is available at https://github.com/VascoLopes/NAS-Benchmark-Evaluation. △ Less

Submitted 29 March, 2023; originally announced March 2023.

Comments: 15 pages; 11 figues; 10 tables

arXiv:2303.12876 [pdf, other]

A Survey on Task Allocation and Scheduling in Robotic Network Systems

Abstract: Cloud Robotics is hel** to create a new generation of robots that leverage the nearly unlimited resources of large data centers (i.e., the cloud), overcoming the limitations imposed by on-board resources. Different processing power, capabilities, resource sizes, energy consumption, and so forth, make scheduling and task allocation critical components. The basic idea of task allocation and schedu… ▽ More Cloud Robotics is hel** to create a new generation of robots that leverage the nearly unlimited resources of large data centers (i.e., the cloud), overcoming the limitations imposed by on-board resources. Different processing power, capabilities, resource sizes, energy consumption, and so forth, make scheduling and task allocation critical components. The basic idea of task allocation and scheduling is to optimize performance by minimizing completion time, energy consumption, delays between two consecutive tasks, along with others, and maximizing resource utilization, number of completed tasks in a given time interval, and suchlike. In the past, several works have addressed various aspects of task allocation and scheduling. In this paper, we provide a comprehensive overview of task allocation and scheduling strategies and related metrics suitable for robotic network cloud systems. We discuss the issues related to allocation and scheduling methods and the limitations that need to be overcome. The literature review is organized according to three different viewpoints: Architectures and Applications, Methods and Parameters. In addition, the limitations of each method are highlighted for future research. △ Less

Submitted 22 March, 2023; originally announced March 2023.

arXiv:2303.11191 [pdf, other]

A Survey of Demonstration Learning

Abstract: With the fast improvement of machine learning, reinforcement learning (RL) has been used to automate human tasks in different areas. However, training such agents is difficult and restricted to expert users. Moreover, it is mostly limited to simulation environments due to the high cost and safety concerns of interactions in the real world. Demonstration Learning is a paradigm in which an agent lea… ▽ More With the fast improvement of machine learning, reinforcement learning (RL) has been used to automate human tasks in different areas. However, training such agents is difficult and restricted to expert users. Moreover, it is mostly limited to simulation environments due to the high cost and safety concerns of interactions in the real world. Demonstration Learning is a paradigm in which an agent learns to perform a task by imitating the behavior of an expert shown in demonstrations. It is a relatively recent area in machine learning, but it is gaining significant traction due to having tremendous potential for learning complex behaviors from demonstrations. Learning from demonstration accelerates the learning process by improving sample efficiency, while also reducing the effort of the programmer. Due to learning without interacting with the environment, demonstration learning would allow the automation of a wide range of real world applications such as robotics and healthcare. This paper provides a survey of demonstration learning, where we formally introduce the demonstration problem along with its main challenges and provide a comprehensive overview of the process of learning from demonstrations from the creation of the demonstration data set, to learning methods from demonstrations, and optimization by combining demonstration learning with different machine learning methods. We also review the existing benchmarks and identify their strengths and limitations. Additionally, we discuss the advantages and disadvantages of the paradigm as well as its main applications. Lastly, we discuss our perspective on open problems and research directions for this rapidly growing field. △ Less

Submitted 20 March, 2023; originally announced March 2023.

Comments: 35 pages, 9 figures

arXiv:2212.06313 [pdf, other]

Metaheuristic-based Energy-aware Image Compression for Mobile App Development

Authors: Seyed Jalaleddin Mousavirad, Luís A Alexandre

Abstract: The JPEG standard is widely used in different image processing applications. One of the main components of the JPEG standard is the quantisation table (QT) since it plays a vital role in the image properties such as image quality and file size. In recent years, several efforts based on population-based metaheuristic (PBMH) algorithms have been performed to find the proper QT(s) for a specific imag… ▽ More The JPEG standard is widely used in different image processing applications. One of the main components of the JPEG standard is the quantisation table (QT) since it plays a vital role in the image properties such as image quality and file size. In recent years, several efforts based on population-based metaheuristic (PBMH) algorithms have been performed to find the proper QT(s) for a specific image, although they do not take into consideration the user opinion in advance. Take an android developer as an example, who prefers a small-size image, while the optimisation process results in a high-quality image, leading to a huge file size. Another pitfall of the current works is a lack of comprehensive coverage, meaning that the QT(s) can not provide all possible combinations of file size and quality. Therefore, this paper aims to propose three distinct contributions. First, to include the user opinion in the compression process, the file size of the output image can be controlled by a user in advance. To this end, we propose a novel objective function for population-based JPEG image compression. Second, to tackle the lack of comprehensive coverage, we suggest a novel representation. Our proposed representation can not only provide more comprehensive coverage but also find the proper value for the quality factor for a specific image without any background knowledge. Both changes in representation and objective function are independent of the search strategies and can be used with any type of population-based metaheuristic (PBMH) algorithm. Therefore, as the third contribution, we also provide a comprehensive benchmark on 22 state-of-the-art and recently-introduced PBMH algorithms. Our extensive experiments on different benchmark images and in terms of different criteria show that our novel formulation for JPEG image compression can work effectively. △ Less

Submitted 20 April, 2023; v1 submitted 12 December, 2022; originally announced December 2022.

Comments: This paper is submitted to the related journal

arXiv:2209.10447 [pdf, other]

Hierarchical Decision Transformer

Abstract: Sequence models in reinforcement learning require task knowledge to estimate the task policy. This paper presents a hierarchical algorithm for learning a sequence model from demonstrations. The high-level mechanism guides the low-level controller through the task by selecting sub-goals for the latter to reach. This sequence replaces the returns-to-go of previous methods, improving its performance… ▽ More Sequence models in reinforcement learning require task knowledge to estimate the task policy. This paper presents a hierarchical algorithm for learning a sequence model from demonstrations. The high-level mechanism guides the low-level controller through the task by selecting sub-goals for the latter to reach. This sequence replaces the returns-to-go of previous methods, improving its performance overall, especially in tasks with longer episodes and scarcer rewards. We validate our method in multiple tasks of OpenAIGym, D4RL and RoboMimic benchmarks. Our method outperforms the baselines in eight out of ten tasks of varied horizons and reward frequencies without prior task knowledge, showing the advantages of the hierarchical model approach for learning from demonstrations using a sequence model. △ Less

Submitted 21 September, 2022; originally announced September 2022.

arXiv:2209.04374 [pdf, other]

Energy-Aware JPEG Image Compression: A Multi-Objective Approach

Authors: Seyed Jalaleddin Mousavirad, Luís A. Alexandre

Abstract: Customer satisfaction is crucially affected by energy consumption in mobile devices. One of the most energy-consuming parts of an application is images. While different images with different quality consume different amounts of energy, there are no straightforward methods to calculate the energy consumption of an operation in a typical image. This paper, first, investigates that there is a correla… ▽ More Customer satisfaction is crucially affected by energy consumption in mobile devices. One of the most energy-consuming parts of an application is images. While different images with different quality consume different amounts of energy, there are no straightforward methods to calculate the energy consumption of an operation in a typical image. This paper, first, investigates that there is a correlation between energy consumption and image quality as well as image file size. Therefore, these two can be considered as a proxy for energy consumption. Then, we propose a multi-objective strategy to enhance image quality and reduce image file size based on the quantisation tables in JPEG image compression. To this end, we have used two general multi-objective metaheuristic approaches: scalarisation and Pareto-based. Scalarisation methods find a single optimal solution based on combining different objectives, while Pareto-based techniques aim to achieve a set of solutions. In this paper, we embed our strategy into five scalarisation algorithms, including energy-aware multi-objective genetic algorithm (EnMOGA), energy-aware multi-objective particle swarm optimisation (EnMOPSO), energy-aware multi-objective differential evolution (EnMODE), energy-aware multi-objective evolutionary strategy (EnMOES), and energy-aware multi-objective pattern search (EnMOPS). Also, two Pareto-based methods, including a non-dominated sorting genetic algorithm (NSGA-II) and a reference-point-based NSGA-II (NSGA-III) are used for the embedding scheme, and two Pareto-based algorithms, EnNSGAII and EnNSGAIII, are presented. Experimental studies show that the performance of the baseline algorithm is improved by embedding the proposed strategy into metaheuristic algorithms. △ Less

Submitted 9 September, 2022; originally announced September 2022.

Comments: 42 pages, this paper is submitted to the related journal

arXiv:2208.09193 [pdf, other]

Exploration, Path Planning with Obstacle and Collision Avoidance in a Dynamic Environment

Abstract: If we give a robot the task of moving an object from its current position to another location in an unknown environment, the robot must explore the map, identify all types of obstacles, and then determine the best route to complete the task. We proposed a mathematical model to find an optimal path planning that avoids collisions with all static and moving obstacles and has the minimum completion t… ▽ More If we give a robot the task of moving an object from its current position to another location in an unknown environment, the robot must explore the map, identify all types of obstacles, and then determine the best route to complete the task. We proposed a mathematical model to find an optimal path planning that avoids collisions with all static and moving obstacles and has the minimum completion time and the minimum distance traveled. In this model, the bounding box around obstacles and robots is not considered, so the robot can move very close to the obstacles without colliding with them. We considered two types of obstacles: deterministic, which include all static obstacles such as walls that do not move and all moving obstacles whose movements have a fixed pattern, and non-deterministic, which include all obstacles whose movements can occur in any direction with some probability distribution at any time. We also consider the acceleration and deceleration of the robot to improve collision avoidance. △ Less

Submitted 19 August, 2022; originally announced August 2022.

arXiv:2208.06475 [pdf]

Guided Evolutionary Neural Architecture Search With Efficient Performance Estimation

Authors: Vasco Lopes, Miguel Santos, Bruno Degardin, Luís A. Alexandre

Abstract: Neural Architecture Search (NAS) methods have been successfully applied to image tasks with excellent results. However, NAS methods are often complex and tend to converge to local minima as soon as generated architectures seem to yield good results. This paper proposes GEA, a novel approach for guided NAS. GEA guides the evolution by exploring the search space by generating and evaluating several… ▽ More Neural Architecture Search (NAS) methods have been successfully applied to image tasks with excellent results. However, NAS methods are often complex and tend to converge to local minima as soon as generated architectures seem to yield good results. This paper proposes GEA, a novel approach for guided NAS. GEA guides the evolution by exploring the search space by generating and evaluating several architectures in each generation at initialisation stage using a zero-proxy estimator, where only the highest-scoring architecture is trained and kept for the next generation. Subsequently, GEA continuously extracts knowledge about the search space without increased complexity by generating several off-springs from an existing architecture at each generation. More, GEA forces exploitation of the most performant architectures by descendant generation while simultaneously driving exploration through parent mutation and favouring younger architectures to the detriment of older ones. Experimental results demonstrate the effectiveness of the proposed method, and extensive ablation studies evaluate the importance of different parameters. Results show that GEA achieves state-of-the-art results on all data sets of NAS-Bench-101, NAS-Bench-201 and TransNAS-Bench-101 benchmarks. △ Less

Submitted 22 July, 2022; originally announced August 2022.

Comments: 10 pages, 7 figures, 4 tables. arXiv admin note: substantial text overlap with arXiv:2110.15232

arXiv:2207.05620 [pdf]

LudVision -- Remote Detection of Exotic Invasive Aquatic Floral Species using Drone-Mounted Multispectral Data

Authors: António J. Abreu, Luís A. Alexandre, João A. Santos, Filippo Basso

Abstract: Remote sensing is the process of detecting and monitoring the physical characteristics of an area by measuring its reflected and emitted radiation at a distance. It is being broadly used to monitor ecosystems, mainly for their preservation. Ever-growing reports of invasive species have affected the natural balance of ecosystems. Exotic invasive species have a critical impact when introduced into n… ▽ More Remote sensing is the process of detecting and monitoring the physical characteristics of an area by measuring its reflected and emitted radiation at a distance. It is being broadly used to monitor ecosystems, mainly for their preservation. Ever-growing reports of invasive species have affected the natural balance of ecosystems. Exotic invasive species have a critical impact when introduced into new ecosystems and may lead to the extinction of native species. In this study, we focus on Ludwigia peploides, considered by the European Union as an aquatic invasive species. Its presence can negatively impact the surrounding ecosystem and human activities such as agriculture, fishing, and navigation. Our goal was to develop a method to identify the presence of the species. We used images collected by a drone-mounted multispectral sensor to achieve this, creating our LudVision data set. To identify the targeted species on the collected images, we propose a new method for detecting Ludwigia p. in multispectral images. The method is based on existing state-of-the-art semantic segmentation methods modified to handle multispectral data. The proposed method achieved a producer's accuracy of 79.9% and a user's accuracy of 95.5%. △ Less

Submitted 13 July, 2022; v1 submitted 12 July, 2022; originally announced July 2022.

arXiv:2203.05508 [pdf, other]

Towards Less Constrained Macro-Neural Architecture Search

Abstract: Networks found with Neural Architecture Search (NAS) achieve state-of-the-art performance in a variety of tasks, out-performing human-designed networks. However, most NAS methods heavily rely on human-defined assumptions that constrain the search: architecture's outer-skeletons, number of layers, parameter heuristics and search spaces. Additionally, common search spaces consist of repeatable modul… ▽ More Networks found with Neural Architecture Search (NAS) achieve state-of-the-art performance in a variety of tasks, out-performing human-designed networks. However, most NAS methods heavily rely on human-defined assumptions that constrain the search: architecture's outer-skeletons, number of layers, parameter heuristics and search spaces. Additionally, common search spaces consist of repeatable modules (cells) instead of fully exploring the architecture's search space by designing entire architectures (macro-search). Imposing such constraints requires deep human expertise and restricts the search to pre-defined settings. In this paper, we propose LCMNAS, a method that pushes NAS to less constrained search spaces by performing macro-search without relying on pre-defined heuristics or bounded search spaces. LCMNAS introduces three components for the NAS pipeline: i) a method that leverages information about well-known architectures to autonomously generate complex search spaces based on Weighted Directed Graphs with hidden properties, ii) an evolutionary search strategy that generates complete architectures from scratch, and iii) a mixed-performance estimation approach that combines information about architectures at initialization stage and lower fidelity estimates to infer their trainability and capacity to model complex functions. We present experiments in 13 different data sets showing that LCMNAS is capable of generating both cell and macro-based architectures with minimal GPU computation and state-of-the-art results. More, we conduct extensive studies on the importance of different NAS components in both cell and macro-based settings. Code for reproducibility is public at https://github.com/VascoLopes/LCMNAS. △ Less

Submitted 6 January, 2023; v1 submitted 10 March, 2022; originally announced March 2022.

Comments: 13 pages double-column, 9 tables, 6 figures

arXiv:2201.12813 [pdf, other]

Contrastive Learning from Demonstrations

Abstract: This paper presents a framework for learning visual representations from unlabeled video demonstrations captured from multiple viewpoints. We show that these representations are applicable for imitating several robotic tasks, including pick and place. We optimize a recently proposed self-supervised learning algorithm by applying contrastive learning to enhance task-relevant information while suppr… ▽ More This paper presents a framework for learning visual representations from unlabeled video demonstrations captured from multiple viewpoints. We show that these representations are applicable for imitating several robotic tasks, including pick and place. We optimize a recently proposed self-supervised learning algorithm by applying contrastive learning to enhance task-relevant information while suppressing irrelevant information in the feature embeddings. We validate the proposed method on the publicly available Multi-View Pouring and a custom Pick and Place data sets and compare it with the TCN triplet baseline. We evaluate the learned representations using three metrics: viewpoint alignment, stage classification and reinforcement learning, and in all cases the results improve when compared to state-of-the-art approaches, with the added benefit of reduced number of training iterations. △ Less

Submitted 7 November, 2022; v1 submitted 30 January, 2022; originally announced January 2022.

Journal ref: IEEE Robotic Computing, Naples, Italy, December 5-7, 2022

arXiv:2111.09378 [pdf, other]

MPF6D: Masked Pyramid Fusion 6D Pose Estimation

Authors: Nuno Pereira, Luís A. Alexandre

Abstract: Object pose estimation has multiple important applications, such as robotic gras** and augmented reality. We present a new method to estimate the 6D pose of objects that improves upon the accuracy of current proposals and can still be used in real-time. Our method uses RGB-D data as input to segment objects and estimate their pose. It uses a neural network with multiple heads to identify the obj… ▽ More Object pose estimation has multiple important applications, such as robotic gras** and augmented reality. We present a new method to estimate the 6D pose of objects that improves upon the accuracy of current proposals and can still be used in real-time. Our method uses RGB-D data as input to segment objects and estimate their pose. It uses a neural network with multiple heads to identify the objects in the scene, generate the appropriate masks and estimate the values of the translation vectors and the quaternion that represents the objects' rotation. These heads leverage a pyramid architecture used during feature extraction and feature fusion. We conduct an empirical evaluation using the two most common datasets in the area, and compare against state-of-the-art approaches, illustrating the capabilities of MPF6D. Our method can be used in real-time with its low inference time and high accuracy. △ Less

Submitted 4 February, 2022; v1 submitted 17 November, 2021; originally announced November 2021.

arXiv:2110.15232 [pdf, other]

Guided Evolution for Neural Architecture Search

Authors: Vasco Lopes, Miguel Santos, Bruno Degardin, Luís A. Alexandre

Abstract: Neural Architecture Search (NAS) methods have been successfully applied to image tasks with excellent results. However, NAS methods are often complex and tend to converge to local minima as soon as generated architectures seem to yield good results. In this paper, we propose G-EA, a novel approach for guided evolutionary NAS. The rationale behind G-EA, is to explore the search space by generating… ▽ More Neural Architecture Search (NAS) methods have been successfully applied to image tasks with excellent results. However, NAS methods are often complex and tend to converge to local minima as soon as generated architectures seem to yield good results. In this paper, we propose G-EA, a novel approach for guided evolutionary NAS. The rationale behind G-EA, is to explore the search space by generating and evaluating several architectures in each generation at initialization stage using a zero-proxy estimator, where only the highest-scoring network is trained and kept for the next generation. This evaluation at initialization stage allows continuous extraction of knowledge from the search space without increasing computation, thus allowing the search to be efficiently guided. Moreover, G-EA forces exploitation of the most performant networks by descendant generation while at the same time forcing exploration by parent mutation and by favouring younger architectures to the detriment of older ones. Experimental results demonstrate the effectiveness of the proposed method, showing that G-EA achieves state-of-the-art results in NAS-Bench-201 search space in CIFAR-10, CIFAR-100 and ImageNet16-120, with mean accuracies of 93.98%, 72.12% and 45.94% respectively. △ Less

Submitted 28 October, 2021; originally announced October 2021.

Comments: Paper accepted at 35th Conference on Neural Information Processing Systems (NeurIPS) - New In ML. 9 pages, 2 figures, 1 table

arXiv:2104.12710 [pdf, other]

doi 10.1016/j.robot.2022.104144

Optimal Algorithm Allocation for Robotic Network Cloud Systems

Authors: Saeid Alirezazadeh, André Correia, Luís A. Alexandre

Abstract: A robotic network is a system with multiple robots connected by a communication network. Certain tasks that cannot be accomplished with available robotic resources are candidates for the use of cloud robotics, which overcomes the limitations of the robot network by adding to the network, either local or remote servers or cloud infrastructure, to aid in computational demanding tasks or storage. Pre… ▽ More A robotic network is a system with multiple robots connected by a communication network. Certain tasks that cannot be accomplished with available robotic resources are candidates for the use of cloud robotics, which overcomes the limitations of the robot network by adding to the network, either local or remote servers or cloud infrastructure, to aid in computational demanding tasks or storage. Previous studies have mainly focused on minimizing the cost of the robots in retrieving resources by knowing the resource allocation in advance. We develop a method for a robotic network cloud system that includes robots, fog and cloud nodes, to determine where each algorithm should be allocated so that the system achieves optimal performance, regardless of which robot initiates the request. We can find the minimum required memory for the robots and the optimal way to allocate the algorithms with the shortest time to complete each task. We experimentally compare our method with a state-of-the-art method, using real-world data, showing the improvements that can be obtained. △ Less

Submitted 27 May, 2022; v1 submitted 26 April, 2021; originally announced April 2021.

Comments: This work is accepted for publication in the Elsevier Journal of Robotics and Autonomous Systems. Personal use of this material is permitted. Permission from Elsevier must be obtained for all other uses

Journal ref: Robotics and Autonomous Systems, 154, August 2022, 104144

arXiv:2102.08099 [pdf, other]

doi 10.1007/978-3-030-86383-8_44

EPE-NAS: Efficient Performance Estimation Without Training for Neural Architecture Search

Authors: Vasco Lopes, Saeid Alirezazadeh, Luís A. Alexandre

Abstract: Neural Architecture Search (NAS) has shown excellent results in designing architectures for computer vision problems. NAS alleviates the need for human-defined settings by automating architecture design and engineering. However, NAS methods tend to be slow, as they require large amounts of GPU computation. This bottleneck is mainly due to the performance estimation strategy, which requires the eva… ▽ More Neural Architecture Search (NAS) has shown excellent results in designing architectures for computer vision problems. NAS alleviates the need for human-defined settings by automating architecture design and engineering. However, NAS methods tend to be slow, as they require large amounts of GPU computation. This bottleneck is mainly due to the performance estimation strategy, which requires the evaluation of the generated architectures, mainly by training them, to update the sampler method. In this paper, we propose EPE-NAS, an efficient performance estimation strategy, that mitigates the problem of evaluating networks, by scoring untrained networks and creating a correlation with their trained performance. We perform this process by looking at intra and inter-class correlations of an untrained network. We show that EPE-NAS can produce a robust correlation and that by incorporating it into a simple random sampling strategy, we are able to search for competitive networks, without requiring any training, in a matter of seconds using a single GPU. Moreover, EPE-NAS is agnostic to the search method, since it focuses on the evaluation of untrained networks, making it easy to integrate into almost any NAS method. △ Less

Submitted 16 February, 2021; originally announced February 2021.

arXiv:2102.08092 [pdf, other]

doi 10.1109/IJCNN52387.2021.9533552

An AutoML-based Approach to Multimodal Image Sentiment Analysis

Authors: Vasco Lopes, António Gaspar, Luís A. Alexandre, João Cordeiro

Abstract: Sentiment analysis is a research topic focused on analysing data to extract information related to the sentiment that it causes. Applications of sentiment analysis are wide, ranging from recommendation systems, and marketing to customer satisfaction. Recent approaches evaluate textual content using Machine Learning techniques that are trained over large corpora. However, as social media grown, oth… ▽ More Sentiment analysis is a research topic focused on analysing data to extract information related to the sentiment that it causes. Applications of sentiment analysis are wide, ranging from recommendation systems, and marketing to customer satisfaction. Recent approaches evaluate textual content using Machine Learning techniques that are trained over large corpora. However, as social media grown, other data types emerged in large quantities, such as images. Sentiment analysis in images has shown to be a valuable complement to textual data since it enables the inference of the underlying message polarity by creating context and connections. Multimodal sentiment analysis approaches intend to leverage information of both textual and image content to perform an evaluation. Despite recent advances, current solutions still flounder in combining both image and textual information to classify social media data, mainly due to subjectivity, inter-class homogeneity and fusion data differences. In this paper, we propose a method that combines both textual and image individual sentiment analysis into a final fused classification based on AutoML, that performs a random search to find the best model. Our method achieved state-of-the-art performance in the B-T4SA dataset, with 95.19% accuracy. △ Less

Submitted 16 February, 2021; originally announced February 2021.

arXiv:2012.03555 [pdf, other]

Improving Makespan in Dynamic Task Scheduling for Cloud Robotic Systems with Time Window Constraints

Abstract: A scheduling method in a robotic network cloud system with minimal makespan is beneficial as the system can complete all the tasks assigned to it in the fastest way. Robotic network cloud systems can be translated into graphs where nodes represent hardware with independent computing power and edges represent data transmissions between nodes. Time window constraints on tasks are a natural way to or… ▽ More A scheduling method in a robotic network cloud system with minimal makespan is beneficial as the system can complete all the tasks assigned to it in the fastest way. Robotic network cloud systems can be translated into graphs where nodes represent hardware with independent computing power and edges represent data transmissions between nodes. Time window constraints on tasks are a natural way to order tasks. The makespan is the maximum amount of time between when the first node to receive a task starts executing its first scheduled task and when all nodes have completed their last scheduled task. Load balancing allocation and scheduling ensures that the time between when the first node completes its scheduled tasks and when all other nodes complete their scheduled tasks is as short as possible. We propose a grid of all tasks to ensure that the time window constraints for tasks are met. We propose grid of all tasks balancing algorithm for distributing and scheduling tasks with minimum makespan. We theoretically prove the correctness of the proposed algorithm and present simulations illustrating the obtained results. △ Less

Submitted 7 June, 2022; v1 submitted 7 December, 2020; originally announced December 2020.

Comments: This work has been submitted to the Springer Nature for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2009.01573 [pdf, other]

doi 10.1007/978-3-030-63830-6_12

Auto-Classifier: A Robust Defect Detector Based on an AutoML Head

Abstract: The dominant approach for surface defect detection is the use of hand-crafted feature-based methods. However, this falls short when conditions vary that affect extracted images. So, in this paper, we sought to determine how well several state-of-the-art Convolutional Neural Networks perform in the task of surface defect detection. Moreover, we propose two methods: CNN-Fusion, that fuses the predic… ▽ More The dominant approach for surface defect detection is the use of hand-crafted feature-based methods. However, this falls short when conditions vary that affect extracted images. So, in this paper, we sought to determine how well several state-of-the-art Convolutional Neural Networks perform in the task of surface defect detection. Moreover, we propose two methods: CNN-Fusion, that fuses the prediction of all the networks into a final one, and Auto-Classifier, which is a novel proposal that improves a Convolutional Neural Network by modifying its classification component using AutoML. We carried out experiments to evaluate the proposed methods in the task of surface defect detection using different datasets from DAGM2007. We show that the use of Convolutional Neural Networks achieves better results than traditional methods, and also, that Auto-Classifier out-performs all other methods, by achieving 100% accuracy and 100% AUC results throughout all the datasets. △ Less

Submitted 3 September, 2020; originally announced September 2020.

Comments: 12 pages, 2 figures. Published in ICONIP2020, proceedings published in the Springer's series of Lecture Notes in Computer Science

arXiv:2007.16149 [pdf, other]

HMCNAS: Neural Architecture Search using Hidden Markov Chains and Bayesian Optimization

Abstract: Neural Architecture Search has achieved state-of-the-art performance in a variety of tasks, out-performing human-designed networks. However, many assumptions, that require human definition, related with the problems being solved or the models generated are still needed: final model architectures, number of layers to be sampled, forced operations, small search spaces, which ultimately contributes t… ▽ More Neural Architecture Search has achieved state-of-the-art performance in a variety of tasks, out-performing human-designed networks. However, many assumptions, that require human definition, related with the problems being solved or the models generated are still needed: final model architectures, number of layers to be sampled, forced operations, small search spaces, which ultimately contributes to having models with higher performances at the cost of inducing bias into the system. In this paper, we propose HMCNAS, which is composed of two novel components: i) a method that leverages information about human-designed models to autonomously generate a complex search space, and ii) an Evolutionary Algorithm with Bayesian Optimization that is capable of generating competitive CNNs from scratch, without relying on human-defined parameters or small search spaces. The experimental results show that the proposed approach results in competitive architectures obtained in a very short time. HMCNAS provides a step towards generalizing NAS, by providing a way to create competitive models, without requiring any human knowledge about the specific task. △ Less

Submitted 31 July, 2020; originally announced July 2020.

Comments: 9 pages, 1 figure, 2 tables, neural architecture search, macro-search

arXiv:2007.11534 [pdf, other]

doi 10.1109/ISPA-BDCloud-SocialCom-SustainCom51426.2020.00181

Dynamic Task Allocation for Robotic Network Cloud Systems

Abstract: Every robotic network cloud system can be seen as a graph with nodes as hardware with independent computational processing powers and edges as data transmissions between nodes. When assigning a task to a node we may change several values corresponding to the node such as distance to other nodes, the time to complete all of its tasks, the energy level of the node, energy consumed while performing a… ▽ More Every robotic network cloud system can be seen as a graph with nodes as hardware with independent computational processing powers and edges as data transmissions between nodes. When assigning a task to a node we may change several values corresponding to the node such as distance to other nodes, the time to complete all of its tasks, the energy level of the node, energy consumed while performing all of its tasks, geometrical position, communication with other nodes, and so on. These values can be seen as fingerprints for the current state of the node which can be evaluated as a subspace of a hyperspace. We proposed a theoretical model describing how assigning tasks to a node will change the subspace of the hyperspace, and from that, we show how to obtain the optimal task allocation. We described the communication instability between nodes and the capability of nodes as subspaces of a hyperspace. We translate task scheduling to nodes as finding the maximum volume of the hyperspace. △ Less

Submitted 22 July, 2020; originally announced July 2020.

Journal ref: 2020 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom)

arXiv:2003.08683 [pdf, other]

doi 10.1109/TCC.2021.3093489

Optimal Algorithm Allocation for Single Robot Cloud Systems

Abstract: In order for a robot to perform a task, several algorithms need to be executed, sometimes, simultaneously. Algorithms can be run either on the robot itself or, upon request, be performed on cloud infrastructure. The term cloud infrastructure is used to describe hardware, storage, abstracted resources, and network resources related to cloud computing. Depending on the decisions on where to execute… ▽ More In order for a robot to perform a task, several algorithms need to be executed, sometimes, simultaneously. Algorithms can be run either on the robot itself or, upon request, be performed on cloud infrastructure. The term cloud infrastructure is used to describe hardware, storage, abstracted resources, and network resources related to cloud computing. Depending on the decisions on where to execute the algorithms, the overall execution time and necessary memory space for the robot will change accordingly. The price of a robot depends, among other things, on its memory capacity and computational power. We answer the question of how to keep a given performance and use a cheaper robot (lower resources) by assigning computational tasks to the cloud infrastructure, depending on memory, computational power, and communication constraints. Also, for a fixed robot, our model provides a way to have optimal overall performance. We provide a general model for the optimal decision of algorithm allocation under certain constraints. We exemplify the model with simulation results. The main advantage of our model is that it provides an optimal task allocation simultaneously for memory and time. △ Less

Submitted 8 February, 2022; v1 submitted 19 March, 2020; originally announced March 2020.

Comments: \c{opyright} 2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

Journal ref: 2021 IEEE Transactions on Cloud Computing

arXiv:1911.07771 [pdf, other]

MaskedFusion: Mask-based 6D Object Pose Estimation

Authors: Nuno Pereira, Luís A. Alexandre

Abstract: MaskedFusion is a framework to estimate the 6D pose of objects using RGB-D data, with an architecture that leverages multiple sub-tasks in a pipeline to achieve accurate 6D poses. 6D pose estimation is an open challenge due to complex world objects and many possible problems when capturing data from the real world, e.g., occlusions, truncations, and noise in the data. Achieving accurate 6D poses w… ▽ More MaskedFusion is a framework to estimate the 6D pose of objects using RGB-D data, with an architecture that leverages multiple sub-tasks in a pipeline to achieve accurate 6D poses. 6D pose estimation is an open challenge due to complex world objects and many possible problems when capturing data from the real world, e.g., occlusions, truncations, and noise in the data. Achieving accurate 6D poses will improve results in other open problems like robot gras** or positioning objects in augmented reality. MaskedFusion improves the state-of-the-art by using object masks to eliminate non-relevant data. With the inclusion of the masks on the neural network that estimates the 6D pose of an object we also have features that represent the object shape. MaskedFusion is a modular pipeline where each sub-task can have different methods that achieve the objective. MaskedFusion achieved 97.3% on average using the ADD metric on the LineMOD dataset and 93.3% using the ADD-S AUC metric on YCB-Video Dataset, which is an improvement, compared to the state-of-the-art methods. The code is available on GitHub (https://github.com/kroglice/MaskedFusion). △ Less

Submitted 18 March, 2020; v1 submitted 18 November, 2019; originally announced November 2019.

arXiv:1906.10923 [pdf, other]

A crossinggram for random fields on lattices

Authors: Helena Ferreira, Marta Ferreira, Luís A. Alexandre

Abstract: The modeling of risk situations that occur in a space-time framework can be done using max-stable random fields on lattices. Although the summary coefficients for the spatial and temporal behaviour do not characterize the finite-dimensional distributions of the random field, they have the advantage of being immediate to interpret and easier to estimate. The coefficients that we propose give us inf… ▽ More The modeling of risk situations that occur in a space-time framework can be done using max-stable random fields on lattices. Although the summary coefficients for the spatial and temporal behaviour do not characterize the finite-dimensional distributions of the random field, they have the advantage of being immediate to interpret and easier to estimate. The coefficients that we propose give us information about the tendency of a random field for local oscillations of its values in relation to real valued high levels. It is not the magnitude of the oscillations that is being evaluated, but rather the greater or lesser number of oscillations, that is, the tendency of the trajectories to oscillate. We can observe surface trajectories more smooth over a region according to higher crossinggram value. It takes value in $[0,1]$ and increases with the concordance of the variables of the random field. △ Less

Submitted 13 February, 2020; v1 submitted 26 June, 2019; originally announced June 2019.

arXiv:1904.04306 [pdf, other]

A Time-Segmented Consortium Blockchain for Robotic Event Registration

Authors: Miguel Fernandes, Luís A. Alexandre

Abstract: A blockchain, during its lifetime, records large amounts of data, that in a common usage its kept on its entirety. In a robotics environment, the old information is useful for human evaluation, or oracles interfacing with the blockchain but it is not useful for the robots that require only current information in order to continue their work. This causes a storage problem in blockchain nodes that h… ▽ More A blockchain, during its lifetime, records large amounts of data, that in a common usage its kept on its entirety. In a robotics environment, the old information is useful for human evaluation, or oracles interfacing with the blockchain but it is not useful for the robots that require only current information in order to continue their work. This causes a storage problem in blockchain nodes that have limited storage capacity, such as in the case of nodes attached to robots that are usually built around embedded solutions. This paper presents a time-segmentation solution for devices with limited storage capacity, integrated in a particular robot-directed blockchain called RobotChain. Results are presented regarding the proposed solution that show that the goal of restricting each node's capacity is reached without compromising all the benefits that arise from the use of blockchains in these contexts, and on the contrary, it allows for cheap nodes to use this blockchain, reduces storage costs and allows faster deployment of new nodes. △ Less

Submitted 14 May, 2019; v1 submitted 8 April, 2019; originally announced April 2019.

arXiv:1903.00660 [pdf, other]

Controlling Robots using Artificial Intelligence and a Consortium Blockchain

Authors: Vasco Lopes, Luís A. Alexandre, Nuno Pereira

Abstract: Blockchain is a disruptive technology that is normally used within financial applications, however it can be very beneficial also in certain robotic contexts, such as when an immutable register of events is required. Among the several properties of Blockchain that can be useful within robotic environments, we find not just immutability but also decentralization of the data, irreversibility, access… ▽ More Blockchain is a disruptive technology that is normally used within financial applications, however it can be very beneficial also in certain robotic contexts, such as when an immutable register of events is required. Among the several properties of Blockchain that can be useful within robotic environments, we find not just immutability but also decentralization of the data, irreversibility, accessibility and non-repudiation. In this paper, we propose an architecture that uses blockchain as a ledger and smart-contract technology for robotic control by using external parties, Oracles, to process data. We show how to register events in a secure way, how it is possible to use smart-contracts to control robots and how to interface with external Artificial Intelligence algorithms for image analysis. The proposed architecture is modular and can be used in multiple contexts such as in manufacturing, network control, robot control, and others, since it is easy to integrate, adapt, maintain and extend to new domains. △ Less

Submitted 2 March, 2019; originally announced March 2019.

arXiv:1810.00329 [pdf, ps, other]

An Overview of Blockchain Integration with Robotics and Artificial Intelligence