-
Deep Dive into MRI: Exploring Deep Learning Applications in 0.55T and 7T MRI
Authors:
Ana Carolina Alves,
André Ferreira,
Behrus Puladi,
Jan Egger,
Victor Alves
Abstract:
The development of magnetic resonance imaging (MRI) for medical imaging has provided a leap forward in diagnosis, providing a safe, non-invasive alternative to techniques involving ionising radiation exposure for diagnostic purposes. It was described by Block and Purcel in 1946, and it was not until 1980 that the first clinical application of MRI became available. Since that time the MRI has gone…
▽ More
The development of magnetic resonance imaging (MRI) for medical imaging has provided a leap forward in diagnosis, providing a safe, non-invasive alternative to techniques involving ionising radiation exposure for diagnostic purposes. It was described by Block and Purcel in 1946, and it was not until 1980 that the first clinical application of MRI became available. Since that time the MRI has gone through many advances and has altered the way diagnosing procedures are performed. Due to its ability to improve constantly, MRI has become a commonly used practice among several specialisations in medicine. Particularly starting 0.55T and 7T MRI technologies have pointed out enhanced preservation of image detail and advanced tissue characterisation. This review examines the integration of deep learning (DL) techniques into these MRI modalities, disseminating and exploring the study applications. It highlights how DL contributes to 0.55T and 7T MRI data, showcasing the potential of DL in improving and refining these technologies. The review ends with a brief overview of how MRI technology will evolve in the coming years.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Eery Space: Facilitating Virtual Meetings Through Remote Proxemics
Authors:
Maurício Sousa,
Daniel Mendes,
Alfredo Ferreira,
João Madeiras Pereira,
Joaquim Jorge
Abstract:
Virtual meetings have become increasingly common with modern video-conference and collaborative software. While they allow obvious savings in time and resources, current technologies add unproductive layers of protocol to the flow of communication between participants, rendering the interactions far from seamless. In this work we introduce Remote Proxemics, an extension of proxemics aimed at bring…
▽ More
Virtual meetings have become increasingly common with modern video-conference and collaborative software. While they allow obvious savings in time and resources, current technologies add unproductive layers of protocol to the flow of communication between participants, rendering the interactions far from seamless. In this work we introduce Remote Proxemics, an extension of proxemics aimed at bringing the syntax of co-located proximal interactions to virtual meetings. We propose Eery Space, a shared virtual locus that results from merging multiple remote areas, where meeting participants' are located side-by-side as if they shared the same physical location. Eery Space promotes collaborative content creation and seamless mediation of communication channels based on virtual proximity. Results from user evaluation suggest that our approach is effective at enhancing mutual awareness between participants and sufficient to initiate proximal exchanges regardless of their geolocation, while promoting smooth interactions between local and remote people alike. These results happen even in the absence of visual avatars and other social devices such as eye contact, which are largely the focus of previous approaches.
△ Less
Submitted 1 June, 2024;
originally announced June 2024.
-
4Doodle: Two-handed Gestures for Immersive Sketching of Architectural Models
Authors:
Fernando Fonseca,
Maurício Sousa,
Daniel Mendes,
Alfredo Ferreira,
Joaquim Jorge
Abstract:
Three-dimensional immersive sketching for content creation and modeling has been studied for some time. However, research in this domain mainly focused on CAVE-like scenarios. These setups can be expensive and offer a narrow interaction space. Building more affordable setups using head-mounted displays is possible, allowing greater immersion and a larger space for user physical movements. This pap…
▽ More
Three-dimensional immersive sketching for content creation and modeling has been studied for some time. However, research in this domain mainly focused on CAVE-like scenarios. These setups can be expensive and offer a narrow interaction space. Building more affordable setups using head-mounted displays is possible, allowing greater immersion and a larger space for user physical movements. This paper presents a fully immersive environment using bi-manual gestures to sketch and create content freely in the virtual world. This approach can be applied to many scenarios, allowing people to express their ideas or review existing designs. To cope with known motor difficulties and inaccuracy of freehand 3D sketching, we explore proxy geometry and a laser-like metaphor to draw content directly from models and create content surfaces. Our current prototype offers 24 cubic meters for movement, limited by the room size. It features infinite virtual drawing space through pan and scale techniques and is larger than the typical 6-sided cave at a fraction of the cost. In a preliminary study conducted with architects and engineers, our system showed a clear promise as a tool for sketching and 3D content creation in virtual reality with a great emphasis on bi-manual gestures.
△ Less
Submitted 27 June, 2024; v1 submitted 29 May, 2024;
originally announced May 2024.
-
Neural Network Learning of Black-Scholes Equation for Option Pricing
Authors:
Daniel de Souza Santos,
Tiago Alessandro Espinola Ferreira
Abstract:
One of the most discussed problems in the financial world is stock option pricing. The Black-Scholes Equation is a Parabolic Partial Differential Equation which provides an option pricing model. The present work proposes an approach based on Neural Networks to solve the Black-Scholes Equations. Real-world data from the stock options market were used as the initial boundary to solve the Black-Schol…
▽ More
One of the most discussed problems in the financial world is stock option pricing. The Black-Scholes Equation is a Parabolic Partial Differential Equation which provides an option pricing model. The present work proposes an approach based on Neural Networks to solve the Black-Scholes Equations. Real-world data from the stock options market were used as the initial boundary to solve the Black-Scholes Equation. In particular, times series of call options prices of Brazilian companies Petrobras and Vale were employed. The results indicate that the network can learn to solve the Black-Sholes Equation for a specific real-world stock options time series. The experimental results showed that the Neural network option pricing based on the Black-Sholes Equation solution can reach an option pricing forecasting more accurate than the traditional Black-Sholes analytical solutions. The experimental results making it possible to use this methodology to make short-term call option price forecasts in options markets.
△ Less
Submitted 9 May, 2024;
originally announced May 2024.
-
How we won BraTS 2023 Adult Glioma challenge? Just faking it! Enhanced Synthetic Data Augmentation and Model Ensemble for brain tumour segmentation
Authors:
André Ferreira,
Naida Solak,
Jianning Li,
Philipp Dammann,
Jens Kleesiek,
Victor Alves,
Jan Egger
Abstract:
Deep Learning is the state-of-the-art technology for segmenting brain tumours. However, this requires a lot of high-quality data, which is difficult to obtain, especially in the medical field. Therefore, our solutions address this problem by using unconventional mechanisms for data augmentation. Generative adversarial networks and registration are used to massively increase the amount of available…
▽ More
Deep Learning is the state-of-the-art technology for segmenting brain tumours. However, this requires a lot of high-quality data, which is difficult to obtain, especially in the medical field. Therefore, our solutions address this problem by using unconventional mechanisms for data augmentation. Generative adversarial networks and registration are used to massively increase the amount of available samples for training three different deep learning models for brain tumour segmentation, the first task of the BraTS2023 challenge. The first model is the standard nnU-Net, the second is the Swin UNETR and the third is the winning solution of the BraTS 2021 Challenge. The entire pipeline is built on the nnU-Net implementation, except for the generation of the synthetic data. The use of convolutional algorithms and transformers is able to fill each other's knowledge gaps. Using the new metric, our best solution achieves the dice results 0.9005, 0.8673, 0.8509 and HD95 14.940, 14.467, 17.699 (whole tumour, tumour core and enhancing tumour) in the validation set.
△ Less
Submitted 27 February, 2024;
originally announced February 2024.
-
Deep PCCT: Photon Counting Computed Tomography Deep Learning Applications Review
Authors:
Ana Carolina Alves,
André Ferreira,
Gijs Luijten,
Jens Kleesiek,
Behrus Puladi,
Jan Egger,
Victor Alves
Abstract:
Medical imaging faces challenges such as limited spatial resolution, interference from electronic noise and poor contrast-to-noise ratios. Photon Counting Computed Tomography (PCCT) has emerged as a solution, addressing these issues with its innovative technology. This review delves into the recent developments and applications of PCCT in pre-clinical research, emphasizing its potential to overcom…
▽ More
Medical imaging faces challenges such as limited spatial resolution, interference from electronic noise and poor contrast-to-noise ratios. Photon Counting Computed Tomography (PCCT) has emerged as a solution, addressing these issues with its innovative technology. This review delves into the recent developments and applications of PCCT in pre-clinical research, emphasizing its potential to overcome traditional imaging limitations. For example PCCT has demonstrated remarkable efficacy in improving the detection of subtle abnormalities in breast, providing a level of detail previously unattainable. Examining the current literature on PCCT, it presents a comprehensive analysis of the technology, highlighting the main features of scanners and their varied applications. In addition, it explores the integration of deep learning into PCCT, along with the study of radiomic features, presenting successful applications in data processing. While acknowledging these advances, it also discusses the existing challenges in this field, paving the way for future research and improvements in medical imaging technologies. Despite the limited number of articles on this subject, due to the recent integration of PCCT at a clinical level, its potential benefits extend to various diagnostic applications.
△ Less
Submitted 6 February, 2024;
originally announced February 2024.
-
Polygon Detection from a Set of Lines
Authors:
Alfredo Ferreira Jr.,
Manuel J. Fonseca,
Joaquim A. Jorge
Abstract:
Detecting polygons defined by a set of line segments in a plane is an important step in analyzing vector drawings. This paper presents an approach combining several algorithms to detect basic polygons from arbitrary line segments. The resulting algorithm runs in polynomial time and space, with complexities of $O\bigl((N + M)^4\bigr)$ and $O\bigl((N + M)^2\bigr)$, where $N$ is the number of line se…
▽ More
Detecting polygons defined by a set of line segments in a plane is an important step in analyzing vector drawings. This paper presents an approach combining several algorithms to detect basic polygons from arbitrary line segments. The resulting algorithm runs in polynomial time and space, with complexities of $O\bigl((N + M)^4\bigr)$ and $O\bigl((N + M)^2\bigr)$, where $N$ is the number of line segments and $M$ is the number of intersections between line segments. Our choice of algorithms was made to strike a good compromise between efficiency and ease of implementation. The result is a simple and efficient solution to detect polygons from lines.
△ Less
Submitted 26 December, 2023;
originally announced December 2023.
-
Beyond the Screen: Resha** the Workplace with Virtual and Augmented Reality
Authors:
Nuno Verdelho Trindade,
Alfredo Ferreira,
João Madeiras Pereira
Abstract:
Although extended reality technologies have enjoyed an explosion in popularity in recent years, few applications are effectively used outside the entertainment or academic contexts. This work consists of a literature review regarding the effective integration of such technologies in the workplace. It aims to provide an updated view of how they are being used in that context. First, we examine exis…
▽ More
Although extended reality technologies have enjoyed an explosion in popularity in recent years, few applications are effectively used outside the entertainment or academic contexts. This work consists of a literature review regarding the effective integration of such technologies in the workplace. It aims to provide an updated view of how they are being used in that context. First, we examine existing research concerning virtual, augmented, and mixed-reality applications. We also analyze which have made their way to the workflows of companies and institutions. Furthermore, we circumscribe the aspects of extended reality technologies that determined this applicability.
△ Less
Submitted 1 December, 2023;
originally announced December 2023.
-
Visualizing Plasma Physics Simulations in Immersive Environments
Authors:
Nuno Verdelho Trindade,
Oscar Amaro,
David Bras,
Daniel Goncalves,
João Madeiras Pereira,
Alfredo Ferreira
Abstract:
Plasma physics simulations create complex datasets for which researchers need state-of-the-art visualization tools to gain insights. These datasets are 3D in nature but are commonly depicted and analyzed using 2D idioms displayed on 2D screens. These offer limited understandability in a domain where spatial awareness is key. Virtual reality (VR) can be used as an alternative to conventional means…
▽ More
Plasma physics simulations create complex datasets for which researchers need state-of-the-art visualization tools to gain insights. These datasets are 3D in nature but are commonly depicted and analyzed using 2D idioms displayed on 2D screens. These offer limited understandability in a domain where spatial awareness is key. Virtual reality (VR) can be used as an alternative to conventional means for analyzing such datasets. VR has been known to improve depth and spatial relationship perception, which are fundamental for obtaining insights into 3D plasma morphology. Likewise, VR can potentially increase user engagement by offering more immersive and enjoyable experiences. Methods This study presents PlasmaVR, a proof-of-concept VR tool for visualizing datasets resulting from plasma physics simulations. It enables immersive multidimensional data visualization of particles, scalar, and vector fields and uses a more natural interface. The study includes user evaluation with domain experts where PlasmaVR was employed to assess the possible benefits of immersive environments in plasma physics visualization. The experimental group comprised five plasma physics researchers who were asked to perform tasks designed to represent their typical analysis workflow. To assess the suitability of the prototype for the different types of tasks, a set of objective metrics, such as completion time and number of errors, were measured. The prototype's usability was also evaluated using a standard System Usability Survey questionnaire.
△ Less
Submitted 24 November, 2023;
originally announced November 2023.
-
CAPIVARA: Cost-Efficient Approach for Improving Multilingual CLIP Performance on Low-Resource Languages
Authors:
Gabriel Oliveira dos Santos,
Diego A. B. Moreira,
Alef Iury Ferreira,
Jhessica Silva,
Luiz Pereira,
Pedro Bueno,
Thiago Sousa,
Helena Maia,
Nádia Da Silva,
Esther Colombini,
Helio Pedrini,
Sandra Avila
Abstract:
This work introduces CAPIVARA, a cost-efficient framework designed to enhance the performance of multilingual CLIP models in low-resource languages. While CLIP has excelled in zero-shot vision-language tasks, the resource-intensive nature of model training remains challenging. Many datasets lack linguistic diversity, featuring solely English descriptions for images. CAPIVARA addresses this by augm…
▽ More
This work introduces CAPIVARA, a cost-efficient framework designed to enhance the performance of multilingual CLIP models in low-resource languages. While CLIP has excelled in zero-shot vision-language tasks, the resource-intensive nature of model training remains challenging. Many datasets lack linguistic diversity, featuring solely English descriptions for images. CAPIVARA addresses this by augmenting text data using image captioning and machine translation to generate multiple synthetic captions in low-resource languages. We optimize the training pipeline with LiT, LoRA, and gradient checkpointing to alleviate the computational cost. Through extensive experiments, CAPIVARA emerges as state of the art in zero-shot tasks involving images and Portuguese texts. We show the potential for significant improvements in other low-resource languages, achieved by fine-tuning the pre-trained multilingual CLIP using CAPIVARA on a single GPU for 2 hours. Our model and code is available at https://github.com/hiaac-nlp/CAPIVARA.
△ Less
Submitted 23 October, 2023; v1 submitted 20 October, 2023;
originally announced October 2023.
-
Multilingual Natural Language Processing Model for Radiology Reports -- The Summary is all you need!
Authors:
Mariana Lindo,
Ana Sofia Santos,
André Ferreira,
Jianning Li,
Gijs Luijten,
Gustavo Correia,
Moon Kim,
Benedikt Michael Schaarschmidt,
Cornelius Deuschl,
Johannes Haubold,
Jens Kleesiek,
Jan Egger,
Victor Alves
Abstract:
The impression section of a radiology report summarizes important radiology findings and plays a critical role in communicating these findings to physicians. However, the preparation of these summaries is time-consuming and error-prone for radiologists. Recently, numerous models for radiology report summarization have been developed. Nevertheless, there is currently no model that can summarize the…
▽ More
The impression section of a radiology report summarizes important radiology findings and plays a critical role in communicating these findings to physicians. However, the preparation of these summaries is time-consuming and error-prone for radiologists. Recently, numerous models for radiology report summarization have been developed. Nevertheless, there is currently no model that can summarize these reports in multiple languages. Such a model could greatly improve future research and the development of Deep Learning models that incorporate data from patients with different ethnic backgrounds. In this study, the generation of radiology impressions in different languages was automated by fine-tuning a model, publicly available, based on a multilingual text-to-text Transformer to summarize findings available in English, Portuguese, and German radiology reports. In a blind test, two board-certified radiologists indicated that for at least 70% of the system-generated summaries, the quality matched or exceeded the corresponding human-written summaries, suggesting substantial clinical reliability. Furthermore, this study showed that the multilingual model outperformed other models that specialized in summarizing radiology reports in only one language, as well as models that were not specifically designed for summarizing radiology reports, such as ChatGPT.
△ Less
Submitted 13 January, 2024; v1 submitted 29 September, 2023;
originally announced October 2023.
-
Deepfake audio as a data augmentation technique for training automatic speech to text transcription models
Authors:
Alexandre R. Ferreira,
Cláudio E. C. Campelo
Abstract:
To train transcriptor models that produce robust results, a large and diverse labeled dataset is required. Finding such data with the necessary characteristics is a challenging task, especially for languages less popular than English. Moreover, producing such data requires significant effort and often money. Therefore, a strategy to mitigate this problem is the use of data augmentation techniques.…
▽ More
To train transcriptor models that produce robust results, a large and diverse labeled dataset is required. Finding such data with the necessary characteristics is a challenging task, especially for languages less popular than English. Moreover, producing such data requires significant effort and often money. Therefore, a strategy to mitigate this problem is the use of data augmentation techniques. In this work, we propose a framework that approaches data augmentation based on deepfake audio. To validate the produced framework, experiments were conducted using existing deepfake and transcription models. A voice cloner and a dataset produced by Indians (in English) were selected, ensuring the presence of a single accent in the dataset. Subsequently, the augmented data was used to train speech to text models in various scenarios.
△ Less
Submitted 22 September, 2023;
originally announced September 2023.
-
MedShapeNet -- A Large-Scale Dataset of 3D Medical Shapes for Computer Vision
Authors:
Jianning Li,
Zongwei Zhou,
Jiancheng Yang,
Antonio Pepe,
Christina Gsaxner,
Gijs Luijten,
Chongyu Qu,
Tiezheng Zhang,
Xiaoxi Chen,
Wenxuan Li,
Marek Wodzinski,
Paul Friedrich,
Kangxian Xie,
Yuan **,
Narmada Ambigapathy,
Enrico Nasca,
Naida Solak,
Gian Marco Melito,
Viet Duc Vu,
Afaque R. Memon,
Christopher Schlachta,
Sandrine De Ribaupierre,
Rajnikant Patel,
Roy Eagleson,
Xiaojun Chen
, et al. (132 additional authors not shown)
Abstract:
Prior to the deep learning era, shape was commonly used to describe the objects. Nowadays, state-of-the-art (SOTA) algorithms in medical imaging are predominantly diverging from computer vision, where voxel grids, meshes, point clouds, and implicit surface models are used. This is seen from numerous shape-related publications in premier vision conferences as well as the growing popularity of Shape…
▽ More
Prior to the deep learning era, shape was commonly used to describe the objects. Nowadays, state-of-the-art (SOTA) algorithms in medical imaging are predominantly diverging from computer vision, where voxel grids, meshes, point clouds, and implicit surface models are used. This is seen from numerous shape-related publications in premier vision conferences as well as the growing popularity of ShapeNet (about 51,300 models) and Princeton ModelNet (127,915 models). For the medical domain, we present a large collection of anatomical shapes (e.g., bones, organs, vessels) and 3D models of surgical instrument, called MedShapeNet, created to facilitate the translation of data-driven vision algorithms to medical applications and to adapt SOTA vision algorithms to medical problems. As a unique feature, we directly model the majority of shapes on the imaging data of real patients. As of today, MedShapeNet includes 23 dataset with more than 100,000 shapes that are paired with annotations (ground truth). Our data is freely accessible via a web interface and a Python application programming interface (API) and can be used for discriminative, reconstructive, and variational benchmarks as well as various applications in virtual, augmented, or mixed reality, and 3D printing. Exemplary, we present use cases in the fields of classification of brain tumors, facial and skull reconstructions, multi-class anatomy completion, education, and 3D printing. In future, we will extend the data and improve the interfaces. The project pages are: https://medshapenet.ikim.nrw/ and https://github.com/Jianningli/medshapenet-feedback
△ Less
Submitted 12 December, 2023; v1 submitted 30 August, 2023;
originally announced August 2023.
-
Revisiting N-CNN for Clinical Practice
Authors:
Leonardo Antunes Ferreira,
Lucas Pereira Carlini,
Gabriel de Almeida Sá Coutrin,
Tatiany Marcondes Heideirich,
Marina Carvalho de Moraes Barros,
Ruth Guinsburg,
Carlos Eduardo Thomaz
Abstract:
This paper revisits the Neonatal Convolutional Neural Network (N-CNN) by optimizing its hyperparameters and evaluating how they affect its classification metrics, explainability and reliability, discussing their potential impact in clinical practice. We have chosen hyperparameters that do not modify the original N-CNN architecture, but mainly modify its learning rate and training regularization. T…
▽ More
This paper revisits the Neonatal Convolutional Neural Network (N-CNN) by optimizing its hyperparameters and evaluating how they affect its classification metrics, explainability and reliability, discussing their potential impact in clinical practice. We have chosen hyperparameters that do not modify the original N-CNN architecture, but mainly modify its learning rate and training regularization. The optimization was done by evaluating the improvement in F1 Score for each hyperparameter individually, and the best hyperparameters were chosen to create a Tuned N-CNN. We also applied soft labels derived from the Neonatal Facial Coding System, proposing a novel approach for training facial expression classification models for neonatal pain assessment. Interestingly, while the Tuned N-CNN results point towards improvements in classification metrics and explainability, these improvements did not directly translate to calibration performance. We believe that such insights might have the potential to contribute to the development of more reliable pain evaluation tools for newborns, aiding healthcare professionals in delivering appropriate interventions and improving patient outcomes.
△ Less
Submitted 10 August, 2023;
originally announced August 2023.
-
PESC -- Parallel Experiment for Sequential Code
Authors:
Henrique C. T. Santos,
Luciano S. de Souza,
Jonathan H. A. de Carvalho,
Tiago A. E. Ferreira
Abstract:
The need for computational resources grows as computational algorithms gain popularity in different sectors of the scientific community. This search has stimulated the development of several cloud platforms that abstract the complexity of computational infrastructure. Unfortunately, the cost of accessing these resources could leave out various studies that could be carried by a simpler infrastruct…
▽ More
The need for computational resources grows as computational algorithms gain popularity in different sectors of the scientific community. This search has stimulated the development of several cloud platforms that abstract the complexity of computational infrastructure. Unfortunately, the cost of accessing these resources could leave out various studies that could be carried by a simpler infrastructure. In this article, we present a platform for distributing computer simulations on resources available on a network using containers that abstracts the complexity needed to configure these execution environments and allows any user can benefit from this infrastructure. Simulations could be developed in any programming language (like Python, Java, C, R) and with specific execution needs within reach of the scientific community in a general way. We will present results obtained in running simulations that required more than 1000 runs with different initial parameters and various other experiments that benefited from using the platform.
△ Less
Submitted 13 January, 2023;
originally announced January 2023.
-
Open-Source Skull Reconstruction with MONAI
Authors:
Jianning Li,
André Ferreira,
Behrus Puladi,
Victor Alves,
Michael Kamp,
Moon-Sung Kim,
Felix Nensa,
Jens Kleesiek,
Seyed-Ahmad Ahmadi,
Jan Egger
Abstract:
We present a deep learning-based approach for skull reconstruction for MONAI, which has been pre-trained on the MUG500+ skull dataset. The implementation follows the MONAI contribution guidelines, hence, it can be easily tried out and used, and extended by MONAI users. The primary goal of this paper lies in the investigation of open-sourcing codes and pre-trained deep learning models under the MON…
▽ More
We present a deep learning-based approach for skull reconstruction for MONAI, which has been pre-trained on the MUG500+ skull dataset. The implementation follows the MONAI contribution guidelines, hence, it can be easily tried out and used, and extended by MONAI users. The primary goal of this paper lies in the investigation of open-sourcing codes and pre-trained deep learning models under the MONAI framework. Nowadays, open-sourcing software, especially (pre-trained) deep learning models, has become increasingly important. Over the years, medical image analysis experienced a tremendous transformation. Over a decade ago, algorithms had to be implemented and optimized with low-level programming languages, like C or C++, to run in a reasonable time on a desktop PC, which was not as powerful as today's computers. Nowadays, users have high-level scripting languages like Python, and frameworks like PyTorch and TensorFlow, along with a sea of public code repositories at hand. As a result, implementations that had thousands of lines of C or C++ code in the past, can now be scripted with a few lines and in addition executed in a fraction of the time. To put this even on a higher level, the Medical Open Network for Artificial Intelligence (MONAI) framework tailors medical imaging research to an even more convenient process, which can boost and push the whole field. The MONAI framework is a freely available, community-supported, open-source and PyTorch-based framework, that also enables to provide research contributions with pre-trained models to others. Codes and pre-trained weights for skull reconstruction are publicly available at: https://github.com/Project-MONAI/research-contributions/tree/master/SkullRec
△ Less
Submitted 15 June, 2023; v1 submitted 25 November, 2022;
originally announced November 2022.
-
Safe Real-World Autonomous Driving by Learning to Predict and Plan with a Mixture of Experts
Authors:
Stefano Pini,
Christian S. Perone,
Aayush Ahuja,
Ana Sofia Rufino Ferreira,
Moritz Niendorf,
Sergey Zagoruyko
Abstract:
The goal of autonomous vehicles is to navigate public roads safely and comfortably. To enforce safety, traditional planning approaches rely on handcrafted rules to generate trajectories. Machine learning-based systems, on the other hand, scale with data and are able to learn more complex behaviors. However, they often ignore that agents and self-driving vehicle trajectory distributions can be leve…
▽ More
The goal of autonomous vehicles is to navigate public roads safely and comfortably. To enforce safety, traditional planning approaches rely on handcrafted rules to generate trajectories. Machine learning-based systems, on the other hand, scale with data and are able to learn more complex behaviors. However, they often ignore that agents and self-driving vehicle trajectory distributions can be leveraged to improve safety. In this paper, we propose modeling a distribution over multiple future trajectories for both the self-driving vehicle and other road agents, using a unified neural network architecture for prediction and planning. During inference, we select the planning trajectory that minimizes a cost taking into account safety and the predicted probabilities. Our approach does not depend on any rule-based planners for trajectory generation or optimization, improves with more training data and is simple to implement. We extensively evaluate our method through a realistic simulator and show that the predicted trajectory distribution corresponds to different driving profiles. We also successfully deploy it on a self-driving vehicle on urban public roads, confirming that it drives safely without compromising comfort. The code for training and testing our model on a public prediction dataset and the video of the road test are available at https://woven.mobi/safepathnet
△ Less
Submitted 3 November, 2022;
originally announced November 2022.
-
CW-ERM: Improving Autonomous Driving Planning with Closed-loop Weighted Empirical Risk Minimization
Authors:
Eesha Kumar,
Yiming Zhang,
Stefano Pini,
Simon Stent,
Ana Ferreira,
Sergey Zagoruyko,
Christian S. Perone
Abstract:
The imitation learning of self-driving vehicle policies through behavioral cloning is often carried out in an open-loop fashion, ignoring the effect of actions to future states. Training such policies purely with Empirical Risk Minimization (ERM) can be detrimental to real-world performance, as it biases policy networks towards matching only open-loop behavior, showing poor results when evaluated…
▽ More
The imitation learning of self-driving vehicle policies through behavioral cloning is often carried out in an open-loop fashion, ignoring the effect of actions to future states. Training such policies purely with Empirical Risk Minimization (ERM) can be detrimental to real-world performance, as it biases policy networks towards matching only open-loop behavior, showing poor results when evaluated in closed-loop. In this work, we develop an efficient and simple-to-implement principle called Closed-loop Weighted Empirical Risk Minimization (CW-ERM), in which a closed-loop evaluation procedure is first used to identify training data samples that are important for practical driving performance and then we these samples to help debias the policy network. We evaluate CW-ERM in a challenging urban driving dataset and show that this procedure yields a significant reduction in collisions as well as other non-differentiable closed-loop metrics.
△ Less
Submitted 11 October, 2022; v1 submitted 5 October, 2022;
originally announced October 2022.
-
AutoPET Challenge: Combining nn-Unet with Swin UNETR Augmented by Maximum Intensity Projection Classifier
Authors:
Lars Heiliger,
Zdravko Marinov,
Max Hasin,
André Ferreira,
Jana Fragemann,
Kelsey Pomykala,
Jacob Murray,
David Kersting,
Victor Alves,
Rainer Stiefelhagen,
Jan Egger,
Jens Kleesiek
Abstract:
Tumor volume and changes in tumor characteristics over time are important biomarkers for cancer therapy. In this context, FDG-PET/CT scans are routinely used for staging and re-staging of cancer, as the radiolabeled fluorodeoxyglucose is taken up in regions of high metabolism. Unfortunately, these regions with high metabolism are not specific to tumors and can also represent physiological uptake b…
▽ More
Tumor volume and changes in tumor characteristics over time are important biomarkers for cancer therapy. In this context, FDG-PET/CT scans are routinely used for staging and re-staging of cancer, as the radiolabeled fluorodeoxyglucose is taken up in regions of high metabolism. Unfortunately, these regions with high metabolism are not specific to tumors and can also represent physiological uptake by normal functioning organs, inflammation, or infection, making detailed and reliable tumor segmentation in these scans a demanding task. This gap in research is addressed by the AutoPET challenge, which provides a public data set with FDG-PET/CT scans from 900 patients to encourage further improvement in this field. Our contribution to this challenge is an ensemble of two state-of-the-art segmentation models, the nn-Unet and the Swin UNETR, augmented by a maximum intensity projection classifier that acts like a gating mechanism. If it predicts the existence of lesions, both segmentations are combined by a late fusion approach. Our solution achieves a Dice score of 72.12\% on patients diagnosed with lung cancer, melanoma, and lymphoma in our cross-validation. Code: https://github.com/heiligerl/autopet_submission
△ Less
Submitted 14 October, 2022; v1 submitted 2 September, 2022;
originally announced September 2022.
-
An Evolutionary Approach for Creating of Diverse Classifier Ensembles
Authors:
Alvaro R. Ferreira Jr,
Fabio A. Faria,
Gustavo Carneiro,
Vinicius V. de Melo
Abstract:
Classification is one of the most studied tasks in data mining and machine learning areas and many works in the literature have been presented to solve classification problems for multiple fields of knowledge such as medicine, biology, security, and remote sensing. Since there is no single classifier that achieves the best results for all kinds of applications, a good alternative is to adopt class…
▽ More
Classification is one of the most studied tasks in data mining and machine learning areas and many works in the literature have been presented to solve classification problems for multiple fields of knowledge such as medicine, biology, security, and remote sensing. Since there is no single classifier that achieves the best results for all kinds of applications, a good alternative is to adopt classifier fusion strategies. A key point in the success of classifier fusion approaches is the combination of diversity and accuracy among classifiers belonging to an ensemble. With a large amount of classification models available in the literature, one challenge is the choice of the most suitable classifiers to compose the final classification system, which generates the need of classifier selection strategies. We address this point by proposing a framework for classifier selection and fusion based on a four-step protocol called CIF-E (Classifiers, Initialization, Fitness function, and Evolutionary algorithm). We implement and evaluate 24 varied ensemble approaches following the proposed CIF-E protocol and we are able to find the most accurate approach. A comparative analysis has also been performed among the best approaches and many other baselines from the literature. The experiments show that the proposed evolutionary approach based on Univariate Marginal Distribution Algorithm (UMDA) can outperform the state-of-the-art literature approaches in many well-known UCI datasets.
△ Less
Submitted 23 August, 2022;
originally announced August 2022.
-
Domain Specific Wav2vec 2.0 Fine-tuning For The SE&R 2022 Challenge
Authors:
Alef Iury Siqueira Ferreira,
Gustavo dos Reis Oliveira
Abstract:
This paper presents our efforts to build a robust ASR model for the shared task Automatic Speech Recognition for spontaneous and prepared speech & Speech Emotion Recognition in Portuguese (SE&R 2022). The goal of the challenge is to advance the ASR research for the Portuguese language, considering prepared and spontaneous speech in different dialects. Our method consist on fine-tuning an ASR model…
▽ More
This paper presents our efforts to build a robust ASR model for the shared task Automatic Speech Recognition for spontaneous and prepared speech & Speech Emotion Recognition in Portuguese (SE&R 2022). The goal of the challenge is to advance the ASR research for the Portuguese language, considering prepared and spontaneous speech in different dialects. Our method consist on fine-tuning an ASR model in a domain-specific approach, applying gain normalization and selective noise insertion. The proposed method improved over the strong baseline provided on the test set in 3 of the 4 tracks available
△ Less
Submitted 28 July, 2022;
originally announced July 2022.
-
FakeNews: GAN-based generation of realistic 3D volumetric data -- A systematic review and taxonomy
Authors:
André Ferreira,
Jianning Li,
Kelsey L. Pomykala,
Jens Kleesiek,
Victor Alves,
Jan Egger
Abstract:
With the massive proliferation of data-driven algorithms, such as deep learning-based approaches, the availability of high-quality data is of great interest. Volumetric data is very important in medicine, as it ranges from disease diagnoses to therapy monitoring. When the dataset is sufficient, models can be trained to help doctors with these tasks. Unfortunately, there are scenarios where large a…
▽ More
With the massive proliferation of data-driven algorithms, such as deep learning-based approaches, the availability of high-quality data is of great interest. Volumetric data is very important in medicine, as it ranges from disease diagnoses to therapy monitoring. When the dataset is sufficient, models can be trained to help doctors with these tasks. Unfortunately, there are scenarios where large amounts of data is unavailable. For example, rare diseases and privacy issues can lead to restricted data availability. In non-medical fields, the high cost of obtaining enough high-quality data can also be a concern. A solution to these problems can be the generation of realistic synthetic data using Generative Adversarial Networks (GANs). The existence of these mechanisms is a good asset, especially in healthcare, as the data must be of good quality, realistic, and without privacy issues. Therefore, most of the publications on volumetric GANs are within the medical domain. In this review, we provide a summary of works that generate realistic volumetric synthetic data using GANs. We therefore outline GAN-based methods in these areas with common architectures, loss functions and evaluation metrics, including their advantages and disadvantages. We present a novel taxonomy, evaluations, challenges, and research opportunities to provide a holistic overview of the current state of volumetric GANs.
△ Less
Submitted 14 February, 2024; v1 submitted 4 July, 2022;
originally announced July 2022.
-
Fusing Multiscale Texture and Residual Descriptors for Multilevel 2D Barcode Rebroadcasting Detection
Authors:
Anselmo Ferreira,
Changcheng Chen,
Mauro Barni
Abstract:
Nowadays, 2D barcodes have been widely used for advertisement, mobile payment, and product authentication. However, in applications related to product authentication, an authentic 2D barcode can be illegally copied and attached to a counterfeited product in such a way to bypass the authentication scheme. In this paper, we employ a proprietary 2D barcode pattern and use multimedia forensics methods…
▽ More
Nowadays, 2D barcodes have been widely used for advertisement, mobile payment, and product authentication. However, in applications related to product authentication, an authentic 2D barcode can be illegally copied and attached to a counterfeited product in such a way to bypass the authentication scheme. In this paper, we employ a proprietary 2D barcode pattern and use multimedia forensics methods to analyse the scanning and printing artefacts resulting from the copy (rebroadcasting) attack. A diverse and complementary feature set is proposed to quantify the barcode texture distortions introduced during the illegal copying process. The proposed features are composed of global and local descriptors, which characterize the multi-scale texture appearance and the points of interest distribution, respectively. The proposed descriptors are compared against some existing texture descriptors and deep learning-based approaches under various scenarios, such as cross-datasets and cross-size. Experimental results highlight the practicality of the proposed method in real-world settings.
△ Less
Submitted 16 May, 2022;
originally announced May 2022.
-
Brazilian COVID-19 data streaming
Authors:
Nívea B. da Silva,
Luis Iván O. Valencia,
Fábio M. H. S. Filho,
Andressa C. S. Ferreira,
Felipe A. C. Pereira,
Guilherme L. de Oliveira,
Paloma F. Oliveira,
Moreno S. Rodrigues,
Pablo I. P. Ramos,
Juliane F. Oliveira
Abstract:
We collected individualized (unidentifiable) and aggregated openly available data from various sources related to suspected/confirmed SARS-CoV-2 infections, vaccinations, non-pharmaceutical government interventions, human mobility, and levels of population inequality in Brazil. In addition, a data structure allowing real-time data collection, curation, integration, and extract-transform-load proce…
▽ More
We collected individualized (unidentifiable) and aggregated openly available data from various sources related to suspected/confirmed SARS-CoV-2 infections, vaccinations, non-pharmaceutical government interventions, human mobility, and levels of population inequality in Brazil. In addition, a data structure allowing real-time data collection, curation, integration, and extract-transform-load processes for different objectives was developed. The granularity of this dataset (state- and municipality-wide) enables its application to individualized and ecological epidemiological studies, statistical, mathematical, and computational modeling, data visualization as well as the scientific dissemination of information on the COVID-19 pandemic in Brazil.
△ Less
Submitted 10 May, 2022;
originally announced May 2022.
-
Private delegated computations using strong isolation
Authors:
Mathias Brossard,
Guilhem Bryant,
Basma El Gaabouri,
Xinxin Fan,
Alexandre Ferreira,
Edmund Grimley-Evans,
Christopher Haster,
Evan Johnson,
Derek Miller,
Fan Mo,
Dominic P. Mulligan,
Nick Spinale,
Eric van Hensbergen,
Hugo J. M. Vincent,
Shale Xiong
Abstract:
Sensitive computations are now routinely delegated to third-parties. In response, Confidential Computing technologies are being introduced to microprocessors, offering a protected processing environment, which we generically call an isolate, providing confidentiality and integrity guarantees to code and data hosted within -- even in the face of a privileged attacker. Isolates, with an attestation…
▽ More
Sensitive computations are now routinely delegated to third-parties. In response, Confidential Computing technologies are being introduced to microprocessors, offering a protected processing environment, which we generically call an isolate, providing confidentiality and integrity guarantees to code and data hosted within -- even in the face of a privileged attacker. Isolates, with an attestation protocol, permit remote third-parties to establish a trusted "beachhead" containing known code and data on an otherwise untrusted machine. Yet, the rise of these technologies introduces many new problems, including: how to ease provisioning of computations safely into isolates; how to develop distributed systems spanning multiple classes of isolate; and what to do about the billions of "legacy" devices without support for Confidential Computing?
Tackling the problems above, we introduce Veracruz, a framework that eases the design and implementation of complex privacy-preserving, collaborative, delegated computations among a group of mutually mistrusting principals. Veracruz supports multiple isolation technologies and provides a common programming model and attestation protocol across all of them, smoothing deployment of delegated computations over supported technologies. We demonstrate Veracruz in operation, on private in-cloud object detection on encrypted video streaming from a video camera. In addition to supporting hardware-backed isolates -- like AWS Nitro Enclaves and Arm Confidential Computing Architecture Realms -- Veracruz also provides pragmatic "software isolates" on Armv8-A devices without hardware Confidential Computing capability, using the high-assurance seL4 microkernel and our IceCap framework.
△ Less
Submitted 6 May, 2022;
originally announced May 2022.
-
Generation of Synthetic Rat Brain MRI scans with a 3D Enhanced Alpha-GAN
Authors:
André Ferreira,
Ricardo Magalhães,
Sébastien Mériaux,
Victor Alves
Abstract:
Translational brain research using Magnetic Resonance Imaging (MRI) is becoming increasingly popular as animal models are an essential part of scientific studies and more ultra-high-field scanners are becoming available. Some disadvantages of MRI are the availability of MRI scanners and the time required for a full scanning session (it usually takes over 30 minutes). Privacy laws and the 3Rs ethic…
▽ More
Translational brain research using Magnetic Resonance Imaging (MRI) is becoming increasingly popular as animal models are an essential part of scientific studies and more ultra-high-field scanners are becoming available. Some disadvantages of MRI are the availability of MRI scanners and the time required for a full scanning session (it usually takes over 30 minutes). Privacy laws and the 3Rs ethics rule also make it difficult to create large datasets for training deep learning models. Generative Adversarial Networks (GANs) can perform data augmentation with higher quality than other techniques. In this work, the alpha-GAN architecture is used to test its ability to produce realistic 3D MRI scans of the rat brain. As far as the authors are aware, this is the first time that a GAN-based approach has been used for data augmentation in preclinical data. The generated scans are evaluated using various qualitative and quantitative metrics. A Turing test conducted by 4 experts has shown that the generated scans can trick almost any expert. The generated scans were also used to evaluate their impact on the performance of an existing deep learning model developed for segmenting the rat brain into white matter, grey matter and cerebrospinal fluid. The models were compared using the Dice score. The best results for whole brain and white matter segmentation were obtained when 174 real scans and 348 synthetic scans were used, with improvements of 0.0172 and 0.0129, respectively. Using 174 real scans and 87 synthetic scans resulted in improvements of 0.0038 and 0.0764 for grey matter and CSF segmentation, respectively. Thus, by using the proposed new normalisation layer and loss functions, it was possible to improve the realism of the generated rat MRI scans and it was shown that using the generated data improved the segmentation model more than using the conventional data augmentation.
△ Less
Submitted 4 January, 2022; v1 submitted 27 December, 2021;
originally announced December 2021.
-
Infinite Servers Queue Systems Busy Period Time Length Distribution and Parameters Study through Computational Simulation
Authors:
Manuel Alberto M. Ferreira
Abstract:
A FORTRAN program to simulate the operation of infinite servers queues is presented in this work. Poisson arrivals processes are considered but not only. For many parameters of interest in queuing systems study or application, either there are not theoretical results or, existing, they are mathematically intractable what makes their utility doubtful. In this case a possible issue is to use simulat…
▽ More
A FORTRAN program to simulate the operation of infinite servers queues is presented in this work. Poisson arrivals processes are considered but not only. For many parameters of interest in queuing systems study or application, either there are not theoretical results or, existing, they are mathematically intractable what makes their utility doubtful. In this case a possible issue is to use simulation methods in order to get more useful results. Indeed, using simulation, some experiences may be performed and the respective results used to conjecture about certain queue systems interesting quantities. In this paper this procedure is followed to learn something more about quantities of interest for those infinite servers queue systems, in particular about busy period parameters and probability distributions.
△ Less
Submitted 18 October, 2021;
originally announced October 2021.
-
Distributed Mission Planning of Complex Tasks for Heterogeneous Multi-Robot Teams
Authors:
Barbara Arbanas Ferreira,
Tamara Petrović,
Stjepan Bogdan
Abstract:
In this paper, we propose a distributed multi-stage optimization method for planning complex missions for heterogeneous multi-robot teams. This class of problems involves tasks that can be executed in different ways and are associated with cross-schedule dependencies that constrain the schedules of the different robots in the system. The proposed approach involves a multi-objective heuristic searc…
▽ More
In this paper, we propose a distributed multi-stage optimization method for planning complex missions for heterogeneous multi-robot teams. This class of problems involves tasks that can be executed in different ways and are associated with cross-schedule dependencies that constrain the schedules of the different robots in the system. The proposed approach involves a multi-objective heuristic search of the mission, represented as a hierarchical tree that defines the mission goal. This procedure outputs several favorable ways to fulfill the mission, which directly feed into the next stage of the method. We propose a distributed metaheuristic based on evolutionary computation to allocate tasks and generate schedules for the set of chosen decompositions. The method is evaluated in a simulation setup of an automated greenhouse use case, where we demonstrate the method's ability to adapt the planning strategy depending on the available robots and the given optimization criteria.
△ Less
Submitted 21 September, 2021;
originally announced September 2021.
-
A QUBO Formulation for Minimum Loss Spanning Tree Reconfiguration Problems in Electric Power Networks
Authors:
Filipe F. C. Silva,
Pedro M. S. Carvalho,
Luis A. F. M. Ferreira,
Yasser Omar
Abstract:
We introduce a novel quadratic unconstrained binary optimization (QUBO) formulation for a classical problem in electrical engineering -- the optimal reconfiguration of distribution grids. For a given graph representing the grid infrastructure and known nodal loads, the problem consists in finding the spanning tree that minimizes the total link ohmic losses. A set of constraints is initially define…
▽ More
We introduce a novel quadratic unconstrained binary optimization (QUBO) formulation for a classical problem in electrical engineering -- the optimal reconfiguration of distribution grids. For a given graph representing the grid infrastructure and known nodal loads, the problem consists in finding the spanning tree that minimizes the total link ohmic losses. A set of constraints is initially defined to impose topologically valid solutions. These constraints are then converted to a QUBO model as penalty terms. The electrical losses terms are finally added to the model as the objective function to minimize. In order to maximize the performance of solution searching with classical solvers, with hybrid quantum-classical solvers and with quantum annealers, our QUBO formulation has the goal of being very efficient in terms of variables usage. A standard 33-node test network is used as an illustrative example of our general formulation. Model metrics for this example are presented and discussed. Finally, the optimal solution for this example was obtained and validated through comparison with the optimal solution from an independent method.
△ Less
Submitted 15 March, 2022; v1 submitted 20 September, 2021;
originally announced September 2021.
-
Distributed Allocation and Scheduling of Tasks with Cross-Schedule Dependencies for Heterogeneous Multi-Robot Teams
Authors:
Barbara Arbanas Ferreira,
Tamara Petrović,
Matko Orsag,
J. Ramiro Martínez-de-Dios,
Stjepan Bogdan
Abstract:
To enable safe and efficient use of multi-robot systems in everyday life, a robust and fast method for coordinating their actions must be developed. In this paper, we present a distributed task allocation and scheduling algorithm for missions where the tasks of different robots are tightly coupled with temporal and precedence constraints. The approach is based on representing the problem as a vari…
▽ More
To enable safe and efficient use of multi-robot systems in everyday life, a robust and fast method for coordinating their actions must be developed. In this paper, we present a distributed task allocation and scheduling algorithm for missions where the tasks of different robots are tightly coupled with temporal and precedence constraints. The approach is based on representing the problem as a variant of the vehicle routing problem, and the solution is found using a distributed metaheuristic algorithm based on evolutionary computation (CBM-pop). Such an approach allows a fast and near-optimal allocation and can therefore be used for online replanning in case of task changes. Simulation results show that the approach has better computational speed and scalability without loss of optimality compared to the state-of-the-art distributed methods. An application of the planning procedure to a practical use case of a greenhouse maintained by a multi-robot system is given.
△ Less
Submitted 7 September, 2021;
originally announced September 2021.
-
On Applying the Lackadaisical Quantum Walk Algorithm to Search for Multiple Solutions on Grids
Authors:
Jonathan H. A. de Carvalho,
Luciano S. de Souza,
Fernando M. de Paula Neto,
Tiago A. E. Ferreira
Abstract:
Quantum computing promises to improve the information processing power to levels unreachable by classical computation. Quantum walks are heading the development of quantum algorithms for searching information on graphs more efficiently than their classical counterparts. A quantum-walk-based algorithm standing out in the literature is the lackadaisical quantum walk. The lackadaisical quantum walk i…
▽ More
Quantum computing promises to improve the information processing power to levels unreachable by classical computation. Quantum walks are heading the development of quantum algorithms for searching information on graphs more efficiently than their classical counterparts. A quantum-walk-based algorithm standing out in the literature is the lackadaisical quantum walk. The lackadaisical quantum walk is an algorithm developed to search graph structures whose vertices have a self-loop of weight $l$. This paper addresses several issues related to applying the lackadaisical quantum walk to search for multiple solutions on grids successfully. Firstly, we show that only one of the two stop** conditions found in the literature is suitable for simulations. We also demonstrate that the final success probability depends on both the space density of solutions and the relative distance between solutions. Furthermore, this work generalizes the lackadaisical quantum walk to search for multiple solutions on grids of arbitrary dimensions. In addition, we propose an optimal adjustment of the self-loop weight $l$ for such $d$-dimensional grids. It turns out other fits of $l$ found in the literature are particular cases. Finally, we observe a two-to-one relation between the steps of the lackadaisical quantum walk and Grover's algorithm, which requires modifications in the stop** condition. In conclusion, this work deals with practical issues one should consider when applying the lackadaisical quantum walk, besides expanding the technique to a broader range of search problems.
△ Less
Submitted 9 January, 2023; v1 submitted 11 June, 2021;
originally announced June 2021.
-
Performance Analysis of a Foreground Segmentation Neural Network Model
Authors:
Joel Tomás Morais,
António Ramires Fernandes,
André Leite Ferreira,
Bruno Faria
Abstract:
In recent years the interest in segmentation has been growing, being used in a wide range of applications such as fraud detection, anomaly detection in public health and intrusion detection. We present an ablation study of FgSegNet_v2, analysing its three stages: (i) Encoder, (ii) Feature Pooling Module and (iii) Decoder. The result of this study is a proposal of a variation of the aforementioned…
▽ More
In recent years the interest in segmentation has been growing, being used in a wide range of applications such as fraud detection, anomaly detection in public health and intrusion detection. We present an ablation study of FgSegNet_v2, analysing its three stages: (i) Encoder, (ii) Feature Pooling Module and (iii) Decoder. The result of this study is a proposal of a variation of the aforementioned method that surpasses state of the art results. Three datasets are used for testing: CDNet2014, SBI2015 and CityScapes. In CDNet2014 we got an overall improvement compared to the state of the art, mainly in the LowFrameRate subset. The presented approach is promising as it produces comparable results with the state of the art (SBI2015 and Cityscapes datasets) in very different conditions, such as different lighting conditions.
△ Less
Submitted 25 May, 2021;
originally announced May 2021.
-
On The Gap Between Software Maintenance Theory and Practitioners' Approaches
Authors:
Mívian Ferreira,
Mariza Bigonha,
Kecia A. M. Ferreira
Abstract:
The way practitioners perform maintenance tasks in practice is little known by researchers. In turn, practitioners are not always up to date with the proposals provided by the research community. This work investigates the gap between software maintenance techniques proposed by the research community and the software maintenance practice. We carried out a survey with 112 practitioners from 92 comp…
▽ More
The way practitioners perform maintenance tasks in practice is little known by researchers. In turn, practitioners are not always up to date with the proposals provided by the research community. This work investigates the gap between software maintenance techniques proposed by the research community and the software maintenance practice. We carried out a survey with 112 practitioners from 92 companies and 12 countries. We concentrate on analyzing if and how practitioners understand and apply the following subjects: bad smells, refactoring, software metrics, and change impact analysis. This study shows that there is a large gap between research approaches and industry practice in those subjects, especially in change impact analysis and software metrics.
△ Less
Submitted 8 April, 2021;
originally announced April 2021.
-
NemaNet: A convolutional neural network model for identification of nematodes soybean crop in brazil
Authors:
Andre da Silva Abade,
Lucas Faria Porto,
Paulo Afonso Ferreira,
Flavio de Barros Vidal
Abstract:
Phytoparasitic nematodes (or phytonematodes) are causing severe damage to crops and generating large-scale economic losses worldwide. In soybean crops, annual losses are estimated at 10.6% of world production. Besides, identifying these species through microscopic analysis by an expert with taxonomy knowledge is often laborious, time-consuming, and susceptible to failure. In this perspective, robu…
▽ More
Phytoparasitic nematodes (or phytonematodes) are causing severe damage to crops and generating large-scale economic losses worldwide. In soybean crops, annual losses are estimated at 10.6% of world production. Besides, identifying these species through microscopic analysis by an expert with taxonomy knowledge is often laborious, time-consuming, and susceptible to failure. In this perspective, robust and automatic approaches are necessary for identifying phytonematodes capable of providing correct diagnoses for the classification of species and subsidizing the taking of all control and prevention measures. This work presents a new public data set called NemaDataset containing 3,063 microscopic images from five nematode species with the most significant damage relevance for the soybean crop. Additionally, we propose a new Convolutional Neural Network (CNN) model defined as NemaNet and a comparative assessment with thirteen popular models of CNNs, all of them representing the state of the art classification and recognition. The general average calculated for each model, on a from-scratch training, the NemaNet model reached 96.99% accuracy, while the best evaluation fold reached 98.03%. In training with transfer learning, the average accuracy reached 98.88\%. The best evaluation fold reached 99.34% and achieve an overall accuracy improvement over 6.83% and 4.1%, for from-scratch and transfer learning training, respectively, when compared to other popular models.
△ Less
Submitted 5 March, 2021;
originally announced March 2021.
-
VIPPrint: A Large Scale Dataset of Printed and Scanned Images for Synthetic Face Images Detection and Source Linking
Authors:
Anselmo Ferreira,
Ehsan Nowroozi,
Mauro Barni
Abstract:
The possibility of carrying out a meaningful forensics analysis on printed and scanned images plays a major role in many applications. First of all, printed documents are often associated with criminal activities, such as terrorist plans, child pornography pictures, and even fake packages. Additionally, printing and scanning can be used to hide the traces of image manipulation or the synthetic nat…
▽ More
The possibility of carrying out a meaningful forensics analysis on printed and scanned images plays a major role in many applications. First of all, printed documents are often associated with criminal activities, such as terrorist plans, child pornography pictures, and even fake packages. Additionally, printing and scanning can be used to hide the traces of image manipulation or the synthetic nature of images, since the artifacts commonly found in manipulated and synthetic images are gone after the images are printed and scanned. A problem hindering research in this area is the lack of large scale reference datasets to be used for algorithm development and benchmarking. Motivated by this issue, we present a new dataset composed of a large number of synthetic and natural printed face images. To highlight the difficulties associated with the analysis of the images of the dataset, we carried out an extensive set of experiments comparing several printer attribution methods. We also verified that state-of-the-art methods to distinguish natural and synthetic face images fail when applied to print and scanned images. We envision that the availability of the new dataset and the preliminary experiments we carried out will motivate and facilitate further research in this area.
△ Less
Submitted 1 February, 2021;
originally announced February 2021.
-
A New Artificial Neuron Proposal with Trainable Simultaneous Local and Global Activation Function
Authors:
Tiago A. E. Ferreira,
Marios Mattheakis,
Pavlos Protopapas
Abstract:
The activation function plays a fundamental role in the artificial neural network learning process. However, there is no obvious choice or procedure to determine the best activation function, which depends on the problem. This study proposes a new artificial neuron, named global-local neuron, with a trainable activation function composed of two components, a global and a local. The global componen…
▽ More
The activation function plays a fundamental role in the artificial neural network learning process. However, there is no obvious choice or procedure to determine the best activation function, which depends on the problem. This study proposes a new artificial neuron, named global-local neuron, with a trainable activation function composed of two components, a global and a local. The global component term used here is relative to a mathematical function to describe a general feature present in all problem domain. The local component is a function that can represent a localized behavior, like a transient or a perturbation. This new neuron can define the importance of each activation function component in the learning phase. Depending on the problem, it results in a purely global, or purely local, or a mixed global and local activation function after the training phase. Here, the trigonometric sine function was employed for the global component and the hyperbolic tangent for the local component. The proposed neuron was tested for problems where the target was a purely global function, or purely local function, or a composition of two global and local functions. Two classes of test problems were investigated, regression problems and differential equations solving. The experimental tests demonstrated the Global-Local Neuron network's superior performance, compared with simple neural networks with sine or hyperbolic tangent activation function, and with a hybrid network that combines these two simple neural networks.
△ Less
Submitted 15 January, 2021;
originally announced January 2021.
-
Towards fast machine-learning-assisted Bayesian posterior inference of microseismic event location and source mechanism
Authors:
Davide Piras,
Alessio Spurio Mancini,
Ana M. G. Ferreira,
Benjamin Joachimi,
Michael P. Hobson
Abstract:
Bayesian inference applied to microseismic activity monitoring allows the accurate location of microseismic events from recorded seismograms and the estimation of the associated uncertainties. However, the forward modelling of these microseismic events, which is necessary to perform Bayesian source inversion, can be prohibitively expensive in terms of computational resources. A viable solution is…
▽ More
Bayesian inference applied to microseismic activity monitoring allows the accurate location of microseismic events from recorded seismograms and the estimation of the associated uncertainties. However, the forward modelling of these microseismic events, which is necessary to perform Bayesian source inversion, can be prohibitively expensive in terms of computational resources. A viable solution is to train a surrogate model based on machine learning techniques, to emulate the forward model and thus accelerate Bayesian inference. In this paper, we substantially enhance previous work, which considered only sources with isotropic moment tensors. We train a machine learning algorithm on the power spectrum of the recorded pressure wave and show that the trained emulator allows complete and fast event locations for $\textit{any}$ source mechanism. Moreover, we show that our approach is computationally inexpensive, as it can be run in less than 1 hour on a commercial laptop, while yielding accurate results using less than $10^4$ training seismograms. We additionally demonstrate how the trained emulators can be used to identify the source mechanism through the estimation of the Bayesian evidence. Finally, we demonstrate that our approach is robust to real noise as measured in field data. This work lays the foundations for efficient, accurate future joint determinations of event location and moment tensor, and associated uncertainties, which are ultimately key for accurately characterising human-induced and natural earthquakes, and for enhanced quantitative seismic hazard assessments.
△ Less
Submitted 28 October, 2022; v1 submitted 12 January, 2021;
originally announced January 2021.
-
Localizacao em ambientes internos utilizando redes Wi-Fi
Authors:
David Alan de Oliveira Ferreira,
Celso Barbosa Carvalho,
Edjair de Souza Mota
Abstract:
This paper presents a localization method for indoor environments capable of improving the location accuracy that is hampered by instability in RSSI of the IEEE 802.11 networks. The method employs the k-Nearest Neighbors (kNN) algorithm and quartiles analysis in the data representation. The proposal had null error with only four APs and 10 readings per sample of each AP with just 0.69 second to lo…
▽ More
This paper presents a localization method for indoor environments capable of improving the location accuracy that is hampered by instability in RSSI of the IEEE 802.11 networks. The method employs the k-Nearest Neighbors (kNN) algorithm and quartiles analysis in the data representation. The proposal had null error with only four APs and 10 readings per sample of each AP with just 0.69 second to locate. These values are important contributions, confirming that the method is promising to locate objects in indoor environments.
△ Less
Submitted 29 December, 2020;
originally announced December 2020.
-
Cluster structure of optimal solutions in bipartitioning of small worlds
Authors:
Adam Lipowski,
Antonio L. Ferreira,
Dorota Lipowska
Abstract:
Using a simulated annealing, we examine a bipartitioning of small worlds obtained by adding a fraction of randomly chosen links to a one-dimensional chain or a square lattice. Models defined on small worlds typically exhibit a mean-field behaviour, regardless of the underlying lattice. Our work demonstrates that the bipartitioning of small worlds does depend on the underlying lattice. Simulations…
▽ More
Using a simulated annealing, we examine a bipartitioning of small worlds obtained by adding a fraction of randomly chosen links to a one-dimensional chain or a square lattice. Models defined on small worlds typically exhibit a mean-field behaviour, regardless of the underlying lattice. Our work demonstrates that the bipartitioning of small worlds does depend on the underlying lattice. Simulations show that for one-dimensional small worlds, optimal partitions are finite size clusters for any fraction of additional links. In the two-dimensional case, we observe two regimes: when the fraction of additional links is sufficiently small, the optimal partitions have a stripe-like shape, which is lost for larger number of additional links as optimal partitions become disordered. Some arguments, which interpret additional links as thermal excitations and refer to the thermodynamics of Ising models, suggest a qualitatitve explanation of such a behaviour. The histogram of overlaps suggests that a replica symmetry is broken in a one-dimensional small world. In the two-dimensional case, the replica symmetry seems to hold but with some additional degeneracy of stripe-like partitions.
△ Less
Submitted 19 November, 2020;
originally announced November 2020.
-
Reinforced Deep Markov Models With Applications in Automatic Trading
Authors:
Tadeu A. Ferreira
Abstract:
Inspired by the developments in deep generative models, we propose a model-based RL approach, coined Reinforced Deep Markov Model (RDMM), designed to integrate desirable properties of a reinforcement learning algorithm acting as an automatic trading system. The network architecture allows for the possibility that market dynamics are partially visible and are potentially modified by the agent's act…
▽ More
Inspired by the developments in deep generative models, we propose a model-based RL approach, coined Reinforced Deep Markov Model (RDMM), designed to integrate desirable properties of a reinforcement learning algorithm acting as an automatic trading system. The network architecture allows for the possibility that market dynamics are partially visible and are potentially modified by the agent's actions. The RDMM filters incomplete and noisy data, to create better-behaved input data for RL planning. The policy search optimisation also properly accounts for state uncertainty. Due to the complexity of the RKDF model architecture, we performed ablation studies to understand the contributions of individual components of the approach better. To test the financial performance of the RDMM we implement policies using variants of Q-Learning, DynaQ-ARIMA and DynaQ-LSTM algorithms. The experiments show that the RDMM is data-efficient and provides financial gains compared to the benchmarks in the optimal execution problem. The performance improvement becomes more pronounced when price dynamics are more complex, and this has been demonstrated using real data sets from the limit order book of Facebook, Intel, Vodafone and Microsoft.
△ Less
Submitted 9 November, 2020;
originally announced November 2020.
-
CAPTION: Correction by Analyses, POS-Tagging and Interpretation of Objects using only Nouns
Authors:
Leonardo Anjoletto Ferreira,
Douglas De Rizzo Meneghetti,
Paulo Eduardo Santos
Abstract:
Recently, Deep Learning (DL) methods have shown an excellent performance in image captioning and visual question answering. However, despite their performance, DL methods do not learn the semantics of the words that are being used to describe a scene, making it difficult to spot incorrect words used in captions or to interchange words that have similar meanings. This work proposes a combination of…
▽ More
Recently, Deep Learning (DL) methods have shown an excellent performance in image captioning and visual question answering. However, despite their performance, DL methods do not learn the semantics of the words that are being used to describe a scene, making it difficult to spot incorrect words used in captions or to interchange words that have similar meanings. This work proposes a combination of DL methods for object detection and natural language processing to validate image's captions. We test our method in the FOIL-COCO data set, since it provides correct and incorrect captions for various images using only objects represented in the MS-COCO image data set. Results show that our method has a good overall performance, in some cases similar to the human performance.
△ Less
Submitted 2 October, 2020;
originally announced October 2020.
-
Plant Diseases recognition on images using Convolutional Neural Networks: A Systematic Review
Authors:
Andre S. Abade,
Paulo Afonso Ferreira,
Flavio de Barros Vidal
Abstract:
Plant diseases are considered one of the main factors influencing food production and minimize losses in production, and it is essential that crop diseases have fast detection and recognition. The recent expansion of deep learning methods has found its application in plant disease detection, offering a robust tool with highly accurate results. In this context, this work presents a systematic revie…
▽ More
Plant diseases are considered one of the main factors influencing food production and minimize losses in production, and it is essential that crop diseases have fast detection and recognition. The recent expansion of deep learning methods has found its application in plant disease detection, offering a robust tool with highly accurate results. In this context, this work presents a systematic review of the literature that aims to identify the state of the art of the use of convolutional neural networks(CNN) in the process of identification and classification of plant diseases, delimiting trends, and indicating gaps. In this sense, we present 121 papers selected in the last ten years with different approaches to treat aspects related to disease detection, characteristics of the data set, the crops and pathogens investigated. From the results of the systematic review, it is possible to understand the innovative trends regarding the use of CNNs in the identification of plant diseases and to identify the gaps that need the attention of the research community.
△ Less
Submitted 9 September, 2020;
originally announced September 2020.
-
Refining Network Intents for Self-Driving Networks
Authors:
Arthur Selle Jacobs,
Ricardo José Pfitscher,
Ronaldo Alves Ferreira,
Lisandro Zambenedetti Granville
Abstract:
Recent advances in artificial intelligence (AI) offer an opportunity for the adoption of self-driving networks. However, network operators or home-network users still do not have the right tools to exploit these new advancements in AI, since they have to rely on low-level languages to specify network policies. Intent-based networking (IBN) allows operators to specify high-level policies that dicta…
▽ More
Recent advances in artificial intelligence (AI) offer an opportunity for the adoption of self-driving networks. However, network operators or home-network users still do not have the right tools to exploit these new advancements in AI, since they have to rely on low-level languages to specify network policies. Intent-based networking (IBN) allows operators to specify high-level policies that dictate how the network should behave without worrying how they are translated into configuration commands in the network devices. However, the existing research proposals for IBN fail to exploit the knowledge and feedback from the network operator to validate or improve the translation of intents. In this paper, we introduce a novel intent-refinement process that uses machine learning and feedback from the operator to translate the operator's utterances into network configurations. Our refinement process uses a sequence-to-sequence learning model to extract intents from natural language and the feedback from the operator to improve learning. The key insight of our process is an intermediate representation that resembles natural language that is suitable to collect feedback from the operator but is structured enough to facilitate precise translations. Our prototype interacts with a network operator using natural language and translates the operator input to the intermediate representation before translating to SDN rules. Our experimental results show that our process achieves a correlation coefficient squared (i.e., R-squared) of 0.99 for a dataset with 5000 entries and the operator feedback significantly improves the accuracy of our model.
△ Less
Submitted 12 August, 2020;
originally announced August 2020.
-
Deep Dense and Convolutional Autoencoders for Unsupervised Anomaly Detection in Machine Condition Sounds
Authors:
Alexandrine Ribeiro,
Luis Miguel Matos,
Pedro Jose Pereira,
Eduardo C. Nunes,
Andre L. Ferreira,
Paulo Cortez,
Andre Pilastri
Abstract:
This technical report describes two methods that were developed for Task 2 of the DCASE 2020 challenge. The challenge involves an unsupervised learning to detect anomalous sounds, thus only normal machine working condition samples are available during the training process. The two methods involve deep autoencoders, based on dense and convolutional architectures that use melspectogram processed sou…
▽ More
This technical report describes two methods that were developed for Task 2 of the DCASE 2020 challenge. The challenge involves an unsupervised learning to detect anomalous sounds, thus only normal machine working condition samples are available during the training process. The two methods involve deep autoencoders, based on dense and convolutional architectures that use melspectogram processed sound features. Experiments were held, using the six machine type datasets of the challenge. Overall, competitive results were achieved by the proposed dense and convolutional AE, outperforming the baseline challenge method.
△ Less
Submitted 19 June, 2020; v1 submitted 18 June, 2020;
originally announced June 2020.
-
Applying Genetic Programming to Improve Interpretability in Machine Learning Models
Authors:
Leonardo Augusto Ferreira,
Frederico Gadelha Guimarães,
Rodrigo Silva
Abstract:
Explainable Artificial Intelligence (or xAI) has become an important research topic in the fields of Machine Learning and Deep Learning. In this paper, we propose a Genetic Programming (GP) based approach, named Genetic Programming Explainer (GPX), to the problem of explaining decisions computed by AI systems. The method generates a noise set located in the neighborhood of the point of interest, w…
▽ More
Explainable Artificial Intelligence (or xAI) has become an important research topic in the fields of Machine Learning and Deep Learning. In this paper, we propose a Genetic Programming (GP) based approach, named Genetic Programming Explainer (GPX), to the problem of explaining decisions computed by AI systems. The method generates a noise set located in the neighborhood of the point of interest, whose prediction should be explained, and fits a local explanation model for the analyzed sample. The tree structure generated by GPX provides a comprehensible analytical, possibly non-linear, symbolic expression which reflects the local behavior of the complex model. We considered three machine learning techniques that can be recognized as complex black-box models: Random Forest, Deep Neural Network and Support Vector Machine in twenty data sets for regression and classifications problems. Our results indicate that the GPX is able to produce more accurate understanding of complex models than the state of the art. The results validate the proposed approach as a novel way to deploy GP to improve interpretability.
△ Less
Submitted 18 May, 2020;
originally announced May 2020.
-
Uncovering Spatiotemporal and Semantic Aspects of Tourists Mobility Using Social Sensing
Authors:
Ana P G Ferreira,
Thiago H Silva,
Antonio A F Loureiro
Abstract:
Tourism favors more economic activities, employment, revenues and plays a significant role in development; thus, the improvement of this activity is a strategic task. In this work, we show how social sensing can be used to understand the key characteristics of the behavior of tourists and residents. We observe distinct behavioral patterns in those classes, considering the spatial and temporal dime…
▽ More
Tourism favors more economic activities, employment, revenues and plays a significant role in development; thus, the improvement of this activity is a strategic task. In this work, we show how social sensing can be used to understand the key characteristics of the behavior of tourists and residents. We observe distinct behavioral patterns in those classes, considering the spatial and temporal dimensions, where cultural and regional aspects might play an important role. Besides, we investigate how tourists move and the factors that influence their movements in London, New York, Rio de Janeiro and Tokyo. In addition, we propose a new approach based on a topic model that enables the automatic identification of mobility pattern themes, ultimately leading to a better understanding of users' profiles. The applicability of our results is broad, hel** to provide better applications and services in the tourism segment.
△ Less
Submitted 18 May, 2020;
originally announced May 2020.
-
Gravitational Wave Detection and Information Extraction via Neural Networks
Authors:
Gerson R. Santos,
Marcela P. Figueiredo,
Antonio de Pádua Santos,
Pavlos Protopapas,
Tiago A. E. Ferreira
Abstract:
Laser Interferometer Gravitational-Wave Observatory (LIGO) was the first laboratory to measure the gravitational waves. It was needed an exceptional experimental design to measure distance changes much less than a radius of a proton. In the same way, the data analyses to confirm and extract information is a tremendously hard task. Here, it is shown a computational procedure base on artificial neur…
▽ More
Laser Interferometer Gravitational-Wave Observatory (LIGO) was the first laboratory to measure the gravitational waves. It was needed an exceptional experimental design to measure distance changes much less than a radius of a proton. In the same way, the data analyses to confirm and extract information is a tremendously hard task. Here, it is shown a computational procedure base on artificial neural networks to detect a gravitation wave event and extract the knowledge of its ring-down time from the LIGO data. With this proposal, it is possible to make a probabilistic thermometer for gravitational wave detection and obtain physical information about the astronomical body system that created the phenomenon. Here, the ring-down time is determined with a direct data measure, without the need to use numerical relativity techniques and high computational power.
△ Less
Submitted 22 March, 2020;
originally announced March 2020.
-
Wine quality rapid detection using a compact electronic nose system: application focused on spoilage thresholds by acetic acid
Authors:
Juan C. Rodriguez Gamboa,
Eva Susana Albarracin E.,
Adenilton J. da Silva,
Luciana Leite,
Tiago A. E. Ferreira
Abstract:
It is crucial for the wine industry to have methods like electronic nose systems (E-Noses) for real-time monitoring thresholds of acetic acid in wines, preventing its spoilage or determining its quality. In this paper, we prove that the portable and compact self-developed E-Nose, based on thin film semiconductor (SnO2) sensors and trained with an approach that uses deep Multilayer Perceptron (MLP)…
▽ More
It is crucial for the wine industry to have methods like electronic nose systems (E-Noses) for real-time monitoring thresholds of acetic acid in wines, preventing its spoilage or determining its quality. In this paper, we prove that the portable and compact self-developed E-Nose, based on thin film semiconductor (SnO2) sensors and trained with an approach that uses deep Multilayer Perceptron (MLP) neural network, can perform early detection of wine spoilage thresholds in routine tasks of wine quality control. To obtain rapid and online detection, we propose a method of rising-window focused on raw data processing to find an early portion of the sensor signals with the best recognition performance. Our approach was compared with the conventional approach employed in E-Noses for gas recognition that involves feature extraction and selection techniques for preprocessing data, succeeded by a Support Vector Machine (SVM) classifier. The results evidence that is possible to classify three wine spoilage levels in 2.7 seconds after the gas injection point, implying in a methodology 63 times faster than the results obtained with the conventional approach in our experimental setup.
△ Less
Submitted 16 January, 2020;
originally announced January 2020.
-
Models under which random forests perform badly; consequences for applications
Authors:
José A. Ferreira
Abstract:
We give examples of data-generating models under which Breiman's random forest may be extremely slow to converge to the optimal predictor or even fail to be consistent. The evidence provided for these properties is based on mostly intuitive arguments, similar to those used earlier with simpler examples, and on numerical experiments. Although one can always choose models under which random forests…
▽ More
We give examples of data-generating models under which Breiman's random forest may be extremely slow to converge to the optimal predictor or even fail to be consistent. The evidence provided for these properties is based on mostly intuitive arguments, similar to those used earlier with simpler examples, and on numerical experiments. Although one can always choose models under which random forests perform very badly, we show that simple methods based on statistics of `variable use' and `variable importance' can often be used to construct a much better predictor based on a `many-armed' random forest obtained by forcing initial splits on variables which the default version of the algorithm tends to ignore.
△ Less
Submitted 30 November, 2021; v1 submitted 2 October, 2019;
originally announced October 2019.
-
Heuristics, Answer Set Programming and Markov Decision Process for Solving a Set of Spatial Puzzles
Authors:
Thiago Freitas dos Santos,
Paulo E. Santos,
Leonardo A. Ferreira,
Reinaldo A. C. Bianchi,
Pedro Cabalar
Abstract:
Spatial puzzles composed of rigid objects, flexible strings and holes offer interesting domains for reasoning about spatial entities that are common in the human daily-life's activities. The goal of this work is to investigate the automated solution of this kind of puzzles adapting an algorithm that combines Answer Set Programming (ASP) with Markov Decision Process (MDP), algorithm oASP(MDP), to u…
▽ More
Spatial puzzles composed of rigid objects, flexible strings and holes offer interesting domains for reasoning about spatial entities that are common in the human daily-life's activities. The goal of this work is to investigate the automated solution of this kind of puzzles adapting an algorithm that combines Answer Set Programming (ASP) with Markov Decision Process (MDP), algorithm oASP(MDP), to use heuristics accelerating the learning process. ASP is applied to represent the domain as an MDP, while a Reinforcement Learning algorithm (Q-Learning) is used to find the optimal policies. In this work, the heuristics were obtained from the solution of relaxed versions of the puzzles. Experiments were performed on deterministic, non-deterministic and non-stationary versions of the puzzles. Results show that the proposed approach can accelerate the learning process, presenting an advantage when compared to the non-heuristic versions of oASP(MDP) and Q-Learning.
△ Less
Submitted 15 February, 2019;
originally announced March 2019.