-
Offshore Wind Plant Instance Segmentation Using Sentinel-1 Time Series, GIS, and Semantic Segmentation Models
Authors:
Osmar Luiz Ferreira de Carvalho,
Osmar Abilio de Carvalho Junior,
Anesmar Olino de Albuquerque,
Daniel Guerreiro e Silva
Abstract:
Offshore wind farms represent a renewable energy source with a significant global growth trend, and their monitoring is strategic for territorial and environmental planning. This study's primary objective is to detect offshore wind plants at an instance level using semantic segmentation models and Sentinel-1 time series. The secondary objectives are: (a) to develop a database consisting of labeled…
▽ More
Offshore wind farms represent a renewable energy source with a significant global growth trend, and their monitoring is strategic for territorial and environmental planning. This study's primary objective is to detect offshore wind plants at an instance level using semantic segmentation models and Sentinel-1 time series. The secondary objectives are: (a) to develop a database consisting of labeled data and S-1 time series; (b) to compare the performance of five deep semantic segmentation architectures (U-Net, U-Net++, Feature Pyramid Network - FPN, DeepLabv3+, and LinkNet); (c) develop a novel augmentation strategy that shuffles the positions of the images within the time series; (d) investigate different dimensions of time series intervals (1, 5, 10, and 15 images); and (e) evaluate the semantic-to-instance conversion procedure. LinkNet was the top-performing model, followed by U-Net++ and U-Net, while FPN and DeepLabv3+ presented the worst results. The evaluation of semantic segmentation models reveals enhanced Intersection over Union (IoU) (25%) and F-score metrics (18%) with the augmentation of time series images. The study showcases the augmentation strategy's capability to mitigate biases and precisely detect invariant targets. Furthermore, the conversion from semantic to instance segmentation demonstrates its efficacy in accurately isolating individual instances within classified regions - simplifying training data and reducing annotation effort and complexity.
△ Less
Submitted 14 December, 2023;
originally announced December 2023.
-
Neural-based Compression Scheme for Solar Image Data
Authors:
Ali Zafari,
Atefeh Khoshkhahtinat,
Jeremy A. Grajeda,
Piyush M. Mehta,
Nasser M. Nasrabadi,
Laura E. Boucheron,
Barbara J. Thompson,
Michael S. F. Kirk,
Daniel da Silva
Abstract:
Studying the solar system and especially the Sun relies on the data gathered daily from space missions. These missions are data-intensive and compressing this data to make them efficiently transferable to the ground station is a twofold decision to make. Stronger compression methods, by distorting the data, can increase data throughput at the cost of accuracy which could affect scientific analysis…
▽ More
Studying the solar system and especially the Sun relies on the data gathered daily from space missions. These missions are data-intensive and compressing this data to make them efficiently transferable to the ground station is a twofold decision to make. Stronger compression methods, by distorting the data, can increase data throughput at the cost of accuracy which could affect scientific analysis of the data. On the other hand, preserving subtle details in the compressed data requires a high amount of data to be transferred, reducing the desired gains from compression. In this work, we propose a neural network-based lossy compression method to be used in NASA's data-intensive imagery missions. We chose NASA's SDO mission which transmits 1.4 terabytes of data each day as a proof of concept for the proposed algorithm. In this work, we propose an adversarially trained neural network, equipped with local and non-local attention modules to capture both the local and global structure of the image resulting in a better trade-off in rate-distortion (RD) compared to conventional hand-engineered codecs. The RD variational autoencoder used in this work is jointly trained with a channel-dependent entropy model as a shared prior between the analysis and synthesis transforms to make the entropy coding of the latent code more effective. Our neural image compression algorithm outperforms currently-in-use and state-of-the-art codecs such as JPEG and JPEG-2000 in terms of the RD performance when compressing extreme-ultraviolet (EUV) data. As a proof of concept for use of this algorithm in SDO data analysis, we have performed coronal hole (CH) detection using our compressed images, and generated consistent segmentations, even at a compression rate of $\sim0.1$ bits per pixel (compared to 8 bits per pixel on the original data) using EUV data from SDO.
△ Less
Submitted 5 November, 2023;
originally announced November 2023.
-
Multi-spectral Entropy Constrained Neural Compression of Solar Imagery
Authors:
Ali Zafari,
Atefeh Khoshkhahtinat,
Piyush M. Mehta,
Nasser M. Nasrabadi,
Barbara J. Thompson,
Michael S. F. Kirk,
Daniel da Silva
Abstract:
Missions studying the dynamic behaviour of the Sun are defined to capture multi-spectral images of the sun and transmit them to the ground station in a daily basis. To make transmission efficient and feasible, image compression systems need to be exploited. Recently successful end-to-end optimized neural network-based image compression systems have shown great potential to be used in an ad-hoc man…
▽ More
Missions studying the dynamic behaviour of the Sun are defined to capture multi-spectral images of the sun and transmit them to the ground station in a daily basis. To make transmission efficient and feasible, image compression systems need to be exploited. Recently successful end-to-end optimized neural network-based image compression systems have shown great potential to be used in an ad-hoc manner. In this work we have proposed a transformer-based multi-spectral neural image compressor to efficiently capture redundancies both intra/inter-wavelength. To unleash the locality of window-based self attention mechanism, we propose an inter-window aggregated token multi head self attention. Additionally to make the neural compressor autoencoder shift invariant, a randomly shifted window attention mechanism is used which makes the transformer blocks insensitive to translations in their input domain. We demonstrate that the proposed approach not only outperforms the conventional compression algorithms but also it is able to better decorrelates images along the multiple wavelengths compared to single spectral compression.
△ Less
Submitted 10 October, 2023; v1 submitted 19 September, 2023;
originally announced September 2023.
-
Context-Aware Neural Video Compression on Solar Dynamics Observatory
Authors:
Atefeh Khoshkhahtinat,
Ali Zafari,
Piyush M. Mehta,
Nasser M. Nasrabadi,
Barbara J. Thompson,
Michael S. F. Kirk,
Daniel da Silva
Abstract:
NASA's Solar Dynamics Observatory (SDO) mission collects large data volumes of the Sun's daily activity. Data compression is crucial for space missions to reduce data storage and video bandwidth requirements by eliminating redundancies in the data. In this paper, we present a novel neural Transformer-based video compression approach specifically designed for the SDO images. Our primary objective i…
▽ More
NASA's Solar Dynamics Observatory (SDO) mission collects large data volumes of the Sun's daily activity. Data compression is crucial for space missions to reduce data storage and video bandwidth requirements by eliminating redundancies in the data. In this paper, we present a novel neural Transformer-based video compression approach specifically designed for the SDO images. Our primary objective is to efficiently exploit the temporal and spatial redundancies inherent in solar images to obtain a high compression ratio. Our proposed architecture benefits from a novel Transformer block called Fused Local-aware Window (FLaWin), which incorporates window-based self-attention modules and an efficient fused local-aware feed-forward (FLaFF) network. This architectural design allows us to simultaneously capture short-range and long-range information while facilitating the extraction of rich and diverse contextual representations. Moreover, this design choice results in reduced computational complexity. Experimental results demonstrate the significant contribution of the FLaWin Transformer block to the compression performance, outperforming conventional hand-engineered video codecs such as H.264 and H.265 in terms of rate-distortion trade-off.
△ Less
Submitted 19 September, 2023;
originally announced September 2023.
-
Does pre-training on brain-related tasks results in better deep-learning-based brain age biomarkers?
Authors:
Bruno Machado Pacheco,
Victor Hugo Rocha de Oliveira,
Augusto Braga Fernandes Antunes,
Saulo Domingos de Souza Pedro,
Danilo Silva
Abstract:
Brain age prediction using neuroimaging data has shown great potential as an indicator of overall brain health and successful aging, as well as a disease biomarker. Deep learning models have been established as reliable and efficient brain age estimators, being trained to predict the chronological age of healthy subjects. In this paper, we investigate the impact of a pre-training step on deep lear…
▽ More
Brain age prediction using neuroimaging data has shown great potential as an indicator of overall brain health and successful aging, as well as a disease biomarker. Deep learning models have been established as reliable and efficient brain age estimators, being trained to predict the chronological age of healthy subjects. In this paper, we investigate the impact of a pre-training step on deep learning models for brain age prediction. More precisely, instead of the common approach of pre-training on natural imaging classification, we propose pre-training the models on brain-related tasks, which led to state-of-the-art results in our experiments on ADNI data. Furthermore, we validate the resulting brain age biomarker on images of patients with mild cognitive impairment and Alzheimer's disease. Interestingly, our results indicate that better-performing deep learning models in terms of brain age prediction on healthy patients do not result in more reliable biomarkers.
△ Less
Submitted 11 July, 2023;
originally announced July 2023.
-
Turbidity Control in Sedimentation Columns by Direction Dependent Models
Authors:
Jesus-Pablo Toledo-Zucco,
Daniel Sbarbaro,
Joao Manoel Gomes da Silva Jr
Abstract:
Sedimentation is a crucial phenomenon in recovering water from slurries by separating solid-liquid. Thickeners and sedimentation columns are equipments widely used in the process industry to reclaim water from process slurries. This contribution addresses the problem of controlling the turbidity of the recovered water in a sedimentation column by manipulating the underflow. The phenomenological mo…
▽ More
Sedimentation is a crucial phenomenon in recovering water from slurries by separating solid-liquid. Thickeners and sedimentation columns are equipments widely used in the process industry to reclaim water from process slurries. This contribution addresses the problem of controlling the turbidity of the recovered water in a sedimentation column by manipulating the underflow. The phenomenological model describing the turbidity is too complex to be used in a control strategy, and it is difficult to identify its parameters using plant measurements. This work proposes an empirical piece-wise time-delay model for modeling the turbidity at the top of the column to circumvent these problems. A systematic design procedure is developed to tune a Proportional Integral controller guaranteeing closed-loop stability for systems modeled as a piece-wise time delay model. Experiments in a pilot plant validate the theoretical results and illustrate the control performance under various operational scenarios.
△ Less
Submitted 1 July, 2024; v1 submitted 29 May, 2023;
originally announced May 2023.
-
eXplainable Artificial Intelligence on Medical Images: A Survey
Authors:
Matteus Vargas Simão da Silva,
Rodrigo Reis Arrais,
Jhessica Victoria Santos da Silva,
Felipe Souza Tânios,
Mateus Antonio Chinelatto,
Natalia Backhaus Pereira,
Renata De Paris,
Lucas Cesar Ferreira Domingos,
Rodrigo Dória Villaça,
Vitor Lopes Fabris,
Nayara Rossi Brito da Silva,
Ana Claudia Akemi Matsuki de Faria,
Jose Victor Nogueira Alves da Silva,
Fabiana Cristina Queiroz de Oliveira Marucci,
Francisco Alves de Souza Neto,
Danilo Xavier Silva,
Vitor Yukio Kondo,
Claudio Filipi Gonçalves dos Santos
Abstract:
Over the last few years, the number of works about deep learning applied to the medical field has increased enormously. The necessity of a rigorous assessment of these models is required to explain these results to all people involved in medical exams. A recent field in the machine learning area is explainable artificial intelligence, also known as XAI, which targets to explain the results of such…
▽ More
Over the last few years, the number of works about deep learning applied to the medical field has increased enormously. The necessity of a rigorous assessment of these models is required to explain these results to all people involved in medical exams. A recent field in the machine learning area is explainable artificial intelligence, also known as XAI, which targets to explain the results of such black box models to permit the desired assessment. This survey analyses several recent studies in the XAI field applied to medical diagnosis research, allowing some explainability of the machine learning results in several different diseases, such as cancers and COVID-19.
△ Less
Submitted 12 May, 2023;
originally announced May 2023.
-
Computing Functions Over-the-Air Using Digital Modulations
Authors:
Saeed Razavikia,
Jose Mairton Barros da Silva Jr,
Carlo Fischione
Abstract:
Over-the-air computation (AirComp) is a known technique in which wireless devices transmit values by analog amplitude modulation so that a function of these values is computed over the communication channel at a common receiver. The physical reason is the superposition properties of the electromagnetic waves, which naturally return sums of analog values. Consequently, the applications of AirComp a…
▽ More
Over-the-air computation (AirComp) is a known technique in which wireless devices transmit values by analog amplitude modulation so that a function of these values is computed over the communication channel at a common receiver. The physical reason is the superposition properties of the electromagnetic waves, which naturally return sums of analog values. Consequently, the applications of AirComp are almost entirely restricted to analog communication systems. However, the use of digital communications for over-the-air computations would have several benefits, such as error correction, synchronization, acquisition of channel state information, and easier adoption by current digital communication systems. Nevertheless, a common belief is that digital modulations are generally unfeasible for computation tasks because the overlap** of digitally modulated signals returns signals that seem to be meaningless for these tasks. This paper breaks through such a belief and proposes a fundamentally new computing method, named ChannelComp, for performing over-the-air computations by any digital modulation. In particular, we propose digital modulation formats that allow us to compute a wider class of functions than AirComp can compute, and we propose a feasibility optimization problem that ascertains the optimal digital modulation for computing functions over-the-air. The simulation results verify the superior performance of ChannelComp in comparison to AirComp, particularly for the product functions, with around 10 dB improvement of the computation error.
△ Less
Submitted 20 March, 2023; v1 submitted 1 March, 2023;
originally announced March 2023.
-
Environment for the Design and Automation of New CDPR Architectures
Authors:
Josue Rivera,
Julio Garrido,
Enrique Riveiro,
Diego Silva
Abstract:
This paper presents a design and automation environment to study the control trajectory for new CDPR architectures, for instance CDPRs with an unusual number of cables or different motor location in the robot frame. In order to test the environment capabilities, an architecture of a planar under-constrained CDPR was designed, simulated, and implemented using standard industrial hardware. Both the…
▽ More
This paper presents a design and automation environment to study the control trajectory for new CDPR architectures, for instance CDPRs with an unusual number of cables or different motor location in the robot frame. In order to test the environment capabilities, an architecture of a planar under-constrained CDPR was designed, simulated, and implemented using standard industrial hardware. Both the simulated model and industrial prototype were running the same trajectories to determine the time delay and the error position between them. The tests have demonstrated that the simulated model of the CDPR reproduces the trajectories of the equivalent industrial prototype with a maximum deviation of 0.35% under loading and different speed conditions, despite the time delays produced by the data transmission and the non-deterministic communication protocols used to connect the industrial automation controller with the simulated model. The results have shown that the environment is suitable for trajectory control and workspace analysis of new CDPR architectures under different dynamic conditions.
△ Less
Submitted 23 January, 2023;
originally announced January 2023.
-
Towards fully automated deep-learning-based brain tumor segmentation: is brain extraction still necessary?
Authors:
Bruno Machado Pacheco,
Guilherme de Souza e Cassia,
Danilo Silva
Abstract:
State-of-the-art brain tumor segmentation is based on deep learning models applied to multi-modal MRIs. Currently, these models are trained on images after a preprocessing stage that involves registration, interpolation, brain extraction (BE, also known as skull-strip**) and manual correction by an expert. However, for clinical practice, this last step is tedious and time-consuming and, therefor…
▽ More
State-of-the-art brain tumor segmentation is based on deep learning models applied to multi-modal MRIs. Currently, these models are trained on images after a preprocessing stage that involves registration, interpolation, brain extraction (BE, also known as skull-strip**) and manual correction by an expert. However, for clinical practice, this last step is tedious and time-consuming and, therefore, not always feasible, resulting in skull-strip** faults that can negatively impact the tumor segmentation quality. Still, the extent of this impact has never been measured for any of the many different BE methods available. In this work, we propose an automatic brain tumor segmentation pipeline and evaluate its performance with multiple BE methods. Our experiments show that the choice of a BE method can compromise up to 15.7% of the tumor segmentation performance. Moreover, we propose training and testing tumor segmentation models on non-skull-stripped images, effectively discarding the BE step from the pipeline. Our results show that this approach leads to a competitive performance at a fraction of the time. We conclude that, in contrast to the current paradigm, training tumor segmentation models on non-skull-stripped images can be the best option when high performance in clinical practice is desired.
△ Less
Submitted 14 December, 2022;
originally announced December 2022.
-
Interpretability Analysis of Deep Models for COVID-19 Detection
Authors:
Daniel Peixoto Pinto da Silva,
Edresson Casanova,
Lucas Rafael Stefanel Gris,
Arnaldo Candido Junior,
Marcelo Finger,
Flaviane Svartman,
Beatriz Raposo,
Marcus Vinícius Moreira Martins,
Sandra Maria Aluísio,
Larissa Cristina Berti,
João Paulo Teixeira
Abstract:
During the outbreak of COVID-19 pandemic, several research areas joined efforts to mitigate the damages caused by SARS-CoV-2. In this paper we present an interpretability analysis of a convolutional neural network based model for COVID-19 detection in audios. We investigate which features are important for model decision process, investigating spectrograms, F0, F0 standard deviation, sex and age.…
▽ More
During the outbreak of COVID-19 pandemic, several research areas joined efforts to mitigate the damages caused by SARS-CoV-2. In this paper we present an interpretability analysis of a convolutional neural network based model for COVID-19 detection in audios. We investigate which features are important for model decision process, investigating spectrograms, F0, F0 standard deviation, sex and age. Following, we analyse model decisions by generating heat maps for the trained models to capture their attention during the decision process. Focusing on a explainable Inteligence Artificial approach, we show that studied models can taken unbiased decisions even in the presence of spurious data in the training set, given the adequate preprocessing steps. Our best model has 94.44% of accuracy in detection, with results indicating that models favors spectrograms for the decision process, particularly, high energy areas in the spectrogram related to prosodic domains, while F0 also leads to efficient COVID-19 detection.
△ Less
Submitted 25 November, 2022;
originally announced November 2022.
-
Federated Learning Using Three-Operator ADMM
Authors:
Shashi Kant,
José Mairton B. da Silva Jr.,
Gabor Fodor,
Bo Göransson,
Mats Bengtsson,
Carlo Fischione
Abstract:
Federated learning (FL) has emerged as an instance of distributed machine learning paradigm that avoids the transmission of data generated on the users' side. Although data are not transmitted, edge devices have to deal with limited communication bandwidths, data heterogeneity, and straggler effects due to the limited computational resources of users' devices. A prominent approach to overcome such…
▽ More
Federated learning (FL) has emerged as an instance of distributed machine learning paradigm that avoids the transmission of data generated on the users' side. Although data are not transmitted, edge devices have to deal with limited communication bandwidths, data heterogeneity, and straggler effects due to the limited computational resources of users' devices. A prominent approach to overcome such difficulties is FedADMM, which is based on the classical two-operator consensus alternating direction method of multipliers (ADMM). The common assumption of FL algorithms, including FedADMM, is that they learn a global model using data only on the users' side and not on the edge server. However, in edge learning, the server is expected to be near the base station and have direct access to rich datasets. In this paper, we argue that leveraging the rich data on the edge server is much more beneficial than utilizing only user datasets. Specifically, we show that the mere application of FL with an additional virtual user node representing the data on the edge server is inefficient. We propose FedTOP-ADMM, which generalizes FedADMM and is based on a three-operator ADMM-type technique that exploits a smooth cost function on the edge server to learn a global model parallel to the edge devices. Our numerical experiments indicate that FedTOP-ADMM has substantial gain up to 33\% in communication efficiency to reach a desired test accuracy with respect to FedADMM, including a virtual user on the edge server.
△ Less
Submitted 25 March, 2024; v1 submitted 8 November, 2022;
originally announced November 2022.
-
Development of a 3D model of clinically relevant microcalcifications
Authors:
Ann-Katherine Carton,
Clément Jailin,
Raoul de Sousa Silva,
Ruben Sanchez de la Rosa,
Serge Muller
Abstract:
A realistic 3D anthropomorphic software model of microcalcifications may serve as a useful tool to assess the performance of breast imaging applications through simulations. We present a method allowing to simulate visually realistic microcalcifications with large morphological variability. Principal component analysis (PCA) was used to analyze the shape of 281 biopsied microcalcifications imaged…
▽ More
A realistic 3D anthropomorphic software model of microcalcifications may serve as a useful tool to assess the performance of breast imaging applications through simulations. We present a method allowing to simulate visually realistic microcalcifications with large morphological variability. Principal component analysis (PCA) was used to analyze the shape of 281 biopsied microcalcifications imaged with a micro-CT. The PCA analysis requires the same number of shape components for each input microcalcification. Therefore, the voxel-based microcalcifications were converted to a surface mesh with same number of vertices using a marching cube algorithm. The vertices were registered using an iterative closest point algorithm and a simulated annealing algorithm. To evaluate the approach, input microcalcifications were reconstructed by progressively adding principal components. Input and reconstructed microcalcifications were visually and quantitatively compared. New microcalcifications were simulated using randomly sampled principal components determined from the PCA applied to the input microcalcifications, and their realism was appreciated through visual assessment. Preliminary results have shown that input microcalcifications can be reconstructed with high visual fidelity when using 62 principal components, representing 99.5% variance. For that condition, the average L2 norm and dice coefficient were respectively 10.5 $μ$m and 0.93. Newly generated microcalcifications with 62 principal components were found to be visually similar, while not identical, to input microcalcifications. The proposed PCA model of microcalcification shapes allows to successfully reconstruct input microcalcifications and to generate new visually realistic microcalcifications with various morphologies.
△ Less
Submitted 8 November, 2022;
originally announced November 2022.
-
Blind Asynchronous Over-the-Air Federated Edge Learning
Authors:
Saeed Razavikia,
Jaume Anguera Peris,
Jose Mairton B. da Silva Jr,
Carlo Fischione
Abstract:
Federated Edge Learning (FEEL) is a distributed machine learning technique where each device contributes to training a global inference model by independently performing local computations with their data. More recently, FEEL has been merged with over-the-air computation (OAC), where the global model is calculated over the air by leveraging the superposition of analog signals. However, when implem…
▽ More
Federated Edge Learning (FEEL) is a distributed machine learning technique where each device contributes to training a global inference model by independently performing local computations with their data. More recently, FEEL has been merged with over-the-air computation (OAC), where the global model is calculated over the air by leveraging the superposition of analog signals. However, when implementing FEEL with OAC, there is the challenge on how to precode the analog signals to overcome any time misalignment at the receiver. In this work, we propose a novel synchronization-free method to recover the parameters of the global model over the air without requiring any prior information about the time misalignments. For that, we construct a convex optimization based on the norm minimization problem to directly recover the global model by solving a convex semi-definite program. The performance of the proposed method is evaluated in terms of accuracy and convergence via numerical experiments. We show that our proposed algorithm is close to the ideal synchronized scenario by $10\%$, and performs $4\times$ better than the simple case where no recovering method is used.
△ Less
Submitted 31 October, 2022;
originally announced October 2022.
-
Attention-Based Generative Neural Image Compression on Solar Dynamics Observatory
Authors:
Ali Zafari,
Atefeh Khoshkhahtinat,
Piyush M. Mehta,
Nasser M. Nasrabadi,
Barbara J. Thompson,
Daniel da Silva,
Michael S. F. Kirk
Abstract:
NASA's Solar Dynamics Observatory (SDO) mission gathers 1.4 terabytes of data each day from its geosynchronous orbit in space. SDO data includes images of the Sun captured at different wavelengths, with the primary scientific goal of understanding the dynamic processes governing the Sun. Recently, end-to-end optimized artificial neural networks (ANN) have shown great potential in performing image…
▽ More
NASA's Solar Dynamics Observatory (SDO) mission gathers 1.4 terabytes of data each day from its geosynchronous orbit in space. SDO data includes images of the Sun captured at different wavelengths, with the primary scientific goal of understanding the dynamic processes governing the Sun. Recently, end-to-end optimized artificial neural networks (ANN) have shown great potential in performing image compression. ANN-based compression schemes have outperformed conventional hand-engineered algorithms for lossy and lossless image compression. We have designed an ad-hoc ANN-based image compression scheme to reduce the amount of data needed to be stored and retrieved on space missions studying solar dynamics. In this work, we propose an attention module to make use of both local and non-local attention mechanisms in an adversarially trained neural image compression network. We have also demonstrated the superior perceptual quality of this neural image compressor. Our proposed algorithm for compressing images downloaded from the SDO spacecraft performs better in rate-distortion trade-off than the popular currently-in-use image compression codecs such as JPEG and JPEG2000. In addition we have shown that the proposed method outperforms state-of-the art lossy transform coding compression codec, i.e., BPG.
△ Less
Submitted 4 May, 2023; v1 submitted 12 October, 2022;
originally announced October 2022.
-
Predição da Idade Cerebral a partir de Imagens de Ressonância Magnética utilizando Redes Neurais Convolucionais
Authors:
Victor H. R. Oliveira,
Augusto Antunes,
Alexandre S. Soares,
Arthur D. Reys,
Robson Z. Júnior,
Saulo D. S. Pedro,
Danilo Silva
Abstract:
In this work, deep learning techniques for brain age prediction from magnetic resonance images are investigated, aiming to assist in the identification of biomarkers of the natural aging process. The identification of biomarkers is useful for detecting an early-stage neurodegenerative process, as well as for predicting age-related or non-age-related cognitive decline. Two techniques are implemente…
▽ More
In this work, deep learning techniques for brain age prediction from magnetic resonance images are investigated, aiming to assist in the identification of biomarkers of the natural aging process. The identification of biomarkers is useful for detecting an early-stage neurodegenerative process, as well as for predicting age-related or non-age-related cognitive decline. Two techniques are implemented and compared in this work: a 3D Convolutional Neural Network applied to the volumetric image and a 2D Convolutional Neural Network applied to slices from the axial plane, with subsequent fusion of individual predictions. The best result was obtained by the 2D model, which achieved a mean absolute error of 3.83 years.
--
Neste trabalho são investigadas técnicas de aprendizado profundo para a predição da idade cerebral a partir de imagens de ressonância magnética, visando auxiliar na identificação de biomarcadores do processo natural de envelhecimento. A identificação de biomarcadores é útil para a detecção de um processo neurodegenerativo em estágio inicial, além de possibilitar prever um declínio cognitivo relacionado ou não à idade. Duas técnicas são implementadas e comparadas neste trabalho: uma Rede Neural Convolucional 3D aplicada na imagem volumétrica e uma Rede Neural Convolucional 2D aplicada a fatias do plano axial, com posterior fusão das predições individuais. O melhor resultado foi obtido pelo modelo 2D, que alcançou um erro médio absoluto de 3.83 anos.
△ Less
Submitted 23 December, 2021;
originally announced December 2021.
-
CORAA: a large corpus of spontaneous and prepared speech manually validated for speech recognition in Brazilian Portuguese
Authors:
Arnaldo Candido Junior,
Edresson Casanova,
Anderson Soares,
Frederico Santos de Oliveira,
Lucas Oliveira,
Ricardo Corso Fernandes Junior,
Daniel Peixoto Pinto da Silva,
Fernando Gorgulho Fayet,
Bruno Baldissera Carlotto,
Lucas Rafael Stefanel Gris,
Sandra Maria Aluísio
Abstract:
Automatic Speech recognition (ASR) is a complex and challenging task. In recent years, there have been significant advances in the area. In particular, for the Brazilian Portuguese (BP) language, there were about 376 hours public available for ASR task until the second half of 2020. With the release of new datasets in early 2021, this number increased to 574 hours. The existing resources, however,…
▽ More
Automatic Speech recognition (ASR) is a complex and challenging task. In recent years, there have been significant advances in the area. In particular, for the Brazilian Portuguese (BP) language, there were about 376 hours public available for ASR task until the second half of 2020. With the release of new datasets in early 2021, this number increased to 574 hours. The existing resources, however, are composed of audios containing only read and prepared speech. There is a lack of datasets including spontaneous speech, which are essential in different ASR applications. This paper presents CORAA (Corpus of Annotated Audios) v1. with 290.77 hours, a publicly available dataset for ASR in BP containing validated pairs (audio-transcription). CORAA also contains European Portuguese audios (4.69 hours). We also present a public ASR model based on Wav2Vec 2.0 XLSR-53 and fine-tuned over CORAA. Our model achieved a Word Error Rate of 24.18% on CORAA test set and 20.08% on Common Voice test set. When measuring the Character Error Rate, we obtained 11.02% and 6.34% for CORAA and Common Voice, respectively. CORAA corpora were assembled to both improve ASR models in BP with phenomena from spontaneous speech and motivate young researchers to start their studies on ASR for Portuguese. All the corpora are publicly available at https://github.com/nilc-nlp/CORAA under the CC BY-NC-ND 4.0 license.
△ Less
Submitted 18 November, 2021; v1 submitted 14 October, 2021;
originally announced October 2021.
-
Simultaneous Wireless Information and Power Transfer for Federated Learning
Authors:
José Mairton B. da Silva Jr.,
Konstantinos Ntougias,
Ioannis Krikidis,
Gábor Fodor,
Carlo Fischione
Abstract:
In the Internet of Things, learning is one of most prominent tasks. In this paper, we consider an Internet of Things scenario where federated learning is used with simultaneous transmission of model data and wireless power. We investigate the trade-off between the number of communication rounds and communication round time while harvesting energy to compensate the energy expenditure. We formulate…
▽ More
In the Internet of Things, learning is one of most prominent tasks. In this paper, we consider an Internet of Things scenario where federated learning is used with simultaneous transmission of model data and wireless power. We investigate the trade-off between the number of communication rounds and communication round time while harvesting energy to compensate the energy expenditure. We formulate and solve an optimization problem by considering the number of local iterations on devices, the time to transmit-receive the model updates, and to harvest sufficient energy. Numerical results indicate that maximum ratio transmission and zero-forcing beamforming for the optimization of the local iterations on devices substantially boost the test accuracy of the learning task. Moreover, maximum ratio transmission instead of zero-forcing provides the best test accuracy and communication round time trade-off for various energy harvesting percentages. Thus, it is possible to learn a model quickly with few communication rounds without depleting the battery.
△ Less
Submitted 21 July, 2021; v1 submitted 26 April, 2021;
originally announced April 2021.
-
An Artificial Intelligence Framework for Bidding Optimization with Uncertainty in Multiple Frequency Reserve Markets
Authors:
Thimal Kempitiya,
Seppo Sierla,
Daswin De Silva,
Matti Yli-Ojanpera,
Damminda Alahakoon,
Valeriy Vyatkin
Abstract:
The global ambitions of a carbon-neutral society necessitate a stable and robust smart grid that capitalises on frequency reserves of renewable energy. Frequency reserves are resources that adjust power production or consumption in real time to react to a power grid frequency deviation. Revenue generation motivates the availability of these resources for managing such deviations. However, limited…
▽ More
The global ambitions of a carbon-neutral society necessitate a stable and robust smart grid that capitalises on frequency reserves of renewable energy. Frequency reserves are resources that adjust power production or consumption in real time to react to a power grid frequency deviation. Revenue generation motivates the availability of these resources for managing such deviations. However, limited research has been conducted on data-driven decisions and optimal bidding strategies for trading such capacities in multiple frequency reserves markets. We address this limitation by making the following research contributions. Firstly, a generalised model is designed based on an extensive study of critical characteristics of global frequency reserves markets. Secondly, three bidding strategies are proposed, based on this market model, to capitalise on price peaks in multi-stage markets. Two strategies are proposed for non-reschedulable loads, in which case the bidding strategy aims to select the market with the highest anticipated price, and the third bidding strategy focuses on rescheduling loads to hours on which highest reserve market prices are anticipated. The third research contribution is an Artificial Intelligence (AI) based bidding optimization framework that implements these three strategies, with novel uncertainty metrics that supplement data-driven price prediction. Finally, the framework is evaluated empirically using a case study of multiple frequency reserves markets in Finland. The results from this evaluation confirm the effectiveness of the proposed bidding strategies and the AI-based bidding optimization framework in terms of cumulative revenue generation, leading to an increased availability of frequency reserves.
△ Less
Submitted 5 April, 2021;
originally announced April 2021.
-
Full-Duplex and Dynamic-TDD: Pushing the Limits of Spectrum Reuse in Multi-Cell Communications
Authors:
José Mairton B. da Silva Jr.,
Gustav Wikström,
Ratheesh K. Mungara,
Carlo Fischione
Abstract:
Although in cellular networks full-duplex and dynamic time-division duplexing promise increased spectrum efficiency, their potential is so far challenged by increased interference. While previous studies have shown that self-interference can be suppressed to a sufficient level, we show that the cross-link interference for both duplexing modes, especially from base station to base station, is the r…
▽ More
Although in cellular networks full-duplex and dynamic time-division duplexing promise increased spectrum efficiency, their potential is so far challenged by increased interference. While previous studies have shown that self-interference can be suppressed to a sufficient level, we show that the cross-link interference for both duplexing modes, especially from base station to base station, is the remaining challenge in multi-cell networks, restricting the uplink performance. Using beamforming techniques of low-complexity, we show that this interference can be mitigated, and that full-duplex and dynamic time-division duplexing can substantially increase the capacity of multi-cell networks. Our results suggest that if we can control the cross link interference in full-duplex, then we can almost double the multi cell network capacity as well as user throughput. Therefore, the techniques in this paper have the potentiality to enable a smooth introduction of full-duplex into cellular systems.
△ Less
Submitted 21 October, 2020;
originally announced October 2020.
-
Wireless for Machine Learning
Authors:
Henrik Hellström,
José Mairton B. da Silva Jr,
Mohammad Mohammadi Amiri,
Mingzhe Chen,
Viktoria Fodor,
H. Vincent Poor,
Carlo Fischione
Abstract:
As data generation increasingly takes place on devices without a wired connection, machine learning (ML) related traffic will be ubiquitous in wireless networks. Many studies have shown that traditional wireless protocols are highly inefficient or unsustainable to support ML, which creates the need for new wireless communication methods. In this survey, we give an exhaustive review of the state-of…
▽ More
As data generation increasingly takes place on devices without a wired connection, machine learning (ML) related traffic will be ubiquitous in wireless networks. Many studies have shown that traditional wireless protocols are highly inefficient or unsustainable to support ML, which creates the need for new wireless communication methods. In this survey, we give an exhaustive review of the state-of-the-art wireless methods that are specifically designed to support ML services over distributed datasets. Currently, there are two clear themes within the literature, analog over-the-air computation and digital radio resource management optimized for ML. This survey gives a comprehensive introduction to these methods, reviews the most important works, highlights open problems, and discusses application scenarios.
△ Less
Submitted 9 June, 2022; v1 submitted 31 August, 2020;
originally announced August 2020.
-
Microphone Array Based Surveillance Audio Classification
Authors:
Dimitri Leandro de Oliveira Silva,
Tito Spadini,
Ricardo Suyama
Abstract:
The work assessed seven classical classifiers and two beamforming algorithms for detecting surveillance sound events. The tests included the use of AWGN with -10 dB to 30 dB SNR. Data Augmentation was also employed to improve algorithms' performance. The results showed that the combination of SVM and Delay-and-Sum (DaS) scored the best accuracy (up to 86.0\%), but had high computational cost (…
▽ More
The work assessed seven classical classifiers and two beamforming algorithms for detecting surveillance sound events. The tests included the use of AWGN with -10 dB to 30 dB SNR. Data Augmentation was also employed to improve algorithms' performance. The results showed that the combination of SVM and Delay-and-Sum (DaS) scored the best accuracy (up to 86.0\%), but had high computational cost ($\approx $ 402 ms), mainly due to DaS. The use of SGD also seems to be a good alternative since it has achieved good accuracy either (up to 85.3\%), but with quicker processing time ($\approx$ 165 ms).
△ Less
Submitted 22 May, 2020;
originally announced May 2020.
-
Hardware Architecture Proposal for TEDA algorithm to Data Streaming Anomaly Detection
Authors:
Lucileide M. D. da Silva,
Maria G. F. Coutinho,
Carlos E. B. Santos,
Mailson R. Santos,
Luiz Affonso Guedes,
M. Dolores Ruiz,
Marcelo A. C. Fernandes
Abstract:
The amount of data in real-time, such as time series and streaming data, available today continues to grow. Being able to analyze this data the moment it arrives can bring an immense added value. However, it also requires a lot of computational effort and new acceleration techniques. As a possible solution to this problem, this paper proposes a hardware architecture for Typicality and Eccentricity…
▽ More
The amount of data in real-time, such as time series and streaming data, available today continues to grow. Being able to analyze this data the moment it arrives can bring an immense added value. However, it also requires a lot of computational effort and new acceleration techniques. As a possible solution to this problem, this paper proposes a hardware architecture for Typicality and Eccentricity Data Analytic (TEDA) algorithm implemented on Field Programmable Gate Arrays (FPGA) for use in data streaming anomaly detection. TEDA is based on a new approach to outlier detection in the data stream context. In order to validate the proposals, results of the occupation and throughput of the proposed hardware are presented. Besides, the bit accurate simulation results are also presented. The project aims to Xilinx Virtex-6 xc6vlx240t-1ff1156 as the target FPGA.
△ Less
Submitted 8 March, 2020;
originally announced March 2020.
-
Sound Event Recognition in a Smart City Surveillance Context
Authors:
Tito Spadini,
Dimitri Leandro de Oliveira Silva,
Ricardo Suyama
Abstract:
Due to the growing demand for improving surveillance capabilities in smart cities, systems need to be developed to provide better monitoring capabilities to competent authorities, agencies responsible for strategic resource management, and emergency call centers. This work assumes that, as a complementary monitoring solution, the use of a system capable of detecting the occurrence of sound events,…
▽ More
Due to the growing demand for improving surveillance capabilities in smart cities, systems need to be developed to provide better monitoring capabilities to competent authorities, agencies responsible for strategic resource management, and emergency call centers. This work assumes that, as a complementary monitoring solution, the use of a system capable of detecting the occurrence of sound events, performing the Sound Events Recognition (SER) task, is highly convenient. In order to contribute to the classification of such events, this paper explored several classifiers over the SESA dataset, composed of audios of three hazard classes (gunshots, explosions, and sirens) and a class of casual sounds that could be misinterpreted as some of the other sounds. The best result was obtained by SGD, with an accuracy of 72.13% with 6.81 ms classification time, reinforcing the viability of such an approach.
△ Less
Submitted 1 February, 2020; v1 submitted 27 October, 2019;
originally announced October 2019.
-
Multi-Path Learnable Wavelet Neural Network for Image Classification
Authors:
D. D. N. De Silva,
H. W. M. K. Vithanage,
K. S. D. Fernando,
I. T. S. Piyatilake
Abstract:
Despite the remarkable success of deep learning in pattern recognition, deep network models face the problem of training a large number of parameters. In this paper, we propose and evaluate a novel multi-path wavelet neural network architecture for image classification with far less number of trainable parameters. The model architecture consists of a multi-path layout with several levels of wavele…
▽ More
Despite the remarkable success of deep learning in pattern recognition, deep network models face the problem of training a large number of parameters. In this paper, we propose and evaluate a novel multi-path wavelet neural network architecture for image classification with far less number of trainable parameters. The model architecture consists of a multi-path layout with several levels of wavelet decompositions performed in parallel followed by fully connected layers. These decomposition operations comprise wavelet neurons with learnable parameters, which are updated during the training phase using the back-propagation algorithm. We evaluate the performance of the introduced network using common image datasets without data augmentation except for SVHN and compare the results with influential deep learning models. Our findings support the possibility of reducing the number of parameters significantly in deep neural networks without compromising its accuracy.
△ Less
Submitted 26 August, 2019;
originally announced August 2019.
-
A Multistage Method for SCMA Codebook Design Based on MDS Codes
Authors:
Bruno Fontana da Silva,
Danilo Silva,
Bartolomeu F. Uchôa-Filho,
Didier Le Ruyet
Abstract:
Sparse Code Multiple Access (SCMA) has been recently proposed for the future generation of wireless communication standards. SCMA system design involves specifying several parameters. In order to simplify the procedure, most works consider a multistage design approach. Two main stages are usually emphasized in these methods: sparse signatures design (equivalently, resource allocation) and codebook…
▽ More
Sparse Code Multiple Access (SCMA) has been recently proposed for the future generation of wireless communication standards. SCMA system design involves specifying several parameters. In order to simplify the procedure, most works consider a multistage design approach. Two main stages are usually emphasized in these methods: sparse signatures design (equivalently, resource allocation) and codebook design. In this paper, we present a novel SCMA codebook design method. The proposed method considers SCMA codebooks structured with an underlying vector space obtained from classical block codes. In particular, when using maximum distance separable (MDS) codes, our proposed design provides maximum signal-space diversity with a relatively small alphabet. The use of small alphabets also helps to maintain desired properties in the codebooks, such as low peak-to-average power ratio and low-complexity detection.
△ Less
Submitted 7 May, 2019;
originally announced May 2019.
-
How to Split UL/DL Antennas in Full-Duplex Cellular Networks
Authors:
José Mairton B. da Silva Jr.,
Hadi Ghauch,
Gábor Fodor,
Carlo Fischione
Abstract:
To further improve the potential of full-duplex communications, networks may employ multiple antennas at the base station or user equipment. To this end, networks that employ current radios usually deal with self-interference and multi-user interference by beamforming techniques. Although previous works investigated beamforming design to improve spectral efficiency, the fundamental question of how…
▽ More
To further improve the potential of full-duplex communications, networks may employ multiple antennas at the base station or user equipment. To this end, networks that employ current radios usually deal with self-interference and multi-user interference by beamforming techniques. Although previous works investigated beamforming design to improve spectral efficiency, the fundamental question of how to split the antennas at a base station between uplink and downlink in full-duplex networks has not been investigated rigorously. This paper addresses this question by posing antenna splitting as a binary nonlinear optimization problem to minimize the sum mean squared error of the received data symbols. It is shown that this is an NP-hard problem. This combinatorial problem is dealt with by equivalent formulations, iterative convex approximations, and a binary relaxation. The proposed algorithm is guaranteed to converge to a stationary solution of the relaxed problem with much smaller complexity than exhaustive search. Numerical results indicate that the proposed solution is close to the optimal in both high and low self-interference capable scenarios, while the usually assumed antenna splitting is far from optimal. For large number of antennas, a simple antenna splitting is close to the proposed solution. This reveals that the importance of antenna splitting is inversely proportional with the number of antennas.
△ Less
Submitted 23 May, 2018; v1 submitted 11 January, 2018;
originally announced January 2018.
-
Avaliação da doença de Alzheimer pela análise multiespectral de imagens DW-MR por redes RBF como alternativa aos mapas ADC
Authors:
Wellington Pinheiro dos Santos,
Ricardo Emmanuel de Souza,
Ascendino Flávio Dias e Silva,
Plínio Batista dos Santos Filho
Abstract:
Alzheimer's disease is the most common cause of dementia, yet difficult to accurately diagnose without the use of invasive techniques, particularly at the beginning of the disease. This work addresses the classification and analysis of multispectral synthetic images composed by diffusion-weighted magnetic resonance brain volumes for evaluation of the area of cerebrospinal fluid and its correlation…
▽ More
Alzheimer's disease is the most common cause of dementia, yet difficult to accurately diagnose without the use of invasive techniques, particularly at the beginning of the disease. This work addresses the classification and analysis of multispectral synthetic images composed by diffusion-weighted magnetic resonance brain volumes for evaluation of the area of cerebrospinal fluid and its correlation with the progression of Alzheimer's disease. A 1.5 T MR imaging system was used to acquire all the images presented. The classification methods are based on multilayer perceptrons and classifiers of radial basis function networks. It is assumed that the classes of interest can be separated by hyperquadrics. A polynomial network of degree 2 is used to classify the original volumes, generating a ground-truth volume. The classification results are used to improve the usual analysis by the map of apparent diffusion coefficients.
△ Less
Submitted 3 December, 2017;
originally announced December 2017.
-
New Algorithms for Computing a Single Component of the Discrete Fourier Transform
Authors:
G. Jerônimo da Silva Jr.,
R. M. Campello de Souza,
H. M. de Oliveira
Abstract:
This paper introduces the theory and hardware implementation of two new algorithms for computing a single component of the discrete Fourier transform. In terms of multiplicative complexity, both algorithms are more efficient, in general, than the well known Goertzel Algorithm.
This paper introduces the theory and hardware implementation of two new algorithms for computing a single component of the discrete Fourier transform. In terms of multiplicative complexity, both algorithms are more efficient, in general, than the well known Goertzel Algorithm.
△ Less
Submitted 9 March, 2015;
originally announced March 2015.
-
The Z Transform over Finite Fields
Authors:
R. M. Campello de Souza,
H. M. de Oliveira,
D. Silva
Abstract:
Finite field transforms have many applications and, in many cases, can be implemented with a low computational complexity. In this paper, the Z Transform over a finite field is introduced and some of its properties are presented.
Finite field transforms have many applications and, in many cases, can be implemented with a low computational complexity. In this paper, the Z Transform over a finite field is introduced and some of its properties are presented.
△ Less
Submitted 11 February, 2015;
originally announced February 2015.