Search | arXiv e-print repository

Offshore Wind Plant Instance Segmentation Using Sentinel-1 Time Series, GIS, and Semantic Segmentation Models

Authors: Osmar Luiz Ferreira de Carvalho, Osmar Abilio de Carvalho Junior, Anesmar Olino de Albuquerque, Daniel Guerreiro e Silva

Abstract: Offshore wind farms represent a renewable energy source with a significant global growth trend, and their monitoring is strategic for territorial and environmental planning. This study's primary objective is to detect offshore wind plants at an instance level using semantic segmentation models and Sentinel-1 time series. The secondary objectives are: (a) to develop a database consisting of labeled… ▽ More Offshore wind farms represent a renewable energy source with a significant global growth trend, and their monitoring is strategic for territorial and environmental planning. This study's primary objective is to detect offshore wind plants at an instance level using semantic segmentation models and Sentinel-1 time series. The secondary objectives are: (a) to develop a database consisting of labeled data and S-1 time series; (b) to compare the performance of five deep semantic segmentation architectures (U-Net, U-Net++, Feature Pyramid Network - FPN, DeepLabv3+, and LinkNet); (c) develop a novel augmentation strategy that shuffles the positions of the images within the time series; (d) investigate different dimensions of time series intervals (1, 5, 10, and 15 images); and (e) evaluate the semantic-to-instance conversion procedure. LinkNet was the top-performing model, followed by U-Net++ and U-Net, while FPN and DeepLabv3+ presented the worst results. The evaluation of semantic segmentation models reveals enhanced Intersection over Union (IoU) (25%) and F-score metrics (18%) with the augmentation of time series images. The study showcases the augmentation strategy's capability to mitigate biases and precisely detect invariant targets. Furthermore, the conversion from semantic to instance segmentation demonstrates its efficacy in accurately isolating individual instances within classified regions - simplifying training data and reducing annotation effort and complexity. △ Less

Submitted 14 December, 2023; originally announced December 2023.

Comments: 21 pages, 5 figures

MSC Class: 68T45 ACM Class: I.4.6

arXiv:2311.02855 [pdf, other]

doi 10.1109/TAES.2023.3332056

Neural-based Compression Scheme for Solar Image Data

Authors: Ali Zafari, Atefeh Khoshkhahtinat, Jeremy A. Grajeda, Piyush M. Mehta, Nasser M. Nasrabadi, Laura E. Boucheron, Barbara J. Thompson, Michael S. F. Kirk, Daniel da Silva

Abstract: Studying the solar system and especially the Sun relies on the data gathered daily from space missions. These missions are data-intensive and compressing this data to make them efficiently transferable to the ground station is a twofold decision to make. Stronger compression methods, by distorting the data, can increase data throughput at the cost of accuracy which could affect scientific analysis… ▽ More Studying the solar system and especially the Sun relies on the data gathered daily from space missions. These missions are data-intensive and compressing this data to make them efficiently transferable to the ground station is a twofold decision to make. Stronger compression methods, by distorting the data, can increase data throughput at the cost of accuracy which could affect scientific analysis of the data. On the other hand, preserving subtle details in the compressed data requires a high amount of data to be transferred, reducing the desired gains from compression. In this work, we propose a neural network-based lossy compression method to be used in NASA's data-intensive imagery missions. We chose NASA's SDO mission which transmits 1.4 terabytes of data each day as a proof of concept for the proposed algorithm. In this work, we propose an adversarially trained neural network, equipped with local and non-local attention modules to capture both the local and global structure of the image resulting in a better trade-off in rate-distortion (RD) compared to conventional hand-engineered codecs. The RD variational autoencoder used in this work is jointly trained with a channel-dependent entropy model as a shared prior between the analysis and synthesis transforms to make the entropy coding of the latent code more effective. Our neural image compression algorithm outperforms currently-in-use and state-of-the-art codecs such as JPEG and JPEG-2000 in terms of the RD performance when compressing extreme-ultraviolet (EUV) data. As a proof of concept for use of this algorithm in SDO data analysis, we have performed coronal hole (CH) detection using our compressed images, and generated consistent segmentations, even at a compression rate of $\sim0.1$ bits per pixel (compared to 8 bits per pixel on the original data) using EUV data from SDO. △ Less

Submitted 5 November, 2023; originally announced November 2023.

Comments: Accepted for publication in IEEE Transactions on Aerospace and Electronic Systems (TAES). arXiv admin note: text overlap with arXiv:2210.06478

arXiv:2309.10791 [pdf, other]

Multi-spectral Entropy Constrained Neural Compression of Solar Imagery

Authors: Ali Zafari, Atefeh Khoshkhahtinat, Piyush M. Mehta, Nasser M. Nasrabadi, Barbara J. Thompson, Michael S. F. Kirk, Daniel da Silva

Abstract: Missions studying the dynamic behaviour of the Sun are defined to capture multi-spectral images of the sun and transmit them to the ground station in a daily basis. To make transmission efficient and feasible, image compression systems need to be exploited. Recently successful end-to-end optimized neural network-based image compression systems have shown great potential to be used in an ad-hoc man… ▽ More Missions studying the dynamic behaviour of the Sun are defined to capture multi-spectral images of the sun and transmit them to the ground station in a daily basis. To make transmission efficient and feasible, image compression systems need to be exploited. Recently successful end-to-end optimized neural network-based image compression systems have shown great potential to be used in an ad-hoc manner. In this work we have proposed a transformer-based multi-spectral neural image compressor to efficiently capture redundancies both intra/inter-wavelength. To unleash the locality of window-based self attention mechanism, we propose an inter-window aggregated token multi head self attention. Additionally to make the neural compressor autoencoder shift invariant, a randomly shifted window attention mechanism is used which makes the transformer blocks insensitive to translations in their input domain. We demonstrate that the proposed approach not only outperforms the conventional compression algorithms but also it is able to better decorrelates images along the multiple wavelengths compared to single spectral compression. △ Less

Submitted 10 October, 2023; v1 submitted 19 September, 2023; originally announced September 2023.

Comments: Accepted to IEEE 22$^{nd}$ International Conference on Machine Learning and Applications 2023 (ICMLA)

arXiv:2309.10784 [pdf, other]

Context-Aware Neural Video Compression on Solar Dynamics Observatory

Authors: Atefeh Khoshkhahtinat, Ali Zafari, Piyush M. Mehta, Nasser M. Nasrabadi, Barbara J. Thompson, Michael S. F. Kirk, Daniel da Silva

Abstract: NASA's Solar Dynamics Observatory (SDO) mission collects large data volumes of the Sun's daily activity. Data compression is crucial for space missions to reduce data storage and video bandwidth requirements by eliminating redundancies in the data. In this paper, we present a novel neural Transformer-based video compression approach specifically designed for the SDO images. Our primary objective i… ▽ More NASA's Solar Dynamics Observatory (SDO) mission collects large data volumes of the Sun's daily activity. Data compression is crucial for space missions to reduce data storage and video bandwidth requirements by eliminating redundancies in the data. In this paper, we present a novel neural Transformer-based video compression approach specifically designed for the SDO images. Our primary objective is to efficiently exploit the temporal and spatial redundancies inherent in solar images to obtain a high compression ratio. Our proposed architecture benefits from a novel Transformer block called Fused Local-aware Window (FLaWin), which incorporates window-based self-attention modules and an efficient fused local-aware feed-forward (FLaFF) network. This architectural design allows us to simultaneously capture short-range and long-range information while facilitating the extraction of rich and diverse contextual representations. Moreover, this design choice results in reduced computational complexity. Experimental results demonstrate the significant contribution of the FLaWin Transformer block to the compression performance, outperforming conventional hand-engineered video codecs such as H.264 and H.265 in terms of rate-distortion trade-off. △ Less

Submitted 19 September, 2023; originally announced September 2023.

Comments: Accepted to IEEE 22$^{nd}$ International Conference on Machine Learning and Applications 2023 (ICMLA) - Selected for Oral Presentation

arXiv:2307.05241 [pdf, other]

doi 10.1007/978-3-031-45389-2_13

Does pre-training on brain-related tasks results in better deep-learning-based brain age biomarkers?

Authors: Bruno Machado Pacheco, Victor Hugo Rocha de Oliveira, Augusto Braga Fernandes Antunes, Saulo Domingos de Souza Pedro, Danilo Silva

Abstract: Brain age prediction using neuroimaging data has shown great potential as an indicator of overall brain health and successful aging, as well as a disease biomarker. Deep learning models have been established as reliable and efficient brain age estimators, being trained to predict the chronological age of healthy subjects. In this paper, we investigate the impact of a pre-training step on deep lear… ▽ More Brain age prediction using neuroimaging data has shown great potential as an indicator of overall brain health and successful aging, as well as a disease biomarker. Deep learning models have been established as reliable and efficient brain age estimators, being trained to predict the chronological age of healthy subjects. In this paper, we investigate the impact of a pre-training step on deep learning models for brain age prediction. More precisely, instead of the common approach of pre-training on natural imaging classification, we propose pre-training the models on brain-related tasks, which led to state-of-the-art results in our experiments on ADNI data. Furthermore, we validate the resulting brain age biomarker on images of patients with mild cognitive impairment and Alzheimer's disease. Interestingly, our results indicate that better-performing deep learning models in terms of brain age prediction on healthy patients do not result in more reliable biomarkers. △ Less

Submitted 11 July, 2023; originally announced July 2023.

Comments: Accepted at BRACIS 2023

arXiv:2305.18605 [pdf, other]

Turbidity Control in Sedimentation Columns by Direction Dependent Models

Authors: Jesus-Pablo Toledo-Zucco, Daniel Sbarbaro, Joao Manoel Gomes da Silva Jr

Abstract: Sedimentation is a crucial phenomenon in recovering water from slurries by separating solid-liquid. Thickeners and sedimentation columns are equipments widely used in the process industry to reclaim water from process slurries. This contribution addresses the problem of controlling the turbidity of the recovered water in a sedimentation column by manipulating the underflow. The phenomenological mo… ▽ More Sedimentation is a crucial phenomenon in recovering water from slurries by separating solid-liquid. Thickeners and sedimentation columns are equipments widely used in the process industry to reclaim water from process slurries. This contribution addresses the problem of controlling the turbidity of the recovered water in a sedimentation column by manipulating the underflow. The phenomenological model describing the turbidity is too complex to be used in a control strategy, and it is difficult to identify its parameters using plant measurements. This work proposes an empirical piece-wise time-delay model for modeling the turbidity at the top of the column to circumvent these problems. A systematic design procedure is developed to tune a Proportional Integral controller guaranteeing closed-loop stability for systems modeled as a piece-wise time delay model. Experiments in a pilot plant validate the theoretical results and illustrate the control performance under various operational scenarios. △ Less

Submitted 1 July, 2024; v1 submitted 29 May, 2023; originally announced May 2023.

arXiv:2305.07511 [pdf, ps, other]

eXplainable Artificial Intelligence on Medical Images: A Survey

Authors: Matteus Vargas Simão da Silva, Rodrigo Reis Arrais, Jhessica Victoria Santos da Silva, Felipe Souza Tânios, Mateus Antonio Chinelatto, Natalia Backhaus Pereira, Renata De Paris, Lucas Cesar Ferreira Domingos, Rodrigo Dória Villaça, Vitor Lopes Fabris, Nayara Rossi Brito da Silva, Ana Claudia Akemi Matsuki de Faria, Jose Victor Nogueira Alves da Silva, Fabiana Cristina Queiroz de Oliveira Marucci, Francisco Alves de Souza Neto, Danilo Xavier Silva, Vitor Yukio Kondo, Claudio Filipi Gonçalves dos Santos

Abstract: Over the last few years, the number of works about deep learning applied to the medical field has increased enormously. The necessity of a rigorous assessment of these models is required to explain these results to all people involved in medical exams. A recent field in the machine learning area is explainable artificial intelligence, also known as XAI, which targets to explain the results of such… ▽ More Over the last few years, the number of works about deep learning applied to the medical field has increased enormously. The necessity of a rigorous assessment of these models is required to explain these results to all people involved in medical exams. A recent field in the machine learning area is explainable artificial intelligence, also known as XAI, which targets to explain the results of such black box models to permit the desired assessment. This survey analyses several recent studies in the XAI field applied to medical diagnosis research, allowing some explainability of the machine learning results in several different diseases, such as cancers and COVID-19. △ Less

Submitted 12 May, 2023; originally announced May 2023.

arXiv:2303.00577 [pdf, ps, other]

Computing Functions Over-the-Air Using Digital Modulations

Authors: Saeed Razavikia, Jose Mairton Barros da Silva Jr, Carlo Fischione

Abstract: Over-the-air computation (AirComp) is a known technique in which wireless devices transmit values by analog amplitude modulation so that a function of these values is computed over the communication channel at a common receiver. The physical reason is the superposition properties of the electromagnetic waves, which naturally return sums of analog values. Consequently, the applications of AirComp a… ▽ More Over-the-air computation (AirComp) is a known technique in which wireless devices transmit values by analog amplitude modulation so that a function of these values is computed over the communication channel at a common receiver. The physical reason is the superposition properties of the electromagnetic waves, which naturally return sums of analog values. Consequently, the applications of AirComp are almost entirely restricted to analog communication systems. However, the use of digital communications for over-the-air computations would have several benefits, such as error correction, synchronization, acquisition of channel state information, and easier adoption by current digital communication systems. Nevertheless, a common belief is that digital modulations are generally unfeasible for computation tasks because the overlap** of digitally modulated signals returns signals that seem to be meaningless for these tasks. This paper breaks through such a belief and proposes a fundamentally new computing method, named ChannelComp, for performing over-the-air computations by any digital modulation. In particular, we propose digital modulation formats that allow us to compute a wider class of functions than AirComp can compute, and we propose a feasibility optimization problem that ascertains the optimal digital modulation for computing functions over-the-air. The simulation results verify the superior performance of ChannelComp in comparison to AirComp, particularly for the product functions, with around 10 dB improvement of the computation error. △ Less

Submitted 20 March, 2023; v1 submitted 1 March, 2023; originally announced March 2023.

Comments: submitted version to the IEEE ICC conference

arXiv:2301.09396 [pdf]

Environment for the Design and Automation of New CDPR Architectures

Authors: Josue Rivera, Julio Garrido, Enrique Riveiro, Diego Silva

Abstract: This paper presents a design and automation environment to study the control trajectory for new CDPR architectures, for instance CDPRs with an unusual number of cables or different motor location in the robot frame. In order to test the environment capabilities, an architecture of a planar under-constrained CDPR was designed, simulated, and implemented using standard industrial hardware. Both the… ▽ More This paper presents a design and automation environment to study the control trajectory for new CDPR architectures, for instance CDPRs with an unusual number of cables or different motor location in the robot frame. In order to test the environment capabilities, an architecture of a planar under-constrained CDPR was designed, simulated, and implemented using standard industrial hardware. Both the simulated model and industrial prototype were running the same trajectories to determine the time delay and the error position between them. The tests have demonstrated that the simulated model of the CDPR reproduces the trajectories of the equivalent industrial prototype with a maximum deviation of 0.35% under loading and different speed conditions, despite the time delays produced by the data transmission and the non-deterministic communication protocols used to connect the industrial automation controller with the simulated model. The results have shown that the environment is suitable for trajectory control and workspace analysis of new CDPR architectures under different dynamic conditions. △ Less

Submitted 23 January, 2023; originally announced January 2023.

Comments: 8 pages, 7 figures, preprint, FAIM 2023 conference

arXiv:2212.07497 [pdf, ps, other]

doi 10.1016/j.bspc.2022.104514

Towards fully automated deep-learning-based brain tumor segmentation: is brain extraction still necessary?

Authors: Bruno Machado Pacheco, Guilherme de Souza e Cassia, Danilo Silva

Abstract: State-of-the-art brain tumor segmentation is based on deep learning models applied to multi-modal MRIs. Currently, these models are trained on images after a preprocessing stage that involves registration, interpolation, brain extraction (BE, also known as skull-strip**) and manual correction by an expert. However, for clinical practice, this last step is tedious and time-consuming and, therefor… ▽ More State-of-the-art brain tumor segmentation is based on deep learning models applied to multi-modal MRIs. Currently, these models are trained on images after a preprocessing stage that involves registration, interpolation, brain extraction (BE, also known as skull-strip**) and manual correction by an expert. However, for clinical practice, this last step is tedious and time-consuming and, therefore, not always feasible, resulting in skull-strip** faults that can negatively impact the tumor segmentation quality. Still, the extent of this impact has never been measured for any of the many different BE methods available. In this work, we propose an automatic brain tumor segmentation pipeline and evaluate its performance with multiple BE methods. Our experiments show that the choice of a BE method can compromise up to 15.7% of the tumor segmentation performance. Moreover, we propose training and testing tumor segmentation models on non-skull-stripped images, effectively discarding the BE step from the pipeline. Our results show that this approach leads to a competitive performance at a fraction of the time. We conclude that, in contrast to the current paradigm, training tumor segmentation models on non-skull-stripped images can be the best option when high performance in clinical practice is desired. △ Less

Submitted 14 December, 2022; originally announced December 2022.

Comments: 15 pages, 9 figures

Journal ref: Biomedical Signal Processing and Control, vol. 82, p. 104514, Apr. 2023

arXiv:2211.14372 [pdf, other]

Interpretability Analysis of Deep Models for COVID-19 Detection

Authors: Daniel Peixoto Pinto da Silva, Edresson Casanova, Lucas Rafael Stefanel Gris, Arnaldo Candido Junior, Marcelo Finger, Flaviane Svartman, Beatriz Raposo, Marcus Vinícius Moreira Martins, Sandra Maria Aluísio, Larissa Cristina Berti, João Paulo Teixeira

Abstract: During the outbreak of COVID-19 pandemic, several research areas joined efforts to mitigate the damages caused by SARS-CoV-2. In this paper we present an interpretability analysis of a convolutional neural network based model for COVID-19 detection in audios. We investigate which features are important for model decision process, investigating spectrograms, F0, F0 standard deviation, sex and age.… ▽ More During the outbreak of COVID-19 pandemic, several research areas joined efforts to mitigate the damages caused by SARS-CoV-2. In this paper we present an interpretability analysis of a convolutional neural network based model for COVID-19 detection in audios. We investigate which features are important for model decision process, investigating spectrograms, F0, F0 standard deviation, sex and age. Following, we analyse model decisions by generating heat maps for the trained models to capture their attention during the decision process. Focusing on a explainable Inteligence Artificial approach, we show that studied models can taken unbiased decisions even in the presence of spurious data in the training set, given the adequate preprocessing steps. Our best model has 94.44% of accuracy in detection, with results indicating that models favors spectrograms for the decision process, particularly, high energy areas in the spectrogram related to prosodic domains, while F0 also leads to efficient COVID-19 detection. △ Less

Submitted 25 November, 2022; originally announced November 2022.

Comments: 14 pages, 4 figures

arXiv:2211.04152 [pdf, other]

doi 10.1109/JSTSP.2022.3221681

Federated Learning Using Three-Operator ADMM

Authors: Shashi Kant, José Mairton B. da Silva Jr., Gabor Fodor, Bo Göransson, Mats Bengtsson, Carlo Fischione

Abstract: Federated learning (FL) has emerged as an instance of distributed machine learning paradigm that avoids the transmission of data generated on the users' side. Although data are not transmitted, edge devices have to deal with limited communication bandwidths, data heterogeneity, and straggler effects due to the limited computational resources of users' devices. A prominent approach to overcome such… ▽ More Federated learning (FL) has emerged as an instance of distributed machine learning paradigm that avoids the transmission of data generated on the users' side. Although data are not transmitted, edge devices have to deal with limited communication bandwidths, data heterogeneity, and straggler effects due to the limited computational resources of users' devices. A prominent approach to overcome such difficulties is FedADMM, which is based on the classical two-operator consensus alternating direction method of multipliers (ADMM). The common assumption of FL algorithms, including FedADMM, is that they learn a global model using data only on the users' side and not on the edge server. However, in edge learning, the server is expected to be near the base station and have direct access to rich datasets. In this paper, we argue that leveraging the rich data on the edge server is much more beneficial than utilizing only user datasets. Specifically, we show that the mere application of FL with an additional virtual user node representing the data on the edge server is inefficient. We propose FedTOP-ADMM, which generalizes FedADMM and is based on a three-operator ADMM-type technique that exploits a smooth cost function on the edge server to learn a global model parallel to the edge devices. Our numerical experiments indicate that FedTOP-ADMM has substantial gain up to 33\% in communication efficiency to reach a desired test accuracy with respect to FedADMM, including a virtual user on the edge server. △ Less

Submitted 25 March, 2024; v1 submitted 8 November, 2022; originally announced November 2022.

Comments: accepted to IEEE Journal of Selected Topics in Signal Processing, 2022

arXiv:2211.04097 [pdf]

doi 10.1117/12.2625771

Development of a 3D model of clinically relevant microcalcifications

Authors: Ann-Katherine Carton, Clément Jailin, Raoul de Sousa Silva, Ruben Sanchez de la Rosa, Serge Muller

Abstract: A realistic 3D anthropomorphic software model of microcalcifications may serve as a useful tool to assess the performance of breast imaging applications through simulations. We present a method allowing to simulate visually realistic microcalcifications with large morphological variability. Principal component analysis (PCA) was used to analyze the shape of 281 biopsied microcalcifications imaged… ▽ More A realistic 3D anthropomorphic software model of microcalcifications may serve as a useful tool to assess the performance of breast imaging applications through simulations. We present a method allowing to simulate visually realistic microcalcifications with large morphological variability. Principal component analysis (PCA) was used to analyze the shape of 281 biopsied microcalcifications imaged with a micro-CT. The PCA analysis requires the same number of shape components for each input microcalcification. Therefore, the voxel-based microcalcifications were converted to a surface mesh with same number of vertices using a marching cube algorithm. The vertices were registered using an iterative closest point algorithm and a simulated annealing algorithm. To evaluate the approach, input microcalcifications were reconstructed by progressively adding principal components. Input and reconstructed microcalcifications were visually and quantitatively compared. New microcalcifications were simulated using randomly sampled principal components determined from the PCA applied to the input microcalcifications, and their realism was appreciated through visual assessment. Preliminary results have shown that input microcalcifications can be reconstructed with high visual fidelity when using 62 principal components, representing 99.5% variance. For that condition, the average L2 norm and dice coefficient were respectively 10.5 $μ$m and 0.93. Newly generated microcalcifications with 62 principal components were found to be visually similar, while not identical, to input microcalcifications. The proposed PCA model of microcalcification shapes allows to successfully reconstruct input microcalcifications and to generate new visually realistic microcalcifications with various morphologies. △ Less

Submitted 8 November, 2022; originally announced November 2022.

Journal ref: Sixteenth International Workshop on Breast Imaging, May 2022, Leuven, Belgium. pp.38

arXiv:2210.17469 [pdf, ps, other]

Blind Asynchronous Over-the-Air Federated Edge Learning

Authors: Saeed Razavikia, Jaume Anguera Peris, Jose Mairton B. da Silva Jr, Carlo Fischione

Abstract: Federated Edge Learning (FEEL) is a distributed machine learning technique where each device contributes to training a global inference model by independently performing local computations with their data. More recently, FEEL has been merged with over-the-air computation (OAC), where the global model is calculated over the air by leveraging the superposition of analog signals. However, when implem… ▽ More Federated Edge Learning (FEEL) is a distributed machine learning technique where each device contributes to training a global inference model by independently performing local computations with their data. More recently, FEEL has been merged with over-the-air computation (OAC), where the global model is calculated over the air by leveraging the superposition of analog signals. However, when implementing FEEL with OAC, there is the challenge on how to precode the analog signals to overcome any time misalignment at the receiver. In this work, we propose a novel synchronization-free method to recover the parameters of the global model over the air without requiring any prior information about the time misalignments. For that, we construct a convex optimization based on the norm minimization problem to directly recover the global model by solving a convex semi-definite program. The performance of the proposed method is evaluated in terms of accuracy and convergence via numerical experiments. We show that our proposed algorithm is close to the ideal synchronized scenario by $10\%$, and performs $4\times$ better than the simple case where no recovering method is used. △ Less

Submitted 31 October, 2022; originally announced October 2022.

arXiv:2210.06478 [pdf, other]

doi 10.1109/ICMLA55696.2022.00035

Attention-Based Generative Neural Image Compression on Solar Dynamics Observatory

Authors: Ali Zafari, Atefeh Khoshkhahtinat, Piyush M. Mehta, Nasser M. Nasrabadi, Barbara J. Thompson, Daniel da Silva, Michael S. F. Kirk

Abstract: NASA's Solar Dynamics Observatory (SDO) mission gathers 1.4 terabytes of data each day from its geosynchronous orbit in space. SDO data includes images of the Sun captured at different wavelengths, with the primary scientific goal of understanding the dynamic processes governing the Sun. Recently, end-to-end optimized artificial neural networks (ANN) have shown great potential in performing image… ▽ More NASA's Solar Dynamics Observatory (SDO) mission gathers 1.4 terabytes of data each day from its geosynchronous orbit in space. SDO data includes images of the Sun captured at different wavelengths, with the primary scientific goal of understanding the dynamic processes governing the Sun. Recently, end-to-end optimized artificial neural networks (ANN) have shown great potential in performing image compression. ANN-based compression schemes have outperformed conventional hand-engineered algorithms for lossy and lossless image compression. We have designed an ad-hoc ANN-based image compression scheme to reduce the amount of data needed to be stored and retrieved on space missions studying solar dynamics. In this work, we propose an attention module to make use of both local and non-local attention mechanisms in an adversarially trained neural image compression network. We have also demonstrated the superior perceptual quality of this neural image compressor. Our proposed algorithm for compressing images downloaded from the SDO spacecraft performs better in rate-distortion trade-off than the popular currently-in-use image compression codecs such as JPEG and JPEG2000. In addition we have shown that the proposed method outperforms state-of-the art lossy transform coding compression codec, i.e., BPG. △ Less

Submitted 4 May, 2023; v1 submitted 12 October, 2022; originally announced October 2022.

Comments: Accepted to IEEE 21$^{st}$ International Conference on Machine Learning and Applications 2022 (ICMLA) - Selected for Oral Presentation

arXiv:2112.12609 [pdf]

Predição da Idade Cerebral a partir de Imagens de Ressonância Magnética utilizando Redes Neurais Convolucionais

Authors: Victor H. R. Oliveira, Augusto Antunes, Alexandre S. Soares, Arthur D. Reys, Robson Z. Júnior, Saulo D. S. Pedro, Danilo Silva

Abstract: In this work, deep learning techniques for brain age prediction from magnetic resonance images are investigated, aiming to assist in the identification of biomarkers of the natural aging process. The identification of biomarkers is useful for detecting an early-stage neurodegenerative process, as well as for predicting age-related or non-age-related cognitive decline. Two techniques are implemente… ▽ More In this work, deep learning techniques for brain age prediction from magnetic resonance images are investigated, aiming to assist in the identification of biomarkers of the natural aging process. The identification of biomarkers is useful for detecting an early-stage neurodegenerative process, as well as for predicting age-related or non-age-related cognitive decline. Two techniques are implemented and compared in this work: a 3D Convolutional Neural Network applied to the volumetric image and a 2D Convolutional Neural Network applied to slices from the axial plane, with subsequent fusion of individual predictions. The best result was obtained by the 2D model, which achieved a mean absolute error of 3.83 years. -- Neste trabalho são investigadas técnicas de aprendizado profundo para a predição da idade cerebral a partir de imagens de ressonância magnética, visando auxiliar na identificação de biomarcadores do processo natural de envelhecimento. A identificação de biomarcadores é útil para a detecção de um processo neurodegenerativo em estágio inicial, além de possibilitar prever um declínio cognitivo relacionado ou não à idade. Duas técnicas são implementadas e comparadas neste trabalho: uma Rede Neural Convolucional 3D aplicada na imagem volumétrica e uma Rede Neural Convolucional 2D aplicada a fatias do plano axial, com posterior fusão das predições individuais. O melhor resultado foi obtido pelo modelo 2D, que alcançou um erro médio absoluto de 3.83 anos. △ Less

Submitted 23 December, 2021; originally announced December 2021.

Comments: 3 pages, 3 figures, in Portuguese, accepted at XVIII Congresso Brasileiro de Informática em Saúde (CBIS 2021)

arXiv:2110.15731 [pdf, other]

CORAA: a large corpus of spontaneous and prepared speech manually validated for speech recognition in Brazilian Portuguese

Authors: Arnaldo Candido Junior, Edresson Casanova, Anderson Soares, Frederico Santos de Oliveira, Lucas Oliveira, Ricardo Corso Fernandes Junior, Daniel Peixoto Pinto da Silva, Fernando Gorgulho Fayet, Bruno Baldissera Carlotto, Lucas Rafael Stefanel Gris, Sandra Maria Aluísio

Abstract: Automatic Speech recognition (ASR) is a complex and challenging task. In recent years, there have been significant advances in the area. In particular, for the Brazilian Portuguese (BP) language, there were about 376 hours public available for ASR task until the second half of 2020. With the release of new datasets in early 2021, this number increased to 574 hours. The existing resources, however,… ▽ More Automatic Speech recognition (ASR) is a complex and challenging task. In recent years, there have been significant advances in the area. In particular, for the Brazilian Portuguese (BP) language, there were about 376 hours public available for ASR task until the second half of 2020. With the release of new datasets in early 2021, this number increased to 574 hours. The existing resources, however, are composed of audios containing only read and prepared speech. There is a lack of datasets including spontaneous speech, which are essential in different ASR applications. This paper presents CORAA (Corpus of Annotated Audios) v1. with 290.77 hours, a publicly available dataset for ASR in BP containing validated pairs (audio-transcription). CORAA also contains European Portuguese audios (4.69 hours). We also present a public ASR model based on Wav2Vec 2.0 XLSR-53 and fine-tuned over CORAA. Our model achieved a Word Error Rate of 24.18% on CORAA test set and 20.08% on Common Voice test set. When measuring the Character Error Rate, we obtained 11.02% and 6.34% for CORAA and Common Voice, respectively. CORAA corpora were assembled to both improve ASR models in BP with phenomena from spontaneous speech and motivate young researchers to start their studies on ASR for Portuguese. All the corpora are publicly available at https://github.com/nilc-nlp/CORAA under the CC BY-NC-ND 4.0 license. △ Less

Submitted 18 November, 2021; v1 submitted 14 October, 2021; originally announced October 2021.

Comments: This paper is under consideration at Language Resources and Evaluation (LREV)

arXiv:2104.12749 [pdf, other]

Simultaneous Wireless Information and Power Transfer for Federated Learning

Authors: José Mairton B. da Silva Jr., Konstantinos Ntougias, Ioannis Krikidis, Gábor Fodor, Carlo Fischione

Abstract: In the Internet of Things, learning is one of most prominent tasks. In this paper, we consider an Internet of Things scenario where federated learning is used with simultaneous transmission of model data and wireless power. We investigate the trade-off between the number of communication rounds and communication round time while harvesting energy to compensate the energy expenditure. We formulate… ▽ More In the Internet of Things, learning is one of most prominent tasks. In this paper, we consider an Internet of Things scenario where federated learning is used with simultaneous transmission of model data and wireless power. We investigate the trade-off between the number of communication rounds and communication round time while harvesting energy to compensate the energy expenditure. We formulate and solve an optimization problem by considering the number of local iterations on devices, the time to transmit-receive the model updates, and to harvest sufficient energy. Numerical results indicate that maximum ratio transmission and zero-forcing beamforming for the optimization of the local iterations on devices substantially boost the test accuracy of the learning task. Moreover, maximum ratio transmission instead of zero-forcing provides the best test accuracy and communication round time trade-off for various energy harvesting percentages. Thus, it is possible to learn a model quickly with few communication rounds without depleting the battery. △ Less

Submitted 21 July, 2021; v1 submitted 26 April, 2021; originally announced April 2021.

Comments: Accepted to appear in the IEEE International Workshop on Signal Processing Advances in Wireless Communications (SPAWC) in Lucca, Italy, Sep. 2021

arXiv:2104.01865 [pdf, other]

doi 10.1016/j.apenergy.2020.115918

An Artificial Intelligence Framework for Bidding Optimization with Uncertainty in Multiple Frequency Reserve Markets

Authors: Thimal Kempitiya, Seppo Sierla, Daswin De Silva, Matti Yli-Ojanpera, Damminda Alahakoon, Valeriy Vyatkin

Abstract: The global ambitions of a carbon-neutral society necessitate a stable and robust smart grid that capitalises on frequency reserves of renewable energy. Frequency reserves are resources that adjust power production or consumption in real time to react to a power grid frequency deviation. Revenue generation motivates the availability of these resources for managing such deviations. However, limited… ▽ More The global ambitions of a carbon-neutral society necessitate a stable and robust smart grid that capitalises on frequency reserves of renewable energy. Frequency reserves are resources that adjust power production or consumption in real time to react to a power grid frequency deviation. Revenue generation motivates the availability of these resources for managing such deviations. However, limited research has been conducted on data-driven decisions and optimal bidding strategies for trading such capacities in multiple frequency reserves markets. We address this limitation by making the following research contributions. Firstly, a generalised model is designed based on an extensive study of critical characteristics of global frequency reserves markets. Secondly, three bidding strategies are proposed, based on this market model, to capitalise on price peaks in multi-stage markets. Two strategies are proposed for non-reschedulable loads, in which case the bidding strategy aims to select the market with the highest anticipated price, and the third bidding strategy focuses on rescheduling loads to hours on which highest reserve market prices are anticipated. The third research contribution is an Artificial Intelligence (AI) based bidding optimization framework that implements these three strategies, with novel uncertainty metrics that supplement data-driven price prediction. Finally, the framework is evaluated empirically using a case study of multiple frequency reserves markets in Finland. The results from this evaluation confirm the effectiveness of the proposed bidding strategies and the AI-based bidding optimization framework in terms of cumulative revenue generation, leading to an increased availability of frequency reserves. △ Less

Submitted 5 April, 2021; originally announced April 2021.

Journal ref: Applied Energy, Volume 280, 15 December 2020, 115918

arXiv:2010.11317 [pdf, other]

Full-Duplex and Dynamic-TDD: Pushing the Limits of Spectrum Reuse in Multi-Cell Communications

Authors: José Mairton B. da Silva Jr., Gustav Wikström, Ratheesh K. Mungara, Carlo Fischione

Abstract: Although in cellular networks full-duplex and dynamic time-division duplexing promise increased spectrum efficiency, their potential is so far challenged by increased interference. While previous studies have shown that self-interference can be suppressed to a sufficient level, we show that the cross-link interference for both duplexing modes, especially from base station to base station, is the r… ▽ More Although in cellular networks full-duplex and dynamic time-division duplexing promise increased spectrum efficiency, their potential is so far challenged by increased interference. While previous studies have shown that self-interference can be suppressed to a sufficient level, we show that the cross-link interference for both duplexing modes, especially from base station to base station, is the remaining challenge in multi-cell networks, restricting the uplink performance. Using beamforming techniques of low-complexity, we show that this interference can be mitigated, and that full-duplex and dynamic time-division duplexing can substantially increase the capacity of multi-cell networks. Our results suggest that if we can control the cross link interference in full-duplex, then we can almost double the multi cell network capacity as well as user throughput. Therefore, the techniques in this paper have the potentiality to enable a smooth introduction of full-duplex into cellular systems. △ Less

Submitted 21 October, 2020; originally announced October 2020.

Comments: 15 pages, 6 figures. Accepted to IEEE Wireless Communications - Special Issue on Full Duplex Communications Theory, Standardization and Practice

arXiv:2008.13492 [pdf, other]

Wireless for Machine Learning

Authors: Henrik Hellström, José Mairton B. da Silva Jr, Mohammad Mohammadi Amiri, Mingzhe Chen, Viktoria Fodor, H. Vincent Poor, Carlo Fischione

Abstract: As data generation increasingly takes place on devices without a wired connection, machine learning (ML) related traffic will be ubiquitous in wireless networks. Many studies have shown that traditional wireless protocols are highly inefficient or unsustainable to support ML, which creates the need for new wireless communication methods. In this survey, we give an exhaustive review of the state-of… ▽ More As data generation increasingly takes place on devices without a wired connection, machine learning (ML) related traffic will be ubiquitous in wireless networks. Many studies have shown that traditional wireless protocols are highly inefficient or unsustainable to support ML, which creates the need for new wireless communication methods. In this survey, we give an exhaustive review of the state-of-the-art wireless methods that are specifically designed to support ML services over distributed datasets. Currently, there are two clear themes within the literature, analog over-the-air computation and digital radio resource management optimized for ML. This survey gives a comprehensive introduction to these methods, reviews the most important works, highlights open problems, and discusses application scenarios. △ Less

Submitted 9 June, 2022; v1 submitted 31 August, 2020; originally announced August 2020.

arXiv:2005.11348 [pdf, other]

Microphone Array Based Surveillance Audio Classification

Authors: Dimitri Leandro de Oliveira Silva, Tito Spadini, Ricardo Suyama

Abstract: The work assessed seven classical classifiers and two beamforming algorithms for detecting surveillance sound events. The tests included the use of AWGN with -10 dB to 30 dB SNR. Data Augmentation was also employed to improve algorithms' performance. The results showed that the combination of SVM and Delay-and-Sum (DaS) scored the best accuracy (up to 86.0\%), but had high computational cost (… ▽ More The work assessed seven classical classifiers and two beamforming algorithms for detecting surveillance sound events. The tests included the use of AWGN with -10 dB to 30 dB SNR. Data Augmentation was also employed to improve algorithms' performance. The results showed that the combination of SVM and Delay-and-Sum (DaS) scored the best accuracy (up to 86.0\%), but had high computational cost ($\approx $ 402 ms), mainly due to DaS. The use of SGD also seems to be a good alternative since it has achieved good accuracy either (up to 85.3\%), but with quicker processing time ($\approx$ 165 ms). △ Less

Submitted 22 May, 2020; originally announced May 2020.

arXiv:2003.03837 [pdf, other]

doi 10.1109/ACCESS.2021.3098004

Hardware Architecture Proposal for TEDA algorithm to Data Streaming Anomaly Detection

Authors: Lucileide M. D. da Silva, Maria G. F. Coutinho, Carlos E. B. Santos, Mailson R. Santos, Luiz Affonso Guedes, M. Dolores Ruiz, Marcelo A. C. Fernandes

Abstract: The amount of data in real-time, such as time series and streaming data, available today continues to grow. Being able to analyze this data the moment it arrives can bring an immense added value. However, it also requires a lot of computational effort and new acceleration techniques. As a possible solution to this problem, this paper proposes a hardware architecture for Typicality and Eccentricity… ▽ More The amount of data in real-time, such as time series and streaming data, available today continues to grow. Being able to analyze this data the moment it arrives can bring an immense added value. However, it also requires a lot of computational effort and new acceleration techniques. As a possible solution to this problem, this paper proposes a hardware architecture for Typicality and Eccentricity Data Analytic (TEDA) algorithm implemented on Field Programmable Gate Arrays (FPGA) for use in data streaming anomaly detection. TEDA is based on a new approach to outlier detection in the data stream context. In order to validate the proposals, results of the occupation and throughput of the proposed hardware are presented. Besides, the bit accurate simulation results are also presented. The project aims to Xilinx Virtex-6 xc6vlx240t-1ff1156 as the target FPGA. △ Less

Submitted 8 March, 2020; originally announced March 2020.

arXiv:1910.12369 [pdf, other]

Sound Event Recognition in a Smart City Surveillance Context

Authors: Tito Spadini, Dimitri Leandro de Oliveira Silva, Ricardo Suyama

Abstract: Due to the growing demand for improving surveillance capabilities in smart cities, systems need to be developed to provide better monitoring capabilities to competent authorities, agencies responsible for strategic resource management, and emergency call centers. This work assumes that, as a complementary monitoring solution, the use of a system capable of detecting the occurrence of sound events,… ▽ More Due to the growing demand for improving surveillance capabilities in smart cities, systems need to be developed to provide better monitoring capabilities to competent authorities, agencies responsible for strategic resource management, and emergency call centers. This work assumes that, as a complementary monitoring solution, the use of a system capable of detecting the occurrence of sound events, performing the Sound Events Recognition (SER) task, is highly convenient. In order to contribute to the classification of such events, this paper explored several classifiers over the SESA dataset, composed of audios of three hazard classes (gunshots, explosions, and sirens) and a class of casual sounds that could be misinterpreted as some of the other sounds. The best result was obtained by SGD, with an accuracy of 72.13% with 6.81 ms classification time, reinforcing the viability of such an approach. △ Less

Submitted 1 February, 2020; v1 submitted 27 October, 2019; originally announced October 2019.

arXiv:1908.09775 [pdf, other]

Multi-Path Learnable Wavelet Neural Network for Image Classification

Authors: D. D. N. De Silva, H. W. M. K. Vithanage, K. S. D. Fernando, I. T. S. Piyatilake

Abstract: Despite the remarkable success of deep learning in pattern recognition, deep network models face the problem of training a large number of parameters. In this paper, we propose and evaluate a novel multi-path wavelet neural network architecture for image classification with far less number of trainable parameters. The model architecture consists of a multi-path layout with several levels of wavele… ▽ More Despite the remarkable success of deep learning in pattern recognition, deep network models face the problem of training a large number of parameters. In this paper, we propose and evaluate a novel multi-path wavelet neural network architecture for image classification with far less number of trainable parameters. The model architecture consists of a multi-path layout with several levels of wavelet decompositions performed in parallel followed by fully connected layers. These decomposition operations comprise wavelet neurons with learnable parameters, which are updated during the training phase using the back-propagation algorithm. We evaluate the performance of the introduced network using common image datasets without data augmentation except for SVHN and compare the results with influential deep learning models. Our findings support the possibility of reducing the number of parameters significantly in deep neural networks without compromising its accuracy. △ Less

Submitted 26 August, 2019; originally announced August 2019.

arXiv:1905.02533 [pdf, other]

doi 10.1109/LWC.2019.2925801

A Multistage Method for SCMA Codebook Design Based on MDS Codes

Authors: Bruno Fontana da Silva, Danilo Silva, Bartolomeu F. Uchôa-Filho, Didier Le Ruyet

Abstract: Sparse Code Multiple Access (SCMA) has been recently proposed for the future generation of wireless communication standards. SCMA system design involves specifying several parameters. In order to simplify the procedure, most works consider a multistage design approach. Two main stages are usually emphasized in these methods: sparse signatures design (equivalently, resource allocation) and codebook… ▽ More Sparse Code Multiple Access (SCMA) has been recently proposed for the future generation of wireless communication standards. SCMA system design involves specifying several parameters. In order to simplify the procedure, most works consider a multistage design approach. Two main stages are usually emphasized in these methods: sparse signatures design (equivalently, resource allocation) and codebook design. In this paper, we present a novel SCMA codebook design method. The proposed method considers SCMA codebooks structured with an underlying vector space obtained from classical block codes. In particular, when using maximum distance separable (MDS) codes, our proposed design provides maximum signal-space diversity with a relatively small alphabet. The use of small alphabets also helps to maintain desired properties in the codebooks, such as low peak-to-average power ratio and low-complexity detection. △ Less

Submitted 7 May, 2019; originally announced May 2019.

Comments: Submitted to IEEE Wireless Communication Letters

arXiv:1801.03717 [pdf, ps, other]

How to Split UL/DL Antennas in Full-Duplex Cellular Networks

Authors: José Mairton B. da Silva Jr., Hadi Ghauch, Gábor Fodor, Carlo Fischione

Abstract: To further improve the potential of full-duplex communications, networks may employ multiple antennas at the base station or user equipment. To this end, networks that employ current radios usually deal with self-interference and multi-user interference by beamforming techniques. Although previous works investigated beamforming design to improve spectral efficiency, the fundamental question of how… ▽ More To further improve the potential of full-duplex communications, networks may employ multiple antennas at the base station or user equipment. To this end, networks that employ current radios usually deal with self-interference and multi-user interference by beamforming techniques. Although previous works investigated beamforming design to improve spectral efficiency, the fundamental question of how to split the antennas at a base station between uplink and downlink in full-duplex networks has not been investigated rigorously. This paper addresses this question by posing antenna splitting as a binary nonlinear optimization problem to minimize the sum mean squared error of the received data symbols. It is shown that this is an NP-hard problem. This combinatorial problem is dealt with by equivalent formulations, iterative convex approximations, and a binary relaxation. The proposed algorithm is guaranteed to converge to a stationary solution of the relaxed problem with much smaller complexity than exhaustive search. Numerical results indicate that the proposed solution is close to the optimal in both high and low self-interference capable scenarios, while the usually assumed antenna splitting is far from optimal. For large number of antennas, a simple antenna splitting is close to the proposed solution. This reveals that the importance of antenna splitting is inversely proportional with the number of antennas. △ Less

Submitted 23 May, 2018; v1 submitted 11 January, 2018; originally announced January 2018.

Comments: 7 pages, 4 figures. Accepted to IEEE ICC 2018 Workshop on Full-Duplex Communications for Future Wireless Networks

arXiv:1712.01700 [pdf]

Avaliação da doença de Alzheimer pela análise multiespectral de imagens DW-MR por redes RBF como alternativa aos mapas ADC

Authors: Wellington Pinheiro dos Santos, Ricardo Emmanuel de Souza, Ascendino Flávio Dias e Silva, Plínio Batista dos Santos Filho

Abstract: Alzheimer's disease is the most common cause of dementia, yet difficult to accurately diagnose without the use of invasive techniques, particularly at the beginning of the disease. This work addresses the classification and analysis of multispectral synthetic images composed by diffusion-weighted magnetic resonance brain volumes for evaluation of the area of cerebrospinal fluid and its correlation… ▽ More Alzheimer's disease is the most common cause of dementia, yet difficult to accurately diagnose without the use of invasive techniques, particularly at the beginning of the disease. This work addresses the classification and analysis of multispectral synthetic images composed by diffusion-weighted magnetic resonance brain volumes for evaluation of the area of cerebrospinal fluid and its correlation with the progression of Alzheimer's disease. A 1.5 T MR imaging system was used to acquire all the images presented. The classification methods are based on multilayer perceptrons and classifiers of radial basis function networks. It is assumed that the classes of interest can be separated by hyperquadrics. A polynomial network of degree 2 is used to classify the original volumes, generating a ground-truth volume. The classification results are used to improve the usual analysis by the map of apparent diffusion coefficients. △ Less

Submitted 3 December, 2017; originally announced December 2017.

Comments: in Portuguese

Journal ref: Learning and Nonlinear Models, v. 4, p. 43-53, 2008

arXiv:1503.02577 [pdf, ps, other]

New Algorithms for Computing a Single Component of the Discrete Fourier Transform

Authors: G. Jerônimo da Silva Jr., R. M. Campello de Souza, H. M. de Oliveira

Abstract: This paper introduces the theory and hardware implementation of two new algorithms for computing a single component of the discrete Fourier transform. In terms of multiplicative complexity, both algorithms are more efficient, in general, than the well known Goertzel Algorithm. This paper introduces the theory and hardware implementation of two new algorithms for computing a single component of the discrete Fourier transform. In terms of multiplicative complexity, both algorithms are more efficient, in general, than the well known Goertzel Algorithm. △ Less

Submitted 9 March, 2015; originally announced March 2015.

Comments: 4 pages, 3 figures, 1 table. In: 10th International Symposium on Communication Theory and Applications, Ambleside, UK

arXiv:1502.03371 [pdf]

The Z Transform over Finite Fields

Authors: R. M. Campello de Souza, H. M. de Oliveira, D. Silva

Abstract: Finite field transforms have many applications and, in many cases, can be implemented with a low computational complexity. In this paper, the Z Transform over a finite field is introduced and some of its properties are presented. Finite field transforms have many applications and, in many cases, can be implemented with a low computational complexity. In this paper, the Z Transform over a finite field is introduced and some of its properties are presented. △ Less

Submitted 11 February, 2015; originally announced February 2015.

Comments: 6 pages, 5 figures, Proc. IEEE/SBrT Int. Telecomm. Symp., 2002. pp.362-367

Showing 1–30 of 30 results for author: Silva, D