Search | arXiv e-print repository

Contract Usage and Evolution in Android Mobile Applications

Authors: David R. Ferreira, Alexandra Mendes, João F. Ferreira

Abstract: Formal contracts and assertions are effective methods to enhance software quality by enforcing preconditions, postconditions, and invariants. Previous research has demonstrated the value of contracts in traditional software development contexts. However, the adoption and impact of contracts in the context of mobile application development, particularly of Android applications, remain unexplored.… ▽ More Formal contracts and assertions are effective methods to enhance software quality by enforcing preconditions, postconditions, and invariants. Previous research has demonstrated the value of contracts in traditional software development contexts. However, the adoption and impact of contracts in the context of mobile application development, particularly of Android applications, remain unexplored. To address this, we present the first large-scale empirical study on the presence and use of contracts in Android applications, written in Java or Kotlin. We consider different types of contract elements divided into five categories: conditional runtime exceptions, APIs, annotations, assertions, and other. We analyzed 2,390 Android applications from the F-Droid repository and processed more than 51,749 KLOC to determine 1) how and to what extent contracts are used, 2) how contract usage evolves, and 3) whether contracts are used safely in the context of program evolution and inheritance. Our findings include: 1) although most applications do not specify contracts, annotation-based approaches are the most popular among practitioners; 2) applications that use contracts continue to use them in later versions, but the number of methods increases at a higher rate than the number of contracts; and 3) there are many potentially unsafe specification changes when applications evolve and in subty** relationships, which indicates a lack of specification stability. Our findings show that it would be desirable to have libraries that standardize contract specifications in Java and Kotlin, and tools that aid practitioners in writing stronger contracts and in detecting contract violations in the context of program evolution and inheritance. △ Less

Submitted 25 January, 2024; originally announced January 2024.

arXiv:2401.06723 [pdf, other]

Evaluation of the Energy Consumption of a Mobile Robotic Platform for Sustainable 6G Networks

Authors: Diogo Ferreira, André Coelho, Rui Campos

Abstract: The emerging 6G paradigm and the proliferation of wireless devices require flexible network infrastructures capable of meeting the increasing Quality of Service (QoS) requirements. Mobile Robotic Platforms (MRPs) acting as mobile communications cells are a promising solution to provide on-demand wireless connectivity in dynamic networking scenarios. However, the energy consumption of MRPs is a cha… ▽ More The emerging 6G paradigm and the proliferation of wireless devices require flexible network infrastructures capable of meeting the increasing Quality of Service (QoS) requirements. Mobile Robotic Platforms (MRPs) acting as mobile communications cells are a promising solution to provide on-demand wireless connectivity in dynamic networking scenarios. However, the energy consumption of MRPs is a challenge that must be considered, in order to maximize the availability of the wireless networks created. The main contribution of this paper is the experimental evaluation of the energy consumption of an MRP acting as a mobile communications cell. The evaluation considers different actions performed by a real MRP, showing that the energy consumption varies significantly with the type of action performed. The obtained results pave the way for optimizing the MRP movement in dynamic networking scenarios so that the wireless network's availability is maximized while minimizing the MRP's energy consumption. △ Less

Submitted 12 January, 2024; originally announced January 2024.

arXiv:2311.14856 [pdf, other]

Disruption Prediction in Fusion Devices through Feature Extraction and Logistic Regression

Authors: Diogo R. Ferreira

Abstract: This document describes an approach used in the Multi-Machine Disruption Prediction Challenge for Fusion Energy by ITU, a data science competition which ran from September to November 2023, on the online platform Zindi. The competition involved data from three fusion devices - C-Mod, HL-2A, and J-TEXT - with most of the training data coming from the last two, and the test data coming from the firs… ▽ More This document describes an approach used in the Multi-Machine Disruption Prediction Challenge for Fusion Energy by ITU, a data science competition which ran from September to November 2023, on the online platform Zindi. The competition involved data from three fusion devices - C-Mod, HL-2A, and J-TEXT - with most of the training data coming from the last two, and the test data coming from the first one. Each device has multiple diagnostics and signals, and it turns out that a critical issue in this competition was to identify which signals, and especially which features from those signals, were most relevant to achieve accurate predictions. The approach described here is based on extracting features from signals, and then applying logistic regression on top of those features. Each signal is treated as a separate predictor and, in the end, a combination of such predictors achieved the first place on the leaderboard. △ Less

Submitted 24 November, 2023; originally announced November 2023.

arXiv:2311.08176 [pdf, other]

A deformation-based morphometry framework for disentangling Alzheimer's disease from normal aging using learned normal aging templates

Authors: **gru Fu, Daniel Ferreira, Örjan Smedby, Rodrigo Moreno

Abstract: Alzheimer's Disease and normal aging are both characterized by brain atrophy. The question of whether AD-related brain atrophy represents accelerated aging or a neurodegeneration process distinct from that in normal aging remains unresolved. Moreover, precisely disentangling AD-related brain atrophy from normal aging in a clinical context is complex. In this study, we propose a deformation-based m… ▽ More Alzheimer's Disease and normal aging are both characterized by brain atrophy. The question of whether AD-related brain atrophy represents accelerated aging or a neurodegeneration process distinct from that in normal aging remains unresolved. Moreover, precisely disentangling AD-related brain atrophy from normal aging in a clinical context is complex. In this study, we propose a deformation-based morphometry framework to estimate normal aging and AD-specific atrophy patterns of subjects from morphological MRI scans. We first leverage deep-learning-based methods to create age-dependent templates of cognitively normal (CN) subjects. These templates model the normal aging atrophy patterns in a CN population. Then, we use the learned diffeomorphic registration to estimate the one-year normal aging pattern at the voxel level. We register the testing image to the 60-year-old CN template in the second step. Finally, normal aging and AD-specific scores are estimated by measuring the alignment of this registration with the one-year normal aging pattern. The methodology was developed and evaluated on the OASIS3 dataset with 1,014 T1-weighted MRI scans. Of these, 326 scans were from CN subjects, and 688 scans were from individuals clinically diagnosed with AD at different stages of clinical severity defined by clinical dementia rating (CDR) scores. The results show that ventricles predominantly follow an accelerated normal aging pattern in subjects with AD. In turn, hippocampi and amygdala regions were affected by both normal aging and AD-specific factors. Interestingly, hippocampi and amygdala regions showed more of an accelerated normal aging pattern for subjects during the early clinical stages of the disease, while the AD-specific score increases in later clinical stages. Our code is freely available at https://github.com/Fjr9516/DBM_with_DL. △ Less

Submitted 14 November, 2023; originally announced November 2023.

Comments: 21 pages, 8 figures

arXiv:2311.04847 [pdf]

Are foundation models efficient for medical image segmentation?

Authors: Danielle Ferreira, Rima Arnaout

Abstract: Foundation models are experiencing a surge in popularity. The Segment Anything model (SAM) asserts an ability to segment a wide spectrum of objects but required supervised training at unprecedented scale. We compared SAM's performance (against clinical ground truth) and resources (labeling time, compute) to a modality-specific, label-free self-supervised learning (SSL) method on 25 measurements fo… ▽ More Foundation models are experiencing a surge in popularity. The Segment Anything model (SAM) asserts an ability to segment a wide spectrum of objects but required supervised training at unprecedented scale. We compared SAM's performance (against clinical ground truth) and resources (labeling time, compute) to a modality-specific, label-free self-supervised learning (SSL) method on 25 measurements for 100 cardiac ultrasounds. SAM performed poorly and required significantly more labeling and computing resources, demonstrating worse efficiency than SSL. △ Less

Submitted 8 November, 2023; originally announced November 2023.

Comments: 14 pages, 2 figures, 2 tables

arXiv:2310.17646 [pdf, other]

doi 10.1088/2632-2153/ad4ba6

Learning the dynamics of a one-dimensional plasma model with graph neural networks

Authors: Diogo D Carvalho, Diogo R Ferreira, Luis O Silva

Abstract: We explore the possibility of fully replacing a plasma physics kinetic simulator with a graph neural network-based simulator. We focus on this class of surrogate models given the similarity between their message-passing update mechanism and the traditional physics solver update, and the possibility of enforcing known physical priors into the graph construction and update. We show that our model le… ▽ More We explore the possibility of fully replacing a plasma physics kinetic simulator with a graph neural network-based simulator. We focus on this class of surrogate models given the similarity between their message-passing update mechanism and the traditional physics solver update, and the possibility of enforcing known physical priors into the graph construction and update. We show that our model learns the kinetic plasma dynamics of the one-dimensional plasma model, a predecessor of contemporary kinetic plasma simulation codes, and recovers a wide range of well-known kinetic plasma processes, including plasma thermalization, electrostatic fluctuations about thermal equilibrium, and the drag on a fast sheet and Landau dam**. We compare the performance against the original plasma model in terms of run-time, conservation laws, and temporal evolution of key physical quantities. The limitations of the model are presented and possible directions for higher-dimensional surrogate models for kinetic plasmas are discussed. △ Less

Submitted 13 May, 2024; v1 submitted 26 October, 2023; originally announced October 2023.

Comments: title changed, accepted for publication in Machine Learning: Science and Technology, code available at https://github.com/diogodcarvalho/gns-sheet-model

Journal ref: Mach. Learn.: Sci. Technol. 5 025048 (2024)

arXiv:2310.14439 [pdf]

doi 10.1016/j.visinf.2023.01.003

Towards the automation of book typesetting

Authors: Sérgio M. Rebelo, Tiago Martins, Diogo Ferreira, Artur Rebelo

Abstract: This paper proposes a generative approach for the automatic typesetting of books in desktop publishing. The presented system consists in a computer script that operates inside a widely used design software tool and implements a generative process based on several typographic rules, styles and principles which have been identified in the literature. The performance of the proposed system is tested… ▽ More This paper proposes a generative approach for the automatic typesetting of books in desktop publishing. The presented system consists in a computer script that operates inside a widely used design software tool and implements a generative process based on several typographic rules, styles and principles which have been identified in the literature. The performance of the proposed system is tested through an experiment which included the evaluation of its outputs with people. The results reveal the ability of the system to consistently create varied book designs from the same input content as well as visually coherent book designs with different contents while complying with fundamental typographic principles. △ Less

Submitted 22 October, 2023; originally announced October 2023.

Comments: 26 pages, 5 figures. Revised version published at Visual Informatics, 7(2), pp. 1\textendash{}12

Journal ref: Visual Informatics, (2023) 7(2), 1--12

arXiv:2301.06193 [pdf, other]

RedBit: An End-to-End Flexible Framework for Evaluating the Accuracy of Quantized CNNs

Authors: André Santos, João Dinis Ferreira, Onur Mutlu, Gabriel Falcao

Abstract: In recent years, Convolutional Neural Networks (CNNs) have become the standard class of deep neural network for image processing, classification and segmentation tasks. However, the large strides in accuracy obtained by CNNs have been derived from increasing the complexity of network topologies, which incurs sizeable performance and energy penalties in the training and inference of CNNs. Many rece… ▽ More In recent years, Convolutional Neural Networks (CNNs) have become the standard class of deep neural network for image processing, classification and segmentation tasks. However, the large strides in accuracy obtained by CNNs have been derived from increasing the complexity of network topologies, which incurs sizeable performance and energy penalties in the training and inference of CNNs. Many recent works have validated the effectiveness of parameter quantization, which consists in reducing the bit width of the network's parameters, to enable the attainment of considerable performance and energy efficiency gains without significantly compromising accuracy. However, it is difficult to compare the relative effectiveness of different quantization methods. To address this problem, we introduce RedBit, an open-source framework that provides a transparent, extensible and easy-to-use interface to evaluate the effectiveness of different algorithms and parameter configurations on network accuracy. We use RedBit to perform a comprehensive survey of five state-of-the-art quantization methods applied to the MNIST, CIFAR-10 and ImageNet datasets. We evaluate a total of 2300 individual bit width combinations, independently tuning the width of the network's weight and input activation parameters, from 32 bits down to 1 bit (e.g., 8/8, 2/2, 1/32, 1/1, for weights/activations). Upwards of 20000 hours of computing time in a pool of state-of-the-art GPUs were used to generate all the results in this paper. For 1-bit quantization, the accuracy losses for the MNIST, CIFAR-10 and ImageNet datasets range between [0.26%, 0.79%], [9.74%, 32.96%] and [10.86%, 47.36%] top-1, respectively. We actively encourage the reader to download the source code and experiment with RedBit, and to submit their own observed results to our public repository, available at https://github.com/IT-Coimbra/RedBit. △ Less

Submitted 15 January, 2023; originally announced January 2023.

Comments: 17 pages, 4 figures, 14 tables

arXiv:2210.04979 [pdf]

Label-free segmentation from cardiac ultrasound using self-supervised learning

Authors: Danielle L. Ferreira, Zaynaf Salaymang, Rima Arnaout

Abstract: Segmentation and measurement of cardiac chambers is critical in cardiac ultrasound but is laborious and poorly reproducible. Neural networks can assist, but supervised approaches require the same laborious manual annotations. We built a pipeline for self-supervised (no manual labels) segmentation combining computer vision, clinical domain knowledge, and deep learning. We trained on 450 echocardiog… ▽ More Segmentation and measurement of cardiac chambers is critical in cardiac ultrasound but is laborious and poorly reproducible. Neural networks can assist, but supervised approaches require the same laborious manual annotations. We built a pipeline for self-supervised (no manual labels) segmentation combining computer vision, clinical domain knowledge, and deep learning. We trained on 450 echocardiograms (93,000 images) and tested on 8,393 echocardiograms (4,476,266 images; mean 61 years, 51% female), using the resulting segmentations to calculate biometrics. We also tested against external images from an additional 10,030 patients with available manual tracings of the left ventricle. r2 between clinically measured and pipeline-predicted measurements were similar to reported inter-clinician variation and comparable to supervised learning across several different measurements (r2 0.56-0.84). Average accuracy for detecting abnormal chamber size and function was 0.85 (range 0.71-0.97) compared to clinical measurements. A subset of test echocardiograms (n=553) had corresponding cardiac MRIs, where MRI is the gold standard. Correlation between pipeline and MRI measurements was similar to that between clinical echocardiogram and MRI. Finally, the pipeline accurately segments the left ventricle with an average Dice score of 0.89 (95% CI [0.89]) in the external, manually labeled dataset. Our results demonstrate a manual-label free, clinically valid, and highly scalable method for segmentation from ultrasound, a noisy but globally important imaging modality. △ Less

Submitted 24 October, 2023; v1 submitted 10 October, 2022; originally announced October 2022.

Comments: 37 pages, 3 Tables, 7 Figures

arXiv:2207.05514 [pdf, other]

doi 10.3390/s22166063

A semi-supervised methodology for fishing activity detection using the geometry behind the trajectory of multiple vessels

Authors: Martha Dais Ferreira, Gabriel Spadon, Amilcar Soares, Stan Matwin

Abstract: Automatic Identification System (AIS) messages are useful for tracking vessel activity across oceans worldwide using radio links and satellite transceivers. Such data plays a significant role in tracking vessel activity and map** mobility patterns such as those found in fishing. Accordingly, this paper proposes a geometric-driven semi-supervised approach for fishing activity detection from AIS d… ▽ More Automatic Identification System (AIS) messages are useful for tracking vessel activity across oceans worldwide using radio links and satellite transceivers. Such data plays a significant role in tracking vessel activity and map** mobility patterns such as those found in fishing. Accordingly, this paper proposes a geometric-driven semi-supervised approach for fishing activity detection from AIS data. Through the proposed methodology we show how to explore the information included in the messages to extract features describing the geometry of the vessel route. To this end, we leverage the unsupervised nature of cluster analysis to label the trajectory geometry highlighting the changes in the vessel's moving pattern which tends to indicate fishing activity. The labels obtained by the proposed unsupervised approach are used to detect fishing activities, which we approach as a time-series classification task. In this context, we propose a solution using recurrent neural networks on AIS data streams with roughly 87% of the overall $F$-score on the whole trajectories of 50 different unseen fishing vessels. Such results are accompanied by a broad benchmark study assessing the performance of different Recurrent Neural Network (RNN) architectures. In conclusion, this work contributes by proposing a thorough process that includes data preparation, labeling, data modeling, and model validation. Therefore, we present a novel solution for mobility pattern detection that relies upon unfolding the trajectory in time and observing their inherent geometry. △ Less

Submitted 22 August, 2022; v1 submitted 12 July, 2022; originally announced July 2022.

Comments: Ferreira, M.D.; Spadon, G.; Soares, A.; Matwin, S. A Semi-Supervised Methodology for Fishing Activity Detection Using the Geometry behind the Trajectory of Multiple Vessels. Sensors 2022, 22, 6063. https://doi.org/10.3390/s22166063

arXiv:2205.15607 [pdf, other]

doi 10.1002/hbm.26165

Generative Aging of Brain Images with Diffeomorphic Registration

Authors: **gru Fu, Antonios Tzortzakakis, José Barroso, Eric Westman, Daniel Ferreira, Rodrigo Moreno

Abstract: Analyzing and predicting brain aging is essential for early prognosis and accurate diagnosis of cognitive diseases. The technique of neuroimaging, such as Magnetic Resonance Imaging (MRI), provides a noninvasive means of observing the aging process within the brain. With longitudinal image data collection, data-intensive Artificial Intelligence (AI) algorithms have been used to examine brain aging… ▽ More Analyzing and predicting brain aging is essential for early prognosis and accurate diagnosis of cognitive diseases. The technique of neuroimaging, such as Magnetic Resonance Imaging (MRI), provides a noninvasive means of observing the aging process within the brain. With longitudinal image data collection, data-intensive Artificial Intelligence (AI) algorithms have been used to examine brain aging. However, existing state-of-the-art algorithms tend to be restricted to group-level predictions and suffer from unreal predictions. This paper proposes a methodology for generating longitudinal MRI scans that capture subject-specific neurodegeneration and retain anatomical plausibility in aging. The proposed methodology is developed within the framework of diffeomorphic registration and relies on three key novel technological advances to generate subject-level anatomically plausible predictions: i) a computationally efficient and individualized generative framework based on registration; ii) an aging generative module based on biological linear aging progression; iii) a quality control module to fit registration for generation task. Our methodology was evaluated on 2662 T1-weighted (T1-w) MRI scans from 796 participants from three different cohorts. First, we applied 6 commonly used criteria to demonstrate the aging simulation ability of the proposed methodology; Secondly, we evaluated the quality of the synthetic images using quantitative measurements and qualitative assessment by a neuroradiologist. Overall, the experimental results show that the proposed method can produce anatomically plausible predictions that can be used to enhance longitudinal datasets, in turn enabling data-hungry AI-driven healthcare tools. △ Less

Submitted 31 May, 2022; originally announced May 2022.

arXiv:2202.13867 [pdf, other]

Unfolding AIS transmission behavior for vessel movement modeling on noisy data leveraging machine learning

Authors: Gabriel Spadon, Martha D. Ferreira, Amilcar Soares, Stan Matwin

Abstract: The oceans are a source of an impressive mixture of complex data that could be used to uncover relationships yet to be discovered. Such data comes from the oceans and their surface, such as Automatic Identification System (AIS) messages used for tracking vessels' trajectories. AIS messages are transmitted over radio or satellite at ideally periodic time intervals but vary irregularly over time. As… ▽ More The oceans are a source of an impressive mixture of complex data that could be used to uncover relationships yet to be discovered. Such data comes from the oceans and their surface, such as Automatic Identification System (AIS) messages used for tracking vessels' trajectories. AIS messages are transmitted over radio or satellite at ideally periodic time intervals but vary irregularly over time. As such, this paper aims to model the AIS message transmission behavior through neural networks for forecasting upcoming AIS messages' content from multiple vessels, particularly in a simultaneous approach despite messages' temporal irregularities as outliers. We present a set of experiments comprising multiple algorithms for forecasting tasks with horizon sizes of varying lengths. Deep learning models (e.g., neural networks) revealed themselves to adequately preserve vessels' spatial awareness regardless of temporal irregularity. We show how convolutional layers, feed-forward networks, and recurrent neural networks can improve such tasks by working together. Experimenting with short, medium, and large-sized sequences of messages, our model achieved 36/37/38% of the Relative Percentage Difference - the lower, the better, whereas we observed 92/45/96% on the Elman's RNN, 51/52/40% on the GRU, and 129/98/61% on the LSTM. These results support our model as a driver for improving the prediction of vessel routes when analyzing multiple vessels of diverging types simultaneously under temporally noise data. △ Less

Submitted 5 July, 2022; v1 submitted 24 February, 2022; originally announced February 2022.

arXiv:2202.02432 [pdf, other]

doi 10.1162/coli_a_00462

Transformers and the representation of biomedical background knowledge

Authors: Oskar Wysocki, Zili Zhou, Paul O'Regan, Deborah Ferreira, Magdalena Wysocka, Dónal Landers, André Freitas

Abstract: Specialised transformers-based models (such as BioBERT and BioMegatron) are adapted for the biomedical domain based on publicly available biomedical corpora. As such, they have the potential to encode large-scale biological knowledge. We investigate the encoding and representation of biological knowledge in these models, and its potential utility to support inference in cancer precision medicine -… ▽ More Specialised transformers-based models (such as BioBERT and BioMegatron) are adapted for the biomedical domain based on publicly available biomedical corpora. As such, they have the potential to encode large-scale biological knowledge. We investigate the encoding and representation of biological knowledge in these models, and its potential utility to support inference in cancer precision medicine - namely, the interpretation of the clinical significance of genomic alterations. We compare the performance of different transformer baselines; we use probing to determine the consistency of encodings for distinct entities; and we use clustering methods to compare and contrast the internal properties of the embeddings for genes, variants, drugs and diseases. We show that these models do indeed encode biological knowledge, although some of this is lost in fine-tuning for specific tasks. Finally, we analyse how the models behave with regard to biases and imbalances in the dataset. △ Less

Submitted 18 August, 2022; v1 submitted 4 February, 2022; originally announced February 2022.

Comments: 25 pages, 12 figures, supplementary methods, tables and figures at the end of the manuscript

Journal ref: Computational Linguistics 2022

arXiv:2112.08289 [pdf, other]

Decomposing Natural Logic Inferences in Neural NLI

Authors: Julia Rozanova, Deborah Ferreira, Marco Valentino, Mokanrarangan Thayaparan, Andre Freitas

Abstract: In the interest of interpreting neural NLI models and their reasoning strategies, we carry out a systematic probing study which investigates whether these models capture the crucial semantic features central to natural logic: monotonicity and concept inclusion. Correctly identifying valid inferences in downward-monotone contexts is a known stumbling block for NLI performance, subsuming linguistic… ▽ More In the interest of interpreting neural NLI models and their reasoning strategies, we carry out a systematic probing study which investigates whether these models capture the crucial semantic features central to natural logic: monotonicity and concept inclusion. Correctly identifying valid inferences in downward-monotone contexts is a known stumbling block for NLI performance, subsuming linguistic phenomena such as negation scope and generalized quantifiers. To understand this difficulty, we emphasize monotonicity as a property of a context and examine the extent to which models capture monotonicity information in the contextual embeddings which are intermediate to their decision making process. Drawing on the recent advancement of the probing paradigm, we compare the presence of monotonicity features across various models. We find that monotonicity information is notably weak in the representations of popular NLI models which achieve high scores on benchmarks, and observe that previous improvements to these models based on fine-tuning strategies have introduced stronger monotonicity features together with their improved performance on challenge sets. △ Less

Submitted 8 November, 2023; v1 submitted 15 December, 2021; originally announced December 2021.

arXiv:2109.08634 [pdf, other]

Grounding Natural Language Instructions: Can Large Language Models Capture Spatial Information?

Authors: Julia Rozanova, Deborah Ferreira, Krishna Dubba, Weiwei Cheng, Dell Zhang, Andre Freitas

Abstract: Models designed for intelligent process automation are required to be capable of grounding user interface elements. This task of interface element grounding is centred on linking instructions in natural language to their target referents. Even though BERT and similar pre-trained language models have excelled in several NLP tasks, their use has not been widely explored for the UI grounding domain.… ▽ More Models designed for intelligent process automation are required to be capable of grounding user interface elements. This task of interface element grounding is centred on linking instructions in natural language to their target referents. Even though BERT and similar pre-trained language models have excelled in several NLP tasks, their use has not been widely explored for the UI grounding domain. This work concentrates on testing and probing the grounding abilities of three different transformer-based models: BERT, RoBERTa and LayoutLM. Our primary focus is on these models' spatial reasoning skills, given their importance in this domain. We observe that LayoutLM has a promising advantage for applications in this domain, even though it was created for a different original purpose (representing scanned documents): the learned spatial features appear to be transferable to the UI grounding setting, especially as they demonstrate the ability to discriminate between target directions in natural language instructions. △ Less

Submitted 17 September, 2021; originally announced September 2021.

Comments: *Equal contribution

arXiv:2107.11879 [pdf, other]

Hybrid Autoregressive Inference for Scalable Multi-hop Explanation Regeneration

Authors: Marco Valentino, Mokanarangan Thayaparan, Deborah Ferreira, André Freitas

Abstract: Regenerating natural language explanations in the scientific domain has been proposed as a benchmark to evaluate complex multi-hop and explainable inference. In this context, large language models can achieve state-of-the-art performance when employed as cross-encoder architectures and fine-tuned on human-annotated explanations. However, while much attention has been devoted to the quality of the… ▽ More Regenerating natural language explanations in the scientific domain has been proposed as a benchmark to evaluate complex multi-hop and explainable inference. In this context, large language models can achieve state-of-the-art performance when employed as cross-encoder architectures and fine-tuned on human-annotated explanations. However, while much attention has been devoted to the quality of the explanations, the problem of performing inference efficiently is largely under-studied. Cross-encoders, in fact, are intrinsically not scalable, possessing limited applicability to real-world scenarios that require inference on massive facts banks. To enable complex multi-hop reasoning at scale, this paper focuses on bi-encoder architectures, investigating the problem of scientific explanation regeneration at the intersection of dense and sparse models. Specifically, we present SCAR (for Scalable Autoregressive Inference), a hybrid framework that iteratively combines a Transformer-based bi-encoder with a sparse model of explanatory power, designed to leverage explicit inference patterns in the explanations. Our experiments demonstrate that the hybrid framework significantly outperforms previous sparse models, achieving performance comparable with that of state-of-the-art cross-encoders while being approx 50 times faster and scalable to corpora of millions of facts. Further analyses on semantic drift and multi-hop question answering reveal that the proposed hybridisation boosts the quality of the most challenging explanations, contributing to improved performance on downstream inference tasks. △ Less

Submitted 6 December, 2021; v1 submitted 25 July, 2021; originally announced July 2021.

Comments: To appear at the 36th AAAI Conference on Artificial Intelligence (AAAI-22)

arXiv:2105.08008 [pdf, other]

Supporting Context Monotonicity Abstractions in Neural NLI Models

Authors: Julia Rozanova, Deborah Ferreira, Mokanarangan Thayaparan, Marco Valentino, André Freitas

Abstract: Natural language contexts display logical regularities with respect to substitutions of related concepts: these are captured in a functional order-theoretic property called monotonicity. For a certain class of NLI problems where the resulting entailment label depends only on the context monotonicity and the relation between the substituted concepts, we build on previous techniques that aim to impr… ▽ More Natural language contexts display logical regularities with respect to substitutions of related concepts: these are captured in a functional order-theoretic property called monotonicity. For a certain class of NLI problems where the resulting entailment label depends only on the context monotonicity and the relation between the substituted concepts, we build on previous techniques that aim to improve the performance of NLI models for these problems, as consistent performance across both upward and downward monotone contexts still seems difficult to attain even for state-of-the-art models. To this end, we reframe the problem of context monotonicity classification to make it compatible with transformer-based pre-trained NLI models and add this task to the training pipeline. Furthermore, we introduce a sound and complete simplified monotonicity logic formalism which describes our treatment of contexts as abstract units. Using the notions in our formalism, we adapt targeted challenge sets to investigate whether an intermediate context monotonicity classification task can aid NLI models' performance on examples exhibiting monotonicity reasoning. △ Less

Submitted 17 May, 2021; originally announced May 2021.

Comments: NALOMA'21 (NAtural LOgic Meets MAchine Learning) @IWCS 2021

arXiv:2105.03417 [pdf, other]

Diff-Explainer: Differentiable Convex Optimization for Explainable Multi-hop Inference

Authors: Mokanarangan Thayaparan, Marco Valentino, Deborah Ferreira, Julia Rozanova, André Freitas

Abstract: This paper presents Diff-Explainer, the first hybrid framework for explainable multi-hop inference that integrates explicit constraints with neural architectures through differentiable convex optimization. Specifically, Diff-Explainer allows for the fine-tuning of neural representations within a constrained optimization framework to answer and explain multi-hop questions in natural language. To de… ▽ More This paper presents Diff-Explainer, the first hybrid framework for explainable multi-hop inference that integrates explicit constraints with neural architectures through differentiable convex optimization. Specifically, Diff-Explainer allows for the fine-tuning of neural representations within a constrained optimization framework to answer and explain multi-hop questions in natural language. To demonstrate the efficacy of the hybrid framework, we combine existing ILP-based solvers for multi-hop Question Answering (QA) with Transformer-based representations. An extensive empirical evaluation on scientific and commonsense QA tasks demonstrates that the integration of explicit constraints in an end-to-end differentiable framework can significantly improve the performance of non-differentiable ILP solvers (8.91% - 13.3%). Moreover, additional analysis reveals that Diff-Explainer is able to achieve strong performance when compared to standalone Transformers and previous multi-hop approaches while still providing structured explanations in support of its predictions. △ Less

Submitted 22 June, 2022; v1 submitted 7 May, 2021; originally announced May 2021.

arXiv:2104.07699 [pdf, other]

doi 10.1109/MICRO56248.2022.00067

pLUTo: Enabling Massively Parallel Computation in DRAM via Lookup Tables

Authors: João Dinis Ferreira, Gabriel Falcao, Juan Gómez-Luna, Mohammed Alser, Lois Orosa, Mohammad Sadrosadati, Jeremie S. Kim, Geraldo F. Oliveira, Taha Shahroodi, Anant Nori, Onur Mutlu

Abstract: Data movement between the main memory and the processor is a key contributor to execution time and energy consumption in memory-intensive applications. This data movement bottleneck can be alleviated using Processing-in-Memory (PiM). One category of PiM is Processing-using-Memory (PuM), in which computation takes place inside the memory array by exploiting intrinsic analog properties of the memory… ▽ More Data movement between the main memory and the processor is a key contributor to execution time and energy consumption in memory-intensive applications. This data movement bottleneck can be alleviated using Processing-in-Memory (PiM). One category of PiM is Processing-using-Memory (PuM), in which computation takes place inside the memory array by exploiting intrinsic analog properties of the memory device. PuM yields high performance and energy efficiency, but existing PuM techniques support a limited range of operations. As a result, current PuM architectures cannot efficiently perform some complex operations (e.g., multiplication, division, exponentiation) without large increases in chip area and design complexity. To overcome these limitations of existing PuM architectures, we introduce pLUTo (processing-using-memory with lookup table (LUT) operations), a DRAM-based PuM architecture that leverages the high storage density of DRAM to enable the massively parallel storing and querying of lookup tables (LUTs). The key idea of pLUTo is to replace complex operations with low-cost, bulk memory reads (i.e., LUT queries) instead of relying on complex extra logic. We evaluate pLUTo across 11 real-world workloads that showcase the limitations of prior PuM approaches and show that our solution outperforms optimized CPU and GPU baselines by an average of 713$\times$ and 1.2$\times$, respectively, while simultaneously reducing energy consumption by an average of 1855$\times$ and 39.5$\times$. Across these workloads, pLUTo outperforms state-of-the-art PiM architectures by an average of 18.3$\times$. We also show that different versions of pLUTo provide different levels of flexibility and performance at different additional DRAM area overheads (between 10.2% and 23.1%). pLUTo's source code is openly and fully available at https://github.com/CMU-SAFARI/pLUTo. △ Less

Submitted 3 October, 2022; v1 submitted 15 April, 2021; originally announced April 2021.

ACM Class: B.3.1; C.1.3

Journal ref: IEEE/ACM International Symposium on Microarchitecture (MICRO), 2022, 900-919

arXiv:2104.05807 [pdf, other]

Does My Representation Capture X? Probe-Ably

Authors: Deborah Ferreira, Julia Rozanova, Mokanarangan Thayaparan, Marco Valentino, André Freitas

Abstract: Probing (or diagnostic classification) has become a popular strategy for investigating whether a given set of intermediate features is present in the representations of neural models. Probing studies may have misleading results, but various recent works have suggested more reliable methodologies that compensate for the possible pitfalls of probing. However, these best practices are numerous and fa… ▽ More Probing (or diagnostic classification) has become a popular strategy for investigating whether a given set of intermediate features is present in the representations of neural models. Probing studies may have misleading results, but various recent works have suggested more reliable methodologies that compensate for the possible pitfalls of probing. However, these best practices are numerous and fast-evolving. To simplify the process of running a set of probing experiments in line with suggested methodologies, we introduce Probe-Ably: an extendable probing framework which supports and automates the application of probing methods to the user's inputs. △ Less

Submitted 30 September, 2021; v1 submitted 12 April, 2021; originally announced April 2021.

Comments: ACL 2021 (System Demonstrations)

arXiv:2104.05741 [pdf, other]

Active learning for medical code assignment

Authors: Martha Dais Ferreira, Michal Malyska, Nicola Sahar, Riccardo Miotto, Fernando Paulovich, Evangelos Milios

Abstract: Machine Learning (ML) is widely used to automatically extract meaningful information from Electronic Health Records (EHR) to support operational, clinical, and financial decision-making. However, ML models require a large number of annotated examples to provide satisfactory results, which is not possible in most healthcare scenarios due to the high cost of clinician-labeled data. Active Learning (… ▽ More Machine Learning (ML) is widely used to automatically extract meaningful information from Electronic Health Records (EHR) to support operational, clinical, and financial decision-making. However, ML models require a large number of annotated examples to provide satisfactory results, which is not possible in most healthcare scenarios due to the high cost of clinician-labeled data. Active Learning (AL) is a process of selecting the most informative instances to be labeled by an expert to further train a supervised algorithm. We demonstrate the effectiveness of AL in multi-label text classification in the clinical domain. In this context, we apply a set of well-known AL methods to help automatically assign ICD-9 codes on the MIMIC-III dataset. Our results show that the selection of informative instances provides satisfactory classification with a significantly reduced training set (8.3\% of the total instances). We conclude that AL methods can significantly reduce the manual annotation cost while preserving model performance. △ Less

Submitted 12 April, 2021; originally announced April 2021.

Comments: It was accepted in the ACM CHIL 2021 workshop track

arXiv:2101.08655 [pdf, other]

doi 10.3390/info13080368

Q4EDA: A Novel Strategy for Textual Information Retrieval Based on User Interactions with Visual Representations of Time Series

Authors: Leonardo Christino, Martha D. Ferreira, Fernando V. Paulovich

Abstract: Knowing how to construct text-based Search Queries (SQs) for use in Search Engines (SEs) such as Google or Wikipedia has become a fundamental skill. Though much data are available through such SEs, most structured datasets live outside their scope. Visualization tools aid in this limitation, but no such tools come close to the sheer amount of information available through general-purpose SEs. To f… ▽ More Knowing how to construct text-based Search Queries (SQs) for use in Search Engines (SEs) such as Google or Wikipedia has become a fundamental skill. Though much data are available through such SEs, most structured datasets live outside their scope. Visualization tools aid in this limitation, but no such tools come close to the sheer amount of information available through general-purpose SEs. To fill this gap, this paper presents Q4EDA, a novel framework that converts users' visual selection queries executed on top of time series visual representations, providing valid and stable SQs to be used in general-purpose SEs and suggestions of related information. The usefulness of Q4EDA is presented and validated by users through an application linking a Gapminder's line-chart replica with a SE populated with Wikipedia documents, showing how Q4EDA supports and enhances exploratory analysis of United Nations world indicators. Despite some limitations, Q4EDA is unique in its proposal and represents a real advance towards providing solutions for querying textual information based on user interactions with visual representations. △ Less

Submitted 2 August, 2022; v1 submitted 19 January, 2021; originally announced January 2021.

Comments: 12 Figures, 21 pages, submitted to Information 2022 special issue: Trends and Opportunities in Visualization and Visual Analytics

ACM Class: H.3.3; I.2.6; I.2.7; I.3.8

Journal ref: Information 13 (2022), no. 8: 368

arXiv:2012.14850 [pdf]

Localizacao em ambientes internos utilizando redes Wi-Fi

Authors: David Alan de Oliveira Ferreira, Celso Barbosa Carvalho, Edjair de Souza Mota

Abstract: This paper presents a localization method for indoor environments capable of improving the location accuracy that is hampered by instability in RSSI of the IEEE 802.11 networks. The method employs the k-Nearest Neighbors (kNN) algorithm and quartiles analysis in the data representation. The proposal had null error with only four APs and 10 readings per sample of each AP with just 0.69 second to lo… ▽ More This paper presents a localization method for indoor environments capable of improving the location accuracy that is hampered by instability in RSSI of the IEEE 802.11 networks. The method employs the k-Nearest Neighbors (kNN) algorithm and quartiles analysis in the data representation. The proposal had null error with only four APs and 10 readings per sample of each AP with just 0.69 second to locate. These values are important contributions, confirming that the method is promising to locate objects in indoor environments. △ Less

Submitted 29 December, 2020; originally announced December 2020.

Comments: in Portuguese. Simposio Brasileiro de telecomunicacoes e processamento de sinais, SBrT 2019. p. 1-5, Petropolis, RJ

arXiv:2012.11890 [pdf, ps, other]

SIMDRAM: A Framework for Bit-Serial SIMD Processing Using DRAM

Authors: Nastaran Ha**azar, Geraldo F. Oliveira, Sven Gregorio, João Dinis Ferreira, Nika Mansouri Ghiasi, Minesh Patel, Mohammed Alser, Saugata Ghose, Juan Gómez-Luna, Onur Mutlu

Abstract: Processing-using-DRAM has been proposed for a limited set of basic operations (i.e., logic operations, addition). However, in order to enable the full adoption of processing-using-DRAM, it is necessary to provide support for more complex operations. In this paper, we propose SIMDRAM, a flexible general-purpose processing-using-DRAM framework that enables massively-parallel computation of a wide ra… ▽ More Processing-using-DRAM has been proposed for a limited set of basic operations (i.e., logic operations, addition). However, in order to enable the full adoption of processing-using-DRAM, it is necessary to provide support for more complex operations. In this paper, we propose SIMDRAM, a flexible general-purpose processing-using-DRAM framework that enables massively-parallel computation of a wide range of operations by using each DRAM column as an independent SIMD lane to perform bit-serial operations. SIMDRAM consists of three key steps to enable a desired operation in DRAM: (1) building an efficient majority-based representation of the desired operation, (2) map** the operation input and output operands to DRAM rows and to the required DRAM commands that produce the desired operation, and (3) executing the operation. These three steps ensure efficient computation of any arbitrary and complex operation in DRAM. The first two steps give users the flexibility to efficiently implement and compute any desired operation in DRAM. The third step controls the execution flow of the in-DRAM computation, transparently from the user. We comprehensively evaluate SIMDRAM's reliability, area overhead, operation throughput, and energy efficiency using a wide range of operations and seven diverse real-world kernels to demonstrate its generality. Our results show that SIMDRAM provides up to 5.1x higher operation throughput and 2.5x higher energy efficiency than a state-of-the-art in-DRAM computing mechanism, and up to 2.5x speedup for real-world kernels while incurring less than 1% DRAM chip area overhead. Compared to a CPU and a high-end GPU, SIMDRAM is 257x and 31x more energy-efficient, while providing 93x and 6x higher operation throughput, respectively. △ Less

Submitted 22 December, 2020; originally announced December 2020.

Comments: Extended abstract of the full paper to appear in ASPLOS 2021

arXiv:2010.02825 [pdf, other]

WoLFRaM: Enhancing Wear-Leveling and Fault Tolerance in Resistive Memories using Programmable Address Decoders

Authors: Leonid Yavits, Lois Orosa, Suyash Mahar, João Dinis Ferreira, Mattan Erez, Ran Ginosar, Onur Mutlu

Abstract: Resistive memories have limited lifetime caused by limited write endurance and highly non-uniform write access patterns. Two main techniques to mitigate endurance-related memory failures are 1) wear-leveling, to evenly distribute the writes across the entire memory, and 2) fault tolerance, to correct memory cell failures. However, one of the main open challenges in extending the lifetime of existi… ▽ More Resistive memories have limited lifetime caused by limited write endurance and highly non-uniform write access patterns. Two main techniques to mitigate endurance-related memory failures are 1) wear-leveling, to evenly distribute the writes across the entire memory, and 2) fault tolerance, to correct memory cell failures. However, one of the main open challenges in extending the lifetime of existing resistive memories is to make both techniques work together seamlessly and efficiently. To address this challenge, we propose WoLFRaM, a new mechanism that combines both wear-leveling and fault tolerance techniques at low cost by using a programmable resistive address decoder (PRAD). The key idea of WoLFRaM is to use PRAD for implementing 1) a new efficient wear-leveling mechanism that remaps write accesses to random physical locations on the fly, and 2) a new efficient fault tolerance mechanism that recovers from faults by remap** failed memory blocks to available physical locations. Our evaluations show that, for a Phase Change Memory (PCM) based system with cell endurance of 108 writes, WoLFRaM increases the memory lifetime by 68% compared to a baseline that implements the best state-of-the-art wear-leveling and fault correction mechanisms. WoLFRaM's average / worst-case performance and energy overheads are 0.51% / 3.8% and 0.47% / 2.1% respectively. △ Less

Submitted 6 October, 2020; originally announced October 2020.

Comments: To appear in ICCD 2020

arXiv:2009.02708 [pdf, other]

Deep Learning for the Analysis of Disruption Precursors based on Plasma Tomography

Authors: Diogo R. Ferreira, Pedro J. Carvalho, Carlo Sozzi, Peter J. Lomas, JET Contributors

Abstract: The JET baseline scenario is being developed to achieve high fusion performance and sustained fusion power. However, with higher plasma current and higher input power, an increase in pulse disruptivity is being observed. Although there is a wide range of possible disruption causes, the present disruptions seem to be closely related to radiative phenomena such as impurity accumulation, core radiati… ▽ More The JET baseline scenario is being developed to achieve high fusion performance and sustained fusion power. However, with higher plasma current and higher input power, an increase in pulse disruptivity is being observed. Although there is a wide range of possible disruption causes, the present disruptions seem to be closely related to radiative phenomena such as impurity accumulation, core radiation, and radiative collapse. In this work, we focus on bolometer tomography to reconstruct the plasma radiation profile and, on top of it, we apply anomaly detection to identify the radiation patterns that precede major disruptions. The approach makes extensive use of machine learning. First, we train a surrogate model for plasma tomography based on matrix multiplication, which provides a fast method to compute the plasma radiation profiles across the full extent of any given pulse. Then, we train a variational autoencoder to reproduce the radiation profiles by encoding them into a latent distribution and subsequently decoding them. As an anomaly detector, the variational autoencoder struggles to reproduce unusual behaviors, which includes not only the actual disruptions but their precursors as well. These precursors are identified based on an analysis of the anomaly score across all baseline pulses in two recent campaigns at JET. △ Less

Submitted 8 September, 2020; v1 submitted 6 September, 2020; originally announced September 2020.

Comments: (to appear)

Journal ref: Fusion Science and Technology, 2020

arXiv:2007.02194 [pdf]

30 Years of Software Refactoring Research:A Systematic Literature Review

Authors: Chaima Abid, Vahid Alizadeh, Marouane Kessentini, Thiago do Nascimento Ferreira, Danny Dig

Abstract: Due to the growing complexity of software systems, there has been a dramatic increase and industry demand for tools and techniques on software refactoring in the last ten years, defined traditionally as a set of program transformations intended to improve the system design while preserving the behavior. Refactoring studies are expanded beyond code-level restructuring to be applied at different lev… ▽ More Due to the growing complexity of software systems, there has been a dramatic increase and industry demand for tools and techniques on software refactoring in the last ten years, defined traditionally as a set of program transformations intended to improve the system design while preserving the behavior. Refactoring studies are expanded beyond code-level restructuring to be applied at different levels (architecture, model, requirements, etc.), adopted in many domains beyond the object-oriented paradigm (cloud computing, mobile, web, etc.), used in industrial settings and considered objectives beyond improving the design to include other non-functional requirements (e.g., improve performance, security, etc.). Thus, challenges to be addressed by refactoring work are, nowadays, beyond code transformation to include, but not limited to, scheduling the opportune time to carry refactoring, recommendations of specific refactoring activities, detection of refactoring opportunities, and testing the correctness of applied refactorings. Therefore, the refactoring research efforts are fragmented over several research communities, various domains, and objectives. To structure the field and existing research results, this paper provides a systematic literature review and analyzes the results of 3183 research papers on refactoring covering the last three decades to offer the most scalable and comprehensive literature review of existing refactoring research studies. Based on this survey, we created a taxonomy to classify the existing research, identified research trends, and highlighted gaps in the literature and avenues for further research. △ Less

Submitted 4 July, 2020; originally announced July 2020.

Comments: 23 pages

arXiv:2004.14959 [pdf, other]

Natural Language Premise Selection: Finding Supporting Statements for Mathematical Text

Authors: Deborah Ferreira, Andre Freitas

Abstract: Mathematical text is written using a combination of words and mathematical expressions. This combination, along with a specific way of structuring sentences makes it challenging for state-of-art NLP tools to understand and reason on top of mathematical discourse. In this work, we propose a new NLP task, the natural premise selection, which is used to retrieve supporting definitions and supporting… ▽ More Mathematical text is written using a combination of words and mathematical expressions. This combination, along with a specific way of structuring sentences makes it challenging for state-of-art NLP tools to understand and reason on top of mathematical discourse. In this work, we propose a new NLP task, the natural premise selection, which is used to retrieve supporting definitions and supporting propositions that are useful for generating an informal mathematical proof for a particular statement. We also make available a dataset, NL-PS, which can be used to evaluate different approaches for the natural premise selection task. Using different baselines, we demonstrate the underlying interpretation challenges associated with the task. △ Less

Submitted 30 April, 2020; originally announced April 2020.

Comments: 12th Language Resources and Evaluation Conference (LREC), Marseille, France, 2020 (Language Resource Paper)

arXiv:2001.02639 [pdf, other]

On the Evaluation of Intelligent Process Automation

Authors: Deborah Ferreira, Julia Rozanova, Krishna Dubba, Dell Zhang, Andre Freitas

Abstract: Intelligent Process Automation (IPA) is emerging as a sub-field of AI to support the automation of long-tail processes which requires the coordination of tasks across different systems. So far, the field of IPA has been largely driven by systems and use cases, lacking a more formal definition of the task and its assessment. This paper aims to address this gap by providing a formalisation of IPA an… ▽ More Intelligent Process Automation (IPA) is emerging as a sub-field of AI to support the automation of long-tail processes which requires the coordination of tasks across different systems. So far, the field of IPA has been largely driven by systems and use cases, lacking a more formal definition of the task and its assessment. This paper aims to address this gap by providing a formalisation of IPA and by proposing specific metrics to support the empirical evaluation of IPA systems. This work also compares and contrasts IPA against related tasks such as end-user programming and program synthesis. △ Less

Submitted 4 February, 2020; v1 submitted 8 January, 2020; originally announced January 2020.

Comments: Submitted to the AAAI-20 Workshop on Intelligent Process Automation

arXiv:1911.00515 [pdf, ps, other]

doi 10.1016/j.media.2020.101714

The reliability of a deep learning model in clinical out-of-distribution MRI data: a multicohort study

Authors: Gustav Mårtensson, Daniel Ferreira, Tobias Granberg, Lena Cavallin, Ketil Oppedal, Alessandro Padovani, Irena Rektorova, Laura Bonanni, Matteo Pardini, Milica Kramberger, John-Paul Taylor, Jakub Hort, Jón Snædal, Jaime Kulisevsky, Frederic Blanc, Angelo Antonini, Patrizia Mecocci, Bruno Vellas, Magda Tsolaki, Iwona Kłoszewska, Hilkka Soininen, Simon Lovestone, Andrew Simmons, Dag Aarsland, Eric Westman

Abstract: Deep learning (DL) methods have in recent years yielded impressive results in medical imaging, with the potential to function as clinical aid to radiologists. However, DL models in medical imaging are often trained on public research cohorts with images acquired with a single scanner or with strict protocol harmonization, which is not representative of a clinical setting. The aim of this study was… ▽ More Deep learning (DL) methods have in recent years yielded impressive results in medical imaging, with the potential to function as clinical aid to radiologists. However, DL models in medical imaging are often trained on public research cohorts with images acquired with a single scanner or with strict protocol harmonization, which is not representative of a clinical setting. The aim of this study was to investigate how well a DL model performs in unseen clinical data sets---collected with different scanners, protocols and disease populations---and whether more heterogeneous training data improves generalization. In total, 3117 MRI scans of brains from multiple dementia research cohorts and memory clinics, that had been visually rated by a neuroradiologist according to Scheltens' scale of medial temporal atrophy (MTA), were included in this study. By training multiple versions of a convolutional neural network on different subsets of this data to predict MTA ratings, we assessed the impact of including images from a wider distribution during training had on performance in external memory clinic data. Our results showed that our model generalized well to data sets acquired with similar protocols as the training data, but substantially worse in clinical cohorts with visibly different tissue contrasts in the images. This implies that future DL studies investigating performance in out-of-distribution (OOD) MRI data need to assess multiple external cohorts for reliable results. Further, by including data from a wider range of scanners and protocols the performance improved in OOD data, which suggests that more heterogeneous training data makes the model generalize better. To conclude, this is the most comprehensive study to date investigating the domain shift in deep learning on MRI data, and we advocate rigorous evaluation of DL models on clinical data prior to being certified for deployment. △ Less

Submitted 1 November, 2019; originally announced November 2019.

Comments: 11 pages, 3 figures

arXiv:1910.13257 [pdf, other]

doi 10.1109/TPS.2019.2947304

Deep Learning for Plasma Tomography and Disruption Prediction from Bolometer Data

Authors: Diogo R. Ferreira, Pedro J. Carvalho, Horácio Fernandes

Abstract: The use of deep learning is facilitating a wide range of data processing tasks in many areas. The analysis of fusion data is no exception, since there is a need to process large amounts of data collected from the diagnostic systems attached to a fusion device. Fusion data involves images and time series, and are a natural candidate for the use of convolutional and recurrent neural networks. In thi… ▽ More The use of deep learning is facilitating a wide range of data processing tasks in many areas. The analysis of fusion data is no exception, since there is a need to process large amounts of data collected from the diagnostic systems attached to a fusion device. Fusion data involves images and time series, and are a natural candidate for the use of convolutional and recurrent neural networks. In this work, we describe how CNNs can be used to reconstruct the plasma radiation profile, and we discuss the potential of using RNNs for disruption prediction based on the same input data. Both approaches have been applied at JET using data from a multi-channel diagnostic system. Similar approaches can be applied to other fusion devices and diagnostics. △ Less

Submitted 27 October, 2019; originally announced October 2019.

Comments: arXiv admin note: substantial text overlap with arXiv:1811.00333

Journal ref: IEEE Transactions on Plasma Science, 2019

arXiv:1909.13404 [pdf, other]

Towards modular and programmable architecture search

Authors: Renato Negrinho, Darshan Patil, Nghia Le, Daniel Ferreira, Matthew Gormley, Geoffrey Gordon

Abstract: Neural architecture search methods are able to find high performance deep learning architectures with minimal effort from an expert. However, current systems focus on specific use-cases (e.g. convolutional image classifiers and recurrent language models), making them unsuitable for general use-cases that an expert might wish to write. Hyperparameter optimization systems are general-purpose but lac… ▽ More Neural architecture search methods are able to find high performance deep learning architectures with minimal effort from an expert. However, current systems focus on specific use-cases (e.g. convolutional image classifiers and recurrent language models), making them unsuitable for general use-cases that an expert might wish to write. Hyperparameter optimization systems are general-purpose but lack the constructs needed for easy application to architecture search. In this work, we propose a formal language for encoding search spaces over general computational graphs. The language constructs allow us to write modular, composable, and reusable search space encodings and to reason about search space design. We use our language to encode search spaces from the architecture search literature. The language allows us to decouple the implementations of the search space and the search algorithm, allowing us to expose search spaces to search algorithms through a consistent interface. Our experiments show the ease with which we can experiment with different combinations of search spaces and search algorithms without having to implement each combination from scratch. We release an implementation of our language with this paper. △ Less

Submitted 29 September, 2019; originally announced September 2019.

Comments: Published at NeurIPS 2019. Code and documentation for the language implementation can be found at https://github.com/negrinho/deep_architect

arXiv:1907.05280 [pdf, other]

City-GAN: Learning architectural styles using a custom Conditional GAN architecture

Authors: Maximilian Bachl, Daniel C. Ferreira

Abstract: Generative Adversarial Networks (GANs) are a well-known technique that is trained on samples (e.g. pictures of fruits) and which after training is able to generate realistic new samples. Conditional GANs (CGANs) additionally provide label information for subclasses (e.g. apple, orange, pear) which enables the GAN to learn more easily and increase the quality of its output samples. We use GANs to l… ▽ More Generative Adversarial Networks (GANs) are a well-known technique that is trained on samples (e.g. pictures of fruits) and which after training is able to generate realistic new samples. Conditional GANs (CGANs) additionally provide label information for subclasses (e.g. apple, orange, pear) which enables the GAN to learn more easily and increase the quality of its output samples. We use GANs to learn architectural features of major cities and to generate images of buildings which do not exist. We show that currently available GAN and CGAN architectures are unsuited for this task and propose a custom architecture and demonstrate that our architecture has superior performance for this task and verify its capabilities with extensive experiments. △ Less

Submitted 26 May, 2020; v1 submitted 3 July, 2019; originally announced July 2019.

arXiv:1901.00418 [pdf, other]

doi 10.1016/j.nicl.2019.101872

AVRA: Automatic Visual Ratings of Atrophy from MRI images using Recurrent Convolutional Neural Networks

Authors: Gustav Mårtensson, Daniel Ferreira, Lena Cavallin, J-Sebastian Muehlboeck, Lars-Olof Wahlund, Chunliang Wang, Eric Westman

Abstract: Quantifying the degree of atrophy is done clinically by neuroradiologists following established visual rating scales. For these assessments to be reliable the rater requires substantial training and experience, and even then the rating agreement between two radiologists is not perfect. We have developed a model we call AVRA (Automatic Visual Ratings of Atrophy) based on machine learning methods an… ▽ More Quantifying the degree of atrophy is done clinically by neuroradiologists following established visual rating scales. For these assessments to be reliable the rater requires substantial training and experience, and even then the rating agreement between two radiologists is not perfect. We have developed a model we call AVRA (Automatic Visual Ratings of Atrophy) based on machine learning methods and trained on 2350 visual ratings made by an experienced neuroradiologist. It provides fast and automatic ratings for Scheltens' scale of medial temporal atrophy (MTA), the frontal subscale of Pasquier's Global Cortical Atrophy (GCA-F) scale, and Koedam's scale of Posterior Atrophy (PA). We demonstrate substantial inter-rater agreement between AVRA's and a neuroradiologist ratings with Cohen's weighted kappa values of $κ_w$ = 0.74/0.72 (MTA left/right), $κ_w$ = 0.62 (GCA-F) and $κ_w$ = 0.74 (PA), with an inherent intra-rater agreement of $κ_w$ = 1. We conclude that automatic visual ratings of atrophy can potentially have great clinical and scientific value, and aim to present AVRA as a freely available toolbox. △ Less

Submitted 23 December, 2018; originally announced January 2019.

Comments: 11 pages, 6 figures

arXiv:1811.00333 [pdf, other]

Applications of Deep Learning to Nuclear Fusion Research

Authors: Diogo R. Ferreira

Abstract: Nuclear fusion is the process that powers the sun, and it is one of the best hopes to achieve a virtually unlimited energy source for the future of humanity. However, reproducing sustainable nuclear fusion reactions here on Earth is a tremendous scientific and technical challenge. Special devices -- called tokamaks -- have been built around the world, with JET (Joint European Torus, in the UK) bei… ▽ More Nuclear fusion is the process that powers the sun, and it is one of the best hopes to achieve a virtually unlimited energy source for the future of humanity. However, reproducing sustainable nuclear fusion reactions here on Earth is a tremendous scientific and technical challenge. Special devices -- called tokamaks -- have been built around the world, with JET (Joint European Torus, in the UK) being the largest tokamak currently in operation. Such devices confine matter and heat it up to extremely high temperatures, creating a plasma where fusion reactions begin to occur. JET has over one hundred diagnostic systems to monitor what happens inside the plasma, and each 30-second experiment (or pulse) generates about 50 GB of data. In this work, we show how convolutional neural networks (CNNs) can be used to reconstruct the 2D plasma profile inside the device based on data coming from those diagnostics. We also discuss how recurrent neural networks (RNNs) can be used to predict plasma disruptions, which are one of the major problems affecting tokamaks today. Training of such networks is done on NVIDIA GPUs. △ Less

Submitted 1 November, 2018; originally announced November 2018.

Comments: This paper is based on a talk presented at NVIDIA's GPU Technology Conference, which took place at the International Congress Center in Munich, Germany, October 9--11, 2018 (GTC Europe 2018)

arXiv:1808.03319 [pdf, other]

Continuous Authentication of Smartphones Based on Application Usage

Authors: Upal Mahbub, Jukka Komulainen, Denzil Ferreira, Rama Chellappa

Abstract: An empirical investigation of active/continuous authentication for smartphones is presented in this paper by exploiting users' unique application usage data, i.e., distinct patterns of use, modeled by a Markovian process. Variations of Hidden Markov Models (HMMs) are evaluated for continuous user verification, and challenges due to the sparsity of session-wise data, an explosion of states, and han… ▽ More An empirical investigation of active/continuous authentication for smartphones is presented in this paper by exploiting users' unique application usage data, i.e., distinct patterns of use, modeled by a Markovian process. Variations of Hidden Markov Models (HMMs) are evaluated for continuous user verification, and challenges due to the sparsity of session-wise data, an explosion of states, and handling unforeseen events in the test data are tackled. Unlike traditional approaches, the proposed formulation does not depend on the top N-apps, rather uses the complete app-usage information to achieve low latency. Through experimentation, empirical assessment of the impact of unforeseen events, i.e., unknown applications and unforeseen observations, on user verification is done via a modified edit-distance algorithm for simple sequence matching. It is found that for enhanced verification performance, unforeseen events should be incorporated in the models by adopting smoothing techniques with HMMs. For validation, extensive experiments on two distinct datasets are performed. The marginal smoothing technique is the most effective for user verification in terms of equal error rate (EER) and with a sampling rate of 1/30s^{-1} and 30 minutes of historical data, and the method is capable of detecting an intrusion within ~2.5 minutes of application use. △ Less

Submitted 17 July, 2018; originally announced August 2018.

Comments: 14 pages, 7 figures, 7 tables,

arXiv:1711.10292 [pdf, other]

Providing theoretical learning guarantees to Deep Learning Networks

Authors: Rodrigo Fernandes de Mello, Martha Dais Ferreira, Moacir Antonelli Ponti

Abstract: Deep Learning (DL) is one of the most common subjects when Machine Learning and Data Science approaches are considered. There are clearly two movements related to DL: the first aggregates researchers in quest to outperform other algorithms from literature, trying to win contests by considering often small decreases in the empirical risk; and the second investigates overfitting evidences, questioni… ▽ More Deep Learning (DL) is one of the most common subjects when Machine Learning and Data Science approaches are considered. There are clearly two movements related to DL: the first aggregates researchers in quest to outperform other algorithms from literature, trying to win contests by considering often small decreases in the empirical risk; and the second investigates overfitting evidences, questioning the learning capabilities of DL classifiers. Motivated by such opposed points of view, this paper employs the Statistical Learning Theory (SLT) to study the convergence of Deep Neural Networks, with particular interest in Convolutional Neural Networks. In order to draw theoretical conclusions, we propose an approach to estimate the Shattering coefficient of those classification algorithms, providing a lower bound for the complexity of their space of admissible functions, a.k.a. algorithm bias. Based on such estimator, we generalize the complexity of network biases, and, next, we study AlexNet and VGG16 architectures in the point of view of their Shattering coefficients, and number of training examples required to provide theoretical learning guarantees. From our theoretical formulation, we show the conditions which Deep Neural Networks learn as well as point out another issue: DL benchmarks may be strictly driven by empirical risks, disregarding the complexity of algorithms biases. △ Less

Submitted 28 November, 2017; originally announced November 2017.

Comments: Submitted to JMLR

arXiv:1706.09055 [pdf, other]

Acoustic Modeling Using a Shallow CNN-HTSVM Architecture

Authors: Christopher Dane Shulby, Martha Dais Ferreira, Rodrigo F. de Mello, Sandra Maria Aluisio

Abstract: High-accuracy speech recognition is especially challenging when large datasets are not available. It is possible to bridge this gap with careful and knowledge-driven parsing combined with the biologically inspired CNN and the learning guarantees of the Vapnik Chervonenkis (VC) theory. This work presents a Shallow-CNN-HTSVM (Hierarchical Tree Support Vector Machine classifier) architecture which us… ▽ More High-accuracy speech recognition is especially challenging when large datasets are not available. It is possible to bridge this gap with careful and knowledge-driven parsing combined with the biologically inspired CNN and the learning guarantees of the Vapnik Chervonenkis (VC) theory. This work presents a Shallow-CNN-HTSVM (Hierarchical Tree Support Vector Machine classifier) architecture which uses a predefined knowledge-based set of rules with statistical machine learning techniques. Here we show that gross errors present even in state-of-the-art systems can be avoided and that an accurate acoustic model can be built in a hierarchical fashion. The CNN-HTSVM acoustic model outperforms traditional GMM-HMM models and the HTSVM structure outperforms a MLP multi-class classifier. More importantly we isolate the performance of the acoustic model and provide results on both the frame and phoneme level considering the true robustness of the model. We show that even with a small amount of data accurate and robust recognition rates can be obtained. △ Less

Submitted 27 June, 2017; originally announced June 2017.

Comments: Pre-review version of Bracis 2017

arXiv:1705.05449 [pdf, other]

The complex social network of surnames: A comparison between Brazil and Portugal

Authors: G. D. Ferreira, G. M. Viswanathan, L. R. da Silva, H. J. Herrmann

Abstract: We present a study of social networks based on the analysis of Brazilian and Portuguese family names (surnames). We construct networks whose nodes are names of families and whose edges represent parental relations between two families. From these networks we extract the connectivity distribution, clustering coefficient, shortest path and centrality. We find that the connectivity distribution follo… ▽ More We present a study of social networks based on the analysis of Brazilian and Portuguese family names (surnames). We construct networks whose nodes are names of families and whose edges represent parental relations between two families. From these networks we extract the connectivity distribution, clustering coefficient, shortest path and centrality. We find that the connectivity distribution follows an approximate power law. We associate the number of hubs, centrality and entropy to the degree of miscegenation in the societies in both countries. Our results show that Portuguese society has a higher miscegenation degree than Brazilian society. All networks analyzed lead to approximate inverse square power laws in the degree distribution. We conclude that the thermodynamic limit is reached for small networks (3 or 4 thousand nodes). The assortative mixing of all networks is negative, showing that the more connected vertices are connected to vertices with lower connectivity. Finally, the network of surnames presents some small world characteristics. △ Less

Submitted 12 May, 2017; originally announced May 2017.

Comments: 13 pages, 5 figures

arXiv:1109.1265

FEBER: Feedback Based Erasure Recovery for Real-Time Multicast over 802.11 Networks

Authors: Rui A. Costa, Diogo Ferreira, João Barros

Abstract: We consider the problem of broadcasting data streams over a wireless network for multiple receivers with reliability and timely delivery guarantees. In our framework, we consider packets that need to be delivered within a given time interval, after which the packet is no longer useful at the application layer. We set the notion of critical packet and, based on periodic feedback from the receivers,… ▽ More We consider the problem of broadcasting data streams over a wireless network for multiple receivers with reliability and timely delivery guarantees. In our framework, we consider packets that need to be delivered within a given time interval, after which the packet is no longer useful at the application layer. We set the notion of critical packet and, based on periodic feedback from the receivers, we propose a retransmission scheme that will guarantee timely delivery of such packets, as well as packets that are innovative for other receivers. Our solution provides a trade-off between packet delivery ratio and bandwidth use, which contrasts with existing approaches such as FEC and ARQ, where the focus is on ensuring reliability first, offering no guarantees of timely delivery of data. We evaluate the performance of our proposal in a 802.11 wireless network testbed. △ Less

Submitted 15 February, 2012; v1 submitted 6 September, 2011; originally announced September 2011.

Comments: This paper has been withdrawn due to changes in the results obtained in a different testbed

Showing 1–40 of 40 results for author: Ferreira, D