-
Predicting Parkinson's disease evolution using deep learning
Authors:
Maria Frasca,
Davide La Torre,
Gabriella Pravettoni,
Ilaria Cutica
Abstract:
Parkinson's disease is a neurological condition that occurs in nearly 1% of the world's population. The disease is manifested by a drop in dopamine production, symptoms are cognitive and behavioural and include a wide range of personality changes, depressive disorders, memory problems, and emotional dysregulation, which can occur as the disease progresses. Early diagnosis and accurate staging of t…
▽ More
Parkinson's disease is a neurological condition that occurs in nearly 1% of the world's population. The disease is manifested by a drop in dopamine production, symptoms are cognitive and behavioural and include a wide range of personality changes, depressive disorders, memory problems, and emotional dysregulation, which can occur as the disease progresses. Early diagnosis and accurate staging of the disease are essential to apply the appropriate therapeutic approaches to slow cognitive and motor decline.
Currently, there is not a single blood test or biomarker available to diagnose Parkinson's disease. Magnetic resonance imaging has been used for the past three decades to diagnose and distinguish between PD and other neurological conditions. However, in recent years new possibilities have arisen: several AI algorithms have been developed to increase the precision and accuracy of differential diagnosis of PD at an early stage.
To our knowledge, no AI tools have been designed to identify the stage of progression. This paper aims to fill this gap. Using the "Parkinson's Progression Markers Initiative" dataset, which reports the patient's MRI and an indication of the disease stage, we developed a model to identify the level of progression. The images and the associated scores were used for training and assessing different deep-learning models. Our analysis distinguished four distinct disease progression levels based on a standard scale (Hoehn and Yah scale). The final architecture consists of the cascading of a 3DCNN network, adopted to reduce and extract the spatial characteristics of the RMI for efficient training of the successive LSTM layers, aiming at modelling the temporal dependencies among the data.
Our results show that the proposed 3DCNN + LSTM model achieves state-of-the-art results by classifying the elements with 91.90\% as macro averaged OVR AUC on four classes
△ Less
Submitted 5 January, 2024; v1 submitted 28 December, 2023;
originally announced December 2023.
-
Coevolution of Neural Architectures and Features for Stock Market Forecasting: A Multi-objective Decision Perspective
Authors:
Faizal Hafiz,
Jan Broekaert,
Davide La Torre,
Akshya Swain
Abstract:
In a multi objective setting, a portfolio manager's highly consequential decisions can benefit from assessing alternative forecasting models of stock index movement. The present investigation proposes a new approach to identify a set of nondominated neural network models for further selection by the decision maker. A new coevolution approach is proposed to simultaneously select the features and to…
▽ More
In a multi objective setting, a portfolio manager's highly consequential decisions can benefit from assessing alternative forecasting models of stock index movement. The present investigation proposes a new approach to identify a set of nondominated neural network models for further selection by the decision maker. A new coevolution approach is proposed to simultaneously select the features and topology of neural networks (collectively referred to as neural architecture), where the features are viewed from a topological perspective as input neurons. Further, the coevolution is posed as a multicriteria problem to evolve sparse and efficacious neural architectures. The well known dominance and decomposition based multiobjective evolutionary algorithms are augmented with a nongeometric crossover operator to diversify and balance the search for neural architectures across conflicting criteria. Moreover, the coevolution is augmented to accommodate the data based implications of distinct market behaviors prior to and during the ongoing COVID 19 pandemic. A detailed comparative evaluation is carried out with the conventional sequential approach of feature selection followed by neural topology design, as well as a scalarized coevolution approach. The results on the NASDAQ index in pre and peri COVID time windows convincingly demonstrate that the proposed coevolution approach can evolve a set of nondominated neural forecasting models with better generalization capabilities.
△ Less
Submitted 23 November, 2023;
originally announced November 2023.
-
Mixing Deep Learning and Multiple Criteria Optimization: An Application to Distributed Learning with Multiple Datasets
Authors:
Davide La Torre,
Danilo Liuzzi,
Marco Repetto,
Matteo Rocca
Abstract:
The training phase is the most important stage during the machine learning process. In the case of labeled data and supervised learning, machine training consists in minimizing the loss function subject to different constraints. In an abstract setting, it can be formulated as a multiple criteria optimization model in which each criterion measures the distance between the output associated with a s…
▽ More
The training phase is the most important stage during the machine learning process. In the case of labeled data and supervised learning, machine training consists in minimizing the loss function subject to different constraints. In an abstract setting, it can be formulated as a multiple criteria optimization model in which each criterion measures the distance between the output associated with a specific input and its label. Therefore, the fitting term is a vector function and its minimization is intended in the Pareto sense. We provide stability results of the efficient solutions with respect to perturbations of input and output data. We then extend the same approach to the case of learning with multiple datasets. The multiple dataset environment is relevant when reducing the bias due to the choice of a specific training set. We propose a scalarization approach to implement this model and numerical experiments in digit classification using MNIST data.
△ Less
Submitted 2 December, 2021;
originally announced December 2021.
-
Federated Deep Learning in Electricity Forecasting: An MCDM Approach
Authors:
Marco Repetto,
Davide La Torre,
Muhammad Tariq
Abstract:
Large-scale data analysis is growing at an exponential rate as data proliferates in our societies. This abundance of data has the advantage of allowing the decision-maker to implement complex models in scenarios that were prohibitive before. At the same time, such an amount of data requires a distributed thinking approach. In fact, Deep Learning models require plenty of resources, and distributed…
▽ More
Large-scale data analysis is growing at an exponential rate as data proliferates in our societies. This abundance of data has the advantage of allowing the decision-maker to implement complex models in scenarios that were prohibitive before. At the same time, such an amount of data requires a distributed thinking approach. In fact, Deep Learning models require plenty of resources, and distributed training is needed. This paper presents a Multicriteria approach for distributed learning. Our approach uses the Weighted Goal Programming approach in its Chebyshev formulation to build an ensemble of decision rules that optimize aprioristically defined performance metrics. Such a formulation is beneficial because it is both model and metric agnostic and provides an interpretable output for the decision-maker. We test our approach by showing a practical application in electricity demand forecasting. Our results suggest that when we allow for dataset split overlap**, the performances of our methodology are consistently above the baseline model trained on the whole dataset.
△ Less
Submitted 9 January, 2022; v1 submitted 27 November, 2021;
originally announced November 2021.
-
A Multi-criteria Approach to Evolve Sparse Neural Architectures for Stock Market Forecasting
Authors:
Faizal Hafiz,
Jan Broekaert,
Davide La Torre,
Akshya Swain
Abstract:
This study proposes a new framework to evolve efficacious yet parsimonious neural architectures for the movement prediction of stock market indices using technical indicators as inputs. In the light of a sparse signal-to-noise ratio under the Efficient Market hypothesis, develo** machine learning methods to predict the movement of a financial market using technical indicators has shown to be a c…
▽ More
This study proposes a new framework to evolve efficacious yet parsimonious neural architectures for the movement prediction of stock market indices using technical indicators as inputs. In the light of a sparse signal-to-noise ratio under the Efficient Market hypothesis, develo** machine learning methods to predict the movement of a financial market using technical indicators has shown to be a challenging problem. To this end, the neural architecture search is posed as a multi-criteria optimization problem to balance the efficacy with the complexity of architectures. In addition, the implications of different dominant trading tendencies which may be present in the pre-COVID and within-COVID time periods are investigated. An $ε-$ constraint framework is proposed as a remedy to extract any concordant information underlying the possibly conflicting pre-COVID data. Further, a new search paradigm, Two-Dimensional Swarms (2DS) is proposed for the multi-criteria neural architecture search, which explicitly integrates sparsity as an additional search dimension in particle swarms. A detailed comparative evaluation of the proposed approach is carried out by considering genetic algorithm and several combinations of empirical neural design rules with a filter-based feature selection method (mRMR) as baseline approaches. The results of this study convincingly demonstrate that the proposed approach can evolve parsimonious networks with better generalization capabilities.
△ Less
Submitted 15 November, 2021;
originally announced November 2021.
-
Competing control scenarios in probabilistic SIR epidemics on social-contact networks
Authors:
Jan B. Broekaert,
Davide La Torre,
Faizal Hafiz
Abstract:
A probabilistic approach to the epidemic evolution on realistic social-contact networks allows for characteristic differences among subjects, including the individual number and structure of social contacts, and the heterogeneity of the infection and recovery rates according to age or medical preconditions. Within our probabilistic Susceptible-Infectious-Removed (SIR) model on social-contact netwo…
▽ More
A probabilistic approach to the epidemic evolution on realistic social-contact networks allows for characteristic differences among subjects, including the individual number and structure of social contacts, and the heterogeneity of the infection and recovery rates according to age or medical preconditions. Within our probabilistic Susceptible-Infectious-Removed (SIR) model on social-contact networks, we evaluate the `infection load' or `activation margin' of various control scenarios; by confinement, by vaccination, and by their combination. We compare the epidemic burden for subpopulations which apply competing or co-operative control strategies. The simulation experiments are conducted on randomised social-contact graphs that are designed to exhibit realistic person-person contact characteristics and which follow near `homogeneous' or `block-localised' subpopulation spreading. The scalarization method is used for the multi-objective optimization problem in which both the infection load is minimized and the extent to which each subpopulation's control strategy preference ranking is adhered to is maximized. We obtain the compounded payoff matrices for two subpopulations which impose contrasting control strategies, each according to their proper ranked control strategy preferences. The Nash equilibria, according to each subpopulation's compounded objective, and according to their proper ranking intensity, are discussed. Finally, the interaction effects of the control strategies are discussed and related to the type of spreading of the two subpopulations.
△ Less
Submitted 10 February, 2022; v1 submitted 31 August, 2021;
originally announced August 2021.
-
AI-enabled Automation for Completeness Checking of Privacy Policies
Authors:
Orlando Amaral,
Sallam Abualhaija,
Damiano Torre,
Mehrdad Sabetzadeh,
Lionel C. Briand
Abstract:
Technological advances in information sharing have raised concerns about data protection. Privacy policies contain privacy-related requirements about how the personal data of individuals will be handled by an organization or a software system (e.g., a web service or an app). In Europe, privacy policies are subject to compliance with the General Data Protection Regulation (GDPR). A prerequisite for…
▽ More
Technological advances in information sharing have raised concerns about data protection. Privacy policies contain privacy-related requirements about how the personal data of individuals will be handled by an organization or a software system (e.g., a web service or an app). In Europe, privacy policies are subject to compliance with the General Data Protection Regulation (GDPR). A prerequisite for GDPR compliance checking is to verify whether the content of a privacy policy is complete according to the provisions of GDPR. Incomplete privacy policies might result in large fines on violating organization as well as incomplete privacy-related software specifications. Manual completeness checking is both time-consuming and error-prone. In this paper, we propose AI-based automation for the completeness checking of privacy policies. Through systematic qualitative methods, we first build two artifacts to characterize the privacy-related provisions of GDPR, namely a conceptual model and a set of completeness criteria. Then, we develop an automated solution on top of these artifacts by leveraging a combination of natural language processing and supervised machine learning. Specifically, we identify the GDPR-relevant information content in privacy policies and subsequently check them against the completeness criteria. To evaluate our approach, we collected 234 real privacy policies from the fund industry. Over a set of 48 unseen privacy policies, our approach detected 300 of the total of 334 violations of some completeness criteria correctly, while producing 23 false positives. The approach thus has a precision of 92.9% and recall of 89.8%. Compared to a baseline that applies keyword search only, our approach results in an improvement of 24.5% in precision and 38% in recall.
△ Less
Submitted 5 October, 2021; v1 submitted 10 June, 2021;
originally announced June 2021.
-
Inferring urban social networks from publicly available data
Authors:
Stefano Guarino,
Enrico Mastrostefano,
Massimo Bernaschi,
Alessandro Celestini,
Marco Cianfriglia,
Davide Torre,
Lena Zastrow
Abstract:
The emergence of social networks and the definition of suitable generative models for synthetic yet realistic social graphs are widely studied problems in the literature. By not being tied to any real data, random graph models cannot capture all the subtleties of real networks and are inadequate for many practical contexts -- including areas of research, such as computational epidemiology, which a…
▽ More
The emergence of social networks and the definition of suitable generative models for synthetic yet realistic social graphs are widely studied problems in the literature. By not being tied to any real data, random graph models cannot capture all the subtleties of real networks and are inadequate for many practical contexts -- including areas of research, such as computational epidemiology, which are recently high on the agenda. At the same time, the so-called contact networks describe interactions, rather than relationships, and are strongly dependent on the application and on the size and quality of the sample data used to infer them. To fill the gap between these two approaches, we present a data-driven model for urban social networks, implemented and released as open source software. Given a territory of interest, and only based on widely available aggregated demographic and social-mixing data, we construct an age-stratified and geo-referenced synthetic population whose individuals are connected by "strong ties" of two types: intra-household (e.g., kinship) or friendship. While household links are entirely data-driven, we propose a parametric probabilistic model for friendship, based on the assumption that distances and age differences play a role, and that not all individuals are equally sociable. The demographic and geographic factors governing the structure of the obtained network, under different configurations, are thoroughly studied through extensive simulations focused on three Italian cities of different size.
△ Less
Submitted 2 April, 2021; v1 submitted 11 December, 2020;
originally announced December 2020.
-
Model Driven Engineering for Data Protection and Privacy: Application and Experience with GDPR
Authors:
Damiano Torre,
Mauricio Alferez,
Ghanem Soltana,
Mehrdad Sabetzadeh,
Lionel Briand
Abstract:
In Europe and indeed worldwide, the General Data Protection Regulation (GDPR) provides protection to individuals regarding their personal data in the face of new technological developments. GDPR is widely viewed as the benchmark for data protection and privacy regulations that harmonizes data privacy laws across Europe. Although the GDPR is highly beneficial to individuals, it presents significant…
▽ More
In Europe and indeed worldwide, the General Data Protection Regulation (GDPR) provides protection to individuals regarding their personal data in the face of new technological developments. GDPR is widely viewed as the benchmark for data protection and privacy regulations that harmonizes data privacy laws across Europe. Although the GDPR is highly beneficial to individuals, it presents significant challenges for organizations monitoring or storing personal information. Since there is currently no automated solution with broad industrial applicability, organizations have no choice but to carry out expensive manual audits to ensure GDPR compliance. In this paper, we present a complete GDPR UML model as a first step towards designing automated methods for checking GDPR compliance. Given that the practical application of the GDPR is influenced by national laws of the EU Member States, we suggest a two-tiered description of the GDPR, generic and specialized. In this paper, we provide (1) the GDPR conceptual model we developed with complete traceability from its classes to the GDPR, (2) a glossary to help understand the model, (3) the plain-English description of 35 compliance rules derived from GDPR along with their encoding in OCL, and (4) the set of 20 variations points derived from GDPR to specialize the generic model. We further present the challenges we faced in our modeling endeavor, the lessons we learned from it, and future directions for research.
△ Less
Submitted 23 July, 2020;
originally announced July 2020.
-
On Systematically Building a Controlled Natural Language for Functional Requirements
Authors:
Alvaro Veizaga,
Mauricio Alferez,
Damiano Torre,
Mehrdad Sabetzadeh,
Lionel Briand
Abstract:
[Context] Natural language (NL) is pervasive in software requirements specifications (SRSs). However, despite its popularity and widespread use, NL is highly prone to quality issues such as vagueness, ambiguity, and incompleteness. Controlled natural languages (CNLs) have been proposed as a way to prevent quality problems in requirements documents, while maintaining the flexibility to write and co…
▽ More
[Context] Natural language (NL) is pervasive in software requirements specifications (SRSs). However, despite its popularity and widespread use, NL is highly prone to quality issues such as vagueness, ambiguity, and incompleteness. Controlled natural languages (CNLs) have been proposed as a way to prevent quality problems in requirements documents, while maintaining the flexibility to write and communicate requirements in an intuitive and universally understood manner. [Objective] In collaboration with an industrial partner from the financial domain, we systematically develop and evaluate a CNL, named Rimay, intended at hel** analysts write functional requirements. [Method] We rely on Grounded Theory for building Rimay and follow well-known guidelines for conducting and reporting industrial case study research. [Results] Our main contributions are: (1) a qualitative methodology to systematically define a CNL for functional requirements; this methodology is general and applicable to information systems beyond the financial domain, (2) a CNL grammar to represent functional requirements; this grammar is derived from our experience in the financial domain, but should be applicable, possibly with adaptations, to other information-system domains, and (3) an empirical evaluation of our CNL (Rimay) through an industrial case study. Our contributions draw on 15 representative SRSs, collectively containing 3215 NL requirements statements from the financial domain. [Conclusion] Our evaluation shows that Rimay is expressive enough to capture, on average, 88% (405 out of 460) of the NL requirements statements in four previously unseen SRSs from the financial domain.
△ Less
Submitted 4 May, 2020;
originally announced May 2020.
-
Optimization of Structural Similarity in Mathematical Imaging
Authors:
D. Otero,
D. La Torre,
O. Michailovich,
E. R. Vrscay
Abstract:
It is now generally accepted that Euclidean-based metrics may not always adequately represent the subjective judgement of a human observer. As a result, many image processing methodologies have been recently extended to take advantage of alternative visual quality measures, the most prominent of which is the Structural Similarity Index Measure (SSIM). The superiority of the latter over Euclidean-b…
▽ More
It is now generally accepted that Euclidean-based metrics may not always adequately represent the subjective judgement of a human observer. As a result, many image processing methodologies have been recently extended to take advantage of alternative visual quality measures, the most prominent of which is the Structural Similarity Index Measure (SSIM). The superiority of the latter over Euclidean-based metrics have been demonstrated in several studies. However, being focused on specific applications, the findings of such studies often lack generality which, if otherwise acknowledged, could have provided a useful guidance for further development of SSIM-based image processing algorithms. Accordingly, instead of focusing on a particular image processing task, in this paper, we introduce a general framework that encompasses a wide range of imaging applications in which the SSIM can be employed as a fidelity measure. Subsequently, we show how the framework can be used to cast some standard as well as original imaging tasks into optimization problems, followed by a discussion of a number of novel numerical strategies for their solution.
△ Less
Submitted 7 February, 2020;
originally announced February 2020.
-
On the Presence of Green and Sustainable Software Engineering in Higher Education Curricula
Authors:
Damiano Torre,
Giuseppe Procaccianti,
Davide Fucci,
Sonja Lutovac,
Giuseppe Scanniello
Abstract:
Nowadays, software is pervasive in our everyday lives. Its sustainability and environmental impact have become major factors to be considered in the development of software systems. Millennials-the newer generation of university students-are particularly keen to learn about and contribute to a more sustainable and green society. The need for training on green and sustainable topics in software eng…
▽ More
Nowadays, software is pervasive in our everyday lives. Its sustainability and environmental impact have become major factors to be considered in the development of software systems. Millennials-the newer generation of university students-are particularly keen to learn about and contribute to a more sustainable and green society. The need for training on green and sustainable topics in software engineering has been reflected in a number of recent studies. The goal of this paper is to get a first understanding of what is the current state of teaching sustainability in the software engineering community, what are the motivations behind the current state of teaching, and what can be done to improve it. To this end, we report the findings from a targeted survey of 33 academics on the presence of green and sustainable software engineering in higher education. The major findings from the collected data suggest that sustainability is under-represented in the curricula, while the current focus of teaching is on energy efficiency delivered through a fact-based approach. The reasons vary from lack of awareness, teaching material and suitable technologies, to the high effort required to teach sustainability. Finally, we provide recommendations for educators willing to teach sustainability in software engineering that can help to suit millennial students needs.
△ Less
Submitted 3 March, 2017;
originally announced March 2017.