Search | arXiv e-print repository

VC dimension of Graph Neural Networks with Pfaffian activation functions

Authors: Giuseppe Alessio D'Inverno, Monica Bianchini, Franco Scarselli

Abstract: Graph Neural Networks (GNNs) have emerged in recent years as a powerful tool to learn tasks across a wide range of graph domains in a data-driven fashion; based on a message passing mechanism, GNNs have gained increasing popularity due to their intuitive formulation, closely linked with the Weisfeiler-Lehman (WL) test for graph isomorphism, to which they have proven equivalent. From a theoretical… ▽ More Graph Neural Networks (GNNs) have emerged in recent years as a powerful tool to learn tasks across a wide range of graph domains in a data-driven fashion; based on a message passing mechanism, GNNs have gained increasing popularity due to their intuitive formulation, closely linked with the Weisfeiler-Lehman (WL) test for graph isomorphism, to which they have proven equivalent. From a theoretical point of view, GNNs have been shown to be universal approximators, and their generalization capability (namely, bounds on the Vapnik Chervonekis (VC) dimension) has recently been investigated for GNNs with piecewise polynomial activation functions. The aim of our work is to extend this analysis on the VC dimension of GNNs to other commonly used activation functions, such as sigmoid and hyperbolic tangent, using the framework of Pfaffian function theory. Bounds are provided with respect to architecture parameters (depth, number of neurons, input size) as well as with respect to the number of colors resulting from the 1-WL test applied on the graph domain. The theoretical analysis is supported by a preliminary experimental study. △ Less

Submitted 2 April, 2024; v1 submitted 22 January, 2024; originally announced January 2024.

Comments: 35 pages, 9 figures

arXiv:2401.03824 [pdf, ps, other]

A topological description of loss surfaces based on Betti Numbers

Authors: Maria Sofia Bucarelli, Giuseppe Alessio D'Inverno, Monica Bianchini, Franco Scarselli, Fabrizio Silvestri

Abstract: In the context of deep learning models, attention has recently been paid to studying the surface of the loss function in order to better understand training with methods based on gradient descent. This search for an appropriate description, both analytical and topological, has led to numerous efforts to identify spurious minima and characterize gradient dynamics. Our work aims to contribute to thi… ▽ More In the context of deep learning models, attention has recently been paid to studying the surface of the loss function in order to better understand training with methods based on gradient descent. This search for an appropriate description, both analytical and topological, has led to numerous efforts to identify spurious minima and characterize gradient dynamics. Our work aims to contribute to this field by providing a topological measure to evaluate loss complexity in the case of multilayer neural networks. We compare deep and shallow architectures with common sigmoidal activation functions by deriving upper and lower bounds on the complexity of their loss function and revealing how that complexity is influenced by the number of hidden units, training models, and the activation function used. Additionally, we found that certain variations in the loss function or model architecture, such as adding an $\ell_2$ regularization term or implementing skip connections in a feedforward network, do not affect loss topology in specific cases. △ Less

Submitted 8 January, 2024; originally announced January 2024.

arXiv:2306.02838 [pdf, other]

Impact of the Covid 19 outbreaks on the italian twitter vaccination debat: a network based analysis

Authors: Veronica Lachi, Giovanna Maria Dimitri, Alessandro Di Stefano, Pietro Liò, Monica Bianchini, Chiara Mocenni

Abstract: Vaccine hesitancy, or the reluctance to be vaccinated, is a phenomenon that has recently become particularly significant, in conjunction with the vaccination campaign against COVID-19. During the lockdown period, necessary to control the spread of the virus, social networks have played an important role in the Italian debate on vaccination, generally representing the easiest and safest way to exch… ▽ More Vaccine hesitancy, or the reluctance to be vaccinated, is a phenomenon that has recently become particularly significant, in conjunction with the vaccination campaign against COVID-19. During the lockdown period, necessary to control the spread of the virus, social networks have played an important role in the Italian debate on vaccination, generally representing the easiest and safest way to exchange opinions and maintain some form of sociability. Among social network platforms, Twitter has assumed a strategic role in driving the public opinion, creating compact groups of users sharing similar views towards the utility, uselessness or even dangerousness of vaccines. In this paper, we present a new, publicly available, dataset of Italian tweets, TwitterVax, collected in the period January 2019--May 2022. Considering monthly data, gathered into forty one retweet networks -- where nodes identify users and edges are present between users who have retweeted each other -- we performed community detection within the networks, analyzing their evolution and polarization with respect to NoVax and ProVax users through time. This allowed us to clearly discover debate trends as well as identify potential key moments and actors in opinion flows, characterizing the main features and tweeting behavior of the two communities. △ Less

Submitted 5 June, 2023; originally announced June 2023.

arXiv:2302.01018 [pdf, other]

Graph Neural Networks for temporal graphs: State of the art, open challenges, and opportunities

Authors: Antonio Longa, Veronica Lachi, Gabriele Santin, Monica Bianchini, Bruno Lepri, Pietro Lio, Franco Scarselli, Andrea Passerini

Abstract: Graph Neural Networks (GNNs) have become the leading paradigm for learning on (static) graph-structured data. However, many real-world systems are dynamic in nature, since the graph and node/edge attributes change over time. In recent years, GNN-based models for temporal graphs have emerged as a promising area of research to extend the capabilities of GNNs. In this work, we provide the first compr… ▽ More Graph Neural Networks (GNNs) have become the leading paradigm for learning on (static) graph-structured data. However, many real-world systems are dynamic in nature, since the graph and node/edge attributes change over time. In recent years, GNN-based models for temporal graphs have emerged as a promising area of research to extend the capabilities of GNNs. In this work, we provide the first comprehensive overview of the current state-of-the-art of temporal GNN, introducing a rigorous formalization of learning settings and tasks and a novel taxonomy categorizing existing approaches in terms of how the temporal aspect is represented and processed. We conclude the survey with a discussion of the most relevant open challenges for the field, from both research and application perspectives. △ Less

Submitted 8 July, 2023; v1 submitted 2 February, 2023; originally announced February 2023.

arXiv:2211.16871 [pdf, other]

A Deep Learning Approach to the Prediction of Drug Side-Effects on Molecular Graphs

Authors: Pietro Bongini, Elisa Messori, Niccolò Pancino, Monica Bianchini

Abstract: Predicting drug side-effects before they occur is a key task in kee** the number of drug-related hospitalizations low and to improve drug discovery processes. Automatic predictors of side-effects generally are not able to process the structure of the drug, resulting in a loss of information. Graph neural networks have seen great success in recent years, thanks to their ability of exploiting the… ▽ More Predicting drug side-effects before they occur is a key task in kee** the number of drug-related hospitalizations low and to improve drug discovery processes. Automatic predictors of side-effects generally are not able to process the structure of the drug, resulting in a loss of information. Graph neural networks have seen great success in recent years, thanks to their ability of exploiting the information conveyed by the graph structure and labels. These models have been used in a wide variety of biological applications, among which the prediction of drug side-effects on a large knowledge graph. Exploiting the molecular graph encoding the structure of the drug represents a novel approach, in which the problem is formulated as a multi-class multi-label graph-focused classification. We developed a methodology to carry out this task, using recurrent Graph Neural Networks, and building a dataset from freely accessible and well established data sources. The results show that our method has an improved classification capability, under many parameters and metrics, with respect to previously available predictors. △ Less

Submitted 30 November, 2022; originally announced November 2022.

Comments: 16 pages, 2 figures, under review

MSC Class: 62-06

arXiv:2202.08147 [pdf, other]

Modular multi-source prediction of drug side-effects with DruGNN

Authors: Pietro Bongini, Franco Scarselli, Monica Bianchini, Giovanna Maria Dimitri, Niccolò Pancino, Pietro Liò

Abstract: Drug Side-Effects (DSEs) have a high impact on public health, care system costs, and drug discovery processes. Predicting the probability of side-effects, before their occurrence, is fundamental to reduce this impact, in particular on drug discovery. Candidate molecules could be screened before undergoing clinical trials, reducing the costs in time, money, and health of the participants. Drug side… ▽ More Drug Side-Effects (DSEs) have a high impact on public health, care system costs, and drug discovery processes. Predicting the probability of side-effects, before their occurrence, is fundamental to reduce this impact, in particular on drug discovery. Candidate molecules could be screened before undergoing clinical trials, reducing the costs in time, money, and health of the participants. Drug side-effects are triggered by complex biological processes involving many different entities, from drug structures to protein-protein interactions. To predict their occurrence, it is necessary to integrate data from heterogeneous sources. In this work, such heterogeneous data is integrated into a graph dataset, expressively representing the relational information between different entities, such as drug molecules and genes. The relational nature of the dataset represents an important novelty for drug side-effect predictors. Graph Neural Networks (GNNs) are exploited to predict DSEs on our dataset with very promising results. GNNs are deep learning models that can process graph-structured data, with minimal information loss, and have been applied on a wide variety of biological tasks. Our experimental results confirm the advantage of using relationships between data entities, suggesting interesting future developments in this scope. The experimentation also shows the importance of specific subsets of data in determining associations between drugs and side-effects. △ Less

Submitted 15 February, 2022; originally announced February 2022.

Comments: 19 pages, 3 figures

arXiv:2106.08992 [pdf, other]

On the approximation capability of GNNs in node classification/regression tasks

Authors: Giuseppe Alessio D'Inverno, Monica Bianchini, Maria Lucia Sampoli, Franco Scarselli

Abstract: Graph Neural Networks (GNNs) are a broad class of connectionist models for graph processing. Recent studies have shown that GNNs can approximate any function on graphs, modulo the equivalence relation on graphs defined by the Weisfeiler--Lehman (WL) test. However, these results suffer from some limitations, both because they were derived using the Stone--Weierstrass theorem -- which is existential… ▽ More Graph Neural Networks (GNNs) are a broad class of connectionist models for graph processing. Recent studies have shown that GNNs can approximate any function on graphs, modulo the equivalence relation on graphs defined by the Weisfeiler--Lehman (WL) test. However, these results suffer from some limitations, both because they were derived using the Stone--Weierstrass theorem -- which is existential in nature, -- and because they assume that the target function to be approximated must be continuous. Furthermore, all current results are dedicated to graph classification/regression tasks, where the GNN must produce a single output for the whole graph, while also node classification/regression problems, in which an output is returned for each node, are very common. In this paper, we propose an alternative way to demonstrate the approximation capability of GNNs that overcomes these limitations. Indeed, we show that GNNs are universal approximators in probability for node classification/regression tasks, as they can approximate any measurable function that satisfies the 1--WL equivalence on nodes. The proposed theoretical framework allows the approximation of generic discontinuous target functions and also suggests the GNN architecture that can reach a desired approximation. In addition, we provide a bound on the number of the GNN layers required to achieve the desired degree of approximation, namely $2r-1$, where $r$ is the maximum number of nodes for the graphs in the domain. △ Less

Submitted 9 November, 2023; v1 submitted 16 June, 2021; originally announced June 2021.

Comments: 25 pages, 5 figures

ACM Class: F.2.2; G.2.2; I.2.6

arXiv:2106.05132 [pdf, other]

doi 10.3390/math9222896

A multi-stage GAN for multi-organ chest X-ray image generation and segmentation

Authors: Giorgio Ciano, Paolo Andreini, Tommaso Mazzierli, Monica Bianchini, Franco Scarselli

Abstract: Multi-organ segmentation of X-ray images is of fundamental importance for computer aided diagnosis systems. However, the most advanced semantic segmentation methods rely on deep learning and require a huge amount of labeled images, which are rarely available due to both the high cost of human resources and the time required for labeling. In this paper, we present a novel multi-stage generation alg… ▽ More Multi-organ segmentation of X-ray images is of fundamental importance for computer aided diagnosis systems. However, the most advanced semantic segmentation methods rely on deep learning and require a huge amount of labeled images, which are rarely available due to both the high cost of human resources and the time required for labeling. In this paper, we present a novel multi-stage generation algorithm based on Generative Adversarial Networks (GANs) that can produce synthetic images along with their semantic labels and can be used for data augmentation. The main feature of the method is that, unlike other approaches, generation occurs in several stages, which simplifies the procedure and allows it to be used on very small datasets. The method has been evaluated on the segmentation of chest radiographic images, showing promising results. The multistage approach achieves state-of-the-art and, when very few images are used to train the GANs, outperforms the corresponding single-stage approach. △ Less

Submitted 5 October, 2021; v1 submitted 9 June, 2021; originally announced June 2021.

arXiv:2012.07397 [pdf, other]

doi 10.1016/j.neucom.2021.04.039

Molecular graph generation with Graph Neural Networks

Authors: Pietro Bongini, Monica Bianchini, Franco Scarselli

Abstract: Drug Discovery is a fundamental and ever-evolving field of research. The design of new candidate molecules requires large amounts of time and money, and computational methods are being increasingly employed to cut these costs. Machine learning methods are ideal for the design of large amounts of potential new candidate molecules, which are naturally represented as graphs. Graph generation is being… ▽ More Drug Discovery is a fundamental and ever-evolving field of research. The design of new candidate molecules requires large amounts of time and money, and computational methods are being increasingly employed to cut these costs. Machine learning methods are ideal for the design of large amounts of potential new candidate molecules, which are naturally represented as graphs. Graph generation is being revolutionized by deep learning methods, and molecular generation is one of its most promising applications. In this paper, we introduce a sequential molecular graph generator based on a set of graph neural network modules, which we call MG^2N^2. At each step, a node or a group of nodes is added to the graph, along with its connections. The modular architecture simplifies the training procedure, also allowing an independent retraining of a single module. Sequentiality and modularity make the generation process interpretable. The use of graph neural networks maximizes the information in input at each generative step, which consists of the subgraph produced during the previous steps. Experiments of unconditional generation on the QM9 and Zinc datasets show that our model is capable of generalizing molecular patterns seen during the training phase, without overfitting. The results indicate that our method is competitive, and outperforms challenging baselines for unconditional generation. △ Less

Submitted 27 May, 2021; v1 submitted 14 December, 2020; originally announced December 2020.

Comments: 20 pages, 4 figures (2 figures are composed of double images, for a total of 6 images)

Journal ref: Neurocomputing 2021

arXiv:1911.09026 [pdf, other]

doi 10.1016/j.patrec.2020.06.023

Weak Supervision for Generating Pixel-Level Annotations in Scene Text Segmentation

Authors: Simone Bonechi, Paolo Andreini, Monica Bianchini, Franco Scarselli

Abstract: Providing pixel-level supervisions for scene text segmentation is inherently difficult and costly, so that only few small datasets are available for this task. To face the scarcity of training data, previous approaches based on Convolutional Neural Networks (CNNs) rely on the use of a synthetic dataset for pre-training. However, synthetic data cannot reproduce the complexity and variability of nat… ▽ More Providing pixel-level supervisions for scene text segmentation is inherently difficult and costly, so that only few small datasets are available for this task. To face the scarcity of training data, previous approaches based on Convolutional Neural Networks (CNNs) rely on the use of a synthetic dataset for pre-training. However, synthetic data cannot reproduce the complexity and variability of natural images. In this work, we propose to use a weakly supervised learning approach to reduce the domain-shift between synthetic and real data. Leveraging the bounding-box supervision of the COCO-Text and the MLT datasets, we generate weak pixel-level supervisions of real images. In particular, the COCO-Text-Segmentation (COCO_TS) and the MLT-Segmentation (MLT_S) datasets are created and released. These two datasets are used to train a CNN, the Segmentation Multiscale Attention Network (SMANet), which is specifically designed to face some peculiarities of the scene text segmentation task. The SMANet is trained end-to-end on the proposed datasets, and the experiments show that COCO_TS and MLT_S are a valid alternative to synthetic images, allowing to use only a fraction of the training samples and improving significantly the performances. △ Less

Submitted 19 November, 2019; originally announced November 2019.

Comments: arXiv admin note: substantial text overlap with arXiv:1904.00818

arXiv:1907.12296 [pdf, other]

A Two Stage GAN for High Resolution Retinal Image Generation and Segmentation

Authors: Paolo Andreini, Simone Bonechi, Monica Bianchini, Alessandro Mecocci, Franco Scarselli, Andrea Sodi

Abstract: In recent years, the use of deep learning is becoming increasingly popular in computer vision. However, the effective training of deep architectures usually relies on huge sets of annotated data. This is critical in the medical field where it is difficult and expensive to obtain annotated images. In this paper, we use Generative Adversarial Networks (GANs) for synthesizing high quality retinal ima… ▽ More In recent years, the use of deep learning is becoming increasingly popular in computer vision. However, the effective training of deep architectures usually relies on huge sets of annotated data. This is critical in the medical field where it is difficult and expensive to obtain annotated images. In this paper, we use Generative Adversarial Networks (GANs) for synthesizing high quality retinal images, along with the corresponding semantic label-maps, to be used instead of real images during the training process. Differently from other previous proposals, we suggest a two step approach: first, a progressively growing GAN is trained to generate the semantic label-maps, which describe the blood vessel structure (i.e. vasculature); second, an image-to-image translation approach is used to obtain realistic retinal images from the generated vasculature. By using only a handful of training samples, our approach generates realistic high resolution images, that can be effectively used to enlarge small available datasets. Comparable results have been obtained employing the generated images in place of real data during training. The practical viability of the proposed approach has been demonstrated by applying it on two well established benchmark sets for retinal vessel segmentation, both containing a very small number of training samples. Our method obtained better performances with respect to state-of-the-art techniques. △ Less

Submitted 29 July, 2019; originally announced July 2019.

arXiv:1904.00818 [pdf, other]

doi 10.1007/978-3-030-30490-4_26

COCO_TS Dataset: Pixel-level Annotations Based on Weak Supervision for Scene Text Segmentation

Authors: Simone Bonechi, Paolo Andreini, Monica Bianchini, Franco Scarselli

Abstract: The absence of large scale datasets with pixel-level supervisions is a significant obstacle for the training of deep convolutional networks for scene text segmentation. For this reason, synthetic data generation is normally employed to enlarge the training dataset. Nonetheless, synthetic data cannot reproduce the complexity and variability of natural images. In this paper, a weakly supervised lear… ▽ More The absence of large scale datasets with pixel-level supervisions is a significant obstacle for the training of deep convolutional networks for scene text segmentation. For this reason, synthetic data generation is normally employed to enlarge the training dataset. Nonetheless, synthetic data cannot reproduce the complexity and variability of natural images. In this paper, a weakly supervised learning approach is used to reduce the shift between training on real and synthetic data. Pixel-level supervisions for a text detection dataset (i.e. where only bounding-box annotations are available) are generated. In particular, the COCO-Text-Segmentation (COCO_TS) dataset, which provides pixel-level supervisions for the COCO-Text dataset, is created and released. The generated annotations are used to train a deep convolutional neural network for semantic segmentation. Experiments show that the proposed dataset can be used instead of synthetic data, allowing us to use only a fraction of the training samples and significantly improving the performances. △ Less

Submitted 24 September, 2019; v1 submitted 1 April, 2019; originally announced April 2019.

arXiv:1807.08173 [pdf, other]

Modeling Taxi Drivers' Behaviour for the Next Destination Prediction

Authors: Alberto Rossi, Gianni Barlacchi, Monica Bianchini, Bruno Lepri

Abstract: In this paper, we study how to model taxi drivers' behaviour and geographical information for an interesting and challenging task: the next destination prediction in a taxi journey. Predicting the next location is a well studied problem in human mobility, which finds several applications in real-world scenarios, from optimizing the efficiency of electronic dispatching systems to predicting and red… ▽ More In this paper, we study how to model taxi drivers' behaviour and geographical information for an interesting and challenging task: the next destination prediction in a taxi journey. Predicting the next location is a well studied problem in human mobility, which finds several applications in real-world scenarios, from optimizing the efficiency of electronic dispatching systems to predicting and reducing the traffic jam. This task is normally modeled as a multiclass classification problem, where the goal is to select, among a set of already known locations, the next taxi destination. We present a Recurrent Neural Network (RNN) approach that models the taxi drivers' behaviour and encodes the semantics of visited locations by using geographical information from Location-Based Social Networks (LBSNs). In particular, RNNs are trained to predict the exact coordinates of the next destination, overcoming the problem of producing, in output, a limited set of locations, seen during the training phase. The proposed approach was tested on the ECML/PKDD Discovery Challenge 2015 dataset - based on the city of Porto -, obtaining better results with respect to the competition winner, whilst using less information, and on Manhattan and San Francisco datasets. △ Less

Submitted 8 January, 2019; v1 submitted 21 July, 2018; originally announced July 2018.

Comments: preprint version of a paper submitted to IEEE Transactions on Intelligent Transportation Systems

Showing 1–13 of 13 results for author: Bianchini, M