Search | arXiv e-print repository

Beyond Multilayer Perceptrons: Investigating Complex Topologies in Neural Networks

Authors: Tommaso Boccato, Matteo Ferrante, Andrea Duggento, Nicola Toschi

Abstract: In this study, we explore the impact of network topology on the approximation capabilities of artificial neural networks (ANNs), with a particular focus on complex topologies. We propose a novel methodology for constructing complex ANNs based on various topologies, including Barabási-Albert, Erdős-Rényi, Watts-Strogatz, and multilayer perceptrons (MLPs). The constructed networks are evaluated on s… ▽ More In this study, we explore the impact of network topology on the approximation capabilities of artificial neural networks (ANNs), with a particular focus on complex topologies. We propose a novel methodology for constructing complex ANNs based on various topologies, including Barabási-Albert, Erdős-Rényi, Watts-Strogatz, and multilayer perceptrons (MLPs). The constructed networks are evaluated on synthetic datasets generated from manifold learning generators, with varying levels of task difficulty and noise, and on real-world datasets from the UCI suite. Our findings reveal that complex topologies lead to superior performance in high-difficulty regimes compared to traditional MLPs. This performance advantage is attributed to the ability of complex networks to exploit the compositionality of the underlying target function. However, this benefit comes at the cost of increased forward-pass computation time and reduced robustness to graph damage. Additionally, we investigate the relationship between various topological attributes and model performance. Our analysis shows that no single attribute can account for the observed performance differences, suggesting that the influence of network topology on approximation capabilities may be more intricate than a simple correlation with individual topological attributes. Our study sheds light on the potential of complex topologies for enhancing the performance of ANNs and provides a foundation for future research exploring the interplay between multiple topological attributes and their impact on model performance. △ Less

Submitted 23 October, 2023; v1 submitted 31 March, 2023; originally announced March 2023.

arXiv:2303.15963 [pdf]

doi 10.1016/j.inffus.2022.07.017

Multimodal and multicontrast image fusion via deep generative models

Authors: Giovanna Maria Dimitri, Simeon Spasov, Andrea Duggento, Luca Passamonti, Pietro Li`o, Nicola Toschi

Abstract: Recently, it has become progressively more evident that classic diagnostic labels are unable to reliably describe the complexity and variability of several clinical phenotypes. This is particularly true for a broad range of neuropsychiatric illnesses (e.g., depression, anxiety disorders, behavioral phenotypes). Patient heterogeneity can be better described by grou** individuals into novel catego… ▽ More Recently, it has become progressively more evident that classic diagnostic labels are unable to reliably describe the complexity and variability of several clinical phenotypes. This is particularly true for a broad range of neuropsychiatric illnesses (e.g., depression, anxiety disorders, behavioral phenotypes). Patient heterogeneity can be better described by grou** individuals into novel categories based on empirically derived sections of intersecting continua that span across and beyond traditional categorical borders. In this context, neuroimaging data carry a wealth of spatiotemporally resolved information about each patient's brain. However, they are usually heavily collapsed a priori through procedures which are not learned as part of model training, and consequently not optimized for the downstream prediction task. This is because every individual participant usually comes with multiple whole-brain 3D imaging modalities often accompanied by a deep genotypic and phenotypic characterization, hence posing formidable computational challenges. In this paper we design a deep learning architecture based on generative models rooted in a modular approach and separable convolutional blocks to a) fuse multiple 3D neuroimaging modalities on a voxel-wise level, b) convert them into informative latent embeddings through heavy dimensionality reduction, c) maintain good generalizability and minimal information loss. As proof of concept, we test our architecture on the well characterized Human Connectome Project database demonstrating that our latent embeddings can be clustered into easily separable subject strata which, in turn, map to different phenotypical information which was not included in the embedding creation process. This may be of aid in predicting disease evolution as well as drug response, hence supporting mechanistic disease understanding and empowering clinical trials. △ Less

Submitted 27 February, 2024; v1 submitted 28 March, 2023; originally announced March 2023.

arXiv:2209.12279 [pdf, other]

VAESim: A probabilistic approach for self-supervised prototype discovery

Authors: Matteo Ferrante, Tommaso Boccato, Simeon Spasov, Andrea Duggento, Nicola Toschi

Abstract: In medicine, curated image datasets often employ discrete labels to describe what is known to be a continuous spectrum of healthy to pathological conditions, such as e.g. the Alzheimer's Disease Continuum or other areas where the image plays a pivotal point in diagnosis. We propose an architecture for image stratification based on a conditional variational autoencoder. Our framework, VAESim, lever… ▽ More In medicine, curated image datasets often employ discrete labels to describe what is known to be a continuous spectrum of healthy to pathological conditions, such as e.g. the Alzheimer's Disease Continuum or other areas where the image plays a pivotal point in diagnosis. We propose an architecture for image stratification based on a conditional variational autoencoder. Our framework, VAESim, leverages a continuous latent space to represent the continuum of disorders and finds clusters during training, which can then be used for image/patient stratification. The core of the method learns a set of prototypical vectors, each associated with a cluster. First, we perform a soft assignment of each data sample to the clusters. Then, we reconstruct the sample based on a similarity measure between the sample embedding and the prototypical vectors of the clusters. To update the prototypical embeddings, we use an exponential moving average of the most similar representations between actual prototypes and samples in the batch size. We test our approach on the MNIST-handwritten digit dataset and on a medical benchmark dataset called PneumoniaMNIST. We demonstrate that our method outperforms baselines in terms of kNN accuracy measured on a classification task against a standard VAE (up to 15% improvement in performance) in both datasets, and also performs at par with classification models trained in a fully supervised way. We also demonstrate how our model outperforms current, end-to-end models for unsupervised stratification. △ Less

Submitted 25 September, 2022; originally announced September 2022.

arXiv:2209.12005 [pdf, other]

Contrastive learning for unsupervised medical image clustering and reconstruction

Authors: Matteo Ferrante, Tommaso Boccato, Simeon Spasov, Andrea Duggento, Nicola Toschi

Abstract: The lack of large labeled medical imaging datasets, along with significant inter-individual variability compared to clinically established disease classes, poses significant challenges in exploiting medical imaging information in a precision medicine paradigm, where in principle dense patient-specific data can be employed to formulate individual predictions and/or stratify patients into finer-grai… ▽ More The lack of large labeled medical imaging datasets, along with significant inter-individual variability compared to clinically established disease classes, poses significant challenges in exploiting medical imaging information in a precision medicine paradigm, where in principle dense patient-specific data can be employed to formulate individual predictions and/or stratify patients into finer-grained groups which may follow more homogeneous trajectories and therefore empower clinical trials. In order to efficiently explore the effective degrees of freedom underlying variability in medical images in an unsupervised manner, in this work we propose an unsupervised autoencoder framework which is augmented with a contrastive loss to encourage high separability in the latent space. The model is validated on (medical) benchmark datasets. As cluster labels are assigned to each example according to cluster assignments, we compare performance with a supervised transfer learning baseline. Our method achieves similar performance to the supervised architecture, indicating that separation in the latent space reproduces expert medical observer-assigned labels. The proposed method could be beneficial for patient stratification, exploring new subdivisions of larger classes or pathological continua or, due to its sampling abilities in a variation setting, data augmentation in medical image processing. △ Less

Submitted 24 September, 2022; originally announced September 2022.

arXiv:2209.11998 [pdf, other]

Physically constrained neural networks to solve the inverse problem for neuron models

Authors: Matteo Ferrante, Andera Duggento, Nicola Toschi

Abstract: Systems biology and systems neurophysiology in particular have recently emerged as powerful tools for a number of key applications in the biomedical sciences. Nevertheless, such models are often based on complex combinations of multiscale (and possibly multiphysics) strategies that require ad hoc computational strategies and pose extremely high computational demands. Recent developments in the fie… ▽ More Systems biology and systems neurophysiology in particular have recently emerged as powerful tools for a number of key applications in the biomedical sciences. Nevertheless, such models are often based on complex combinations of multiscale (and possibly multiphysics) strategies that require ad hoc computational strategies and pose extremely high computational demands. Recent developments in the field of deep neural networks have demonstrated the possibility of formulating nonlinear, universal approximators to estimate solutions to highly nonlinear and complex problems with significant speed and accuracy advantages in comparison with traditional models. After synthetic data validation, we use so-called physically constrained neural networks (PINN) to simultaneously solve the biologically plausible Hodgkin-Huxley model and infer its parameters and hidden time-courses from real data under both variable and constant current stimulation, demonstrating extremely low variability across spikes and faithful signal reconstruction. The parameter ranges we obtain are also compatible with prior knowledge. We demonstrate that detailed biological knowledge can be provided to a neural network, making it able to fit complex dynamics over both simulated and real data. △ Less

Submitted 24 September, 2022; originally announced September 2022.

arXiv:2209.02037 [pdf, other]

4Ward: a Relayering Strategy for Efficient Training of Arbitrarily Complex Directed Acyclic Graphs

Authors: Tommaso Boccato, Matteo Ferrante, Andrea Duggento, Nicola Toschi

Abstract: Thanks to their ease of implementation, multilayer perceptrons (MLPs) have become ubiquitous in deep learning applications. The graph underlying an MLP is indeed multipartite, i.e. each layer of neurons only connects to neurons belonging to the adjacent layer. In contrast, in vivo brain connectomes at the level of individual synapses suggest that biological neuronal networks are characterized by s… ▽ More Thanks to their ease of implementation, multilayer perceptrons (MLPs) have become ubiquitous in deep learning applications. The graph underlying an MLP is indeed multipartite, i.e. each layer of neurons only connects to neurons belonging to the adjacent layer. In contrast, in vivo brain connectomes at the level of individual synapses suggest that biological neuronal networks are characterized by scale-free degree distributions or exponentially truncated power law strength distributions, hinting at potentially novel avenues for the exploitation of evolution-derived neuronal networks. In this paper, we present ``4Ward'', a method and Python library capable of generating flexible and efficient neural networks (NNs) from arbitrarily complex directed acyclic graphs. 4Ward is inspired by layering algorithms drawn from the graph drawing discipline to implement efficient forward passes, and provides significant time gains in computational experiments with various Erdős-Rényi graphs. 4Ward not only overcomes the sequential nature of the learning matrix method, by parallelizing the computation of activations, but also addresses the scalability issues encountered in the current state-of-the-art and provides the designer with freedom to customize weight initialization and activation functions. Our algorithm can be of aid for any investigator seeking to exploit complex topologies in a NN design framework at the microscale. △ Less

Submitted 23 October, 2023; v1 submitted 5 September, 2022; originally announced September 2022.

arXiv:2208.08860 [pdf]

An intertwined neural network model for EEG classification in brain-computer interfaces

Authors: Andrea Duggento, Mario De Lorenzo, Stefano Bargione, Allegra Conti, Vincenzo Catrambone, Gaetano Valenza, Nicola Toschi

Abstract: The brain computer interface (BCI) is a nonstimulatory direct and occasionally bidirectional communication link between the brain and a computer or an external device. Classically, EEG-based BCI algorithms have relied on models such as support vector machines and linear discriminant analysis or multiclass common spatial patterns. During the last decade, however, more sophisticated machine learning… ▽ More The brain computer interface (BCI) is a nonstimulatory direct and occasionally bidirectional communication link between the brain and a computer or an external device. Classically, EEG-based BCI algorithms have relied on models such as support vector machines and linear discriminant analysis or multiclass common spatial patterns. During the last decade, however, more sophisticated machine learning architectures, such as convolutional neural networks, recurrent neural networks, long short-term memory networks and gated recurrent unit networks, have been extensively used to enhance discriminability in multiclass BCI tasks. Additionally, preprocessing and denoising of EEG signals has always been key in the successful decoding of brain activity, and the determination of an optimal and standardized EEG preprocessing activity is an active area of research. In this paper, we present a deep neural network architecture specifically engineered to a) provide state-of-the-art performance in multiclass motor imagery classification and b) remain robust to preprocessing to enable real-time processing of raw data as it streams from EEG and BCI equipment. It is based on the intertwined use of time-distributed fully connected (tdFC) and space-distributed 1D temporal convolutional layers (sdConv) and explicitly addresses the possibility that interaction of spatial and temporal features of the EEG signal occurs at all levels of complexity. Numerical experiments demonstrate that our architecture provides superior performance compared baselines based on a combination of 3D convolutions and recurrent neural networks in a six-class motor imagery network, with a subjectwise accuracy that reaches 99%. Importantly, these results remain unchanged when minimal or extensive preprocessing is applied, possibly paving the way for a more transversal and real-time use of deep learning architectures in EEG classification. △ Less

Submitted 4 August, 2022; originally announced August 2022.

Showing 1–7 of 7 results for author: Duggento, A