-
Asymptotics of Learning with Deep Structured (Random) Features
Authors:
Dominik Schröder,
Daniil Dmitriev,
Hugo Cui,
Bruno Loureiro
Abstract:
For a large class of feature maps we provide a tight asymptotic characterisation of the test error associated with learning the readout layer, in the high-dimensional limit where the input dimension, hidden layer widths, and number of training samples are proportionally large. This characterization is formulated in terms of the population covariance of the features. Our work is partially motivated…
▽ More
For a large class of feature maps we provide a tight asymptotic characterisation of the test error associated with learning the readout layer, in the high-dimensional limit where the input dimension, hidden layer widths, and number of training samples are proportionally large. This characterization is formulated in terms of the population covariance of the features. Our work is partially motivated by the problem of learning with Gaussian rainbow neural networks, namely deep non-linear fully-connected networks with random but structured weights, whose row-wise covariances are further allowed to depend on the weights of previous layers. For such networks we also derive a closed-form formula for the feature covariance in terms of the weight matrices. We further find that in some cases our results can capture feature maps learned by deep, finite-width neural networks trained under gradient descent.
△ Less
Submitted 10 June, 2024; v1 submitted 21 February, 2024;
originally announced February 2024.
-
Harmful Conspiracies in Temporal Interaction Networks: Understanding the Dynamics of Digital Wildfires through Phase Transitions
Authors:
Kaspara Skovli Gåsvær,
Pedro G. Lind,
Johannes Langguth,
Morten Hjorth-Jensen,
Michael Kreil,
Daniel Thilo Schroeder
Abstract:
Shortly after the first COVID-19 cases became apparent in December 2020, rumors spread on social media suggesting a connection between the virus and the 5G radiation emanating from the recently deployed telecommunications network. In the course of the following weeks, this idea gained increasing popularity, and various alleged explanations for how such a connection manifests emerged. Ultimately, a…
▽ More
Shortly after the first COVID-19 cases became apparent in December 2020, rumors spread on social media suggesting a connection between the virus and the 5G radiation emanating from the recently deployed telecommunications network. In the course of the following weeks, this idea gained increasing popularity, and various alleged explanations for how such a connection manifests emerged. Ultimately, after being amplified by prominent conspiracy theorists, a series of arson attacks on telecommunication equipment follows, concluding with the kidnap** of telecommunication technicians in Peru. In this paper, we study the spread of content related to a conspiracy theory with harmful consequences, a so-called digital wildfire. In particular, we investigate the 5G and COVID-19 misinformation event on Twitter before, during, and after its peak in April and May 2020. For this purpose, we examine the community dynamics in complex temporal interaction networks underlying Twitter user activity. We assess the evolution of such digital wildfires by appropriately defining the temporal dynamics of communication in communities within social networks. We show that, for this specific misinformation event, the number of interactions of the users participating in a digital wildfire, as well as the size of the engaged communities, both follow a power-law distribution. Moreover, our research elucidates the possibility of quantifying the phases of a digital wildfire, as per established literature. We identify one such phase as a critical transition, marked by a shift from sporadic tweets to a global spread event, highlighting the dramatic scaling of misinformation propagation.
△ Less
Submitted 9 October, 2023;
originally announced October 2023.
-
Social media in the Global South: A Network Dataset of the Malian Twittersphere
Authors:
Daniel Thilo Schroeder,
Mirjam de Bruijn,
Luca Bruls,
Mulatu Alemayehu Moges,
Samba Dialimpa Badji,
Noëmie Fritz,
Modibo Galy Cisse,
Johannes Langguth,
Bruce Mutsvairo,
Kristin Skare Orgeret
Abstract:
With the expansion of mobile communications infrastructure, social media usage in the Global South is surging. Compared to the Global North, populations of the Global South have had less prior experience with social media from stationary computers and wired Internet. Many countries are experiencing violent conflicts that have a profound effect on their societies. As a result, social networks devel…
▽ More
With the expansion of mobile communications infrastructure, social media usage in the Global South is surging. Compared to the Global North, populations of the Global South have had less prior experience with social media from stationary computers and wired Internet. Many countries are experiencing violent conflicts that have a profound effect on their societies. As a result, social networks develop under different conditions than elsewhere, and our goal is to provide data for studying this phenomenon. In this dataset paper, we present a data collection of a national Twittersphere in a West African country of conflict. While not the largest social network in terms of users, Twitter is an important platform where people engage in public discussion. The focus is on Mali, a country beset by conflict since 2012 that has recently had a relatively precarious media ecology. The dataset consists of tweets and Twitter users in Mali and was collected in June 2022, when the Malian conflict became more violent internally both towards external and international actors. In a preliminary analysis, we assume that the conflictual context influences how people access social media and, therefore, the shape of the Twittersphere and its characteristics. The aim of this paper is to primarily invite researchers from various disciplines including complex networks and social sciences scholars to explore the data at hand further. We collected the dataset using a scra** strategy of the follower network and the identification of characteristics of a Malian Twitter user. The given snapshot of the Malian Twitter follower network contains around seven million accounts, of which 56,000 are clearly identifiable as Malian. In addition, we present the tweets. The dataset is available at: https://osf.io/mj2qt/
△ Less
Submitted 24 October, 2023; v1 submitted 25 April, 2023;
originally announced April 2023.
-
Deterministic equivalent and error universality of deep random features learning
Authors:
Dominik Schröder,
Hugo Cui,
Daniil Dmitriev,
Bruno Loureiro
Abstract:
This manuscript considers the problem of learning a random Gaussian network function using a fully connected network with frozen intermediate layers and trainable readout layer. This problem can be seen as a natural generalization of the widely studied random features model to deeper architectures. First, we prove Gaussian universality of the test error in a ridge regression setting where the lear…
▽ More
This manuscript considers the problem of learning a random Gaussian network function using a fully connected network with frozen intermediate layers and trainable readout layer. This problem can be seen as a natural generalization of the widely studied random features model to deeper architectures. First, we prove Gaussian universality of the test error in a ridge regression setting where the learner and target networks share the same intermediate layers, and provide a sharp asymptotic formula for it. Establishing this result requires proving a deterministic equivalent for traces of the deep random features sample covariance matrices which can be of independent interest. Second, we conjecture the asymptotic Gaussian universality of the test error in the more general setting of arbitrary convex losses and generic learner/target architectures. We provide extensive numerical evidence for this conjecture, which requires the derivation of closed-form expressions for the layer-wise post-activation population covariances. In light of our results, we investigate the interplay between architecture design and implicit regularization.
△ Less
Submitted 1 February, 2023;
originally announced February 2023.
-
Localization supervision of chest x-ray classifiers using label-specific eye-tracking annotation
Authors:
Ricardo Bigolin Lanfredi,
Joyce D. Schroeder,
Tolga Tasdizen
Abstract:
Convolutional neural networks (CNNs) have been successfully applied to chest x-ray (CXR) images. Moreover, annotated bounding boxes have been shown to improve the interpretability of a CNN in terms of localizing abnormalities. However, only a few relatively small CXR datasets containing bounding boxes are available, and collecting them is very costly. Opportunely, eye-tracking (ET) data can be col…
▽ More
Convolutional neural networks (CNNs) have been successfully applied to chest x-ray (CXR) images. Moreover, annotated bounding boxes have been shown to improve the interpretability of a CNN in terms of localizing abnormalities. However, only a few relatively small CXR datasets containing bounding boxes are available, and collecting them is very costly. Opportunely, eye-tracking (ET) data can be collected in a non-intrusive way during the clinical workflow of a radiologist. We use ET data recorded from radiologists while dictating CXR reports to train CNNs. We extract snippets from the ET data by associating them with the dictation of keywords and use them to supervise the localization of specific abnormalities. We show that this method improves a model's interpretability without impacting its image-level classification.
△ Less
Submitted 14 December, 2022; v1 submitted 20 July, 2022;
originally announced July 2022.
-
Switching between Numerical Black-box Optimization Algorithms with Warm-starting Policies
Authors:
Dominik Schröder,
Diederick Vermetten,
Hao Wang,
Carola Doerr,
Thomas Bäck
Abstract:
When solving optimization problems with black-box approaches, the algorithms gather valuable information about the problem instance during the optimization process. This information is used to adjust the distributions from which new solution candidates are sampled. In fact, a key objective in evolutionary computation is to identify the most effective ways to collect and exploit instance knowledge.…
▽ More
When solving optimization problems with black-box approaches, the algorithms gather valuable information about the problem instance during the optimization process. This information is used to adjust the distributions from which new solution candidates are sampled. In fact, a key objective in evolutionary computation is to identify the most effective ways to collect and exploit instance knowledge. However, while considerable work is devoted to adjusting hyper-parameters of black-box optimization algorithms on the fly or exchanging some of its modular components, we barely know how to effectively switch between different black-box optimization algorithms.
In this work, we build on the recent study of Vermetten et al. [GECCO 2020], who presented a data-driven approach to investigate promising switches between pairs of algorithms for numerical black-box optimization. We replicate their approach with a portfolio of five algorithms and investigate whether the predicted performance gains are realized when executing the most promising switches. Our results suggest that with a single switch between two algorithms, we outperform the best static choice among the five algorithms on 48 out of the 120 considered problem instances, the 24 BBOB functions in five different dimensions. We also show that for switching between BFGS and CMA-ES, a proper warm-starting of the parameters is crucial to realize high-performance gains. Lastly, with a sensitivity analysis, we find the actual performance gain per run is largely affected by the switching point, and in some cases, the switching point yielding the best actual performance differs from the one computed from the theoretical gain.
△ Less
Submitted 12 January, 2023; v1 submitted 13 April, 2022;
originally announced April 2022.
-
Comparing radiologists' gaze and saliency maps generated by interpretability methods for chest x-rays
Authors:
Ricardo Bigolin Lanfredi,
Ambuj Arora,
Trafton Drew,
Joyce D. Schroeder,
Tolga Tasdizen
Abstract:
The interpretability of medical image analysis models is considered a key research field. We use a dataset of eye-tracking data from five radiologists to compare the outputs of interpretability methods and the heatmaps representing where radiologists looked. We conduct a class-independent analysis of the saliency maps generated by two methods selected from the literature: Grad-CAM and attention ma…
▽ More
The interpretability of medical image analysis models is considered a key research field. We use a dataset of eye-tracking data from five radiologists to compare the outputs of interpretability methods and the heatmaps representing where radiologists looked. We conduct a class-independent analysis of the saliency maps generated by two methods selected from the literature: Grad-CAM and attention maps from an attention-gated model. For the comparison, we use shuffled metrics, which avoid biases from fixation locations. We achieve scores comparable to an interobserver baseline in one shuffled metric, highlighting the potential of saliency maps from Grad-CAM to mimic a radiologist's attention over an image. We also divide the dataset into subsets to evaluate in which cases similarities are higher.
△ Less
Submitted 19 April, 2023; v1 submitted 22 December, 2021;
originally announced December 2021.
-
Single volume lung biomechanics from chest computed tomography using a mode preserving generative adversarial network
Authors:
Muhammad F. A. Chaudhary,
Sarah E. Gerard,
Di Wang,
Gary E. Christensen,
Christopher B. Cooper,
Joyce D. Schroeder,
Eric A. Hoffman,
Joseph M. Reinhardt
Abstract:
Local tissue expansion of the lungs is typically derived by registering computed tomography (CT) scans acquired at multiple lung volumes. However, acquiring multiple scans incurs increased radiation dose, time, and cost, and may not be possible in many cases, thus restricting the applicability of registration-based biomechanics. We propose a generative adversarial learning approach for estimating…
▽ More
Local tissue expansion of the lungs is typically derived by registering computed tomography (CT) scans acquired at multiple lung volumes. However, acquiring multiple scans incurs increased radiation dose, time, and cost, and may not be possible in many cases, thus restricting the applicability of registration-based biomechanics. We propose a generative adversarial learning approach for estimating local tissue expansion directly from a single CT scan. The proposed framework was trained and evaluated on 2500 subjects from the SPIROMICS cohort. Once trained, the framework can be used as a registration-free method for predicting local tissue expansion. We evaluated model performance across varying degrees of disease severity and compared its performance with two image-to-image translation frameworks - UNet and Pix2Pix. Our model achieved an overall PSNR of 18.95 decibels, SSIM of 0.840, and Spearman's correlation of 0.61 at a high spatial resolution of 1 mm3.
△ Less
Submitted 15 October, 2021;
originally announced October 2021.
-
REFLACX, a dataset of reports and eye-tracking data for localization of abnormalities in chest x-rays
Authors:
Ricardo Bigolin Lanfredi,
Mingyuan Zhang,
William F. Auffermann,
Jessica Chan,
Phuong-Anh T. Duong,
Vivek Srikumar,
Trafton Drew,
Joyce D. Schroeder,
Tolga Tasdizen
Abstract:
Deep learning has shown recent success in classifying anomalies in chest x-rays, but datasets are still small compared to natural image datasets. Supervision of abnormality localization has been shown to improve trained models, partially compensating for dataset sizes. However, explicitly labeling these anomalies requires an expert and is very time-consuming. We propose a potentially scalable meth…
▽ More
Deep learning has shown recent success in classifying anomalies in chest x-rays, but datasets are still small compared to natural image datasets. Supervision of abnormality localization has been shown to improve trained models, partially compensating for dataset sizes. However, explicitly labeling these anomalies requires an expert and is very time-consuming. We propose a potentially scalable method for collecting implicit localization data using an eye tracker to capture gaze locations and a microphone to capture a dictation of a report, imitating the setup of a reading room. The resulting REFLACX (Reports and Eye-Tracking Data for Localization of Abnormalities in Chest X-rays) dataset was labeled across five radiologists and contains 3,032 synchronized sets of eye-tracking data and timestamped report transcriptions for 2,616 chest x-rays from the MIMIC-CXR dataset. We also provide auxiliary annotations, including bounding boxes around lungs and heart and validation labels consisting of ellipses localizing abnormalities and image-level labels. Furthermore, a small subset of the data contains readings from all radiologists, allowing for the calculation of inter-rater scores.
△ Less
Submitted 28 June, 2022; v1 submitted 29 September, 2021;
originally announced September 2021.
-
Analysis of One-Hidden-Layer Neural Networks via the Resolvent Method
Authors:
Vanessa Piccolo,
Dominik Schröder
Abstract:
In this work, we investigate the asymptotic spectral density of the random feature matrix $M = Y Y^\ast$ with $Y = f(WX)$ generated by a single-hidden-layer neural network, where $W$ and $X$ are random rectangular matrices with i.i.d. centred entries and $f$ is a non-linear smooth function which is applied entry-wise. We prove that the Stieltjes transform of the limiting spectral distribution appr…
▽ More
In this work, we investigate the asymptotic spectral density of the random feature matrix $M = Y Y^\ast$ with $Y = f(WX)$ generated by a single-hidden-layer neural network, where $W$ and $X$ are random rectangular matrices with i.i.d. centred entries and $f$ is a non-linear smooth function which is applied entry-wise. We prove that the Stieltjes transform of the limiting spectral distribution approximately satisfies a quartic self-consistent equation, which is exactly the equation obtained by [Pennington, Worah] and [Benigni, Péché] with the moment method. We extend the previous results to the case of additive bias $Y=f(WX+B)$ with $B$ being an independent rank-one Gaussian random matrix, closer modelling the neural network infrastructures encountered in practice. Our key finding is that in the case of additive bias it is impossible to choose an activation function preserving the layer-to-layer singular value distribution, in sharp contrast to the bias-free case where a simple integral constraint is sufficient to achieve isospectrality. To obtain the asymptotics for the empirical spectral density we follow the resolvent method from random matrix theory via the cumulant expansion. We find that this approach is more robust and less combinatorial than the moment method and expect that it will apply also for models where the combinatorics of the former become intractable. The resolvent method has been widely employed, but compared to previous works, it is applied here to non-linear random matrices.
△ Less
Submitted 11 November, 2021; v1 submitted 11 May, 2021;
originally announced May 2021.
-
Quantifying the Preferential Direction of the Model Gradient in Adversarial Training With Projected Gradient Descent
Authors:
Ricardo Bigolin Lanfredi,
Joyce D. Schroeder,
Tolga Tasdizen
Abstract:
Adversarial training, especially projected gradient descent (PGD), has proven to be a successful approach for improving robustness against adversarial attacks. After adversarial training, gradients of models with respect to their inputs have a preferential direction. However, the direction of alignment is not mathematically well established, making it difficult to evaluate quantitatively. We propo…
▽ More
Adversarial training, especially projected gradient descent (PGD), has proven to be a successful approach for improving robustness against adversarial attacks. After adversarial training, gradients of models with respect to their inputs have a preferential direction. However, the direction of alignment is not mathematically well established, making it difficult to evaluate quantitatively. We propose a novel definition of this direction as the direction of the vector pointing toward the closest point of the support of the closest inaccurate class in decision space. To evaluate the alignment with this direction after adversarial training, we apply a metric that uses generative adversarial networks to produce the smallest residual needed to change the class present in the image. We show that PGD-trained models have a higher alignment than the baseline according to our definition, that our metric presents higher alignment values than a competing metric formulation, and that enforcing this alignment increases the robustness of models.
△ Less
Submitted 19 April, 2023; v1 submitted 10 September, 2020;
originally announced September 2020.
-
Interpretation of Disease Evidence for Medical Images Using Adversarial Deformation Fields
Authors:
Ricardo Bigolin Lanfredi,
Joyce D. Schroeder,
Clement Vachet,
Tolga Tasdizen
Abstract:
The high complexity of deep learning models is associated with the difficulty of explaining what evidence they recognize as correlating with specific disease labels. This information is critical for building trust in models and finding their biases. Until now, automated deep learning visualization solutions have identified regions of images used by classifiers, but these solutions are too coarse,…
▽ More
The high complexity of deep learning models is associated with the difficulty of explaining what evidence they recognize as correlating with specific disease labels. This information is critical for building trust in models and finding their biases. Until now, automated deep learning visualization solutions have identified regions of images used by classifiers, but these solutions are too coarse, too noisy, or have a limited representation of the way images can change. We propose a novel method for formulating and presenting spatial explanations of disease evidence, called deformation field interpretation with generative adversarial networks (DeFI-GAN). An adversarially trained generator produces deformation fields that modify images of diseased patients to resemble images of healthy patients. We validate the method studying chronic obstructive pulmonary disease (COPD) evidence in chest x-rays (CXRs) and Alzheimer's disease (AD) evidence in brain MRIs. When extracting disease evidence in longitudinal data, we show compelling results against a baseline producing difference maps. DeFI-GAN also highlights disease biomarkers not found by previous methods and potential biases that may help in investigations of the dataset and of the adopted learning methods.
△ Less
Submitted 19 April, 2023; v1 submitted 3 July, 2020;
originally announced July 2020.
-
Adversarial regression training for visualizing the progression of chronic obstructive pulmonary disease with chest x-rays
Authors:
Ricardo Bigolin Lanfredi,
Joyce D. Schroeder,
Clement Vachet,
Tolga Tasdizen
Abstract:
Knowledge of what spatial elements of medical images deep learning methods use as evidence is important for model interpretability, trustiness, and validation. There is a lack of such techniques for models in regression tasks. We propose a method, called visualization for regression with a generative adversarial network (VR-GAN), for formulating adversarial training specifically for datasets conta…
▽ More
Knowledge of what spatial elements of medical images deep learning methods use as evidence is important for model interpretability, trustiness, and validation. There is a lack of such techniques for models in regression tasks. We propose a method, called visualization for regression with a generative adversarial network (VR-GAN), for formulating adversarial training specifically for datasets containing regression target values characterizing disease severity. We use a conditional generative adversarial network where the generator attempts to learn to shift the output of a regressor through creating disease effect maps that are added to the original images. Meanwhile, the regressor is trained to predict the original regression value for the modified images. A model trained with this technique learns to provide visualization for how the image would appear at different stages of the disease. We analyze our method in a dataset of chest x-rays associated with pulmonary function tests, used for diagnosing chronic obstructive pulmonary disease (COPD). For validation, we compute the difference of two registered x-rays of the same patient at different time points and correlate it to the generated disease effect map. The proposed method outperforms a technique based on classification and provides realistic-looking images, making modifications to images following what radiologists usually observe for this disease. Implementation code is available at https://github.com/ricbl/vrgan.
△ Less
Submitted 27 August, 2019;
originally announced August 2019.
-
An Enhanced Lumped Element Electrical Model of a Double Barrier Memristive Device
Authors:
Enver Solan,
Sven Dirkmann,
Mirko Hansen,
Dietmar Schroeder,
Hermann Kohlstedt,
Martin Ziegler,
Thomas Mussenbrock,
Karlheinz Ochs
Abstract:
The massive parallel approach of neuromorphic circuits leads to effective methods for solving complex problems. It has turned out that resistive switching devices with a continuous resistance range are potential candidates for such applications. These devices are memristive systems - nonlinear resistors with memory. They are fabricated in nanotechnology and hence parameter spread during fabricatio…
▽ More
The massive parallel approach of neuromorphic circuits leads to effective methods for solving complex problems. It has turned out that resistive switching devices with a continuous resistance range are potential candidates for such applications. These devices are memristive systems - nonlinear resistors with memory. They are fabricated in nanotechnology and hence parameter spread during fabrication may aggravate reproducible analyses. This issue makes simulation models of memristive devices worthwhile.
Kinetic Monte-Carlo simulations based on a distributed model of the device can be used to understand the underlying physical and chemical phenomena. However, such simulations are very time-consuming and neither convenient for investigations of whole circuits nor for real-time applications, e.g. emulation purposes. Instead, a concentrated model of the device can be used for both fast simulations and real-time applications, respectively. We introduce an enhanced electrical model of a valence change mechanism (VCM) based double barrier memristive device (DBMD) with a continuous resistance range. This device consists of an ultra-thin memristive layer sandwiched between a tunnel barrier and a Schottky-contact. The introduced model leads to very fast simulations by using usual circuit simulation tools while maintaining physically meaningful parameters.
Kinetic Monte-Carlo simulations based on a distributed model and experimental data have been utilized as references to verify the concentrated model.
△ Less
Submitted 19 January, 2017;
originally announced January 2017.
-
Ubic: Bridging the gap between digital cryptography and the physical world
Authors:
Mark Simkin,
Dominique Schroeder,
Andreas Bulling,
Mario Fritz
Abstract:
Advances in computing technology increasingly blur the boundary between the digital domain and the physical world. Although the research community has developed a large number of cryptographic primitives and has demonstrated their usability in all-digital communication, many of them have not yet made their way into the real world due to usability aspects. We aim to make another step towards a tigh…
▽ More
Advances in computing technology increasingly blur the boundary between the digital domain and the physical world. Although the research community has developed a large number of cryptographic primitives and has demonstrated their usability in all-digital communication, many of them have not yet made their way into the real world due to usability aspects. We aim to make another step towards a tighter integration of digital cryptography into real world interactions. We describe Ubic, a framework that allows users to bridge the gap between digital cryptography and the physical world. Ubic relies on head-mounted displays, like Google Glass, resource-friendly computer vision techniques as well as mathematically sound cryptographic primitives to provide users with better security and privacy guarantees. The framework covers key cryptographic primitives, such as secure identification, document verification using a novel secure physical document format, as well as content hiding. To make a contribution of practical value, we focused on making Ubic as simple, easily deployable, and user friendly as possible.
△ Less
Submitted 24 July, 2014; v1 submitted 5 March, 2014;
originally announced March 2014.