Search | arXiv e-print repository

Certificates of Differential Privacy and Unlearning for Gradient-Based Training

Authors: Matthew Wicker, Philip Sosnin, Adrianna Janik, Mark N. Müller, Adrian Weller, Calvin Tsay

Abstract: Proper data stewardship requires that model owners protect the privacy of individuals' data used during training. Whether through anonymization with differential privacy or the use of unlearning in non-anonymized settings, the gold-standard techniques for providing privacy guarantees can come with significant performance penalties or be too weak to provide practical assurances. In part, this is du… ▽ More Proper data stewardship requires that model owners protect the privacy of individuals' data used during training. Whether through anonymization with differential privacy or the use of unlearning in non-anonymized settings, the gold-standard techniques for providing privacy guarantees can come with significant performance penalties or be too weak to provide practical assurances. In part, this is due to the fact that the guarantee provided by differential privacy represents the worst-case privacy leakage for any individual, while the true privacy leakage of releasing the prediction for a given individual might be substantially smaller or even, as we show, non-existent. This work provides a novel framework based on convex relaxations and bounds propagation that can compute formal guarantees (certificates) that releasing specific predictions satisfies $ε=0$ privacy guarantees or do not depend on data that is subject to an unlearning request. Our framework offers a new verification-centric approach to privacy and unlearning guarantees, that can be used to further engender user trust with tighter privacy guarantees, provide formal proofs of robustness to certain membership inference attacks, identify potentially vulnerable records, and enhance current unlearning approaches. We validate the effectiveness of our approach on tasks from financial services, medical imaging, and natural language processing. △ Less

Submitted 19 June, 2024; originally announced June 2024.

Comments: 15 pages, 14 figures

arXiv:2311.06112 [pdf, other]

Turbulence Scaling from Deep Learning Diffusion Generative Models

Authors: Tim Whittaker, Romuald A. Janik, Yaron Oz

Abstract: Complex spatial and temporal structures are inherent characteristics of turbulent fluid flows and comprehending them poses a major challenge. This comprehesion necessitates an understanding of the space of turbulent fluid flow configurations. We employ a diffusion-based generative model to learn the distribution of turbulent vorticity profiles and generate snapshots of turbulent solutions to the i… ▽ More Complex spatial and temporal structures are inherent characteristics of turbulent fluid flows and comprehending them poses a major challenge. This comprehesion necessitates an understanding of the space of turbulent fluid flow configurations. We employ a diffusion-based generative model to learn the distribution of turbulent vorticity profiles and generate snapshots of turbulent solutions to the incompressible Navier-Stokes equations. We consider the inverse cascade in two spatial dimensions and generate diverse turbulent solutions that differ from those in the training dataset. We analyze the statistical scaling properties of the new turbulent profiles, calculate their structure functions, energy power spectrum, velocity probability distribution function and moments of local energy dissipation. All the learnt scaling exponents are consistent with the expected Kolmogorov scaling and have lower errors than the training ones. This agreement with established turbulence characteristics provides strong evidence of the model's capability to capture essential features of real-world turbulence. △ Less

Submitted 10 November, 2023; originally announced November 2023.

arXiv:2311.03839 [pdf, other]

Aspects of human memory and Large Language Models

Authors: Romuald A. Janik

Abstract: Large Language Models (LLMs) are huge artificial neural networks which primarily serve to generate text, but also provide a very sophisticated probabilistic model of language use. Since generating a semantically consistent text requires a form of effective memory, we investigate the memory properties of LLMs and find surprising similarities with key characteristics of human memory. We argue that t… ▽ More Large Language Models (LLMs) are huge artificial neural networks which primarily serve to generate text, but also provide a very sophisticated probabilistic model of language use. Since generating a semantically consistent text requires a form of effective memory, we investigate the memory properties of LLMs and find surprising similarities with key characteristics of human memory. We argue that the human-like memory properties of the Large Language Model do not follow automatically from the LLM architecture but are rather learned from the statistics of the training textual data. These results strongly suggest that the biological features of human memory leave an imprint on the way that we structure our textual narratives. △ Less

Submitted 8 April, 2024; v1 submitted 7 November, 2023; originally announced November 2023.

Comments: 13+3 pages; v2: abstract expanded and future research directions added; v3: minor clarifications added

arXiv:2212.02651 [pdf, other]

Explaining Link Predictions in Knowledge Graph Embedding Models with Influential Examples

Authors: Adrianna Janik, Luca Costabello

Abstract: We study the problem of explaining link predictions in the Knowledge Graph Embedding (KGE) models. We propose an example-based approach that exploits the latent space representation of nodes and edges in a knowledge graph to explain predictions. We evaluated the importance of identified triples by observing progressing degradation of model performance upon influential triples removal. Our experime… ▽ More We study the problem of explaining link predictions in the Knowledge Graph Embedding (KGE) models. We propose an example-based approach that exploits the latent space representation of nodes and edges in a knowledge graph to explain predictions. We evaluated the importance of identified triples by observing progressing degradation of model performance upon influential triples removal. Our experiments demonstrate that this approach to generate explanations outperforms baselines on KGE models for two publicly available datasets. △ Less

Submitted 5 December, 2022; originally announced December 2022.

arXiv:2211.15382 [pdf, other]

doi 10.1140/epje/s10189-023-00321-7

Neural Network Complexity of Chaos and Turbulence

Authors: Tim Whittaker, Romuald A. Janik, Yaron Oz

Abstract: Chaos and turbulence are complex physical phenomena, yet a precise definition of the complexity measure that quantifies them is still lacking. In this work we consider the relative complexity of chaos and turbulence from the perspective of deep neural networks. We analyze a set of classification problems, where the network has to distinguish images of fluid profiles in the turbulent regime from ot… ▽ More Chaos and turbulence are complex physical phenomena, yet a precise definition of the complexity measure that quantifies them is still lacking. In this work we consider the relative complexity of chaos and turbulence from the perspective of deep neural networks. We analyze a set of classification problems, where the network has to distinguish images of fluid profiles in the turbulent regime from other classes of images such as fluid profiles in the chaotic regime, various constructions of noise and real world images. We analyze incompressible as well as weakly compressible fluid flows. We quantify the complexity of the computation performed by the network via the intrinsic dimensionality of the internal feature representations, and calculate the effective number of independent features which the network uses in order to distinguish between classes. In addition to providing a numerical estimate of the complexity of the computation, the measure also characterizes the neural network processing at intermediate and final stages. We construct adversarial examples and use them to identify the two point correlation spectra for the chaotic and turbulent vorticity as the feature used by the network for classification. △ Less

Submitted 20 July, 2023; v1 submitted 24 November, 2022; originally announced November 2022.

Journal ref: Eur. Phys. J. E 46, 57 (2023)

arXiv:2211.09856 [pdf, other]

Machine Learning-Assisted Recurrence Prediction for Early-Stage Non-Small-Cell Lung Cancer Patients

Authors: Adrianna Janik, Maria Torrente, Luca Costabello, Virginia Calvo, Brian Walsh, Carlos Camps, Sameh K. Mohamed, Ana L. Ortega, Vít Nováček, Bartomeu Massutí, Pasquale Minervini, M. Rosario Garcia Campelo, Edel del Barco, Joaquim Bosch-Barrera, Ernestina Menasalvas, Mohan Timilsina, Mariano Provencio

Abstract: Background: Stratifying cancer patients according to risk of relapse can personalize their care. In this work, we provide an answer to the following research question: How to utilize machine learning to estimate probability of relapse in early-stage non-small-cell lung cancer patients? Methods: For predicting relapse in 1,387 early-stage (I-II), non-small-cell lung cancer (NSCLC) patients from t… ▽ More Background: Stratifying cancer patients according to risk of relapse can personalize their care. In this work, we provide an answer to the following research question: How to utilize machine learning to estimate probability of relapse in early-stage non-small-cell lung cancer patients? Methods: For predicting relapse in 1,387 early-stage (I-II), non-small-cell lung cancer (NSCLC) patients from the Spanish Lung Cancer Group data (65.7 average age, 24.8% females, 75.2% males) we train tabular and graph machine learning models. We generate automatic explanations for the predictions of such models. For models trained on tabular data, we adopt SHAP local explanations to gauge how each patient feature contributes to the predicted outcome. We explain graph machine learning predictions with an example-based method that highlights influential past patients. Results: Machine learning models trained on tabular data exhibit a 76% accuracy for the Random Forest model at predicting relapse evaluated with a 10-fold cross-validation (model was trained 10 times with different independent sets of patients in test, train and validation sets, the reported metrics are averaged over these 10 test sets). Graph machine learning reaches 68% accuracy over a 200-patient, held-out test set, calibrated on a held-out set of 100 patients. Conclusions: Our results show that machine learning models trained on tabular and graph data can enable objective, personalised and reproducible prediction of relapse and therefore, disease outcome in patients with early-stage NSCLC. With further prospective and multisite validation, and additional radiological and molecular data, this prognostic model could potentially serve as a predictive decision support tool for deciding the use of adjuvant treatments in early-stage lung cancer. Keywords: Non-Small-Cell Lung Cancer, Tumor Recurrence Prediction, Machine Learning △ Less

Submitted 17 November, 2022; originally announced November 2022.

arXiv:2202.04766 [pdf, other]

Sampling Strategy for Fine-Tuning Segmentation Models to Crisis Area under Scarcity of Data

Authors: Adrianna Janik, Kris Sankaran

Abstract: The use of remote sensing in humanitarian crisis response missions is well-established and has proven relevant repeatedly. One of the problems is obtaining gold annotations as it is costly and time consuming which makes it almost impossible to fine-tune models to new regions affected by the crisis. Where time is critical, resources are limited and environment is constantly changing, models has to… ▽ More The use of remote sensing in humanitarian crisis response missions is well-established and has proven relevant repeatedly. One of the problems is obtaining gold annotations as it is costly and time consuming which makes it almost impossible to fine-tune models to new regions affected by the crisis. Where time is critical, resources are limited and environment is constantly changing, models has to evolve and provide flexible ways to adapt to a new situation. The question that we want to answer is if prioritization of samples provide better results in fine-tuning vs other classical sampling methods under annotated data scarcity? We propose a method to guide data collection during fine-tuning, based on estimated model and sample properties, like predicted IOU score. We propose two formulas for calculating sample priority. Our approach blends techniques from interpretability, representation learning and active learning. We have applied our method to a deep learning model for semantic segmentation, U-Net, in a remote sensing application of building detection - one of the core use cases of remote sensing in humanitarian applications. Preliminary results shows utility in prioritization of samples for tuning semantic segmentation models under scarcity of data condition. △ Less

Submitted 9 February, 2022; originally announced February 2022.

arXiv:2202.04753 [pdf, other]

Discovering Concepts in Learned Representations using Statistical Inference and Interactive Visualization

Authors: Adrianna Janik, Kris Sankaran

Abstract: Concept discovery is one of the open problems in the interpretability literature that is important for bridging the gap between non-deep learning experts and model end-users. Among current formulations, concepts defines them by as a direction in a learned representation space. This definition makes it possible to evaluate whether a particular concept significantly influences classification decisio… ▽ More Concept discovery is one of the open problems in the interpretability literature that is important for bridging the gap between non-deep learning experts and model end-users. Among current formulations, concepts defines them by as a direction in a learned representation space. This definition makes it possible to evaluate whether a particular concept significantly influences classification decisions for classes of interest. However, finding relevant concepts is tedious, as representation spaces are high-dimensional and hard to navigate. Current approaches include hand-crafting concept datasets and then converting them to latent space directions; alternatively, the process can be automated by clustering the latent space. In this study, we offer another two approaches to guide user discovery of meaningful concepts, one based on multiple hypothesis testing, and another on interactive visualization. We explore the potential value and limitations of these approaches through simulation experiments and an demo visual interface to real data. Overall, we find that these techniques offer a promising strategy for discovering relevant concepts in settings where users do not have predefined descriptions of them, but without completely automating the process. △ Less

Submitted 9 February, 2022; originally announced February 2022.

Comments: KDD'19, Workshop Explainable AI/ML (XAI) for Accountability, Fairness, and Transparency, August 04-08, 2019, Anchorage, AK, USA

arXiv:2109.08103 [pdf, other]

Aesthetics and neural network image representations

Authors: Romuald A. Janik

Abstract: We analyze the spaces of images encoded by generative neural networks of the BigGAN architecture. We find that generic multiplicative perturbations of neural network parameters away from the photo-realistic point often lead to networks generating images which appear as "artistic renditions" of the corresponding objects. This demonstrates an emergence of aesthetic properties directly from the struc… ▽ More We analyze the spaces of images encoded by generative neural networks of the BigGAN architecture. We find that generic multiplicative perturbations of neural network parameters away from the photo-realistic point often lead to networks generating images which appear as "artistic renditions" of the corresponding objects. This demonstrates an emergence of aesthetic properties directly from the structure of the photo-realistic visual environment as encoded in its neural network parametrization. Moreover, modifying a deep semantic part of the neural network leads to the appearance of symbolic visual representations. None of the considered networks had any access to images of human-made art. △ Less

Submitted 12 April, 2023; v1 submitted 16 September, 2021; originally announced September 2021.

Comments: 11 pages, 6 figures; v2: expanded discussion, appendix with 2 figures added

arXiv:2103.08590 [pdf, other]

doi 10.1117/12.2582227

Interpretability of a Deep Learning Model in the Application of Cardiac MRI Segmentation with an ACDC Challenge Dataset

Authors: Adrianna Janik, Jonathan Dodd, Georgiana Ifrim, Kris Sankaran, Kathleen Curran

Abstract: Cardiac Magnetic Resonance (CMR) is the most effective tool for the assessment and diagnosis of a heart condition, which malfunction is the world's leading cause of death. Software tools leveraging Artificial Intelligence already enhance radiologists and cardiologists in heart condition assessment but their lack of transparency is a problem. This project investigates if it is possible to discover… ▽ More Cardiac Magnetic Resonance (CMR) is the most effective tool for the assessment and diagnosis of a heart condition, which malfunction is the world's leading cause of death. Software tools leveraging Artificial Intelligence already enhance radiologists and cardiologists in heart condition assessment but their lack of transparency is a problem. This project investigates if it is possible to discover concepts representative for different cardiac conditions from the deep network trained to segment crdiac structures: Left Ventricle (LV), Right Ventricle (RV) and Myocardium (MYO), using explainability methods that enhances classification system by providing the score-based values of qualitative concepts, along with the key performance metrics. With introduction of a need of explanations in GDPR explainability of AI systems is necessary. This study applies Discovering and Testing with Concept Activation Vectors (D-TCAV), an interpretaibilty method to extract underlying features important for cardiac disease diagnosis from MRI data. The method provides a quantitative notion of concept importance for disease classified. In previous studies, the base method is applied to the classification of cardiac disease and provides clinically meaningful explanations for the predictions of a black-box deep learning classifier. This study applies a method extending TCAV with a Discovering phase (D-TCAV) to cardiac MRI analysis. The advantage of the D-TCAV method over the base method is that it is user-independent. The contribution of this study is a novel application of the explainability method D-TCAV for cardiac MRI anlysis. D-TCAV provides a shorter pre-processing time for clinicians than the base method. △ Less

Submitted 15 March, 2021; originally announced March 2021.

arXiv:2006.12195 [pdf, other]

Neural networks adapting to datasets: learning network size and topology

Authors: Romuald A. Janik, Aleksandra Nowak

Abstract: We introduce a flexible setup allowing for a neural network to learn both its size and topology during the course of a standard gradient-based training. The resulting network has the structure of a graph tailored to the particular learning task and dataset. The obtained networks can also be trained from scratch and achieve virtually identical performance. We explore the properties of the network a… ▽ More We introduce a flexible setup allowing for a neural network to learn both its size and topology during the course of a standard gradient-based training. The resulting network has the structure of a graph tailored to the particular learning task and dataset. The obtained networks can also be trained from scratch and achieve virtually identical performance. We explore the properties of the network architectures for a number of datasets of varying difficulty observing systematic regularities. The obtained graphs can be therefore understood as encoding nontrivial characteristics of the particular classification tasks. △ Less

Submitted 15 July, 2020; v1 submitted 22 June, 2020; originally announced June 2020.

Comments: Fixed blank page

arXiv:2006.04791 [pdf, other]

Complexity for deep neural networks and other characteristics of deep feature representations

Authors: Romuald A. Janik, Przemek Witaszczyk

Abstract: We define a notion of complexity, which quantifies the nonlinearity of the computation of a neural network, as well as a complementary measure of the effective dimension of feature representations. We investigate these observables both for trained networks for various datasets as well as explore their dynamics during training, uncovering in particular power law scaling. These observables can be un… ▽ More We define a notion of complexity, which quantifies the nonlinearity of the computation of a neural network, as well as a complementary measure of the effective dimension of feature representations. We investigate these observables both for trained networks for various datasets as well as explore their dynamics during training, uncovering in particular power law scaling. These observables can be understood in a dual way as uncovering hidden internal structure of the datasets themselves as a function of scale or depth. The entropic character of the proposed notion of complexity should allow to transfer modes of analysis from neuroscience and statistical physics to the domain of artificial neural networks. The introduced observables can be applied without any change to the analysis of biological neuronal systems. △ Less

Submitted 17 March, 2021; v1 submitted 8 June, 2020; originally announced June 2020.

Comments: Significant extension including developments in neuroscience context and more. 36 pages

arXiv:2002.08104 [pdf, other]

Analyzing Neural Networks Based on Random Graphs

Authors: Romuald A. Janik, Aleksandra Nowak

Abstract: We perform a massive evaluation of neural networks with architectures corresponding to random graphs of various types. We investigate various structural and numerical properties of the graphs in relation to neural network test accuracy. We find that none of the classical numerical graph invariants by itself allows to single out the best networks. Consequently, we introduce a new numerical graph ch… ▽ More We perform a massive evaluation of neural networks with architectures corresponding to random graphs of various types. We investigate various structural and numerical properties of the graphs in relation to neural network test accuracy. We find that none of the classical numerical graph invariants by itself allows to single out the best networks. Consequently, we introduce a new numerical graph characteristic that selects a set of quasi-1-dimensional graphs, which are a majority among the best performing networks. We also find that networks with primarily short-range connections perform better than networks which allow for many long-range connections. Moreover, many resolution reducing pathways are beneficial. We provide a dataset of 1020 graphs and the test accuracies of their corresponding neural networks at https://github.com/rmldj/random-graph-nn-paper △ Less

Submitted 2 December, 2020; v1 submitted 19 February, 2020; originally announced February 2020.

Comments: Added new results and discussion

arXiv:1909.10831 [pdf, other]

Entropy from Machine Learning

Authors: Romuald A. Janik

Abstract: We translate the problem of calculating the entropy of a set of binary configurations/signals into a sequence of supervised classification tasks. Subsequently, one can use virtually any machine learning classification algorithm for computing entropy. This procedure can be used to compute entropy, and consequently the free energy directly from a set of Monte Carlo configurations at a given temperat… ▽ More We translate the problem of calculating the entropy of a set of binary configurations/signals into a sequence of supervised classification tasks. Subsequently, one can use virtually any machine learning classification algorithm for computing entropy. This procedure can be used to compute entropy, and consequently the free energy directly from a set of Monte Carlo configurations at a given temperature. As a test of the proposed method, using an off-the-shelf machine learning classifier we reproduce the entropy and free energy of the 2D Ising model from Monte Carlo configurations at various temperatures throughout its phase diagram. Other potential applications include computing the entropy of spiking neurons or any other multidimensional binary signals. △ Less

Submitted 24 October, 2019; v1 submitted 24 September, 2019; originally announced September 2019.

Comments: 10 pages, 2 figures; v2: reference added, minor notational improvement; v3: reference added, general comments in section 3

Showing 1–14 of 14 results for author: Janik, A