-
Factual Confidence of LLMs: on Reliability and Robustness of Current Estimators
Authors:
Matéo Mahaut,
Laura Aina,
Paula Czarnowska,
Momchil Hardalov,
Thomas Müller,
Lluís Màrquez
Abstract:
Large Language Models (LLMs) tend to be unreliable in the factuality of their answers. To address this problem, NLP researchers have proposed a range of techniques to estimate LLM's confidence over facts. However, due to the lack of a systematic comparison, it is not clear how the different methods compare to one another. To fill this gap, we present a survey and empirical comparison of estimators…
▽ More
Large Language Models (LLMs) tend to be unreliable in the factuality of their answers. To address this problem, NLP researchers have proposed a range of techniques to estimate LLM's confidence over facts. However, due to the lack of a systematic comparison, it is not clear how the different methods compare to one another. To fill this gap, we present a survey and empirical comparison of estimators of factual confidence. We define an experimental framework allowing for fair comparison, covering both fact-verification and question answering. Our experiments across a series of LLMs indicate that trained hidden-state probes provide the most reliable confidence estimates, albeit at the expense of requiring access to weights and training data. We also conduct a deeper assessment of factual confidence by measuring the consistency of model behavior under meaning-preserving variations in the input. We find that the confidence of LLMs is often unstable across semantically equivalent inputs, suggesting that there is much room for improvement of the stability of models' parametric knowledge. Our code is available at (https://github.com/amazon-science/factual-confidence-of-llms).
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
Joint Lemmatization and Morphological Tagging with LEMMING
Authors:
Thomas Muller,
Ryan Cotterell,
Alexander Fraser,
Hinrich Schütze
Abstract:
We present LEMMING, a modular log-linear model that jointly models lemmatization and tagging and supports the integration of arbitrary global features. It is trainable on corpora annotated with gold standard tags and lemmata and does not rely on morphological dictionaries or analyzers. LEMMING sets the new state of the art in token-based statistical lemmatization on six languages; e.g., for Czech…
▽ More
We present LEMMING, a modular log-linear model that jointly models lemmatization and tagging and supports the integration of arbitrary global features. It is trainable on corpora annotated with gold standard tags and lemmata and does not rely on morphological dictionaries or analyzers. LEMMING sets the new state of the art in token-based statistical lemmatization on six languages; e.g., for Czech lemmatization, we reduce the error by 60%, from 4.05 to 1.58. We also give empirical evidence that jointly modeling morphological tags and lemmata is mutually beneficial.
△ Less
Submitted 28 May, 2024;
originally announced May 2024.
-
Labeled Morphological Segmentation with Semi-Markov Models
Authors:
Ryan Cotterell,
Thomas Müller,
Alexander Fraser,
Hinrich Schütze
Abstract:
We present labeled morphological segmentation, an alternative view of morphological processing that unifies several tasks. From an annotation standpoint, we additionally introduce a new hierarchy of morphotactic tagsets. Finally, we develop \modelname, a discriminative morphological segmentation system that, contrary to previous work, explicitly models morphotactics. We show that \textsc{chipmunk}…
▽ More
We present labeled morphological segmentation, an alternative view of morphological processing that unifies several tasks. From an annotation standpoint, we additionally introduce a new hierarchy of morphotactic tagsets. Finally, we develop \modelname, a discriminative morphological segmentation system that, contrary to previous work, explicitly models morphotactics. We show that \textsc{chipmunk} yields improved performance on three tasks for all six languages: (i) morphological segmentation, (ii) stemming and (iii) morphological tag classification. On morphological segmentation, our method shows absolute improvements of 2--6 points $F_1$ over the baseline.
△ Less
Submitted 13 April, 2024;
originally announced April 2024.
-
Diff-Def: Diffusion-Generated Deformation Fields for Conditional Atlases
Authors:
Sophie Starck,
Vasiliki Sideri-Lampretsa,
Bernhard Kainz,
Martin Menten,
Tamara Mueller,
Daniel Rueckert
Abstract:
Anatomical atlases are widely used for population analysis. Conditional atlases target a particular sub-population defined via certain conditions (e.g. demographics or pathologies) and allow for the investigation of fine-grained anatomical differences - such as morphological changes correlated with age. Existing approaches use either registration-based methods that are unable to handle large anato…
▽ More
Anatomical atlases are widely used for population analysis. Conditional atlases target a particular sub-population defined via certain conditions (e.g. demographics or pathologies) and allow for the investigation of fine-grained anatomical differences - such as morphological changes correlated with age. Existing approaches use either registration-based methods that are unable to handle large anatomical variations or generative models, which can suffer from training instabilities and hallucinations. To overcome these limitations, we use latent diffusion models to generate deformation fields, which transform a general population atlas into one representing a specific sub-population. By generating a deformation field and registering the conditional atlas to a neighbourhood of images, we ensure structural plausibility and avoid hallucinations, which can occur during direct image synthesis. We compare our method to several state-of-the-art atlas generation methods in experiments using 5000 brain as well as whole-body MR images from UK Biobank. Our method generates highly realistic atlases with smooth transformations and high anatomical fidelity, outperforming the baselines.
△ Less
Submitted 25 March, 2024;
originally announced March 2024.
-
Measurement Uncertainty: Relating the uncertainties of physical and virtual measurements
Authors:
Simon Cramer,
Tobias Müller,
Robert H. Schmitt
Abstract:
In the context of industrially mass-manufactured products, quality management is based on physically inspecting a small sample from a large batch and reasoning about the batch's quality conformance. When complementing physical inspections with predictions from machine learning models, it is crucial that the uncertainty of the prediction is known. Otherwise, the application of established quality m…
▽ More
In the context of industrially mass-manufactured products, quality management is based on physically inspecting a small sample from a large batch and reasoning about the batch's quality conformance. When complementing physical inspections with predictions from machine learning models, it is crucial that the uncertainty of the prediction is known. Otherwise, the application of established quality management concepts is not legitimate. Deterministic (machine learning) models lack quantification of their predictive uncertainty and are therefore unsuitable. Probabilistic (machine learning) models provide a predictive uncertainty along with the prediction. However, a concise relationship is missing between the measurement uncertainty of physical inspections and the predictive uncertainty of probabilistic models in their application in quality management. Here, we show how the predictive uncertainty of probabilistic (machine learning) models is related to the measurement uncertainty of physical inspections. This enables the use of probabilistic models for virtual inspections and integrates them into existing quality management concepts. Thus, we can provide a virtual measurement for any quality characteristic based on the process data and achieve a 100 percent inspection rate. In the field of Predictive Quality, the virtual measurement is of great interest. Based on our results, physical inspections with a low sampling rate can be accompanied by virtual measurements that allow an inspection rate of 100 percent. We add substantial value, especially to complex process chains, as faulty products/parts are identified promptly and upcoming process steps can be aborted.
△ Less
Submitted 21 February, 2024;
originally announced February 2024.
-
Bounding Reconstruction Attack Success of Adversaries Without Data Priors
Authors:
Alexander Ziller,
Anneliese Riess,
Kristian Schwethelm,
Tamara T. Mueller,
Daniel Rueckert,
Georgios Kaissis
Abstract:
Reconstruction attacks on machine learning (ML) models pose a strong risk of leakage of sensitive data. In specific contexts, an adversary can (almost) perfectly reconstruct training data samples from a trained model using the model's gradients. When training ML models with differential privacy (DP), formal upper bounds on the success of such reconstruction attacks can be provided. So far, these b…
▽ More
Reconstruction attacks on machine learning (ML) models pose a strong risk of leakage of sensitive data. In specific contexts, an adversary can (almost) perfectly reconstruct training data samples from a trained model using the model's gradients. When training ML models with differential privacy (DP), formal upper bounds on the success of such reconstruction attacks can be provided. So far, these bounds have been formulated under worst-case assumptions that might not hold high realistic practicality. In this work, we provide formal upper bounds on reconstruction success under realistic adversarial settings against ML models trained with DP and support these bounds with empirical results. With this, we show that in realistic scenarios, (a) the expected reconstruction success can be bounded appropriately in different contexts and by different metrics, which (b) allows for a more educated choice of a privacy parameter.
△ Less
Submitted 20 February, 2024;
originally announced February 2024.
-
MLAAD: The Multi-Language Audio Anti-Spoofing Dataset
Authors:
Nicolas M. Müller,
Piotr Kawa,
Wei Herng Choong,
Edresson Casanova,
Eren Gölge,
Thorsten Müller,
Piotr Syga,
Philip Sperl,
Konstantin Böttinger
Abstract:
Text-to-Speech (TTS) technology brings significant advantages, such as giving a voice to those with speech impairments, but also enables audio deepfakes and spoofs. The former mislead individuals and may propagate misinformation, while the latter undermine voice biometric security systems. AI-based detection can help to address these challenges by automatically differentiating between genuine and…
▽ More
Text-to-Speech (TTS) technology brings significant advantages, such as giving a voice to those with speech impairments, but also enables audio deepfakes and spoofs. The former mislead individuals and may propagate misinformation, while the latter undermine voice biometric security systems. AI-based detection can help to address these challenges by automatically differentiating between genuine and fabricated voice recordings. However, these models are only as good as their training data, which currently is severely limited due to an overwhelming concentration on English and Chinese audio in anti-spoofing databases, thus restricting its worldwide effectiveness. In response, this paper presents the Multi-Language Audio Anti-Spoof Dataset (MLAAD), created using 54 TTS models, comprising 21 different architectures, to generate 163.9 hours of synthetic voice in 23 different languages. We train and evaluate three state-of-the-art deepfake detection models with MLAAD, and observe that MLAAD demonstrates superior performance over comparable datasets like InTheWild or FakeOrReal when used as a training resource. Furthermore, in comparison with the renowned ASVspoof 2019 dataset, MLAAD proves to be a complementary resource. In tests across eight datasets, MLAAD and ASVspoof 2019 alternately outperformed each other, both excelling on four datasets. By publishing MLAAD and making trained models accessible via an interactive webserver , we aim to democratize antispoofing technology, making it accessible beyond the realm of specialists, thus contributing to global efforts against audio spoofing and deepfakes.
△ Less
Submitted 16 April, 2024; v1 submitted 17 January, 2024;
originally announced January 2024.
-
X-HEEP: An Open-Source, Configurable and Extendible RISC-V Microcontroller for the Exploration of Ultra-Low-Power Edge Accelerators
Authors:
Simone Machetti,
Pasquale Davide Schiavone,
Thomas Christoph Müller,
Miguel Peón-Quirós,
David Atienza
Abstract:
The field of edge computing has witnessed remarkable growth owing to the increasing demand for real-time processing of data in applications. However, challenges persist due to limitations in performance and power consumption. To overcome these challenges, heterogeneous architectures have emerged that combine host processors with specialized accelerators tailored to specific applications, leading t…
▽ More
The field of edge computing has witnessed remarkable growth owing to the increasing demand for real-time processing of data in applications. However, challenges persist due to limitations in performance and power consumption. To overcome these challenges, heterogeneous architectures have emerged that combine host processors with specialized accelerators tailored to specific applications, leading to improved performance and reduced power consumption. However, most of the existing platforms lack the necessary configurability and extendability options for integrating custom accelerators. To overcome these limitations, we introduce in this paper the eXtendible Heterogeneous Energy-Efficient Platform (X-HEEP). X-HEEP is an open-source platform designed to natively support the integration of ultra-low-power edge accelerators. It provides customization options to match specific application requirements by exploring various core types, bus topologies, addressing modes, memory sizes, and peripherals. Moreover, the platform prioritizes energy efficiency by implementing low-power strategies, such as clock-gating and power-gating. We demonstrate the real-world applicability of X-HEEP by providing an integration example tailored for healthcare applications that includes a coarse-grained reconfigurable array (CGRA) and in-memory computing (IMC) accelerators. The resulting design, called HEEPocrates, has been implemented both in field programmable gate array (FPGA) on the Xilinx Zynq-7020 chip and in silicon with TSMC 65nm low-power CMOS technology. We run a set of healthcare applications and measure their energy consumption to demonstrate the alignment of our chip with other state-of-the-art microcontrollers commonly adopted in this domain. Moreover, we present the energy benefits of 4.9x and 4.8x gained by exploiting the integrated CGRA and IMC accelerators compared to running on the host CPU.
△ Less
Submitted 8 March, 2024; v1 submitted 10 January, 2024;
originally announced January 2024.
-
Compact Neural Graphics Primitives with Learned Hash Probing
Authors:
Towaki Takikawa,
Thomas Müller,
Merlin Nimier-David,
Alex Evans,
Sanja Fidler,
Alec Jacobson,
Alexander Keller
Abstract:
Neural graphics primitives are faster and achieve higher quality when their neural networks are augmented by spatial data structures that hold trainable features arranged in a grid. However, existing feature grids either come with a large memory footprint (dense or factorized grids, trees, and hash tables) or slow performance (index learning and vector quantization). In this paper, we show that a…
▽ More
Neural graphics primitives are faster and achieve higher quality when their neural networks are augmented by spatial data structures that hold trainable features arranged in a grid. However, existing feature grids either come with a large memory footprint (dense or factorized grids, trees, and hash tables) or slow performance (index learning and vector quantization). In this paper, we show that a hash table with learned probes has neither disadvantage, resulting in a favorable combination of size and speed. Inference is faster than unprobed hash tables at equal quality while training is only 1.2-2.6x slower, significantly outperforming prior index learning approaches. We arrive at this formulation by casting all feature grids into a common framework: they each correspond to a lookup function that indexes into a table of feature vectors. In this framework, the lookup functions of existing data structures can be combined by simple arithmetic combinations of their indices, resulting in Pareto optimal compression and speed.
△ Less
Submitted 28 December, 2023;
originally announced December 2023.
-
Reconciling AI Performance and Data Reconstruction Resilience for Medical Imaging
Authors:
Alexander Ziller,
Tamara T. Mueller,
Simon Stieger,
Leonhard Feiner,
Johannes Brandt,
Rickmer Braren,
Daniel Rueckert,
Georgios Kaissis
Abstract:
Artificial Intelligence (AI) models are vulnerable to information leakage of their training data, which can be highly sensitive, for example in medical imaging. Privacy Enhancing Technologies (PETs), such as Differential Privacy (DP), aim to circumvent these susceptibilities. DP is the strongest possible protection for training models while bounding the risks of inferring the inclusion of training…
▽ More
Artificial Intelligence (AI) models are vulnerable to information leakage of their training data, which can be highly sensitive, for example in medical imaging. Privacy Enhancing Technologies (PETs), such as Differential Privacy (DP), aim to circumvent these susceptibilities. DP is the strongest possible protection for training models while bounding the risks of inferring the inclusion of training samples or reconstructing the original data. DP achieves this by setting a quantifiable privacy budget. Although a lower budget decreases the risk of information leakage, it typically also reduces the performance of such models. This imposes a trade-off between robust performance and stringent privacy. Additionally, the interpretation of a privacy budget remains abstract and challenging to contextualize. In this study, we contrast the performance of AI models at various privacy budgets against both, theoretical risk bounds and empirical success of reconstruction attacks. We show that using very large privacy budgets can render reconstruction attacks impossible, while drops in performance are negligible. We thus conclude that not using DP -- at all -- is negligent when applying AI models to sensitive data. We deem those results to lie a foundation for further debates on striking a balance between privacy risks and model performance.
△ Less
Submitted 5 December, 2023;
originally announced December 2023.
-
Machine Culture
Authors:
Levin Brinkmann,
Fabian Baumann,
Jean-François Bonnefon,
Maxime Derex,
Thomas F. Müller,
Anne-Marie Nussberger,
Agnieszka Czaplicka,
Alberto Acerbi,
Thomas L. Griffiths,
Joseph Henrich,
Joel Z. Leibo,
Richard McElreath,
Pierre-Yves Oudeyer,
Jonathan Stray,
Iyad Rahwan
Abstract:
The ability of humans to create and disseminate culture is often credited as the single most important factor of our success as a species. In this Perspective, we explore the notion of machine culture, culture mediated or generated by machines. We argue that intelligent machines simultaneously transform the cultural evolutionary processes of variation, transmission, and selection. Recommender algo…
▽ More
The ability of humans to create and disseminate culture is often credited as the single most important factor of our success as a species. In this Perspective, we explore the notion of machine culture, culture mediated or generated by machines. We argue that intelligent machines simultaneously transform the cultural evolutionary processes of variation, transmission, and selection. Recommender algorithms are altering social learning dynamics. Chatbots are forming a new mode of cultural transmission, serving as cultural models. Furthermore, intelligent machines are evolving as contributors in generating cultural traits--from game strategies and visual art to scientific results. We provide a conceptual framework for studying the present and anticipated future impact of machines on cultural evolution, and present a research agenda for the study of machine culture.
△ Less
Submitted 22 November, 2023; v1 submitted 19 November, 2023;
originally announced November 2023.
-
Adaptive Shells for Efficient Neural Radiance Field Rendering
Authors:
Zian Wang,
Tianchang Shen,
Merlin Nimier-David,
Nicholas Sharp,
Jun Gao,
Alexander Keller,
Sanja Fidler,
Thomas Müller,
Zan Gojcic
Abstract:
Neural radiance fields achieve unprecedented quality for novel view synthesis, but their volumetric formulation remains expensive, requiring a huge number of samples to render high-resolution images. Volumetric encodings are essential to represent fuzzy geometry such as foliage and hair, and they are well-suited for stochastic optimization. Yet, many scenes ultimately consist largely of solid surf…
▽ More
Neural radiance fields achieve unprecedented quality for novel view synthesis, but their volumetric formulation remains expensive, requiring a huge number of samples to render high-resolution images. Volumetric encodings are essential to represent fuzzy geometry such as foliage and hair, and they are well-suited for stochastic optimization. Yet, many scenes ultimately consist largely of solid surfaces which can be accurately rendered by a single sample per pixel. Based on this insight, we propose a neural radiance formulation that smoothly transitions between volumetric- and surface-based rendering, greatly accelerating rendering speed and even improving visual fidelity. Our method constructs an explicit mesh envelope which spatially bounds a neural volumetric representation. In solid regions, the envelope nearly converges to a surface and can often be rendered with a single sample. To this end, we generalize the NeuS formulation with a learned spatially-varying kernel size which encodes the spread of the density, fitting a wide kernel to volume-like regions and a tight kernel to surface-like regions. We then extract an explicit mesh of a narrow band around the surface, with width determined by the kernel size, and fine-tune the radiance field within this band. At inference time, we cast rays against the mesh and evaluate the radiance field only within the enclosed region, greatly reducing the number of samples required. Experiments show that our approach enables efficient rendering at very high fidelity. We also demonstrate that the extracted envelope enables downstream applications such as animation and simulation.
△ Less
Submitted 16 November, 2023;
originally announced November 2023.
-
Semantic Modelling of Organizational Knowledge as a Basis for Enterprise Data Governance 4.0 -- Application to a Unified Clinical Data Model
Authors:
Miguel AP Oliveira,
Stephane Manara,
Bruno Molé,
Thomas Muller,
Aurélien Guillouche,
Lysann Hesske,
Bruce Jordan,
Gilles Hubert,
Chinmay Kulkarni,
Pralipta Jagdev,
Cedric R. Berger
Abstract:
Individuals and organizations cope with an always-growing amount of data, which is heterogeneous in its contents and formats. An adequate data management process yielding data quality and control over its lifecycle is a prerequisite to getting value out of this data and minimizing inherent risks related to multiple usages. Common data governance frameworks rely on people, policies, and processes t…
▽ More
Individuals and organizations cope with an always-growing amount of data, which is heterogeneous in its contents and formats. An adequate data management process yielding data quality and control over its lifecycle is a prerequisite to getting value out of this data and minimizing inherent risks related to multiple usages. Common data governance frameworks rely on people, policies, and processes that fall short of the overwhelming complexity of data. Yet, harnessing this complexity is necessary to achieve high-quality standards. The latter will condition any downstream data usage outcome, including generative artificial intelligence trained on this data. In this paper, we report our concrete experience establishing a simple, cost-efficient framework that enables metadata-driven, agile and (semi-)automated data governance (i.e. Data Governance 4.0). We explain how we implement and use this framework to integrate 25 years of clinical study data at an enterprise scale in a fully productive environment. The framework encompasses both methodologies and technologies leveraging semantic web principles. We built a knowledge graph describing avatars of data assets in their business context, including governance principles. Multiple ontologies articulated by an enterprise upper ontology enable key governance actions such as FAIRification, lifecycle management, definition of roles and responsibilities, lineage across transformations and provenance from source systems. This metadata model is the keystone to data governance 4.0: a semi-automatised data management process that considers the business context in an agile manner to adapt governance constraints to each use case and dynamically tune it based on business changes.
△ Less
Submitted 23 November, 2023; v1 submitted 20 October, 2023;
originally announced November 2023.
-
The Role of Reference Points in Machine-Learned Atomistic Simulation Models
Authors:
Xiangyun Lei,
Weike Ye,
Joseph Montoya,
Tim Mueller,
Linda Hung,
Jens Hummelshoej
Abstract:
This paper introduces the Chemical Environment Modeling Theory (CEMT), a novel, generalized framework designed to overcome the limitations inherent in traditional atom-centered Machine Learning Force Field (MLFF) models, widely used in atomistic simulations of chemical systems. CEMT demonstrated enhanced flexibility and adaptability by allowing reference points to exist anywhere within the modeled…
▽ More
This paper introduces the Chemical Environment Modeling Theory (CEMT), a novel, generalized framework designed to overcome the limitations inherent in traditional atom-centered Machine Learning Force Field (MLFF) models, widely used in atomistic simulations of chemical systems. CEMT demonstrated enhanced flexibility and adaptability by allowing reference points to exist anywhere within the modeled domain and thus, enabling the study of various model architectures. Utilizing Gaussian Multipole (GMP) featurization functions, several models with different reference point sets, including finite difference grid-centered and bond-centered models, were tested to analyze the variance in capabilities intrinsic to models built on distinct reference points. The results underscore the potential of non-atom-centered reference points in force training, revealing variations in prediction accuracy, inference speed and learning efficiency. Finally, a unique connection between CEMT and real-space orbital-free finite element Density Functional Theory (FE-DFT) is established, and the implications include the enhancement of data efficiency and robustness. It allows the leveraging of spatially-resolved energy densities and charge densities from FE-DFT calculations, as well as serving as a pivotal step towards integrating known quantum-mechanical laws into the architecture of ML models.
△ Less
Submitted 27 October, 2023;
originally announced October 2023.
-
A Comparative Study of Population-Graph Construction Methods and Graph Neural Networks for Brain Age Regression
Authors:
Kyriaki-Margarita Bintsi,
Tamara T. Mueller,
Sophie Starck,
Vasileios Baltatzis,
Alexander Hammers,
Daniel Rueckert
Abstract:
The difference between the chronological and biological brain age of a subject can be an important biomarker for neurodegenerative diseases, thus brain age estimation can be crucial in clinical settings. One way to incorporate multimodal information into this estimation is through population graphs, which combine various types of imaging data and capture the associations among individuals within a…
▽ More
The difference between the chronological and biological brain age of a subject can be an important biomarker for neurodegenerative diseases, thus brain age estimation can be crucial in clinical settings. One way to incorporate multimodal information into this estimation is through population graphs, which combine various types of imaging data and capture the associations among individuals within a population. In medical imaging, population graphs have demonstrated promising results, mostly for classification tasks. In most cases, the graph structure is pre-defined and remains static during training. However, extracting population graphs is a non-trivial task and can significantly impact the performance of Graph Neural Networks (GNNs), which are sensitive to the graph structure. In this work, we highlight the importance of a meaningful graph construction and experiment with different population-graph construction methods and their effect on GNN performance on brain age estimation. We use the homophily metric and graph visualizations to gain valuable quantitative and qualitative insights on the extracted graph structures. For the experimental evaluation, we leverage the UK Biobank dataset, which offers many imaging and non-imaging phenotypes. Our results indicate that architectures highly sensitive to the graph structure, such as Graph Convolutional Network (GCN) and Graph Attention Network (GAT), struggle with low homophily graphs, while other architectures, such as GraphSage and Chebyshev, are more robust across different homophily ratios. We conclude that static graph construction approaches are potentially insufficient for the task of brain age estimation and make recommendations for alternative research directions.
△ Less
Submitted 26 September, 2023;
originally announced September 2023.
-
Survey of adaptive containerization architectures for HPC
Authors:
Tiziano Müller,
Nina Mujkanovic,
Juan J. Durillo,
Nicolay Hammer
Abstract:
Containers offer an array of advantages that benefit research reproducibility and portability across groups and systems. As container tools mature, container security improves, and High-performance computing (HPC) and cloud system tools converge, supercomputing centers are increasingly integrating containers in their workflows. The technology selection process requires sufficient information on th…
▽ More
Containers offer an array of advantages that benefit research reproducibility and portability across groups and systems. As container tools mature, container security improves, and High-performance computing (HPC) and cloud system tools converge, supercomputing centers are increasingly integrating containers in their workflows. The technology selection process requires sufficient information on the diverse tools available, yet the majority of research into containers still focuses on cloud environments. We consider an adaptive containerization approach, with a focus on accelerating the deployment of applications and workflows on HPC systems using containers. To this end, we discuss the specific HPC requirements regarding container tools, and analyze the entire containerization stack, including container engines and registries, in-depth. Finally, we consider various orchestrator and HPC workload manager integration scenarios.
△ Less
Submitted 23 August, 2023;
originally announced August 2023.
-
Body Fat Estimation from Surface Meshes using Graph Neural Networks
Authors:
Tamara T. Mueller,
Siyu Zhou,
Sophie Starck,
Friederike Jungmann,
Alexander Ziller,
Orhun Aksoy,
Danylo Movchan,
Rickmer Braren,
Georgios Kaissis,
Daniel Rueckert
Abstract:
Body fat volume and distribution can be a strong indication for a person's overall health and the risk for develo** diseases like type 2 diabetes and cardiovascular diseases. Frequently used measures for fat estimation are the body mass index (BMI), waist circumference, or the waist-hip-ratio. However, those are rather imprecise measures that do not allow for a discrimination between different t…
▽ More
Body fat volume and distribution can be a strong indication for a person's overall health and the risk for develo** diseases like type 2 diabetes and cardiovascular diseases. Frequently used measures for fat estimation are the body mass index (BMI), waist circumference, or the waist-hip-ratio. However, those are rather imprecise measures that do not allow for a discrimination between different types of fat or between fat and muscle tissue. The estimation of visceral (VAT) and abdominal subcutaneous (ASAT) adipose tissue volume has shown to be a more accurate measure for named risk factors. In this work, we show that triangulated body surface meshes can be used to accurately predict VAT and ASAT volumes using graph neural networks. Our methods achieve high performance while reducing training time and required resources compared to state-of-the-art convolutional neural networks in this area. We furthermore envision this method to be applicable to cheaper and easily accessible medical surface scans instead of expensive medical images.
△ Less
Submitted 31 October, 2023; v1 submitted 13 July, 2023;
originally announced August 2023.
-
SoK: Assessing the State of Applied Federated Machine Learning
Authors:
Tobias Müller,
Maximilian Stäbler,
Hugo Gascón,
Frank Köster,
Florian Matthes
Abstract:
Machine Learning (ML) has shown significant potential in various applications; however, its adoption in privacy-critical domains has been limited due to concerns about data privacy. A promising solution to this issue is Federated Machine Learning (FedML), a model-to-data approach that prioritizes data privacy. By enabling ML algorithms to be applied directly to distributed data sources without sha…
▽ More
Machine Learning (ML) has shown significant potential in various applications; however, its adoption in privacy-critical domains has been limited due to concerns about data privacy. A promising solution to this issue is Federated Machine Learning (FedML), a model-to-data approach that prioritizes data privacy. By enabling ML algorithms to be applied directly to distributed data sources without sharing raw data, FedML offers enhanced privacy protections, making it suitable for privacy-critical environments. Despite its theoretical benefits, FedML has not seen widespread practical implementation. This study aims to explore the current state of applied FedML and identify the challenges hindering its practical adoption. Through a comprehensive systematic literature review, we assess 74 relevant papers to analyze the real-world applicability of FedML. Our analysis focuses on the characteristics and emerging trends of FedML implementations, as well as the motivational drivers and application domains. We also discuss the encountered challenges in integrating FedML into real-life settings. By shedding light on the existing landscape and potential obstacles, this research contributes to the further development and implementation of FedML in privacy-critical scenarios.
△ Less
Submitted 3 August, 2023;
originally announced August 2023.
-
Extended Graph Assessment Metrics for Graph Neural Networks
Authors:
Tamara T. Mueller,
Sophie Starck,
Leonhard F. Feiner,
Kyriaki-Margarita Bintsi,
Daniel Rueckert,
Georgios Kaissis
Abstract:
When re-structuring patient cohorts into so-called population graphs, initially independent data points can be incorporated into one interconnected graph structure. This population graph can then be used for medical downstream tasks using graph neural networks (GNNs). The construction of a suitable graph structure is a challenging step in the learning pipeline that can have severe impact on model…
▽ More
When re-structuring patient cohorts into so-called population graphs, initially independent data points can be incorporated into one interconnected graph structure. This population graph can then be used for medical downstream tasks using graph neural networks (GNNs). The construction of a suitable graph structure is a challenging step in the learning pipeline that can have severe impact on model performance. To this end, different graph assessment metrics have been introduced to evaluate graph structures. However, these metrics are limited to classification tasks and discrete adjacency matrices, only covering a small subset of real-world applications. In this work, we introduce extended graph assessment metrics (GAMs) for regression tasks and continuous adjacency matrices. We focus on two GAMs in specific: \textit{homophily} and \textit{cross-class neighbourhood similarity} (CCNS). We extend the notion of GAMs to more than one hop, define homophily for regression tasks, as well as continuous adjacency matrices, and propose a light-weight CCNS distance for discrete and continuous adjacency matrices. We show the correlation of these metrics with model performance on different medical population graphs and under different learning settings.
△ Less
Submitted 19 September, 2023; v1 submitted 13 July, 2023;
originally announced July 2023.
-
Atlas-Based Interpretable Age Prediction In Whole-Body MR Images
Authors:
Sophie Starck,
Yadunandan Vivekanand Kini,
Jessica Johanna Maria Ritter,
Rickmer Braren,
Daniel Rueckert,
Tamara Mueller
Abstract:
Age prediction is an important part of medical assessments and research. It can aid in detecting diseases as well as abnormal ageing by highlighting the discrepancy between chronological and biological age. To gain a comprehensive understanding of age-related changes observed in various body parts, we investigate them on a larger scale by using whole-body 3D images. We utilise the Grad-CAM interpr…
▽ More
Age prediction is an important part of medical assessments and research. It can aid in detecting diseases as well as abnormal ageing by highlighting the discrepancy between chronological and biological age. To gain a comprehensive understanding of age-related changes observed in various body parts, we investigate them on a larger scale by using whole-body 3D images. We utilise the Grad-CAM interpretability method to determine the body areas most predictive of a person's age. We expand our analysis beyond individual subjects by employing registration techniques to generate population-wide interpretability maps. Our findings reveal three primary areas of interest: the spine, the autochthonous back muscles, and the cardiac region, which exhibits the highest importance.
△ Less
Submitted 2 November, 2023; v1 submitted 14 July, 2023;
originally announced July 2023.
-
Privacy-Utility Trade-offs in Neural Networks for Medical Population Graphs: Insights from Differential Privacy and Graph Structure
Authors:
Tamara T. Mueller,
Maulik Chevli,
Ameya Daigavane,
Daniel Rueckert,
Georgios Kaissis
Abstract:
We initiate an empirical investigation into differentially private graph neural networks on population graphs from the medical domain by examining privacy-utility trade-offs at different privacy levels on both real-world and synthetic datasets and performing auditing through membership inference attacks. Our findings highlight the potential and the challenges of this specific DP application area.…
▽ More
We initiate an empirical investigation into differentially private graph neural networks on population graphs from the medical domain by examining privacy-utility trade-offs at different privacy levels on both real-world and synthetic datasets and performing auditing through membership inference attacks. Our findings highlight the potential and the challenges of this specific DP application area. Moreover, we find evidence that the underlying graph structure constitutes a potential factor for larger performance gaps by showing a correlation between the degree of graph homophily and the accuracy of the trained model.
△ Less
Submitted 13 July, 2023;
originally announced July 2023.
-
Interpretable 2D Vision Models for 3D Medical Images
Authors:
Alexander Ziller,
Ayhan Can Erdur,
Marwa Trigui,
Alp Güvenir,
Tamara T. Mueller,
Philip Müller,
Friederike Jungmann,
Johannes Brandt,
Jan Peeken,
Rickmer Braren,
Daniel Rueckert,
Georgios Kaissis
Abstract:
Training Artificial Intelligence (AI) models on 3D images presents unique challenges compared to the 2D case: Firstly, the demand for computational resources is significantly higher, and secondly, the availability of large datasets for pre-training is often limited, impeding training success. This study proposes a simple approach of adapting 2D networks with an intermediate feature representation…
▽ More
Training Artificial Intelligence (AI) models on 3D images presents unique challenges compared to the 2D case: Firstly, the demand for computational resources is significantly higher, and secondly, the availability of large datasets for pre-training is often limited, impeding training success. This study proposes a simple approach of adapting 2D networks with an intermediate feature representation for processing 3D images. Our method employs attention pooling to learn to assign each slice an importance weight and, by that, obtain a weighted average of all 2D slices. These weights directly quantify the contribution of each slice to the contribution and thus make the model prediction inspectable. We show on all 3D MedMNIST datasets as benchmark and two real-world datasets consisting of several hundred high-resolution CT or MRI scans that our approach performs on par with existing methods. Furthermore, we compare the in-built interpretability of our approach to HiResCam, a state-of-the-art retrospective interpretability approach.
△ Less
Submitted 5 December, 2023; v1 submitted 13 July, 2023;
originally announced July 2023.
-
Neuralangelo: High-Fidelity Neural Surface Reconstruction
Authors:
Zhaoshuo Li,
Thomas Müller,
Alex Evans,
Russell H. Taylor,
Mathias Unberath,
Ming-Yu Liu,
Chen-Hsuan Lin
Abstract:
Neural surface reconstruction has been shown to be powerful for recovering dense 3D surfaces via image-based neural rendering. However, current methods struggle to recover detailed structures of real-world scenes. To address the issue, we present Neuralangelo, which combines the representation power of multi-resolution 3D hash grids with neural surface rendering. Two key ingredients enable our app…
▽ More
Neural surface reconstruction has been shown to be powerful for recovering dense 3D surfaces via image-based neural rendering. However, current methods struggle to recover detailed structures of real-world scenes. To address the issue, we present Neuralangelo, which combines the representation power of multi-resolution 3D hash grids with neural surface rendering. Two key ingredients enable our approach: (1) numerical gradients for computing higher-order derivatives as a smoothing operation and (2) coarse-to-fine optimization on the hash grids controlling different levels of details. Even without auxiliary inputs such as depth, Neuralangelo can effectively recover dense 3D surface structures from multi-view images with fidelity significantly surpassing previous methods, enabling detailed large-scale scene reconstruction from RGB video captures.
△ Less
Submitted 12 June, 2023; v1 submitted 5 June, 2023;
originally announced June 2023.
-
Unlocking the Potential of Collaborative AI -- On the Socio-technical Challenges of Federated Machine Learning
Authors:
Tobias Müller,
Milena Zahn,
Florian Matthes
Abstract:
The disruptive potential of AI systems roots in the emergence of big data. Yet, a significant portion is scattered and locked in data silos, leaving its potential untapped. Federated Machine Learning is a novel AI paradigm enabling the creation of AI models from decentralized, potentially siloed data. Hence, Federated Machine Learning could technically open data silos and therefore unlock economic…
▽ More
The disruptive potential of AI systems roots in the emergence of big data. Yet, a significant portion is scattered and locked in data silos, leaving its potential untapped. Federated Machine Learning is a novel AI paradigm enabling the creation of AI models from decentralized, potentially siloed data. Hence, Federated Machine Learning could technically open data silos and therefore unlock economic potential. However, this requires collaboration between multiple parties owning data silos. Setting up collaborative business models is complex and often a reason for failure. Current literature lacks guidelines on which aspects must be considered to successfully realize collaborative AI projects. This research investigates the challenges of prevailing collaborative business models and distinct aspects of Federated Machine Learning. Through a systematic literature review, focus group, and expert interviews, we provide a systemized collection of socio-technical challenges and an extended Business Model Canvas for the initial viability assessment of collaborative AI projects.
△ Less
Submitted 28 April, 2023; v1 submitted 26 April, 2023;
originally announced April 2023.
-
Underwater Autonomous Tank Cleaning Rover
Authors:
Aditya Sundarajan,
Jaideepnath Anand,
Kevin Timothy Muller,
Mangal Das
Abstract:
In order to keep aquatic ecosystems safe and healthy, it is imperative that cleaning be done frequently. This research suggests the use of autonomous underwater rovers for effective underwater cleaning as a novel approach to this issue. The enhanced sensing and navigational capabilities of the autonomous rovers enable them to independently navigate underwater environments and find and remove under…
▽ More
In order to keep aquatic ecosystems safe and healthy, it is imperative that cleaning be done frequently. This research suggests the use of autonomous underwater rovers for effective underwater cleaning as a novel approach to this issue. The enhanced sensing and navigational capabilities of the autonomous rovers enable them to independently navigate underwater environments and find and remove underwater garbage and uneaten fish feed which can be recycled. The suggested solution not only does away with the requirement for human divers, but also provides a more effective and affordable technique for underwater cleaning. The paper also examines the creation, testing, and potential of the autonomous underwater rovers.
△ Less
Submitted 17 April, 2023;
originally announced April 2023.
-
BundleSDF: Neural 6-DoF Tracking and 3D Reconstruction of Unknown Objects
Authors:
Bowen Wen,
Jonathan Tremblay,
Valts Blukis,
Stephen Tyree,
Thomas Muller,
Alex Evans,
Dieter Fox,
Jan Kautz,
Stan Birchfield
Abstract:
We present a near real-time method for 6-DoF tracking of an unknown object from a monocular RGBD video sequence, while simultaneously performing neural 3D reconstruction of the object. Our method works for arbitrary rigid objects, even when visual texture is largely absent. The object is assumed to be segmented in the first frame only. No additional information is required, and no assumption is ma…
▽ More
We present a near real-time method for 6-DoF tracking of an unknown object from a monocular RGBD video sequence, while simultaneously performing neural 3D reconstruction of the object. Our method works for arbitrary rigid objects, even when visual texture is largely absent. The object is assumed to be segmented in the first frame only. No additional information is required, and no assumption is made about the interaction agent. Key to our method is a Neural Object Field that is learned concurrently with a pose graph optimization process in order to robustly accumulate information into a consistent 3D representation capturing both geometry and appearance. A dynamic pool of posed memory frames is automatically maintained to facilitate communication between these threads. Our approach handles challenging sequences with large pose changes, partial and full occlusion, untextured surfaces, and specular highlights. We show results on HO3D, YCBInEOAT, and BEHAVE datasets, demonstrating that our method significantly outperforms existing approaches. Project page: https://bundlesdf.github.io
△ Less
Submitted 24 March, 2023;
originally announced March 2023.
-
Bridging scales with Machine Learning: From first principles statistical mechanics to continuum phase field computations to study order disorder transitions in LixCoO2
Authors:
G. H. Teichert,
S. Das,
M. Faghih Shojaei,
J. Holber,
T. Mueller,
L. Hung,
V. Gavini,
K. Garikipati
Abstract:
LixTMO2 (TM=Ni, Co, Mn) forms an important family of cathode materials for Li-ion batteries, whose performance is strongly governed by Li composition-dependent crystal structure and phase stability. Here, we use LixCoO2 (LCO) as a model system to benchmark a machine learning-enabled framework for bridging scales in materials physics. We focus on two scales: (a) assemblies of thousands of atoms des…
▽ More
LixTMO2 (TM=Ni, Co, Mn) forms an important family of cathode materials for Li-ion batteries, whose performance is strongly governed by Li composition-dependent crystal structure and phase stability. Here, we use LixCoO2 (LCO) as a model system to benchmark a machine learning-enabled framework for bridging scales in materials physics. We focus on two scales: (a) assemblies of thousands of atoms described by density functional theory-informed statistical mechanics, and (b) continuum phase field models to study the dynamics of order-disorder transitions in LCO. Central to the scale bridging is the rigorous, quantitatively accurate, representation of the free energy density and chemical potentials of this material system by coarsegraining formation energies for specific atomic configurations. We develop active learning workflows to train recently developed integrable deep neural networks for such high-dimensional free energy density and chemical potential functions. The resulting, first principles-informed, machine learning-enabled, phase-field computations allow us to study LCO cathodes' phase evolution in terms of temperature, morphology, charge cycling and particle size.
△ Less
Submitted 17 February, 2023;
originally announced February 2023.
-
Towards interpretable quantum machine learning via single-photon quantum walks
Authors:
Fulvio Flamini,
Marius Krumm,
Lukas J. Fiderer,
Thomas Müller,
Hans J. Briegel
Abstract:
Variational quantum algorithms represent a promising approach to quantum machine learning where classical neural networks are replaced by parametrized quantum circuits. However, both approaches suffer from a clear limitation, that is a lack of interpretability. Here, we present a variational method to quantize projective simulation (PS), a reinforcement learning model aimed at interpretable artifi…
▽ More
Variational quantum algorithms represent a promising approach to quantum machine learning where classical neural networks are replaced by parametrized quantum circuits. However, both approaches suffer from a clear limitation, that is a lack of interpretability. Here, we present a variational method to quantize projective simulation (PS), a reinforcement learning model aimed at interpretable artificial intelligence. Decision making in PS is modeled as a random walk on a graph describing the agent's memory. To implement the quantized model, we consider quantum walks of single photons in a lattice of tunable Mach-Zehnder interferometers trained via variational algorithms. Using an example from transfer learning, we show that the quantized PS model can exploit quantum interference to acquire capabilities beyond those of its classical counterpart. Finally, we discuss the role of quantum interference for training and tracing the decision making process, paving the way for realizations of interpretable quantum learning agents.
△ Less
Submitted 16 October, 2023; v1 submitted 31 January, 2023;
originally announced January 2023.
-
Training one model to detect heart and lung sound events from single point auscultations
Authors:
Leander Melms,
Robert R. Ilesan,
Ulrich Köhler,
Olaf Hildebrandt,
Regina Conradt,
Jens Eckstein,
Cihan Atila,
Sami Matrood,
Bernhard Schieffer,
Jürgen R. Schaefer,
Tobias Müller,
Julius Obergassel,
Nadine Schlicker,
Martin C. Hirsch
Abstract:
Objective: This work proposes a semi-supervised training approach for detecting lung and heart sounds simultaneously with only one trained model and in invariance to the auscultation point. Methods: We use open-access data from the 2016 Physionet/CinC Challenge, the 2022 George Moody Challenge, and from the lung sound database HF_V1. We first train specialist single-task models using foreground gr…
▽ More
Objective: This work proposes a semi-supervised training approach for detecting lung and heart sounds simultaneously with only one trained model and in invariance to the auscultation point. Methods: We use open-access data from the 2016 Physionet/CinC Challenge, the 2022 George Moody Challenge, and from the lung sound database HF_V1. We first train specialist single-task models using foreground ground truth (GT) labels from different auscultation databases to identify background sound events in the respective lung and heart auscultation databases. The pseudo-labels generated in this way were combined with the ground truth labels in a new training iteration, such that a new model was subsequently trained to detect foreground and background signals. Benchmark tests ensured that the newly trained model could detect both, lung, and heart sound events in different auscultation sites without regressing on the original task. We also established hand-validated labels for the respective background signal in heart and lung sound auscultations to evaluate the models. Results: In this work, we report for the first time results for i) a multi-class prediction for lung sound events and ii) for simultaneous detection of heart and lung sound events and achieve competitive results using only one model. The combined multi-task model regressed slightly in heart sound detection and gained significantly in lung sound detection accuracy with an overall macro F1 score of 39.2% over six classes, representing a 6.7% improvement over the single-task baseline models. Conclusion/Significance: To the best of our knowledge, this is the first approach developed to date for measuring heart and lung sound events invariant to both, the auscultation site and capturing device. Hence, our model is capable of performing lung and heart sound detection from any auscultation location.
△ Less
Submitted 15 January, 2023;
originally announced January 2023.
-
SEQUENT: Towards Traceable Quantum Machine Learning using Sequential Quantum Enhanced Training
Authors:
Philipp Altmann,
Leo Sünkel,
Jonas Stein,
Tobias Müller,
Christoph Roch,
Claudia Linnhoff-Popien
Abstract:
Applying new computing paradigms like quantum computing to the field of machine learning has recently gained attention. However, as high-dimensional real-world applications are not yet feasible to be solved using purely quantum hardware, hybrid methods using both classical and quantum machine learning paradigms have been proposed. For instance, transfer learning methods have been shown to be succe…
▽ More
Applying new computing paradigms like quantum computing to the field of machine learning has recently gained attention. However, as high-dimensional real-world applications are not yet feasible to be solved using purely quantum hardware, hybrid methods using both classical and quantum machine learning paradigms have been proposed. For instance, transfer learning methods have been shown to be successfully applicable to hybrid image classification tasks. Nevertheless, beneficial circuit architectures still need to be explored. Therefore, tracing the impact of the chosen circuit architecture and parameterization is crucial for the development of beneficially applicable hybrid methods. However, current methods include processes where both parts are trained concurrently, therefore not allowing for a strict separability of classical and quantum impact. Thus, those architectures might produce models that yield a superior prediction accuracy whilst employing the least possible quantum impact. To tackle this issue, we propose Sequential Quantum Enhanced Training (SEQUENT) an improved architecture and training process for the traceable application of quantum computing methods to hybrid machine learning. Furthermore, we provide formal evidence for the disadvantage of current methods and preliminary experimental results as a proof-of-concept for the applicability of SEQUENT.
△ Less
Submitted 26 April, 2023; v1 submitted 6 January, 2023;
originally announced January 2023.
-
An Agent-based Realisation for a continuous Model Adaption Approach in intelligent Digital Twins
Authors:
Daniel Dittler,
Peter Lierhammer,
Dominik Braun,
Timo Müller,
Nasser Jazdi,
Michael Weyrich
Abstract:
The trend in industrial automation is towards networking, intelligence and autonomy. Digital Twins, which serve as virtual representations, are becoming increasingly important in this context. The Digital Twin of a modular production system contains many different models that are mostly created for specific applications and fulfil different requirements. Especially simulation models, which are cre…
▽ More
The trend in industrial automation is towards networking, intelligence and autonomy. Digital Twins, which serve as virtual representations, are becoming increasingly important in this context. The Digital Twin of a modular production system contains many different models that are mostly created for specific applications and fulfil different requirements. Especially simulation models, which are created in the development phase, can be used during the operational phase for applications such as prognosis or operation-parallel simulation. Due to the high heterogeneity of the model landscape in the context of a modular production system, the plant operator is faced with the challenge of adapting the models in order to ensure an application-oriented realism in the event of changes to the asset and its environment or the addition of applications. Therefore, this paper proposes a concept for the continuous model adaption in the Digital Twin of a modular production system during the operational phase. The benefits are then demonstrated by an application scenario and an agent-based realisation.
△ Less
Submitted 7 December, 2022;
originally announced December 2022.
-
How Do Input Attributes Impact the Privacy Loss in Differential Privacy?
Authors:
Tamara T. Mueller,
Stefan Kolek,
Friederike Jungmann,
Alexander Ziller,
Dmitrii Usynin,
Moritz Knolle,
Daniel Rueckert,
Georgios Kaissis
Abstract:
Differential privacy (DP) is typically formulated as a worst-case privacy guarantee over all individuals in a database. More recently, extensions to individual subjects or their attributes, have been introduced. Under the individual/per-instance DP interpretation, we study the connection between the per-subject gradient norm in DP neural networks and individual privacy loss and introduce a novel m…
▽ More
Differential privacy (DP) is typically formulated as a worst-case privacy guarantee over all individuals in a database. More recently, extensions to individual subjects or their attributes, have been introduced. Under the individual/per-instance DP interpretation, we study the connection between the per-subject gradient norm in DP neural networks and individual privacy loss and introduce a novel metric termed the Privacy Loss-Input Susceptibility (PLIS), which allows one to apportion the subject's privacy loss to their input attributes. We experimentally show how this enables the identification of sensitive attributes and of subjects at high risk of data reconstruction.
△ Less
Submitted 18 November, 2022;
originally announced November 2022.
-
Generalizability of Functional Forms for Interatomic Potential Models Discovered by Symbolic Regression
Authors:
Alberto Hernandez,
Tim Mueller
Abstract:
In recent years there has been great progress in the use of machine learning algorithms to develop interatomic potential models. Machine-learned potential models are typically orders of magnitude faster than density functional theory but also orders of magnitude slower than physics-derived models such as the embedded atom method. In our previous work, we used symbolic regression to develop fast, a…
▽ More
In recent years there has been great progress in the use of machine learning algorithms to develop interatomic potential models. Machine-learned potential models are typically orders of magnitude faster than density functional theory but also orders of magnitude slower than physics-derived models such as the embedded atom method. In our previous work, we used symbolic regression to develop fast, accurate and transferrable interatomic potential models for copper with novel functional forms that resemble those of the embedded atom method. To determine the extent to which the success of these forms was specific to copper, here we explore the generalizability of these models to other face-centered cubic transition metals and analyze their out-of-sample performance on several material properties. We found that these forms work particularly well on elements that are chemically similar to copper. When compared to optimized Sutton-Chen models, which have similar complexity, the functional forms discovered using symbolic regression perform better across all elements considered except gold where they have a similar performance. They perform similarly to a moderately more complex embedded atom form on properties on which they were trained, and they are more accurate on average on other properties. We attribute this improved generalized accuracy to the relative simplicity of the models discovered using symbolic regression. The genetic programming models are found to outperform other models from the literature about 50% of the time in a variety of property predictions, with about 1/10th the model complexity on average. We discuss the implications of these results to the broader application of symbolic regression to the development of new potentials and highlight how models discovered for one element can be used to seed new searches for different elements.
△ Less
Submitted 24 March, 2023; v1 submitted 26 October, 2022;
originally announced October 2022.
-
Parallel Inversion of Neural Radiance Fields for Robust Pose Estimation
Authors:
Yunzhi Lin,
Thomas Müller,
Jonathan Tremblay,
Bowen Wen,
Stephen Tyree,
Alex Evans,
Patricio A. Vela,
Stan Birchfield
Abstract:
We present a parallelized optimization method based on fast Neural Radiance Fields (NeRF) for estimating 6-DoF pose of a camera with respect to an object or scene. Given a single observed RGB image of the target, we can predict the translation and rotation of the camera by minimizing the residual between pixels rendered from a fast NeRF model and pixels in the observed image. We integrate a moment…
▽ More
We present a parallelized optimization method based on fast Neural Radiance Fields (NeRF) for estimating 6-DoF pose of a camera with respect to an object or scene. Given a single observed RGB image of the target, we can predict the translation and rotation of the camera by minimizing the residual between pixels rendered from a fast NeRF model and pixels in the observed image. We integrate a momentum-based camera extrinsic optimization procedure into Instant Neural Graphics Primitives, a recent exceptionally fast NeRF implementation. By introducing parallel Monte Carlo sampling into the pose estimation task, our method overcomes local minima and improves efficiency in a more extensive search space. We also show the importance of adopting a more robust pixel-based loss function to reduce error. Experiments demonstrate that our method can achieve improved generalization and robustness on both synthetic and real-world benchmarks.
△ Less
Submitted 10 March, 2023; v1 submitted 18 October, 2022;
originally announced October 2022.
-
A graph-based knowledge representation and pattern mining supporting the Digital Twin creation of existing manufacturing systems
Authors:
Dominik Braun,
Timo Müller,
Nada Sahlab,
Nasser Jazdi,
Wolfgang Schloegl,
Michael Weyrich
Abstract:
The creation of a Digital Twin for existing manufacturing systems, so-called brownfield systems, is a challenging task due to the needed expert knowledge about the structure of brownfield systems and the effort to realize the digital models. Several approaches and methods have already been proposed that at least partially digitalize the information about a brownfield manufacturing system. A Digita…
▽ More
The creation of a Digital Twin for existing manufacturing systems, so-called brownfield systems, is a challenging task due to the needed expert knowledge about the structure of brownfield systems and the effort to realize the digital models. Several approaches and methods have already been proposed that at least partially digitalize the information about a brownfield manufacturing system. A Digital Twin requires linked information from multiple sources. This paper presents a graph-based approach to merge information from heterogeneous sources. Furthermore, the approach provides a way to automatically identify templates using graph structure analysis to facilitate further work with the resulting Digital Twin and its further enhancement.
△ Less
Submitted 21 September, 2022;
originally announced September 2022.
-
Stability of Weighted Majority Voting under Estimated Weights
Authors:
Shaojie Bai,
Dongxia Wang,
Tim Muller,
Peng Cheng,
Jiming Chen
Abstract:
Weighted Majority Voting (WMV) is a well-known optimal decision rule for collective decision making, given the probability of sources to provide accurate information (trustworthiness). However, in reality, the trustworthiness is not a known quantity to the decision maker - they have to rely on an estimate called trust. A (machine learning) algorithm that computes trust is called unbiased when it h…
▽ More
Weighted Majority Voting (WMV) is a well-known optimal decision rule for collective decision making, given the probability of sources to provide accurate information (trustworthiness). However, in reality, the trustworthiness is not a known quantity to the decision maker - they have to rely on an estimate called trust. A (machine learning) algorithm that computes trust is called unbiased when it has the property that it does not systematically overestimate or underestimate the trustworthiness. To formally analyse the uncertainty to the decision process, we introduce and analyse two important properties of such unbiased trust values: stability of correctness and stability of optimality. Stability of correctness means that the decision accuracy that the decision maker believes they achieved is equal to the actual accuracy. We prove stability of correctness holds. Stability of optimality means that the decisions made based on trust, are equally good as they would have been if they were based on trustworthiness. Stability of optimality does not hold. We analyse the difference between the two, and bounds thereon. We also present an overview of how sensitive decision correctness is to changes in trust and trustworthiness.
△ Less
Submitted 30 June, 2024; v1 submitted 13 July, 2022;
originally announced July 2022.
-
Intelligent Exploration of Solution Spaces Exemplified by Industrial Reconfiguration Management
Authors:
Timo Müller,
Benjamin Maschler,
Daniel Dittler,
Nasser Jazdi,
Michael Weyrich
Abstract:
Many decision-making approaches rely on the exploration of solution spaces with regards to specified criteria. However, in complex environments, brute-force exploration strategies are usually not feasible. As an alternative, we propose the combination of an exploration task's vertical sub-division into layers representing different sequentially interdependent sub-problems of the paramount problem…
▽ More
Many decision-making approaches rely on the exploration of solution spaces with regards to specified criteria. However, in complex environments, brute-force exploration strategies are usually not feasible. As an alternative, we propose the combination of an exploration task's vertical sub-division into layers representing different sequentially interdependent sub-problems of the paramount problem and a horizontal sub-division into self-sustained solution sub-spaces. In this paper, we present a universal methodology for the intelligent exploration of solution spaces and derive a use-case specific example from the field of reconfiguration management in industry 4.0.
△ Less
Submitted 4 July, 2022;
originally announced July 2022.
-
Variable Bitrate Neural Fields
Authors:
Towaki Takikawa,
Alex Evans,
Jonathan Tremblay,
Thomas Müller,
Morgan McGuire,
Alec Jacobson,
Sanja Fidler
Abstract:
Neural approximations of scalar and vector fields, such as signed distance functions and radiance fields, have emerged as accurate, high-quality representations. State-of-the-art results are obtained by conditioning a neural approximation with a lookup from trainable feature grids that take on part of the learning task and allow for smaller, more efficient neural networks. Unfortunately, these fea…
▽ More
Neural approximations of scalar and vector fields, such as signed distance functions and radiance fields, have emerged as accurate, high-quality representations. State-of-the-art results are obtained by conditioning a neural approximation with a lookup from trainable feature grids that take on part of the learning task and allow for smaller, more efficient neural networks. Unfortunately, these feature grids usually come at the cost of significantly increased memory consumption compared to stand-alone neural network models. We present a dictionary method for compressing such feature grids, reducing their memory consumption by up to 100x and permitting a multiresolution representation which can be useful for out-of-core streaming. We formulate the dictionary optimization as a vector-quantized auto-decoder problem which lets us learn end-to-end discrete neural representations in a space where no direct supervision is available and with dynamic topology and structure. Our source code will be available at https://github.com/nv-tlabs/vqad.
△ Less
Submitted 15 June, 2022;
originally announced June 2022.
-
RTMV: A Ray-Traced Multi-View Synthetic Dataset for Novel View Synthesis
Authors:
Jonathan Tremblay,
Moustafa Meshry,
Alex Evans,
Jan Kautz,
Alexander Keller,
Sameh Khamis,
Thomas Müller,
Charles Loop,
Nathan Morrical,
Koki Nagano,
Towaki Takikawa,
Stan Birchfield
Abstract:
We present a large-scale synthetic dataset for novel view synthesis consisting of ~300k images rendered from nearly 2000 complex scenes using high-quality ray tracing at high resolution (1600 x 1600 pixels). The dataset is orders of magnitude larger than existing synthetic datasets for novel view synthesis, thus providing a large unified benchmark for both training and evaluation. Using 4 distinct…
▽ More
We present a large-scale synthetic dataset for novel view synthesis consisting of ~300k images rendered from nearly 2000 complex scenes using high-quality ray tracing at high resolution (1600 x 1600 pixels). The dataset is orders of magnitude larger than existing synthetic datasets for novel view synthesis, thus providing a large unified benchmark for both training and evaluation. Using 4 distinct sources of high-quality 3D meshes, the scenes of our dataset exhibit challenging variations in camera views, lighting, shape, materials, and textures. Because our dataset is too large for existing methods to process, we propose Sparse Voxel Light Field (SVLF), an efficient voxel-based light field approach for novel view synthesis that achieves comparable performance to NeRF on synthetic data, while being an order of magnitude faster to train and two orders of magnitude faster to render. SVLF achieves this speed by relying on a sparse voxel octree, careful voxel sampling (requiring only a handful of queries per ray), and reduced network structure; as well as ground truth depth maps at training time. Our dataset is generated by NViSII, a Python-based ray tracing renderer, which is designed to be simple for non-experts to use and share, flexible and powerful through its use of scripting, and able to create high-quality and physically-based rendered images. Experiments with a subset of our dataset allow us to compare standard methods like NeRF and mip-NeRF for single-scene modeling, and pixelNeRF for category-level modeling, pointing toward the need for future improvements in this area.
△ Less
Submitted 24 October, 2022; v1 submitted 14 May, 2022;
originally announced May 2022.
-
Zero and Few-shot Learning for Author Profiling
Authors:
Mara Chinea-Rios,
Thomas Müller,
Gretel Liz De la Peña Sarracén,
Francisco Rangel,
Marc Franco-Salvador
Abstract:
Author profiling classifies author characteristics by analyzing how language is shared among people. In this work, we study that task from a low-resource viewpoint: using little or no training data. We explore different zero and few-shot models based on entailment and evaluate our systems on several profiling tasks in Spanish and English. In addition, we study the effect of both the entailment hyp…
▽ More
Author profiling classifies author characteristics by analyzing how language is shared among people. In this work, we study that task from a low-resource viewpoint: using little or no training data. We explore different zero and few-shot models based on entailment and evaluate our systems on several profiling tasks in Spanish and English. In addition, we study the effect of both the entailment hypothesis and the size of the few-shot training sample. We find that entailment-based models out-perform supervised text classifiers based on roberta-XLM and that we can reach 80% of the accuracy of previous approaches using less than 50\% of the training data on average.
△ Less
Submitted 17 May, 2022; v1 submitted 22 April, 2022;
originally announced April 2022.
-
Active Few-Shot Learning with FASL
Authors:
Thomas Müller,
Guillermo Pérez-Torró,
Angelo Basile,
Marc Franco-Salvador
Abstract:
Recent advances in natural language processing (NLP) have led to strong text classification models for many tasks. However, still often thousands of examples are needed to train models with good quality. This makes it challenging to quickly develop and deploy new models for real world problems and business needs. Few-shot learning and active learning are two lines of research, aimed at tackling th…
▽ More
Recent advances in natural language processing (NLP) have led to strong text classification models for many tasks. However, still often thousands of examples are needed to train models with good quality. This makes it challenging to quickly develop and deploy new models for real world problems and business needs. Few-shot learning and active learning are two lines of research, aimed at tackling this problem. In this work, we combine both lines into FASL, a platform that allows training text classification models using an iterative and fast process. We investigate which active learning methods work best in our few-shot setup. Additionally, we develop a model to predict when to stop annotating. This is relevant as in a few-shot setup we do not have access to a large validation set.
△ Less
Submitted 17 May, 2022; v1 submitted 20 April, 2022;
originally announced April 2022.
-
Few-Shot Learning with Siamese Networks and Label Tuning
Authors:
Thomas Müller,
Guillermo Pérez-Torró,
Marc Franco-Salvador
Abstract:
We study the problem of building text classifiers with little or no training data, commonly known as zero and few-shot text classification. In recent years, an approach based on neural textual entailment models has been found to give strong results on a diverse range of tasks. In this work, we show that with proper pre-training, Siamese Networks that embed texts and labels offer a competitive alte…
▽ More
We study the problem of building text classifiers with little or no training data, commonly known as zero and few-shot text classification. In recent years, an approach based on neural textual entailment models has been found to give strong results on a diverse range of tasks. In this work, we show that with proper pre-training, Siamese Networks that embed texts and labels offer a competitive alternative. These models allow for a large reduction in inference cost: constant in the number of labels rather than linear. Furthermore, we introduce label tuning, a simple and computationally efficient approach that allows to adapt the models in a few-shot setup by only changing the label embeddings. While giving lower performance than model fine-tuning, this approach has the architectural advantage that a single encoder can be shared by many different tasks.
△ Less
Submitted 20 April, 2022; v1 submitted 28 March, 2022;
originally announced March 2022.
-
Privacy: An axiomatic approach
Authors:
Alexander Ziller,
Tamara Mueller,
Rickmer Braren,
Daniel Rueckert,
Georgios Kaissis
Abstract:
The increasing prevalence of large-scale data collection in modern society represents a potential threat to individual privacy. Addressing this threat, for example through privacy-enhancing technologies (PETs), requires a rigorous definition of what exactly is being protected, that is, of privacy itself. In this work, we formulate an axiomatic definition of privacy based on quantifiable and irredu…
▽ More
The increasing prevalence of large-scale data collection in modern society represents a potential threat to individual privacy. Addressing this threat, for example through privacy-enhancing technologies (PETs), requires a rigorous definition of what exactly is being protected, that is, of privacy itself. In this work, we formulate an axiomatic definition of privacy based on quantifiable and irreducible information flows. Our definition synthesizes prior work from the domain of social science with a contemporary understanding of PETs such as differential privacy (DP). Our work highlights the fact that the inevitable difficulties of protecting privacy in practice are fundamentally information-theoretic. Moreover, it enables quantitative reasoning about PETs based on what they are protecting, thus fostering objective policy discourse about their societal implementation.
△ Less
Submitted 22 March, 2022;
originally announced March 2022.
-
SoK: Differential Privacy on Graph-Structured Data
Authors:
Tamara T. Mueller,
Dmitrii Usynin,
Johannes C. Paetzold,
Daniel Rueckert,
Georgios Kaissis
Abstract:
In this work, we study the applications of differential privacy (DP) in the context of graph-structured data. We discuss the formulations of DP applicable to the publication of graphs and their associated statistics as well as machine learning on graph-based data, including graph neural networks (GNNs). The formulation of DP in the context of graph-structured data is difficult, as individual data…
▽ More
In this work, we study the applications of differential privacy (DP) in the context of graph-structured data. We discuss the formulations of DP applicable to the publication of graphs and their associated statistics as well as machine learning on graph-based data, including graph neural networks (GNNs). The formulation of DP in the context of graph-structured data is difficult, as individual data points are interconnected (often non-linearly or sparsely). This connectivity complicates the computation of individual privacy loss in differentially private learning. The problem is exacerbated by an absence of a single, well-established formulation of DP in graph settings. This issue extends to the domain of GNNs, rendering private machine learning on graph-structured data a challenging task. A lack of prior systematisation work motivated us to study graph-based learning from a privacy perspective. In this work, we systematise different formulations of DP on graphs, discuss challenges and promising applications, including the GNN domain. We compare and separate works into graph analysis tasks and graph learning tasks with GNNs. Finally, we conclude our work with a discussion of open questions and potential directions for further research in this area.
△ Less
Submitted 17 March, 2022;
originally announced March 2022.
-
Differentially Private Graph Classification with GNNs
Authors:
Tamara T. Mueller,
Johannes C. Paetzold,
Chinmay Prabhakar,
Dmitrii Usynin,
Daniel Rueckert,
Georgios Kaissis
Abstract:
Graph Neural Networks (GNNs) have established themselves as the state-of-the-art models for many machine learning applications such as the analysis of social networks, protein interactions and molecules. Several among these datasets contain privacy-sensitive data. Machine learning with differential privacy is a promising technique to allow deriving insight from sensitive data while offering formal…
▽ More
Graph Neural Networks (GNNs) have established themselves as the state-of-the-art models for many machine learning applications such as the analysis of social networks, protein interactions and molecules. Several among these datasets contain privacy-sensitive data. Machine learning with differential privacy is a promising technique to allow deriving insight from sensitive data while offering formal guarantees of privacy protection. However, the differentially private training of GNNs has so far remained under-explored due to the challenges presented by the intrinsic structural connectivity of graphs. In this work, we introduce differential privacy for graph-level classification, one of the key applications of machine learning on graphs. Our method is applicable to deep learning on multi-graph datasets and relies on differentially private stochastic gradient descent (DP-SGD). We show results on a variety of synthetic and public datasets and evaluate the impact of different GNN architectures and training hyperparameters on model performance for differentially private graph classification. Finally, we apply explainability techniques to assess whether similar representations are learned in the private and non-private settings and establish robust baselines for future work in this area.
△ Less
Submitted 8 February, 2022; v1 submitted 5 February, 2022;
originally announced February 2022.
-
Instant Neural Graphics Primitives with a Multiresolution Hash Encoding
Authors:
Thomas Müller,
Alex Evans,
Christoph Schied,
Alexander Keller
Abstract:
Neural graphics primitives, parameterized by fully connected neural networks, can be costly to train and evaluate. We reduce this cost with a versatile new input encoding that permits the use of a smaller network without sacrificing quality, thus significantly reducing the number of floating point and memory access operations: a small neural network is augmented by a multiresolution hash table of…
▽ More
Neural graphics primitives, parameterized by fully connected neural networks, can be costly to train and evaluate. We reduce this cost with a versatile new input encoding that permits the use of a smaller network without sacrificing quality, thus significantly reducing the number of floating point and memory access operations: a small neural network is augmented by a multiresolution hash table of trainable feature vectors whose values are optimized through stochastic gradient descent. The multiresolution structure allows the network to disambiguate hash collisions, making for a simple architecture that is trivial to parallelize on modern GPUs. We leverage this parallelism by implementing the whole system using fully-fused CUDA kernels with a focus on minimizing wasted bandwidth and compute operations. We achieve a combined speedup of several orders of magnitude, enabling training of high-quality neural graphics primitives in a matter of seconds, and rendering in tens of milliseconds at a resolution of ${1920\!\times\!1080}$.
△ Less
Submitted 4 May, 2022; v1 submitted 16 January, 2022;
originally announced January 2022.
-
Path Guiding Using Spatio-Directional Mixture Models
Authors:
Ana Dodik,
Marios Papas,
Cengiz Öztireli,
Thomas Müller
Abstract:
We propose a learning-based method for light-path construction in path tracing algorithms, which iteratively optimizes and samples from what we refer to as spatio-directional Gaussian mixture models (SDMMs). In particular, we approximate incident radiance as an online-trained $5$D mixture that is accelerated by a $k$D-tree. Using the same framework, we approximate BSDFs as pre-trained $n$D mixture…
▽ More
We propose a learning-based method for light-path construction in path tracing algorithms, which iteratively optimizes and samples from what we refer to as spatio-directional Gaussian mixture models (SDMMs). In particular, we approximate incident radiance as an online-trained $5$D mixture that is accelerated by a $k$D-tree. Using the same framework, we approximate BSDFs as pre-trained $n$D mixtures, where $n$ is the number of BSDF parameters. Such an approach addresses two major challenges in path-guiding models. First, the $5$D radiance representation naturally captures correlation between the spatial and directional dimensions. Such correlations are present in e.g. parallax and caustics. Second, by using a tangent-space parameterization of Gaussians, our spatio-directional mixtures can perform approximate product sampling with arbitrarily oriented BSDFs. Existing models are only able to do this by either foregoing anisotropy of the mixture components or by representing the radiance field in local (normal aligned) coordinates, which both make the radiance field more difficult to learn. An additional benefit of the tangent-space parameterization is that each individual Gaussian is mapped to the solid sphere with low distortion near its center of mass. Our method performs especially well on scenes with small, localized luminaires that induce high spatio-directional correlation in the incident radiance.
△ Less
Submitted 28 December, 2021; v1 submitted 25 November, 2021;
originally announced November 2021.
-
Extracting Triangular 3D Models, Materials, and Lighting From Images
Authors:
Jacob Munkberg,
Jon Hasselgren,
Tianchang Shen,
Jun Gao,
Wenzheng Chen,
Alex Evans,
Thomas Müller,
Sanja Fidler
Abstract:
We present an efficient method for joint optimization of topology, materials and lighting from multi-view image observations. Unlike recent multi-view reconstruction approaches, which typically produce entangled 3D representations encoded in neural networks, we output triangle meshes with spatially-varying materials and environment lighting that can be deployed in any traditional graphics engine u…
▽ More
We present an efficient method for joint optimization of topology, materials and lighting from multi-view image observations. Unlike recent multi-view reconstruction approaches, which typically produce entangled 3D representations encoded in neural networks, we output triangle meshes with spatially-varying materials and environment lighting that can be deployed in any traditional graphics engine unmodified. We leverage recent work in differentiable rendering, coordinate-based networks to compactly represent volumetric texturing, alongside differentiable marching tetrahedrons to enable gradient-based optimization directly on the surface mesh. Finally, we introduce a differentiable formulation of the split sum approximation of environment lighting to efficiently recover all-frequency lighting. Experiments show our extracted models used in advanced scene editing, material decomposition, and high quality view interpolation, all running at interactive rates in triangle-based renderers (rasterizers and path tracers). Project website: https://nvlabs.github.io/nvdiffrec/ .
△ Less
Submitted 11 April, 2023; v1 submitted 24 November, 2021;
originally announced November 2021.
-
A photosensor employing data-driven binning for ultrafast image recognition
Authors:
Lukas Mennel,
Aday J. Molina-Mendoza,
Matthias Paur,
Dmitry K. Polyushkin,
Dohyun Kwak,
Miriam Giparakis,
Maximilian Beiser,
Aaron Maxwell Andrews,
Thomas Mueller
Abstract:
Pixel binning is a technique, widely used in optical image acquisition and spectroscopy, in which adjacent detector elements of an image sensor are combined into larger pixels. This reduces the amount of data to be processed as well as the impact of noise, but comes at the cost of a loss of information. Here, we push the concept of binning to its limit by combining a large fraction of the sensor e…
▽ More
Pixel binning is a technique, widely used in optical image acquisition and spectroscopy, in which adjacent detector elements of an image sensor are combined into larger pixels. This reduces the amount of data to be processed as well as the impact of noise, but comes at the cost of a loss of information. Here, we push the concept of binning to its limit by combining a large fraction of the sensor elements into a single superpixel that extends over the whole face of the chip. For a given pattern recognition task, its optimal shape is determined from training data using a machine learning algorithm. We demonstrate the classification of optically projected images from the MNIST dataset on a nanosecond timescale, with enhanced sensitivity and without loss of classification accuracy. Our concept is not limited to imaging alone but can also be applied in optical spectroscopy or other sensing applications.
△ Less
Submitted 20 November, 2021;
originally announced November 2021.
-
Spherical harmonic shape descriptors of nodal force demands for quantifying spatial truss connection complexity
Authors:
Keith J. Lee,
Renaud Danhaive,
Caitlin T. Mueller
Abstract:
The connections of a spatial truss structure play a critical role in the safe and efficient transfer of axial forces between members. For discrete connections, they can also improve construction efficiency by acting as registration devices that lock members in precise orientations. As more geometrically complex spatial trusses are enabled by computational workflows and the demand for material-effi…
▽ More
The connections of a spatial truss structure play a critical role in the safe and efficient transfer of axial forces between members. For discrete connections, they can also improve construction efficiency by acting as registration devices that lock members in precise orientations. As more geometrically complex spatial trusses are enabled by computational workflows and the demand for material-efficient spanning systems, there is a need to understand the effects of global form on the demands at the connections. For large-scale structures with irregular geometry, customizing individual nodes to meet exact member orientations and force demands may be infeasible; conversely, standardizing all connections results in oversized nodes and a compromise in registration potential. We propose a method for quantifying the complexity of spatial truss designs by the variation in nodal force demands. By representing nodal forces as a geometric object, we leverage the spherical harmonic shape descriptor, developed for applications in computational geometry, to characterize each node by a rotation and translation-invariant fixed-length vector. We define a complexity score for spatial truss design by the variance in the positions of the feature vectors in higher-dimensional space, providing an additional performance metric during early stage design exploration. We then develop a pathway towards reducing complexity by clustering nodes with respect to their feature vectors to reduce the number of unique connectors for design while minimizing the effects of mass standardization.
△ Less
Submitted 19 November, 2021;
originally announced November 2021.