Search | arXiv e-print repository

Deployment of Deep Learning Model in Real World Clinical Setting: A Case Study in Obstetric Ultrasound

Authors: Chun Kit Wong, Mary Ngo, Manxi Lin, Zahra Bashir, Amihai Heen, Morten Bo Søndergaard Svendsen, Martin Grønnebæk Tolsgaard, Anders Nymark Christensen, Aasa Feragen

Abstract: Despite the rapid development of AI models in medical image analysis, their validation in real-world clinical settings remains limited. To address this, we introduce a generic framework designed for deploying image-based AI models in such settings. Using this framework, we deployed a trained model for fetal ultrasound standard plane detection, and evaluated it in real-time sessions with both novic… ▽ More Despite the rapid development of AI models in medical image analysis, their validation in real-world clinical settings remains limited. To address this, we introduce a generic framework designed for deploying image-based AI models in such settings. Using this framework, we deployed a trained model for fetal ultrasound standard plane detection, and evaluated it in real-time sessions with both novice and expert users. Feedback from these sessions revealed that while the model offers potential benefits to medical practitioners, the need for navigational guidance was identified as a key area for improvement. These findings underscore the importance of early deployment of AI models in real-world settings, leading to insights that can guide the refinement of the model and system based on actual user feedback. △ Less

Submitted 22 March, 2024; originally announced April 2024.

Comments: 10 pages

arXiv:2403.08700 [pdf, other]

Diffusion-based Iterative Counterfactual Explanations for Fetal Ultrasound Image Quality Assessment

Authors: Paraskevas Pegios, Manxi Lin, Nina Weng, Morten Bo Søndergaard Svendsen, Zahra Bashir, Siavash Bigdeli, Anders Nymark Christensen, Martin Tolsgaard, Aasa Feragen

Abstract: Obstetric ultrasound image quality is crucial for accurate diagnosis and monitoring of fetal health. However, producing high-quality standard planes is difficult, influenced by the sonographer's expertise and factors like the maternal BMI or the fetus dynamics. In this work, we propose using diffusion-based counterfactual explainable AI to generate realistic high-quality standard planes from low-q… ▽ More Obstetric ultrasound image quality is crucial for accurate diagnosis and monitoring of fetal health. However, producing high-quality standard planes is difficult, influenced by the sonographer's expertise and factors like the maternal BMI or the fetus dynamics. In this work, we propose using diffusion-based counterfactual explainable AI to generate realistic high-quality standard planes from low-quality non-standard ones. Through quantitative and qualitative evaluation, we demonstrate the effectiveness of our method in producing plausible counterfactuals of increased quality. This shows future promise both for enhancing training of clinicians by providing visual feedback, as well as for improving image quality and, consequently, downstream diagnosis and monitoring. △ Less

Submitted 13 March, 2024; originally announced March 2024.

arXiv:2403.06748 [pdf, other]

Shortcut Learning in Medical Image Segmentation

Authors: Manxi Lin, Nina Weng, Kamil Mikolaj, Zahra Bashir, Morten Bo Søndergaard Svendsen, Martin Tolsgaard, Anders Nymark Christensen, Aasa Feragen

Abstract: Shortcut learning is a phenomenon where machine learning models prioritize learning simple, potentially misleading cues from data that do not generalize well beyond the training set. While existing research primarily investigates this in the realm of image classification, this study extends the exploration of shortcut learning into medical image segmentation. We demonstrate that clinical annotatio… ▽ More Shortcut learning is a phenomenon where machine learning models prioritize learning simple, potentially misleading cues from data that do not generalize well beyond the training set. While existing research primarily investigates this in the realm of image classification, this study extends the exploration of shortcut learning into medical image segmentation. We demonstrate that clinical annotations such as calipers, and the combination of zero-padded convolutions and center-cropped training sets in the dataset can inadvertently serve as shortcuts, impacting segmentation accuracy. We identify and evaluate the shortcut learning on two different but common medical image segmentation tasks. In addition, we suggest strategies to mitigate the influence of shortcut learning and improve the generalizability of the segmentation models. By uncovering the presence and implications of shortcuts in medical image segmentation, we provide insights and methodologies for evaluating and overcoming this pervasive challenge and call for attention in the community for shortcuts in segmentation. Our code is public at https://github.com/nina-weng/shortcut_skinseg . △ Less

Submitted 27 June, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

Comments: 11 pages, 6 figures, accepted at MICCAI 2024

arXiv:2402.08294 [pdf, other]

Learning semantic image quality for fetal ultrasound from noisy ranking annotation

Authors: Manxi Lin, Jakob Ambsdorf, Emilie Pi Fogtmann Sejer, Zahra Bashir, Chun Kit Wong, Paraskevas Pegios, Alberto Raheli, Morten Bo Søndergaard Svendsen, Mads Nielsen, Martin Grønnebæk Tolsgaard, Anders Nymark Christensen, Aasa Feragen

Abstract: We introduce the notion of semantic image quality for applications where image quality relies on semantic requirements. Working in fetal ultrasound, where ranking is challenging and annotations are noisy, we design a robust coarse-to-fine model that ranks images based on their semantic image quality and endow our predicted rankings with an uncertainty estimate. To annotate rankings on training dat… ▽ More We introduce the notion of semantic image quality for applications where image quality relies on semantic requirements. Working in fetal ultrasound, where ranking is challenging and annotations are noisy, we design a robust coarse-to-fine model that ranks images based on their semantic image quality and endow our predicted rankings with an uncertainty estimate. To annotate rankings on training data, we design an efficient ranking annotation scheme based on the merge sort algorithm. Finally, we compare our ranking algorithm to a number of state-of-the-art ranking algorithms on a challenging fetal ultrasound quality assessment task, showing the superior performance of our method on the majority of rank correlation metrics. △ Less

Submitted 13 February, 2024; originally announced February 2024.

Comments: Extended version of the accepted paper at ISBI 2024

arXiv:2402.06600 [pdf, ps, other]

First-Order Fischer Servi Logic

Authors: Ahmee Christensen

Abstract: We prove the completeness of a first-order analogue of the Fischer Servi logic $\mathsf{FS}$ with respect to its expected birelational semantics. To this end we introduce the notion of the $\textit{trace model}$ and, much like in a canonical model argument, prove a truth lemma. We conclude by examining a number of other first-order Fischer Servi logics, including the first-order analogue of… ▽ More We prove the completeness of a first-order analogue of the Fischer Servi logic $\mathsf{FS}$ with respect to its expected birelational semantics. To this end we introduce the notion of the $\textit{trace model}$ and, much like in a canonical model argument, prove a truth lemma. We conclude by examining a number of other first-order Fischer Servi logics, including the first-order analogue of $\mathsf{FSS4}$, whose completeness can be similarly proved. △ Less

Submitted 9 February, 2024; originally announced February 2024.

Comments: 19 pages, 8 figures

MSC Class: 03B45; 03B20 ACM Class: F.4.1

arXiv:2401.13103 [pdf, other]

Self-organizing Nervous Systems for Robot Swarms

Authors: W. Zhu, S. Oguz, M. K. Heinrich, M. Allwright, M. Wahby, A. Lyhne Christensen, E. Garone, M. Dorigo

Abstract: The system architecture controlling a group of robots is generally set before deployment and can be either centralized or decentralized. This dichotomy is highly constraining, because decentralized systems are typically fully self-organized and therefore difficult to design analytically, whereas centralized systems have single points of failure and limited scalability. To address this dichotomy, w… ▽ More The system architecture controlling a group of robots is generally set before deployment and can be either centralized or decentralized. This dichotomy is highly constraining, because decentralized systems are typically fully self-organized and therefore difficult to design analytically, whereas centralized systems have single points of failure and limited scalability. To address this dichotomy, we present the Self-organizing Nervous System (SoNS), a novel robot swarm architecture based on self-organized hierarchy. The SoNS approach enables robots to autonomously establish, maintain, and reconfigure dynamic multi-level system architectures. For example, a robot swarm consisting of $n$ independent robots could transform into a single $n$-robot SoNS and then into several independent smaller SoNSs, where each SoNS uses a temporary and dynamic hierarchy. Leveraging the SoNS approach, we show that sensing, actuation, and decision-making can be coordinated in a locally centralized way, without sacrificing the benefits of scalability, flexibility, and fault tolerance, for which swarm robotics is usually studied. In several proof-of-concept robot missions -- including binary decision-making and search-and-rescue -- we demonstrate that the capabilities of the SoNS approach greatly advance the state of the art in swarm robotics. The missions are conducted with a real heterogeneous aerial-ground robot swarm, using a custom-developed quadrotor platform. We also demonstrate the scalability of the SoNS approach in swarms of up to 250 robots in a physics-based simulator, and demonstrate several types of system fault tolerance in simulation and reality. △ Less

Submitted 23 January, 2024; originally announced January 2024.

Comments: 135 pages, 62 figues, and 14 embedded videos

arXiv:2310.19789 [pdf, other]

DiffEnc: Variational Diffusion with a Learned Encoder

Authors: Beatrix M. G. Nielsen, Anders Christensen, Andrea Dittadi, Ole Winther

Abstract: Diffusion models may be viewed as hierarchical variational autoencoders (VAEs) with two improvements: parameter sharing for the conditional distributions in the generative process and efficient computation of the loss as independent terms over the hierarchy. We consider two changes to the diffusion model that retain these advantages while adding flexibility to the model. Firstly, we introduce a da… ▽ More Diffusion models may be viewed as hierarchical variational autoencoders (VAEs) with two improvements: parameter sharing for the conditional distributions in the generative process and efficient computation of the loss as independent terms over the hierarchy. We consider two changes to the diffusion model that retain these advantages while adding flexibility to the model. Firstly, we introduce a data- and depth-dependent mean function in the diffusion process, which leads to a modified diffusion loss. Our proposed framework, DiffEnc, achieves a statistically significant improvement in likelihood on CIFAR-10. Secondly, we let the ratio of the noise variance of the reverse encoder process and the generative process be a free weight parameter rather than being fixed to 1. This leads to theoretical insights: For a finite depth hierarchy, the evidence lower bound (ELBO) can be used as an objective for a weighted diffusion loss approach and for optimizing the noise schedule specifically for inference. For the infinite-depth hierarchy, on the other hand, the weight parameter has to be 1 to have a well-defined ELBO. △ Less

Submitted 8 February, 2024; v1 submitted 30 October, 2023; originally announced October 2023.

arXiv:2308.10599 [pdf, other]

Image-free Classifier Injection for Zero-Shot Classification

Authors: Anders Christensen, Massimiliano Mancini, A. Sophia Koepke, Ole Winther, Zeynep Akata

Abstract: Zero-shot learning models achieve remarkable results on image classification for samples from classes that were not seen during training. However, such models must be trained from scratch with specialised methods: therefore, access to a training dataset is required when the need for zero-shot classification arises. In this paper, we aim to equip pre-trained models with zero-shot classification cap… ▽ More Zero-shot learning models achieve remarkable results on image classification for samples from classes that were not seen during training. However, such models must be trained from scratch with specialised methods: therefore, access to a training dataset is required when the need for zero-shot classification arises. In this paper, we aim to equip pre-trained models with zero-shot classification capabilities without the use of image data. We achieve this with our proposed Image-free Classifier Injection with Semantics (ICIS) that injects classifiers for new, unseen classes into pre-trained classification models in a post-hoc fashion without relying on image data. Instead, the existing classifier weights and simple class-wise descriptors, such as class names or attributes, are used. ICIS has two encoder-decoder networks that learn to reconstruct classifier weights from descriptors (and vice versa), exploiting (cross-)reconstruction and cosine losses to regularise the decoding process. Notably, ICIS can be cheaply trained and applied directly on top of pre-trained classification models. Experiments on benchmark ZSL datasets show that ICIS produces unseen classifier weights that achieve strong (generalised) zero-shot classification performance. Code is available at https://github.com/ExplainableML/ImageFreeZSL . △ Less

Submitted 21 August, 2023; originally announced August 2023.

Comments: Accepted at ICCV 2023

arXiv:2307.10865 [pdf, other]

Addressing caveats of neural persistence with deep graph persistence

Authors: Leander Girrbach, Anders Christensen, Ole Winther, Zeynep Akata, A. Sophia Koepke

Abstract: Neural Persistence is a prominent measure for quantifying neural network complexity, proposed in the emerging field of topological data analysis in deep learning. In this work, however, we find both theoretically and empirically that the variance of network weights and spatial concentration of large weights are the main factors that impact neural persistence. Whilst this captures useful informatio… ▽ More Neural Persistence is a prominent measure for quantifying neural network complexity, proposed in the emerging field of topological data analysis in deep learning. In this work, however, we find both theoretically and empirically that the variance of network weights and spatial concentration of large weights are the main factors that impact neural persistence. Whilst this captures useful information for linear classifiers, we find that no relevant spatial structure is present in later layers of deep neural networks, making neural persistence roughly equivalent to the variance of weights. Additionally, the proposed averaging procedure across layers for deep neural networks does not consider interaction between layers. Based on our analysis, we propose an extension of the filtration underlying neural persistence to the whole neural network instead of single layers, which is equivalent to calculating neural persistence on one particular matrix. This yields our deep graph persistence measure, which implicitly incorporates persistent paths through the network and alleviates variance-related issues through standardisation. Code is available at https://github.com/ExplainableML/Deep-Graph-Persistence . △ Less

Submitted 20 November, 2023; v1 submitted 20 July, 2023; originally announced July 2023.

Comments: Transactions on Machine Learning Research (TMLR), 2023

arXiv:2304.05463 [pdf, other]

doi 10.1007/978-3-031-44521-7_2

An Automatic Guidance and Quality Assessment System for Doppler Imaging of Umbilical Artery

Authors: Chun Kit Wong, Manxi Lin, Alberto Raheli, Zahra Bashir, Morten Bo Søndergaard Svendsen, Martin Grønnebæk Tolsgaard, Aasa Feragen, Anders Nymark Christensen

Abstract: Examination of the umbilical artery with Doppler ultrasonography is performed to investigate blood supply to the fetus through the umbilical cord, which is vital for the monitoring of fetal health. Such examination involves several steps that must be performed correctly: identifying suitable sites on the umbilical artery for the measurement, acquiring the blood flow curve in the form of a Doppler… ▽ More Examination of the umbilical artery with Doppler ultrasonography is performed to investigate blood supply to the fetus through the umbilical cord, which is vital for the monitoring of fetal health. Such examination involves several steps that must be performed correctly: identifying suitable sites on the umbilical artery for the measurement, acquiring the blood flow curve in the form of a Doppler spectrum, and ensuring compliance to a set of quality standards. These steps rely heavily on the operator's skill, and the shortage of experienced sonographers has thus created a demand for machine assistance. In this work, we propose an automatic system to fill the gap. By using a modified Faster R-CNN network, we obtain an algorithm that can suggest locations suitable for Doppler measurement. Meanwhile, we have also developed a method for assessment of the Doppler spectrum's quality. The proposed system is validated on 657 images from a national ultrasound screening database, with results demonstrating its potential as a guidance system. △ Less

Submitted 6 July, 2023; v1 submitted 11 April, 2023; originally announced April 2023.

Comments: Fetal Ultrasound, Umbilical Artery, Doppler Ultrasound

Journal ref: ASMUS 2023. Simplifying Medical Ultrasound pp 13-22. Lecture Notes in Computer Science, vol 14337

arXiv:2211.10630 [pdf, other]

I saw, I conceived, I concluded: Progressive Concepts as Bottlenecks

Authors: Manxi Lin, Aasa Feragen, Zahra Bashir, Martin Grønnebæk Tolsgaard, Anders Nymark Christensen

Abstract: Concept bottleneck models (CBMs) include a bottleneck of human-interpretable concepts providing explainability and intervention during inference by correcting the predicted, intermediate concepts. This makes CBMs attractive for high-stakes decision-making. In this paper, we take the quality assessment of fetal ultrasound scans as a real-life use case for CBM decision support in healthcare. For thi… ▽ More Concept bottleneck models (CBMs) include a bottleneck of human-interpretable concepts providing explainability and intervention during inference by correcting the predicted, intermediate concepts. This makes CBMs attractive for high-stakes decision-making. In this paper, we take the quality assessment of fetal ultrasound scans as a real-life use case for CBM decision support in healthcare. For this case, simple binary concepts are not sufficiently reliable, as they are mapped directly from images of highly variable quality, for which variable model calibration might lead to unstable binarized concepts. Moreover, scalar concepts do not provide the intuitive spatial feedback requested by users. To address this, we design a hierarchical CBM imitating the sequential expert decision-making process of "seeing", "conceiving" and "concluding". Our model first passes through a layer of visual, segmentation-based concepts, and next a second layer of property concepts directly associated with the decision-making task. We note that experts can intervene on both the visual and property concepts during inference. Additionally, we increase the bottleneck capacity by considering task-relevant concept interaction. Our application of ultrasound scan quality assessment is challenging, as it relies on balancing the (often poor) image quality against an assessment of the visibility and geometric properties of standardized image content. Our validation shows that -- in contrast with previous CBM models -- our CBM models actually outperform equivalent concept-free models in terms of predictive performance. Moreover, we illustrate how interventions can further improve our performance over the state-of-the-art. △ Less

Submitted 19 November, 2022; originally announced November 2022.

arXiv:2211.01406 [pdf, other]

Incorporating High-Frequency Weather Data into Consumption Expenditure Predictions

Authors: Anders Christensen, Joel Ferguson, Simón Ramírez Amaya

Abstract: Recent efforts have been very successful in accurately map** welfare in datasparse regions of the world using satellite imagery and other non-traditional data sources. However, the literature to date has focused on predicting a particular class of welfare measures, asset indices, which are relatively insensitive to short term fluctuations in well-being. We suggest that predicting more volatile w… ▽ More Recent efforts have been very successful in accurately map** welfare in datasparse regions of the world using satellite imagery and other non-traditional data sources. However, the literature to date has focused on predicting a particular class of welfare measures, asset indices, which are relatively insensitive to short term fluctuations in well-being. We suggest that predicting more volatile welfare measures, such as consumption expenditure, substantially benefits from the incorporation of data sources with high temporal resolution. By incorporating daily weather data into training and prediction, we improve consumption prediction accuracy significantly compared to models that only utilize satellite imagery. △ Less

Submitted 6 October, 2022; originally announced November 2022.

arXiv:2210.13230 [pdf, other]

An Experimental Study of Dimension Reduction Methods on Machine Learning Algorithms with Applications to Psychometrics

Authors: Sean H. Merritt, Alexander P. Christensen

Abstract: Develo** interpretable machine learning models has become an increasingly important issue. One way in which data scientists have been able to develop interpretable models has been to use dimension reduction techniques. In this paper, we examine several dimension reduction techniques including two recent approaches developed in the network psychometrics literature called exploratory graph analysi… ▽ More Develo** interpretable machine learning models has become an increasingly important issue. One way in which data scientists have been able to develop interpretable models has been to use dimension reduction techniques. In this paper, we examine several dimension reduction techniques including two recent approaches developed in the network psychometrics literature called exploratory graph analysis (EGA) and unique variable analysis (UVA). We compared EGA and UVA with two other dimension reduction techniques common in the machine learning literature (principal component analysis and independent component analysis) as well as no reduction to the variables real data. We show that EGA and UVA perform as well as the other reduction techniques or no reduction. Consistent with previous literature, we show that dimension reduction can decrease, increase, or provide the same accuracy as no reduction of variables. Our tentative results find that dimension reduction tends to lead to better performance when used for classification tasks. △ Less

Submitted 21 March, 2023; v1 submitted 19 October, 2022; originally announced October 2022.

arXiv:2205.11115 [pdf, other]

DTU-Net: Learning Topological Similarity for Curvilinear Structure Segmentation

Authors: Manxi Lin, Zahra Bashir, Martin Grønnebæk Tolsgaard, Anders Nymark Christensen, Aasa Feragen

Abstract: Curvilinear structure segmentation is important in medical imaging, quantifying structures such as vessels, airways, neurons, or organ boundaries in 2D slices. Segmentation via pixel-wise classification often fails to capture the small and low-contrast curvilinear structures. Prior topological information is typically used to address this problem, often at an expensive computational cost, and some… ▽ More Curvilinear structure segmentation is important in medical imaging, quantifying structures such as vessels, airways, neurons, or organ boundaries in 2D slices. Segmentation via pixel-wise classification often fails to capture the small and low-contrast curvilinear structures. Prior topological information is typically used to address this problem, often at an expensive computational cost, and sometimes requiring prior knowledge of the expected topology. We present DTU-Net, a data-driven approach to topology-preserving curvilinear structure segmentation. DTU-Net consists of two sequential, lightweight U-Nets, dedicated to texture and topology, respectively. While the texture net makes a coarse prediction using image texture information, the topology net learns topological information from the coarse prediction by employing a triplet loss trained to recognize false and missed splits in the structure. We conduct experiments on a challenging multi-class ultrasound scan segmentation dataset as well as a well-known retinal imaging dataset. Results show that our model outperforms existing approaches in both pixel-wise segmentation accuracy and topological continuity, with no need for prior topological knowledge. △ Less

Submitted 4 March, 2023; v1 submitted 23 May, 2022; originally announced May 2022.

Comments: 12 pages, 4 figures

arXiv:2203.11824 [pdf, other]

Was that so hard? Estimating human classification difficulty

Authors: Morten Rieger Hannemose, Josefine Vilsbøll Sundgaard, Niels Kvorning Ternov, Rasmus R. Paulsen, Anders Nymark Christensen

Abstract: When doctors are trained to diagnose a specific disease, they learn faster when presented with cases in order of increasing difficulty. This creates the need for automatically estimating how difficult it is for doctors to classify a given case. In this paper, we introduce methods for estimating how hard it is for a doctor to diagnose a case represented by a medical image, both when ground truth di… ▽ More When doctors are trained to diagnose a specific disease, they learn faster when presented with cases in order of increasing difficulty. This creates the need for automatically estimating how difficult it is for doctors to classify a given case. In this paper, we introduce methods for estimating how hard it is for a doctor to diagnose a case represented by a medical image, both when ground truth difficulties are available for training, and when they are not. Our methods are based on embeddings obtained with deep metric learning. Additionally, we introduce a practical method for obtaining ground truth human difficulty for each image case in a dataset using self-assessed certainty. We apply our methods to two different medical datasets, achieving high Kendall rank correlation coefficients, showing that we outperform existing methods by a large margin on our problem and data. △ Less

Submitted 22 March, 2022; originally announced March 2022.

Comments: 10 pages, 2 figures

arXiv:2202.03434 [pdf, other]

Multi-modal data generation with a deep metric variational autoencoder

Authors: Josefine Vilsbøll Sundgaard, Morten Rieger Hannemose, Søren Laugesen, Peter Bray, James Harte, Yosuke Kamide, Chiemi Tanaka, Rasmus R. Paulsen, Anders Nymark Christensen

Abstract: We present a deep metric variational autoencoder for multi-modal data generation. The variational autoencoder employs triplet loss in the latent space, which allows for conditional data generation by sampling in the latent space within each class cluster. The approach is evaluated on a multi-modal dataset consisting of otoscopy images of the tympanic membrane with corresponding wideband tympanomet… ▽ More We present a deep metric variational autoencoder for multi-modal data generation. The variational autoencoder employs triplet loss in the latent space, which allows for conditional data generation by sampling in the latent space within each class cluster. The approach is evaluated on a multi-modal dataset consisting of otoscopy images of the tympanic membrane with corresponding wideband tympanometry measurements. The modalities in this dataset are correlated, as they represent different aspects of the state of the middle ear, but they do not present a direct pixel-to-pixel correlation. The approach shows promising results for the conditional generation of pairs of images and tympanograms, and will allow for efficient data augmentation of data from multi-modal sources. △ Less

Submitted 7 February, 2022; originally announced February 2022.

arXiv:2105.14655 [pdf, other]

doi 10.1073/pnas.2205221119

Informing Geometric Deep Learning with Electronic Interactions to Accelerate Quantum Chemistry

Authors: Zhuoran Qiao, Anders S. Christensen, Matthew Welborn, Frederick R. Manby, Anima Anandkumar, Thomas F. Miller III

Abstract: Predicting electronic energies, densities, and related chemical properties can facilitate the discovery of novel catalysts, medicines, and battery materials. By develo** a physics-inspired equivariant neural network, we introduce a method to learn molecular representations based on the electronic interactions among atomic orbitals. Our method, OrbNet-Equi, leverages efficient tight-binding simul… ▽ More Predicting electronic energies, densities, and related chemical properties can facilitate the discovery of novel catalysts, medicines, and battery materials. By develo** a physics-inspired equivariant neural network, we introduce a method to learn molecular representations based on the electronic interactions among atomic orbitals. Our method, OrbNet-Equi, leverages efficient tight-binding simulations and learned map**s to recover high fidelity quantum chemical properties. OrbNet-Equi models a wide spectrum of target properties with an accuracy consistently better than standard machine learning methods and a speed orders of magnitude greater than density functional theory. Despite only using training samples collected from readily available small-molecule libraries, OrbNet-Equi outperforms traditional methods on comprehensive downstream benchmarks that encompass diverse main-group chemical processes. Our method also describes interactions in challenging charge-transfer complexes and open-shell systems. We anticipate that the strategy presented here will help to expand opportunities for studies in chemistry and materials science, where the acquisition of experimental or reference training data is costly. △ Less

Submitted 1 April, 2022; v1 submitted 30 May, 2021; originally announced May 2021.

Journal ref: Proceedings of the National Academy of Sciences 119.31 (2022): e2205221119

arXiv:2103.00067 [pdf, other]

Partitioned Graph Convolution Using Adversarial and Regression Networks for Road Travel Speed Prediction

Authors: Jakob Meldgaard Kjær, Lasse Kristensen, Mads Alberg Christensen

Abstract: Access to quality travel time information for roads in a road network has become increasingly important with the rising demand for real-time travel time estimation for paths within road networks. In the context of the Danish road network (DRN) dataset used in this paper, the data coverage is sparse and skewed towards arterial roads, with a coverage of 23.88% across 850,980 road segments, which mak… ▽ More Access to quality travel time information for roads in a road network has become increasingly important with the rising demand for real-time travel time estimation for paths within road networks. In the context of the Danish road network (DRN) dataset used in this paper, the data coverage is sparse and skewed towards arterial roads, with a coverage of 23.88% across 850,980 road segments, which makes travel time estimation difficult. Existing solutions for graph-based data processing often neglect the size of the graph, which is an apparent problem for road networks with a large amount of connected road segments. To this end, we propose a framework for predicting road segment travel speed histograms for dataless edges, based on a latent representation generated by an adversarially regularized convolutional network. We apply a partitioning algorithm to divide the graph into dense subgraphs, and then train a model for each subgraph to predict speed histograms for the nodes. The framework achieves an accuracy of 71.5% intersection and 78.5% correlation on predicting travel speed histograms using the DRN dataset. Furthermore, experiments show that partitioning the dataset into clusters increases the performance of the framework. Specifically, partitioning the road network dataset into 100 clusters, with approximately 500 road segments in each cluster, achieves a better performance than when using 10 and 20 clusters. △ Less

Submitted 26 February, 2021; originally announced March 2021.

Comments: This thesis was completed 2020-06-12 and defended 2020-06-26

MSC Class: 68T05 ACM Class: I.2.6; G.2.2

arXiv:2008.01998 [pdf, other]

Optimal Variance Control of the Score Function Gradient Estimator for Importance Weighted Bounds

Authors: Valentin Liévin, Andrea Dittadi, Anders Christensen, Ole Winther

Abstract: This paper introduces novel results for the score function gradient estimator of the importance weighted variational bound (IWAE). We prove that in the limit of large $K$ (number of importance samples) one can choose the control variate such that the Signal-to-Noise ratio (SNR) of the estimator grows as $\sqrt{K}$. This is in contrast to the standard pathwise gradient estimator where the SNR decre… ▽ More This paper introduces novel results for the score function gradient estimator of the importance weighted variational bound (IWAE). We prove that in the limit of large $K$ (number of importance samples) one can choose the control variate such that the Signal-to-Noise ratio (SNR) of the estimator grows as $\sqrt{K}$. This is in contrast to the standard pathwise gradient estimator where the SNR decreases as $1/\sqrt{K}$. Based on our theoretical findings we develop a novel control variate that extends on VIMCO. Empirically, for the training of both continuous and discrete generative models, the proposed method yields superior variance reduction, resulting in an SNR for IWAE that increases with $K$ without relying on the reparameterization trick. The novel estimator is competitive with state-of-the-art reparameterization-free gradient estimators such as Reweighted Wake-Sleep (RWS) and the thermodynamic variational objective (TVO) when training generative models. △ Less

Submitted 8 December, 2020; v1 submitted 5 August, 2020; originally announced August 2020.

arXiv:2006.00084 [pdf, other]

Clustering-informed Cinematic Astrophysical Data Visualization with Application to the Moon-forming Terrestrial Synestia

Authors: Patrick D. Aleo, Simon J. Lock, Donna J. Cox, Stuart A. Levy, J. P. Naiman, A. J. Christensen, Kalina Borkiewicz, Robert Patterson

Abstract: Scientific visualization tools are currently not optimized to create cinematic, production-quality representations of numerical data for the purpose of science communication. In our pipeline \texttt{Estra}, we outline a step-by-step process from a raw simulation into a finished render as a way to teach non-experts in the field of visualization how to achieve production-quality outputs on their own… ▽ More Scientific visualization tools are currently not optimized to create cinematic, production-quality representations of numerical data for the purpose of science communication. In our pipeline \texttt{Estra}, we outline a step-by-step process from a raw simulation into a finished render as a way to teach non-experts in the field of visualization how to achieve production-quality outputs on their own. We demonstrate feasibility of using the visual effects software Houdini for cinematic astrophysical data visualization, informed by machine learning clustering algorithms. To demonstrate the capabilities of this pipeline, we used a post-impact, thermally-equilibrated Moon-forming synestia from \cite{Lock18}. Our approach aims to identify "physically interpretable" clusters, where clusters identified in an appropriate phase space (e.g. here we use a temperature-entropy phase-space) correspond to physically meaningful structures within the simulation data. Clustering results can then be used to highlight these structures by informing the color-map** process in a simplified Houdini software shading network, where dissimilar phase-space clusters are mapped to different color values for easier visual identification. Cluster information can also be used in 3D position space, via Houdini's Scene View, to aid in physical cluster finding, simulation prototy**, and data exploration. Our clustering-based renders are compared to those created by the Advanced Visualization Lab (AVL) team for the full dome show "Imagine the Moon" as proof of concept. With \texttt{Estra}, scientists have a tool to create their own production-quality, data-driven visualizations. △ Less

Submitted 29 May, 2020; originally announced June 2020.

Comments: 19 pages, 16 figures, submitted to MNRAS

arXiv:1905.12321 [pdf, other]

doi 10.1016/j.cageo.2020.104643

Complex-valued neural networks for machine learning on non-stationary physical data

Authors: Jesper Sören Dramsch, Mikael Lüthje, Anders Nymark Christensen

Abstract: Deep learning has become an area of interest in most scientific areas, including physical sciences. Modern networks apply real-valued transformations on the data. Particularly, convolutions in convolutional neural networks discard phase information entirely. Many deterministic signals, such as seismic data or electrical signals, contain significant information in the phase of the signal. We explor… ▽ More Deep learning has become an area of interest in most scientific areas, including physical sciences. Modern networks apply real-valued transformations on the data. Particularly, convolutions in convolutional neural networks discard phase information entirely. Many deterministic signals, such as seismic data or electrical signals, contain significant information in the phase of the signal. We explore complex-valued deep convolutional networks to leverage non-linear feature maps. Seismic data commonly has a lowcut filter applied, to attenuate noise from ocean waves and similar long wavelength contributions. Discarding the phase information leads to low-frequency aliasing analogous to the Nyquist-Shannon theorem for high frequencies. In non-stationary data, the phase content can stabilize training and improve the generalizability of neural networks. While it has been shown that phase content can be restored in deep neural networks, we show how including phase information in feature maps improves both training and inference from deterministic physical data. Furthermore, we show that the reduction of parameters in a complex network outperforms larger real-valued networks. △ Less

Submitted 26 November, 2019; v1 submitted 29 May, 2019; originally announced May 2019.

Comments: 17 pages total, 15 pages, 2 pages references, paper, 11 figures, 28 networks

arXiv:1901.03309 [pdf, other]

doi 10.1063/1.5088393

A Universal Density Matrix Functional from Molecular Orbital-Based Machine Learning: Transferability across Organic Molecules

Authors: Lixue Cheng, Matthew Welborn, Anders S. Christensen, Thomas F. Miller III

Abstract: We address the degree to which machine learning can be used to accurately and transferably predict post-Hartree-Fock correlation energies. Refined strategies for feature design and selection are presented, and the molecular-orbital-based machine learning (MOB-ML) method is applied to several test systems. Strikingly, for the MP2, CCSD, and CCSD(T) levels of theory, it is shown that the thermally a… ▽ More We address the degree to which machine learning can be used to accurately and transferably predict post-Hartree-Fock correlation energies. Refined strategies for feature design and selection are presented, and the molecular-orbital-based machine learning (MOB-ML) method is applied to several test systems. Strikingly, for the MP2, CCSD, and CCSD(T) levels of theory, it is shown that the thermally accessible (350 K) potential energy surface for a single water molecule can be described to within 1 millihartree using a model that is trained from only a single reference calculation at a randomized geometry. To explore the breadth of chemical diversity that can be described, MOB-ML is also applied to a new dataset of thermalized (350 K) geometries of 7211 organic models with up to seven heavy atoms. In comparison with the previously reported $Δ$-ML method, MOB-ML is shown to reach chemical accuracy with three-fold fewer training geometries. Finally, a transferability test in which models trained for seven-heavy-atom systems are used to predict energies for thirteen-heavy-atom systems reveals that MOB-ML reaches chemical accuracy with 36-fold fewer training calculations than $Δ$-ML (140 versus 5000 training calculations). △ Less

Submitted 4 April, 2019; v1 submitted 10 January, 2019; originally announced January 2019.

Comments: 8 pages, 3 figures

Journal ref: J. Chem. Phys. 150, 131103 (2019)

arXiv:1511.03154 [pdf, other]

doi 10.1371/journal.pone.0151834

Evolution of Collective Behaviors for a Real Swarm of Aquatic Surface Robots

Authors: Miguel Duarte, Vasco Costa, Jorge Gomes, Tiago Rodrigues, Fernando Silva, Sancho Moura Oliveira, Anders Lyhne Christensen

Abstract: Swarm robotics is a promising approach for the coordination of large numbers of robots. While previous studies have shown that evolutionary robotics techniques can be applied to obtain robust and efficient self-organized behaviors for robot swarms, most studies have been conducted in simulation, and the few that have been conducted on real robots have been confined to laboratory environments. In t… ▽ More Swarm robotics is a promising approach for the coordination of large numbers of robots. While previous studies have shown that evolutionary robotics techniques can be applied to obtain robust and efficient self-organized behaviors for robot swarms, most studies have been conducted in simulation, and the few that have been conducted on real robots have been confined to laboratory environments. In this paper, we demonstrate for the first time a swarm robotics system with evolved control successfully operating in a real and uncontrolled environment. We evolve neural network-based controllers in simulation for canonical swarm robotics tasks, namely homing, dispersion, clustering, and monitoring. We then assess the performance of the controllers on a real swarm of up to ten aquatic surface robots. Our results show that the evolved controllers transfer successfully to real robots and achieve a performance similar to the performance obtained in simulation. We validate that the evolved controllers display key properties of swarm intelligence-based control, namely scalability, flexibility, and robustness on the real swarm. We conclude with a proof-of-concept experiment in which the swarm performs a complete environmental monitoring task by combining multiple evolved controllers. △ Less

Submitted 2 February, 2016; v1 submitted 10 November, 2015; originally announced November 2015.

Comments: 31 pages, 15 figures, journal

Journal ref: PLoS ONE 11(3), 2016, pp. e0151834

arXiv:1505.07050 [pdf, other]

Virtual Nervous Systems for Self-Assembling Robots - A preliminary report

Authors: Nithin Mathews, Anders Lyhne Christensen, Rehan O'Grady, Marco Dorigo

Abstract: We define the nervous system of a robot as the processing unit responsible for controlling the robot body, together with the links between the processing unit and the sensorimotor hardware of the robot - i.e., the equivalent of the central nervous system in biological organisms. We present autonomous robots that can merge their nervous systems when they physically connect to each other, creating a… ▽ More We define the nervous system of a robot as the processing unit responsible for controlling the robot body, together with the links between the processing unit and the sensorimotor hardware of the robot - i.e., the equivalent of the central nervous system in biological organisms. We present autonomous robots that can merge their nervous systems when they physically connect to each other, creating a "virtual nervous system" (VNS). We show that robots with a VNS have capabilities beyond those found in any existing robotic system or biological organism: they can merge into larger bodies with a single brain (i.e., processing unit), split into separate bodies with independent brains, and temporarily acquire sensing and actuating capabilities of specialized peer robots. VNS-based robots can also self-heal by removing or replacing malfunctioning body parts, including the brain. △ Less

Submitted 26 May, 2015; originally announced May 2015.

arXiv:1407.0577 [pdf, other]

doi 10.7551/978-0-262-32621-6-ch036

Systematic Derivation of Behaviour Characterisations in Evolutionary Robotics

Authors: Jorge Gomes, Pedro Mariano, Anders Lyhne Christensen

Abstract: Evolutionary techniques driven by behavioural diversity, such as novelty search, have shown significant potential in evolutionary robotics. These techniques rely on priorly specified behaviour characterisations to estimate the similarity between individuals. Characterisations are typically defined in an ad hoc manner based on the experimenter's intuition and knowledge about the task. Alternatively… ▽ More Evolutionary techniques driven by behavioural diversity, such as novelty search, have shown significant potential in evolutionary robotics. These techniques rely on priorly specified behaviour characterisations to estimate the similarity between individuals. Characterisations are typically defined in an ad hoc manner based on the experimenter's intuition and knowledge about the task. Alternatively, generic characterisations based on the sensor-effector values of the agents are used. In this paper, we propose a novel approach that allows for systematic derivation of behaviour characterisations for evolutionary robotics, based on a formal description of the agents and their environment. Systematically derived behaviour characterisations (SDBCs) go beyond generic characterisations in that they can contain task-specific features related to the internal state of the agents, environmental features, and relations between them. We evaluate SDBCs with novelty search in three simulated collective robotics tasks. Our results show that SDBCs yield a performance comparable to the task-specific characterisations, in terms of both solution quality and behaviour space exploration. △ Less

Submitted 2 July, 2014; originally announced July 2014.

Comments: To appear in 14th International Conference on the Synthesis and Simulation of Living Systems (ALife 14)

Journal ref: International Conference on the Synthesis and Simulation of Living Systems (ALife). pp. 212-219. MIT Press (2014)

arXiv:1407.0576 [pdf, other]

doi 10.1007/978-3-319-10762-2_23

Novelty Search in Competitive Coevolution

Authors: Jorge Gomes, Pedro Mariano, Anders Lyhne Christensen

Abstract: One of the main motivations for the use of competitive coevolution systems is their ability to capitalise on arms races between competing species to evolve increasingly sophisticated solutions. Such arms races can, however, be hard to sustain, and it has been shown that the competing species often converge prematurely to certain classes of behaviours. In this paper, we investigate if and how novel… ▽ More One of the main motivations for the use of competitive coevolution systems is their ability to capitalise on arms races between competing species to evolve increasingly sophisticated solutions. Such arms races can, however, be hard to sustain, and it has been shown that the competing species often converge prematurely to certain classes of behaviours. In this paper, we investigate if and how novelty search, an evolutionary technique driven by behavioural novelty, can overcome convergence in coevolution. We propose three methods for applying novelty search to coevolutionary systems with two species: (i) score both populations according to behavioural novelty; (ii) score one population according to novelty, and the other according to fitness; and (iii) score both populations with a combination of novelty and fitness. We evaluate the methods in a predator-prey pursuit task. Our results show that novelty-based approaches can evolve a significantly more diverse set of solutions, when compared to traditional fitness-based coevolution. △ Less

Submitted 2 July, 2014; originally announced July 2014.

Comments: To appear in 13th International Conference on Parallel Problem Solving from Nature (PPSN 2014)

Journal ref: Parallel Problem Solving from Nature (PPSN). vol. 8672 LNCS. pp. 233-242. Springer (2014)

arXiv:1304.3393 [pdf, other]

doi 10.1145/2463372.2463398

Generic Behaviour Similarity Measures for Evolutionary Swarm Robotics

Authors: Jorge Gomes, Anders Lyhne Christensen

Abstract: Novelty search has shown to be a promising approach for the evolution of controllers for swarm robotics. In existing studies, however, the experimenter had to craft a domain dependent behaviour similarity measure to use novelty search in swarm robotics applications. The reliance on hand-crafted similarity measures places an additional burden to the experimenter and introduces a bias in the evoluti… ▽ More Novelty search has shown to be a promising approach for the evolution of controllers for swarm robotics. In existing studies, however, the experimenter had to craft a domain dependent behaviour similarity measure to use novelty search in swarm robotics applications. The reliance on hand-crafted similarity measures places an additional burden to the experimenter and introduces a bias in the evolutionary process. In this paper, we propose and compare two task-independent, generic behaviour similarity measures: combined state count and sampled average state. The proposed measures use the values of sensors and effectors recorded for each individual robot of the swarm. The characterisation of the group-level behaviour is then obtained by combining the sensor-effector values from all the robots. We evaluate the proposed measures in an aggregation task and in a resource sharing task. We show that the generic measures match the performance of domain dependent measures in terms of solution quality. Our results indicate that the proposed generic measures operate as effective behaviour similarity measures, and that it is possible to leverage the benefits of novelty search without having to craft domain specific similarity measures. △ Less

Submitted 11 April, 2013; originally announced April 2013.

Comments: Initial submission. Final version to appear in GECCO 2013 and dl.acm.org

Journal ref: Genetic and Evolutionary Computation Conference (GECCO). pp. 199-206. ACM Press (2013)

arXiv:1304.3362 [pdf, other]

doi 10.1007/s11721-013-0081-z

Evolution of Swarm Robotics Systems with Novelty Search

Authors: Jorge Gomes, Paulo Urbano, Anders Lyhne Christensen

Abstract: Novelty search is a recent artificial evolution technique that challenges traditional evolutionary approaches. In novelty search, solutions are rewarded based on their novelty, rather than their quality with respect to a predefined objective. The lack of a predefined objective precludes premature convergence caused by a deceptive fitness function. In this paper, we apply novelty search combined wi… ▽ More Novelty search is a recent artificial evolution technique that challenges traditional evolutionary approaches. In novelty search, solutions are rewarded based on their novelty, rather than their quality with respect to a predefined objective. The lack of a predefined objective precludes premature convergence caused by a deceptive fitness function. In this paper, we apply novelty search combined with NEAT to the evolution of neural controllers for homogeneous swarms of robots. Our empirical study is conducted in simulation, and we use a common swarm robotics task - aggregation, and a more challenging task - sharing of an energy recharging station. Our results show that novelty search is unaffected by deception, is notably effective in bootstrap** the evolution, can find solutions with lower complexity than fitness-based evolution, and can find a broad diversity of solutions for the same task. Even in non-deceptive setups, novelty search achieves solution qualities similar to those obtained in traditional fitness-based evolution. Our study also encompasses variants of novelty search that work in concert with fitness-based evolution to combine the exploratory character of novelty search with the exploitatory character of objective-based evolution. We show that these variants can further improve the performance of novelty search. Overall, our study shows that novelty search is a promising alternative for the evolution of controllers for robotic swarms. △ Less

Submitted 11 April, 2013; originally announced April 2013.

Comments: To appear in Swarm Intelligence (2013), ANTS Special Issue. The final publication will be available at link.springer.com

Journal ref: Swarm Intelligence 7 (2-3). pp. 115-144 (2013)

Showing 1–28 of 28 results for author: Christensen, A