-
ROCOv2: Radiology Objects in COntext Version 2, an Updated Multimodal Image Dataset
Authors:
Johannes Rückert,
Louise Bloch,
Raphael Brüngel,
Ahmad Idrissi-Yaghir,
Henning Schäfer,
Cynthia S. Schmidt,
Sven Koitka,
Obioma Pelka,
Asma Ben Abacha,
Alba G. Seco de Herrera,
Henning Müller,
Peter A. Horn,
Felix Nensa,
Christoph M. Friedrich
Abstract:
Automated medical image analysis systems often require large amounts of training data with high quality labels, which are difficult and time consuming to generate. This paper introduces Radiology Object in COntext version 2 (ROCOv2), a multimodal dataset consisting of radiological images and associated medical concepts and captions extracted from the PMC Open Access subset. It is an updated versio…
▽ More
Automated medical image analysis systems often require large amounts of training data with high quality labels, which are difficult and time consuming to generate. This paper introduces Radiology Object in COntext version 2 (ROCOv2), a multimodal dataset consisting of radiological images and associated medical concepts and captions extracted from the PMC Open Access subset. It is an updated version of the ROCO dataset published in 2018, and adds 35,705 new images added to PMC since 2018. It further provides manually curated concepts for imaging modalities with additional anatomical and directional concepts for X-rays. The dataset consists of 79,789 images and has been used, with minor modifications, in the concept detection and caption prediction tasks of ImageCLEFmedical Caption 2023. The dataset is suitable for training image annotation models based on image-caption pairs, or for multi-label image classification using Unified Medical Language System (UMLS) concepts provided with each image. In addition, it can serve for pre-training of medical domain models, and evaluation of deep learning models for multi-task learning.
△ Less
Submitted 18 June, 2024; v1 submitted 16 May, 2024;
originally announced May 2024.
-
Engaging Young Learners with Testing Using the Code Critters Mutation Game
Authors:
Philipp Straubinger,
Lena Bloch,
Gordon Fraser
Abstract:
Everyone learns to code nowadays. Writing code, however, does not go without testing, which unfortunately rarely seems to be taught explicitly. Testing is often not deemed important enough or is just not perceived as sufficiently exciting. Testing can be exciting: In this paper, we introduce Code Critters, a serious game designed to teach testing concepts engagingly. In the style of popular tower…
▽ More
Everyone learns to code nowadays. Writing code, however, does not go without testing, which unfortunately rarely seems to be taught explicitly. Testing is often not deemed important enough or is just not perceived as sufficiently exciting. Testing can be exciting: In this paper, we introduce Code Critters, a serious game designed to teach testing concepts engagingly. In the style of popular tower defense games, players strategically position magical portals that need to distinguish between creatures exhibiting the behavior described by correct code from those that are mutated, and thus faulty. When placing portals, players are implicitly testing: They choose test inputs (i.e., where to place portals), as well as test oracles (i.e., what behavior to expect), and they observe test executions as the creatures wander across the landscape passing the players' portals. An empirical study involving 40 children demonstrates that they actively engage with Code Critters. Their positive feedback provides evidence that they enjoyed playing the game, and some of the children even continued to play Code Critters at home, outside the educational setting of our study.
△ Less
Submitted 14 April, 2024;
originally announced April 2024.
-
PreprintResolver: Improving Citation Quality by Resolving Published Versions of ArXiv Preprints using Literature Databases
Authors:
Louise Bloch,
Johannes Rückert,
Christoph M. Friedrich
Abstract:
The growing impact of preprint servers enables the rapid sharing of time-sensitive research. Likewise, it is becoming increasingly difficult to distinguish high-quality, peer-reviewed research from preprints. Although preprints are often later published in peer-reviewed journals, this information is often missing from preprint servers. To overcome this problem, the PreprintResolver was developed,…
▽ More
The growing impact of preprint servers enables the rapid sharing of time-sensitive research. Likewise, it is becoming increasingly difficult to distinguish high-quality, peer-reviewed research from preprints. Although preprints are often later published in peer-reviewed journals, this information is often missing from preprint servers. To overcome this problem, the PreprintResolver was developed, which uses four literature databases (DBLP, SemanticScholar, OpenAlex, and CrossRef / CrossCite) to identify preprint-publication pairs for the arXiv preprint server. The target audience focuses on, but is not limited to inexperienced researchers and students, especially from the field of computer science. The tool is based on a fuzzy matching of author surnames, titles, and DOIs. Experiments were performed on a sample of 1,000 arXiv-preprints from the research field of computer science and without any publication information. With 77.94 %, computer science is highly affected by missing publication information in arXiv. The results show that the PreprintResolver was able to resolve 603 out of 1,000 (60.3 %) arXiv-preprints from the research field of computer science and without any publication information. All four literature databases contributed to the final result. In a manual validation, a random sample of 100 resolved preprints was checked. For all preprints, at least one result is plausible. For nine preprints, more than one result was identified, three of which are partially invalid. In conclusion the PreprintResolver is suitable for individual, manually reviewed requests, but less suitable for bulk requests. The PreprintResolver tool (https://preprintresolver.eu, Available from 2023-08-01) and source code (https://gitlab.com/ippolis_wp3/preprint-resolver, Accessed: 2023-07-19) is available online.
△ Less
Submitted 4 September, 2023;
originally announced September 2023.
-
Why is the winner the best?
Authors:
Matthias Eisenmann,
Annika Reinke,
Vivienn Weru,
Minu Dietlinde Tizabi,
Fabian Isensee,
Tim J. Adler,
Sharib Ali,
Vincent Andrearczyk,
Marc Aubreville,
Ujjwal Baid,
Spyridon Bakas,
Niranjan Balu,
Sophia Bano,
Jorge Bernal,
Sebastian Bodenstedt,
Alessandro Casella,
Veronika Cheplygina,
Marie Daum,
Marleen de Bruijne,
Adrien Depeursinge,
Reuben Dorent,
Jan Egger,
David G. Ellis,
Sandy Engelhardt,
Melanie Ganz
, et al. (100 additional authors not shown)
Abstract:
International benchmarking competitions have become fundamental for the comparative performance assessment of image analysis methods. However, little attention has been given to investigating what can be learnt from these competitions. Do they really generate scientific progress? What are common and successful participation strategies? What makes a solution superior to a competing method? To addre…
▽ More
International benchmarking competitions have become fundamental for the comparative performance assessment of image analysis methods. However, little attention has been given to investigating what can be learnt from these competitions. Do they really generate scientific progress? What are common and successful participation strategies? What makes a solution superior to a competing method? To address this gap in the literature, we performed a multi-center study with all 80 competitions that were conducted in the scope of IEEE ISBI 2021 and MICCAI 2021. Statistical analyses performed based on comprehensive descriptions of the submitted algorithms linked to their rank as well as the underlying participation strategies revealed common characteristics of winning solutions. These typically include the use of multi-task learning (63%) and/or multi-stage pipelines (61%), and a focus on augmentation (100%), image preprocessing (97%), data curation (79%), and postprocessing (66%). The "typical" lead of a winning team is a computer scientist with a doctoral degree, five years of experience in biomedical image analysis, and four years of experience in deep learning. Two core general development strategies stood out for highly-ranked teams: the reflection of the metrics in the method design and the focus on analyzing and handling failure cases. According to the organizers, 43% of the winning algorithms exceeded the state of the art but only 11% completely solved the respective domain problem. The insights of our study could help researchers (1) improve algorithm development strategies when approaching new problems, and (2) focus on open research questions revealed by this work.
△ Less
Submitted 30 March, 2023;
originally announced March 2023.
-
Biomedical image analysis competitions: The state of current participation practice
Authors:
Matthias Eisenmann,
Annika Reinke,
Vivienn Weru,
Minu Dietlinde Tizabi,
Fabian Isensee,
Tim J. Adler,
Patrick Godau,
Veronika Cheplygina,
Michal Kozubek,
Sharib Ali,
Anubha Gupta,
Jan Kybic,
Alison Noble,
Carlos Ortiz de Solórzano,
Samiksha Pachade,
Caroline Petitjean,
Daniel Sage,
Donglai Wei,
Elizabeth Wilden,
Deepak Alapatt,
Vincent Andrearczyk,
Ujjwal Baid,
Spyridon Bakas,
Niranjan Balu,
Sophia Bano
, et al. (331 additional authors not shown)
Abstract:
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis,…
▽ More
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
△ Less
Submitted 12 September, 2023; v1 submitted 16 December, 2022;
originally announced December 2022.
-
Machine Learning Workflow to Explain Black-box Models for Early Alzheimer's Disease Classification Evaluated for Multiple Datasets
Authors:
Louise Bloch,
Christoph M. Friedrich
Abstract:
Purpose: Hard-to-interpret Black-box Machine Learning (ML) were often used for early Alzheimer's Disease (AD) detection.
Methods: To interpret eXtreme Gradient Boosting (XGBoost), Random Forest (RF), and Support Vector Machine (SVM) black-box models a workflow based on Shapley values was developed. All models were trained on the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset and eval…
▽ More
Purpose: Hard-to-interpret Black-box Machine Learning (ML) were often used for early Alzheimer's Disease (AD) detection.
Methods: To interpret eXtreme Gradient Boosting (XGBoost), Random Forest (RF), and Support Vector Machine (SVM) black-box models a workflow based on Shapley values was developed. All models were trained on the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset and evaluated for an independent ADNI test set, as well as the external Australian Imaging and Lifestyle flagship study of Ageing (AIBL), and Open Access Series of Imaging Studies (OASIS) datasets. Shapley values were compared to intuitively interpretable Decision Trees (DTs), and Logistic Regression (LR), as well as natural and permutation feature importances. To avoid the reduction of the explanation validity caused by correlated features, forward selection and aspect consolidation were implemented.
Results: Some black-box models outperformed DTs and LR. The forward-selected features correspond to brain areas previously associated with AD. Shapley values identified biologically plausible associations with moderate to strong correlations with feature importances. The most important RF features to predict AD conversion were the volume of the amygdalae, and a cognitive test score. Good cognitive test performances and large brain volumes decreased the AD risk. The models trained using cognitive test scores significantly outperformed brain volumetric models ($p<0.05$). Cognitive Normal (CN) vs. AD models were successfully transferred to external datasets.
Conclusion: In comparison to previous work, improved performances for ADNI and AIBL were achieved for CN vs. Mild Cognitive Impairment (MCI) classification using brain volumes. The Shapley values and the feature importances showed moderate to strong correlations.
△ Less
Submitted 5 November, 2022; v1 submitted 12 May, 2022;
originally announced May 2022.
-
Boosting EfficientNets Ensemble Performance via Pseudo-Labels and Synthetic Images by pix2pixHD for Infection and Ischaemia Classification in Diabetic Foot Ulcers
Authors:
Louise Bloch,
Raphael Brüngel,
Christoph M. Friedrich
Abstract:
Diabetic foot ulcers are a common manifestation of lesions on the diabetic foot, a syndrome acquired as a long-term complication of diabetes mellitus. Accompanying neuropathy and vascular damage promote acquisition of pressure injuries and tissue death due to ischaemia. Affected areas are prone to infections, hindering the healing progress. The research at hand investigates an approach on classifi…
▽ More
Diabetic foot ulcers are a common manifestation of lesions on the diabetic foot, a syndrome acquired as a long-term complication of diabetes mellitus. Accompanying neuropathy and vascular damage promote acquisition of pressure injuries and tissue death due to ischaemia. Affected areas are prone to infections, hindering the healing progress. The research at hand investigates an approach on classification of infection and ischaemia, conducted as part of the Diabetic Foot Ulcer Challenge (DFUC) 2021. Different models of the EfficientNet family are utilized in ensembles. An extension strategy for the training data is applied, involving pseudo-labeling for unlabeled images, and extensive generation of synthetic images via pix2pixHD to cope with severe class imbalances. The resulting extended training dataset features $8.68$ times the size of the baseline and shows a real to synthetic image ratio of $1:3$. Performances of models and ensembles trained on the baseline and extended training dataset are compared. Synthetic images featured a broad qualitative variety. Results show that models trained on the extended training dataset as well as their ensemble benefit from the large extension. F1-Scores for rare classes receive outstanding boosts, while those for common classes are either not harmed or boosted moderately. A critical discussion concretizes benefits and identifies limitations, suggesting improvements. The work concludes that classification performance of individual models as well as that of ensembles can be boosted utilizing synthetic images. Especially performance for rare classes benefits notably.
△ Less
Submitted 30 November, 2021;
originally announced December 2021.
-
Privacy-preserving methods for smart-meter-based network simulations
Authors:
Jordan Holweger,
Lionel Bloch,
Christophe Ballif,
Nicolas Wyrsch
Abstract:
Smart-meters are a key component of energy transition. The large amount of data collected in near real-time allows grid operators to observe and simulate network states. However, privacy-preserving rules forbid the use of such data for any applications other than network operation and billing. Smart-meter measurements must be anonymised to transmit these sensitive data to a third party to perform…
▽ More
Smart-meters are a key component of energy transition. The large amount of data collected in near real-time allows grid operators to observe and simulate network states. However, privacy-preserving rules forbid the use of such data for any applications other than network operation and billing. Smart-meter measurements must be anonymised to transmit these sensitive data to a third party to perform network simulation and analysis. This work proposes two methods for data anonymisation that enable the use of raw active power measurements for network simulation and analysis. The first is based on an allocation of an externally sourced load database. The second consists of grou** smart-meter data with similar electric characteristics, then performing a random permutation of the network load-bus assignment. A benchmark of these two methods highlights that both provide similar results in bus-voltage magnitude estimation concerning ground-truth voltage.
△ Less
Submitted 7 October, 2021;
originally announced October 2021.
-
Code Perfumes: Reporting Good Code to Encourage Learners
Authors:
Florian Obermüller,
Lena Bloch,
Luisa Greifenstein,
Ute Heuer,
Gordon Fraser
Abstract:
Block-based programming languages like Scratch enable children to be creative while learning to program. Even though the block-based approach simplifies the creation of programs, learning to program can nevertheless be challenging. Automated tools such as linters therefore support learners by providing feedback about potential bugs or code smells in their programs. Even when this feedback is elabo…
▽ More
Block-based programming languages like Scratch enable children to be creative while learning to program. Even though the block-based approach simplifies the creation of programs, learning to program can nevertheless be challenging. Automated tools such as linters therefore support learners by providing feedback about potential bugs or code smells in their programs. Even when this feedback is elaborate and constructive, it still represents purely negative criticism and by construction ignores what learners have done correctly in their programs. In this paper we introduce an orthogonal approach to linting: We complement the criticism produced by a linter with positive feedback. We introduce the concept of code perfumes as the counterpart to code smells, indicating the correct application of programming practices considered to be good. By analysing not only what learners did wrong but also what they did right we hope to encourage learners, to provide teachers and students a better understanding of learners' progress, and to support the adoption of automated feedback tools. Using a catalogue of 25 code perfumes for Scratch, we empirically demonstrate that these represent frequent practices in Scratch, and we find that better programs indeed contain more code perfumes.
△ Less
Submitted 13 August, 2021;
originally announced August 2021.
-
A gold complex single crystal comprised of nanoporosity and curved surfaces
Authors:
Maria Koifman Khristosov,
Leonid Bloch,
Manfred Burghammer,
Paul Zaslansky,
Yaron Kauffmann,
Alex Katsman,
Boaz Pokroy
Abstract:
Complex hierarchical shapes are widely known in biogenic single crystals, but growing of intricate synthetic metal single crystals is still a challenge. Here we report on a simple method for growing intricately shaped single crystals of gold, each consisting of a micron-sized crystal surrounded by a nanoporous structure, while the two parts comprise a single crystal. This is achieved by annealing…
▽ More
Complex hierarchical shapes are widely known in biogenic single crystals, but growing of intricate synthetic metal single crystals is still a challenge. Here we report on a simple method for growing intricately shaped single crystals of gold, each consisting of a micron-sized crystal surrounded by a nanoporous structure, while the two parts comprise a single crystal. This is achieved by annealing thin films of gold and germanium to solidify a eutectic composition melt at a hypoeutectic concentration (Au-enriched composition). Transmission electron microscopy and synchrotron submicron scanning diffractometry and imaging confirms that the whole structure was indeed a single crystal. A kinetic model showing how this intricate single-crystal structure can be grown is presented.
△ Less
Submitted 29 December, 2019;
originally announced January 2020.
-
Mitigating the impact of distributed PV in a low-voltage grid using electricity tariffs
Authors:
Jordan Holweger,
Lionel Bloch,
Christophe Ballif,
Nicolas Wyrsch
Abstract:
A high share of distributed photovoltaic (PV) generation in low-voltage networks may lead to over-voltage, and line/transformer overloading. To mitigate these issues, we investigate how advanced electricity tariffs could ensure safe grid operation hile enabling building owners to recover their investment in a PV and storage system. We show that dynamic volumetric electricity prices trigger economi…
▽ More
A high share of distributed photovoltaic (PV) generation in low-voltage networks may lead to over-voltage, and line/transformer overloading. To mitigate these issues, we investigate how advanced electricity tariffs could ensure safe grid operation hile enabling building owners to recover their investment in a PV and storage system. We show that dynamic volumetric electricity prices trigger economic opportunities for large investments in PV and battery capacity but lead to more pressure on the grid while capacity and block rate tariffs mitigate over-voltage and decrease line loading issues. However, block rate tariffs significantly decrease the optimal PV installation size.
△ Less
Submitted 22 October, 2019;
originally announced October 2019.
-
Unsupervised algorithm for disaggregating low-sampling-rate electricity consumption of households
Authors:
Jordan Holweger,
Marina Dorokhova,
Lionel Bloch,
Christophe Ballif,
Nicolas Wyrsch
Abstract:
Non-intrusive load monitoring (NILM) has been extensively researched over the last decade. The objective of NILM is to identify the power consumption of individual appliances and to detect when particular devices are on or off from measuring the power consumption of an entire house. This information allows households to receive customized advice on how to better manage their electrical consumption…
▽ More
Non-intrusive load monitoring (NILM) has been extensively researched over the last decade. The objective of NILM is to identify the power consumption of individual appliances and to detect when particular devices are on or off from measuring the power consumption of an entire house. This information allows households to receive customized advice on how to better manage their electrical consumption. In this paper, we present an alternative NILM method that breaks down the aggregated power signal into categories of appliances. The ultimate goal is to use this approach for demand-side management to estimate potential flexibility within the electricity consumption of households. Our method is implemented as an algorithm combining NILM and load profile simulation. This algorithm, based on a Markov model, allocates an activity chain to each inhabitant of the household, deduces from the whole-house power measurement and statistical data the appliance usage, generate the power profile accordingly and finally returns the share of energy consumed by each appliance category over time. To analyze its performance, the algorithm was benchmarked against several state-of-the-art NILM algorithms and tested on three public datasets. The proposed algorithm is unsupervised; hence it does not require any labeled data, which are expensive to acquire. Although better performance is shown for the supervised algorithms, our proposed unsupervised algorithm achieves a similar range of uncertainty while saving on the cost of acquiring labeled data. Additionally, our method requires lower computational power compared to most of the tested NILM algorithms. It was designed for low-sampling-rate power measurement (every 15 min), which corresponds to the frequency range of most common smart meters.
△ Less
Submitted 20 August, 2019;
originally announced August 2019.
-
Sponge like nanoporous single crystals of gold
Authors:
Maria Koifman Khristosov,
Leonid Bloch,
Manfred Burghammer,
Yaron Kauffmann,
Alex Katsman,
Boaz Pokroy
Abstract:
Single crystals in nature often demonstrate fascinating intricate porous morphologies rather than classical faceted surfaces. We attempt to grow such crystals, drawing inspiration from biogenic porous single crystals. Here we show that nanoporous single crystals of gold can be grown with no need for any elaborate fabrication steps. These crystals are found to grow following solidification of a eut…
▽ More
Single crystals in nature often demonstrate fascinating intricate porous morphologies rather than classical faceted surfaces. We attempt to grow such crystals, drawing inspiration from biogenic porous single crystals. Here we show that nanoporous single crystals of gold can be grown with no need for any elaborate fabrication steps. These crystals are found to grow following solidification of a eutectic composition melt that forms as a result of the dewetting of nanometric thin films. We also present a kinetic model that shows how this nano-porous single-crystalline structure can be obtained, and which allows the potential size of the porous single crystal to be predicted. Retaining their single crystalline nature is due to the fact that the full crystallization process is faster than the average period between two subsequent nucleation events. Our findings clearly demonstrate that it is possible to form singe crystalline nano porous metal crystals in a controlled manner. INTRODUCTION
△ Less
Submitted 18 September, 2016;
originally announced September 2016.
-
Size Effect on the Short Range Order in Amorphous Materials
Authors:
Leonid Bloch,
Yaron Kauffmann,
Boaz Pokroy
Abstract:
Drawing inspiration from nature, where some organisms can control the short range order of amorphous minerals, we successfully manipulated the short range order of amorphous alumina by surface and size effects. By utilizing the Atomic Layer Deposition (ALD) method to grow amorphous nanometrically thin films, combined with state-of-the-art electron energy loss spectroscopy (EELS) and X-ray photoele…
▽ More
Drawing inspiration from nature, where some organisms can control the short range order of amorphous minerals, we successfully manipulated the short range order of amorphous alumina by surface and size effects. By utilizing the Atomic Layer Deposition (ALD) method to grow amorphous nanometrically thin films, combined with state-of-the-art electron energy loss spectroscopy (EELS) and X-ray photoelectron spectroscopy (XPS), we showed experimentally that the short range order in such films is strongly influenced by size. This phenomenon is equivalent to the well-known size effect on lattice parameters and on the relative stability of different polymorphs in crystalline materials. We also show that the short range order changes while still in the amorphous phase, before the amorphous to crystalline transformation takes place.
△ Less
Submitted 16 February, 2014;
originally announced February 2014.