Search | arXiv e-print repository

Assessing the Performance of Deep Learning for Automated Gleason Grading in Prostate Cancer

Authors: Dominik Müller, Philip Meyer, Lukas Rentschler, Robin Manz, Daniel Hieber, Jonas Bäcker, Samantha Cramer, Christoph Wengenmayr, Bruno Märkl, Ralf Huss, Frank Kramer, Iñaki Soto-Rey, Johannes Raffler

Abstract: Prostate cancer is a dominant health concern calling for advanced diagnostic tools. Utilizing digital pathology and artificial intelligence, this study explores the potential of 11 deep neural network architectures for automated Gleason grading in prostate carcinoma focusing on comparing traditional and recent architectures. A standardized image classification pipeline, based on the AUCMEDI framew… ▽ More Prostate cancer is a dominant health concern calling for advanced diagnostic tools. Utilizing digital pathology and artificial intelligence, this study explores the potential of 11 deep neural network architectures for automated Gleason grading in prostate carcinoma focusing on comparing traditional and recent architectures. A standardized image classification pipeline, based on the AUCMEDI framework, facilitated robust evaluation using an in-house dataset consisting of 34,264 annotated tissue tiles. The results indicated varying sensitivity across architectures, with ConvNeXt demonstrating the strongest performance. Notably, newer architectures achieved superior performance, even though with challenges in differentiating closely related Gleason grades. The ConvNeXt model was capable of learning a balance between complexity and generalizability. Overall, this study lays the groundwork for enhanced Gleason grading systems, potentially improving diagnostic efficiency for prostate cancer. △ Less

Submitted 25 March, 2024; originally announced March 2024.

arXiv:2403.16678 [pdf]

DeepGleason: a System for Automated Gleason Grading of Prostate Cancer using Deep Neural Networks

Authors: Dominik Müller, Philip Meyer, Lukas Rentschler, Robin Manz, Jonas Bäcker, Samantha Cramer, Christoph Wengenmayr, Bruno Märkl, Ralf Huss, Iñaki Soto-Rey, Johannes Raffler

Abstract: Advances in digital pathology and artificial intelligence (AI) offer promising opportunities for clinical decision support and enhancing diagnostic workflows. Previous studies already demonstrated AI's potential for automated Gleason grading, but lack state-of-the-art methodology and model reusability. To address this issue, we propose DeepGleason: an open-source deep neural network based image cl… ▽ More Advances in digital pathology and artificial intelligence (AI) offer promising opportunities for clinical decision support and enhancing diagnostic workflows. Previous studies already demonstrated AI's potential for automated Gleason grading, but lack state-of-the-art methodology and model reusability. To address this issue, we propose DeepGleason: an open-source deep neural network based image classification system for automated Gleason grading using whole-slide histopathology images from prostate tissue sections. Implemented with the standardized AUCMEDI framework, our tool employs a tile-wise classification approach utilizing fine-tuned image preprocessing techniques in combination with a ConvNeXt architecture which was compared to various state-of-the-art architectures. The neural network model was trained and validated on an in-house dataset of 34,264 annotated tiles from 369 prostate carcinoma slides. We demonstrated that DeepGleason is capable of highly accurate and reliable Gleason grading with a macro-averaged F1-score of 0.806, AUC of 0.991, and Accuracy of 0.974. The internal architecture comparison revealed that the ConvNeXt model was superior performance-wise on our dataset to established and other modern architectures like transformers. Furthermore, we were able to outperform the current state-of-the-art in tile-wise fine-classification with a sensitivity and specificity of 0.94 and 0.98 for benign vs malignant detection as well as of 0.91 and 0.75 for Gleason 3 vs Gleason 4 & 5 classification, respectively. Our tool contributes to the wider adoption of AI-based Gleason grading within the research community and paves the way for broader clinical application of deep learning models in digital pathology. DeepGleason is open-source and publicly available for research application in the following Git repository: https://github.com/frankkramer-lab/DeepGleason. △ Less

Submitted 25 March, 2024; originally announced March 2024.

arXiv:2403.04510 [pdf, other]

Where does In-context Translation Happen in Large Language Models

Authors: Suzanna Sia, David Mueller, Kevin Duh

Abstract: Self-supervised large language models have demonstrated the ability to perform Machine Translation (MT) via in-context learning, but little is known about where the model performs the task with respect to prompt instructions and demonstration examples. In this work, we attempt to characterize the region where large language models transition from in-context learners to translation models. Through… ▽ More Self-supervised large language models have demonstrated the ability to perform Machine Translation (MT) via in-context learning, but little is known about where the model performs the task with respect to prompt instructions and demonstration examples. In this work, we attempt to characterize the region where large language models transition from in-context learners to translation models. Through a series of layer-wise context-masking experiments on \textsc{GPTNeo2.7B}, \textsc{Bloom3B}, \textsc{Llama7b} and \textsc{Llama7b-chat}, we demonstrate evidence of a "task recognition" point where the translation task is encoded into the input representations and attention to context is no longer necessary. We further observe correspondence between the low performance when masking out entire layers, and the task recognition layers. Taking advantage of this redundancy results in 45\% computational savings when prompting with 5 examples, and task recognition achieved at layer 14 / 32. Our layer-wise fine-tuning experiments indicate that the most effective layers for MT fine-tuning are the layers critical to task recognition. △ Less

Submitted 7 March, 2024; originally announced March 2024.

Comments: 19 pages. Under Review

arXiv:2401.06481 [pdf, other]

Machine learning a fixed point action for SU(3) gauge theory with a gauge equivariant convolutional neural network

Authors: Kieran Holland, Andreas Ipp, David I. Müller, Urs Wenger

Abstract: Fixed point lattice actions are designed to have continuum classical properties unaffected by discretization effects and reduced lattice artifacts at the quantum level. They provide a possible way to extract continuum physics with coarser lattices, thereby allowing to circumvent problems with critical slowing down and topological freezing toward the continuum limit. A crucial ingredient for practi… ▽ More Fixed point lattice actions are designed to have continuum classical properties unaffected by discretization effects and reduced lattice artifacts at the quantum level. They provide a possible way to extract continuum physics with coarser lattices, thereby allowing to circumvent problems with critical slowing down and topological freezing toward the continuum limit. A crucial ingredient for practical applications is to find an accurate and compact parametrization of a fixed point action, since many of its properties are only implicitly defined. Here we use machine learning methods to revisit the question of how to parametrize fixed point actions. In particular, we obtain a fixed point action for four-dimensional SU(3) gauge theory using convolutional neural networks with exact gauge invariance. The large operator space allows us to find superior parametrizations compared to previous studies, a necessary first step for future Monte Carlo simulations. △ Less

Submitted 12 January, 2024; originally announced January 2024.

Comments: 22 pages, 15 figures, 6 tables

arXiv:2311.17816 [pdf, other]

Fixed point actions from convolutional neural networks

Authors: Kieran Holland, Andreas Ipp, David I. Müller, Urs Wenger

Abstract: Lattice gauge-equivariant convolutional neural networks (L-CNNs) can be used to form arbitrarily shaped Wilson loops and can approximate any gauge-covariant or gauge-invariant function on the lattice. Here we use L-CNNs to describe fixed point (FP) actions which are based on renormalization group transformations. FP actions are classically perfect, i.e., they have no lattice artifacts on classical… ▽ More Lattice gauge-equivariant convolutional neural networks (L-CNNs) can be used to form arbitrarily shaped Wilson loops and can approximate any gauge-covariant or gauge-invariant function on the lattice. Here we use L-CNNs to describe fixed point (FP) actions which are based on renormalization group transformations. FP actions are classically perfect, i.e., they have no lattice artifacts on classical gauge-field configurations satisfying the equations of motion, and therefore possess scale invariant instanton solutions. FP actions are tree-level Symanzik-improved to all orders in the lattice spacing and can produce physical predictions with very small lattice artifacts even on coarse lattices. We find that L-CNNs are much more accurate at parametrizing the FP action compared to older approaches. They may therefore provide a way to circumvent critical slowing down and topological freezing towards the continuum limit. △ Less

Submitted 29 November, 2023; originally announced November 2023.

Comments: 9 pages, 5 figures; Proceedings of the 40th International Symposium on Lattice Field Theory (Lattice 2023)

arXiv:2310.00562 [pdf, other]

Discrete Choice Multi-Armed Bandits

Authors: Emerson Melo, David Müller

Abstract: This paper establishes a connection between a category of discrete choice models and the realms of online learning and multiarmed bandit algorithms. Our contributions can be summarized in two key aspects. Firstly, we furnish sublinear regret bounds for a comprehensive family of algorithms, encompassing the Exp3 algorithm as a particular case. Secondly, we introduce a novel family of adversarial mu… ▽ More This paper establishes a connection between a category of discrete choice models and the realms of online learning and multiarmed bandit algorithms. Our contributions can be summarized in two key aspects. Firstly, we furnish sublinear regret bounds for a comprehensive family of algorithms, encompassing the Exp3 algorithm as a particular case. Secondly, we introduce a novel family of adversarial multiarmed bandit algorithms, drawing inspiration from the generalized nested logit models initially introduced by \citet{wen:2001}. These algorithms offer users the flexibility to fine-tune the model extensively, as they can be implemented efficiently due to their closed-form sampling distribution probabilities. To demonstrate the practical implementation of our algorithms, we present numerical experiments, focusing on the stochastic bandit case. △ Less

Submitted 30 September, 2023; originally announced October 2023.

MSC Class: F.2.0

arXiv:2308.13373 [pdf]

Enhanced Mortality Prediction In Patients With Subarachnoid Haemorrhage Using A Deep Learning Model Based On The Initial CT Scan

Authors: Sergio Garcia-Garcia, Santiago Cepeda, Dominik Muller, Alejandra Mosteiro, Ramon Torne, Silvia Agudo, Natalia de la Torre, Ignacio Arrese, Rosario Sarabia

Abstract: PURPOSE: Subarachnoid hemorrhage (SAH) entails high morbidity and mortality rates. Convolutional neural networks (CNN), a form of deep learning, are capable of generating highly accurate predictions from imaging data. Our objective was to predict mortality in SAH patients by processing the initial CT scan on a CNN based algorithm. METHODS: Retrospective multicentric study of a consecutive cohort… ▽ More PURPOSE: Subarachnoid hemorrhage (SAH) entails high morbidity and mortality rates. Convolutional neural networks (CNN), a form of deep learning, are capable of generating highly accurate predictions from imaging data. Our objective was to predict mortality in SAH patients by processing the initial CT scan on a CNN based algorithm. METHODS: Retrospective multicentric study of a consecutive cohort of patients with SAH between 2011-2022. Demographic, clinical and radiological variables were analyzed. Pre-processed baseline CT scan images were used as the input for training a CNN using AUCMEDI Framework. Our model's architecture leverages the DenseNet-121 structure, employing transfer learning principles. The output variable was mortality in the first three months. Performance of the model was evaluated by statistical parameters conventionally used in studies involving artificial intelligence methods. RESULTS: Images from 219 patients were processed, 175 for training and validation of the CNN and 44 for its evaluation. 52%(115/219) of patients were female, and the median age was 58(SD=13.06) years. 18.5%(39/219) were idiopathic SAH. Mortality rate was 28.5%(63/219). The model showed good accuracy at predicting mortality in SAH patients exclusively using the images of the initial CT scan (Accuracy=74%, F1=75% and AUC=82%). CONCLUSION: Modern image processing techniques based on AI and CNN make possible to predict mortality in SAH patients with high accuracy using CT scan images as the only input. These models might be optimized by including more data and patients resulting in better training, development and performance on tasks which are beyond the skills of conventional clinical knowledge. △ Less

Submitted 25 August, 2023; originally announced August 2023.

arXiv:2308.04871 [pdf, other]

Mathematical Artifacts Have Politics: The Journey from Examples to Embedded Ethics

Authors: Dennis Müller, Maurice Chiodo

Abstract: We extend Langdon Winner's idea that artifacts have politics into the realm of mathematics. To do so, we first provide a list of examples showing the existence of mathematical artifacts that have politics. In the second step, we provide an argument that shows that all mathematical artifacts have politics. We conclude by showing the implications for embedding ethics into mathematical curricula. We… ▽ More We extend Langdon Winner's idea that artifacts have politics into the realm of mathematics. To do so, we first provide a list of examples showing the existence of mathematical artifacts that have politics. In the second step, we provide an argument that shows that all mathematical artifacts have politics. We conclude by showing the implications for embedding ethics into mathematical curricula. We show how acknowledging that mathematical artifacts have politics can help mathematicians design better exercises for their mathematics students. △ Less

Submitted 9 August, 2023; originally announced August 2023.

Comments: 23 pages; keywords: artifacts have politics, embedded ethics, mathematics in society, ethics in mathematics, mathematics education

MSC Class: 01A80. 00A30; 97A40; 97D20

arXiv:2306.10484 [pdf, other]

The STOIC2021 COVID-19 AI challenge: applying reusable training methodologies to private data

Authors: Luuk H. Boulogne, Julian Lorenz, Daniel Kienzle, Robin Schon, Katja Ludwig, Rainer Lienhart, Simon Jegou, Guang Li, Cong Chen, Qi Wang, Derik Shi, Mayug Maniparambil, Dominik Muller, Silvan Mertes, Niklas Schroter, Fabio Hellmann, Miriam Elia, Ine Dirks, Matias Nicolas Bossa, Abel Diaz Berenguer, Tanmoy Mukherjee, Jef Vandemeulebroucke, Hichem Sahli, Nikos Deligiannis, Panagiotis Gonidakis , et al. (13 additional authors not shown)

Abstract: Challenges drive the state-of-the-art of automated medical image analysis. The quantity of public training data that they provide can limit the performance of their solutions. Public access to the training methodology for these solutions remains absent. This study implements the Type Three (T3) challenge format, which allows for training solutions on private data and guarantees reusable training m… ▽ More Challenges drive the state-of-the-art of automated medical image analysis. The quantity of public training data that they provide can limit the performance of their solutions. Public access to the training methodology for these solutions remains absent. This study implements the Type Three (T3) challenge format, which allows for training solutions on private data and guarantees reusable training methodologies. With T3, challenge organizers train a codebase provided by the participants on sequestered training data. T3 was implemented in the STOIC2021 challenge, with the goal of predicting from a computed tomography (CT) scan whether subjects had a severe COVID-19 infection, defined as intubation or death within one month. STOIC2021 consisted of a Qualification phase, where participants developed challenge solutions using 2000 publicly available CT scans, and a Final phase, where participants submitted their training methodologies with which solutions were trained on CT scans of 9724 subjects. The organizers successfully trained six of the eight Final phase submissions. The submitted codebases for training and running inference were released publicly. The winning solution obtained an area under the receiver operating characteristic curve for discerning between severe and non-severe COVID-19 of 0.815. The Final phase solutions of all finalists improved upon their Qualification phase solutions.HSUXJM-TNZF9CHSUXJM-TNZF9C △ Less

Submitted 25 June, 2023; v1 submitted 18 June, 2023; originally announced June 2023.

arXiv:2306.09131 [pdf, other]

Manifesto for the Responsible Development of Mathematical Works -- A Tool for Practitioners and for Management

Authors: Maurice Chiodo, Dennis Müller

Abstract: This manifesto has been written as a practical tool and aid for anyone carrying out, managing or influencing mathematical work. It provides insight into how to undertake and develop mathematically-powered products and services in a safe and responsible way. Rather than give a framework of objectives to achieve, we instead introduce a process that can be integrated into the common ways in which mat… ▽ More This manifesto has been written as a practical tool and aid for anyone carrying out, managing or influencing mathematical work. It provides insight into how to undertake and develop mathematically-powered products and services in a safe and responsible way. Rather than give a framework of objectives to achieve, we instead introduce a process that can be integrated into the common ways in which mathematical products or services are created, from start to finish. This process helps address the various issues and problems that can arise for the product, the developers, the institution, and for wider society. To do this, we break down the typical procedure of mathematical development into 10 key stages; our "10 pillars for responsible development" which follow a somewhat chronological ordering of the steps, and associated challenges, that frequently occur in mathematical work. Together these 10 pillars cover issues of the entire lifecycle of a mathematical product or service, including the preparatory work required to responsibly start a project, central questions of good technical mathematics and data science, and issues of communication, deployment and follow-up maintenance specifically related to mathematical systems. This manifesto, and the pillars within it, are the culmination of 7 years of work done by us as part of the Cambridge University Ethics in Mathematics Project. These are all tried-and-tested ideas, that we have presented and used in both academic and industrial environments. In our work, we have directly seen that mathematics can be an incredible tool for good in society, but also that without careful consideration it can cause immense harm. We hope that following this manifesto will empower its readers to reduce the risk of undesirable and unwanted consequences of their mathematical work. △ Less

Submitted 15 June, 2023; originally announced June 2023.

Comments: 33 pages, 1 figure. This is the first version, and we intend to make revisions. Comments and feedback are welcome - please get in touch with us

MSC Class: 01A80; 00A30

arXiv:2306.02507 [pdf, other]

Deep learning powered real-time identification of insects using citizen science data

Authors: Shivani Chiranjeevi, Mojdeh Sadaati, Zi K Deng, Jayanth Koushik, Talukder Z Jubery, Daren Mueller, Matthew E O Neal, Nirav Merchant, Aarti Singh, Asheesh K Singh, Soumik Sarkar, Arti Singh, Baskar Ganapathysubramanian

Abstract: Insect-pests significantly impact global agricultural productivity and quality. Effective management involves identifying the full insect community, including beneficial insects and harmful pests, to develop and implement integrated pest management strategies. Automated identification of insects under real-world conditions presents several challenges, including differentiating similar-looking spec… ▽ More Insect-pests significantly impact global agricultural productivity and quality. Effective management involves identifying the full insect community, including beneficial insects and harmful pests, to develop and implement integrated pest management strategies. Automated identification of insects under real-world conditions presents several challenges, including differentiating similar-looking species, intra-species dissimilarity and inter-species similarity, several life cycle stages, camouflage, diverse imaging conditions, and variability in insect orientation. A deep-learning model, InsectNet, is proposed to address these challenges. InsectNet is endowed with five key features: (a) utilization of a large dataset of insect images collected through citizen science; (b) label-free self-supervised learning for large models; (c) improving prediction accuracy for species with a small sample size; (d) enhancing model trustworthiness; and (e) democratizing access through streamlined MLOps. This approach allows accurate identification (>96% accuracy) of over 2500 insect species, including pollinator (e.g., butterflies, bees), parasitoid (e.g., some wasps and flies), predator species (e.g., lady beetles, mantises, dragonflies) and harmful pest species (e.g., armyworms, cutworms, grasshoppers, stink bugs). InsectNet can identify invasive species, provide fine-grained insect species identification, and work effectively in challenging backgrounds. It also can abstain from making predictions when uncertain, facilitating seamless human intervention and making it a practical and trustworthy tool. InsectNet can guide citizen science data collection, especially for invasive species where early detection is crucial. Similar approaches may transform other agricultural challenges like disease detection and underscore the importance of data collection, particularly through citizen science efforts.. △ Less

Submitted 4 June, 2023; originally announced June 2023.

arXiv:2305.08660 [pdf]

Towards Automated COVID-19 Presence and Severity Classification

Authors: Dominik Müller, Niklas Schröter, Silvan Mertes, Fabio Hellmann, Miriam Elia, Wolfgang Reif, Bernhard Bauer, Elisabeth André, Frank Kramer

Abstract: COVID-19 presence classification and severity prediction via (3D) thorax computed tomography scans have become important tasks in recent times. Especially for capacity planning of intensive care units, predicting the future severity of a COVID-19 patient is crucial. The presented approach follows state-of-theart techniques to aid medical professionals in these situations. It comprises an ensemble… ▽ More COVID-19 presence classification and severity prediction via (3D) thorax computed tomography scans have become important tasks in recent times. Especially for capacity planning of intensive care units, predicting the future severity of a COVID-19 patient is crucial. The presented approach follows state-of-theart techniques to aid medical professionals in these situations. It comprises an ensemble learning strategy via 5-fold cross-validation that includes transfer learning and combines pre-trained 3D-versions of ResNet34 and DenseNet121 for COVID19 classification and severity prediction respectively. Further, domain-specific preprocessing was applied to optimize model performance. In addition, medical information like the infection-lung-ratio, patient age, and sex were included. The presented model achieves an AUC of 79.0% to predict COVID-19 severity, and 83.7% AUC to classify the presence of an infection, which is comparable with other currently popular methods. This approach is implemented using the AUCMEDI framework and relies on well-known network architectures to ensure robustness and reproducibility. △ Less

Submitted 15 May, 2023; originally announced May 2023.

arXiv:2304.08881 [pdf, other]

Segmentation of glioblastomas in early post-operative multi-modal MRI with deep neural networks

Authors: Ragnhild Holden Helland, Alexandros Ferles, André Pedersen, Ivar Kommers, Hilko Ardon, Frederik Barkhof, Lorenzo Bello, Mitchel S. Berger, Tora Dunås, Marco Conti Nibali, Julia Furtner, Shawn Hervey-Jumper, Albert J. S. Idema, Barbara Kiesel, Rishi Nandoe Tewari, Emmanuel Mandonnet, Domenique M. J. Müller, Pierre A. Robe, Marco Rossi, Lisa M. Sagberg, Tommaso Sciortino, Tom Aalders, Michiel Wagemakers, Georg Widhalm, Marnix G. Witte , et al. (8 additional authors not shown)

Abstract: Extent of resection after surgery is one of the main prognostic factors for patients diagnosed with glioblastoma. To achieve this, accurate segmentation and classification of residual tumor from post-operative MR images is essential. The current standard method for estimating it is subject to high inter- and intra-rater variability, and an automated method for segmentation of residual tumor in ear… ▽ More Extent of resection after surgery is one of the main prognostic factors for patients diagnosed with glioblastoma. To achieve this, accurate segmentation and classification of residual tumor from post-operative MR images is essential. The current standard method for estimating it is subject to high inter- and intra-rater variability, and an automated method for segmentation of residual tumor in early post-operative MRI could lead to a more accurate estimation of extent of resection. In this study, two state-of-the-art neural network architectures for pre-operative segmentation were trained for the task. The models were extensively validated on a multicenter dataset with nearly 1000 patients, from 12 hospitals in Europe and the United States. The best performance achieved was a 61\% Dice score, and the best classification performance was about 80\% balanced accuracy, with a demonstrated ability to generalize across hospitals. In addition, the segmentation performance of the best models was on par with human expert raters. The predicted segmentations can be used to accurately classify the patients into those with residual tumor, and those with gross total resection. △ Less

Submitted 18 April, 2023; originally announced April 2023.

Comments: 13 pages, 4 figures, 4 tables

ACM Class: I.4.6; J.3

arXiv:2303.11448 [pdf, ps, other]

Geometrical aspects of lattice gauge equivariant convolutional neural networks

Authors: Jimmy Aronsson, David I. Müller, Daniel Schuh

Abstract: Lattice gauge equivariant convolutional neural networks (L-CNNs) are a framework for convolutional neural networks that can be applied to non-Abelian lattice gauge theories without violating gauge symmetry. We demonstrate how L-CNNs can be equipped with global group equivariance. This allows us to extend the formulation to be equivariant not just under translations but under global lattice symmetr… ▽ More Lattice gauge equivariant convolutional neural networks (L-CNNs) are a framework for convolutional neural networks that can be applied to non-Abelian lattice gauge theories without violating gauge symmetry. We demonstrate how L-CNNs can be equipped with global group equivariance. This allows us to extend the formulation to be equivariant not just under translations but under global lattice symmetries such as rotations and reflections. Additionally, we provide a geometric formulation of L-CNNs and show how convolutions in L-CNNs arise as a special case of gauge equivariant neural networks on SU($N$) principal bundles. △ Less

Submitted 20 March, 2023; originally announced March 2023.

Comments: 22 pages

arXiv:2302.10179 [pdf]

A Dynamic Feedforward Control Strategy for Energy-efficient Building System Operation

Authors: Xia Chen, Xiaoye Cai, Alexander Kümpel, Dirk Müller, Philipp Geyer

Abstract: The development of current building energy system operation has benefited from: 1. Informational support from the optimal design through simulation or first-principles models; 2. System load and energy prediction through machine learning (ML). Through the literature review, we note that in current control strategies and optimization algorithms, most of them rely on receiving information from real-… ▽ More The development of current building energy system operation has benefited from: 1. Informational support from the optimal design through simulation or first-principles models; 2. System load and energy prediction through machine learning (ML). Through the literature review, we note that in current control strategies and optimization algorithms, most of them rely on receiving information from real-time feedback or using only predictive signals based on ML data fitting. They do not fully utilize dynamic building information. In other words, embedding dynamic prior knowledge from building system characteristics simultaneously for system control draws less attention. In this context, we propose an engineer-friendly control strategy framework. The framework is integrated with a feedforward loop that embedded a dynamic building environment with leading and lagging system information involved: The simulation combined with system characteristic information is imported to the ML predictive algorithms. ML generates step-ahead information by rolling-window feed-in of simulation output to minimize the errors of its forecasting predecessor in a loop and achieve an overall optimal. We tested it in a case for heating system control with typical control strategies, which shows our framework owns a further energy-saving potential of 15%. △ Less

Submitted 23 January, 2023; originally announced February 2023.

Comments: 6 pages, 7 figures. Accepted by PLEA2022, Presentation is avaiable at:https://github.com/chenxiachan/chenxiachan.github.io/blob/master/files/PLEA-Xia_1166_PG.pdf

arXiv:2212.08568 [pdf, other]

Biomedical image analysis competitions: The state of current participation practice

Authors: Matthias Eisenmann, Annika Reinke, Vivienn Weru, Minu Dietlinde Tizabi, Fabian Isensee, Tim J. Adler, Patrick Godau, Veronika Cheplygina, Michal Kozubek, Sharib Ali, Anubha Gupta, Jan Kybic, Alison Noble, Carlos Ortiz de Solórzano, Samiksha Pachade, Caroline Petitjean, Daniel Sage, Donglai Wei, Elizabeth Wilden, Deepak Alapatt, Vincent Andrearczyk, Ujjwal Baid, Spyridon Bakas, Niranjan Balu, Sophia Bano , et al. (331 additional authors not shown)

Abstract: The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis,… ▽ More The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps. △ Less

Submitted 12 September, 2023; v1 submitted 16 December, 2022; originally announced December 2022.

arXiv:2212.06645 [pdf, other]

Do Text-to-Text Multi-Task Learners Suffer from Task Conflict?

Authors: David Mueller, Nicholas Andrews, Mark Dredze

Abstract: Traditional multi-task learning architectures train a single model across multiple tasks through a shared encoder followed by task-specific decoders. Learning these models often requires specialized training algorithms that address task-conflict in the shared parameter updates, which otherwise can lead to negative transfer. A new type of multi-task learning within NLP homogenizes multi-task archit… ▽ More Traditional multi-task learning architectures train a single model across multiple tasks through a shared encoder followed by task-specific decoders. Learning these models often requires specialized training algorithms that address task-conflict in the shared parameter updates, which otherwise can lead to negative transfer. A new type of multi-task learning within NLP homogenizes multi-task architectures as a shared encoder and language model decoder, which does surprisingly well across a range of diverse tasks. Does this new architecture suffer from task-conflicts that require specialized training algorithms? We study how certain factors in the shift towards text-to-text models affects multi-task conflict and negative transfer, finding that both directional conflict and transfer are surprisingly constant across architectures. △ Less

Submitted 13 December, 2022; originally announced December 2022.

Comments: Findings of EMNLP 2022

arXiv:2212.00832 [pdf, other]

doi 10.1051/epjconf/202227409001

Applications of Lattice Gauge Equivariant Neural Networks

Authors: Matteo Favoni, Andreas Ipp, David I. Müller

Abstract: The introduction of relevant physical information into neural network architectures has become a widely used and successful strategy for improving their performance. In lattice gauge theories, such information can be identified with gauge symmetries, which are incorporated into the network layers of our recently proposed Lattice Gauge Equivariant Convolutional Neural Networks (L-CNNs). L-CNNs can… ▽ More The introduction of relevant physical information into neural network architectures has become a widely used and successful strategy for improving their performance. In lattice gauge theories, such information can be identified with gauge symmetries, which are incorporated into the network layers of our recently proposed Lattice Gauge Equivariant Convolutional Neural Networks (L-CNNs). L-CNNs can generalize better to differently sized lattices than traditional neural networks and are by construction equivariant under lattice gauge transformations. In these proceedings, we present our progress on possible applications of L-CNNs to Wilson flow or continuous normalizing flow. Our methods are based on neural ordinary differential equations which allow us to modify link configurations in a gauge equivariant manner. For simplicity, we focus on simple toy models to test these ideas in practice. △ Less

Submitted 1 December, 2022; originally announced December 2022.

Comments: 8 pages, 4 figures, proceedings of XVth Quark Confinement and the Hadron Spectrum conference

arXiv:2210.13642 [pdf]

MISm: A Medical Image Segmentation Metric for Evaluation of weak labeled Data

Authors: Dennis Hartmann, Verena Schmid, Philip Meyer, Iñaki Soto-Rey, Dominik Müller, Frank Kramer

Abstract: Performance measures are an important tool for assessing and comparing different medical image segmentation algorithms. Unfortunately, the current measures have their weaknesses when it comes to assessing certain edge cases. These limitations arouse when images with a very small region of interest or without a region of interest at all are assessed. As a solution for these limitations, we propose… ▽ More Performance measures are an important tool for assessing and comparing different medical image segmentation algorithms. Unfortunately, the current measures have their weaknesses when it comes to assessing certain edge cases. These limitations arouse when images with a very small region of interest or without a region of interest at all are assessed. As a solution for these limitations, we propose a new medical image segmentation metric: MISm. To evaluate MISm, the popular metrics in the medical image segmentation and MISm were compared using images of magnet resonance tomography from several scenarios. In order to allow application in the community and reproducibility of experimental results, we included MISm in the publicly available evaluation framework MISeval: https://github.com/frankkramer-lab/miseval/tree/master/miseval △ Less

Submitted 24 October, 2022; originally announced October 2022.

Comments: GitHub: https://github.com/frankkramer-lab/miseval/tree/master/miseval

arXiv:2210.11091 [pdf]

Standardized Medical Image Classification across Medical Disciplines

Authors: Simone Mayer, Dominik Müller, Frank Kramer

Abstract: AUCMEDI is a Python-based framework for medical image classification. In this paper, we evaluate the capabilities of AUCMEDI, by applying it to multiple datasets. Datasets were specifically chosen to cover a variety of medical disciplines and imaging modalities. We designed a simple pipeline using Jupyter notebooks and applied it to all datasets. Results show that AUCMEDI was able to train a model… ▽ More AUCMEDI is a Python-based framework for medical image classification. In this paper, we evaluate the capabilities of AUCMEDI, by applying it to multiple datasets. Datasets were specifically chosen to cover a variety of medical disciplines and imaging modalities. We designed a simple pipeline using Jupyter notebooks and applied it to all datasets. Results show that AUCMEDI was able to train a model with accurate classification capabilities for each dataset: Averaged AUC per dataset range between 0.82 and 1.0, averaged F1 scores range between 0.61 and 1.0. With its high adaptability and strong performance, AUCMEDI proves to be a powerful instrument to build widely applicable neural networks. The notebooks serve as application examples for AUCMEDI. △ Less

Submitted 20 October, 2022; originally announced October 2022.

Comments: https://frankkramer-lab.github.io/aucmedi/

arXiv:2206.08182 [pdf, other]

Nucleus Segmentation and Analysis in Breast Cancer with the MIScnn Framework

Authors: Adrian Pfleiderer, Dominik Müller, Frank Kramer

Abstract: The NuCLS dataset contains over 220.000 annotations of cell nuclei in breast cancers. We show how to use these data to create a multi-rater model with the MIScnn Framework to automate the analysis of cell nuclei. For the model creation, we use the widespread U-Net approach embedded in a pipeline. This pipeline provides besides the high performance convolution neural network, several preprocessor t… ▽ More The NuCLS dataset contains over 220.000 annotations of cell nuclei in breast cancers. We show how to use these data to create a multi-rater model with the MIScnn Framework to automate the analysis of cell nuclei. For the model creation, we use the widespread U-Net approach embedded in a pipeline. This pipeline provides besides the high performance convolution neural network, several preprocessor techniques and a extended data exploration. The final model is tested in the evaluation phase using a wide variety of metrics with a subsequent visualization. Finally, the results are compared and interpreted with the results of the NuCLS study. As an outlook, indications are given which are important for the future development of models in the context of cell nuclei. △ Less

Submitted 1 February, 2023; v1 submitted 16 June, 2022; originally announced June 2022.

arXiv:2205.00889 [pdf, other]

Vehicle Routing with Time-Dependent Travel Times: Theory, Practice, and Benchmarks

Authors: Jannis Blauth, Stephan Held, Dirk Müller, Niklas Schlomberg, Vera Traub, Thorben Tröbst, Jens Vygen

Abstract: We develop theoretical foundations and practical algorithms for vehicle routing with time-dependent travel times. We also provide new benchmark instances and experimental results. First, we study basic operations on piecewise linear arrival time functions. In particular, we devise a faster algorithm to compute the pointwise minimum of a set of piecewise linear functions and a monotonicity-preservi… ▽ More We develop theoretical foundations and practical algorithms for vehicle routing with time-dependent travel times. We also provide new benchmark instances and experimental results. First, we study basic operations on piecewise linear arrival time functions. In particular, we devise a faster algorithm to compute the pointwise minimum of a set of piecewise linear functions and a monotonicity-preserving variant of the Imai-Iri algorithm to approximate an arrival time function with fewer breakpoints. Next, we show how to evaluate insertion and deletion operations in tours efficiently and update the underlying data structure faster than previously known when a tour changes. Evaluating a tour also requires a scheduling step which is non-trivial in the presence of time windows and time-dependent travel times. We show how to perform this in linear time. Based on these results, we develop a local search heuristic to solve real-world vehicle routing problems with various constraints efficiently and report experimental results on classical benchmarks. Since most of these do not have time-dependent travel times, we generate and publish new benchmark instances that are based on real-world data. This data also demonstrates the importance of considering time-dependent travel times in instances with tight time windows. △ Less

Submitted 25 March, 2024; v1 submitted 2 May, 2022; originally announced May 2022.

arXiv:2204.14199 [pdf, other]

doi 10.3389/fneur.2022.932219

Preoperative brain tumor imaging: models and software for segmentation and standardized reporting

Authors: D. Bouget, A. Pedersen, A. S. Jakola, V. Kavouridis, K. E. Emblem, R. S. Eijgelaar, I. Kommers, H. Ardon, F. Barkhof, L. Bello, M. S. Berger, M. C. Nibali, J. Furtner, S. Hervey-Jumper, A. J. S. Idema, B. Kiesel, A. Kloet, E. Mandonnet, D. M. J. Müller, P. A. Robe, M. Rossi, T. Sciortino, W. Van den Brink, M. Wagemakers, G. Widhalm , et al. (5 additional authors not shown)

Abstract: For patients suffering from brain tumor, prognosis estimation and treatment decisions are made by a multidisciplinary team based on a set of preoperative MR scans. Currently, the lack of standardized and automatic methods for tumor detection and generation of clinical reports represents a major hurdle. In this study, we investigate glioblastomas, lower grade gliomas, meningiomas, and metastases, t… ▽ More For patients suffering from brain tumor, prognosis estimation and treatment decisions are made by a multidisciplinary team based on a set of preoperative MR scans. Currently, the lack of standardized and automatic methods for tumor detection and generation of clinical reports represents a major hurdle. In this study, we investigate glioblastomas, lower grade gliomas, meningiomas, and metastases, through four cohorts of up to 4000 patients. Tumor segmentation models were trained using the AGU-Net architecture with different preprocessing steps and protocols. Segmentation performances were assessed in-depth using a wide-range of voxel and patient-wise metrics covering volume, distance, and probabilistic aspects. Finally, two software solutions have been developed, enabling an easy use of the trained models and standardized generation of clinical reports: Raidionics and Raidionics-Slicer. Segmentation performances were quite homogeneous across the four different brain tumor types, with an average true positive Dice ranging between 80% and 90%, patient-wise recall between 88% and 98%, and patient-wise precision around 95%. With our Raidionics software, running on a desktop computer with CPU support, tumor segmentation can be performed in 16 to 54 seconds depending on the dimensions of the MRI volume. For the generation of a standardized clinical report, including the tumor segmentation and features computation, 5 to 15 minutes are necessary. All trained models have been made open-access together with the source code for both software solutions and validation metrics computation. In the future, an automatic classification of the brain tumor type would be necessary to replace manual user input. Finally, the inclusion of post-operative segmentation in both software solutions will be key for generating complete post-operative standardized clinical reports. △ Less

Submitted 29 April, 2022; originally announced April 2022.

Comments: 20 pages, 5 figures, 10 tables

ACM Class: I.4.6; J.3

Journal ref: Frontiers in Neurology, Sec. Applied Neuroimaging, Volume 13, 2022

arXiv:2203.09181 [pdf, other]

An Interactive Explanatory AI System for Industrial Quality Control

Authors: Dennis Müller, Michael März, Stephan Scheele, Ute Schmid

Abstract: Machine learning based image classification algorithms, such as deep neural network approaches, will be increasingly employed in critical settings such as quality control in industry, where transparency and comprehensibility of decisions are crucial. Therefore, we aim to extend the defect detection task towards an interactive human-in-the-loop approach that allows us to integrate rich background k… ▽ More Machine learning based image classification algorithms, such as deep neural network approaches, will be increasingly employed in critical settings such as quality control in industry, where transparency and comprehensibility of decisions are crucial. Therefore, we aim to extend the defect detection task towards an interactive human-in-the-loop approach that allows us to integrate rich background knowledge and the inference of complex relationships going beyond traditional purely data-driven approaches. We propose an approach for an interactive support system for classifications in an industrial quality control setting that combines the advantages of both (explainable) knowledge-driven and data-driven machine learning methods, in particular inductive logic programming and convolutional neural networks, with human expertise and control. The resulting system can assist domain experts with decisions, provide transparent explanations for results, and integrate feedback from users; thus reducing workload for humans while both respecting their expertise and without removing their agency or accountability. △ Less

Submitted 17 March, 2022; originally announced March 2022.

Comments: to be published in AAAI 2022

arXiv:2202.05273 [pdf]

Towards a Guideline for Evaluation Metrics in Medical Image Segmentation

Authors: Dominik Müller, Iñaki Soto-Rey, Frank Kramer

Abstract: In the last decade, research on artificial intelligence has seen rapid growth with deep learning models, especially in the field of medical image segmentation. Various studies demonstrated that these models have powerful prediction capabilities and achieved similar results as clinicians. However, recent studies revealed that the evaluation in image segmentation studies lacks reliable model perform… ▽ More In the last decade, research on artificial intelligence has seen rapid growth with deep learning models, especially in the field of medical image segmentation. Various studies demonstrated that these models have powerful prediction capabilities and achieved similar results as clinicians. However, recent studies revealed that the evaluation in image segmentation studies lacks reliable model performance assessment and showed statistical bias by incorrect metric implementation or usage. Thus, this work provides an overview and interpretation guide on the following metrics for medical image segmentation evaluation in binary as well as multi-class problems: Dice similarity coefficient, Jaccard, Sensitivity, Specificity, Rand index, ROC curves, Cohen's Kappa, and Hausdorff distance. As a summary, we propose a guideline for standardized medical image segmentation evaluation to improve evaluation quality, reproducibility, and comparability in the research field. △ Less

Submitted 10 February, 2022; originally announced February 2022.

Comments: Source Code: https://github.com/frankkramer-lab/miseval.analysis

arXiv:2202.01712 [pdf, other]

doi 10.1002/widm.1475

Review of automated time series forecasting pipelines

Authors: Stefan Meisenbacher, Marian Turowski, Kaleb Phipps, Martin Rätz, Dirk Müller, Veit Hagenmeyer, Ralf Mikut

Abstract: Time series forecasting is fundamental for various use cases in different domains such as energy systems and economics. Creating a forecasting model for a specific use case requires an iterative and complex design process. The typical design process includes the five sections (1) data pre-processing, (2) feature engineering, (3) hyperparameter optimization, (4) forecasting method selection, and (5… ▽ More Time series forecasting is fundamental for various use cases in different domains such as energy systems and economics. Creating a forecasting model for a specific use case requires an iterative and complex design process. The typical design process includes the five sections (1) data pre-processing, (2) feature engineering, (3) hyperparameter optimization, (4) forecasting method selection, and (5) forecast ensembling, which are commonly organized in a pipeline structure. One promising approach to handle the ever-growing demand for time series forecasts is automating this design process. The present paper, thus, analyzes the existing literature on automated time series forecasting pipelines to investigate how to automate the design process of forecasting models. Thereby, we consider both Automated Machine Learning (AutoML) and automated statistical forecasting methods in a single forecasting pipeline. For this purpose, we firstly present and compare the proposed automation methods for each pipeline section. Secondly, we analyze the automation methods regarding their interaction, combination, and coverage of the five pipeline sections. For both, we discuss the literature, identify problems, give recommendations, and suggest future research. This review reveals that the majority of papers only cover two or three of the five pipeline sections. We conclude that future research has to holistically consider the automation of the forecasting pipeline to enable the large-scale application of time series forecasting. △ Less

Submitted 3 February, 2022; originally announced February 2022.

Journal ref: WIREs Data Mining and Knowledge Discovery (2022) e1475

arXiv:2201.13222 [pdf]

Perspective on Code Submission and Automated Evaluation Platforms for University Teaching

Authors: Florian Auer, Johann Frei, Dominik Müller, Frank Kramer

Abstract: We present a perspective on platforms for code submission and automated evaluation in the context of university teaching. Due to the COVID-19 pandemic, such platforms have become an essential asset for remote courses and a reasonable standard for structured code submission concerning increasing numbers of students in computer sciences. Utilizing automated code evaluation techniques exhibits notabl… ▽ More We present a perspective on platforms for code submission and automated evaluation in the context of university teaching. Due to the COVID-19 pandemic, such platforms have become an essential asset for remote courses and a reasonable standard for structured code submission concerning increasing numbers of students in computer sciences. Utilizing automated code evaluation techniques exhibits notable positive impacts for both students and teachers in terms of quality and scalability. We identified relevant technical and non-technical requirements for such platforms in terms of practical applicability and secure code submission environments. Furthermore, a survey among students was conducted to obtain empirical data on general perception. We conclude that submission and automated evaluation involves continuous maintenance yet lowers the required workload for teachers and provides better evaluation transparency for students. △ Less

Submitted 25 January, 2022; originally announced January 2022.

Comments: Source code: https://github.com/frankkramer-lab/MISITcms-app. Manuscript accepted for publication in the following Conference - Jorunal: MedInfo 2021, Virtual Conference, October 2-4, 2021 - IOS Press

arXiv:2201.11440 [pdf]

An Analysis on Ensemble Learning optimized Medical Image Classification with Deep Convolutional Neural Networks

Authors: Dominik Müller, Iñaki Soto-Rey, Frank Kramer

Abstract: Novel and high-performance medical image classification pipelines are heavily utilizing ensemble learning strategies. The idea of ensemble learning is to assemble diverse models or multiple predictions and, thus, boost prediction performance. However, it is still an open question to what extent as well as which ensemble learning strategies are beneficial in deep learning based medical image classi… ▽ More Novel and high-performance medical image classification pipelines are heavily utilizing ensemble learning strategies. The idea of ensemble learning is to assemble diverse models or multiple predictions and, thus, boost prediction performance. However, it is still an open question to what extent as well as which ensemble learning strategies are beneficial in deep learning based medical image classification pipelines. In this work, we proposed a reproducible medical image classification pipeline for analyzing the performance impact of the following ensemble learning techniques: Augmenting, Stacking, and Bagging. The pipeline consists of state-of-the-art preprocessing and image augmentation methods as well as 9 deep convolution neural network architectures. It was applied on four popular medical imaging datasets with varying complexity. Furthermore, 12 pooling functions for combining multiple predictions were analyzed, ranging from simple statistical functions like unweighted averaging up to more complex learning-based functions like support vector machines. Our results revealed that Stacking achieved the largest performance gain of up to 13% F1-score increase. Augmenting showed consistent improvement capabilities by up to 4% and is also applicable to single model based pipelines. Cross-validation based Bagging demonstrated significant performance gain close to Stacking, which resulted in an F1-score increase up to +11%. Furthermore, we demonstrated that simple statistical pooling functions are equal or often even better than more complex pooling functions. We concluded that the integration of ensemble learning techniques is a powerful method for any medical image classification pipeline to improve robustness and boost performance. △ Less

Submitted 13 April, 2022; v1 submitted 27 January, 2022; originally announced January 2022.

Comments: Code: https://github.com/frankkramer-lab/ensmic ; Supplementary Material: https://doi.org/10.5281/zenodo.6457912

arXiv:2201.09395 [pdf]

MISeval: a Metric Library for Medical Image Segmentation Evaluation

Authors: Dominik Müller, Dennis Hartmann, Philip Meyer, Florian Auer, Iñaki Soto-Rey, Frank Kramer

Abstract: Correct performance assessment is crucial for evaluating modern artificial intelligence algorithms in medicine like deep-learning based medical image segmentation models. However, there is no universal metric library in Python for standardized and reproducible evaluation. Thus, we propose our open-source publicly available Python package MISeval: a metric library for Medical Image Segmentation Eva… ▽ More Correct performance assessment is crucial for evaluating modern artificial intelligence algorithms in medicine like deep-learning based medical image segmentation models. However, there is no universal metric library in Python for standardized and reproducible evaluation. Thus, we propose our open-source publicly available Python package MISeval: a metric library for Medical Image Segmentation Evaluation. The implemented metrics can be intuitively used and easily integrated into any performance assessment pipeline. The package utilizes modern CI/CD strategies to ensure functionality and stability. MISeval is available from PyPI (miseval) and GitHub: https://github.com/frankkramer-lab/miseval. △ Less

Submitted 23 January, 2022; originally announced January 2022.

arXiv:2112.12493 [pdf, other]

doi 10.1051/epjconf/202225809001

Equivariance and generalization in neural networks

Authors: Srinath Bulusu, Matteo Favoni, Andreas Ipp, David I. Müller, Daniel Schuh

Abstract: The crucial role played by the underlying symmetries of high energy physics and lattice field theories calls for the implementation of such symmetries in the neural network architectures that are applied to the physical system under consideration. In these proceedings, we focus on the consequences of incorporating translational equivariance among the network properties, particularly in terms of pe… ▽ More The crucial role played by the underlying symmetries of high energy physics and lattice field theories calls for the implementation of such symmetries in the neural network architectures that are applied to the physical system under consideration. In these proceedings, we focus on the consequences of incorporating translational equivariance among the network properties, particularly in terms of performance and generalization. The benefits of equivariant networks are exemplified by studying a complex scalar field theory, on which various regression and classification tasks are examined. For a meaningful comparison, promising equivariant and non-equivariant architectures are identified by means of a systematic search. The results indicate that in most of the tasks our best equivariant architectures can perform and generalize significantly better than their non-equivariant counterparts, which applies not only to physical parameters beyond those represented in the training set, but also to different lattice sizes. △ Less

Submitted 23 December, 2021; originally announced December 2021.

Comments: 8 pages, 7 figures, proceedings for the 14th Quark Confinement and the Hadron Spectrum Conference (vConf2021)

arXiv:2112.12474 [pdf, other]

Generalization capabilities of neural networks in lattice applications

Authors: Srinath Bulusu, Matteo Favoni, Andreas Ipp, David I. Müller, Daniel Schuh

Abstract: In recent years, the use of machine learning has become increasingly popular in the context of lattice field theories. An essential element of such theories is represented by symmetries, whose inclusion in the neural network properties can lead to high reward in terms of performance and generalizability. A fundamental symmetry that usually characterizes physical systems on a lattice with periodic… ▽ More In recent years, the use of machine learning has become increasingly popular in the context of lattice field theories. An essential element of such theories is represented by symmetries, whose inclusion in the neural network properties can lead to high reward in terms of performance and generalizability. A fundamental symmetry that usually characterizes physical systems on a lattice with periodic boundary conditions is equivariance under spacetime translations. Here we investigate the advantages of adopting translationally equivariant neural networks in favor of non-equivariant ones. The system we consider is a complex scalar field with quartic interaction on a two-dimensional lattice in the flux representation, on which the networks carry out various regression and classification tasks. Promising equivariant and non-equivariant architectures are identified with a systematic search. We demonstrate that in most of these tasks our best equivariant architectures can perform and generalize significantly better than their non-equivariant counterparts, which applies not only to physical parameters beyond those represented in the training set, but also to different lattice sizes. △ Less

Submitted 23 December, 2021; originally announced December 2021.

Comments: 10 pages, 7 figures, proceedings for the 38th International Symposium on Lattice Field Theory (LATTICE21)

arXiv:2112.11239 [pdf, other]

doi 10.1051/epjconf/202225809004

Preserving gauge invariance in neural networks

Authors: Matteo Favoni, Andreas Ipp, David I. Müller, Daniel Schuh

Abstract: In these proceedings we present lattice gauge equivariant convolutional neural networks (L-CNNs) which are able to process data from lattice gauge theory simulations while exactly preserving gauge symmetry. We review aspects of the architecture and show how L-CNNs can represent a large class of gauge invariant and equivariant functions on the lattice. We compare the performance of L-CNNs and non-e… ▽ More In these proceedings we present lattice gauge equivariant convolutional neural networks (L-CNNs) which are able to process data from lattice gauge theory simulations while exactly preserving gauge symmetry. We review aspects of the architecture and show how L-CNNs can represent a large class of gauge invariant and equivariant functions on the lattice. We compare the performance of L-CNNs and non-equivariant networks using a non-linear regression problem and demonstrate how gauge invariance is broken for non-equivariant models. △ Less

Submitted 21 December, 2021; originally announced December 2021.

Comments: 8 pages, 3 figures, proceedings for vConf 2021

arXiv:2111.04389 [pdf, other]

Lattice gauge symmetry in neural networks

Authors: Matteo Favoni, Andreas Ipp, David I. Müller, Daniel Schuh

Abstract: We review a novel neural network architecture called lattice gauge equivariant convolutional neural networks (L-CNNs), which can be applied to generic machine learning problems in lattice gauge theory while exactly preserving gauge symmetry. We discuss the concept of gauge equivariance which we use to explicitly construct a gauge equivariant convolutional layer and a bilinear layer. The performanc… ▽ More We review a novel neural network architecture called lattice gauge equivariant convolutional neural networks (L-CNNs), which can be applied to generic machine learning problems in lattice gauge theory while exactly preserving gauge symmetry. We discuss the concept of gauge equivariance which we use to explicitly construct a gauge equivariant convolutional layer and a bilinear layer. The performance of L-CNNs and non-equivariant CNNs is compared using seemingly simple non-linear regression tasks, where L-CNNs demonstrate generalizability and achieve a high degree of accuracy in their predictions compared to their non-equivariant counterparts. △ Less

Submitted 8 November, 2021; originally announced November 2021.

Comments: 10 pages, 3 figures, proceedings for the 38th International Symposium on Lattice Field Theory (LATTICE21)

arXiv:2110.01017 [pdf]

Classification of Viral Pneumonia X-ray Images with the Aucmedi Framework

Authors: Pia Schneider, Dominik Müller, Frank Kramer

Abstract: In this work we use the AUCMEDI-Framework to train a deep neural network to classify chest X-ray images as either normal or viral pneumonia. Stratified k-fold cross-validation with k=3 is used to generate the validation-set and 15% of the data are set aside for the evaluation of the models of the different folds and ensembles each. A random-forest ensemble as well as a Soft-Majority-Vote ensemble… ▽ More In this work we use the AUCMEDI-Framework to train a deep neural network to classify chest X-ray images as either normal or viral pneumonia. Stratified k-fold cross-validation with k=3 is used to generate the validation-set and 15% of the data are set aside for the evaluation of the models of the different folds and ensembles each. A random-forest ensemble as well as a Soft-Majority-Vote ensemble are built from the predictions of the different folds. Evaluation metrics (Classification-Report, macro f1-scores, Confusion-Matrices, ROC-Curves) of the individual folds and the ensembles show that the classifier works well. Finally Grad-CAM and LIME explainable artificial intelligence (XAI) algorithms are applied to visualize the image features that are most important for the prediction. For Grad-CAM the heatmaps of the three folds are furthermore averaged for all images in order to calculate a mean XAI-heatmap. As the heatmaps of the different folds for most images differ only slightly this averaging procedure works well. However, only medical professionals can evaluate the quality of the features marked by the XAI. A comparison of the evaluation metrics with metrics of standard procedures such as PCR would also be important. Further limitations are discussed. △ Less

Submitted 3 October, 2021; originally announced October 2021.

Comments: 7 pages, open manuscript, student submission

arXiv:2108.13912 [pdf, other]

Automatic digital twin data model generation of building energy systems from pi** and instrumentation diagrams

Authors: Florian Stinner, Martin Wiecek, Marc Baranski, Alexander Kümpel, Dirk Müller

Abstract: Buildings directly and indirectly emit a large share of current CO2 emissions. There is a high potential for CO2 savings through modern control methods in building automation systems (BAS) like model predictive control (MPC). For a proper control, MPC needs mathematical models to predict the future behavior of the controlled system. For this purpose, digital twins of the building can be used. Howe… ▽ More Buildings directly and indirectly emit a large share of current CO2 emissions. There is a high potential for CO2 savings through modern control methods in building automation systems (BAS) like model predictive control (MPC). For a proper control, MPC needs mathematical models to predict the future behavior of the controlled system. For this purpose, digital twins of the building can be used. However, with current methods in existing buildings, a digital twin set up is usually labor-intensive. Especially connecting the different components of the technical system to an overall digital twin of the building is time-consuming. Pi** and instrument diagrams (P&ID) can provide the needed information, but it is necessary to extract the information and provide it in a standardized format to process it further. In this work, we present an approach to recognize symbols and connections of P&ID from buildings in a completely automated way. There are various standards for graphical representation of symbols in P&ID of building energy systems. Therefore, we use different data sources and standards to generate a holistic training data set. We apply algorithms for symbol recognition, line recognition and derivation of connections to the data sets. Furthermore, the result is exported to a format that provides semantics of building energy systems. The symbol recognition, line recognition and connection recognition show good results with an average precision of 93.7%, which can be used in further processes like control generation, (distributed) model predictive control or fault detection. Nevertheless, the approach needs further research. △ Less

Submitted 31 August, 2021; originally announced August 2021.

Journal ref: Proceedings of ECOS 2021 - The 34rth International Conference On Efficiency, Cost, Optimization, Simulation and Environmental Impact of Energy Systems, June 27-July 2, 2021, At: Taormina, Italy

arXiv:2103.16492 [pdf]

Assessing the Role of Random Forests in Medical Image Segmentation

Authors: Dennis Hartmann, Dominik Müller, Iñaki Soto-Rey, Frank Kramer

Abstract: Neural networks represent a field of research that can quickly achieve very good results in the field of medical image segmentation using a GPU. A possible way to achieve good results without GPUs are random forests. For this purpose, two random forest approaches were compared with a state-of-the-art deep convolutional neural network. To make the comparison the PhC-C2DH-U373 and the retinal imagin… ▽ More Neural networks represent a field of research that can quickly achieve very good results in the field of medical image segmentation using a GPU. A possible way to achieve good results without GPUs are random forests. For this purpose, two random forest approaches were compared with a state-of-the-art deep convolutional neural network. To make the comparison the PhC-C2DH-U373 and the retinal imaging datasets were used. The evaluation showed that the deep convolutional neutral network achieved the best results. However, one of the random forest approaches also achieved a similar high performance. Our results indicate that random forest approaches are a good alternative to deep convolutional neural networks and, thus, allow the usage of medical image segmentation without a GPU. △ Less

Submitted 30 March, 2021; originally announced March 2021.

arXiv:2103.14686 [pdf, other]

doi 10.1103/PhysRevD.104.074504

Generalization capabilities of translationally equivariant neural networks

Authors: Srinath Bulusu, Matteo Favoni, Andreas Ipp, David I. Müller, Daniel Schuh

Abstract: The rising adoption of machine learning in high energy physics and lattice field theory necessitates the re-evaluation of common methods that are widely used in computer vision, which, when applied to problems in physics, can lead to significant drawbacks in terms of performance and generalizability. One particular example for this is the use of neural network architectures that do not reflect the… ▽ More The rising adoption of machine learning in high energy physics and lattice field theory necessitates the re-evaluation of common methods that are widely used in computer vision, which, when applied to problems in physics, can lead to significant drawbacks in terms of performance and generalizability. One particular example for this is the use of neural network architectures that do not reflect the underlying symmetries of the given physical problem. In this work, we focus on complex scalar field theory on a two-dimensional lattice and investigate the benefits of using group equivariant convolutional neural network architectures based on the translation group. For a meaningful comparison, we conduct a systematic search for equivariant and non-equivariant neural network architectures and apply them to various regression and classification tasks. We demonstrate that in most of these tasks our best equivariant architectures can perform and generalize significantly better than their non-equivariant counterparts, which applies not only to physical parameters beyond those represented in the training set, but also to different lattice sizes. △ Less

Submitted 11 October, 2021; v1 submitted 26 March, 2021; originally announced March 2021.

Comments: 28 pages, 20 figures, v3: equivalent to the version published in PRD

Journal ref: Phys. Rev. D 104, 074504 (2021)

arXiv:2103.14660 [pdf]

Multi-Disease Detection in Retinal Imaging based on Ensembling Heterogeneous Deep Learning Models

Authors: Dominik Müller, Iñaki Soto-Rey, Frank Kramer

Abstract: Preventable or undiagnosed visual impairment and blindness affect billion of people worldwide. Automated multi-disease detection models offer great potential to address this problem via clinical decision support in diagnosis. In this work, we proposed an innovative multi-disease detection pipeline for retinal imaging which utilizes ensemble learning to combine the predictive capabilities of severa… ▽ More Preventable or undiagnosed visual impairment and blindness affect billion of people worldwide. Automated multi-disease detection models offer great potential to address this problem via clinical decision support in diagnosis. In this work, we proposed an innovative multi-disease detection pipeline for retinal imaging which utilizes ensemble learning to combine the predictive capabilities of several heterogeneous deep convolutional neural network models. Our pipeline includes state-of-the-art strategies like transfer learning, class weighting, real-time image augmentation and Focal loss utilization. Furthermore, we integrated ensemble learning techniques like heterogeneous deep learning models, bagging via 5-fold cross-validation and stacked logistic regression models. Through internal and external evaluation, we were able to validate and demonstrate high accuracy and reliability of our pipeline, as well as the comparability with other state-of-the-art pipelines for retinal disease prediction. △ Less

Submitted 26 March, 2021; originally announced March 2021.

Comments: Code repository: https://github.com/frankkramer-lab/riadd.aucmedi Appendix: https://doi.org/10.5281/zenodo.4573990

arXiv:2103.07678 [pdf, other]

doi 10.1016/j.rse.2021.112750

A review of machine learning in processing remote sensing data for mineral exploration

Authors: Hojat Shirmard, Ehsan Farahbakhsh, R. Dietmar Muller, Rohitash Chandra

Abstract: The decline of the number of newly discovered mineral deposits and increase in demand for different minerals in recent years has led exploration geologists to look for more efficient and innovative methods for processing different data types at each stage of mineral exploration. As a primary step, various features, such as lithological units, alteration types, structures, and indicator minerals, a… ▽ More The decline of the number of newly discovered mineral deposits and increase in demand for different minerals in recent years has led exploration geologists to look for more efficient and innovative methods for processing different data types at each stage of mineral exploration. As a primary step, various features, such as lithological units, alteration types, structures, and indicator minerals, are mapped to aid decision-making in targeting ore deposits. Different types of remote sensing datasets, such as satellite and airborne data, make it possible to overcome common problems associated with map** geological features. The rapid increase in the volume of remote sensing data obtained from different platforms has encouraged scientists to develop advanced, innovative, and robust data processing methodologies. Machine learning methods can help process a wide range of remote sensing datasets and determine the relationship between components such as the reflectance continuum and features of interest. These methods are robust in processing spectral and ground truth measurements against noise and uncertainties. In recent years, many studies have been carried out by supplementing geological surveys with remote sensing datasets, which is now prominent in geoscience research. This paper provides a comprehensive review of the implementation and adaptation of some popular and recently established machine learning methods for processing different types of remote sensing data and investigates their applications for detecting various ore deposit types. We demonstrate the high capability of combining remote sensing data and machine learning methods for map** different geological features that are critical for providing potential maps. Moreover, we find there is scope for advanced methods to process the new generation of remote sensing data for creating improved mineral prospectivity maps. △ Less

Submitted 4 December, 2021; v1 submitted 13 March, 2021; originally announced March 2021.

Comments: 26 pages, 4 figures, 2 tables

Journal ref: Remote Sensing of Environment, 268, 112750 (2022)

arXiv:2101.11716 [pdf, ps, other]

Disambiguating Symbolic Expressions in Informal Documents

Authors: Dennis Müller, Cezary Kaliszyk

Abstract: We propose the task of disambiguating symbolic expressions in informal STEM documents in the form of LaTeX files - that is, determining their precise semantics and abstract syntax tree - as a neural machine translation task. We discuss the distinct challenges involved and present a dataset with roughly 33,000 entries. We evaluated several baseline models on this dataset, which failed to yield even… ▽ More We propose the task of disambiguating symbolic expressions in informal STEM documents in the form of LaTeX files - that is, determining their precise semantics and abstract syntax tree - as a neural machine translation task. We discuss the distinct challenges involved and present a dataset with roughly 33,000 entries. We evaluated several baseline models on this dataset, which failed to yield even syntactically valid LaTeX before overfitting. Consequently, we describe a methodology using a transformer language model pre-trained on sources obtained from arxiv.longhoe.net, which yields promising results despite the small size of the dataset. We evaluate our model using a plurality of dedicated techniques, taking the syntax and semantics of symbolic expressions into account. △ Less

Submitted 25 January, 2021; originally announced January 2021.

Comments: ICLR 2021 conference paper

arXiv:2101.00884 [pdf, ps, other]

Coreference Resolution in Research Papers from Multiple Domains

Authors: Arthur Brack, Daniel Uwe Müller, Anett Hoppe, Ralph Ewerth

Abstract: Coreference resolution is essential for automatic text understanding to facilitate high-level information retrieval tasks such as text summarisation or question answering. Previous work indicates that the performance of state-of-the-art approaches (e.g. based on BERT) noticeably declines when applied to scientific papers. In this paper, we investigate the task of coreference resolution in research… ▽ More Coreference resolution is essential for automatic text understanding to facilitate high-level information retrieval tasks such as text summarisation or question answering. Previous work indicates that the performance of state-of-the-art approaches (e.g. based on BERT) noticeably declines when applied to scientific papers. In this paper, we investigate the task of coreference resolution in research papers and subsequent knowledge graph population. We present the following contributions: (1) We annotate a corpus for coreference resolution that comprises 10 different scientific disciplines from Science, Technology, and Medicine (STM); (2) We propose transfer learning for automatic coreference resolution in research papers; (3) We analyse the impact of coreference resolution on knowledge graph (KG) population; (4) We release a research KG that is automatically populated from 55,485 papers in 10 STM domains. Comprehensive experiments show the usefulness of the proposed approach. Our transfer learning approach considerably outperforms state-of-the-art baselines on our corpus with an F1 score of 61.4 (+11.0), while the evaluation against a gold standard KG shows that coreference resolution improves the quality of the populated KG significantly with an F1 score of 63.5 (+21.8). △ Less

Submitted 4 January, 2021; originally announced January 2021.

Comments: Accepted for publication in 43rd European Conference on Information Retrieval (ECIR), 2021

arXiv:2012.12901 [pdf, other]

doi 10.1103/PhysRevLett.128.032003

Lattice gauge equivariant convolutional neural networks

Authors: Matteo Favoni, Andreas Ipp, David I. Müller, Daniel Schuh

Abstract: We propose Lattice gauge equivariant Convolutional Neural Networks (L-CNNs) for generic machine learning applications on lattice gauge theoretical problems. At the heart of this network structure is a novel convolutional layer that preserves gauge equivariance while forming arbitrarily shaped Wilson loops in successive bilinear layers. Together with topological information, for example from Polyak… ▽ More We propose Lattice gauge equivariant Convolutional Neural Networks (L-CNNs) for generic machine learning applications on lattice gauge theoretical problems. At the heart of this network structure is a novel convolutional layer that preserves gauge equivariance while forming arbitrarily shaped Wilson loops in successive bilinear layers. Together with topological information, for example from Polyakov loops, such a network can in principle approximate any gauge covariant function on the lattice. We demonstrate that L-CNNs can learn and generalize gauge invariant quantities that traditional convolutional neural networks are incapable of finding. △ Less

Submitted 22 November, 2021; v1 submitted 23 December, 2020; originally announced December 2020.

Comments: letter: 6 pages, 5 figures; supplementary material: 14 pages, 4 figures; replaced some figures, added supplementary material

Journal ref: Phys.Rev.Lett. 128 (2022) 3, 032003

arXiv:2011.01050 [pdf, other]

doi 10.1016/j.jsc.2020.07.011

Stronger bounds on the cost of computing Groebner bases for HFE systems

Authors: Elisa Gorla, Daniela Mueller, Christophe Petit

Abstract: We give upper bounds for the solving degree and the last fall degree of the polynomial system associated to the HFE (Hidden Field Equations) cryptosystem. Our bounds improve the known bounds for this type of systems. We also present new results on the connection between the solving degree and the last fall degree and prove that, in some cases, the solving degree is independent of coordinate change… ▽ More We give upper bounds for the solving degree and the last fall degree of the polynomial system associated to the HFE (Hidden Field Equations) cryptosystem. Our bounds improve the known bounds for this type of systems. We also present new results on the connection between the solving degree and the last fall degree and prove that, in some cases, the solving degree is independent of coordinate changes. △ Less

Submitted 2 November, 2020; originally announced November 2020.

Comments: 15 pages

arXiv:2010.06721 [pdf, other]

Ensemble Distillation for Structured Prediction: Calibrated, Accurate, Fast-Choose Three

Authors: Steven Reich, David Mueller, Nicholas Andrews

Abstract: Modern neural networks do not always produce well-calibrated predictions, even when trained with a proper scoring function such as cross-entropy. In classification settings, simple methods such as isotonic regression or temperature scaling may be used in conjunction with a held-out dataset to calibrate model outputs. However, extending these methods to structured prediction is not always straightf… ▽ More Modern neural networks do not always produce well-calibrated predictions, even when trained with a proper scoring function such as cross-entropy. In classification settings, simple methods such as isotonic regression or temperature scaling may be used in conjunction with a held-out dataset to calibrate model outputs. However, extending these methods to structured prediction is not always straightforward or effective; furthermore, a held-out calibration set may not always be available. In this paper, we study ensemble distillation as a general framework for producing well-calibrated structured prediction models while avoiding the prohibitive inference-time cost of ensembles. We validate this framework on two tasks: named-entity recognition and machine translation. We find that, across both tasks, ensemble distillation produces models which retain much of, and occasionally improve upon, the performance and calibration benefits of ensembles, while only requiring a single model during test-time. △ Less

Submitted 25 March, 2021; v1 submitted 13 October, 2020; originally announced October 2020.

Comments: EMNLP 2020. v2: Changed formatting of title in metadata; no other changes

arXiv:2010.05386 [pdf, other]

Multi-Objective Bayesian Optimisation and Joint Inversion for Active Sensor Fusion

Authors: Sebastian Haan, Fabio Ramos, Dietmar Müller

Abstract: A critical decision process in data acquisition for mineral and energy resource exploration is how to efficiently combine a variety of sensor types and to minimize total cost. We propose a probabilistic framework for multi-objective optimisation and inverse problems given an expensive cost function for allocating new measurements. This new method is devised to jointly solve multi-linear forward mo… ▽ More A critical decision process in data acquisition for mineral and energy resource exploration is how to efficiently combine a variety of sensor types and to minimize total cost. We propose a probabilistic framework for multi-objective optimisation and inverse problems given an expensive cost function for allocating new measurements. This new method is devised to jointly solve multi-linear forward models of 2D-sensor data and 3D-geophysical properties using sparse Gaussian Process kernels while taking into account the cross-variances of different parameters. Multiple optimisation strategies are tested and evaluated on a set of synthetic and real geophysical data. We demonstrate the advantages on a specific example of a joint inverse problem, recommending where to place new drill-core measurements given 2D gravity and magnetic sensor data, the same approach can be applied to a variety of remote sensing problems with linear forward models - ranging from constraints limiting surface access for data acquisition to adaptive multi-sensor positioning. △ Less

Submitted 11 October, 2020; originally announced October 2020.

Comments: Accepted for publication in Geophysics

arXiv:2008.08045 [pdf]

Algorithm Based on One Monocular Video Delivers Highly Valid and Reliable Gait Parameters

Authors: Dr. Arash Azhand, Dr. Sophie Rabe, Dr. Swantje Müller, Igor Sattler, Dr. Anika Steinert

Abstract: Despite its paramount importance for manifold use cases (e.g., in the health care industry, sports, rehabilitation and fitness assessment), sufficiently valid and reliable gait parameter measurement is still limited to high-tech gait laboratories mostly. Here, we demonstrate the excellent validity and test-retest repeatability of a novel gait assessment system which is built upon modern convolutio… ▽ More Despite its paramount importance for manifold use cases (e.g., in the health care industry, sports, rehabilitation and fitness assessment), sufficiently valid and reliable gait parameter measurement is still limited to high-tech gait laboratories mostly. Here, we demonstrate the excellent validity and test-retest repeatability of a novel gait assessment system which is built upon modern convolutional neural networks to extract three-dimensional skeleton joints from monocular frontal-view videos of walking humans. The validity study is based on a comparison to the GAITRite pressure-sensitive walkway system. All measured gait parameters (gait speed, cadence, step length and step time) showed excellent concurrent validity for multiple walk trials at normal and fast gait speeds. The test-retest-repeatability is on the same level as the GAITRite system. In conclusion, we are convinced that our results can pave the way for cost, space and operationally effective gait analysis in broad mainstream applications. Most sensor-based systems are costly, must be operated by extensively trained personnel (e.g., motion capture systems) or - even if not quite as costly - still possess considerable complexity (e.g., wearable sensors). In contrast, a video sufficient for the assessment method presented here can be obtained by anyone, without much training, via a smartphone camera. △ Less

Submitted 23 June, 2021; v1 submitted 5 August, 2020; originally announced August 2020.

arXiv:2007.04774 [pdf]

doi 10.1016/j.imu.2021.100681.

Automated Chest CT Image Segmentation of COVID-19 Lung Infection based on 3D U-Net

Authors: Dominik Müller, Iñaki Soto Rey, Frank Kramer

Abstract: The coronavirus disease 2019 (COVID-19) affects billions of lives around the world and has a significant impact on public healthcare. Due to rising skepticism towards the sensitivity of RT-PCR as screening method, medical imaging like computed tomography offers great potential as alternative. For this reason, automated image segmentation is highly desired as clinical decision support for quantitat… ▽ More The coronavirus disease 2019 (COVID-19) affects billions of lives around the world and has a significant impact on public healthcare. Due to rising skepticism towards the sensitivity of RT-PCR as screening method, medical imaging like computed tomography offers great potential as alternative. For this reason, automated image segmentation is highly desired as clinical decision support for quantitative assessment and disease monitoring. However, publicly available COVID-19 imaging data is limited which leads to overfitting of traditional approaches. To address this problem, we propose an innovative automated segmentation pipeline for COVID-19 infected regions, which is able to handle small datasets by utilization as variant databases. Our method focuses on on-the-fly generation of unique and random image patches for training by performing several preprocessing methods and exploiting extensive data augmentation. For further reduction of the overfitting risk, we implemented a standard 3D U-Net architecture instead of new or computational complex neural network architectures. Through a 5-fold cross-validation on 20 CT scans of COVID-19 patients, we were able to develop a highly accurate as well as robust segmentation model for lungs and COVID-19 infected regions without overfitting on the limited data. Our method achieved Dice similarity coefficients of 0.956 for lungs and 0.761 for infection. We demonstrated that the proposed method outperforms related approaches, advances the state-of-the-art for COVID-19 segmentation and improves medical image analysis with limited data. The code and model are available under the following link: https://github.com/frankkramer-lab/covid19.MIScnn △ Less

Submitted 24 June, 2020; originally announced July 2020.

Comments: Code repository: https://github.com/frankkramer-lab/covid19.MIScnn

Journal ref: Robust chest CT image segmentation of COVID-19 lung infection based on limited data. Informatics in Medicine Unlocked. Volume 25. 2021. https://www.sciencedirect.com/science/article/pii/S2352914821001660

arXiv:2005.00847 [pdf, other]

Sources of Transfer in Multilingual Named Entity Recognition

Authors: David Mueller, Nicholas Andrews, Mark Dredze

Abstract: Named-entities are inherently multilingual, and annotations in any given language may be limited. This motivates us to consider polyglot named-entity recognition (NER), where one model is trained using annotated data drawn from more than one language. However, a straightforward implementation of this simple idea does not always work in practice: naive training of NER models using annotated data dr… ▽ More Named-entities are inherently multilingual, and annotations in any given language may be limited. This motivates us to consider polyglot named-entity recognition (NER), where one model is trained using annotated data drawn from more than one language. However, a straightforward implementation of this simple idea does not always work in practice: naive training of NER models using annotated data drawn from multiple languages consistently underperforms models trained on monolingual data alone, despite having access to more training data. The starting point of this paper is a simple solution to this problem, in which polyglot models are fine-tuned on monolingual data to consistently and significantly outperform their monolingual counterparts. To explain this phenomena, we explore the sources of multilingual transfer in polyglot NER models and examine the weight structure of polyglot models compared to their monolingual counterparts. We find that polyglot models efficiently share many parameters across languages and that fine-tuning may utilize a large number of those parameters. △ Less

Submitted 2 May, 2020; originally announced May 2020.

Comments: ACL 2020

arXiv:2002.04955 [pdf, other]

The Space of Mathematical Software Systems -- A Survey of Paradigmatic Systems

Authors: Katja Bercic, Jacques Carette, William M. Farmer, Michael Kohlhase, Dennis Müller, Florian Rabe, Yasmine Sharoda

Abstract: Mathematical software systems are becoming more and more important in pure and applied mathematics in order to deal with the complexity and scalability issues inherent in mathematics. In the last decades we have seen a cambric explosion of increasingly powerful but also diverging systems. To give researchers a guide to this space of systems, we devise a novel conceptualization of mathematical soft… ▽ More Mathematical software systems are becoming more and more important in pure and applied mathematics in order to deal with the complexity and scalability issues inherent in mathematics. In the last decades we have seen a cambric explosion of increasingly powerful but also diverging systems. To give researchers a guide to this space of systems, we devise a novel conceptualization of mathematical software that focuses on five aspects: inference covers formal logic and reasoning about mathematical statements via proofs and models, typically with strong emphasis on correctness; computation covers algorithms and software libraries for representing and manipulating mathematical objects, typically with strong emphasis on efficiency; concretization covers generating and maintaining collections of mathematical objects conforming to a certain pattern, typically with strong emphasis on complete enumeration; narration covers describing mathematical contexts and relations, typically with strong emphasis on human readability; finally, organization covers representing mathematical contexts and objects in machine-actionable formal languages, typically with strong emphasis on expressivity and system interoperability. Despite broad agreement that an ideal system would seamlessly integrate all these aspects, research has diversified into families of highly specialized systems focusing on a single aspect and possibly partially integrating others, each with their own communities, challenges, and successes. In this survey, we focus on the commonalities and differences of these systems from the perspective of a future multi-aspect system. △ Less

Submitted 12 February, 2020; originally announced February 2020.

arXiv:1910.10850 [pdf, other]

doi 10.4204/EPTCS.307.5

Rapid Prototy** Formal Systems in MMT: 5 Case Studies

Authors: Dennis Müller, Florian Rabe

Abstract: Logical frameworks are meta-formalisms in which the syntax and semantics of object logics and related formal systems can be defined. This allows object logics to inherit implementations from the framework including, e.g., parser, type checker, or module system. But if the desired object logic falls outside the comfort zone of the logical framework, these definitions may become cumbersome or infeas… ▽ More Logical frameworks are meta-formalisms in which the syntax and semantics of object logics and related formal systems can be defined. This allows object logics to inherit implementations from the framework including, e.g., parser, type checker, or module system. But if the desired object logic falls outside the comfort zone of the logical framework, these definitions may become cumbersome or infeasible. Therefore, the MMT system abstracts even further than previous frameworks: it assumes no type system or logic at all and allows its kernel algorithms to be customized by almost arbitrary sets of rules. In particular, this allows implementing standard logical frameworks like LF in MMT. But it does so without chaining users to one particular meta-formalism: users can flexibly adapt MMT whenever the object logic demands it. In this paper, we present a series of case studies that do just that, defining increasingly complex object logics in MMT. We use elegant declarative logic definitions wherever possible, but inject entirely new rules into the kernel when necessary. Our experience shows that the MMT approach allows deriving prototype implementations of very diverse formal systems very easily and quickly. △ Less

Submitted 23 October, 2019; originally announced October 2019.

Comments: In Proceedings LFMTP 2019, arXiv:1910.08712

Journal ref: EPTCS 307, 2019, pp. 40-54

Showing 1–50 of 65 results for author: Müller, D