Search | arXiv e-print repository

A Data-Centric Perspective on Evaluating Machine Learning Models for Tabular Data

Authors: Andrej Tschalzev, Sascha Marton, Stefan Lüdtke, Christian Bartelt, Heiner Stuckenschmidt

Abstract: Tabular data is prevalent in real-world machine learning applications, and new models for supervised learning of tabular data are frequently proposed. Comparative studies assessing the performance of models typically consist of model-centric evaluation setups with overly standardized data preprocessing. This paper demonstrates that such model-centric evaluations are biased, as real-world modeling… ▽ More Tabular data is prevalent in real-world machine learning applications, and new models for supervised learning of tabular data are frequently proposed. Comparative studies assessing the performance of models typically consist of model-centric evaluation setups with overly standardized data preprocessing. This paper demonstrates that such model-centric evaluations are biased, as real-world modeling pipelines often require dataset-specific preprocessing and feature engineering. Therefore, we propose a data-centric evaluation framework. We select 10 relevant datasets from Kaggle competitions and implement expert-level preprocessing pipelines for each dataset. We conduct experiments with different preprocessing pipelines and hyperparameter optimization (HPO) regimes to quantify the impact of model selection, HPO, feature engineering, and test-time adaptation. Our main findings are: 1. After dataset-specific feature engineering, model rankings change considerably, performance differences decrease, and the importance of model selection reduces. 2. Recent models, despite their measurable progress, still significantly benefit from manual feature engineering. This holds true for both tree-based models and neural networks. 3. While tabular data is typically considered static, samples are often collected over time, and adapting to distribution shifts can be important even in supposedly static data. These insights suggest that research efforts should be directed toward a data-centric perspective, acknowledging that tabular data requires feature engineering and often exhibits temporal characteristics. △ Less

Submitted 2 July, 2024; originally announced July 2024.

arXiv:2407.01115 [pdf, other]

Enabling Mixed Effects Neural Networks for Diverse, Clustered Data Using Monte Carlo Methods

Authors: Andrej Tschalzev, Paul Nitschke, Lukas Kirchdorfer, Stefan Lüdtke, Christian Bartelt, Heiner Stuckenschmidt

Abstract: Neural networks often assume independence among input data samples, disregarding correlations arising from inherent clustering patterns in real-world datasets (e.g., due to different sites or repeated measurements). Recently, mixed effects neural networks (MENNs) which separate cluster-specific 'random effects' from cluster-invariant 'fixed effects' have been proposed to improve generalization and… ▽ More Neural networks often assume independence among input data samples, disregarding correlations arising from inherent clustering patterns in real-world datasets (e.g., due to different sites or repeated measurements). Recently, mixed effects neural networks (MENNs) which separate cluster-specific 'random effects' from cluster-invariant 'fixed effects' have been proposed to improve generalization and interpretability for clustered data. However, existing methods only allow for approximate quantification of cluster effects and are limited to regression and binary targets with only one clustering feature. We present MC-GMENN, a novel approach employing Monte Carlo methods to train Generalized Mixed Effects Neural Networks. We empirically demonstrate that MC-GMENN outperforms existing mixed effects deep learning models in terms of generalization performance, time complexity, and quantification of inter-cluster variance. Additionally, MC-GMENN is applicable to a wide range of datasets, including multi-class classification tasks with multiple high-cardinality categorical features. For these datasets, we show that MC-GMENN outperforms conventional encoding and embedding methods, simultaneously offering a principled methodology for interpreting the effects of clustering patterns. △ Less

Submitted 1 July, 2024; originally announced July 2024.

arXiv:2311.17640 [pdf]

doi 10.1107/S2052252524001246

Community recommendations on cryoEM data archiving and validation

Authors: Gerard J. Kleywegt, Paul D. Adams, Sarah J. Butcher, Cathy Lawson, Alexis Rohou, Peter B. Rosenthal, Sriram Subramaniam, Maya Topf, Sanja Abbott, Philip R. Baldwin, John M. Berrisford, Gérard Bricogne, Preeti Choudhary, Tristan I. Croll, Radostin Danev, Sai J. Ganesan, Timothy Grant, Aleksandras Gutmanas, Richard Henderson, J. Bernard Heymann, Juha T. Huiskonen, Andrei Istrate, Takayuki Kato, Gabriel C. Lander, Shee-Mei Lok , et al. (22 additional authors not shown)

Abstract: In January 2020, a workshop was held at EMBL-EBI (Hinxton, UK) to discuss data requirements for deposition and validation of cryoEM structures, with a focus on single-particle analysis. The meeting was attended by 45 experts in data processing, model building and refinement, validation, and archiving of such structures. This report describes the workshop's motivation and history, the topics discus… ▽ More In January 2020, a workshop was held at EMBL-EBI (Hinxton, UK) to discuss data requirements for deposition and validation of cryoEM structures, with a focus on single-particle analysis. The meeting was attended by 45 experts in data processing, model building and refinement, validation, and archiving of such structures. This report describes the workshop's motivation and history, the topics discussed, and consensus recommendations resulting from the workshop. Some challenges for future methods-development efforts in this area are also highlighted, as is the implementation to date of some of the recommendations. △ Less

Submitted 2 February, 2024; v1 submitted 29 November, 2023; originally announced November 2023.

Comments: Outcomes of a wwPDB/EMDB workshop on cryoEM data management, deposition and validation

arXiv:2309.17130 [pdf, other]

GRANDE: Gradient-Based Decision Tree Ensembles for Tabular Data

Authors: Sascha Marton, Stefan Lüdtke, Christian Bartelt, Heiner Stuckenschmidt

Abstract: Despite the success of deep learning for text and image data, tree-based ensemble models are still state-of-the-art for machine learning with heterogeneous tabular data. However, there is a significant need for tabular-specific gradient-based methods due to their high flexibility. In this paper, we propose $\text{GRANDE}$, $\text{GRA}$die$\text{N}$t-Based $\text{D}$ecision Tree $\text{E}$nsembles,… ▽ More Despite the success of deep learning for text and image data, tree-based ensemble models are still state-of-the-art for machine learning with heterogeneous tabular data. However, there is a significant need for tabular-specific gradient-based methods due to their high flexibility. In this paper, we propose $\text{GRANDE}$, $\text{GRA}$die$\text{N}$t-Based $\text{D}$ecision Tree $\text{E}$nsembles, a novel approach for learning hard, axis-aligned decision tree ensembles using end-to-end gradient descent. GRANDE is based on a dense representation of tree ensembles, which affords to use backpropagation with a straight-through operator to jointly optimize all model parameters. Our method combines axis-aligned splits, which is a useful inductive bias for tabular data, with the flexibility of gradient-based optimization. Furthermore, we introduce an advanced instance-wise weighting that facilitates learning representations for both, simple and complex relations, within a single model. We conducted an extensive evaluation on a predefined benchmark with 19 classification datasets and demonstrate that our method outperforms existing gradient-boosting and deep learning frameworks on most datasets. The method is available under: https://github.com/s-marton/GRANDE △ Less

Submitted 12 March, 2024; v1 submitted 29 September, 2023; originally announced September 2023.

arXiv:2309.00306 [pdf, ps, other]

On the Aggregation of Rules for Knowledge Graph Completion

Authors: Patrick Betz, Stefan Lüdtke, Christian Meilicke, Heiner Stuckenschmidt

Abstract: Rule learning approaches for knowledge graph completion are efficient, interpretable and competitive to purely neural models. The rule aggregation problem is concerned with finding one plausibility score for a candidate fact which was simultaneously predicted by multiple rules. Although the problem is ubiquitous, as data-driven rule learning can result in noisy and large rulesets, it is underrepre… ▽ More Rule learning approaches for knowledge graph completion are efficient, interpretable and competitive to purely neural models. The rule aggregation problem is concerned with finding one plausibility score for a candidate fact which was simultaneously predicted by multiple rules. Although the problem is ubiquitous, as data-driven rule learning can result in noisy and large rulesets, it is underrepresented in the literature and its theoretical foundations have not been studied before in this context. In this work, we demonstrate that existing aggregation approaches can be expressed as marginal inference operations over the predicting rules. In particular, we show that the common Max-aggregation strategy, which scores candidates based on the rule with the highest confidence, has a probabilistic interpretation. Finally, we propose an efficient and overlooked baseline which combines the previous strategies and is competitive to computationally more expensive approaches. △ Less

Submitted 1 September, 2023; originally announced September 2023.

Comments: KLR Workshop@ICML2023

arXiv:2308.03403 [pdf, other]

Towards Machine Learning-based Fish Stock Assessment

Authors: Stefan Lüdtke, Maria E. Pierce

Abstract: The accurate assessment of fish stocks is crucial for sustainable fisheries management. However, existing statistical stock assessment models can have low forecast performance of relevant stock parameters like recruitment or spawning stock biomass, especially in ecosystems that are changing due to global warming and other anthropogenic stressors. In this paper, we investigate the use of machine le… ▽ More The accurate assessment of fish stocks is crucial for sustainable fisheries management. However, existing statistical stock assessment models can have low forecast performance of relevant stock parameters like recruitment or spawning stock biomass, especially in ecosystems that are changing due to global warming and other anthropogenic stressors. In this paper, we investigate the use of machine learning models to improve the estimation and forecast of such stock parameters. We propose a hybrid model that combines classical statistical stock assessment models with supervised ML, specifically gradient boosted trees. Our hybrid model leverages the initial estimate provided by the classical model and uses the ML model to make a post-hoc correction to improve accuracy. We experiment with five different stocks and find that the forecast accuracy of recruitment and spawning stock biomass improves considerably in most cases. △ Less

Submitted 7 August, 2023; originally announced August 2023.

Comments: Accepted at Fragile Earth Workshop 2023

arXiv:2305.03515 [pdf, other]

GradTree: Learning Axis-Aligned Decision Trees with Gradient Descent

Authors: Sascha Marton, Stefan Lüdtke, Christian Bartelt, Heiner Stuckenschmidt

Abstract: Decision Trees (DTs) are commonly used for many machine learning tasks due to their high degree of interpretability. However, learning a DT from data is a difficult optimization problem, as it is non-convex and non-differentiable. Therefore, common approaches learn DTs using a greedy growth algorithm that minimizes the impurity locally at each internal node. Unfortunately, this greedy procedure ca… ▽ More Decision Trees (DTs) are commonly used for many machine learning tasks due to their high degree of interpretability. However, learning a DT from data is a difficult optimization problem, as it is non-convex and non-differentiable. Therefore, common approaches learn DTs using a greedy growth algorithm that minimizes the impurity locally at each internal node. Unfortunately, this greedy procedure can lead to inaccurate trees. In this paper, we present a novel approach for learning hard, axis-aligned DTs with gradient descent. The proposed method uses backpropagation with a straight-through operator on a dense DT representation, to jointly optimize all tree parameters. Our approach outperforms existing methods on binary classification benchmarks and achieves competitive results for multi-class tasks. The method is available under: https://github.com/s-marton/GradTree △ Less

Submitted 12 March, 2024; v1 submitted 5 May, 2023; originally announced May 2023.

arXiv:2301.10571 [pdf, other]

Leveraging Planning Landmarks for Hybrid Online Goal Recognition

Authors: Nils Wilken, Lea Cohausz, Johannes Schaum, Stefan Lüdtke, Christian Bartelt, Heiner Stuckenschmidt

Abstract: Goal recognition is an important problem in many application domains (e.g., pervasive computing, intrusion detection, computer games, etc.). In many application scenarios it is important that goal recognition algorithms can recognize goals of an observed agent as fast as possible and with minimal domain knowledge. Hence, in this paper, we propose a hybrid method for online goal recognition that co… ▽ More Goal recognition is an important problem in many application domains (e.g., pervasive computing, intrusion detection, computer games, etc.). In many application scenarios it is important that goal recognition algorithms can recognize goals of an observed agent as fast as possible and with minimal domain knowledge. Hence, in this paper, we propose a hybrid method for online goal recognition that combines a symbolic planning landmark based approach and a data-driven goal recognition approach and evaluate it in a real-world cooking scenario. The empirical results show that the proposed method is not only significantly more efficient in terms of computation time than the state-of-the-art but also improves goal recognition performance. Furthermore, we show that the utilized planning landmark based approach, which was so far only evaluated on artificial benchmark domains, achieves also good recognition performance when applied to a real-world cooking scenario. △ Less

Submitted 25 January, 2023; originally announced January 2023.

Comments: 9 pages. Presented at SPARK 2022 (https://icaps22.icaps-conference.org/workshops/SPARK/)

arXiv:2301.05608 [pdf, other]

Investigating the Combination of Planning-Based and Data-Driven Methods for Goal Recognition

Authors: Nils Wilken, Lea Cohausz, Johannes Schaum, Stefan Lüdtke, Heiner Stuckenschmidt

Abstract: An important feature of pervasive, intelligent assistance systems is the ability to dynamically adapt to the current needs of their users. Hence, it is critical for such systems to be able to recognize those goals and needs based on observations of the user's actions and state of the environment. In this work, we investigate the application of two state-of-the-art, planning-based plan recognition… ▽ More An important feature of pervasive, intelligent assistance systems is the ability to dynamically adapt to the current needs of their users. Hence, it is critical for such systems to be able to recognize those goals and needs based on observations of the user's actions and state of the environment. In this work, we investigate the application of two state-of-the-art, planning-based plan recognition approaches in a real-world setting. So far, these approaches were only evaluated in artificial settings in combination with agents that act perfectly rational. We show that such approaches have difficulties when used to recognize the goals of human subjects, because human behaviour is typically not perfectly rational. To overcome this issue, we propose an extension to the existing approaches through a classification-based method trained on observed behaviour data. We empirically show that the proposed extension not only outperforms the purely planning-based- and purely data-driven goal recognition methods but is also able to recognize the correct goal more reliably, especially when only a small number of observations were seen. This substantially improves the usefulness of hybrid goal recognition approaches for intelligent assistance systems, as recognizing a goal early opens much more possibilities for supportive reactions of the system. △ Less

Submitted 13 January, 2023; originally announced January 2023.

arXiv:2210.16425 [pdf, other]

Feasibility study for the hard x-ray free electron laser based on synergistic use of conventional and plasma accelerator technologies

Authors: Nikolai Yampolsky, Sandra Biedron, Bjorn Manuel Hegelich, Scott Luedtke, Evgenya Simakov, Stephen Milton

Abstract: We access the possibility of using the conventional RF accelerator as an injector for the plasma driven wakefield accelerator. Conventional accelerators deliver high quality beams with low emittance and low energy spread. Once injected into the plasma wake, the emittance may be preserved upon proper beam matching while the energy spread may not due to long beam duration delivered by the convention… ▽ More We access the possibility of using the conventional RF accelerator as an injector for the plasma driven wakefield accelerator. Conventional accelerators deliver high quality beams with low emittance and low energy spread. Once injected into the plasma wake, the emittance may be preserved upon proper beam matching while the energy spread may not due to long beam duration delivered by the conventional accelerators. Parameters of the overall accelerator system and the free electron laser based on such an accelerator are estimated. △ Less

Submitted 28 October, 2022; originally announced October 2022.

Comments: Internal report from Los Alamos National Laboratory. LA-UR-22-30406

arXiv:2207.11492 [pdf]

doi 10.1063/5.0161687

High-charge 10 GeV electron acceleration in a 10 cm nanoparticle-assisted hybrid wakefield accelerator

Authors: Constantin Aniculaesei, Thanh Ha, Samuel Yoffe, Edward McCary, Michael M Spinks, Hernan J. Quevedo, Lance Labun, Ou Z. Labun, Ritwik Sain, Andrea Hannasch, Rafal Zgadzaj, Isabella Pagano, Jose A. Franco-Altamirano, Martin L. Ringuette, Erhart Gaul, Scott V. Luedtke, Ganesh Tiwari, Bernhard Ersfeld, Enrico Brunetti, Hartmut Ruhl, Todd Ditmire, Sandra Bruce, Michael E. Donovan, Dino A. Jaroszynski, Michael C. Downer , et al. (1 additional authors not shown)

Abstract: In an electron wakefield accelerator, an intense laser pulse or charged particle beam excites plasma waves. Under proper conditions, electrons from the background plasma are trapped in the plasma wave and accelerated to ultra-relativistic velocities. We present recent results from a proof-of-principle wakefield acceleration experiment that reveal a unique synergy between a laser-driven and particl… ▽ More In an electron wakefield accelerator, an intense laser pulse or charged particle beam excites plasma waves. Under proper conditions, electrons from the background plasma are trapped in the plasma wave and accelerated to ultra-relativistic velocities. We present recent results from a proof-of-principle wakefield acceleration experiment that reveal a unique synergy between a laser-driven and particle-driven accelerator: a high-charge laser-wakefield accelerated electron bunch can drive its own wakefield while simultaneously drawing energy from the laser pulse via direct laser acceleration. This process continues to accelerate electrons beyond the usual decelerating phase of the wakefield, thus reaching much higher energies. We find that the 10-centimeter-long nanoparticle-assisted wakefield accelerator can generate 340 pC, 10.4+-0.6 GeV electron bunches with 3.4 GeV RMS convolved energy spread and 0.9 mrad RMS divergence. It can also produce bunches with lower energy, a few percent energy spread, and a higher charge. This synergistic mechanism and the simplicity of the experimental setup represent a step closer to compact tabletop particle accelerators suitable for applications requiring high charge at high energies, such as free electron lasers or radiation sources producing muon beams. △ Less

Submitted 18 August, 2023; v1 submitted 23 July, 2022; originally announced July 2022.

Journal ref: Matter Radiat. Extremes 9, 014001 (2024)

arXiv:2207.08816 [pdf, other]

Discovering Behavioral Predispositions in Data to Improve Human Activity Recognition

Authors: Maximilian Popko, Sebastian Bader, Stefan Lüdtke, Thomas Kirste

Abstract: The automatic, sensor-based assessment of challenging behavior of persons with dementia is an important task to support the selection of interventions. However, predicting behaviors like apathy and agitation is challenging due to the large inter- and intra-patient variability. Goal of this paper is to improve the recognition performance by making use of the observation that patients tend to show s… ▽ More The automatic, sensor-based assessment of challenging behavior of persons with dementia is an important task to support the selection of interventions. However, predicting behaviors like apathy and agitation is challenging due to the large inter- and intra-patient variability. Goal of this paper is to improve the recognition performance by making use of the observation that patients tend to show specific behaviors at certain times of the day or week. We propose to identify such segments of similar behavior via clustering the distributions of annotations of the time segments. All time segments within a cluster then consist of similar behaviors and thus indicate a behavioral predisposition (BPD). We utilize BPDs by training a classifier for each BPD. Empirically, we demonstrate that when the BPD per time segment is known, activity recognition performance can be substantially improved. △ Less

Submitted 18 July, 2022; originally announced July 2022.

Comments: Submitted to iWOAR 2022 - 7th international Workshop on Sensor-Based Activity Recognition and Artificial Intelligence

arXiv:2207.08414 [pdf, other]

Outlier Explanation via Sum-Product Networks

Authors: Stefan Lüdtke, Christian Bartelt, Heiner Stuckenschmidt

Abstract: Outlier explanation is the task of identifying a set of features that distinguish a sample from normal data, which is important for downstream (human) decision-making. Existing methods are based on beam search in the space of feature subsets. They quickly becomes computationally expensive, as they require to run an outlier detection algorithm from scratch for each feature subset. To alleviate this… ▽ More Outlier explanation is the task of identifying a set of features that distinguish a sample from normal data, which is important for downstream (human) decision-making. Existing methods are based on beam search in the space of feature subsets. They quickly becomes computationally expensive, as they require to run an outlier detection algorithm from scratch for each feature subset. To alleviate this problem, we propose a novel outlier explanation algorithm based on Sum-Product Networks (SPNs), a class of probabilistic circuits. Our approach leverages the tractability of marginal inference in SPNs to compute outlier scores in feature subsets. By using SPNs, it becomes feasible to perform backwards elimination instead of the usual forward beam search, which is less susceptible to missing relevant features in an explanation, especially when the number of features is large. We empirically show that our approach achieves state-of-the-art results for outlier explanation, outperforming recent search-based as well as deep learning-based explanation methods △ Less

Submitted 18 July, 2022; originally announced July 2022.

arXiv:2206.04891 [pdf, other]

doi 10.1007/s10994-023-06428-4

Explaining Neural Networks without Access to Training Data

Authors: Sascha Marton, Stefan Lüdtke, Christian Bartelt, Andrej Tschalzev, Heiner Stuckenschmidt

Abstract: We consider generating explanations for neural networks in cases where the network's training data is not accessible, for instance due to privacy or safety issues. Recently, $\mathcal{I}$-Nets have been proposed as a sample-free approach to post-hoc, global model interpretability that does not require access to training data. They formulate interpretation as a machine learning task that maps netwo… ▽ More We consider generating explanations for neural networks in cases where the network's training data is not accessible, for instance due to privacy or safety issues. Recently, $\mathcal{I}$-Nets have been proposed as a sample-free approach to post-hoc, global model interpretability that does not require access to training data. They formulate interpretation as a machine learning task that maps network representations (parameters) to a representation of an interpretable function. In this paper, we extend the $\mathcal{I}$-Net framework to the cases of standard and soft decision trees as surrogate models. We propose a suitable decision tree representation and design of the corresponding $\mathcal{I}$-Net output layers. Furthermore, we make $\mathcal{I}$-Nets applicable to real-world tasks by considering more realistic distributions when generating the $\mathcal{I}$-Net's training data. We empirically evaluate our approach against traditional global, post-hoc interpretability approaches and show that it achieves superior results when the training data is not accessible. △ Less

Submitted 10 June, 2022; originally announced June 2022.

Journal ref: Machine Learning (2024)

arXiv:2202.00332 [pdf, other]

Activity Recognition in Assembly Tasks by Bayesian Filtering in Multi-Hypergraphs

Authors: Timon Felske, Stefan Lüdtke, Sebastian Bader, Thomas Kirste

Abstract: We study sensor-based human activity recognition in manual work processes like assembly tasks. In such processes, the system states often have a rich structure, involving object properties and relations. Thus, estimating the hidden system state from sensor observations by recursive Bayesian filtering can be very challenging, due to the combinatorial explosion in the number of system states. To all… ▽ More We study sensor-based human activity recognition in manual work processes like assembly tasks. In such processes, the system states often have a rich structure, involving object properties and relations. Thus, estimating the hidden system state from sensor observations by recursive Bayesian filtering can be very challenging, due to the combinatorial explosion in the number of system states. To alleviate this problem, we propose an efficient Bayesian filtering model for such processes. In our approach, system states are represented by multi-hypergraphs, and the system dynamics is modeled by graph rewriting rules. We show a preliminary concept that allows to represent distributions over multi-hypergraphs more compactly than by full enumeration, and present an inference algorithm that works directly on this compact representation. We demonstrate the applicability of the algorithm on a real dataset. △ Less

Submitted 1 February, 2022; originally announced February 2022.

Comments: Accepted for presentation at the 2nd GCLR workshop in conjunction with AAAI 2022

arXiv:2111.04564 [pdf, other]

Human Activity Recognition using Attribute-Based Neural Networks and Context Information

Authors: Stefan Lüdtke, Fernando Moya Rueda, Waqas Ahmed, Gernot A. Fink, Thomas Kirste

Abstract: We consider human activity recognition (HAR) from wearable sensor data in manual-work processes, like warehouse order-picking. Such structured domains can often be partitioned into distinct process steps, e.g., packaging or transporting. Each process step can have a different prior distribution over activity classes, e.g., standing or walking, and different system dynamics. Here, we show how such… ▽ More We consider human activity recognition (HAR) from wearable sensor data in manual-work processes, like warehouse order-picking. Such structured domains can often be partitioned into distinct process steps, e.g., packaging or transporting. Each process step can have a different prior distribution over activity classes, e.g., standing or walking, and different system dynamics. Here, we show how such context information can be integrated systematically into a deep neural network-based HAR system. Specifically, we propose a hybrid architecture that combines a deep neural network-that estimates high-level movement descriptors, attributes, from the raw-sensor data-and a shallow classifier, which predicts activity classes from the estimated attributes and (optional) context information, like the currently executed process step. We empirically show that our proposed architecture increases HAR performance, compared to state-of-the-art methods. Additionally, we show that HAR performance can be further increased when information about process steps is incorporated, even when that information is only partially correct. △ Less

Submitted 28 October, 2021; originally announced November 2021.

Comments: 3rd International Workshop on Deep Learning for Human Activity Recognition

arXiv:2110.05165 [pdf, other]

Exchangeability-Aware Sum-Product Networks

Authors: Stefan Lüdtke, Christian Bartelt, Heiner Stuckenschmidt

Abstract: Sum-Product Networks (SPNs) are expressive probabilistic models that provide exact, tractable inference. They achieve this efficiency by making use of local independence. On the other hand, mixtures of exchangeable variable models (MEVMs) are a class of tractable probabilistic models that make use of exchangeability of discrete random variables to render inference tractable. Exchangeability, which… ▽ More Sum-Product Networks (SPNs) are expressive probabilistic models that provide exact, tractable inference. They achieve this efficiency by making use of local independence. On the other hand, mixtures of exchangeable variable models (MEVMs) are a class of tractable probabilistic models that make use of exchangeability of discrete random variables to render inference tractable. Exchangeability, which arises naturally in relational domains, has not been considered for efficient representation and inference in SPNs yet. The contribution of this paper is a novel probabilistic model which we call Exchangeability-Aware Sum-Product Networks (XSPNs). It contains both SPNs and MEVMs as special cases, and combines the ability of SPNs to efficiently learn deep probabilistic models with the ability of MEVMs to efficiently handle exchangeable random variables. We introduce a structure learning algorithm for XSPNs and empirically show that they can be more accurate than conventional SPNs when the data contains repeated, interchangeable parts. △ Less

Submitted 28 April, 2022; v1 submitted 11 October, 2021; originally announced October 2021.

Comments: accepted at IJCAI 2022

arXiv:2102.13133 [pdf, other]

doi 10.1109/TPDS.2021.3084795

VPIC 2.0: Next Generation Particle-in-Cell Simulations

Authors: Robert Bird, Nigel Tan, Scott V. Luedtke, Stephen Lien Harrell, Michela Taufer, Brian Albright

Abstract: VPIC is a general purpose Particle-in-Cell simulation code for modeling plasma phenomena such as magnetic reconnection, fusion, solar weather, and laser-plasma interaction in three dimensions using large numbers of particles. VPIC's capacity in both fidelity and scale makes it particularly well-suited for plasma research on pre-exascale and exascale platforms. In this paper we demonstrate the uniq… ▽ More VPIC is a general purpose Particle-in-Cell simulation code for modeling plasma phenomena such as magnetic reconnection, fusion, solar weather, and laser-plasma interaction in three dimensions using large numbers of particles. VPIC's capacity in both fidelity and scale makes it particularly well-suited for plasma research on pre-exascale and exascale platforms. In this paper we demonstrate the unique challenges involved in preparing the VPIC code for operation at exascale, outlining important optimizations to make VPIC efficient on accelerators. Specifically, we show the work undertaken in adapting VPIC to exploit the portability-enabling framework Kokkos and highlight the enhancements to VPIC's modeling capabilities to achieve performance at exascale. We assess the achieved performance-portability trade-off through a suite of studies on nine different varieties of modern pre-exascale hardware. Our performance-portability study includes weak-scaling runs on three of the top ten TOP500 supercomputers, as well as a comparison of low-level system performance of hardware from four different vendors. △ Less

Submitted 25 February, 2021; originally announced February 2021.

arXiv:2101.10356 [pdf]

doi 10.1038/s41592-021-01220-5

Deep learning based mixed-dimensional GMM for characterizing variability in CryoEM

Authors: Muyuan Chen, Steven Ludtke

Abstract: Structural flexibility and/or dynamic interactions with other molecules is a critical aspect of protein function. CryoEM provides direct visualization of individual macromolecules sampling different conformational and compositional states. While numerous methods are available for computational classification of discrete states, characterization of continuous conformational changes or large numbers… ▽ More Structural flexibility and/or dynamic interactions with other molecules is a critical aspect of protein function. CryoEM provides direct visualization of individual macromolecules sampling different conformational and compositional states. While numerous methods are available for computational classification of discrete states, characterization of continuous conformational changes or large numbers of discrete state without human supervision remains challenging. Here we present e2gmm, a machine learning algorithm to determine a conformational landscape for proteins or complexes using a 3-D Gaussian mixture model mapped onto 2-D particle images in known orientations. Using a deep neural network architecture, e2gmm can automatically resolve the structural heterogeneity within the protein complex and map particles onto a small latent space describing conformational and compositional changes. This system presents a more intuitive and flexible representation than other manifold methods currently in use. We demonstrate this method on both simulated data as well as three biological systems, to explore compositional and conformational changes at a range of scales. The software is distributed as part of EMAN2. △ Less

Submitted 23 May, 2021; v1 submitted 25 January, 2021; originally announced January 2021.

Comments: 31 pages, 5 main figures and 8 supplementary figures

Journal ref: Nature Methods 18, 930-936 (2021)

arXiv:2006.14114 [pdf, other]

doi 10.1103/PhysRevResearch.3.L032061

Creating QED Photon Jets with Present-Day Lasers

Authors: Scott V. Luedtke, Lin Yin, Lance A. Labun, Ou Z. Labun, B. J. Albright, Robert F. Bird, W. D. Nystrom, Björn Manuel Hegelich

Abstract: Large-scale, relativistic particle-in-cell simulations with quantum electrodynamics (QED) models show that high energy (1$<E_γ\lesssim$ 75 MeV) QED photon jets with a flux of $10^{12}$ sr$^{-1}$ can be created with present-day lasers and planar, unstructured targets. This process involves a self-forming channel in the target in response to a laser pulse focused tightly ($f$ number unity) onto the… ▽ More Large-scale, relativistic particle-in-cell simulations with quantum electrodynamics (QED) models show that high energy (1$<E_γ\lesssim$ 75 MeV) QED photon jets with a flux of $10^{12}$ sr$^{-1}$ can be created with present-day lasers and planar, unstructured targets. This process involves a self-forming channel in the target in response to a laser pulse focused tightly ($f$ number unity) onto the target surface. We show the self-formation of a channel to be robust to experimentally motivated variations in preplasma, angle of incidence, and laser stability, and present in simulations using historical shot data from the Texas Petawatt. We estimate that a detectable photon flux in the 10s of MeV range will require about 60 J in a 150 fs pulse. △ Less

Submitted 15 September, 2021; v1 submitted 24 June, 2020; originally announced June 2020.

Comments: 5 pages, 4 figures

Journal ref: Phys. Rev. Research 3, 032061 (2021)

arXiv:1902.03978 [pdf]

doi 10.1038/s41592-019-0591-8

A complete data processing workflow for CryoET and subtomogram averaging

Authors: Muyuan Chen, James M. Bell, Xiaodong Shi, Stella Y. Sun, Zhao Wang, Steven J. Ludtke

Abstract: Electron cryotomography (CryoET) is currently the only method capable of visualizing cells in 3D at nanometer resolutions. While modern instruments produce massive amounts of tomography data containing extremely rich structural information, the data processing is very labor intensive and results are often limited by the skills of the personnel rather than the data. We present an integrated workflo… ▽ More Electron cryotomography (CryoET) is currently the only method capable of visualizing cells in 3D at nanometer resolutions. While modern instruments produce massive amounts of tomography data containing extremely rich structural information, the data processing is very labor intensive and results are often limited by the skills of the personnel rather than the data. We present an integrated workflow that covers the entire tomography data processing pipeline, from automated tilt series alignment to subnanometer resolution subtomogram averaging. This workflow greatly reduces human effort and increases throughput, and is capable of determining protein structures at state-of-the-art resolutions for both purified macromolecules and cells. △ Less

Submitted 11 February, 2019; originally announced February 2019.

Comments: 21 pages, 4+2 figures

Journal ref: Nature Methods 16 (2019) 1161-1168

arXiv:1808.07067 [pdf, other]

Jet Observable for Photons from High-Intensity Laser-Plasma Interactions

Authors: Scott V. Luedtke, Lance A. Labun, Ou Z. Labun, Karl-Ulrich Bamberg, Hartmut Ruhl, Björn Manuel Hegelich

Abstract: The goals of discovering quantum radiation dynamics in high-intensity laser-plasma interactions and engineering new laser-driven high-energy particle sources both require accurate and robust predictions. Experiments rely on particle-in-cell simulations to predict and interpret outcomes, but unknowns in modeling the interaction limit the simulations to qualitative predictions, too uncertain to test… ▽ More The goals of discovering quantum radiation dynamics in high-intensity laser-plasma interactions and engineering new laser-driven high-energy particle sources both require accurate and robust predictions. Experiments rely on particle-in-cell simulations to predict and interpret outcomes, but unknowns in modeling the interaction limit the simulations to qualitative predictions, too uncertain to test the quantum theory. To establish a basis for quantitative prediction, we introduce a `jet' observable that parameterizes the emitted photon distribution and quantifies a highly directional flux of high-energy photon emission. Jets are identified by the observable under a variety of physical conditions and shown to be most prominent when the laser pulse forms a wavelength-scale channel through the target. The highest energy photons are generally emitted in the direction of the jet. The observable is compatible with characteristics of photon emission from quantum theory. This work offers quantitative guidance for the design of experiments and detectors, offering a foundation to use photon emission to interpret dynamics during high-intensity laser-plasma experiments and validate quantum radiation theory in strong fields. △ Less

Submitted 21 August, 2018; originally announced August 2018.

arXiv:1804.06748 [pdf, other]

doi 10.1613/jair.1.11261

State-Space Abstractions for Probabilistic Inference: A Systematic Review

Authors: Stefan Lüdtke, Max Schröder, Frank Krüger, Sebastian Bader, Thomas Kirste

Abstract: Tasks such as social network analysis, human behavior recognition, or modeling biochemical reactions, can be solved elegantly by using the probabilistic inference framework. However, standard probabilistic inference algorithms work at a propositional level, and thus cannot capture the symmetries and redundancies that are present in these tasks. Algorithms that exploit those symmetries have been de… ▽ More Tasks such as social network analysis, human behavior recognition, or modeling biochemical reactions, can be solved elegantly by using the probabilistic inference framework. However, standard probabilistic inference algorithms work at a propositional level, and thus cannot capture the symmetries and redundancies that are present in these tasks. Algorithms that exploit those symmetries have been devised in different research fields, for example by the lifted inference-, multiple object tracking-, and modeling and simulation-communities. The common idea, that we call state space abstraction, is to perform inference over compact representations of sets of symmetric states. Although they are concerned with a similar topic, the relationship between these approaches has not been investigated systematically. This survey provides the following contributions. We perform a systematic literature review to outline the state of the art in probabilistic inference methods exploiting symmetries. From an initial set of more than 4,000 papers, we identify 116 relevant papers. Furthermore, we provide new high-level categories that classify the approaches, based on common properties of the approaches. The research areas underlying each of the categories are introduced concisely. Researchers from different fields that are confronted with a state space explosion problem in a probabilistic system can use this classification to identify possible solutions. Finally, based on this conceptualization, we identify potentials for future research, as some relevant application domains are not addressed by current approaches. △ Less

Submitted 4 December, 2018; v1 submitted 18 April, 2018; originally announced April 2018.

arXiv:1801.10495 [pdf, other]

Lifted Filtering via Exchangeable Decomposition

Authors: Stefan Lüdtke, Max Schröder, Sebastian Bader, Kristian Kersting, Thomas Kirste

Abstract: We present a model for exact recursive Bayesian filtering based on lifted multiset states. Combining multisets with lifting makes it possible to simultaneously exploit multiple strategies for reducing inference complexity when compared to list-based grounded state representations. The core idea is to borrow the concept of Maximally Parallel Multiset Rewriting Systems and to enhance it by concepts… ▽ More We present a model for exact recursive Bayesian filtering based on lifted multiset states. Combining multisets with lifting makes it possible to simultaneously exploit multiple strategies for reducing inference complexity when compared to list-based grounded state representations. The core idea is to borrow the concept of Maximally Parallel Multiset Rewriting Systems and to enhance it by concepts from Rao-Blackwellization and Lifted Inference, giving a representation of state distributions that enables efficient inference. In worlds where the random variables that define the system state are exchangeable -- where the identity of entities does not matter -- it automatically uses a representation that abstracts from ordering (achieving an exponential reduction in complexity) -- and it automatically adapts when observations or system dynamics destroy exchangeability by breaking symmetry. △ Less

Submitted 7 May, 2018; v1 submitted 31 January, 2018; originally announced January 2018.

arXiv:1707.06446 [pdf, other]

Sequential Lifted Bayesian Filtering in Multiset Rewriting Systems

Authors: Max Schröder, Stefan Lüdtke, Sebastian Bader, Frank Krüger, Thomas Kirste

Abstract: Bayesian Filtering for plan and activity recognition is challenging for scenarios that contain many observation equivalent entities (i.e. entities that produce the same observations). This is due to the combinatorial explosion in the number of hypotheses that need to be tracked. However, this class of problems exhibits a certain symmetry that can be exploited for state space representation and inf… ▽ More Bayesian Filtering for plan and activity recognition is challenging for scenarios that contain many observation equivalent entities (i.e. entities that produce the same observations). This is due to the combinatorial explosion in the number of hypotheses that need to be tracked. However, this class of problems exhibits a certain symmetry that can be exploited for state space representation and inference. We analyze current state of the art methods and find that none of them completely fits the requirements arising in this problem class. We sketch a novel inference algorithm that provides a solution by incorporating concepts from Lifted Inference algorithms, Probabilistic Multiset Rewriting Systems, and Computational State Space Models. Two experiments confirm that this novel algorithm has the potential to perform efficient probabilistic inference on this problem class. △ Less

Submitted 14 August, 2017; v1 submitted 20 July, 2017; originally announced July 2017.

Comments: 7 pages, 3 figures, accepted at UAI-17 Statistical Relational AI (StarAI) workshop

arXiv:1701.05567 [pdf]

doi 10.1038/nmeth.4405

Convolutional Neural Networks for Automated Annotation of Cellular Cryo-Electron Tomograms

Authors: Muyuan Chen, Wei Dai, Ying Sun, Darius Jonasch, Cynthia Y He, Michael F. Schmid, Wah Chiu, Steven J Ludtke

Abstract: Cellular Electron Cryotomography (CryoET) offers the ability to look inside cells and observe macromolecules frozen in action. A primary challenge for this technique is identifying and extracting the molecular components within the crowded cellular environment. We introduce a method using neural networks to dramatically reduce the time and human effort required for subcellular annotation and featu… ▽ More Cellular Electron Cryotomography (CryoET) offers the ability to look inside cells and observe macromolecules frozen in action. A primary challenge for this technique is identifying and extracting the molecular components within the crowded cellular environment. We introduce a method using neural networks to dramatically reduce the time and human effort required for subcellular annotation and feature extraction. Subsequent subtomogram classification and averaging yields in-situ structures of molecular components of interest. △ Less

Submitted 11 June, 2017; v1 submitted 19 January, 2017; originally announced January 2017.

Comments: 21 pages, 8 figures

Journal ref: Nature Methods volume 14, 983-985 (2017)

Showing 1–26 of 26 results for author: Luedtke, S