-
Nuclear Pleomorphism in Canine Cutaneous Mast Cell Tumors: Comparison of Reproducibility and Prognostic Relevance between Estimates, Manual Morphometry and Algorithmic Morphometry
Authors:
Andreas Haghofer,
Eda Parlak,
Alexander Bartel,
Taryn A. Donovan,
Charles-Antoine Assenmacher,
Pompei Bolfa,
Michael J. Dark,
Andrea Fuchs-Baumgartinger,
Andrea Klang,
Kathrin Jäger,
Robert Klopfleisch,
Sophie Merz,
Barbara Richter,
F. Yvonne Schulman,
Hannah Janout,
Jonathan Ganz,
Josef Scharinger,
Marc Aubreville,
Stephan M. Winkler,
Matti Kiupel,
Christof A. Bertram
Abstract:
Variation in nuclear size and shape is an important criterion of malignancy for many tumor types; however, categorical estimates by pathologists have poor reproducibility. Measurements of nuclear characteristics (morphometry) can improve reproducibility, but manual methods are time consuming. The aim of this study was to explore the limitations of estimates and develop alternative morphometric sol…
▽ More
Variation in nuclear size and shape is an important criterion of malignancy for many tumor types; however, categorical estimates by pathologists have poor reproducibility. Measurements of nuclear characteristics (morphometry) can improve reproducibility, but manual methods are time consuming. The aim of this study was to explore the limitations of estimates and develop alternative morphometric solutions for canine cutaneous mast cell tumors (ccMCT). We assessed the following nuclear evaluation methods for measurement accuracy, reproducibility, and prognostic utility: 1) anisokaryosis (karyomegaly) estimates by 11 pathologists; 2) gold standard manual morphometry of at least 100 nuclei; 3) practicable manual morphometry with stratified sampling of 12 nuclei by 9 pathologists; and 4) automated morphometry using a deep learning-based segmentation algorithm. The study dataset comprised 96 ccMCT with available outcome information. The study dataset comprised 96 ccMCT with available outcome information. Inter-rater reproducibility of karyomegaly estimates was low ($κ$ = 0.226), while it was good (ICC = 0.654) for practicable morphometry of the standard deviation (SD) of nuclear size. As compared to gold standard manual morphometry (AUC = 0.839, 95% CI: 0.701 - 0.977), the prognostic value (tumor-specific survival) of SDs of nuclear area for practicable manual morphometry (12 nuclei) and automated morphometry were high with an area under the ROC curve (AUC) of 0.868 (95% CI: 0.737 - 0.991) and 0.943 (95% CI: 0.889 - 0.996), respectively. This study supports the use of manual morphometry with stratified sampling of 12 nuclei and algorithmic morphometry to overcome the poor reproducibility of estimates.
△ Less
Submitted 23 May, 2024; v1 submitted 26 September, 2023;
originally announced September 2023.
-
Cluster Analysis of a Symbolic Regression Search Space
Authors:
Gabriel Kronberger,
Lukas Kammerer,
Bogdan Burlacu,
Stephan M. Winkler,
Michael Kommenda,
Michael Affenzeller
Abstract:
In this chapter we take a closer look at the distribution of symbolic regression models generated by genetic programming in the search space. The motivation for this work is to improve the search for well-fitting symbolic regression models by using information about the similarity of models that can be precomputed independently from the target function. For our analysis, we use a restricted gramma…
▽ More
In this chapter we take a closer look at the distribution of symbolic regression models generated by genetic programming in the search space. The motivation for this work is to improve the search for well-fitting symbolic regression models by using information about the similarity of models that can be precomputed independently from the target function. For our analysis, we use a restricted grammar for uni-variate symbolic regression models and generate all possible models up to a fixed length limit. We identify unique models and cluster them based on phenotypic as well as genotypic similarity. We find that phenotypic similarity leads to well-defined clusters while genotypic similarity does not produce a clear clustering. By map** solution candidates visited by GP to the enumerated search space we find that GP initially explores the whole search space and later converges to the subspace of highest quality expressions in a run for a simple benchmark problem.
△ Less
Submitted 28 September, 2021;
originally announced September 2021.
-
Symbolic Regression by Exhaustive Search: Reducing the Search Space Using Syntactical Constraints and Efficient Semantic Structure Deduplication
Authors:
Lukas Kammerer,
Gabriel Kronberger,
Bogdan Burlacu,
Stephan M. Winkler,
Michael Kommenda,
Michael Affenzeller
Abstract:
Symbolic regression is a powerful system identification technique in industrial scenarios where no prior knowledge on model structure is available. Such scenarios often require specific model properties such as interpretability, robustness, trustworthiness and plausibility, that are not easily achievable using standard approaches like genetic programming for symbolic regression. In this chapter we…
▽ More
Symbolic regression is a powerful system identification technique in industrial scenarios where no prior knowledge on model structure is available. Such scenarios often require specific model properties such as interpretability, robustness, trustworthiness and plausibility, that are not easily achievable using standard approaches like genetic programming for symbolic regression. In this chapter we introduce a deterministic symbolic regression algorithm specifically designed to address these issues. The algorithm uses a context-free grammar to produce models that are parameterized by a non-linear least squares local optimization procedure. A finite enumeration of all possible models is guaranteed by structural restrictions as well as a caching mechanism for detecting semantically equivalent solutions. Enumeration order is established via heuristics designed to improve search efficiency. Empirical tests on a comprehensive benchmark suite show that our approach is competitive with genetic programming in many noiseless problems while maintaining desirable properties such as simple, reliable models and reproducibility.
△ Less
Submitted 28 September, 2021;
originally announced September 2021.