Search | arXiv e-print repository

PMBO: Enhancing Black-Box Optimization through Multivariate Polynomial Surrogates

Authors: Janina Schreiber, Pau Batlle, Damar Wicaksono, Michael Hecht

Abstract: We introduce a surrogate-based black-box optimization method, termed Polynomial-model-based optimization (PMBO). The algorithm alternates polynomial approximation with Bayesian optimization steps, using Gaussian processes to model the error between the objective and its polynomial fit. We describe the algorithmic design of PMBO and compare the results of the performance of PMBO with several optimi… ▽ More We introduce a surrogate-based black-box optimization method, termed Polynomial-model-based optimization (PMBO). The algorithm alternates polynomial approximation with Bayesian optimization steps, using Gaussian processes to model the error between the objective and its polynomial fit. We describe the algorithmic design of PMBO and compare the results of the performance of PMBO with several optimization methods for a set of analytic test functions. The results show that PMBO outperforms the classic Bayesian optimization and is robust with respect to the choice of its correlation function family and its hyper-parameter setting, which, on the contrary, need to be carefully tuned in classic Bayesian optimization. Remarkably, PMBO performs comparably with state-of-the-art evolutionary algorithms such as the Covariance Matrix Adaptation -- Evolution Strategy (CMA-ES). This finding suggests that PMBO emerges as the pivotal choice among surrogate-based optimization methods when addressing low-dimensional optimization problems. Hereby, the simple nature of polynomials opens the opportunity for interpretation and analysis of the inferred surrogate model, providing a macroscopic perspective on the landscape of the objective function. △ Less

Submitted 12 March, 2024; originally announced March 2024.

Comments: 16 pages, 6 figures

arXiv:2309.08228 [pdf, other]

Ensuring Topological Data-Structure Preservation under Autoencoder Compression due to Latent Space Regularization in Gauss--Legendre nodes

Authors: Chethan Krishnamurthy Ramanaik, Juan-Esteban Suarez Cardona, Anna Willmann, Pia Hanfeld, Nico Hoffmann, Michael Hecht

Abstract: We formulate a data independent latent space regularisation constraint for general unsupervised autoencoders. The regularisation rests on sampling the autoencoder Jacobian in Legendre nodes, being the centre of the Gauss-Legendre quadrature. Revisiting this classic enables to prove that regularised autoencoders ensure a one-to-one re-embedding of the initial data manifold to its latent representat… ▽ More We formulate a data independent latent space regularisation constraint for general unsupervised autoencoders. The regularisation rests on sampling the autoencoder Jacobian in Legendre nodes, being the centre of the Gauss-Legendre quadrature. Revisiting this classic enables to prove that regularised autoencoders ensure a one-to-one re-embedding of the initial data manifold to its latent representation. Demonstrations show that prior proposed regularisation strategies, such as contractive autoencoding, cause topological defects already for simple examples, and so do convolutional based (variational) autoencoders. In contrast, topological preservation is ensured already by standard multilayer perceptron neural networks when being regularised due to our contribution. This observation extends through the classic FashionMNIST dataset up to real world encoding problems for MRI brain scans, suggesting that, across disciplines, reliable low dimensional representations of complex high-dimensional datasets can be delivered due to this regularisation technique. △ Less

Submitted 21 September, 2023; v1 submitted 15 September, 2023; originally announced September 2023.

arXiv:2309.00663 [pdf, other]

Polynomial-Model-Based Optimization for Blackbox Objectives

Authors: Janina Schreiber, Damar Wicaksono, Michael Hecht

Abstract: For a wide range of applications the structure of systems like Neural Networks or complex simulations, is unknown and approximation is costly or even impossible. Black-box optimization seeks to find optimal (hyper-) parameters for these systems such that a pre-defined objective function is minimized. Polynomial-Model-Based Optimization (PMBO) is a novel blackbox optimizer that finds the minimum by… ▽ More For a wide range of applications the structure of systems like Neural Networks or complex simulations, is unknown and approximation is costly or even impossible. Black-box optimization seeks to find optimal (hyper-) parameters for these systems such that a pre-defined objective function is minimized. Polynomial-Model-Based Optimization (PMBO) is a novel blackbox optimizer that finds the minimum by fitting a polynomial surrogate to the objective function. Motivated by Bayesian optimization the model is iteratively updated according to the acquisition function Expected Improvement, thus balancing the exploitation and exploration rate and providing an uncertainty estimate of the model. PMBO is benchmarked against other state-of-the-art algorithms for a given set of artificial, analytical functions. PMBO competes successfully with those algorithms and even outperforms all of them in some cases. As the results suggest, we believe PMBO is the pivotal choice for solving blackbox optimization tasks occurring in a wide range of disciplines. △ Less

Submitted 1 September, 2023; originally announced September 2023.

arXiv:2301.04887 [pdf, other]

Learning Partial Differential Equations by Spectral Approximates of General Sobolev Spaces

Authors: Juan-Esteban Suarez Cardona, Phil-Alexander Hofmann, Michael Hecht

Abstract: We introduce a novel spectral, finite-dimensional approximation of general Sobolev spaces in terms of Chebyshev polynomials. Based on this polynomial surrogate model (PSM), we realise a variational formulation, solving a vast class of linear and non-linear partial differential equations (PDEs). The PSMs are as flexible as the physics-informed neural nets (PINNs) and provide an alternative for addr… ▽ More We introduce a novel spectral, finite-dimensional approximation of general Sobolev spaces in terms of Chebyshev polynomials. Based on this polynomial surrogate model (PSM), we realise a variational formulation, solving a vast class of linear and non-linear partial differential equations (PDEs). The PSMs are as flexible as the physics-informed neural nets (PINNs) and provide an alternative for addressing inverse PDE problems, such as PDE-parameter inference. In contrast to PINNs, the PSMs result in a convex optimisation problem for a vast class of PDEs, including all linear ones, in which case the PSM-approximate is efficiently computable due to the exponential convergence rate of the underlying variational gradient descent. As a practical consequence prominent PDE problems were resolved by the PSMs without High Performance Computing (HPC) on a local machine. This gain in efficiency is complemented by an increase of approximation power, outperforming PINN alternatives in both accuracy and runtime. Beyond the empirical evidence we give here, the translation of classic PDE theory in terms of the Sobolev space approximates suggests the PSMs to be universally applicable to well-posed, regular forward and inverse PDE problems. △ Less

Submitted 12 January, 2023; originally announced January 2023.

arXiv:2212.11536 [pdf, other]

Global Polynomial Level Sets for Numerical Differential Geometry of Smooth Closed Surfaces

Authors: Sachin K. Thekke Veettil, Gentian Zavalani, Uwe Hernandez Acosta, Ivo F. Sbalzarini, Michael Hecht

Abstract: We present a computational scheme that derives a global polynomial level set parametrisation for smooth closed surfaces from a regular surface-point set and prove its uniqueness. This enables us to approximate a broad class of smooth surfaces by affine algebraic varieties. From such a global polynomial level set parametrisation, differential-geometric quantities like mean and Gauss curvature can b… ▽ More We present a computational scheme that derives a global polynomial level set parametrisation for smooth closed surfaces from a regular surface-point set and prove its uniqueness. This enables us to approximate a broad class of smooth surfaces by affine algebraic varieties. From such a global polynomial level set parametrisation, differential-geometric quantities like mean and Gauss curvature can be efficiently and accurately computed. Even 4$^{\text{th}}$-order terms such as the Laplacian of mean curvature are approximates with high precision. The accuracy performance results in a gain of computational efficiency, significantly reducing the number of surface points required compared to classic alternatives that rely on surface meshes or embedding grids. We mathematically derive and empirically demonstrate the strengths and the limitations of the present approach, suggesting it to be applicable to a large number of computational tasks in numerical differential geometry. △ Less

Submitted 22 December, 2022; originally announced December 2022.

MSC Class: 53Z50; 65D18

arXiv:2211.15443 [pdf, other]

Replacing Automatic Differentiation by Sobolev Cubatures fastens Physics Informed Neural Nets and strengthens their Approximation Power

Authors: Juan Esteban Suarez Cardona, Michael Hecht

Abstract: We present a novel class of approximations for variational losses, being applicable for the training of physics-informed neural nets (PINNs). The loss formulation reflects classic Sobolev space theory for partial differential equations and their weak formulations. The loss computation rests on an extension of Gauss-Legendre cubatures, we term Sobolev cubatures, replacing automatic differentiation… ▽ More We present a novel class of approximations for variational losses, being applicable for the training of physics-informed neural nets (PINNs). The loss formulation reflects classic Sobolev space theory for partial differential equations and their weak formulations. The loss computation rests on an extension of Gauss-Legendre cubatures, we term Sobolev cubatures, replacing automatic differentiation (A.D.). We prove the runtime complexity of training the resulting Soblev-PINNs (SC-PINNs) to be less than required by PINNs relying on A.D. On top of one-to-two order of magnitude speed-up the SC-PINNs are demonstrated to achieve closer solution approximations for prominent forward and inverse PDE problems than established PINNs achieve. △ Less

Submitted 23 November, 2022; originally announced November 2022.

arXiv:2208.13224 [pdf]

doi 10.3389/fonc.2023.1115258

Deep learning for automatic head and neck lymph node level delineation provides expert-level accuracy

Authors: Thomas Weissmann, Yixing Huang, Stefan Fischer, Johannes Roesch, Sina Mansoorian, Horacio Ayala Gaona, Antoniu-Oreste Gostian, Markus Hecht, Sebastian Lettmaier, Lisa Deloch, Benjamin Frey, Udo S. Gaipl, Luitpold V. Distel, Andreas Maier, Heinrich Iro, Sabine Semrau, Christoph Bert, Rainer Fietkau, Florian Putz

Abstract: Background: Deep learning (DL)-based head and neck lymph node level (HN_LNL) autodelineation is of high relevance to radiotherapy research and clinical treatment planning but still underinvestigated in academic literature. Methods: An expert-delineated cohort of 35 planning CTs was used for training of an nnU-net 3D-fullres/2D-ensemble model for autosegmentation of 20 different HN_LNL. A second co… ▽ More Background: Deep learning (DL)-based head and neck lymph node level (HN_LNL) autodelineation is of high relevance to radiotherapy research and clinical treatment planning but still underinvestigated in academic literature. Methods: An expert-delineated cohort of 35 planning CTs was used for training of an nnU-net 3D-fullres/2D-ensemble model for autosegmentation of 20 different HN_LNL. A second cohort acquired at the same institution later in time served as the test set (n=20). In a completely blinded evaluation, 3 clinical experts rated the quality of DL autosegmentations in a head-to-head comparison with expert-created contours. For a subgroup of 10 cases, intraobserver variability was compared to the average DL autosegmentation accuracy on the original and recontoured set of expert segmentations. A postprocessing step to adjust craniocaudal boundaries of level autosegmentations to the CT slice plane was introduced and the effect on geometric accuracy and expert rating was investigated. Results: Blinded expert ratings for DL segmentations and expert-created contours were not significantly different. DL segmentations with slice plane adjustment were rated numerically higher (mean, 81.0 vs. 79.6,p=0.185) and DL segmentations without slice plane adjustment were rated numerically lower (77.2 vs. 79.6,p=0.167) than manually drawn contours. DL segmentations with CT slice plane adjustment were rated significantly better than DL contours without slice plane adjustment (81.0 vs. 77.2,p=0.004). Geometric accuracy of DL segmentations was not different from intraobserver variability (mean, 0.76 vs. 0.77, p=0.307). Conclusions: We show that a nnU-net 3D-fullres/2D-ensemble model can be used for highly accurate autodelineation of HN_LNL using only a limited training dataset that is ideally suited for large-scale standardized autodelineation of HN_LNL in the research setting. △ Less

Submitted 1 March, 2023; v1 submitted 28 August, 2022; originally announced August 2022.

Comments: 14 pages, 6 figures, published in Frontiers in Oncology

Journal ref: Front. Oncol. 13:1115258

arXiv:2106.12894 [pdf, other]

InFlow: Robust outlier detection utilizing Normalizing Flows

Authors: Nishant Kumar, Pia Hanfeld, Michael Hecht, Michael Bussmann, Stefan Gumhold, Nico Hoffmann

Abstract: Normalizing flows are prominent deep generative models that provide tractable probability distributions and efficient density estimation. However, they are well known to fail while detecting Out-of-Distribution (OOD) inputs as they directly encode the local features of the input representations in their latent space. In this paper, we solve this overconfidence issue of normalizing flows by demonst… ▽ More Normalizing flows are prominent deep generative models that provide tractable probability distributions and efficient density estimation. However, they are well known to fail while detecting Out-of-Distribution (OOD) inputs as they directly encode the local features of the input representations in their latent space. In this paper, we solve this overconfidence issue of normalizing flows by demonstrating that flows, if extended by an attention mechanism, can reliably detect outliers including adversarial attacks. Our approach does not require outlier data for training and we showcase the efficiency of our method for OOD detection by reporting state-of-the-art performance in diverse experimental settings. Code available at https://github.com/ComputationalRadiationPhysics/InFlow . △ Less

Submitted 16 November, 2021; v1 submitted 10 June, 2021; originally announced June 2021.

arXiv:2106.04908 [pdf, other]

Automatic Sexism Detection with Multilingual Transformer Models

Authors: Mina Schütz, Jaqueline Boeck, Daria Liakhovets, Djordje Slijepčević, Armin Kirchknopf, Manuel Hecht, Johannes Bogensperger, Sven Schlarb, Alexander Schindler, Matthias Zeppelzauer

Abstract: Sexism has become an increasingly major problem on social networks during the last years. The first shared task on sEXism Identification in Social neTworks (EXIST) at IberLEF 2021 is an international competition in the field of Natural Language Processing (NLP) with the aim to automatically identify sexism in social media content by applying machine learning methods. Thereby sexism detection is fo… ▽ More Sexism has become an increasingly major problem on social networks during the last years. The first shared task on sE**, and objectification). This paper presents the contribution of the AIT_FHSTP team at the EXIST2021 benchmark for both tasks. To solve the tasks we applied two multilingual transformer models, one based on multilingual BERT and one based on XLM-R. Our approach uses two different strategies to adapt the transformers to the detection of sexist content: first, unsupervised pre-training with additional data and second, supervised fine-tuning with additional and augmented data. For both tasks our best model is XLM-R with unsupervised pre-training on the EXIST data and additional datasets and fine-tuning on the provided dataset. The best run for the binary classification (task 1) achieves a macro F1-score of 0.7752 and scores 5th rank in the benchmark; for the multiclass classification (task 2) our best submission scores 6th rank with a macro F1-score of 0.5589. △ Less

Submitted 8 February, 2022; v1 submitted 9 June, 2021; originally announced June 2021.

Comments: Technical Report to the AIT_FHSTP EXIST 2021 Challenge contribution (under review) http://nlp.uned.es/exist2021/

arXiv:2101.06702 [pdf]

doi 10.1016/j.ymssp.2021.108482

Deep Learning based Virtual Point Tracking for Real-Time Target-less Dynamic Displacement Measurement in Railway Applications

Authors: Dachuan Shi, Eldar Sabanovic, Luca Rizzetto, Viktor Skrickij, Roberto Oliverio, Nadia Kaviani, Yunguang Ye, Gintautas Bureika, Stefano Ricci, Markus Hecht

Abstract: In the application of computer-vision based displacement measurement, an optical target is usually required to prove the reference. In the case that the optical target cannot be attached to the measuring objective, edge detection, feature matching and template matching are the most common approaches in target-less photogrammetry. However, their performance significantly relies on parameter setting… ▽ More In the application of computer-vision based displacement measurement, an optical target is usually required to prove the reference. In the case that the optical target cannot be attached to the measuring objective, edge detection, feature matching and template matching are the most common approaches in target-less photogrammetry. However, their performance significantly relies on parameter settings. This becomes problematic in dynamic scenes where complicated background texture exists and varies over time. To tackle this issue, we propose virtual point tracking for real-time target-less dynamic displacement measurement, incorporating deep learning techniques and domain knowledge. Our approach consists of three steps: 1) automatic calibration for detection of region of interest; 2) virtual point detection for each video frame using deep convolutional neural network; 3) domain-knowledge based rule engine for point tracking in adjacent frames. The proposed approach can be executed on an edge computer in a real-time manner (i.e. over 30 frames per second). We demonstrate our approach for a railway application, where the lateral displacement of the wheel on the rail is measured during operation. We also implement an algorithm using template matching and line detection as the baseline for comparison. The numerical experiments have been performed to evaluate the performance and the latency of our approach in the harsh railway environment with noisy and varying backgrounds. △ Less

Submitted 31 July, 2021; v1 submitted 17 January, 2021; originally announced January 2021.

arXiv:2001.01440 [pdf, other]

Tight Localizations of Feedback Sets

Authors: Michael Hecht, Krzysztof Gonciarz, Szabolcs Horvát

Abstract: The classical NP-hard feedback arc set problem (FASP) and feedback vertex set problem (FVSP) ask for a minimum set of arcs $\varepsilon \subseteq E$ or vertices $ν\subseteq V$ whose removal $G\setminus \varepsilon$, $G\setminus ν$ makes a given multi-digraph $G=(V,E)$ acyclic, respectively. Though both problems are known to be APX-hard, approximation algorithms or proofs of inapproximability are u… ▽ More The classical NP-hard feedback arc set problem (FASP) and feedback vertex set problem (FVSP) ask for a minimum set of arcs $\varepsilon \subseteq E$ or vertices $ν\subseteq V$ whose removal $G\setminus \varepsilon$, $G\setminus ν$ makes a given multi-digraph $G=(V,E)$ acyclic, respectively. Though both problems are known to be APX-hard, approximation algorithms or proofs of inapproximability are unknown. We propose a new $\mathcal{O}(|V||E|^4)$-heuristic for the directed FASP. While a ratio of $r \approx 1.3606$ is known to be a lower bound for the APX-hardness, at least by empirical validation we achieve an approximation of $r \leq 2$. The most relevant applications, such as circuit testing, ask for solving the FASP on large sparse graphs, which can be done efficiently within tight error bounds due to our approach. △ Less

Submitted 16 June, 2020; v1 submitted 6 January, 2020; originally announced January 2020.

Comments: manuscript submitted to ACM

arXiv:1702.07612 [pdf, other]

Exact Localisations of Feedback Sets

Authors: Michael Hecht

Abstract: The feedback arc (vertex) set problem, shortened FASP (FVSP), is to transform a given multi digraph $G=(V,E)$ into an acyclic graph by deleting as few arcs (vertices) as possible. Due to the results of Richard M. Karp in 1972 it is one of the classic NP-complete problems. An important contribution of this paper is that the subgraphs $G_{\mathrm{el}}(e)$, $G_{\mathrm{si}}(e)$ of all elementary cycl… ▽ More The feedback arc (vertex) set problem, shortened FASP (FVSP), is to transform a given multi digraph $G=(V,E)$ into an acyclic graph by deleting as few arcs (vertices) as possible. Due to the results of Richard M. Karp in 1972 it is one of the classic NP-complete problems. An important contribution of this paper is that the subgraphs $G_{\mathrm{el}}(e)$, $G_{\mathrm{si}}(e)$ of all elementary cycles or simple cycles running through some arc $e \in E$, can be computed in $\mathcal{O}\big(|E|^2\big)$ and $\mathcal{O}(|E|^4)$, respectively. We use this fact and introduce the notion of the essential minor and isolated cycles, which yield a priori problem size reductions and in the special case of so called resolvable graphs an exact solution in $\mathcal{O}(|V||E|^3)$. We show that weighted versions of the FASP and FVSP possess a Bellman decomposition, which yields exact solutions using a dynamic programming technique in times $\mathcal{O}\big(2^{m}|E|^4\log(|V|)\big)$ and $\mathcal{O}\big(2^{n}Δ(G)^4|V|^4\log(|E|)\big)$, where $m \leq |E|-|V| +1$, $n \leq (Δ(G)-1)|V|-|E| +1$, respectively. The parameters $m,n$ can be computed in $\mathcal{O}(|E|^3)$, $\mathcal{O}(Δ(G)^3|V|^3)$, respectively and denote the maximal dimension of the cycle space of all appearing meta graphs, decoding the intersection behavior of the cycles. Consequently, $m,n$ equal zero if all meta graphs are trees. Moreover, we deliver several heuristics and discuss how to control their variation from the optimum. Summarizing, the presented results allow us to suggest a strategy for an implementation of a fast and accurate FASP/FVSP-SOLVER. △ Less

Submitted 24 February, 2017; originally announced February 2017.

Showing 1–12 of 12 results for author: Hecht, M