Search | arXiv e-print repository

Projective hypersurfaces in tropical scheme theory I: the Macaulay ideal

Authors: Alex Fink, Jeffrey Giansiracusa, Noah Giansiracusa, Joshua Mundinger

Abstract: A "tropical ideal" is an ideal in the idempotent semiring of tropical polynomials that is also, degree by degree, a tropical linear space. We introduce a construction based on transversal matroids that canonically extends any principal ideal to a tropical ideal. We call this the Macaulay tropical ideal. It has a universal property: any other extension of the given principal ideal to a tropical ide… ▽ More A "tropical ideal" is an ideal in the idempotent semiring of tropical polynomials that is also, degree by degree, a tropical linear space. We introduce a construction based on transversal matroids that canonically extends any principal ideal to a tropical ideal. We call this the Macaulay tropical ideal. It has a universal property: any other extension of the given principal ideal to a tropical ideal with the expected Hilbert function is a weak image of the Macaulay tropical ideal. For each $n\geq 2$ and $d\geq 1$ our construction yields a non-realizable degree $d$ hypersurface scheme in $\mathbb{P}^n$. Maclagan-Rincón produced a non-realizable line in $\mathbb{P}^n$ for each $n$, and for $(d,n)=(1,2)$ the two constructions agree. An appendix by Mundinger compares the Macaulay construction with another method for canonically extending ideals to tropical ideals. △ Less

Submitted 25 May, 2024; originally announced May 2024.

Comments: Appendix by Joshua Mundinger. 21pp

MSC Class: 14T10

arXiv:2401.03746 [pdf, other]

Physics-based vs. data-driven 24-hour probabilistic forecasts of precipitation for northern tropical Africa

Authors: Eva-Maria Walz, Peter Knippertz, Andreas H. Fink, Gregor Köhler, Tilmann Gneiting

Abstract: Numerical weather prediction (NWP) models struggle to skillfully predict tropical precipitation occurrence and amount, calling for alternative approaches. For instance, it has been shown that fairly simple, purely data-driven logistic regression models for 24-hour precipitation occurrence outperform both climatological and NWP forecasts for the West African summer monsoon. More complex neural netw… ▽ More Numerical weather prediction (NWP) models struggle to skillfully predict tropical precipitation occurrence and amount, calling for alternative approaches. For instance, it has been shown that fairly simple, purely data-driven logistic regression models for 24-hour precipitation occurrence outperform both climatological and NWP forecasts for the West African summer monsoon. More complex neural network based approaches, however, remain underdeveloped due to the non-Gaussian character of precipitation. In this study, we develop, apply and evaluate a novel two-stage approach, where we train a U-Net convolutional neural network (CNN) model on gridded rainfall data to obtain a deterministic forecast and then apply the recently developed, nonparametric Easy Uncertainty Quantification (EasyUQ) approach to convert it into a probabilistic forecast. We evaluate CNN+EasyUQ for one-day ahead 24-hour accumulated precipitation forecasts over northern tropical Africa for 2011--2019, with the Integrated Multi-satellitE Retrievals for GPM (IMERG) data serving as ground truth. In the most comprehensive assessment to date we compare CNN+EasyUQ to state-of-the-art physics-based and data-driven approaches such as a monthly probabilistic climatology, raw and postprocessed ensemble forecasts from the European Centre for Medium-Range Weather Forecasts (ECMWF), and traditional statistical approaches that use up to 25 predictor variables from IMERG and the ERA5 reanalysis.Generally, statistical approaches perform about en par with post-processed ECMWF ensemble forecasts. The CNN+EasyUQ approach, however, clearly outperforms all competitors for both occurrence and amount. Hybrid methods that merge CNN+EasyUQ and physics-based forecasts show slight further improvement. Thus, the CNN+EasyUQ approach can likely improve operational probabilistic forecasts of rainfall in the tropics, and potentially even beyond. △ Less

Submitted 8 January, 2024; originally announced January 2024.

arXiv:2308.16139 [pdf, other]

MedShapeNet -- A Large-Scale Dataset of 3D Medical Shapes for Computer Vision

Authors: Jianning Li, Zongwei Zhou, Jiancheng Yang, Antonio Pepe, Christina Gsaxner, Gijs Luijten, Chongyu Qu, Tiezheng Zhang, Xiaoxi Chen, Wenxuan Li, Marek Wodzinski, Paul Friedrich, Kangxian Xie, Yuan **, Narmada Ambigapathy, Enrico Nasca, Naida Solak, Gian Marco Melito, Viet Duc Vu, Afaque R. Memon, Christopher Schlachta, Sandrine De Ribaupierre, Rajnikant Patel, Roy Eagleson, Xiaojun Chen , et al. (132 additional authors not shown)

Abstract: Prior to the deep learning era, shape was commonly used to describe the objects. Nowadays, state-of-the-art (SOTA) algorithms in medical imaging are predominantly diverging from computer vision, where voxel grids, meshes, point clouds, and implicit surface models are used. This is seen from numerous shape-related publications in premier vision conferences as well as the growing popularity of Shape… ▽ More Prior to the deep learning era, shape was commonly used to describe the objects. Nowadays, state-of-the-art (SOTA) algorithms in medical imaging are predominantly diverging from computer vision, where voxel grids, meshes, point clouds, and implicit surface models are used. This is seen from numerous shape-related publications in premier vision conferences as well as the growing popularity of ShapeNet (about 51,300 models) and Princeton ModelNet (127,915 models). For the medical domain, we present a large collection of anatomical shapes (e.g., bones, organs, vessels) and 3D models of surgical instrument, called MedShapeNet, created to facilitate the translation of data-driven vision algorithms to medical applications and to adapt SOTA vision algorithms to medical problems. As a unique feature, we directly model the majority of shapes on the imaging data of real patients. As of today, MedShapeNet includes 23 dataset with more than 100,000 shapes that are paired with annotations (ground truth). Our data is freely accessible via a web interface and a Python application programming interface (API) and can be used for discriminative, reconstructive, and variational benchmarks as well as various applications in virtual, augmented, or mixed reality, and 3D printing. Exemplary, we present use cases in the fields of classification of brain tumors, facial and skull reconstructions, multi-class anatomy completion, education, and 3D printing. In future, we will extend the data and improve the interfaces. The project pages are: https://medshapenet.ikim.nrw/ and https://github.com/Jianningli/medshapenet-feedback △ Less

Submitted 12 December, 2023; v1 submitted 30 August, 2023; originally announced August 2023.

Comments: 16 pages

MSC Class: 68T01

arXiv:2308.05556 [pdf, ps, other]

Extensions of transversal valuated matroids

Authors: Alex Fink, Jorge Alberto Olarte

Abstract: Following up on our previous work, we study single-element extensions of transversal valuated matroids. We show that tropical presentations of valuated matroids with a minimal set of finite entries enjoy counterparts of the properties proved by Bonin and de Mier of minimal non-valuated transversal presentations. Following up on our previous work, we study single-element extensions of transversal valuated matroids. We show that tropical presentations of valuated matroids with a minimal set of finite entries enjoy counterparts of the properties proved by Bonin and de Mier of minimal non-valuated transversal presentations. △ Less

Submitted 10 August, 2023; originally announced August 2023.

Comments: 9 pages

MSC Class: 05B35 (primary); 14T15; 52B40 (secondary)

arXiv:2307.16691 [pdf, other]

Number of ordered factorizations and recursive divisors

Authors: T. M. A. Fink

Abstract: The number of ordered factorizations and the number of recursive divisors are two related arithmetic functions that are recursively defined. But it is hard to construct explicit representations of these functions. Taking advantage of their recursive definition and a geometric interpretation, we derive three closed-form expressions for them both. These expressions shed light on the structure of the… ▽ More The number of ordered factorizations and the number of recursive divisors are two related arithmetic functions that are recursively defined. But it is hard to construct explicit representations of these functions. Taking advantage of their recursive definition and a geometric interpretation, we derive three closed-form expressions for them both. These expressions shed light on the structure of these functions and their number-theoretic properties. Surprisingly, both functions can be expressed as simple generalized hypergeometric functions. △ Less

Submitted 31 July, 2023; originally announced July 2023.

arXiv:2307.09140 [pdf, ps, other]

Properties of the recursive divisor function and the number of ordered factorizations

Authors: T. M. A. Fink

Abstract: We recently introduced the recursive divisor function $κ_x(n)$, a recursive analogue of the usual divisor function. Here we calculate its Dirichlet series, which is ${ζ(s-x)}/(2 - ζ(s))$. We show that $κ_x(n)$ is related to the ordinary divisor function by $κ_x * σ_y = κ_y * σ_x$, where * denotes the Dirichlet convolution. Using this, we derive several identities relating $κ_x$ and some standard a… ▽ More We recently introduced the recursive divisor function $κ_x(n)$, a recursive analogue of the usual divisor function. Here we calculate its Dirichlet series, which is ${ζ(s-x)}/(2 - ζ(s))$. We show that $κ_x(n)$ is related to the ordinary divisor function by $κ_x * σ_y = κ_y * σ_x$, where * denotes the Dirichlet convolution. Using this, we derive several identities relating $κ_x$ and some standard arithmetic functions. We also clarify the relation between $κ_0$ and the much-studied number of ordered factorizations $K(n)$, namely, $κ_0 = {\bf 1} * K$. △ Less

Submitted 18 July, 2023; originally announced July 2023.

arXiv:2306.03934 [pdf, other]

Accurate Fine-Grained Segmentation of Human Anatomy in Radiographs via Volumetric Pseudo-Labeling

Authors: Constantin Seibold, Alexander Jaus, Matthias A. Fink, Moon Kim, Simon Reiß, Ken Herrmann, Jens Kleesiek, Rainer Stiefelhagen

Abstract: Purpose: Interpreting chest radiographs (CXR) remains challenging due to the ambiguity of overlap** structures such as the lungs, heart, and bones. To address this issue, we propose a novel method for extracting fine-grained anatomical structures in CXR using pseudo-labeling of three-dimensional computed tomography (CT) scans. Methods: We created a large-scale dataset of 10,021 thoracic CTs wi… ▽ More Purpose: Interpreting chest radiographs (CXR) remains challenging due to the ambiguity of overlap** structures such as the lungs, heart, and bones. To address this issue, we propose a novel method for extracting fine-grained anatomical structures in CXR using pseudo-labeling of three-dimensional computed tomography (CT) scans. Methods: We created a large-scale dataset of 10,021 thoracic CTs with 157 labels and applied an ensemble of 3D anatomy segmentation models to extract anatomical pseudo-labels. These labels were projected onto a two-dimensional plane, similar to the CXR, allowing the training of detailed semantic segmentation models for CXR without any manual annotation effort. Results: Our resulting segmentation models demonstrated remarkable performance on CXR, with a high average model-annotator agreement between two radiologists with mIoU scores of 0.93 and 0.85 for frontal and lateral anatomy, while inter-annotator agreement remained at 0.95 and 0.83 mIoU. Our anatomical segmentations allowed for the accurate extraction of relevant explainable medical features such as the cardio-thoracic-ratio. Conclusion: Our method of volumetric pseudo-labeling paired with CT projection offers a promising approach for detailed anatomical segmentation of CXR with a high agreement with human annotators. This technique may have important clinical implications, particularly in the analysis of various thoracic pathologies. △ Less

Submitted 6 June, 2023; originally announced June 2023.

Comments: 28 pages, 1 table, 10 figures

ACM Class: I.4.6; I.4.7; I.4.8

arXiv:2306.01629 [pdf, other]

Number of attractors in the critical Kauffman model is exponential

Authors: T. M. A. Fink, F. C. Sheldon

Abstract: The Kauffman model is the archetypal model of genetic computation. It highlights the importance of criticality, at which many biological systems seem poised. In a series of advances, researchers have honed in on how the number of attractors in the critical regime grows with network size. But a definitive answer has proved elusive. We prove that, for the critical Kauffman model with connectivity on… ▽ More The Kauffman model is the archetypal model of genetic computation. It highlights the importance of criticality, at which many biological systems seem poised. In a series of advances, researchers have honed in on how the number of attractors in the critical regime grows with network size. But a definitive answer has proved elusive. We prove that, for the critical Kauffman model with connectivity one, the number of attractors grows at least, and at most, as $(2/\!\sqrt{e})^N$. This is the first proof that the number of attractors in a critical Kauffman model grows exponentially. △ Less

Submitted 2 June, 2023; originally announced June 2023.

Comments: 5 pages, 3 figures

arXiv:2304.01585 [pdf, other]

Multi-Channel Time-Series Person and Soft-Biometric Identification

Authors: Nilah Ravi Nair, Fernando Moya Rueda, Christopher Reining, Gernot A. Fink

Abstract: Multi-channel time-series datasets are popular in the context of human activity recognition (HAR). On-body device (OBD) recordings of human movements are often preferred for HAR applications not only for their reliability but as an approach for identity protection, e.g., in industrial settings. Contradictory, the gait activity is a biometric, as the cyclic movement is distinctive and collectable.… ▽ More Multi-channel time-series datasets are popular in the context of human activity recognition (HAR). On-body device (OBD) recordings of human movements are often preferred for HAR applications not only for their reliability but as an approach for identity protection, e.g., in industrial settings. Contradictory, the gait activity is a biometric, as the cyclic movement is distinctive and collectable. In addition, the gait cycle has proven to contain soft-biometric information of human groups, such as age and height. Though general human movements have not been considered a biometric, they might contain identity information. This work investigates person and soft-biometrics identification from OBD recordings of humans performing different activities using deep architectures. Furthermore, we propose the use of attribute representation for soft-biometric identification. We evaluate the method on four datasets of multi-channel time-series HAR, measuring the performance of a person and soft-biometrics identification and its relation concerning performed activities. We find that person identification is not limited to gait activity. The impact of activities on the identification performance was found to be training and dataset specific. Soft-biometric based attribute representation shows promising results and emphasis the necessity of larger datasets. △ Less

Submitted 4 April, 2023; originally announced April 2023.

Comments: Accepted at the ICPR 2022 workshop: 12th International Workshop on Human Behavior Understanding

arXiv:2303.02079 [pdf, other]

Insights from number theory into the critical Kauffman model with connectivity one

Authors: F. C. Sheldon, T. M. A. Fink

Abstract: The Kauffman model of genetic computation highlights the importance of criticality at the border of order and chaos. The model with connectivity one is of special interest because it is exactly solvable. But our understanding of its behavior is incomplete, and much of what we do know relies on heuristic arguments. Here, we show that the key quantities in the model are intimately related to aspects… ▽ More The Kauffman model of genetic computation highlights the importance of criticality at the border of order and chaos. The model with connectivity one is of special interest because it is exactly solvable. But our understanding of its behavior is incomplete, and much of what we do know relies on heuristic arguments. Here, we show that the key quantities in the model are intimately related to aspects of number theory. Using these links, we derive improved bounds for the number of attractors as well as the mean attractor length, which is harder to compute. Our work suggests that number theory is the natural language for deducing many properties of the critical Kauffman model with connectivity one, and opens the door to further insight into this deceptively simple model. △ Less

Submitted 24 April, 2024; v1 submitted 3 March, 2023; originally announced March 2023.

Comments: 15 pages, 3 figures

arXiv:2302.05314 [pdf, other]

Exact dynamics of the critical Kauffman model with connectivity one

Authors: T. M. A. Fink

Abstract: The critical Kauffman model with connectivity one is the simplest class of critical Boolean networks. Nevertheless, it exhibits intricate behavior at the boundary of order and chaos. We introduce a formalism for expressing the dynamics of multiple loops as a product of the dynamics of individual loops. Using it, we prove that the number of attractors scales as $2^m$, where $m$ is the number of nod… ▽ More The critical Kauffman model with connectivity one is the simplest class of critical Boolean networks. Nevertheless, it exhibits intricate behavior at the boundary of order and chaos. We introduce a formalism for expressing the dynamics of multiple loops as a product of the dynamics of individual loops. Using it, we prove that the number of attractors scales as $2^m$, where $m$ is the number of nodes in loops - as fast as possible, and much faster than previously believed. △ Less

Submitted 31 March, 2023; v1 submitted 10 February, 2023; originally announced February 2023.

arXiv:2301.10161 [pdf, other]

Dataset Bias in Human Activity Recognition

Authors: Nilah Ravi Nair, Lena Schmid, Fernando Moya Rueda, Markus Pauly, Gernot A. Fink, Christopher Reining

Abstract: When creating multi-channel time-series datasets for Human Activity Recognition (HAR), researchers are faced with the issue of subject selection criteria. It is unknown what physical characteristics and/or soft-biometrics, such as age, height, and weight, need to be taken into account to train a classifier to achieve robustness towards heterogeneous populations in the training and testing data. Th… ▽ More When creating multi-channel time-series datasets for Human Activity Recognition (HAR), researchers are faced with the issue of subject selection criteria. It is unknown what physical characteristics and/or soft-biometrics, such as age, height, and weight, need to be taken into account to train a classifier to achieve robustness towards heterogeneous populations in the training and testing data. This contribution statistically curates the training data to assess to what degree the physical characteristics of humans influence HAR performance. We evaluate the performance of a state-of-the-art convolutional neural network on two HAR datasets that vary in the sensors, activities, and recording for time-series HAR. The training data is intentionally biased with respect to human characteristics to determine the features that impact motion behaviour. The evaluations brought forth the impact of the subjects' characteristics on HAR. Thus, providing insights regarding the robustness of the classifier with respect to heterogeneous populations. The study is a step forward in the direction of fair and trustworthy artificial intelligence by attempting to quantify representation bias in multi-channel time series HAR data. △ Less

Submitted 19 January, 2023; originally announced January 2023.

Comments: Submitted for review to THE 32nd INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI-23)

arXiv:2212.01353 [pdf, other]

doi 10.1109/ICPR56361.2022.9956405

Video-based Pose-Estimation Data as Source for Transfer Learning in Human Activity Recognition

Authors: Shrutarv Awasthi, Fernando Moya Rueda, Gernot A. Fink

Abstract: Human Activity Recognition (HAR) using on-body devices identifies specific human actions in unconstrained environments. HAR is challenging due to the inter and intra-variance of human movements; moreover, annotated datasets from on-body devices are scarce. This problem is mainly due to the difficulty of data creation, i.e., recording, expensive annotation, and lack of standard definitions of human… ▽ More Human Activity Recognition (HAR) using on-body devices identifies specific human actions in unconstrained environments. HAR is challenging due to the inter and intra-variance of human movements; moreover, annotated datasets from on-body devices are scarce. This problem is mainly due to the difficulty of data creation, i.e., recording, expensive annotation, and lack of standard definitions of human activities. Previous works demonstrated that transfer learning is a good strategy for addressing scenarios with scarce data. However, the scarcity of annotated on-body device datasets remains. This paper proposes using datasets intended for human-pose estimation as a source for transfer learning; specifically, it deploys sequences of annotated pixel coordinates of human joints from video datasets for HAR and human pose estimation. We pre-train a deep architecture on four benchmark video-based source datasets. Finally, an evaluation is carried out on three on-body device datasets improving HAR performance. △ Less

Submitted 2 December, 2022; originally announced December 2022.

Comments: Accepted for ICPR 2022

arXiv:2210.03416 [pdf, other]

Detailed Annotations of Chest X-Rays via CT Projection for Report Understanding

Authors: Constantin Seibold, Simon Reiß, Saquib Sarfraz, Matthias A. Fink, Victoria Mayer, Jan Sellner, Moon Sung Kim, Klaus H. Maier-Hein, Jens Kleesiek, Rainer Stiefelhagen

Abstract: In clinical radiology reports, doctors capture important information about the patient's health status. They convey their observations from raw medical imaging data about the inner structures of a patient. As such, formulating reports requires medical experts to possess wide-ranging knowledge about anatomical regions with their normal, healthy appearance as well as the ability to recognize abnorma… ▽ More In clinical radiology reports, doctors capture important information about the patient's health status. They convey their observations from raw medical imaging data about the inner structures of a patient. As such, formulating reports requires medical experts to possess wide-ranging knowledge about anatomical regions with their normal, healthy appearance as well as the ability to recognize abnormalities. This explicit grasp on both the patient's anatomy and their appearance is missing in current medical image-processing systems as annotations are especially difficult to gather. This renders the models to be narrow experts e.g. for identifying specific diseases. In this work, we recover this missing link by adding human anatomy into the mix and enable the association of content in medical reports to their occurrence in associated imagery (medical phrase grounding). To exploit anatomical structures in this scenario, we present a sophisticated automatic pipeline to gather and integrate human bodily structures from computed tomography datasets, which we incorporate in our PAXRay: A Projected dataset for the segmentation of Anatomical structures in X-Ray data. Our evaluation shows that methods that take advantage of anatomical information benefit heavily in visually grounding radiologists' findings, as our anatomical segmentations allow for up to absolute 50% better grounding results on the OpenI dataset as compared to commonly used region proposals. The PAXRay dataset is available at https://constantinseibold.github.io/paxray/. △ Less

Submitted 7 October, 2022; originally announced October 2022.

Comments: 33rd British Machine Vision Conference (BMVC 2022)

ACM Class: I.4.6; I.4.8; I.4.9

arXiv:2209.06752 [pdf, other]

Signed permutohedra, delta-matroids, and beyond

Authors: Christopher Eur, Alex Fink, Matt Larson, Hunter Spink

Abstract: We establish a connection between the algebraic geometry of the type B permutohedral toric variety and the combinatorics of delta-matroids. Using this connection, we compute the volume and lattice point counts of type B generalized permutohedra. Applying tropical Hodge theory to a new framework of "tautological classes of delta-matroids," modeled after certain vector bundles associated to realizab… ▽ More We establish a connection between the algebraic geometry of the type B permutohedral toric variety and the combinatorics of delta-matroids. Using this connection, we compute the volume and lattice point counts of type B generalized permutohedra. Applying tropical Hodge theory to a new framework of "tautological classes of delta-matroids," modeled after certain vector bundles associated to realizable delta-matroids, we establish the log-concavity of a Tutte-like invariant for a broad family of delta-matroids that includes all realizable delta-matroids. Our results include new log-concavity statements for all (ordinary) matroids as special cases. △ Less

Submitted 16 February, 2024; v1 submitted 14 September, 2022; originally announced September 2022.

Comments: To appear in Proc. Lon. Math. Soc

arXiv:2208.14996 [pdf, other]

Regulatory motifs: structural and functional building blocks of genetic computation

Authors: Thomas M. A. Fink

Abstract: Develo** and maintaining life requires a lot of computation. This is done by gene regulatory networks. But we have little understanding of how this computation is organized. I show that there is a direct correspondence between the structural and functional building blocks of regulatory networks, which I call regulatory motifs. I derive a simple bound on the range of function that these motifs ca… ▽ More Develo** and maintaining life requires a lot of computation. This is done by gene regulatory networks. But we have little understanding of how this computation is organized. I show that there is a direct correspondence between the structural and functional building blocks of regulatory networks, which I call regulatory motifs. I derive a simple bound on the range of function that these motifs can perform, in terms of the local network structure. I prove that this range is a small fraction of all possible functions, which severely constrains global network behavior. Part of this restriction is due to redundancy in the function that regulatory motifs can achieve - there are many ways to perform the same task. Regulatory motifs help us understanding how genetic computation is organized and what it can achieve. △ Less

Submitted 31 August, 2022; originally announced August 2022.

arXiv:2206.03149 [pdf, other]

Self-Training of Handwritten Word Recognition for Synthetic-to-Real Adaptation

Authors: Fabian Wolf, Gernot A. Fink

Abstract: Performances of Handwritten Text Recognition (HTR) models are largely determined by the availability of labeled and representative training samples. However, in many application scenarios labeled samples are scarce or costly to obtain. In this work, we propose a self-training approach to train a HTR model solely on synthetic samples and unlabeled data. The proposed training scheme uses an initial… ▽ More Performances of Handwritten Text Recognition (HTR) models are largely determined by the availability of labeled and representative training samples. However, in many application scenarios labeled samples are scarce or costly to obtain. In this work, we propose a self-training approach to train a HTR model solely on synthetic samples and unlabeled data. The proposed training scheme uses an initial model trained on synthetic data to make predictions for the unlabeled target dataset. Starting from this initial model with rather poor performance, we show that a considerable adaptation is possible by training against the predicted pseudo-labels. Moreover, the investigated self-training strategy does not require any manually annotated training samples. We evaluate the proposed method on four widely used benchmark datasets and show its effectiveness on closing the gap to a model trained in a fully-supervised manner. △ Less

Submitted 30 June, 2022; v1 submitted 7 June, 2022; originally announced June 2022.

Comments: Accepted for publication in International Conference on Pattern Recognition (ICPR) 2022

arXiv:2202.06080 [pdf, other]

Recognition-free Question Answering on Handwritten Document Collections

Authors: Oliver Tüselmann, Friedrich Müller, Fabian Wolf, Gernot A. Fink

Abstract: In recent years, considerable progress has been made in the research area of Question Answering (QA) on document images. Current QA approaches from the Document Image Analysis community are mainly focusing on machine-printed documents and perform rather limited on handwriting. This is mainly due to the reduced recognition performance on handwritten documents. To tackle this problem, we propose a r… ▽ More In recent years, considerable progress has been made in the research area of Question Answering (QA) on document images. Current QA approaches from the Document Image Analysis community are mainly focusing on machine-printed documents and perform rather limited on handwriting. This is mainly due to the reduced recognition performance on handwritten documents. To tackle this problem, we propose a recognition-free QA approach, especially designed for handwritten document image collections. We present a robust document retrieval method, as well as two QA models. Our approaches outperform the state-of-the-art recognition-free models on the challenging BenthamQA and HW-SQuAD datasets. △ Less

Submitted 12 February, 2022; originally announced February 2022.

arXiv:2201.13279 [pdf, other]

UQGAN: A Unified Model for Uncertainty Quantification of Deep Classifiers trained via Conditional GANs

Authors: Philipp Oberdiek, Gernot A. Fink, Matthias Rottmann

Abstract: We present an approach to quantifying both aleatoric and epistemic uncertainty for deep neural networks in image classification, based on generative adversarial networks (GANs). While most works in the literature that use GANs to generate out-of-distribution (OoD) examples only focus on the evaluation of OoD detection, we present a GAN based approach to learn a classifier that produces proper unce… ▽ More We present an approach to quantifying both aleatoric and epistemic uncertainty for deep neural networks in image classification, based on generative adversarial networks (GANs). While most works in the literature that use GANs to generate out-of-distribution (OoD) examples only focus on the evaluation of OoD detection, we present a GAN based approach to learn a classifier that produces proper uncertainties for OoD examples as well as for false positives (FPs). Instead of shielding the entire in-distribution data with GAN generated OoD examples which is state-of-the-art, we shield each class separately with out-of-class examples generated by a conditional GAN and complement this with a one-vs-all image classifier. In our experiments, in particular on CIFAR10, CIFAR100 and Tiny ImageNet, we improve over the OoD detection and FP detection performance of state-of-the-art GAN-training based classifiers. Furthermore, we also find that the generated GAN examples do not significantly affect the calibration error of our classifier and result in a significant gain in model accuracy. △ Less

Submitted 9 January, 2023; v1 submitted 31 January, 2022; originally announced January 2022.

arXiv:2112.15334 [pdf, other]

Matroids and the space of torus-invariant subvarieties of the Grassmannian with given homology class

Authors: E. Javier Elizondo, Alex Fink, Cristhian Garay López

Abstract: Let $\mathbb{G}(d,n)$ be the complex Grassmannian of affine $d$-planes in $n$-space. We study the problem of characterizing the set of algebraic subvarieties of $\mathbb{G}(d,n)$ invariant under the action of the maximal torus $T$ and having given homology class $λ$. We give a complete answer for the case where $λ$ is the class of a $T$-orbit, and partial results for other cases, using techniques… ▽ More Let $\mathbb{G}(d,n)$ be the complex Grassmannian of affine $d$-planes in $n$-space. We study the problem of characterizing the set of algebraic subvarieties of $\mathbb{G}(d,n)$ invariant under the action of the maximal torus $T$ and having given homology class $λ$. We give a complete answer for the case where $λ$ is the class of a $T$-orbit, and partial results for other cases, using techniques inspired by matroid theory. This problem has applications to the computation of the Euler-Chow series for Grassmannians of projective lines: we calculate the series for 3-cycles in $\mathbb{G}(2,4)$ and carry out partial calculations for $\mathbb{G}(2,5)$. △ Less

Submitted 21 November, 2023; v1 submitted 31 December, 2021; originally announced December 2021.

Comments: This preprint represents a second version of this article

MSC Class: Primary: 14C25; Secondary: 05B35

arXiv:2111.04564 [pdf, other]

Human Activity Recognition using Attribute-Based Neural Networks and Context Information

Authors: Stefan Lüdtke, Fernando Moya Rueda, Waqas Ahmed, Gernot A. Fink, Thomas Kirste

Abstract: We consider human activity recognition (HAR) from wearable sensor data in manual-work processes, like warehouse order-picking. Such structured domains can often be partitioned into distinct process steps, e.g., packaging or transporting. Each process step can have a different prior distribution over activity classes, e.g., standing or walking, and different system dynamics. Here, we show how such… ▽ More We consider human activity recognition (HAR) from wearable sensor data in manual-work processes, like warehouse order-picking. Such structured domains can often be partitioned into distinct process steps, e.g., packaging or transporting. Each process step can have a different prior distribution over activity classes, e.g., standing or walking, and different system dynamics. Here, we show how such context information can be integrated systematically into a deep neural network-based HAR system. Specifically, we propose a hybrid architecture that combines a deep neural network-that estimates high-level movement descriptors, attributes, from the raw-sensor data-and a shallow classifier, which predicts activity classes from the estimated attributes and (optional) context information, like the currently executed process step. We empirically show that our proposed architecture increases HAR performance, compared to state-of-the-art methods. Additionally, we show that HAR performance can be further increased when information about process steps is incorporated, even when that information is only partially correct. △ Less

Submitted 28 October, 2021; originally announced November 2021.

Comments: 3rd International Workshop on Deep Learning for Human Activity Recognition

arXiv:2110.11466 [pdf, other]

MLPerf HPC: A Holistic Benchmark Suite for Scientific Machine Learning on HPC Systems

Authors: Steven Farrell, Murali Emani, Jacob Balma, Lukas Drescher, Aleksandr Drozd, Andreas Fink, Geoffrey Fox, David Kanter, Thorsten Kurth, Peter Mattson, Dawei Mu, Amit Ruhela, Kento Sato, Koichi Shirahata, Tsuguchika Tabaru, Aristeidis Tsaris, Jan Balewski, Ben Cumming, Takumi Danjo, Jens Domke, Takaaki Fukai, Naoto Fukumoto, Tatsuya Fukushi, Balazs Gerofi, Takumi Honda , et al. (18 additional authors not shown)

Abstract: Scientific communities are increasingly adopting machine learning and deep learning models in their applications to accelerate scientific insights. High performance computing systems are pushing the frontiers of performance with a rich diversity of hardware resources and massive scale-out capabilities. There is a critical need to understand fair and effective benchmarking of machine learning appli… ▽ More Scientific communities are increasingly adopting machine learning and deep learning models in their applications to accelerate scientific insights. High performance computing systems are pushing the frontiers of performance with a rich diversity of hardware resources and massive scale-out capabilities. There is a critical need to understand fair and effective benchmarking of machine learning applications that are representative of real-world scientific use cases. MLPerf is a community-driven standard to benchmark machine learning workloads, focusing on end-to-end performance metrics. In this paper, we introduce MLPerf HPC, a benchmark suite of large-scale scientific machine learning training applications driven by the MLCommons Association. We present the results from the first submission round, including a diverse set of some of the world's largest HPC systems. We develop a systematic framework for their joint analysis and compare them in terms of data staging, algorithmic convergence, and compute performance. As a result, we gain a quantitative understanding of optimizations on different subsystems such as staging and on-node loading of data, compute-unit utilization, and communication scheduling, enabling overall $>10 \times$ (end-to-end) performance improvements through system scaling. Notably, our analysis shows a scale-dependent interplay between the dataset size, a system's memory hierarchy, and training convergence that underlines the importance of near-compute storage. To overcome the data-parallel scalability challenge at large batch sizes, we discuss specific learning techniques and hybrid data-and-model parallelism that are effective on large systems. We conclude by characterizing each benchmark with respect to low-level memory, I/O, and network behavior to parameterize extended roofline performance models in future rounds. △ Less

Submitted 26 October, 2021; v1 submitted 21 October, 2021; originally announced October 2021.

arXiv:2109.12551 [pdf, other]

Biological logics are restricted

Authors: Thomas M. A. Fink, Ryan Hannam

Abstract: Networks of gene regulation govern morphogenesis, determine cell identity and regulate cell function. But we have little understanding, at the local level, of which logics are biologically preferred or even permitted. To solve this puzzle, we studied the consequences of a fundamental aspect of gene regulatory networks: genes and transcription factors talk to each other but not themselves. Remarkab… ▽ More Networks of gene regulation govern morphogenesis, determine cell identity and regulate cell function. But we have little understanding, at the local level, of which logics are biologically preferred or even permitted. To solve this puzzle, we studied the consequences of a fundamental aspect of gene regulatory networks: genes and transcription factors talk to each other but not themselves. Remarkably, this bipartite structure severely restricts the number of logical dependencies that a gene can have on other genes. We developed a theory for the number of permitted logics for different regulatory building blocks of genes and transcription factors. We tested our predictions against a simulation of the 19 simplest building blocks, and found complete agreement. The restricted range of biological logics is a key insight into how information is processed at the genetic level. It constraints global network function and makes it easier to reverse engineer regulatory networks from observed behavior. △ Less

Submitted 31 August, 2022; v1 submitted 26 September, 2021; originally announced September 2021.

Comments: 6 pages

arXiv:2106.01014 [pdf, other]

Geometry adaptation of protrusion and polarity dynamics in confined cell migration

Authors: David B. Brückner, Matthew Schmitt, Alexandra Fink, Georg Ladurner, Johannes Flommersfeld, Nicolas Arlt, Edouard Hannezo, Joachim O. Rädler, Chase P. Broedersz

Abstract: Cell migration in confining physiological environments relies on the concerted dynamics of several cellular components, including protrusions, adhesions with the environment, and the cell nucleus. However, it remains poorly understood how the dynamic interplay of these components and the cell polarity determine the emergent migration behavior at the cellular scale. Here, we combine data-driven inf… ▽ More Cell migration in confining physiological environments relies on the concerted dynamics of several cellular components, including protrusions, adhesions with the environment, and the cell nucleus. However, it remains poorly understood how the dynamic interplay of these components and the cell polarity determine the emergent migration behavior at the cellular scale. Here, we combine data-driven inference with a mechanistic bottom-up approach to develop a model for protrusion and polarity dynamics in confined cell migration, revealing how the cellular dynamics adapt to confining geometries. Specifically, we use experimental data of joint protrusion-nucleus migration trajectories of cells on confining micropatterns to systematically determine a mechanistic model linking the stochastic dynamics of cell polarity, protrusions, and nucleus. This model indicates that the cellular dynamics adapt to confining constrictions through a switch in the polarity dynamics from a negative to a positive, self-reinforcing feedback loop. Our model further reveals how this feedback loop leads to stereotypical cycles of protrusion-nucleus dynamics that drive the migration of the cell through constrictions. These cycles are disrupted upon perturbation of cytoskeletal components, indicating that the positive feedback is controlled by cellular migration mechanisms. Our data-driven theoretical approach therefore identifies polarity feedback adaptation as a key mechanism in confined cell migration. △ Less

Submitted 9 August, 2022; v1 submitted 2 June, 2021; originally announced June 2021.

arXiv:2104.09589 [pdf, ps, other]

doi 10.1112/jlms.12856

Gröbner bases, symmetric matrices, and type C Kazhdan-Lusztig varieties

Authors: Laura Escobar, Alex Fink, Jenna Rajchgot, Alexander Woo

Abstract: We study a class of combinatorially-defined polynomial ideals which are generated by minors of a generic symmetric matrix. Included within this class are the symmetric determinantal ideals, the symmetric ladder determinantal ideals, and the symmetric Schubert determinantal ideals of A. Fink, J. Rajchgot, and S. Sullivant. Each ideal in our class is a type C analog of a Kazhdan-Lusztig ideal of A.… ▽ More We study a class of combinatorially-defined polynomial ideals which are generated by minors of a generic symmetric matrix. Included within this class are the symmetric determinantal ideals, the symmetric ladder determinantal ideals, and the symmetric Schubert determinantal ideals of A. Fink, J. Rajchgot, and S. Sullivant. Each ideal in our class is a type C analog of a Kazhdan-Lusztig ideal of A. Woo and A. Yong; that is, it is the scheme-theoretic defining ideal of the intersection of a type C Schubert variety with a type C opposite Schubert cell, appropriately coordinatized. The Kazhdan-Lusztig ideals that arise are exactly those where the opposite cell is $123$-avoiding. Our main results include Gröbner bases for these ideals, prime decompositions of their initial ideals (which are Stanley-Reisner ideals of subword complexes) and combinatorial formulas for their multigraded Hilbert series in terms of pipe dreams. △ Less

Submitted 14 July, 2022; v1 submitted 19 April, 2021; originally announced April 2021.

Comments: 36 pages, 11 figures; v2: minor improvements, Section 8 expanded;

MSC Class: 05E40; 13P10; 14M15; 14N15

arXiv:2103.05262 [pdf, other]

doi 10.1007/s10182-023-00471-1

Left-Truncated Health Insurance Claims Data: Theoretical Review and Empirical Application

Authors: Rafael Weißbachm, Achim Dörre, Dominik Wied, Gabriele Doblhammer, Anne Fink

Abstract: At the beginning of 2004, we draw a sample of size 0.25 million people from the inventory of the health insurer AOK. We followed their health claims until 2013. Our aim is the effect a stroke on the dementia onset probability, for Germans born in the first half of the 20$^{th}$ century. People deceased before 2004 are randomly left-truncated. Filtrations, modelling the missing data, enable to circ… ▽ More At the beginning of 2004, we draw a sample of size 0.25 million people from the inventory of the health insurer AOK. We followed their health claims until 2013. Our aim is the effect a stroke on the dementia onset probability, for Germans born in the first half of the 20$^{th}$ century. People deceased before 2004 are randomly left-truncated. Filtrations, modelling the missing data, enable to circumvent the unknown number of truncated persons by using a conditional instead of the full likelihood. Dementia onset after 2013 is a conditionally fixed right-censoring event. For each observed health history, Jacod's formula yields the conditional likelihood contribution. Asymptotic normality of the estimated intensities is derived, relative to a sample size definition that includes the truncated people. Yet, the standard error is observable. The claims data reveal that after a stroke, with time measured in years, the intensity of dementia onset increases from 0.02 to 0.07. Using the independence of the two estimated intensities, a 95\%-confidence interval for their difference is [0.050,0.056]. The effect halves, when we extend the analysis to an age-inhomogeneous model, but does not change further when we additionally adjust for multi-morbidity. △ Less

Submitted 14 November, 2022; v1 submitted 9 March, 2021; originally announced March 2021.

Comments: 56 pages, 8 figures

MSC Class: 62N99

arXiv:2102.01445 [pdf, other]

doi 10.1109/ISBI48211.2021.9433966

Prediction of low-keV monochromatic images from polyenergetic CT scans for improved automatic detection of pulmonary embolism

Authors: Constantin Seibold, Matthias A. Fink, Charlotte Goos, Hans-Ulrich Kauczor, Heinz-Peter Schlemmer, Rainer Stiefelhagen, Jens Kleesiek

Abstract: Detector-based spectral computed tomography is a recent dual-energy CT (DECT) technology that offers the possibility of obtaining spectral information. From this spectral data, different types of images can be derived, amongst others virtual monoenergetic (monoE) images. MonoE images potentially exhibit decreased artifacts, improve contrast, and overall contain lower noise values, making them idea… ▽ More Detector-based spectral computed tomography is a recent dual-energy CT (DECT) technology that offers the possibility of obtaining spectral information. From this spectral data, different types of images can be derived, amongst others virtual monoenergetic (monoE) images. MonoE images potentially exhibit decreased artifacts, improve contrast, and overall contain lower noise values, making them ideal candidates for better delineation and thus improved diagnostic accuracy of vascular abnormalities. In this paper, we are training convolutional neural networks~(CNN) that can emulate the generation of monoE images from conventional single energy CT acquisitions. For this task, we investigate several commonly used image-translation methods. We demonstrate that these methods while creating visually similar outputs, lead to a poorer performance when used for automatic classification of pulmonary embolism (PE). We expand on these methods through the use of a multi-task optimization approach, under which the networks achieve improved classification as well as generation results, as reflected by PSNR and SSIM scores. Further, evaluating our proposed framework on a subset of the RSNA-PE challenge data set shows that we are able to improve the Area under the Receiver Operating Characteristic curve (AuROC) in comparison to a naïve classification approach from 0.8142 to 0.8420. △ Less

Submitted 23 February, 2021; v1 submitted 2 February, 2021; originally announced February 2021.

Comments: 4 pages, ISBI 2021

MSC Class: 92C55 68T07

arXiv:2012.03318 [pdf]

doi 10.1109/igarss.2017.8127602

Estimation of vegetation loss coefficients and canopy penetration depths from SMAP radiometer and IceSAT lidar data

Authors: M. Baur, T. Jagdhuber, M. Link, M. Piles, D. Entekhabi, A. Fink

Abstract: In this study the framework of the $τ$-$ω$ model is used to derive vegetation loss coefficients and canopy penetration depths from SMAP multi-temporal retrievals of vegetation optical depth, single scattering albedo and ICESat lidar vegetation heights. The vegetation loss coefficients serve as a global indicator of how strong absorption and scattering processes attenuate L-band microwave radiation… ▽ More In this study the framework of the $τ$-$ω$ model is used to derive vegetation loss coefficients and canopy penetration depths from SMAP multi-temporal retrievals of vegetation optical depth, single scattering albedo and ICESat lidar vegetation heights. The vegetation loss coefficients serve as a global indicator of how strong absorption and scattering processes attenuate L-band microwave radiation. By inverting the vegetation loss coefficients, penetration depths into the canopy can be obtained that is displayed for the global forest reservoirs. A simple penetration index is formed combining vegetation heights and penetration depth estimates. The distribution and level of this index reveal that for densely forested areas the soil signal is attenuated considerably, which can affect the accuracy of soil moisture retrievals. △ Less

Submitted 6 December, 2020; originally announced December 2020.

Journal ref: 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS)

arXiv:2008.03978 [pdf, other]

doi 10.1073/pnas.2016602118

Learning the dynamics of cell-cell interactions in confined cell migration

Authors: David B. Brückner, Nicolas Arlt, Alexandra Fink, Pierre Ronceray, Joachim O. Rädler, Chase P. Broedersz

Abstract: The migratory dynamics of cells in physiological processes, ranging from wound healing to cancer metastasis, rely on contact-mediated cell-cell interactions. These interactions play a key role in sha** the stochastic trajectories of migrating cells. While data-driven physical formalisms for the stochastic migration dynamics of single cells have been developed, such a framework for the behavioral… ▽ More The migratory dynamics of cells in physiological processes, ranging from wound healing to cancer metastasis, rely on contact-mediated cell-cell interactions. These interactions play a key role in sha** the stochastic trajectories of migrating cells. While data-driven physical formalisms for the stochastic migration dynamics of single cells have been developed, such a framework for the behavioral dynamics of interacting cells still remains elusive. Here, we monitor stochastic cell trajectories in a minimal experimental cell collider: a dumbbell-shaped micropattern on which pairs of cells perform repeated cellular collisions. We observe different characteristic behaviors, including cells reversing, following and sliding past each other upon collision. Capitalizing on this large experimental data set of coupled cell trajectories, we infer an interacting stochastic equation of motion that accurately predicts the observed interaction behaviors. Our approach reveals that interacting non-cancerous MCF10A cells can be described by repulsion and friction interactions. In contrast, cancerous MDA-MB-231 cells exhibit attraction and anti-friction interactions, promoting the predominant relative sliding behavior observed for these cells. Based on these experimentally inferred interactions, we show how this framework may generalize to provide a unifying theoretical description of the diverse cellular interaction behaviors of distinct cell types. △ Less

Submitted 13 November, 2020; v1 submitted 10 August, 2020; originally announced August 2020.

Journal ref: Proc. Natl. Acad. Sci. USA 118 (2021) e2016602118

arXiv:2007.12908 [pdf, other]

doi 10.1007/s42514-021-00069-6

Large scale simulation of pressure induced phase-field fracture propagation using Utopia

Authors: Patrick Zulian, Alena Kopaničáková, Maria Giuseppina Chiara Nestola, Andreas Fink, Nur Aiman Fadel, Joost Vandevondele, Rolf Krause

Abstract: Non-linear phase field models are increasingly used for the simulation of fracture propagation models. The numerical simulation of fracture networks of realistic size requires the efficient parallel solution of large coupled non-linear systems. Although in principle efficient iterative multi-level methods for these types of problems are available, they are not widely used in practice due to the co… ▽ More Non-linear phase field models are increasingly used for the simulation of fracture propagation models. The numerical simulation of fracture networks of realistic size requires the efficient parallel solution of large coupled non-linear systems. Although in principle efficient iterative multi-level methods for these types of problems are available, they are not widely used in practice due to the complexity of their parallel implementation. Here, we present Utopia, which is an open-source C++ library for parallel non-linear multilevel solution strategies. Utopia provides the advantages of high-level programming interfaces while at the same time a framework to access low-level data-structures without breaking code encapsulation. Complex numerical procedures can be expressed with few lines of code, and evaluated by different implementations, libraries, or computing hardware. In this paper, we investigate the parallel performance of our implementation of the recursive multilevel trust-region (RMTR) method based on the Utopia library. RMTR is a globally convergent multilevel solution strategy designed to solve non-convex constrained minimization problems. In particular, we solve pressure-induced phase-field fracture propagation in large and complex fracture networks. Solving such problems is deemed challenging even for a few fractures, however, here we are considering networks of realistic size with up to 1000 fractures. △ Less

Submitted 25 July, 2020; originally announced July 2020.

Comments: CCF Trans. HPC (2021)

arXiv:2005.06831 [pdf, other]

Detection and Retrieval of Out-of-Distribution Objects in Semantic Segmentation

Authors: Philipp Oberdiek, Matthias Rottmann, Gernot A. Fink

Abstract: When deploying deep learning technology in self-driving cars, deep neural networks are constantly exposed to domain shifts. These include, e.g., changes in weather conditions, time of day, and long-term temporal shift. In this work we utilize a deep neural network trained on the Cityscapes dataset containing urban street scenes and infer images from a different dataset, the A2D2 dataset, containin… ▽ More When deploying deep learning technology in self-driving cars, deep neural networks are constantly exposed to domain shifts. These include, e.g., changes in weather conditions, time of day, and long-term temporal shift. In this work we utilize a deep neural network trained on the Cityscapes dataset containing urban street scenes and infer images from a different dataset, the A2D2 dataset, containing also countryside and highway images. We present a novel pipeline for semantic segmenation that detects out-of-distribution (OOD) segments by means of the deep neural network's prediction and performs image retrieval after feature extraction and dimensionality reduction on image patches. In our experiments we demonstrate that the deployed OOD approach is suitable for detecting out-of-distribution concepts. Furthermore, we evaluate the image patch retrieval qualitatively as well as quantitatively by means of the semi-compatible A2D2 ground truth and obtain mAP values of up to 52.2%. △ Less

Submitted 14 May, 2020; originally announced May 2020.

arXiv:2004.08258 [pdf, ps, other]

doi 10.2140/pjm.2022.318.453

Initial forms and a notion of basis for tropical differential equations

Authors: Alex Fink, Zeinab Toghani

Abstract: We show that solution sets of systems of tropical differential equations can be characterised in terms of monomial-freeness of an initial ideal. We discuss a candidate definition of tropical differential basis and give a nonexistence result for such bases in an example. We show that solution sets of systems of tropical differential equations can be characterised in terms of monomial-freeness of an initial ideal. We discuss a candidate definition of tropical differential basis and give a nonexistence result for such bases in an example. △ Less

Submitted 17 April, 2020; originally announced April 2020.

Comments: 16 pages

MSC Class: 13N99; 14T05

Journal ref: Pacific J. Math. 318 (2022) 453-468

arXiv:2003.01989 [pdf, other]

Annotation-free Learning of Deep Representations for Word Spotting using Synthetic Data and Self Labeling

Authors: Fabian Wolf, Gernot A. Fink

Abstract: Word spotting is a popular tool for supporting the first exploration of historic, handwritten document collections. Today, the best performing methods rely on machine learning techniques, which require a high amount of annotated training material. As training data is usually not available in the application scenario, annotation-free methods aim at solving the retrieval task without representative… ▽ More Word spotting is a popular tool for supporting the first exploration of historic, handwritten document collections. Today, the best performing methods rely on machine learning techniques, which require a high amount of annotated training material. As training data is usually not available in the application scenario, annotation-free methods aim at solving the retrieval task without representative training samples. In this work, we present an annotation-free method that still employs machine learning techniques and therefore outperforms other learning-free approaches. The weakly supervised training scheme relies on a lexicon, that does not need to precisely fit the dataset. In combination with a confidence based selection of pseudo-labeled training samples, we achieve state-of-the-art query-by-example performances. Furthermore, our method allows to perform query-by-string, which is usually not the case for other annotation-free methods. △ Less

Submitted 25 May, 2020; v1 submitted 4 March, 2020; originally announced March 2020.

Comments: Accepted to Workshop on Document Analysis Systems (DAS) 2020

arXiv:1912.04051 [pdf]

Building Executable Secure Design Models for Smart Contracts with Formal Methods

Authors: Weifeng Xu, Glenn A. Fink

Abstract: Smart contracts are appealing because they are self-executing business agreements between parties with the predefined and immutable obligations and rights. However, as with all software, smart contracts may contain vulnerabilities because of design flaws, which may be exploited by one of the parties to defraud the others. In this paper, we demonstrate a systematic approach to building secure desig… ▽ More Smart contracts are appealing because they are self-executing business agreements between parties with the predefined and immutable obligations and rights. However, as with all software, smart contracts may contain vulnerabilities because of design flaws, which may be exploited by one of the parties to defraud the others. In this paper, we demonstrate a systematic approach to building secure design models for smart contracts using formal methods. To build the secure models, we first model the behaviors of participating parties as state machines, and then, we model the predefined obligations and rights of contracts, which specify the interactions among state machines for achieving the business goal. After that, we illustrate executable secure model design patterns in TLA+ (Temporal Logic of Actions) to against well-known smart contract vulnerabilities in terms of state machines and obligations and rights at the design level. These vulnerabilities are found in Ethereum contracts, including Call to the unknown, Gasless send, Reentrancy, Lost in the transfer, and Unpredictable state. The resultant TLA+ specifications are called secure models. We illustrate our approach to detect the vulnerabilities using a real-estate contract example at the design level. △ Less

Submitted 9 December, 2019; originally announced December 2019.

Comments: 6 pages

Journal ref: The 3rd Workshop on Trusted Smart Contracts In Association with Financial Cryptography 2019, St. Kitts, Feb. 2019

arXiv:1912.03281 [pdf, other]

The mathematical structure of innovation

Authors: Thomas M. A. Fink, Ali Teimouri

Abstract: Despite our familiarity with specific technologies, the origin of new technologies remains mysterious. Are new technologies made from scratch, or are they built up recursively from new combinations of existing technologies? To answer this, we introduce a simple model of recursive innovation in which technologies are made up of components and combinations of components can be turned into new compon… ▽ More Despite our familiarity with specific technologies, the origin of new technologies remains mysterious. Are new technologies made from scratch, or are they built up recursively from new combinations of existing technologies? To answer this, we introduce a simple model of recursive innovation in which technologies are made up of components and combinations of components can be turned into new components---a process we call technological recursion. We derive a formula for the extent to which technological recursion increases or decreases the likelihood of making new technologies. We test our predictions on historical data from three domains and find that technologies are not built up from scratch, but are the result of new combinations of existing technologies. This suggests a dynamical process by which known technologies were made and a strategy for accelerating the discovery of new ones. △ Less

Submitted 6 December, 2019; originally announced December 2019.

Comments: 6 pages, 4 figures

arXiv:1910.02815 [pdf]

doi 10.1098/rsif.2019.0689

Disentangling the Behavioural Variability of Confined Cell Migration

Authors: David B. Brückner, Alexandra Fink, Joachim O. Rädler, Chase P. Broedersz

Abstract: Cell-to-cell variability is inherent to numerous biological processes, including cell migration. Quantifying and characterizing the variability of migrating cells is challenging, as it requires monitoring many cells for long time windows under identical conditions. Here, we observe the migration of single human breast cancer cells (MDA-MB-231) in confining two-state micropatterns. To describe the… ▽ More Cell-to-cell variability is inherent to numerous biological processes, including cell migration. Quantifying and characterizing the variability of migrating cells is challenging, as it requires monitoring many cells for long time windows under identical conditions. Here, we observe the migration of single human breast cancer cells (MDA-MB-231) in confining two-state micropatterns. To describe the stochastic dynamics of this confined migration, we employ a dynamical systems approach. We identify statistics to measure the behavioural variance of the migration, which significantly exceed those predicted by a population-averaged stochastic model. This additional variance can be explained by the combination of an 'aging' process and population heterogeneity. To quantify population heterogeneity, we decompose the cells into subpopulations of slow and fast cells, revealing the presence of distinct classes of dynamical systems describing the migration, ranging from bistable to limit cycle behaviour. Our findings highlight the breadth of migration behaviours present in cell populations. △ Less

Submitted 7 October, 2019; originally announced October 2019.

Journal ref: J. R. Soc. Interface 17 (2020) 20190689

arXiv:1904.10047 [pdf, ps, other]

Equivariant K-theory classes of matrix orbit closures

Authors: Andrew Berget, Alex Fink

Abstract: The group $G = GL_r(k) \times (k^\times)^n$ acts on $\mathbf{A}^{r \times n}$, the space of $r$-by-$n$ matrices: $GL_r(k)$ acts by row operations and $(k^\times)^n$ scales columns. A matrix orbit closure is the Zariski closure of a point orbit for this action. We prove that the class of such an orbit closure in $G$ equivariant $K$-theory of $\mathbf{A}^{r \times n}$ is determined by the matroid of… ▽ More The group $G = GL_r(k) \times (k^\times)^n$ acts on $\mathbf{A}^{r \times n}$, the space of $r$-by-$n$ matrices: $GL_r(k)$ acts by row operations and $(k^\times)^n$ scales columns. A matrix orbit closure is the Zariski closure of a point orbit for this action. We prove that the class of such an orbit closure in $G$ equivariant $K$-theory of $\mathbf{A}^{r \times n}$ is determined by the matroid of a generic point. We present two formulas for this class. The key to the proof is to show that matrix orbit closures have rational singularities. △ Less

Submitted 31 March, 2021; v1 submitted 22 April, 2019; originally announced April 2019.

Comments: 27pp. Expanded introduction. New section with positivity conjectures for all matroids. Comments welcome!

arXiv:1904.09868 [pdf, other]

doi 10.1103/PhysRevE.100.022308

A phase transition creates the geometry of the continuum from discrete space

Authors: Robert Stanley Farr, Thomas M. A. Fink

Abstract: Models of discrete space and space-time that exhibit continuum-like behavior at large lengths could have profound implications for physics. They may tame the infinities that arise from quantizing gravity, and dispense with the machinery of the real numbers, which has no direct observational support. Yet despite sophisticated attempts at formulating discrete space, researchers have failed to constr… ▽ More Models of discrete space and space-time that exhibit continuum-like behavior at large lengths could have profound implications for physics. They may tame the infinities that arise from quantizing gravity, and dispense with the machinery of the real numbers, which has no direct observational support. Yet despite sophisticated attempts at formulating discrete space, researchers have failed to construct even the simplest geometries. We investigate graphs as the most elementary discrete models of two-dimensional space. We show that if space is discrete, it must be disordered, by proving that all planar lattice graphs exhibit the same taxicab metric as square grids. We give an explicit recipe for growing disordered discrete space by sampling a Boltzmann distribution of graphs at low temperature. We then propose three conditions which any discrete model of Euclidean space must meet: have a Hausdorff dimension of two, support unique straight lines and obey Pythagoras' theorem. Our model satisfies all three, making it the first discrete model in which continuum-like behavior is recovered at large lengths. △ Less

Submitted 27 October, 2021; v1 submitted 18 April, 2019; originally announced April 2019.

Comments: 6 pages. 7 figures

Journal ref: Phys. Rev. E 100, 022308 (2019)

arXiv:1904.07374 [pdf]

Hel** IT and OT Defenders Collaborate

Authors: Glenn A. Fink, Penny McKenzie

Abstract: Cyber-physical systems, especially in critical infrastructures, have become primary hacking targets in international conflicts and diplomacy. However, cyber-physical systems present unique challenges to defenders, starting with an inability to communicate. This paper outlines the results of our interviews with information technology (IT) defenders and operational technology (OT) operators and seek… ▽ More Cyber-physical systems, especially in critical infrastructures, have become primary hacking targets in international conflicts and diplomacy. However, cyber-physical systems present unique challenges to defenders, starting with an inability to communicate. This paper outlines the results of our interviews with information technology (IT) defenders and operational technology (OT) operators and seeks to address lessons learned from them in the structure of our notional solutions. We present two problems in this paper: (1) the difficulty of coordinating detection and response between defenders who work on the cyber/IT and physical/OT sides of cyber-physical infrastructures, and (2) the difficulty of estimating the safety state of a cyber-physical system while an intrusion is underway but before damage can be effected by the attacker. To meet these challenges, we propose two solutions: (1) a visualization that will enable communication between IT defenders and OT operators, and (2) a machine-learning approach that will estimate the distance from normal the physical system is operating and send information to the visualization. △ Less

Submitted 15 April, 2019; originally announced April 2019.

Comments: 7 pages, 6 figures, 1 table, In proceedings of The Third IEEE International Workshop on Security and Privacy for Internet of Things and Cyber-Physical Systems (IoT/CPS-Security 2018), Seattle, WA, USA, October 22, 2018

Report number: PNNL-SA-138585

arXiv:1903.10930 [pdf, other]

Exploring Confidence Measures for Word Spotting in Heterogeneous Datasets

Authors: Fabian Wolf, Philipp Oberdiek, Gernot A. Fink

Abstract: In recent years, convolutional neural networks (CNNs) took over the field of document analysis and they became the predominant model for word spotting. Especially attribute CNNs, which learn the map** between a word image and an attribute representation, showed exceptional performances. The drawback of this approach is the overconfidence of neural networks when used out of their training distrib… ▽ More In recent years, convolutional neural networks (CNNs) took over the field of document analysis and they became the predominant model for word spotting. Especially attribute CNNs, which learn the map** between a word image and an attribute representation, showed exceptional performances. The drawback of this approach is the overconfidence of neural networks when used out of their training distribution. In this paper, we explore different metrics for quantifying the confidence of a CNN in its predictions, specifically on the retrieval problem of word spotting. With these confidence measures, we limit the inability of a retrieval list to reject certain candidates. We investigate four different approaches that are either based on the network's attribute estimations or make use of a surrogate model. Our approach also aims at answering the question for which part of a dataset the retrieval system gives reliable results. We further show that there exists a direct relation between the proposed confidence measures and the quality of an estimated attribute representation. △ Less

Submitted 26 March, 2019; originally announced March 2019.

arXiv:1903.10332 [pdf, other]

Zero-one Schubert polynomials

Authors: Alex Fink, Karola Mészáros, Avery St. Dizier

Abstract: We prove that if $σ\in S_m$ is a pattern of $w \in S_n$, then we can express the Schubert polynomial $\mathfrak{S}_w$ as a monomial times $\mathfrak{S}_σ$ (in reindexed variables) plus a polynomial with nonnegative coefficients. This implies that the set of permutations whose Schubert polynomials have all their coefficients equal to either 0 or 1 is closed under pattern containment. Using Magyar's… ▽ More We prove that if $σ\in S_m$ is a pattern of $w \in S_n$, then we can express the Schubert polynomial $\mathfrak{S}_w$ as a monomial times $\mathfrak{S}_σ$ (in reindexed variables) plus a polynomial with nonnegative coefficients. This implies that the set of permutations whose Schubert polynomials have all their coefficients equal to either 0 or 1 is closed under pattern containment. Using Magyar's orthodontia, we characterize this class by a list of twelve avoided patterns. We also give other equivalent conditions on $\mathfrak{S}_w$ being zero-one. In this case, the Schubert polynomial $\mathfrak{S}_w$ is equal to the integer point transform of a generalized permutahedron. △ Less

Submitted 15 November, 2020; v1 submitted 25 March, 2019; originally announced March 2019.

Comments: 17 pages, 2 figures; graphics updated and various typos corrected in v2. More minor fixes in v3

arXiv:1903.08288 [pdf, other]

doi 10.1112/jlms.12505

Presentations of Transversal Valuated Matroids

Authors: Alex Fink, Jorge Alberto Olarte

Abstract: Given $d$ row vectors of $n$ tropical numbers, $d<n$, the tropical Stiefel map constructs a version of their row space, whose Plücker coordinates are tropical determinants. We explicitly describe the fibers of this map. From the viewpoint of matroid theory, the tropical Stiefel map defines a generalization of transversal matroids in the valuated context, and our results are the valuated generaliza… ▽ More Given $d$ row vectors of $n$ tropical numbers, $d<n$, the tropical Stiefel map constructs a version of their row space, whose Plücker coordinates are tropical determinants. We explicitly describe the fibers of this map. From the viewpoint of matroid theory, the tropical Stiefel map defines a generalization of transversal matroids in the valuated context, and our results are the valuated generalizations of theorems of Brualdi and Dinolt, Mason and others on the set of all set families that present a given transversal matroid. We show that a connected valuated matroid is transversal if and only if all of its connected initial matroids are. The duals of our results describe complete stable intersections via valuated strict gammoids. △ Less

Submitted 29 November, 2020; v1 submitted 19 March, 2019; originally announced March 2019.

Comments: 47 pages. v2: Many examples added. Introduction and preliminaries significantly trimmed. Three new sections (5, 7, 8); this is mostly reorganization but some discussion of valuated gammoids in section 7 is new, and valuations are handled more explicitly in section 5. v3: Renovated section 6

MSC Class: 05B35; 14T05; 52B20; 52C35

arXiv:1812.06513 [pdf, other]

doi 10.1002/qj.3489

Assessing the predictability of Medicanes in ECMWF ensemble forecasts using an object-based approach

Authors: Enrico Di Muzio, Michael Riemer, Andreas H. Fink, Michael Maier-Gerber

Abstract: The predictability of eight southern European tropical-like cyclones, seven of which Medicanes, is studied evaluating ECMWF operational ensemble forecasts against operational analysis data. Forecast cyclone trajectories are compared to the cyclone trajectory in the analysis by means of a dynamic time war** technique, which allows to find a match in terms of their overall spatio-temporal similari… ▽ More The predictability of eight southern European tropical-like cyclones, seven of which Medicanes, is studied evaluating ECMWF operational ensemble forecasts against operational analysis data. Forecast cyclone trajectories are compared to the cyclone trajectory in the analysis by means of a dynamic time war** technique, which allows to find a match in terms of their overall spatio-temporal similarity. Each storm is treated as an object and its forecasts are analysed using metrics that describe intensity, symmetry, compactness, and upper-level thermal structure. This object-based approach allows to focus on specific storm features, while tolerating their shifts in time and space to some extent. The compactness and symmetry of the storms are generally underpredicted, especially at long lead times. However, forecast accuracy tends to strongly improve at short lead times, indicating that the ECMWF ensemble forecast model can adequately reproduce Medicanes, albeit only few days in advance. In particular, late forecasts which have been initialised when the cyclone has already developed are distinctly more accurate than earlier forecasts in predicting its kinematic and thermal structure, confirming previous findings of high sensitivity of Medicane simulations to initial conditions. Findings reveal a markedly non-gradual evolution of ensemble forecasts with lead time. Specifically, a rapid increase in the probability of cyclone occurrence ("forecast jump") is seen in most cases, generally between 5 and 7 days lead time. Jumps are also found for ensemble median and/or spread for storm thermal structure forecasts. This behaviour is compatible with the existence of predictability barriers. On the other hand, storm position forecasts often exhibit a consistent spatial distribution of storm position uncertainty and bias between consecutive forecasts. △ Less

Submitted 16 December, 2018; originally announced December 2018.

arXiv:1811.12513 [pdf, other]

doi 10.1175/MWR-D-18-0188.1

Tropical transition of Hurricane Chris (2012) over the North Atlantic Ocean: A multi-scale investigation of predictability

Authors: Michael Maier-Gerber, Michael Riemer, Andreas H. Fink, Peter Knippertz, Enrico Di Muzio, Ron McTaggart-Cowan

Abstract: Tropical cyclones that evolve from a non-tropical origin may pose a special challenge for predictions, as they often emerge at the end of a multi-scale cascade of atmospheric processes. Climatological studies have shown that the 'tropical transition' (TT) pathway plays a prominent role in cyclogenesis, in particular over the North Atlantic Ocean. Here we use operational European Centre for Medium-… ▽ More Tropical cyclones that evolve from a non-tropical origin may pose a special challenge for predictions, as they often emerge at the end of a multi-scale cascade of atmospheric processes. Climatological studies have shown that the 'tropical transition' (TT) pathway plays a prominent role in cyclogenesis, in particular over the North Atlantic Ocean. Here we use operational European Centre for Medium-Range Weather Forecasts ensemble predictions to investigate the TT of North Atlantic Hurricane Chris (2012), whose formation was preceded by the merger of two potential vorticity (PV) maxima, eventually resulting in the storm-inducing PV streamer. The principal goal is to elucidate the dynamic and thermodynamic processes governing the predictability of cyclogenesis and subsequent TT. Dynamic time war** is applied to identify ensemble tracks that are similar to the analysis track. This technique permits small temporal and spatial shifts in the development. The formation of the pre-Chris cyclone is predicted by those members that also predict the merging of the two PV maxima. The position of the storm relative to the PV streamer determines whether the pre-Chris cyclone follows the TT pathway. The transitioning storms are located inside a favorable region of high equivalent potential temperatures that result from a warm seclusion underneath the cyclonic roll-up of the PV streamer. A systematic investigation of consecutive ensemble forecasts indicates that forecast improvements are linked to specific events, such as the PV merging. The present case exemplifies how a novel combination of Eulerian and Lagrangian ensemble forecast analysis tool allows to infer physical causes of abrupt changes in predictability. △ Less

Submitted 29 November, 2018; originally announced November 2018.

Comments: 27 pages, 15 figures, supplementary material; submitted to Monthly Weather Review

arXiv:1811.11626 [pdf, other]

doi 10.1175/JCLI-D-18-0173.1

A systematic comparison of tropical waves over northern Africa. Part I: Influence on rainfall

Authors: Andreas Schlueter, Andreas H. Fink, Peter Knippertz, Peter Vogel

Abstract: Low-latitude rainfall variability on the daily to intraseasonal timescale is often related to tropical waves, including convectively coupled equatorial waves, the Madden-Julian Oscillation (MJO), and tropical disturbances. Despite the importance of rainfall variability for vulnerable societies in tropical Africa, the relative influence of tropical waves for this region is largely unknown. This art… ▽ More Low-latitude rainfall variability on the daily to intraseasonal timescale is often related to tropical waves, including convectively coupled equatorial waves, the Madden-Julian Oscillation (MJO), and tropical disturbances. Despite the importance of rainfall variability for vulnerable societies in tropical Africa, the relative influence of tropical waves for this region is largely unknown. This article presents the first systematic comparison of the impact of six wave types on precipitation over northern tropical Africa during the transition and full monsoon seasons, using two satellite products and a dense rain gauge network. Composites of rainfall anomalies in the different datasets show} comparable modulation intensities in the West Sahel and at the Guinea Coast, varying from less than 2 to above 7 mm/d depending on the wave type. African Easterly Waves (AEWs) and Kelvin waves dominate the 3-hourly to daily timescale and explain 10-30% locally. On longer timescales (7-20d), only the MJO and equatorial Rossby (ER) waves remain as modulating factors and explain about up to one third of rainfall variability. Eastward inertio-gravity waves and mixed Rossby-gravity (MRG) waves are comparatively unimportant. An analysis of wave superposition shows that low-frequency waves (MJO, ER) in their wet phase amplify the activity of high-frequency waves (TD, MRG) and suppress them in the dry phase. The results stress that more attention should be paid to tropical waves when forecasting rainfall over northern tropical Africa. △ Less

Submitted 28 November, 2018; originally announced November 2018.

Comments: 34 pages, 12 figures, supplementary material; submitted to Journal of Climate

arXiv:1811.11625 [pdf, other]

doi 10.1175/JCLI-D-18-0651.1

A systematic comparison of tropical waves over northern Africa. Part II: Dynamics and thermodynamics

Authors: Andreas Schlueter, Andreas H. Fink, Peter Knippertz

Abstract: This study presents the first systematic comparison of the (thermo-)dynamics associated with all major tropical wave types causing rainfall modulation over northern tropical Africa: Madden Julian Oscillation (MJO), Equatorial Rossby waves (ERs), mixed Rossby-gravity waves (MRGs), Kelvin waves, tropical disturbances (TDs, including African Easterly Waves), and eastward inertio-gravity waves (EIGs).… ▽ More This study presents the first systematic comparison of the (thermo-)dynamics associated with all major tropical wave types causing rainfall modulation over northern tropical Africa: Madden Julian Oscillation (MJO), Equatorial Rossby waves (ERs), mixed Rossby-gravity waves (MRGs), Kelvin waves, tropical disturbances (TDs, including African Easterly Waves), and eastward inertio-gravity waves (EIGs). Reanalysis and radiosonde data were analyzed for the period 1981--2013 based on space-time filtering of outgoing longwave radiation. The identified circulation patterns are largely consistent with theory. The slow modes, MJO and ER, mainly impact precipitable water, whereas the faster Kelvin waves, MRGs, and TDs primarily modulate moisture convergence. Monsoonal inflow intensifies during wet phases of the MJO, ERs, and MRGs, associated with a northward shift of the intertropical discontinuity for MJO and ERs. During passages of vertically tilted imbalanced wave modes, such as MJO, Kelvin waves, and TDs, and partly MRGs, increased vertical wind shear and improved conditions for up- and downdrafts facilitate the organization of convection. The balanced ERs are not tilted and rainfall is triggered by large-scale moistening and stratiform lifting. The MJO and ERs interact with intraseasonal variations of the Indian monsoon and extratropical Rossby wave trains. The latter causes a trough over the Atlas Mountains associated with a tropical plume and rainfall over the Sahara. Positive North Atlantic and Arctic Oscillation signals precede tropical plumes in case of the MJO. The results unveil which dynamical processes need to be modeled realistically to represent the coupling between tropical waves and rainfall in northern tropical Africa. △ Less

Submitted 28 November, 2018; originally announced November 2018.

Comments: 33 pages, 11 figures, supplementary material; submitted to Journal of Climate

arXiv:1806.10866 [pdf, other]

Exploring Architectures for CNN-Based Word Spotting

Authors: Eugen Rusakov, Sebastian Sudholt, Fabian Wolf, Gernot A. Fink

Abstract: The goal in word spotting is to retrieve parts of document images which are relevant with respect to a certain user-defined query. The recent past has seen attribute-based Convolutional Neural Networks take over this field of research. As is common for other fields of computer vision, the CNNs used for this task are already considerably deep. The question that arises, however, is: How complex does… ▽ More The goal in word spotting is to retrieve parts of document images which are relevant with respect to a certain user-defined query. The recent past has seen attribute-based Convolutional Neural Networks take over this field of research. As is common for other fields of computer vision, the CNNs used for this task are already considerably deep. The question that arises, however, is: How complex does a CNN have to be for word spotting? Are increasingly deeper models giving increasingly better results or does performance behave asymptotically for these architectures? On the other hand, can similar results be obtained with a much smaller CNN? The goal of this paper is to give an answer to these questions. Therefore, the recently successful TPP-PHOCNet will be compared to a Residual Network, a Densely Connected Convolutional Network and a LeNet architecture empirically. As will be seen in the evaluation, a complex model can be beneficial for word spotting on harder tasks such as the IAM Offline Database but gives no advantage for easier benchmarks such as the George Washington Database. △ Less

Submitted 12 March, 2024; v1 submitted 28 June, 2018; originally announced June 2018.

arXiv:1802.09859 [pdf, other]

The Tutte polynomial via lattice point counting

Authors: Amanda Cameron, Alex Fink

Abstract: We recover the Tutte polynomial of a matroid, up to change of coordinates, from an Ehrhart-style polynomial counting lattice points in the Minkowski sum of its base polytope and scalings of simplices. Our polynomial has coefficients of alternating sign with a combinatorial interpretation closely tied to the Dawson partition. Our definition extends in a straightforward way to polymatroids, and in t… ▽ More We recover the Tutte polynomial of a matroid, up to change of coordinates, from an Ehrhart-style polynomial counting lattice points in the Minkowski sum of its base polytope and scalings of simplices. Our polynomial has coefficients of alternating sign with a combinatorial interpretation closely tied to the Dawson partition. Our definition extends in a straightforward way to polymatroids, and in this setting our polynomial has Kálmán's internal and external activity polynomials as its univariate specialisations. △ Less

Submitted 27 February, 2018; originally announced February 2018.

MSC Class: 52B40

arXiv:1802.00761 [pdf, other]

Learning Attribute Representation for Human Activity Recognition

Authors: Fernando Moya Rueda, Gernot A. Fink

Abstract: Attribute representations became relevant in image recognition and word spotting, providing support under the presence of unbalance and disjoint datasets. However, for human activity recognition using sequential data from on-body sensors, human-labeled attributes are lacking. This paper introduces a search for attributes that represent favorably signal segments for recognizing human activities. It… ▽ More Attribute representations became relevant in image recognition and word spotting, providing support under the presence of unbalance and disjoint datasets. However, for human activity recognition using sequential data from on-body sensors, human-labeled attributes are lacking. This paper introduces a search for attributes that represent favorably signal segments for recognizing human activities. It presents three deep architectures, including temporal-convolutions and an IMU centered design, for predicting attributes. An empiric evaluation of random and learned attribute representations, and as well as the networks is carried out on two datasets, outperforming the state-of-the art. △ Less

Submitted 2 February, 2018; originally announced February 2018.

Comments: 6 pages, submitted to ICPR 2018

arXiv:1801.08747 [pdf, other]

Weakly Supervised Object Detection with Pointwise Mutual Information

Authors: Rene Grzeszick, Sebastian Sudholt, Gernot A. Fink

Abstract: In this work a novel approach for weakly supervised object detection that incorporates pointwise mutual information is presented. A fully convolutional neural network architecture is applied in which the network learns one filter per object class. The resulting feature map indicates the location of objects in an image, yielding an intuitive representation of a class activation map. While tradition… ▽ More In this work a novel approach for weakly supervised object detection that incorporates pointwise mutual information is presented. A fully convolutional neural network architecture is applied in which the network learns one filter per object class. The resulting feature map indicates the location of objects in an image, yielding an intuitive representation of a class activation map. While traditionally such networks are learned by a softmax or binary logistic regression (sigmoid cross-entropy loss), a learning approach based on a cosine loss is introduced. A pointwise mutual information layer is incorporated in the network in order to project predictions and ground truth presence labels in a non-categorical embedding space. Thus, the cosine loss can be employed in this non-categorical representation. Besides integrating image level annotations, it is shown how to integrate point-wise annotations using a Spatial Pyramid Pooling layer. The approach is evaluated on the VOC2012 dataset for classification, point localization and weakly supervised bounding box localization. It is shown that the combination of pointwise mutual information and a cosine loss eases the learning process and thus improves the accuracy. The integration of coarse point-wise localizations further improves the results at minimal annotation costs. △ Less

Submitted 26 January, 2018; originally announced January 2018.

Showing 1–50 of 91 results for author: Fink, A