Search | arXiv e-print repository

Local Search k-means++ with Foresight

Authors: Theo Conrads, Lukas Drexler, Joshua Könen, Daniel R. Schmidt, Melanie Schmidt

Abstract: Since its introduction in 1957, Lloyd's algorithm for $k$-means clustering has been extensively studied and has undergone several improvements. While in its original form it does not guarantee any approximation factor at all, Arthur and Vassilvitskii (SODA 2007) proposed $k$-means++ which enhances Lloyd's algorithm by a seeding method which guarantees a $\mathcal{O}(\log k)$-approximation in expec… ▽ More Since its introduction in 1957, Lloyd's algorithm for $k$-means clustering has been extensively studied and has undergone several improvements. While in its original form it does not guarantee any approximation factor at all, Arthur and Vassilvitskii (SODA 2007) proposed $k$-means++ which enhances Lloyd's algorithm by a seeding method which guarantees a $\mathcal{O}(\log k)$-approximation in expectation. More recently, Lattanzi and Sohler (ICML 2019) proposed LS++ which further improves the solution quality of $k$-means++ by local search techniques to obtain a $\mathcal{O}(1)$-approximation. On the practical side, the greedy variant of $k$-means++ is often used although its worst-case behaviour is provably worse than for the standard $k$-means++ variant. We investigate how to improve LS++ further in practice. We study two options for improving the practical performance: (a) Combining LS++ with greedy $k$-means++ instead of $k$-means++, and (b) Improving LS++ by better entangling it with Lloyd's algorithm. Option (a) worsens the theoretical guarantees of $k$-means++ but improves the practical quality also in combination with LS++ as we confirm in our experiments. Option (b) is our new algorithm, Foresight LS++. We experimentally show that FLS++ improves upon the solution quality of LS++. It retains its asymptotic runtime and its worst-case approximation bounds. △ Less

Submitted 4 June, 2024; originally announced June 2024.

arXiv:2404.10433 [pdf, other]

Explainable concept map**s of MRI: Revealing the mechanisms underlying deep learning-based brain disease classification

Authors: Christian Tinauer, Anna Damulina, Maximilian Sackl, Martin Soellradl, Reduan Achtibat, Maximilian Dreyer, Frederik Pahde, Sebastian Lapuschkin, Reinhold Schmidt, Stefan Ropele, Wojciech Samek, Christian Langkammer

Abstract: Motivation. While recent studies show high accuracy in the classification of Alzheimer's disease using deep neural networks, the underlying learned concepts have not been investigated. Goals. To systematically identify changes in brain regions through concepts learned by the deep neural network for model validation. Approach. Using quantitative R2* maps we separated Alzheimer's patients (n=117… ▽ More Motivation. While recent studies show high accuracy in the classification of Alzheimer's disease using deep neural networks, the underlying learned concepts have not been investigated. Goals. To systematically identify changes in brain regions through concepts learned by the deep neural network for model validation. Approach. Using quantitative R2* maps we separated Alzheimer's patients (n=117) from normal controls (n=219) by using a convolutional neural network and systematically investigated the learned concepts using Concept Relevance Propagation and compared these results to a conventional region of interest-based analysis. Results. In line with established histological findings and the region of interest-based analyses, highly relevant concepts were primarily found in and adjacent to the basal ganglia. Impact. The identification of concepts learned by deep neural networks for disease classification enables validation of the models and could potentially improve reliability. △ Less

Submitted 16 April, 2024; originally announced April 2024.

arXiv:2402.09683 [pdf, other]

Exploring a Behavioral Model of "Positive Friction" in Human-AI Interaction

Authors: Zeya Chen, Ruth Schmidt

Abstract: Designing seamless, frictionless user experiences has long been a dominant trend in both applied behavioral science and artificial intelligence (AI), in which the goal of making desirable actions easy and efficient informs efforts to minimize friction in user experiences. However, in some settings, friction can be genuinely beneficial, such as the insertion of deliberate delays to increase reflect… ▽ More Designing seamless, frictionless user experiences has long been a dominant trend in both applied behavioral science and artificial intelligence (AI), in which the goal of making desirable actions easy and efficient informs efforts to minimize friction in user experiences. However, in some settings, friction can be genuinely beneficial, such as the insertion of deliberate delays to increase reflection, preventing individuals from resorting to automatic or biased behaviors, and enhancing opportunities for unexpected discoveries. More recently, the popularization and availability of AI on a widespread scale has only increased the need to examine how friction can help or hinder users of AI; it also suggests a need to consider how positive friction can benefit AI practitioners, both during development processes (e.g., working with diverse teams) and to inform how AI is designed into offerings. This paper first proposes a "positive friction" model that can help characterize how friction is currently beneficial in user and developer experiences with AI, diagnose the potential need for friction where it may not yet exist in these contexts, and inform how positive friction can be used to generate solutions, especially as advances in AI continue to be progress and new opportunities emerge. It then explores this model in the context of AI users and developers by proposing the value of taking a hybrid "AI+human" lens, and concludes by suggesting questions for further exploration. △ Less

Submitted 14 February, 2024; originally announced February 2024.

Comments: This preprint has not undergone peer review or any post-submission corrections. The Version of Record of this contribution will be published in Springer Nature Computer Science book series in Volume HCI International 2024

Journal ref: DESIGN, USER EXPERIENCE AND USABILITY. HCII 2024

arXiv:2307.13078 [pdf, other]

Adaptive Certified Training: Towards Better Accuracy-Robustness Tradeoffs

Authors: Zhakshylyk Nurlanov, Frank R. Schmidt, Florian Bernard

Abstract: As deep learning models continue to advance and are increasingly utilized in real-world systems, the issue of robustness remains a major challenge. Existing certified training methods produce models that achieve high provable robustness guarantees at certain perturbation levels. However, the main problem of such models is a dramatically low standard accuracy, i.e. accuracy on clean unperturbed dat… ▽ More As deep learning models continue to advance and are increasingly utilized in real-world systems, the issue of robustness remains a major challenge. Existing certified training methods produce models that achieve high provable robustness guarantees at certain perturbation levels. However, the main problem of such models is a dramatically low standard accuracy, i.e. accuracy on clean unperturbed data, that makes them impractical. In this work, we consider a more realistic perspective of maximizing the robustness of a model at certain levels of (high) standard accuracy. To this end, we propose a novel certified training method based on a key insight that training with adaptive certified radii helps to improve both the accuracy and robustness of the model, advancing state-of-the-art accuracy-robustness tradeoffs. We demonstrate the effectiveness of the proposed method on MNIST, CIFAR-10, and TinyImageNet datasets. Particularly, on CIFAR-10 and TinyImageNet, our method yields models with up to two times higher robustness, measured as an average certified radius of a test set, at the same levels of standard accuracy compared to baseline approaches. △ Less

Submitted 24 July, 2023; originally announced July 2023.

Comments: Presented at ICML 2023 workshop "New Frontiers in Adversarial Machine Learning"

arXiv:2306.05242 [pdf, other]

Efficient Multi-Task Scene Analysis with RGB-D Transformers

Authors: Söhnke Benedikt Fischedick, Daniel Seichter, Robin Schmidt, Leonard Rabes, Horst-Michael Gross

Abstract: Scene analysis is essential for enabling autonomous systems, such as mobile robots, to operate in real-world environments. However, obtaining a comprehensive understanding of the scene requires solving multiple tasks, such as panoptic segmentation, instance orientation estimation, and scene classification. Solving these tasks given limited computing and battery capabilities on mobile platforms is… ▽ More Scene analysis is essential for enabling autonomous systems, such as mobile robots, to operate in real-world environments. However, obtaining a comprehensive understanding of the scene requires solving multiple tasks, such as panoptic segmentation, instance orientation estimation, and scene classification. Solving these tasks given limited computing and battery capabilities on mobile platforms is challenging. To address this challenge, we introduce an efficient multi-task scene analysis approach, called EMSAFormer, that uses an RGB-D Transformer-based encoder to simultaneously perform the aforementioned tasks. Our approach builds upon the previously published EMSANet. However, we show that the dual CNN-based encoder of EMSANet can be replaced with a single Transformer-based encoder. To achieve this, we investigate how information from both RGB and depth data can be effectively incorporated in a single encoder. To accelerate inference on robotic hardware, we provide a custom NVIDIA TensorRT extension enabling highly optimization for our EMSAFormer approach. Through extensive experiments on the commonly used indoor datasets NYUv2, SUNRGB-D, and ScanNet, we show that our approach achieves state-of-the-art performance while still enabling inference with up to 39.1 FPS on an NVIDIA Jetson AGX Orin 32 GB. △ Less

Submitted 8 June, 2023; originally announced June 2023.

Comments: To be published in IEEE International Joint Conference on Neural Networks (IJCNN) 2023

arXiv:2305.02665 [pdf, other]

Learning Language-Specific Layers for Multilingual Machine Translation

Authors: Telmo Pessoa Pires, Robin M. Schmidt, Yi-Hsiu Liao, Stephan Peitz

Abstract: Multilingual Machine Translation promises to improve translation quality between non-English languages. This is advantageous for several reasons, namely lower latency (no need to translate twice), and reduced error cascades (e.g., avoiding losing gender and formality information when translating through English). On the downside, adding more languages reduces model capacity per language, which is… ▽ More Multilingual Machine Translation promises to improve translation quality between non-English languages. This is advantageous for several reasons, namely lower latency (no need to translate twice), and reduced error cascades (e.g., avoiding losing gender and formality information when translating through English). On the downside, adding more languages reduces model capacity per language, which is usually countered by increasing the overall model size, making training harder and inference slower. In this work, we introduce Language-Specific Transformer Layers (LSLs), which allow us to increase model capacity, while kee** the amount of computation and the number of parameters used in the forward pass constant. The key idea is to have some layers of the encoder be source or target language-specific, while kee** the remaining layers shared. We study the best way to place these layers using a neural architecture search inspired approach, and achieve an improvement of 1.3 chrF (1.5 spBLEU) points over not using LSLs on a separate decoder architecture, and 1.9 chrF (2.2 spBLEU) on a shared decoder one. △ Less

Submitted 4 May, 2023; originally announced May 2023.

Comments: Accepted at ACL 2023

arXiv:2304.12776 [pdf, other]

State Spaces Aren't Enough: Machine Translation Needs Attention

Authors: Ali Vardasbi, Telmo Pessoa Pires, Robin M. Schmidt, Stephan Peitz

Abstract: Structured State Spaces for Sequences (S4) is a recently proposed sequence model with successful applications in various tasks, e.g. vision, language modeling, and audio. Thanks to its mathematical formulation, it compresses its input to a single hidden state, and is able to capture long range dependencies while avoiding the need for an attention mechanism. In this work, we apply S4 to Machine Tra… ▽ More Structured State Spaces for Sequences (S4) is a recently proposed sequence model with successful applications in various tasks, e.g. vision, language modeling, and audio. Thanks to its mathematical formulation, it compresses its input to a single hidden state, and is able to capture long range dependencies while avoiding the need for an attention mechanism. In this work, we apply S4 to Machine Translation (MT), and evaluate several encoder-decoder variants on WMT'14 and WMT'16. In contrast with the success in language modeling, we find that S4 lags behind the Transformer by approximately 4 BLEU points, and that it counter-intuitively struggles with long sentences. Finally, we show that this gap is caused by S4's inability to summarize the full source sentence in a single hidden state, and show that we can close the gap by introducing an attention mechanism. △ Less

Submitted 25 April, 2023; originally announced April 2023.

arXiv:2212.00780 [pdf]

Universe Points Representation Learning for Partial Multi-Graph Matching

Authors: Zhakshylyk Nurlanov, Frank R. Schmidt, Florian Bernard

Abstract: Many challenges from natural world can be formulated as a graph matching problem. Previous deep learning-based methods mainly consider a full two-graph matching setting. In this work, we study the more general partial matching problem with multi-graph cycle consistency guarantees. Building on a recent progress in deep learning on graphs, we propose a novel data-driven method (URL) for partial mult… ▽ More Many challenges from natural world can be formulated as a graph matching problem. Previous deep learning-based methods mainly consider a full two-graph matching setting. In this work, we study the more general partial matching problem with multi-graph cycle consistency guarantees. Building on a recent progress in deep learning on graphs, we propose a novel data-driven method (URL) for partial multi-graph matching, which uses an object-to-universe formulation and learns latent representations of abstract universe points. The proposed approach advances the state of the art in semantic keypoint matching problem, evaluated on Pascal VOC, CUB, and Willow datasets. Moreover, the set of controlled experiments on a synthetic graph matching dataset demonstrates the scalability of our method to graphs with large number of nodes and its robustness to high partiality. △ Less

Submitted 7 December, 2022; v1 submitted 1 December, 2022; originally announced December 2022.

Comments: To appear in AAAI 2023

arXiv:2210.14484 [pdf, other]

doi 10.1016/j.inffus.2024.102524

Imputation of missing values in multi-view data

Authors: Wouter van Loon, Marjolein Fokkema, Frank de Vos, Marisa Koini, Reinhold Schmidt, Mark de Rooij

Abstract: Data for which a set of objects is described by multiple distinct feature sets (called views) is known as multi-view data. When missing values occur in multi-view data, all features in a view are likely to be missing simultaneously. This may lead to very large quantities of missing data which, especially when combined with high-dimensionality, can make the application of conditional imputation met… ▽ More Data for which a set of objects is described by multiple distinct feature sets (called views) is known as multi-view data. When missing values occur in multi-view data, all features in a view are likely to be missing simultaneously. This may lead to very large quantities of missing data which, especially when combined with high-dimensionality, can make the application of conditional imputation methods computationally infeasible. However, the multi-view structure could be leveraged to reduce the complexity and computational load of imputation. We introduce a new imputation method based on the existing stacked penalized logistic regression (StaPLR) algorithm for multi-view learning. It performs imputation in a dimension-reduced space to address computational challenges inherent to the multi-view context. We compare the performance of the new imputation method with several existing imputation algorithms in simulated data sets and a real data application. The results show that the new imputation method leads to competitive results at a much lower computational cost, and makes the use of advanced imputation algorithms such as missForest and predictive mean matching possible in settings where they would otherwise be computationally infeasible. △ Less

Submitted 20 June, 2024; v1 submitted 26 October, 2022; originally announced October 2022.

Comments: 49 pages, 15 figures. Accepted manuscript

Journal ref: Information Fusion 111 (2024) 102524

arXiv:2208.05365 [pdf, other]

Saturation-based Boolean conjunctive query answering and rewriting for the guarded quantification fragments

Authors: Sen Zheng, Renate A. Schmidt

Abstract: Query answering is an important problem in AI, database and knowledge representation. In this paper, we develop saturation-based Boolean conjunctive query answering and rewriting procedures for the guarded, the loosely guarded and the clique-guarded fragments. Our query answering procedure improves existing resolution-based decision procedures for the guarded and the loosely guarded fragments and… ▽ More Query answering is an important problem in AI, database and knowledge representation. In this paper, we develop saturation-based Boolean conjunctive query answering and rewriting procedures for the guarded, the loosely guarded and the clique-guarded fragments. Our query answering procedure improves existing resolution-based decision procedures for the guarded and the loosely guarded fragments and this procedure solves Boolean conjunctive query answering problems for the guarded, the loosely guarded and the clique-guarded fragments. Based on this query answering procedure, we also introduce a novel saturation-based query rewriting procedure for these guarded fragments. Unlike mainstream query answering and rewriting methods, our procedures derive a compact and reusable saturation, namely a closure of formulas, to handle the challenge of querying for distributed datasets. This paper lays the theoretical foundations for the first automated deduction decision procedures for Boolean conjunctive query answering and the first saturation-based Boolean conjunctive query rewriting in the guarded, the loosely guarded and the clique-guarded fragments. △ Less

Submitted 16 October, 2023; v1 submitted 10 August, 2022; originally announced August 2022.

arXiv:2205.10577 [pdf, other]

Non-Autoregressive Neural Machine Translation: A Call for Clarity

Authors: Robin M. Schmidt, Telmo Pires, Stephan Peitz, Jonas Lööf

Abstract: Non-autoregressive approaches aim to improve the inference speed of translation models by only requiring a single forward pass to generate the output sequence instead of iteratively producing each predicted token. Consequently, their translation quality still tends to be inferior to their autoregressive counterparts due to several issues involving output token interdependence. In this work, we tak… ▽ More Non-autoregressive approaches aim to improve the inference speed of translation models by only requiring a single forward pass to generate the output sequence instead of iteratively producing each predicted token. Consequently, their translation quality still tends to be inferior to their autoregressive counterparts due to several issues involving output token interdependence. In this work, we take a step back and revisit several techniques that have been proposed for improving non-autoregressive translation models and compare their combined translation quality and speed implications under third-party testing environments. We provide novel insights for establishing strong baselines using length prediction or CTC-based architecture variants and contribute standardized BLEU, chrF++, and TER scores using sacreBLEU on four translation tasks, which crucially have been missing as inconsistencies in the use of tokenized BLEU lead to deviations of up to 1.7 BLEU points. Our open-sourced code is integrated into fairseq for reproducibility. △ Less

Submitted 21 October, 2022; v1 submitted 21 May, 2022; originally announced May 2022.

Comments: Accepted at EMNLP 2022

arXiv:2201.04816 [pdf, other]

doi 10.1109/CogMI52975.2021.00023

Towards a trustworthy, secure and reliable enclave for machine learning in a hospital setting: The Essen Medical Computing Platform (EMCP)

Authors: Hendrik F. R. Schmidt, Jörg Schlötterer, Marcel Bargull, Enrico Nasca, Ryan Aydelott, Christin Seifert, Folker Meyer

Abstract: AI/Computing at scale is a difficult problem, especially in a health care setting. We outline the requirements, planning and implementation choices as well as the guiding principles that led to the implementation of our secure research computing enclave, the Essen Medical Computing Platform (EMCP), affiliated with a major German hospital. Compliance, data privacy and usability were the immutable r… ▽ More AI/Computing at scale is a difficult problem, especially in a health care setting. We outline the requirements, planning and implementation choices as well as the guiding principles that led to the implementation of our secure research computing enclave, the Essen Medical Computing Platform (EMCP), affiliated with a major German hospital. Compliance, data privacy and usability were the immutable requirements of the system. We will discuss the features of our computing enclave and we will provide our recipe for groups wishing to adopt a similar setup. △ Less

Submitted 13 January, 2022; originally announced January 2022.

Comments: 9 pages, 5 figures, to be published in the proceedings of the 2021 IEEE CogMI conference. Christin Seifert and Folker Meyer are co-senior authors

arXiv:2112.02721 [pdf, other]

NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation

Authors: Kaustubh D. Dhole, Varun Gangal, Sebastian Gehrmann, Aadesh Gupta, Zhenhao Li, Saad Mahamood, Abinaya Mahendiran, Simon Mille, Ashish Shrivastava, Samson Tan, Tongshuang Wu, Jascha Sohl-Dickstein, **ho D. Choi, Eduard Hovy, Ondrej Dusek, Sebastian Ruder, Sajant Anand, Nagender Aneja, Rabin Banjade, Lisa Barthe, Hanna Behnke, Ian Berlot-Attwell, Connor Boyle, Caroline Brun, Marco Antonio Sobrevilla Cabezudo , et al. (101 additional authors not shown)

Abstract: Data augmentation is an important component in the robustness evaluation of models in natural language processing (NLP) and in enhancing the diversity of the data they are trained on. In this paper, we present NL-Augmenter, a new participatory Python-based natural language augmentation framework which supports the creation of both transformations (modifications to the data) and filters (data split… ▽ More Data augmentation is an important component in the robustness evaluation of models in natural language processing (NLP) and in enhancing the diversity of the data they are trained on. In this paper, we present NL-Augmenter, a new participatory Python-based natural language augmentation framework which supports the creation of both transformations (modifications to the data) and filters (data splits according to specific features). We describe the framework and an initial set of 117 transformations and 23 filters for a variety of natural language tasks. We demonstrate the efficacy of NL-Augmenter by using several of its transformations to analyze the robustness of popular natural language models. The infrastructure, datacards and robustness analysis results are available publicly on the NL-Augmenter repository (https://github.com/GEM-benchmark/NL-Augmenter). △ Less

Submitted 11 October, 2022; v1 submitted 5 December, 2021; originally announced December 2021.

Comments: 39 pages, repository at https://github.com/GEM-benchmark/NL-Augmenter

arXiv:2108.05761 [pdf, other]

doi 10.3389/fnins.2022.830630

Analyzing hierarchical multi-view MRI data with StaPLR: An application to Alzheimer's disease classification

Authors: Wouter van Loon, Frank de Vos, Marjolein Fokkema, Botond Szabo, Marisa Koini, Reinhold Schmidt, Mark de Rooij

Abstract: Multi-view data refers to a setting where features are divided into feature sets, for example because they correspond to different sources. Stacked penalized logistic regression (StaPLR) is a recently introduced method that can be used for classification and automatically selecting the views that are most important for prediction. We introduce an extension of this method to a setting where the dat… ▽ More Multi-view data refers to a setting where features are divided into feature sets, for example because they correspond to different sources. Stacked penalized logistic regression (StaPLR) is a recently introduced method that can be used for classification and automatically selecting the views that are most important for prediction. We introduce an extension of this method to a setting where the data has a hierarchical multi-view structure. We also introduce a new view importance measure for StaPLR, which allows us to compare the importance of views at any level of the hierarchy. We apply our extended StaPLR algorithm to Alzheimer's disease classification where different MRI measures have been calculated from three scan types: structural MRI, diffusion-weighted MRI, and resting-state fMRI. StaPLR can identify which scan types and which derived MRI measures are most important for classification, and it outperforms elastic net regression in classification performance. △ Less

Submitted 26 April, 2022; v1 submitted 12 August, 2021; originally announced August 2021.

Comments: 36 pages, 9 figures. Accepted manuscript

Journal ref: Frontiers in Neuroscience 16:830630 (2022) 1-15

arXiv:2107.07714 [pdf, other]

doi 10.1016/j.nima.2021.165932

Optical Inspection of the Silicon Micro-strip Sensors for the CBM Experiment employing Artificial Intelligence

Authors: E. Lavrik, M. Shiroya, H. R. Schmidt, A. Toia, J. M. Heuser

Abstract: Optical inspection of 1191 silicon micro-strip sensors was performed using a custom made optical inspection setup, employing a machine-learning based approach for the defect analysis and subsequent quality assurance. Furthermore, metrological control of the sensor's surface was performed. In this manuscript, we present the analysis of various sensor surface defects. Among these are implant breaks,… ▽ More Optical inspection of 1191 silicon micro-strip sensors was performed using a custom made optical inspection setup, employing a machine-learning based approach for the defect analysis and subsequent quality assurance. Furthermore, metrological control of the sensor's surface was performed. In this manuscript, we present the analysis of various sensor surface defects. Among these are implant breaks, p-stop breaks, aluminium strip opens, aluminium strip shorts, surface scratches, double metallization layer defects, passivation layer defects, bias resistor defects as well as dust particle identification. The defect detection was done using the application of Convolutional Deep Neural Networks (CDNNs). From this, defective strips and defect clusters were identified, as well as a 2D map of the defects using their geometrical positions on the sensor was performed. Based on the total number of defects found on the sensor's surface, a method for the estimation of sensor's overall quality grade and quality score was proposed. △ Less

Submitted 27 September, 2021; v1 submitted 16 July, 2021; originally announced July 2021.

arXiv:2105.10225 [pdf, ps, other]

Finding all minimum cost flows and a faster algorithm for the K best flow problem

Authors: David Könen, Daniel R. Schmidt, Christiane Spisla

Abstract: This paper addresses the problem of determining all optimal integer solutions of a linear integer network flow problem, which we call the all optimal integer flow (AOF) problem. We derive an O(F (m + n) + mn + M ) time algorithm to determine all F many optimal integer flows in a directed network with n nodes and m arcs, where M is the best time needed to find one minimum cost flow. We remark that… ▽ More This paper addresses the problem of determining all optimal integer solutions of a linear integer network flow problem, which we call the all optimal integer flow (AOF) problem. We derive an O(F (m + n) + mn + M ) time algorithm to determine all F many optimal integer flows in a directed network with n nodes and m arcs, where M is the best time needed to find one minimum cost flow. We remark that stop** Hamacher's well-known method for the determination of the K best integer flows at the first sub-optimal flow results in an algorithm with a running time of O(F m(n log n + m) + M ) for solving the AOF problem. Our improvement is essentially made possible by replacing the shortest path sub-problem with a more efficient way to determine a so called proper zero cost cycle using a modified depth-first search technique. As a byproduct, our analysis yields an enhanced algorithm to determine the K best integer flows that runs in O(Kn3 + M ). Besides, we give lower and upper bounds for the number of all optimal integer and feasible integer solutions. Our bounds are based on the fact that any optimal solution can be obtained by an initial optimal tree solution plus a conical combination of incidence vectors of all induced cycles with bounded coefficients. △ Less

Submitted 27 January, 2022; v1 submitted 21 May, 2021; originally announced May 2021.

arXiv:2104.01742 [pdf, other]

Explainability-aided Domain Generalization for Image Classification

Authors: Robin M. Schmidt

Abstract: Traditionally, for most machine learning settings, gaining some degree of explainability that tries to give users more insights into how and why the network arrives at its predictions, restricts the underlying model and hinders performance to a certain degree. For example, decision trees are thought of as being more explainable than deep neural networks but they lack performance on visual tasks. I… ▽ More Traditionally, for most machine learning settings, gaining some degree of explainability that tries to give users more insights into how and why the network arrives at its predictions, restricts the underlying model and hinders performance to a certain degree. For example, decision trees are thought of as being more explainable than deep neural networks but they lack performance on visual tasks. In this work, we empirically demonstrate that applying methods and architectures from the explainability literature can, in fact, achieve state-of-the-art performance for the challenging task of domain generalization while offering a framework for more insights into the prediction and training process. For that, we develop a set of novel algorithms including DivCAM, an approach where the network receives guidance during training via gradient based class activation maps to focus on a diverse set of discriminative features, as well as ProDrop and D-Transformers which apply prototypical networks to the domain generalization task, either with self-challenging or attention alignment. Since these methods offer competitive performance on top of explainability, we argue that the proposed methods can be used as a tool to improve the robustness of deep neural network architectures. △ Less

Submitted 4 April, 2021; originally announced April 2021.

arXiv:2008.10117 [pdf, other]

Collaborative Filtering under Model Uncertainty

Authors: Robin M. Schmidt, Moritz Hahn

Abstract: In their work, Dean, Rich, and Recht create a model to research recourse and availability of items in a recommender system. We used the definition of predictive multiplicity by Marx, Pin Calmon, and Ustun to examine different variations of this model, using different values for two model parameters. Pairwise comparison of their models show, that most of these models produce very similar results in… ▽ More In their work, Dean, Rich, and Recht create a model to research recourse and availability of items in a recommender system. We used the definition of predictive multiplicity by Marx, Pin Calmon, and Ustun to examine different variations of this model, using different values for two model parameters. Pairwise comparison of their models show, that most of these models produce very similar results in terms of discrepancy and ambiguity for the availability and only in some cases the availability sets differ significantly. △ Less

Submitted 24 August, 2020; v1 submitted 23 August, 2020; originally announced August 2020.

Comments: v2: small display fix in affiliation

arXiv:2007.01547 [pdf, other]

Descending through a Crowded Valley - Benchmarking Deep Learning Optimizers

Authors: Robin M. Schmidt, Frank Schneider, Philipp Hennig

Abstract: Choosing the optimizer is considered to be among the most crucial design decisions in deep learning, and it is not an easy one. The growing literature now lists hundreds of optimization methods. In the absence of clear theoretical guidance and conclusive empirical evidence, the decision is often made based on anecdotes. In this work, we aim to replace these anecdotes, if not with a conclusive rank… ▽ More Choosing the optimizer is considered to be among the most crucial design decisions in deep learning, and it is not an easy one. The growing literature now lists hundreds of optimization methods. In the absence of clear theoretical guidance and conclusive empirical evidence, the decision is often made based on anecdotes. In this work, we aim to replace these anecdotes, if not with a conclusive ranking, then at least with evidence-backed heuristics. To do so, we perform an extensive, standardized benchmark of fifteen particularly popular deep learning optimizers while giving a concise overview of the wide range of possible choices. Analyzing more than $50,000$ individual runs, we contribute the following three points: (i) Optimizer performance varies greatly across tasks. (ii) We observe that evaluating multiple optimizers with default parameters works approximately as well as tuning the hyperparameters of a single, fixed optimizer. (iii) While we cannot discern an optimization method clearly dominating across all tested tasks, we identify a significantly reduced subset of specific optimizers and parameter choices that generally lead to competitive results in our experiments: Adam remains a strong contender, with newer methods failing to significantly and consistently outperform it. Our open-sourced results are available as challenging and well-tuned baselines for more meaningful evaluations of novel optimization methods without requiring any further computational efforts. △ Less

Submitted 10 August, 2021; v1 submitted 3 July, 2020; originally announced July 2020.

Comments: Raw results: https://github.com/SirRob1997/Crowded-Valley---Results

arXiv:2007.00757 [pdf, ps, other]

Signature-Based Abduction for Expressive Description Logics -- Technical Report

Authors: Patrick Koopmann, Warren Del-Pinto, Sophie Tourret, Renate A. Schmidt

Abstract: Signature-based abduction aims at building hypotheses over a specified set of names, the signature, that explain an observation relative to some background knowledge. This type of abduction is useful for tasks such as diagnosis, where the vocabulary used for observed symptoms differs from the vocabulary expected to explain those symptoms. We present the first complete method solving signature-base… ▽ More Signature-based abduction aims at building hypotheses over a specified set of names, the signature, that explain an observation relative to some background knowledge. This type of abduction is useful for tasks such as diagnosis, where the vocabulary used for observed symptoms differs from the vocabulary expected to explain those symptoms. We present the first complete method solving signature-based abduction for observations expressed in the expressive description logic ALC, which can include TBox and ABox axioms, thereby solving the knowledge base abduction problem. The method is guaranteed to compute a finite and complete set of hypotheses, and is evaluated on a set of realistic knowledge bases. △ Less

Submitted 8 July, 2020; v1 submitted 1 July, 2020; originally announced July 2020.

Comments: 13 pages, 1 figure

arXiv:2007.00147 [pdf, other]

Neural Network Virtual Sensors for Fuel Injection Quantities with Provable Performance Specifications

Authors: Eric Wong, Tim Schneider, Joerg Schmitt, Frank R. Schmidt, J. Zico Kolter

Abstract: Recent work has shown that it is possible to learn neural networks with provable guarantees on the output of the model when subject to input perturbations, however these works have focused primarily on defending against adversarial examples for image classifiers. In this paper, we study how these provable guarantees can be naturally applied to other real world settings, namely getting performance… ▽ More Recent work has shown that it is possible to learn neural networks with provable guarantees on the output of the model when subject to input perturbations, however these works have focused primarily on defending against adversarial examples for image classifiers. In this paper, we study how these provable guarantees can be naturally applied to other real world settings, namely getting performance specifications for robust virtual sensors measuring fuel injection quantities within an engine. We first demonstrate that, in this setting, even simple neural network models are highly susceptible to reasonable levels of adversarial sensor noise, which are capable of increasing the mean relative error of a standard neural network from 6.6% to 43.8%. We then leverage methods for learning provably robust networks and verifying robustness properties, resulting in a robust model which we can provably guarantee has at most 16.5% mean relative error under any sensor noise. Additionally, we show how specific intervals of fuel injection quantities can be targeted to maximize robustness for certain ranges, allowing us to train a virtual sensor for fuel injection which is provably guaranteed to have at most 10.69% relative error under noise while maintaining 3% relative error on non-adversarial data within normalized fuel injection ranges of 0.6 to 1.0. △ Less

Submitted 30 June, 2020; originally announced July 2020.

arXiv:2002.02228 [pdf, other]

Querying Guarded Fragments via Resolution

Authors: Sen Zheng, Renate A. Schmidt

Abstract: Answering Boolean conjunctive queries over the guarded fragment is decidable, however, as yet no practical decision procedure exists. Meanwhile, ordered resolution, as a practically oriented algorithm, is widely used in state-of-art modern theorem provers. In this paper, we devise a resolution decision procedure, which not only proves decidability of querying of the guarded fragment, but is implem… ▽ More Answering Boolean conjunctive queries over the guarded fragment is decidable, however, as yet no practical decision procedure exists. Meanwhile, ordered resolution, as a practically oriented algorithm, is widely used in state-of-art modern theorem provers. In this paper, we devise a resolution decision procedure, which not only proves decidability of querying of the guarded fragment, but is implementable in any `off-the-shelf' resolution theorem prover with modest effort. Further, we extend the procedure to querying the loosely guarded fragment. The difficulty in querying a knowledge base of (loosely) guarded clauses is that query clauses are not guarded. We show however there are ways to reformulate query clauses into (loosely) guarded clauses either directly via the separation and splitting rules, or via performing inferences using our top-variable inference system combining with a form of dynamic renaming. Therefore, the problem of querying the (loosely) guarded fragment can be reduced to deciding the (loosely) guarded fragment and possibly irreducible query clauses, i.e., a clause that cannot derive any new conclusion. Meanwhile, our procedure yields a goal-oriented query rewriting algorithm: Before introducing datasets, one can produce a saturated clausal set $\mathcal{S}$ using given BCQs and (loosely) guarded theories. Clauses in $\mathcal{S}$ can be easily transformed first-order queries, therefore query answering over the (loosely) guarded fragment is reduced to evaluating a union of first-order queries over datasets. As far as we know, this is the first practical decision procedure for answering and rewriting Boolean conjunctive queries over the guarded fragment and the loosely guarded fragment. △ Less

Submitted 21 July, 2020; v1 submitted 6 February, 2020; originally announced February 2020.

arXiv:2001.03829 [pdf, ps, other]

Deciding the Loosely Guarded Fragment and Querying Its Horn Fragment Using Resolution

Authors: Sen Zheng, Renate A. Schmidt

Abstract: We consider the following query answering problem: Given a Boolean conjunctive query and a theory in the Horn loosely guarded fragment, the aim is to determine whether the query is entailed by the theory. In this paper, we present a resolution decision procedure for the loosely guarded fragment, and use such a procedure to answer Boolean conjunctive queries against the Horn loosely guarded fragmen… ▽ More We consider the following query answering problem: Given a Boolean conjunctive query and a theory in the Horn loosely guarded fragment, the aim is to determine whether the query is entailed by the theory. In this paper, we present a resolution decision procedure for the loosely guarded fragment, and use such a procedure to answer Boolean conjunctive queries against the Horn loosely guarded fragment. The Horn loosely guarded fragment subsumes classes of rules that are prevalent in ontology-based query answering, such as Horn ALCHOI and guarded existential rules. Additionally, we identify star queries and cloud queries, which using our procedure, can be answered against the loosely guarded fragment. △ Less

Submitted 11 January, 2020; originally announced January 2020.

arXiv:1912.05911 [pdf, other]

Recurrent Neural Networks (RNNs): A gentle Introduction and Overview

Authors: Robin M. Schmidt

Abstract: State-of-the-art solutions in the areas of "Language Modelling & Generating Text", "Speech Recognition", "Generating Image Descriptions" or "Video Tagging" have been using Recurrent Neural Networks as the foundation for their approaches. Understanding the underlying concepts is therefore of tremendous importance if we want to keep up with recent or upcoming publications in those areas. In this wor… ▽ More State-of-the-art solutions in the areas of "Language Modelling & Generating Text", "Speech Recognition", "Generating Image Descriptions" or "Video Tagging" have been using Recurrent Neural Networks as the foundation for their approaches. Understanding the underlying concepts is therefore of tremendous importance if we want to keep up with recent or upcoming publications in those areas. In this work we give a short overview over some of the most important concepts in the realm of Recurrent Neural Networks which enables readers to easily understand the fundamentals such as but not limited to "Backpropagation through Time" or "Long Short-Term Memory Units" as well as some of the more recent advances like the "Attention Mechanism" or "Pointer Networks". We also give recommendations for further reading regarding more complex topics where it is necessary. △ Less

Submitted 23 November, 2019; originally announced December 2019.

arXiv:1904.00759 [pdf, other]

Adversarial camera stickers: A physical camera-based attack on deep learning systems

Authors: Juncheng Li, Frank R. Schmidt, J. Zico Kolter

Abstract: Recent work has documented the susceptibility of deep learning systems to adversarial examples, but most such attacks directly manipulate the digital input to a classifier. Although a smaller line of work considers physical adversarial attacks, in all cases these involve manipulating the object of interest, e.g., putting a physical sticker on an object to misclassify it, or manufacturing an object… ▽ More Recent work has documented the susceptibility of deep learning systems to adversarial examples, but most such attacks directly manipulate the digital input to a classifier. Although a smaller line of work considers physical adversarial attacks, in all cases these involve manipulating the object of interest, e.g., putting a physical sticker on an object to misclassify it, or manufacturing an object specifically intended to be misclassified. In this work, we consider an alternative question: is it possible to fool deep classifiers, over all perceived objects of a certain type, by physically manipulating the camera itself? We show that by placing a carefully crafted and mainly-translucent sticker over the lens of a camera, one can create universal perturbations of the observed images that are inconspicuous, yet misclassify target objects as a different (targeted) class. To accomplish this, we propose an iterative procedure for both updating the attack perturbation (to make it adversarial for a given classifier), and the threat model itself (to ensure it is physically realizable). For example, we show that we can achieve physically-realizable attacks that fool ImageNet classifiers in a targeted fashion 49.6% of the time. This presents a new class of physically-realizable threat models to consider in the context of adversarially robust machine learning. Our demo video can be viewed at: https://youtu.be/wUVmL33Fx54 △ Less

Submitted 8 June, 2019; v1 submitted 21 March, 2019; originally announced April 2019.

Journal ref: Proceedings of the 36th International Conference on Machine Learning, PMLR 97:3896-3904, 2019

arXiv:1902.07906 [pdf, other]

Wasserstein Adversarial Examples via Projected Sinkhorn Iterations

Authors: Eric Wong, Frank R. Schmidt, J. Zico Kolter

Abstract: A rapidly growing area of work has studied the existence of adversarial examples, datapoints which have been perturbed to fool a classifier, but the vast majority of these works have focused primarily on threat models defined by $\ell_p$ norm-bounded perturbations. In this paper, we propose a new threat model for adversarial attacks based on the Wasserstein distance. In the image classification se… ▽ More A rapidly growing area of work has studied the existence of adversarial examples, datapoints which have been perturbed to fool a classifier, but the vast majority of these works have focused primarily on threat models defined by $\ell_p$ norm-bounded perturbations. In this paper, we propose a new threat model for adversarial attacks based on the Wasserstein distance. In the image classification setting, such distances measure the cost of moving pixel mass, which naturally cover "standard" image manipulations such as scaling, rotation, translation, and distortion (and can potentially be applied to other settings as well). To generate Wasserstein adversarial examples, we develop a procedure for projecting onto the Wasserstein ball, based upon a modified version of the Sinkhorn iteration. The resulting algorithm can successfully attack image classification models, bringing traditional CIFAR10 models down to 3% accuracy within a Wasserstein ball with radius 0.1 (i.e., moving 10% of the image mass 1 pixel), and we demonstrate that PGD-based adversarial training can improve this adversarial accuracy to 76%. In total, this work opens up a new direction of study in adversarial robustness, more formally considering convex metrics that accurately capture the invariances that we typically believe should exist in classifiers. Code for all experiments in the paper is available at https://github.com/locuslab/projected_sinkhorn. △ Less

Submitted 18 January, 2020; v1 submitted 21 February, 2019; originally announced February 2019.

arXiv:1812.04994 [pdf, ps, other]

Bayesian deep neural networks for low-cost neurophysiological markers of Alzheimer's disease severity

Authors: Wolfgang Fruehwirt, Adam D. Cobb, Martin Mairhofer, Leonard Weydemann, Heinrich Garn, Reinhold Schmidt, Thomas Benke, Peter Dal-Bianco, Gerhard Ransmayr, Markus Waser, Dieter Grossegger, Pengfei Zhang, Georg Dorffner, Stephen Roberts

Abstract: As societies around the world are ageing, the number of Alzheimer's disease (AD) patients is rapidly increasing. To date, no low-cost, non-invasive biomarkers have been established to advance the objectivization of AD diagnosis and progression assessment. Here, we utilize Bayesian neural networks to develop a multivariate predictor for AD severity using a wide range of quantitative EEG (QEEG) mark… ▽ More As societies around the world are ageing, the number of Alzheimer's disease (AD) patients is rapidly increasing. To date, no low-cost, non-invasive biomarkers have been established to advance the objectivization of AD diagnosis and progression assessment. Here, we utilize Bayesian neural networks to develop a multivariate predictor for AD severity using a wide range of quantitative EEG (QEEG) markers. The Bayesian treatment of neural networks both automatically controls model complexity and provides a predictive distribution over the target function, giving uncertainty bounds for our regression task. It is therefore well suited to clinical neuroscience, where data sets are typically sparse and practitioners require a precise assessment of the predictive uncertainty. We use data of one of the largest prospective AD EEG trials ever conducted to demonstrate the potential of Bayesian deep learning in this domain, while comparing two distinct Bayesian neural network approaches, i.e., Monte Carlo dropout and Hamiltonian Monte Carlo. △ Less

Submitted 13 December, 2018; v1 submitted 12 December, 2018; originally announced December 2018.

Comments: Machine Learning for Health (ML4H) Workshop at NeurIPS 2018 arXiv:1811.07216

arXiv:1811.10319 [pdf, other]

On the cost of essentially fair clusterings

Authors: Ioana O. Bercea, Martin Groß, Samir Khuller, Aounon Kumar, Clemens Rösner, Daniel R. Schmidt, Melanie Schmidt

Abstract: Clustering is a fundamental tool in data mining. It partitions points into groups (clusters) and may be used to make decisions for each point based on its group. However, this process may harm protected (minority) classes if the clustering algorithm does not adequately represent them in desirable clusters -- especially if the data is already biased. At NIPS 2017, Chierichetti et al. proposed a m… ▽ More Clustering is a fundamental tool in data mining. It partitions points into groups (clusters) and may be used to make decisions for each point based on its group. However, this process may harm protected (minority) classes if the clustering algorithm does not adequately represent them in desirable clusters -- especially if the data is already biased. At NIPS 2017, Chierichetti et al. proposed a model for fair clustering requiring the representation in each cluster to (approximately) preserve the global fraction of each protected class. Restricting to two protected classes, they developed both a 4-approximation for the fair $k$-center problem and a $O(t)$-approximation for the fair $k$-median problem, where $t$ is a parameter for the fairness model. For multiple protected classes, the best known result is a 14-approximation for fair $k$-center. We extend and improve the known results. Firstly, we give a 5-approximation for the fair $k$-center problem with multiple protected classes. Secondly, we propose a relaxed fairness notion under which we can give bicriteria constant-factor approximations for all of the classical clustering objectives $k$-center, $k$-supplier, $k$-median, $k$-means and facility location. The latter approximations are achieved by a framework that takes an arbitrary existing unfair (integral) solution and a fair (fractional) LP solution and combines them into an essentially fair clustering with a weakly supervised rounding scheme. In this way, a fair clustering can be established belatedly, in a situation where the centers are already fixed. △ Less

Submitted 26 November, 2018; originally announced November 2018.

arXiv:1811.05420 [pdf, other]

ABox Abduction via Forgetting in ALC (Long Version)

Authors: Warren Del-Pinto, Renate A. Schmidt

Abstract: Abductive reasoning generates explanatory hypotheses for new observations using prior knowledge. This paper investigates the use of forgetting, also known as uniform interpolation, to perform ABox abduction in description logic (ALC) ontologies. Non-abducibles are specified by a forgetting signature which can contain concept, but not role, symbols. The resulting hypotheses are semantically minimal… ▽ More Abductive reasoning generates explanatory hypotheses for new observations using prior knowledge. This paper investigates the use of forgetting, also known as uniform interpolation, to perform ABox abduction in description logic (ALC) ontologies. Non-abducibles are specified by a forgetting signature which can contain concept, but not role, symbols. The resulting hypotheses are semantically minimal and each consist of a set of disjuncts. These disjuncts are each independent explanations, and are not redundant with respect to the background ontology or the other disjuncts, representing a form of hypothesis space. The observations and hypotheses handled by the method can contain both atomic or complex ALC concepts, excluding role assertions, and are not restricted to Horn clauses. Two approaches to redundancy elimination are explored for practical use: full and approximate. Using a prototype implementation, experiments were performed over a corpus of real world ontologies to investigate the practicality of both approaches across several settings. △ Less

Submitted 13 November, 2018; originally announced November 2018.

Comments: Long version of a paper accepted for publication in the proceedings of AAAI 2019

arXiv:1807.00211 [pdf, other]

doi 10.1016/j.nima.2018.10.210

Advanced Methods for the Optical Quality Assurance of Silicon Sensors

Authors: E. Lavrik, I. Panasenko, H. R. Schmidt

Abstract: We describe a setup for optical quality assurance of silicon microstrip sensors. Pattern recognition algorithms were developed to analyze microscopic scans of the sensors for defects. It is shown that the software has a recognition and classification rate of $>$~90\% for defects like scratches, shorts, broken metal lines etc. We have demonstrated that advanced image processing based on neural netw… ▽ More We describe a setup for optical quality assurance of silicon microstrip sensors. Pattern recognition algorithms were developed to analyze microscopic scans of the sensors for defects. It is shown that the software has a recognition and classification rate of $>$~90\% for defects like scratches, shorts, broken metal lines etc. We have demonstrated that advanced image processing based on neural network techniques is able to further improve the recognition and defect classification rate. △ Less

Submitted 30 October, 2018; v1 submitted 30 June, 2018; originally announced July 2018.

arXiv:1805.12514 [pdf, other]

Scaling provable adversarial defenses

Authors: Eric Wong, Frank R. Schmidt, Jan Hendrik Metzen, J. Zico Kolter

Abstract: Recent work has developed methods for learning deep network classifiers that are provably robust to norm-bounded adversarial perturbation; however, these methods are currently only possible for relatively small feedforward networks. In this paper, in an effort to scale these approaches to substantially larger models, we extend previous work in three main directions. First, we present a technique f… ▽ More Recent work has developed methods for learning deep network classifiers that are provably robust to norm-bounded adversarial perturbation; however, these methods are currently only possible for relatively small feedforward networks. In this paper, in an effort to scale these approaches to substantially larger models, we extend previous work in three main directions. First, we present a technique for extending these training procedures to much more general networks, with skip connections (such as ResNets) and general nonlinearities; the approach is fully modular, and can be implemented automatically (analogous to automatic differentiation). Second, in the specific case of $\ell_\infty$ adversarial perturbations and networks with ReLU nonlinearities, we adopt a nonlinear random projection for training, which scales linearly in the number of hidden units (previous approaches scaled quadratically). Third, we show how to further improve robust error through cascade models. On both MNIST and CIFAR data sets, we train classifiers that improve substantially on the state of the art in provable robust adversarial error bounds: from 5.8% to 3.1% on MNIST (with $\ell_\infty$ perturbations of $ε=0.1$), and from 80% to 36.4% on CIFAR (with $\ell_\infty$ perturbations of $ε=2/255$). Code for all experiments in the paper is available at https://github.com/locuslab/convex_adversarial/. △ Less

Submitted 21 November, 2018; v1 submitted 31 May, 2018; originally announced May 2018.

arXiv:1711.10824 [pdf, other]

Compression for Smooth Shape Analysis

Authors: V. Estellers, F. R. Schmidt, D. Cremers

Abstract: Most 3D shape analysis methods use triangular meshes to discretize both the shape and functions on it as piecewise linear functions. With this representation, shape analysis requires fine meshes to represent smooth shapes and geometric operators like normals, curvatures, or Laplace-Beltrami eigenfunctions at large computational and memory costs. We avoid this bottleneck with a compression techni… ▽ More Most 3D shape analysis methods use triangular meshes to discretize both the shape and functions on it as piecewise linear functions. With this representation, shape analysis requires fine meshes to represent smooth shapes and geometric operators like normals, curvatures, or Laplace-Beltrami eigenfunctions at large computational and memory costs. We avoid this bottleneck with a compression technique that represents a smooth shape as subdivision surfaces and exploits the subdivision scheme to parametrize smooth functions on that shape with a few control parameters. This compression does not affect the accuracy of the Laplace-Beltrami operator and its eigenfunctions and allow us to compute shape descriptors and shape matchings at an accuracy comparable to triangular meshes but a fraction of the computational cost. Our framework can also compress surfaces represented by point clouds to do shape analysis of 3D scanning data. △ Less

Submitted 29 November, 2017; originally announced November 2017.

arXiv:1707.02753 [pdf, ps, other]

A Local-Search Algorithm for Steiner Forest

Authors: Martin Groß, Anupam Gupta, Amit Kumar, Jannik Matuschke, Daniel R. Schmidt, Melanie Schmidt, José Verschae

Abstract: In the Steiner Forest problem, we are given a graph and a collection of source-sink pairs, and the goal is to find a subgraph of minimum total length such that all pairs are connected. The problem is APX-Hard and can be 2-approximated by, e.g., the elegant primal-dual algorithm of Agrawal, Klein, and Ravi from 1995. We give a local-search-based constant-factor approximation for the problem. Loca… ▽ More In the Steiner Forest problem, we are given a graph and a collection of source-sink pairs, and the goal is to find a subgraph of minimum total length such that all pairs are connected. The problem is APX-Hard and can be 2-approximated by, e.g., the elegant primal-dual algorithm of Agrawal, Klein, and Ravi from 1995. We give a local-search-based constant-factor approximation for the problem. Local search brings in new techniques to an area that has for long not seen any improvements and might be a step towards a combinatorial algorithm for the more general survivable network design problem. Moreover, local search was an essential tool to tackle the dynamic MST/Steiner Tree problem, whereas dynamic Steiner Forest is still wide open. It is easy to see that any constant factor local search algorithm requires steps that add/drop many edges together. We propose natural local moves which, at each step, either (a) add a shortest path in the current graph and then drop a bunch of inessential edges, or (b) add a set of edges to the current solution. This second type of moves is motivated by the potential function we use to measure progress, combining the cost of the solution with a penalty for each connected component. Our carefully-chosen local moves and potential function work in tandem to eliminate bad local minima that arise when using more traditional local moves. △ Less

Submitted 10 July, 2017; originally announced July 2017.

Comments: 46 pages, 22 figures

arXiv:1705.05020 [pdf, other]

Discrete-Continuous ADMM for Transductive Inference in Higher-Order MRFs

Authors: Emanuel Laude, Jan-Hendrik Lange, Jonas Schüpfer, Csaba Domokos, Laura Leal-Taixé, Frank R. Schmidt, Bjoern Andres, Daniel Cremers

Abstract: This paper introduces a novel algorithm for transductive inference in higher-order MRFs, where the unary energies are parameterized by a variable classifier. The considered task is posed as a joint optimization problem in the continuous classifier parameters and the discrete label variables. In contrast to prior approaches such as convex relaxations, we propose an advantageous decoupling of the ob… ▽ More This paper introduces a novel algorithm for transductive inference in higher-order MRFs, where the unary energies are parameterized by a variable classifier. The considered task is posed as a joint optimization problem in the continuous classifier parameters and the discrete label variables. In contrast to prior approaches such as convex relaxations, we propose an advantageous decoupling of the objective function into discrete and continuous subproblems and a novel, efficient optimization method related to ADMM. This approach preserves integrality of the discrete label variables and guarantees global convergence to a critical point. We demonstrate the advantages of our approach in several experiments including video object segmentation on the DAVIS data set and interactive image segmentation. △ Less

Submitted 28 April, 2018; v1 submitted 14 May, 2017; originally announced May 2017.

arXiv:1611.09014 [pdf, ps, other]

Blocking and Other Enhancements for Bottom-Up Model Generation Methods

Authors: Peter Baumgartner, Renate A. Schmidt

Abstract: Model generation is a problem complementary to theorem proving and is important for fault analysis and debugging of formal specifications of security protocols, programs and terminological definitions. This paper discusses several ways of enhancing the paradigm of bottom-up model generation. The two main contributions are new, generalized blocking techniques and a new range-restriction transformat… ▽ More Model generation is a problem complementary to theorem proving and is important for fault analysis and debugging of formal specifications of security protocols, programs and terminological definitions. This paper discusses several ways of enhancing the paradigm of bottom-up model generation. The two main contributions are new, generalized blocking techniques and a new range-restriction transformation. The blocking techniques are based on simple transformations of the input set together with standard equality reasoning and redundancy elimination techniques. These provide general methods for finding small, finite models. The range-restriction transformation refines existing transformations to range-restricted clauses by carefully limiting the creation of domain terms. All possible combinations of the introduced techniques and classical range-restriction were tested on the clausal problems of the TPTP Version 6.0.0 with an implementation based on the SPASS theorem prover using a hyperresolution-like refinement. Unrestricted domain blocking gave best results for satisfiable problems showing it is a powerful technique indispensable for bottom-up model generation methods. Both in combination with the new range-restricting transformation, and the classical range-restricting transformation, good results have been obtained. Limiting the creation of terms during the inference process by using the new range restricting transformation has paid off, especially when using it together with a shifting transformation. The experimental results also show that classical range restriction with unrestricted blocking provides a useful complementary method. Overall, the results showed bottom-up model generation methods were good for disproving theorems and generating models for satisfiable problems, but less efficient than SPASS in auto mode for unsatisfiable problems. △ Less

Submitted 29 November, 2016; v1 submitted 28 November, 2016; originally announced November 2016.

arXiv:1611.05241 [pdf, other]

A Combinatorial Solution to Non-Rigid 3D Shape-to-Image Matching

Authors: Florian Bernard, Frank R. Schmidt, Johan Thunberg, Daniel Cremers

Abstract: We propose a combinatorial solution for the problem of non-rigidly matching a 3D shape to 3D image data. To this end, we model the shape as a triangular mesh and allow each triangle of this mesh to be rigidly transformed to achieve a suitable matching to the image. By penalising the distance and the relative rotation between neighbouring triangles our matching compromises between image and shape i… ▽ More We propose a combinatorial solution for the problem of non-rigidly matching a 3D shape to 3D image data. To this end, we model the shape as a triangular mesh and allow each triangle of this mesh to be rigidly transformed to achieve a suitable matching to the image. By penalising the distance and the relative rotation between neighbouring triangles our matching compromises between image and shape information. In this paper, we resolve two major challenges: Firstly, we address the resulting large and NP-hard combinatorial problem with a suitable graph-theoretic approach. Secondly, we propose an efficient discretisation of the unbounded 6-dimensional Lie group SE(3). To our knowledge this is the first combinatorial formulation for non-rigid 3D shape-to-image matching. In contrast to existing local (gradient descent) optimisation methods, we obtain solutions that do not require a good initialisation and that are within a bound of the optimal solution. We evaluate the proposed method on the two problems of non-rigid 3D shape-to-shape and non-rigid 3D shape-to-image registration and demonstrate that it provides promising results. △ Less

Submitted 18 May, 2017; v1 submitted 16 November, 2016; originally announced November 2016.

Comments: 10 pages, 7 figures

Journal ref: CVPR 2017

arXiv:1606.09540 [pdf, other]

SurfCuit: Surface Mounted Circuits on 3D Prints

Authors: Nobuyuki Umetani, Ryan Schmidt

Abstract: We present, SurfCuit, a novel approach to design and construction of electric circuits on the surface of 3D prints. Our surface mounting technique allows durable construction of circuits on the surface of 3D prints. SurfCuit does not require tedious circuit casing design or expensive set-ups, thus we can expedite the process of circuit construction for 3D models. Our technique allows the user to c… ▽ More We present, SurfCuit, a novel approach to design and construction of electric circuits on the surface of 3D prints. Our surface mounting technique allows durable construction of circuits on the surface of 3D prints. SurfCuit does not require tedious circuit casing design or expensive set-ups, thus we can expedite the process of circuit construction for 3D models. Our technique allows the user to construct complex circuits for consumer-level desktop fused decomposition modeling (FDM) 3D printers. The key idea behind our technique is that FDM plastic forms a strong bond with metal when it is melted. This observation enables construction of a robust circuit traces using copper tape and soldering. We also present an interactive tool to design such circuits on arbitrary 3D geometry. We demonstrate the effectiveness of our approach through various actual construction examples. △ Less

Submitted 30 June, 2016; originally announced June 2016.

arXiv:1605.01760 [pdf, other]

Adaptive Mesh Booleans

Authors: Ryan Schmidt, Tyson Brochu

Abstract: We present a new method for performing Boolean operations on volumes represented as triangle meshes. In contrast to existing methods which treat meshes as 3D polyhedra and try to partition the faces at their exact intersection curves, we treat meshes as adaptive surfaces which can be arbitrarily refined. Rather than depending on computing precise face intersections, our approach refines the input… ▽ More We present a new method for performing Boolean operations on volumes represented as triangle meshes. In contrast to existing methods which treat meshes as 3D polyhedra and try to partition the faces at their exact intersection curves, we treat meshes as adaptive surfaces which can be arbitrarily refined. Rather than depending on computing precise face intersections, our approach refines the input meshes in the intersection regions, then discards intersecting triangles and fills the resulting holes with high-quality triangles. The original intersection curves are approximated to a user-definable precision, and our method can identify and preserve creases and sharp features. Advantages of our approach include the ability to trade speed for accuracy, support for open meshes, and the ability to incorporate tolerances to handle cases where large numbers of faces are slightly inter-penetrating or near-coincident. △ Less

Submitted 5 May, 2016; originally announced May 2016.

MSC Class: 68U05; 68U07

arXiv:1604.08058 [pdf, ps, other]

Lifting QBF Resolution Calculi to DQBF

Authors: Olaf Beyersdorff, Leroy Chew, Renate Schmidt, Martin Suda

Abstract: We examine the existing Resolution systems for quantified Boolean formulas (QBF) and answer the question which of these calculi can be lifted to the more powerful Dependency QBFs (DQBF). An interesting picture emerges: While for QBF we have the strict chain of proof systems Q-Resolution < IR-calc < IRM-calc, the situation is quite different in DQBF. Q-Resolution and likewise universal Resolution a… ▽ More We examine the existing Resolution systems for quantified Boolean formulas (QBF) and answer the question which of these calculi can be lifted to the more powerful Dependency QBFs (DQBF). An interesting picture emerges: While for QBF we have the strict chain of proof systems Q-Resolution < IR-calc < IRM-calc, the situation is quite different in DQBF. Q-Resolution and likewise universal Resolution are too weak: they are not complete. IR-calc has the right strength: it is sound and complete. IRM-calc is too strong: it is not sound any more, and the same applies to long-distance Resolution. Conceptually, we use the relation of DQBF to EPR and explain our new DQBF calculus based on IR-calc as a subsystem of FO-Resolution. △ Less

Submitted 27 April, 2016; originally announced April 2016.

Comments: SAT 2016

arXiv:1601.06070 [pdf, other]

Efficient Globally Optimal 2D-to-3D Deformable Shape Matching

Authors: Zorah Lähner, Emanuele Rodolà, Frank R. Schmidt, Michael M. Bronstein, Daniel Cremers

Abstract: We propose the first algorithm for non-rigid 2D-to-3D shape matching, where the input is a 2D shape represented as a planar curve and a 3D shape represented as a surface; the output is a continuous curve on the surface. We cast the problem as finding the shortest circular path on the product 3-manifold of the surface and the curve. We prove that the optimal matching can be computed in polynomial t… ▽ More We propose the first algorithm for non-rigid 2D-to-3D shape matching, where the input is a 2D shape represented as a planar curve and a 3D shape represented as a surface; the output is a continuous curve on the surface. We cast the problem as finding the shortest circular path on the product 3-manifold of the surface and the curve. We prove that the optimal matching can be computed in polynomial time with a (worst-case) complexity of $O(mn^2\log(n))$, where $m$ and $n$ denote the number of vertices on the template curve and the 3D shape respectively. We also demonstrate that in practice the runtime is essentially linear in $m\!\cdot\! n$ making it an efficient method for shape analysis and shape retrieval. Quantitative evaluation confirms that the method provides excellent results for sketch-based deformable 3D shape retrieval. △ Less

Submitted 21 January, 2022; v1 submitted 22 January, 2016; originally announced January 2016.

Comments: Extended chapter of conference paper in IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2016 to be published in Shape Analysis: Euclidean, Discrete and Algebraic Geometric Methods by Springer

arXiv:1502.04265 [pdf, other]

Solving $k$-means on High-dimensional Big Data

Authors: Jan-Philipp W. Kappmeier, Daniel R. Schmidt, Melanie Schmidt

Abstract: In recent years, there have been major efforts to develop data stream algorithms that process inputs in one pass over the data with little memory requirement. For the $k$-means problem, this has led to the development of several $(1+\varepsilon)$-approximations (under the assumption that $k$ is a constant), but also to the design of algorithms that are extremely fast in practice and compute soluti… ▽ More In recent years, there have been major efforts to develop data stream algorithms that process inputs in one pass over the data with little memory requirement. For the $k$-means problem, this has led to the development of several $(1+\varepsilon)$-approximations (under the assumption that $k$ is a constant), but also to the design of algorithms that are extremely fast in practice and compute solutions of high accuracy. However, when not only the length of the stream is high but also the dimensionality of the input points, then current methods reach their limits. We propose two algorithms, piecy and piecy-mr that are based on the recently developed data stream algorithm BICO that can process high dimensional data in one pass and output a solution of high quality. While piecy is suited for high dimensional data with a medium number of points, piecy-mr is meant for high dimensional data that comes in a very long stream. We provide an extensive experimental study to evaluate piecy and piecy-mr that shows the strength of the new algorithms. △ Less

Submitted 28 May, 2015; v1 submitted 14 February, 2015; originally announced February 2015.

Comments: 23 pages, 9 figures, published at the 14th International Symposium on Experimental Algorithms - SEA 2015

arXiv:1502.02852 [pdf, other]

Transaction Level Analysis for a Clustered and Hardware-Enhanced Task Manager on Homogeneous Many-Core Systems

Authors: Daniel Gregorek, Robert Schmidt, Alberto Garcia-Ortiz

Abstract: The increasing parallelism of many-core systems demands for efficient strategies for the run-time system management. Due to the large number of cores the management overhead has a rising impact to the overall system performance. This work analyzes a clustered infrastructure of dedicated hardware nodes to manage a homogeneous many-core system. The hardware nodes implement a message passing protocol… ▽ More The increasing parallelism of many-core systems demands for efficient strategies for the run-time system management. Due to the large number of cores the management overhead has a rising impact to the overall system performance. This work analyzes a clustered infrastructure of dedicated hardware nodes to manage a homogeneous many-core system. The hardware nodes implement a message passing protocol and perform the task map** and synchronization at run-time. To make meaningful map** decisions, the global management nodes employ a workload status communication mechanism. This paper discusses the design-space of the dedicated infrastructure by means of task map** use-cases and a parallel benchmark including application-interference. We evaluate the architecture in terms of application speedup and analyze the mechanism for the status communication. A comparison versus centralized and fully-distributed configurations demonstrates the reduction of the computation and communication management overhead for our approach. △ Less

Submitted 10 February, 2015; originally announced February 2015.

Comments: Presented at HIP3ES, 2015 (arXiv: 1501.03064)

Report number: HIP3ES/2015/03

arXiv:1501.03064

Proceedings of the Workshop on High Performance Energy Efficient Embedded Systems (HIP3ES) 2015

Authors: Francisco Corbera, Andrés Rodríguez, Rafael Asenjo, Angeles Navarro, Antonio Vilches, Maria Garzaran, Ismat Chaib Draa, Jamel Tayeb, Smail Niar, Mikael Desertot, Daniel Gregorek, Robert Schmidt, Alberto Garcia-Ortiz, Pedro Lopez-Garcia, Rémy Haemmerlé, Maximiliano Klemen, Umer Liqat, Manuel V. Hermenegildo, Radim Vavřík, Albert Saà-Garriga, David Castells-Rufas, Jordi Carrabina

Abstract: Proceedings of the Workshop on High Performance Energy Efficient Embedded Systems (HIP3ES) 2015. Amsterdam, January 21st. Collocated with HIPEAC 2015 Conference. Proceedings of the Workshop on High Performance Energy Efficient Embedded Systems (HIP3ES) 2015. Amsterdam, January 21st. Collocated with HIPEAC 2015 Conference. △ Less

Submitted 13 January, 2015; originally announced January 2015.

arXiv:1408.0391 [pdf, other]

doi 10.1145/2834050.2834102

RiPKI: The Tragic Story of RPKI Deployment in the Web Ecosystem

Authors: Matthias Wählisch, Robert Schmidt, Thomas C. Schmidt, Olaf Maennel, Steve Uhlig, Gareth Tyson

Abstract: Web content delivery is one of the most important services on the Internet. Access to websites is typically secured via TLS. However, this security model does not account for prefix hijacking on the network layer, which may lead to traffic blackholing or transparent interception. Thus, to achieve comprehensive security and service availability, additional protective mechanisms are necessary such a… ▽ More Web content delivery is one of the most important services on the Internet. Access to websites is typically secured via TLS. However, this security model does not account for prefix hijacking on the network layer, which may lead to traffic blackholing or transparent interception. Thus, to achieve comprehensive security and service availability, additional protective mechanisms are necessary such as the RPKI, a recently deployed Resource Public Key Infrastructure to prevent hijacking of traffic by networks. This paper argues two positions. First, that modern web hosting practices make route protection challenging due to the propensity to spread servers across many different networks, often with unpredictable client redirection strategies, and, second, that we need a better understanding why protection mechanisms are not deployed. To initiate this, we empirically explore the relationship between web hosting infrastructure and RPKI deployment. Perversely, we find that less popular websites are more likely to be secured than the prominent sites. Worryingly, we find many large-scale CDNs do not support RPKI, thus making their customers vulnerable. This leads us to explore business reasons why operators are hesitant to deploy RPKI, which may help to guide future research on improving Internet security. △ Less

Submitted 2 November, 2015; v1 submitted 2 August, 2014; originally announced August 2014.

Comments: Previous arXiv version of this paper has been published under the title "When BGP Security Meets Content Deployment: Measuring and Analysing RPKI-Protection of Websites", Proc. of Fourteenth ACM Workshop on Hot Topics in Networks (HotNets), New York:ACM, 2015

ACM Class: C.2.2, C.2.5

arXiv:1311.2102 [pdf, other]

An Experimental Comparison of Trust Region and Level Sets

Authors: Lena Gorelick, Ismail BenAyed, Frank R. Schmidt, Yuri Boykov

Abstract: High-order (non-linear) functionals have become very popular in segmentation, stereo and other computer vision problems. Level sets is a well established general gradient descent framework, which is directly applicable to optimization of such functionals and widely used in practice. Recently, another general optimization approach based on trust region methodology was proposed for regional non-line… ▽ More High-order (non-linear) functionals have become very popular in segmentation, stereo and other computer vision problems. Level sets is a well established general gradient descent framework, which is directly applicable to optimization of such functionals and widely used in practice. Recently, another general optimization approach based on trust region methodology was proposed for regional non-linear functionals. Our goal is a comprehensive experimental comparison of these two frameworks in regard to practical efficiency, robustness to parameters, and optimality. We experiment on a wide range of problems with non-linear constraints on segment volume, appearance and shape. △ Less

Submitted 8 November, 2013; originally announced November 2013.

Comments: 8 pages, 6 figures

arXiv:1305.3131 [pdf, ps, other]

Refinement in the Tableau Synthesis Framework

Authors: Dmitry Tishkovsky, Renate A. Schmidt

Abstract: This paper is concerned with the possibilities of refining and improving calculi generated in the tableau synthesis framework. A general method in the tableau synthesis framework allows to reduce the branching factor of tableau rules and preserves completeness if a general rule refinement condition holds. In this paper we consider two approaches to satisfy this general rule refinement condition. This paper is concerned with the possibilities of refining and improving calculi generated in the tableau synthesis framework. A general method in the tableau synthesis framework allows to reduce the branching factor of tableau rules and preserves completeness if a general rule refinement condition holds. In this paper we consider two approaches to satisfy this general rule refinement condition. △ Less

Submitted 14 May, 2013; originally announced May 2013.

Comments: 19 pages

arXiv:1208.1476 [pdf, ps, other]

doi 10.1145/2559947

Using Tableau to Decide Description Logics with Full Role Negation and Identity

Authors: Renate A. Schmidt, Dmitry Tishkovsky

Abstract: This paper presents a tableau approach for deciding expressive description logics with full role negation and role identity. We consider the description logic ALBOid, which is the extension of ALC with the Boolean role operators, inverse of roles, the identity role, and includes full support for individuals and singleton concepts. ALBOid is expressively equivalent to the two-variable fragment of f… ▽ More This paper presents a tableau approach for deciding expressive description logics with full role negation and role identity. We consider the description logic ALBOid, which is the extension of ALC with the Boolean role operators, inverse of roles, the identity role, and includes full support for individuals and singleton concepts. ALBOid is expressively equivalent to the two-variable fragment of first-order logic with equality and subsumes Boolean modal logic. In this paper we define a sound and complete tableau calculus for the ALBOid that provides a basis for decision procedures for this logic and all its sublogics. An important novelty of our approach is the use of a generic unrestricted blocking mechanism. Being based on a conceptually simple rule, unrestricted blocking performs case distinctions over whether two individuals are equal or not and equality reasoning to find finite models. The blocking mechanism ties the proof of termination of tableau derivations to the finite model property of ALBOid. △ Less

Submitted 28 November, 2016; v1 submitted 7 August, 2012; originally announced August 2012.

Journal ref: ACM Transactions on Computational Logic 15 (1), Article 7 (February 2014)

arXiv:1104.4131 [pdf, ps, other]

doi 10.2168/LMCS-7(2:6)2011

Automated Synthesis of Tableau Calculi

Authors: Renate A. Schmidt, Dmitry Tishkovsky

Abstract: This paper presents a method for synthesising sound and complete tableau calculi. Given a specification of the formal semantics of a logic, the method generates a set of tableau inference rules that can then be used to reason within the logic. The method guarantees that the generated rules form a calculus which is sound and constructively complete. If the logic can be shown to admit finite filtra… ▽ More This paper presents a method for synthesising sound and complete tableau calculi. Given a specification of the formal semantics of a logic, the method generates a set of tableau inference rules that can then be used to reason within the logic. The method guarantees that the generated rules form a calculus which is sound and constructively complete. If the logic can be shown to admit finite filtration with respect to a well-defined first-order semantics then adding a general blocking mechanism provides a terminating tableau calculus. The process of generating tableau rules can be completely automated and produces, together with the blocking mechanism, an automated procedure for generating tableau decision procedures. For illustration we show the workability of the approach for a description logic with transitive roles and propositional intuitionistic logic. △ Less

Submitted 23 April, 2012; v1 submitted 20 April, 2011; originally announced April 2011.

Comments: 32 pages

ACM Class: cs.SC

Journal ref: Logical Methods in Computer Science, Volume 7, Issue 2 (May 7, 2011) lmcs:970

Showing 1–48 of 48 results for author: Schmidt, R