Search | arXiv e-print repository

Planning from Pixels in Environments with Combinatorially Hard Search Spaces

Authors: Marco Bagatella, Mirek Olšák, Michal Rolínek, Georg Martius

Abstract: The ability to form complex plans based on raw visual input is a litmus test for current capabilities of artificial intelligence, as it requires a seamless combination of visual processing and abstract algorithmic execution, two traditionally separate areas of computer science. A recent surge of interest in this field brought advances that yield good performance in tasks ranging from arcade games… ▽ More The ability to form complex plans based on raw visual input is a litmus test for current capabilities of artificial intelligence, as it requires a seamless combination of visual processing and abstract algorithmic execution, two traditionally separate areas of computer science. A recent surge of interest in this field brought advances that yield good performance in tasks ranging from arcade games to continuous control; these methods however do not come without significant issues, such as limited generalization capabilities and difficulties when dealing with combinatorially hard planning instances. Our contribution is two-fold: (i) we present a method that learns to represent its environment as a latent graph and leverages state reidentification to reduce the complexity of finding a good policy from exponential to linear (ii) we introduce a set of lightweight environments with an underlying discrete combinatorial structure in which planning is challenging even for humans. Moreover, we show that our methods achieves strong empirical generalization to variations in the environment, even across highly disadvantaged regimes, such as "one-shot" planning, or in an offline RL paradigm which only provides low-quality trajectories. △ Less

Submitted 18 March, 2022; v1 submitted 12 October, 2021; originally announced October 2021.

arXiv:2108.10606 [pdf, other]

Making Higher Order MOT Scalable: An Efficient Approximate Solver for Lifted Disjoint Paths

Authors: Andrea Hornakova, Timo Kaiser, Paul Swoboda, Michal Rolinek, Bodo Rosenhahn, Roberto Henschel

Abstract: We present an efficient approximate message passing solver for the lifted disjoint paths problem (LDP), a natural but NP-hard model for multiple object tracking (MOT). Our tracker scales to very large instances that come from long and crowded MOT sequences. Our approximate solver enables us to process the MOT15/16/17 benchmarks without sacrificing solution quality and allows for solving MOT20, whi… ▽ More We present an efficient approximate message passing solver for the lifted disjoint paths problem (LDP), a natural but NP-hard model for multiple object tracking (MOT). Our tracker scales to very large instances that come from long and crowded MOT sequences. Our approximate solver enables us to process the MOT15/16/17 benchmarks without sacrificing solution quality and allows for solving MOT20, which has been out of reach up to now for LDP solvers due to its size and complexity. On all these four standard MOT benchmarks we achieve performance comparable or better than current state-of-the-art methods including a tracker based on an optimal LDP solver. △ Less

Submitted 24 August, 2021; originally announced August 2021.

Comments: ICCV 2021. Short version published at CVPR 2021 RVSU workshop https://omnomnom.vision.rwth-aachen.de/data/RobMOTS/workshop/papers/9/CameraReady/paper_V3.pdf . Implementation available at https://github.com/LPMP/LPMP and https://github.com/TimoK93/ApLift

arXiv:2105.02343 [pdf, other]

CombOptNet: Fit the Right NP-Hard Problem by Learning Integer Programming Constraints

Authors: Anselm Paulus, Michal Rolínek, Vít Musil, Brandon Amos, Georg Martius

Abstract: Bridging logical and algorithmic reasoning with modern machine learning techniques is a fundamental challenge with potentially transformative impact. On the algorithmic side, many NP-hard problems can be expressed as integer programs, in which the constraints play the role of their "combinatorial specification." In this work, we aim to integrate integer programming solvers into neural network arch… ▽ More Bridging logical and algorithmic reasoning with modern machine learning techniques is a fundamental challenge with potentially transformative impact. On the algorithmic side, many NP-hard problems can be expressed as integer programs, in which the constraints play the role of their "combinatorial specification." In this work, we aim to integrate integer programming solvers into neural network architectures as layers capable of learning both the cost terms and the constraints. The resulting end-to-end trainable architectures jointly extract features from raw data and solve a suitable (learned) combinatorial problem with state-of-the-art integer programming solvers. We demonstrate the potential of such layers with an extensive performance analysis on synthetic data and with a demonstration on a competitive computer vision keypoint matching benchmark. △ Less

Submitted 11 April, 2022; v1 submitted 5 May, 2021; originally announced May 2021.

Comments: ICML 2021 conference paper

arXiv:2102.07456 [pdf, other]

Neuro-algorithmic Policies enable Fast Combinatorial Generalization

Authors: Marin Vlastelica, Michal Rolínek, Georg Martius

Abstract: Although model-based and model-free approaches to learning the control of systems have achieved impressive results on standard benchmarks, generalization to task variations is still lacking. Recent results suggest that generalization for standard architectures improves only after obtaining exhaustive amounts of data. We give evidence that generalization capabilities are in many cases bottlenecked… ▽ More Although model-based and model-free approaches to learning the control of systems have achieved impressive results on standard benchmarks, generalization to task variations is still lacking. Recent results suggest that generalization for standard architectures improves only after obtaining exhaustive amounts of data. We give evidence that generalization capabilities are in many cases bottlenecked by the inability to generalize on the combinatorial aspects of the problem. Furthermore, we show that for a certain subclass of the MDP framework, this can be alleviated by neuro-algorithmic architectures. Many control problems require long-term planning that is hard to solve generically with neural networks alone. We introduce a neuro-algorithmic policy architecture consisting of a neural network and an embedded time-dependent shortest path solver. These policies can be trained end-to-end by blackbox differentiation. We show that this type of architecture generalizes well to unseen variations in the environment already after seeing a few examples. △ Less

Submitted 15 February, 2021; originally announced February 2021.

Comments: 15 pages

arXiv:2102.06822 [pdf, other]

Demystifying Inductive Biases for $β$-VAE Based Architectures

Authors: Dominik Zietlow, Michal Rolinek, Georg Martius

Abstract: The performance of $β$-Variational-Autoencoders ($β$-VAEs) and their variants on learning semantically meaningful, disentangled representations is unparalleled. On the other hand, there are theoretical arguments suggesting the impossibility of unsupervised disentanglement. In this work, we shed light on the inductive bias responsible for the success of VAE-based architectures. We show that in clas… ▽ More The performance of $β$-Variational-Autoencoders ($β$-VAEs) and their variants on learning semantically meaningful, disentangled representations is unparalleled. On the other hand, there are theoretical arguments suggesting the impossibility of unsupervised disentanglement. In this work, we shed light on the inductive bias responsible for the success of VAE-based architectures. We show that in classical datasets the structure of variance, induced by the generating factors, is conveniently aligned with the latent directions fostered by the VAE objective. This builds the pivotal bias on which the disentangling abilities of VAEs rely. By small, elaborate perturbations of existing datasets, we hide the convenient correlation structure that is easily exploited by a variety of architectures. To demonstrate this, we construct modified versions of standard datasets in which (i) the generative factors are perfectly preserved; (ii) each image undergoes a mild transformation causing a small change of variance; (iii) the leading \textbf{VAE-based disentanglement architectures fail to produce disentangled representations whilst the performance of a non-variational method remains unchanged}. The construction of our modifications is nontrivial and relies on recent progress on mechanistic understanding of $β$-VAEs and their connection to PCA. We strengthen that connection by providing additional insights that are of stand-alone interest. △ Less

Submitted 12 February, 2021; originally announced February 2021.

arXiv:2008.06389 [pdf, other]

Sample-efficient Cross-Entropy Method for Real-time Planning

Authors: Cristina Pinneri, Shambhuraj Sawant, Sebastian Blaes, Jan Achterhold, Joerg Stueckler, Michal Rolinek, Georg Martius

Abstract: Trajectory optimizers for model-based reinforcement learning, such as the Cross-Entropy Method (CEM), can yield compelling results even in high-dimensional control tasks and sparse-reward environments. However, their sampling inefficiency prevents them from being used for real-time planning and control. We propose an improved version of the CEM algorithm for fast planning, with novel additions inc… ▽ More Trajectory optimizers for model-based reinforcement learning, such as the Cross-Entropy Method (CEM), can yield compelling results even in high-dimensional control tasks and sparse-reward environments. However, their sampling inefficiency prevents them from being used for real-time planning and control. We propose an improved version of the CEM algorithm for fast planning, with novel additions including temporally-correlated actions and memory, requiring 2.7-22x less samples and yielding a performance increase of 1.2-10x in high-dimensional control problems. △ Less

Submitted 14 August, 2020; originally announced August 2020.

arXiv:2003.11657 [pdf, other]

Deep Graph Matching via Blackbox Differentiation of Combinatorial Solvers

Authors: Michal Rolínek, Paul Swoboda, Dominik Zietlow, Anselm Paulus, Vít Musil, Georg Martius

Abstract: Building on recent progress at the intersection of combinatorial optimization and deep learning, we propose an end-to-end trainable architecture for deep graph matching that contains unmodified combinatorial solvers. Using the presence of heavily optimized combinatorial solvers together with some improvements in architecture design, we advance state-of-the-art on deep graph matching benchmarks for… ▽ More Building on recent progress at the intersection of combinatorial optimization and deep learning, we propose an end-to-end trainable architecture for deep graph matching that contains unmodified combinatorial solvers. Using the presence of heavily optimized combinatorial solvers together with some improvements in architecture design, we advance state-of-the-art on deep graph matching benchmarks for keypoint correspondence. In addition, we highlight the conceptual advantages of incorporating solvers into deep learning architectures, such as the possibility of post-processing with a strong multi-graph matching solver or the indifference to changes in the training setting. Finally, we propose two new challenging experimental setups. The code is available at https://github.com/martius-lab/blackbox-deep-graph-matching △ Less

Submitted 5 August, 2020; v1 submitted 25 March, 2020; originally announced March 2020.

Comments: ECCV 2020 conference paper

arXiv:1912.03500 [pdf, other]

Optimizing Rank-based Metrics with Blackbox Differentiation

Authors: Michal Rolínek, Vít Musil, Anselm Paulus, Marin Vlastelica, Claudio Michaelis, Georg Martius

Abstract: Rank-based metrics are some of the most widely used criteria for performance evaluation of computer vision models. Despite years of effort, direct optimization for these metrics remains a challenge due to their non-differentiable and non-decomposable nature. We present an efficient, theoretically sound, and general method for differentiating rank-based metrics with mini-batch gradient descent. In… ▽ More Rank-based metrics are some of the most widely used criteria for performance evaluation of computer vision models. Despite years of effort, direct optimization for these metrics remains a challenge due to their non-differentiable and non-decomposable nature. We present an efficient, theoretically sound, and general method for differentiating rank-based metrics with mini-batch gradient descent. In addition, we address optimization instability and sparsity of the supervision signal that both arise from using rank-based metrics as optimization targets. Resulting losses based on recall and Average Precision are applied to image retrieval and object detection tasks. We obtain performance that is competitive with state-of-the-art on standard image retrieval datasets and consistently improve performance of near state-of-the-art object detectors. The code is available at https://github.com/martius-lab/blackbox-backprop △ Less

Submitted 18 March, 2020; v1 submitted 7 December, 2019; originally announced December 2019.

Comments: CVPR 2020 conference paper (oral). The first two authors contributed equally

arXiv:1912.02175 [pdf, other]

Differentiation of Blackbox Combinatorial Solvers

Authors: Marin Vlastelica, Anselm Paulus, Vít Musil, Georg Martius, Michal Rolínek

Abstract: Achieving fusion of deep learning with combinatorial algorithms promises transformative changes to artificial intelligence. One possible approach is to introduce combinatorial building blocks into neural networks. Such end-to-end architectures have the potential to tackle combinatorial problems on raw input data such as ensuring global consistency in multi-object tracking or route planning on maps… ▽ More Achieving fusion of deep learning with combinatorial algorithms promises transformative changes to artificial intelligence. One possible approach is to introduce combinatorial building blocks into neural networks. Such end-to-end architectures have the potential to tackle combinatorial problems on raw input data such as ensuring global consistency in multi-object tracking or route planning on maps in robotics. In this work, we present a method that implements an efficient backward pass through blackbox implementations of combinatorial solvers with linear objective functions. We provide both theoretical and experimental backing. In particular, we incorporate the Gurobi MIP solver, Blossom V algorithm, and Dijkstra's algorithm into architectures that extract suitable features from raw inputs for the traveling salesman problem, the min-cost perfect matching problem and the shortest path problem. The code is available at https://github.com/martius-lab/blackbox-backprop. △ Less

Submitted 16 February, 2020; v1 submitted 4 December, 2019; originally announced December 2019.

Comments: ICLR 2020 conference paper (spotlight). The first two authors contributed equally

arXiv:1812.06775 [pdf, other]

Variational Autoencoders Pursue PCA Directions (by Accident)

Authors: Michal Rolinek, Dominik Zietlow, Georg Martius

Abstract: The Variational Autoencoder (VAE) is a powerful architecture capable of representation learning and generative modeling. When it comes to learning interpretable (disentangled) representations, VAE and its variants show unparalleled performance. However, the reasons for this are unclear, since a very particular alignment of the latent embedding is needed but the design of the VAE does not encourage… ▽ More The Variational Autoencoder (VAE) is a powerful architecture capable of representation learning and generative modeling. When it comes to learning interpretable (disentangled) representations, VAE and its variants show unparalleled performance. However, the reasons for this are unclear, since a very particular alignment of the latent embedding is needed but the design of the VAE does not encourage it in any explicit way. We address this matter and offer the following explanation: the diagonal approximation in the encoder together with the inherent stochasticity force local orthogonality of the decoder. The local behavior of promoting both reconstruction and orthogonality matches closely how the PCA embedding is chosen. Alongside providing an intuitive understanding, we justify the statement with full theoretical analysis as well as with experiments. △ Less

Submitted 16 April, 2019; v1 submitted 17 December, 2018; originally announced December 2018.

arXiv:1802.05074 [pdf, other]

L4: Practical loss-based stepsize adaptation for deep learning

Authors: Michal Rolinek, Georg Martius

Abstract: We propose a stepsize adaptation scheme for stochastic gradient descent. It operates directly with the loss function and rescales the gradient in order to make fixed predicted progress on the loss. We demonstrate its capabilities by conclusively improving the performance of Adam and Momentum optimizers. The enhanced optimizers with default hyperparameters consistently outperform their constant ste… ▽ More We propose a stepsize adaptation scheme for stochastic gradient descent. It operates directly with the loss function and rescales the gradient in order to make fixed predicted progress on the loss. We demonstrate its capabilities by conclusively improving the performance of Adam and Momentum optimizers. The enhanced optimizers with default hyperparameters consistently outperform their constant stepsize counterparts, even the best ones, without a measurable increase in computational cost. The performance is validated on multiple architectures including dense nets, CNNs, ResNets, and the recurrent Differential Neural Computer on classical datasets MNIST, fashion MNIST, CIFAR10 and others. △ Less

Submitted 30 November, 2018; v1 submitted 14 February, 2018; originally announced February 2018.

Comments: NeurIPS, 2018

arXiv:1604.08269 [pdf, ps, other]

Efficient Optimization for Rank-based Loss Functions

Authors: Pritish Mohapatra, Michal Rolinek, C. V. Jawahar, Vladimir Kolmogorov, M. Pawan Kumar

Abstract: The accuracy of information retrieval systems is often measured using complex loss functions such as the average precision (AP) or the normalized discounted cumulative gain (NDCG). Given a set of positive and negative samples, the parameters of a retrieval system can be estimated by minimizing these loss functions. However, the non-differentiability and non-decomposability of these loss functions… ▽ More The accuracy of information retrieval systems is often measured using complex loss functions such as the average precision (AP) or the normalized discounted cumulative gain (NDCG). Given a set of positive and negative samples, the parameters of a retrieval system can be estimated by minimizing these loss functions. However, the non-differentiability and non-decomposability of these loss functions does not allow for simple gradient based optimization algorithms. This issue is generally circumvented by either optimizing a structured hinge-loss upper bound to the loss function or by using asymptotic methods like the direct-loss minimization framework. Yet, the high computational complexity of loss-augmented inference, which is necessary for both the frameworks, prohibits its use in large training data sets. To alleviate this deficiency, we present a novel quicksort flavored algorithm for a large class of non-decomposable loss functions. We provide a complete characterization of the loss functions that are amenable to our algorithm, and show that it includes both AP and NDCG based loss functions. Furthermore, we prove that no comparison based algorithm can improve upon the computational complexity of our approach asymptotically. We demonstrate the effectiveness of our approach in the context of optimizing the structured hinge loss upper bound of AP and NDCG loss for learning models for a variety of vision tasks. We show that our approach provides significantly better results than simpler decomposable loss functions, while requiring a comparable training time. △ Less

Submitted 28 February, 2018; v1 submitted 27 April, 2016; originally announced April 2016.

Comments: 15 pages, 2 figures

arXiv:1602.03124 [pdf, other]

Even Delta-Matroids and the Complexity of Planar Boolean CSPs

Authors: Alexandr Kazda, Vladimir Kolmogorov, Michal Rolínek

Abstract: The main result of this paper is a generalization of the classical blossom algorithm for finding perfect matchings. Our algorithm can efficiently solve Boolean CSPs where each variable appears in exactly two constraints (we call it edge CSP) and all constraints are even $Δ$-matroid relations (represented by lists of tuples). As a consequence of this, we settle the complexity classification of plan… ▽ More The main result of this paper is a generalization of the classical blossom algorithm for finding perfect matchings. Our algorithm can efficiently solve Boolean CSPs where each variable appears in exactly two constraints (we call it edge CSP) and all constraints are even $Δ$-matroid relations (represented by lists of tuples). As a consequence of this, we settle the complexity classification of planar Boolean CSPs started by Dvorak and Kupec. Using a reduction to even $Δ$-matroids, we then extend the tractability result to larger classes of $Δ$-matroids that we call efficiently coverable. It properly includes classes that were known to be tractable before, namely co-independent, compact, local, linear and binary, with the following caveat: we represent $Δ$-matroids by lists of tuples, while the last two use a representation by matrices. Since an $n\times n$ matrix can represent exponentially many tuples, our tractability result is not strictly stronger than the known algorithm for linear and binary $Δ$-matroids. △ Less

Submitted 14 June, 2018; v1 submitted 9 February, 2016; originally announced February 2016.

Comments: 33 pages, 9 figures

MSC Class: 68Q25 ACM Class: F.2.2; G.2.2

arXiv:1504.07067 [pdf, ps, other]

Effectiveness of Structural Restrictions for Hybrid CSPs

Authors: Vladimir Kolmogorov, Michal Rolinek, Rustem Takhanov

Abstract: Constraint Satisfaction Problem (CSP) is a fundamental algorithmic problem that appears in many areas of Computer Science. It can be equivalently stated as computing a homomorphism $\mbox{$\bR \rightarrow \bGamma$}$ between two relational structures, e.g.\ between two directed graphs. Analyzing its complexity has been a prominent research direction, especially for {\em fixed template CSPs} in whic… ▽ More Constraint Satisfaction Problem (CSP) is a fundamental algorithmic problem that appears in many areas of Computer Science. It can be equivalently stated as computing a homomorphism $\mbox{$\bR \rightarrow \bGamma$}$ between two relational structures, e.g.\ between two directed graphs. Analyzing its complexity has been a prominent research direction, especially for {\em fixed template CSPs} in which the right side $\bGamma$ is fixed and the left side $\bR$ is unconstrained. Far fewer results are known for the {\em hybrid} setting that restricts both sides simultaneously. It assumes that $\bR$ belongs to a certain class of relational structures (called a {\em structural restriction} in this paper). We study which structural restrictions are {\em effective}, i.e.\ there exists a fixed template $\bGamma$ (from a certain class of languages) for which the problem is tractable when $\bR$ is restricted, and NP-hard otherwise. We provide a characterization for structural restrictions that are {\em closed under inverse homomorphisms}. The criterion is based on the {\em chromatic number} of a relational structure defined in this paper; it generalizes the standard chromatic number of a graph. As our main tool, we use the algebraic machinery developed for fixed template CSPs. To apply it to our case, we introduce a new construction called a "lifted language." We also give a characterization for structural restrictions corresponding to minor-closed families of graphs, extend results to certain Valued CSPs (namely conservative valued languages), and state implications for CSPs with ordered variables, (valued) CSPs on structures with large girth, and for the maximum weight independent set problem on some restricted families of graphs including graphs with large girth. △ Less

Submitted 24 October, 2015; v1 submitted 27 April, 2015; originally announced April 2015.

Comments: 20 pages

arXiv:1502.07770 [pdf, other]

Total variation on a tree

Authors: Vladimir Kolmogorov, Thomas Pock, Michal Rolinek

Abstract: We consider the problem of minimizing the continuous valued total variation subject to different unary terms on trees and propose fast direct algorithms based on dynamic programming to solve these problems. We treat both the convex and the non-convex case and derive worst case complexities that are equal or better than existing methods. We show applications to total variation based 2D image proces… ▽ More We consider the problem of minimizing the continuous valued total variation subject to different unary terms on trees and propose fast direct algorithms based on dynamic programming to solve these problems. We treat both the convex and the non-convex case and derive worst case complexities that are equal or better than existing methods. We show applications to total variation based 2D image processing and computer vision problems based on a Lagrangian decomposition approach. The resulting algorithms are very efficient, offer a high degree of parallelism and come along with memory requirements which are only in the order of the number of image pixels. △ Less

Submitted 25 April, 2016; v1 submitted 26 February, 2015; originally announced February 2015.

Comments: accepted to SIAM Journal on Imaging Sciences (SIIMS)

arXiv:1502.07327 [pdf, ps, other]

The Complexity of General-Valued CSPs

Authors: Vladimir Kolmogorov, Andrei Krokhin, Michal Rolinek

Abstract: An instance of the Valued Constraint Satisfaction Problem (VCSP) is given by a finite set of variables, a finite domain of labels, and a sum of functions, each function depending on a subset of the variables. Each function can take finite values specifying costs of assignments of labels to its variables or the infinite value, which indicates infeasible assignments. The goal is to find an assignmen… ▽ More An instance of the Valued Constraint Satisfaction Problem (VCSP) is given by a finite set of variables, a finite domain of labels, and a sum of functions, each function depending on a subset of the variables. Each function can take finite values specifying costs of assignments of labels to its variables or the infinite value, which indicates infeasible assignments. The goal is to find an assignment of labels to the variables that minimizes the sum. We study (assuming that P $\ne$ NP) how the complexity of this very general problem depends on the set of functions allowed in the instances, the so-called constraint language. The case when all allowed functions take values in $\{0,\infty\}$ corresponds to ordinary CSPs, where one deals only with the feasibility issue and there is no optimization. This case is the subject of the Algebraic CSP Dichotomy Conjecture predicting for which constraint languages CSPs are tractable and for which NP-hard. The case when all allowed functions take only finite values corresponds to finite-valued CSP, where the feasibility aspect is trivial and one deals only with the optimization issue. The complexity of finite-valued CSPs was fully classified by Thapper and Živný. An algebraic necessary condition for tractability of a general-valued CSP with a fixed constraint language was recently given by Kozik and Ochremiak. As our main result, we prove that if a constraint language satisfies this algebraic necessary condition, and the feasibility CSP corresponding to the VCSP with this language is tractable, then the VCSP is tractable. The algorithm is a simple combination of the assumed algorithm for the feasibility CSP and the standard LP relaxation. As a corollary, we obtain that a dichotomy for ordinary CSPs would imply a dichotomy for general-valued CSPs. △ Less

Submitted 13 February, 2017; v1 submitted 25 February, 2015; originally announced February 2015.

Comments: accepted to SIAM Journal on Computing (SICOMP). An extended abstract of this work (without proofs) has appeared in FOCS 2015

arXiv:1405.7828 [pdf, other]

Superconcentrators of Density 25.3

Authors: Vladimir Kolmogorov, Michal Rolinek

Abstract: An $N$-superconcentrator is a directed, acyclic graph with $N$ input nodes and $N$ output nodes such that every subset of the inputs and every subset of the outputs of same cardinality can be connected by node-disjoint paths. It is known that linear-size and bounded-degree superconcentrators exist. We prove the existence of such superconcentrators with asymptotic density $25.3$ (where the density… ▽ More An $N$-superconcentrator is a directed, acyclic graph with $N$ input nodes and $N$ output nodes such that every subset of the inputs and every subset of the outputs of same cardinality can be connected by node-disjoint paths. It is known that linear-size and bounded-degree superconcentrators exist. We prove the existence of such superconcentrators with asymptotic density $25.3$ (where the density is the number of edges divided by $N$). The previously best known densities were $28$ \cite{Scho2006} and $27.4136$ \cite{YuanK12}. △ Less

Submitted 4 May, 2016; v1 submitted 30 May, 2014; originally announced May 2014.

Comments: (to appear in Ars Combinatorica)

Showing 1–17 of 17 results for author: Rolinek, M