-
Utilizing Grounded SAM for self-supervised frugal camouflaged human detection
Authors:
Matthias Pijarowski,
Alexander Wolpert,
Martin Heckmann,
Michael Teutsch
Abstract:
Visually detecting camouflaged objects is a hard problem for both humans and computer vision algorithms. Strong similarities between object and background appearance make the task significantly more challenging than traditional object detection or segmentation tasks. Current state-of-the-art models use either convolutional neural networks or vision transformers as feature extractors. They are trai…
▽ More
Visually detecting camouflaged objects is a hard problem for both humans and computer vision algorithms. Strong similarities between object and background appearance make the task significantly more challenging than traditional object detection or segmentation tasks. Current state-of-the-art models use either convolutional neural networks or vision transformers as feature extractors. They are trained in a fully supervised manner and thus need a large amount of labeled training data. In this paper, both self-supervised and frugal learning methods are introduced to the task of Camouflaged Object Detection (COD). The overall goal is to fine-tune two COD reference methods, namely SINet-V2 and HitNet, pre-trained for camouflaged animal detection to the task of camouflaged human detection. Therefore, we use the public dataset CPD1K that contains camouflaged humans in a forest environment. We create a strong baseline using supervised frugal transfer learning for the fine-tuning task. Then, we analyze three pseudo-labeling approaches to perform the fine-tuning task in a self-supervised manner. Our experiments show that we achieve similar performance by pure self-supervision compared to fully supervised frugal learning.
△ Less
Submitted 9 June, 2024;
originally announced June 2024.
-
An AlphaZero-Inspired Approach to Solving Search Problems
Authors:
Evgeny Dantsin,
Vladik Kreinovich,
Alexander Wolpert
Abstract:
AlphaZero and its extension MuZero are computer programs that use machine-learning techniques to play at a superhuman level in chess, go, and a few other games. They achieved this level of play solely with reinforcement learning from self-play, without any domain knowledge except the game rules. It is a natural idea to adapt the methods and techniques used in AlphaZero for solving search problems…
▽ More
AlphaZero and its extension MuZero are computer programs that use machine-learning techniques to play at a superhuman level in chess, go, and a few other games. They achieved this level of play solely with reinforcement learning from self-play, without any domain knowledge except the game rules. It is a natural idea to adapt the methods and techniques used in AlphaZero for solving search problems such as the Boolean satisfiability problem (in its search version). Given a search problem, how to represent it for an AlphaZero-inspired solver? What are the "rules of solving" for this search problem? We describe possible representations in terms of easy-instance solvers and self-reductions, and we give examples of such representations for the satisfiability problem. We also describe a version of Monte Carlo tree search adapted for search problems.
△ Less
Submitted 2 July, 2022;
originally announced July 2022.
-
Similarity Between Points in Metric Measure Spaces
Authors:
Evgeny Dantsin,
Alexander Wolpert
Abstract:
This paper is about similarity between objects that can be represented as points in metric measure spaces. A metric measure space is a metric space that is also equipped with a measure. For example, a network with distances between its nodes and weights assigned to its nodes is a metric measure space. Given points x and y in different metric measure spaces or in the same space, how similar are the…
▽ More
This paper is about similarity between objects that can be represented as points in metric measure spaces. A metric measure space is a metric space that is also equipped with a measure. For example, a network with distances between its nodes and weights assigned to its nodes is a metric measure space. Given points x and y in different metric measure spaces or in the same space, how similar are they? A well known approach is to consider x and y similar if their neighborhoods are similar. For metric measure spaces, similarity between neighborhoods is well captured by the Gromov-Hausdorff-Prokhorov distance, but it is NP-hard to compute this distance even in quite simple cases. We propose a tractable alternative: the radial distribution distance between the neighborhoods of x and y. The similarity measure based on the radial distribution distance is coarser than the similarity based on the Gromov-Hausdorff-Prokhorov distance but much easier to compute.
△ Less
Submitted 1 November, 2020;
originally announced November 2020.
-
Anchor-free Small-scale Multispectral Pedestrian Detection
Authors:
Alexander Wolpert,
Michael Teutsch,
M. Saquib Sarfraz,
Rainer Stiefelhagen
Abstract:
Multispectral images consisting of aligned visual-optical (VIS) and thermal infrared (IR) image pairs are well-suited for practical applications like autonomous driving or visual surveillance. Such data can be used to increase the performance of pedestrian detection especially for weakly illuminated, small-scaled, or partially occluded instances. The current state-of-the-art is based on variants o…
▽ More
Multispectral images consisting of aligned visual-optical (VIS) and thermal infrared (IR) image pairs are well-suited for practical applications like autonomous driving or visual surveillance. Such data can be used to increase the performance of pedestrian detection especially for weakly illuminated, small-scaled, or partially occluded instances. The current state-of-the-art is based on variants of Faster R-CNN and thus passes through two stages: a proposal generator network with handcrafted anchor boxes for object localization and a classification network for verifying the object category. In this paper we propose a method for effective and efficient multispectral fusion of the two modalities in an adapted single-stage anchor-free base architecture. We aim at learning pedestrian representations based on object center and scale rather than direct bounding box predictions. In this way, we can both simplify the network architecture and achieve higher detection performance, especially for pedestrians under occlusion or at low object resolution. In addition, we provide a study on well-suited multispectral data augmentation techniques that improve the commonly used augmentations. The results show our method's effectiveness in detecting small-scaled pedestrians. We achieve 5.68% log-average miss rate in comparison to the best current state-of-the-art of 7.49% (25% improvement) on the challenging KAIST Multispectral Pedestrian Detection Benchmark.
Code: https://github.com/HensoldtOptronicsCV/MultispectralPedestrianDetection
△ Less
Submitted 20 August, 2020; v1 submitted 19 August, 2020;
originally announced August 2020.