Search | arXiv e-print repository

H-NeXt: The next step towards roto-translation invariant networks

Authors: Tomas Karella, Filip Sroubek, Jan Flusser, Jan Blazek, Vasek Kosik

Abstract: The widespread popularity of equivariant networks underscores the significance of parameter efficient models and effective use of training data. At a time when robustness to unseen deformations is becoming increasingly important, we present H-NeXt, which bridges the gap between equivariance and invariance. H-NeXt is a parameter-efficient roto-translation invariant network that is trained without a… ▽ More The widespread popularity of equivariant networks underscores the significance of parameter efficient models and effective use of training data. At a time when robustness to unseen deformations is becoming increasingly important, we present H-NeXt, which bridges the gap between equivariance and invariance. H-NeXt is a parameter-efficient roto-translation invariant network that is trained without a single augmented image in the training set. Our network comprises three components: an equivariant backbone for learning roto-translation independent features, an invariant pooling layer for discarding roto-translation information, and a classification layer. H-NeXt outperforms the state of the art in classification on unaugmented training sets and augmented test sets of MNIST and CIFAR-10. △ Less

Submitted 2 November, 2023; originally announced November 2023.

Comments: Appears in British Machine Vision Conference 2023 (BMVC 2023)

arXiv:2304.06566 [pdf, other]

NeRD: Neural field-based Demosaicking

Authors: Tomas Kerepecky, Filip Sroubek, Adam Novozamsky, Jan Flusser

Abstract: We introduce NeRD, a new demosaicking method for generating full-color images from Bayer patterns. Our approach leverages advancements in neural fields to perform demosaicking by representing an image as a coordinate-based neural network with sine activation functions. The inputs to the network are spatial coordinates and a low-resolution Bayer pattern, while the outputs are the corresponding RGB… ▽ More We introduce NeRD, a new demosaicking method for generating full-color images from Bayer patterns. Our approach leverages advancements in neural fields to perform demosaicking by representing an image as a coordinate-based neural network with sine activation functions. The inputs to the network are spatial coordinates and a low-resolution Bayer pattern, while the outputs are the corresponding RGB values. An encoder network, which is a blend of ResNet and U-net, enhances the implicit neural representation of the image to improve its quality and ensure spatial consistency through prior learning. Our experimental results demonstrate that NeRD outperforms traditional and state-of-the-art CNN-based methods and significantly closes the gap to transformer-based methods. △ Less

Submitted 13 April, 2023; originally announced April 2023.

Comments: 5 pages, 4 figures, 1 table

arXiv:2304.06560 [pdf, other]

Real-Time Wheel Detection and Rim Classification in Automotive Production

Authors: Roman Stanek, Tomas Kerepecky, Adam Novozamsky, Filip Sroubek, Barbara Zitova, Jan Flusser

Abstract: This paper proposes a novel approach to real-time automatic rim detection, classification, and inspection by combining traditional computer vision and deep learning techniques. At the end of every automotive assembly line, a quality control process is carried out to identify any potential defects in the produced cars. Common yet hazardous defects are related, for example, to incorrectly mounted ri… ▽ More This paper proposes a novel approach to real-time automatic rim detection, classification, and inspection by combining traditional computer vision and deep learning techniques. At the end of every automotive assembly line, a quality control process is carried out to identify any potential defects in the produced cars. Common yet hazardous defects are related, for example, to incorrectly mounted rims. Routine inspections are mostly conducted by human workers that are negatively affected by factors such as fatigue or distraction. We have designed a new prototype to validate whether all four wheels on a single car match in size and type. Additionally, we present three comprehensive open-source databases, CWD1500, WHEEL22, and RB600, for wheel, rim, and bolt detection, as well as rim classification, which are free-to-use for scientific purposes. △ Less

Submitted 13 April, 2023; originally announced April 2023.

Comments: 5 pages, 7 figures, 3 tables

arXiv:2301.07581 [pdf, other]

doi 10.1007/s11263-023-01798-7

Blur Invariants for Image Recognition

Authors: Jan Flusser, Matej Lebl, Matteo Pedone, Filip Sroubek, Jitka Kostkova

Abstract: Blur is an image degradation that is difficult to remove. Invariants with respect to blur offer an alternative way of a~description and recognition of blurred images without any deblurring. In this paper, we present an original unified theory of blur invariants. Unlike all previous attempts, the new theory does not require any prior knowledge of the blur type. The invariants are constructed in the… ▽ More Blur is an image degradation that is difficult to remove. Invariants with respect to blur offer an alternative way of a~description and recognition of blurred images without any deblurring. In this paper, we present an original unified theory of blur invariants. Unlike all previous attempts, the new theory does not require any prior knowledge of the blur type. The invariants are constructed in the Fourier domain by means of orthogonal projection operators and moment expansion is used for efficient and stable computation. It is shown that all blur invariants published earlier are just particular cases of this approach. Experimental comparison to concurrent approaches shows the advantages of the proposed theory. △ Less

Submitted 18 January, 2023; originally announced January 2023.

Comments: 15 pages

Showing 1–4 of 4 results for author: Flusser, J