Skip to main content

Showing 1–47 of 47 results for author: Mikolajczyk, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.16538  [pdf, other

    cs.CV

    OpenDlign: Enhancing Open-World 3D Learning with Depth-Aligned Images

    Authors: Ye Mao, Junpeng **g, Krystian Mikolajczyk

    Abstract: Recent open-world 3D representation learning methods using Vision-Language Models (VLMs) to align 3D data with image-text information have shown superior 3D zero-shot performance. However, CAD-rendered images for this alignment often lack realism and texture variation, compromising alignment robustness. Moreover, the volume discrepancy between 3D and 2D pretraining datasets highlights the need for… ▽ More

    Submitted 24 June, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

    Comments: 12 pages

  2. arXiv:2404.15194  [pdf, other

    cs.RO cs.CV

    Closed Loop Interactive Embodied Reasoning for Robot Manipulation

    Authors: Michal Nazarczuk, Jan Kristof Behrens, Karla Stepanova, Matej Hoffmann, Krystian Mikolajczyk

    Abstract: Embodied reasoning systems integrate robotic hardware and cognitive processes to perform complex tasks typically in response to a natural language query about a specific physical environment. This usually involves changing the belief about the scene or physically interacting and changing the scene (e.g. 'Sort the objects from lightest to heaviest'). In order to facilitate the development of such s… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  3. arXiv:2404.07344  [pdf, other

    cs.RO cs.AI cs.IT

    Interactive Learning of Physical Object Properties Through Robot Manipulation and Database of Object Measurements

    Authors: Andrej Kruzliak, Jiri Hartvich, Shubhan P. Patni, Lukas Rustler, Jan Kristof Behrens, Fares J. Abu-Dakka, Krystian Mikolajczyk, Ville Kyrki, Matej Hoffmann

    Abstract: This work presents a framework for automatically extracting physical object properties, such as material composition, mass, volume, and stiffness, through robot manipulation and a database of object measurements. The framework involves exploratory action selection to maximize learning about objects on a table. A Bayesian network models conditional dependencies between object properties, incorporat… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: 8 pages, 8 figures

    ACM Class: I.2.9

  4. arXiv:2403.15551  [pdf, other

    cs.CV cs.AI

    Language-Based Depth Hints for Monocular Depth Estimation

    Authors: Dylan Auty, Krystian Mikolajczyk

    Abstract: Monocular depth estimation (MDE) is inherently ambiguous, as a given image may result from many different 3D scenes and vice versa. To resolve this ambiguity, an MDE system must make assumptions about the most likely 3D scenes for a given input. These assumptions can be either explicit or implicit. In this work, we demonstrate the use of natural language as a source of an explicit prior about the… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

    Comments: 8 pages, 1 figure. Work originally done in June 2022

  5. arXiv:2403.14494  [pdf, other

    cs.CV cs.AI

    Learning to Project for Cross-Task Knowledge Distillation

    Authors: Dylan Auty, Roy Miles, Benedikt Kolbeinsson, Krystian Mikolajczyk

    Abstract: Traditional knowledge distillation (KD) relies on a proficient teacher trained on the target task, which is not always available. In this setting, cross-task distillation can be used, enabling the use of any teacher model trained on a different task. However, many KD methods prove ineffective when applied to this cross-task setting. To address this limitation, we propose a simple modification: the… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

  6. arXiv:2403.13615  [pdf, other

    cs.IT eess.SP

    MIMO Channel as a Neural Function: Implicit Neural Representations for Extreme CSI Compression in Massive MIMO Systems

    Authors: Haotian Wu, Maojun Zhang, Yulin Shao, Krystian Mikolajczyk, Deniz Gündüz

    Abstract: Acquiring and utilizing accurate channel state information (CSI) can significantly improve transmission performance, thereby holding a crucial role in realizing the potential advantages of massive multiple-input multiple-output (MIMO) technology. Current prevailing CSI feedback approaches improve precision by employing advanced deep-learning methods to learn representative CSI features for a subse… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

    MSC Class: 94A24 ACM Class: E.4

  7. arXiv:2403.10755  [pdf, other

    cs.CV

    Match-Stereo-Videos: Bidirectional Alignment for Consistent Dynamic Stereo Matching

    Authors: Junpeng **g, Ye Mao, Krystian Mikolajczyk

    Abstract: Dynamic stereo matching is the task of estimating consistent disparities from stereo videos with dynamic objects. Recent learning-based methods prioritize optimal performance on a single stereo pair, resulting in temporal inconsistencies. Existing video methods apply per-frame matching and window-based cost aggregation across the time dimension, leading to low-frequency oscillations at the scale o… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  8. arXiv:2312.12494  [pdf, other

    cs.CV

    DDOS: The Drone Depth and Obstacle Segmentation Dataset

    Authors: Benedikt Kolbeinsson, Krystian Mikolajczyk

    Abstract: Accurate depth and semantic segmentation are crucial for various computer vision tasks. However, the scarcity of annotated real-world aerial datasets poses a significant challenge for training and evaluating robust models. Additionally, the detection and segmentation of thin objects, such as wires, cables, and fences, present a critical concern for ensuring the safe operation of drones. To address… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

  9. arXiv:2311.18098  [pdf, other

    cs.LG cs.IT cs.NI

    Adaptive Early Exiting for Collaborative Inference over Noisy Wireless Channels

    Authors: Mikolaj Jankowski, Deniz Gunduz, Krystian Mikolajczyk

    Abstract: Collaborative inference systems are one of the emerging solutions for deploying deep neural networks (DNNs) at the wireless network edge. Their main idea is to divide a DNN into two parts, where the first is shallow enough to be reliably executed at edge devices of limited computational power, while the second part is executed at an edge server with higher computational capabilities. The main adva… ▽ More

    Submitted 29 November, 2023; originally announced November 2023.

  10. arXiv:2309.00470  [pdf, other

    cs.IT eess.IV

    Deep Joint Source-Channel Coding for Adaptive Image Transmission over MIMO Channels

    Authors: Haotian Wu, Yulin Shao, Chenghong Bian, Krystian Mikolajczyk, Deniz Gündüz

    Abstract: This paper introduces a vision transformer (ViT)-based deep joint source and channel coding (DeepJSCC) scheme for wireless image transmission over multiple-input multiple-output (MIMO) channels, denoted as DeepJSCC-MIMO. We consider DeepJSCC-MIMO for adaptive image transmission in both open-loop and closed-loop MIMO systems. The novel DeepJSCC-MIMO architecture surpasses the classical separation-b… ▽ More

    Submitted 7 May, 2024; v1 submitted 1 September, 2023; originally announced September 2023.

    Comments: arXiv admin note: text overlap with arXiv:2210.15347

    MSC Class: 94A24 ACM Class: E.4

  11. arXiv:2306.09101  [pdf, other

    cs.IT eess.SP

    Transformer-aided Wireless Image Transmission with Channel Feedback

    Authors: Haotian Wu, Yulin Shao, Emre Ozfatura, Krystian Mikolajczyk, Deniz Gündüz

    Abstract: This paper presents a novel wireless image transmission paradigm that can exploit feedback from the receiver, called DeepJSCC-ViT-f. We consider a block feedback channel model, where the transmitter receives noiseless/noisy channel output feedback after each block. The proposed scheme employs a single encoder to facilitate transmission over multiple blocks, refining the receiver's estimation at ea… ▽ More

    Submitted 14 February, 2024; v1 submitted 15 June, 2023; originally announced June 2023.

    MSC Class: 94A24 ACM Class: E.4

  12. arXiv:2303.11098  [pdf, other

    cs.CV cs.AI

    Understanding the Role of the Projector in Knowledge Distillation

    Authors: Roy Miles, Krystian Mikolajczyk

    Abstract: In this paper we revisit the efficacy of knowledge distillation as a function matching and metric learning problem. In doing so we verify three important design decisions, namely the normalisation, soft maximum function, and projection layers as key ingredients. We theoretically show that the projector implicitly encodes information on past examples, enabling relational gradients for the student.… ▽ More

    Submitted 1 February, 2024; v1 submitted 20 March, 2023; originally announced March 2023.

    Comments: AAAI 2024. Code available at https://github.com/roymiles/Simple-Recipe-Distillation

  13. arXiv:2212.00787  [pdf, other

    cs.CV

    Multi-Class Segmentation from Aerial Views using Recursive Noise Diffusion

    Authors: Benedikt Kolbeinsson, Krystian Mikolajczyk

    Abstract: Semantic segmentation from aerial views is a crucial task for autonomous drones, as they rely on precise and accurate segmentation to navigate safely and efficiently. However, aerial images present unique challenges such as diverse viewpoints, extreme scale variations, and high scene complexity. In this paper, we propose an end-to-end multi-class semantic segmentation diffusion model that addresse… ▽ More

    Submitted 22 May, 2024; v1 submitted 1 December, 2022; originally announced December 2022.

    Comments: Accepted at WACV 2024. Code available at https://github.com/benediktkol/recursive-noise-diffusion

  14. arXiv:2211.17232  [pdf, other

    cs.CV cs.LG

    ObjCAViT: Improving Monocular Depth Estimation Using Natural Language Models And Image-Object Cross-Attention

    Authors: Dylan Auty, Krystian Mikolajczyk

    Abstract: While monocular depth estimation (MDE) is an important problem in computer vision, it is difficult due to the ambiguity that results from the compression of a 3D scene into only 2 dimensions. It is common practice in the field to treat it as simple image-to-image translation, without consideration for the semantics of the scene and the objects within it. In contrast, humans and animals have been s… ▽ More

    Submitted 30 November, 2022; originally announced November 2022.

    Comments: 9 pages, 4 figures. Code is released at https://github.com/DylanAuty/ObjCAViT

  15. arXiv:2210.15347  [pdf, other

    cs.IT eess.IV

    Vision Transformer for Adaptive Image Transmission over MIMO Channels

    Authors: Haotian Wu, Yulin Shao, Chenghong Bian, Krystian Mikolajczyk, Deniz Gündüz

    Abstract: This paper presents a vision transformer (ViT) based joint source and channel coding (JSCC) scheme for wireless image transmission over multiple-input multiple-output (MIMO) systems, called ViT-MIMO. The proposed ViT-MIMO architecture, in addition to outperforming separation-based benchmarks, can flexibly adapt to different channel conditions without requiring retraining. Specifically, exploiting… ▽ More

    Submitted 27 October, 2022; originally announced October 2022.

    MSC Class: 94A24 ACM Class: E.4

  16. arXiv:2206.10312  [pdf, other

    cs.RO cs.AI cs.LG

    SAMPLE-HD: Simultaneous Action and Motion Planning Learning Environment

    Authors: Michal Nazarczuk, Tony Ng, Krystian Mikolajczyk

    Abstract: Humans exhibit incredibly high levels of multi-modal understanding - combining visual cues with read, or heard knowledge comes easy to us and allows for very accurate interaction with the surrounding environment. Various simulation environments focus on providing data for tasks related to scene understanding, question answering, space exploration, visual navigation. In this work, we are providing… ▽ More

    Submitted 1 June, 2022; originally announced June 2022.

    Comments: CVPRW, 2 pages

  17. Channel-Adaptive Wireless Image Transmission with OFDM

    Authors: Haotian Wu, Yulin Shao, Krystian Mikolajczyk, Deniz Gündüz

    Abstract: We present a learning-based channel-adaptive joint source and channel coding (CA-JSCC) scheme for wireless image transmission over multipath fading channels. The proposed method is an end-to-end autoencoder architecture with a dual-attention mechanism employing orthogonal frequency division multiplexing (OFDM) transmission. Unlike the previous works, our approach is adaptive to channel-gain and no… ▽ More

    Submitted 8 September, 2022; v1 submitted 4 May, 2022; originally announced May 2022.

    Comments: IEEE Wireless Communications Letters

    MSC Class: 94A24 ACM Class: E.4

  18. arXiv:2204.10384  [pdf, other

    cs.CV

    Monocular Depth Estimation Using Cues Inspired by Biological Vision Systems

    Authors: Dylan Auty, Krystian Mikolajczyk

    Abstract: Monocular depth estimation (MDE) aims to transform an RGB image of a scene into a pixelwise depth map from the same camera view. It is fundamentally ill-posed due to missing information: any single image can have been taken from many possible 3D scenes. Part of the MDE task is, therefore, to learn which visual cues in the image can be used for depth estimation, and how. With training data limited… ▽ More

    Submitted 12 May, 2022; v1 submitted 21 April, 2022; originally announced April 2022.

    Comments: 7 pages, 2 figures. Accepted to International Conference on Pattern Recognition (ICPR) 2022. Code available at https://github.com/DylanAuty/MDE-biological-vision-systems

  19. arXiv:2112.12785  [pdf, other

    cs.CV

    NinjaDesc: Content-Concealing Visual Descriptors via Adversarial Learning

    Authors: Tony Ng, Hyo ** Kim, Vincent Lee, Daniel DeTone, Tsun-Yi Yang, Tianwei Shen, Eddy Ilg, Vassileios Balntas, Krystian Mikolajczyk, Chris Sweeney

    Abstract: In the light of recent analyses on privacy-concerning scene revelation from visual descriptors, we develop descriptors that conceal the input image content. In particular, we propose an adversarial learning framework for training visual descriptors that prevent image reconstruction, while maintaining the matching accuracy. We let a feature encoding network and image reconstruction network compete… ▽ More

    Submitted 29 March, 2022; v1 submitted 23 December, 2021; originally announced December 2021.

    Comments: Accepted at CVPR 2022. Supplementary material included after references. 15 pages, 14 figures, 6 tables

  20. arXiv:2112.04846  [pdf, other

    cs.CV

    ScaleNet: A Shallow Architecture for Scale Estimation

    Authors: Axel Barroso-Laguna, Yurun Tian, Krystian Mikolajczyk

    Abstract: In this paper, we address the problem of estimating scale factors between images. We formulate the scale estimation problem as a prediction of a probability distribution over scale factors. We design a new architecture, ScaleNet, that exploits dilated convolutions as well as self and cross-correlation layers to predict the scale between images. We demonstrate that rectifying images with estimated… ▽ More

    Submitted 5 July, 2022; v1 submitted 9 December, 2021; originally announced December 2021.

    Journal ref: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2022

  21. arXiv:2112.00459  [pdf, other

    cs.CV

    Information Theoretic Representation Distillation

    Authors: Roy Miles, Adrian Lopez Rodriguez, Krystian Mikolajczyk

    Abstract: Despite the empirical success of knowledge distillation, current state-of-the-art methods are computationally expensive to train, which makes them difficult to adopt in practice. To address this problem, we introduce two distinct complementary losses inspired by a cheap entropy-like estimator. These losses aim to maximise the correlation and mutual information between the student and teacher repre… ▽ More

    Submitted 7 October, 2022; v1 submitted 1 December, 2021; originally announced December 2021.

    Comments: BMVC 2022

  22. arXiv:2110.12844  [pdf, other

    cs.CV

    Reconstructing Pruned Filters using Cheap Spatial Transformations

    Authors: Roy Miles, Krystian Mikolajczyk

    Abstract: We present an efficient alternative to the convolutional layer using cheap spatial transformations. This construction exploits an inherent spatial redundancy of the learned convolutional filters to enable a much greater parameter efficiency, while maintaining the top-end accuracy of their dense counter-parts. Training these networks is modelled as a generalised pruning problem, whereby the pruned… ▽ More

    Submitted 24 August, 2023; v1 submitted 25 October, 2021; originally announced October 2021.

    Comments: ICCV 2023 Workshop on Resource Efficient Deep Learning for Computer Vision

  23. arXiv:2110.02903  [pdf, other

    cs.CV

    Grasp-Oriented Fine-grained Cloth Segmentation without Real Supervision

    Authors: Ruijie Ren, Mohit Gurnani Rajesh, Jordi Sanchez-Riera, Fan Zhang, Yurun Tian, Antonio Agudo, Yiannis Demiris, Krystian Mikolajczyk, Francesc Moreno-Noguer

    Abstract: Automatically detecting graspable regions from a single depth image is a key ingredient in cloth manipulation. The large variability of cloth deformations has motivated most of the current approaches to focus on identifying specific gras** points rather than semantic parts, as the appearance and depth variations of local regions are smaller and easier to model than the larger ones. However, task… ▽ More

    Submitted 6 October, 2021; originally announced October 2021.

    Comments: 6 pages, 4 figures. Submitted to International Conference on Robotics and Automation (ICRA)

  24. arXiv:2108.07260  [pdf, other

    cs.CV

    Reassessing the Limitations of CNN Methods for Camera Pose Regression

    Authors: Tony Ng, Adrian Lopez-Rodriguez, Vassileios Balntas, Krystian Mikolajczyk

    Abstract: In this paper, we address the problem of camera pose estimation in outdoor and indoor scenarios. In comparison to the currently top-performing methods that rely on 2D to 3D matching, we propose a model that can directly regress the camera pose from images with significantly higher accuracy than existing methods of the same class. We first analyse why regression methods are still behind the state-o… ▽ More

    Submitted 16 August, 2021; originally announced August 2021.

  25. arXiv:2105.11166  [pdf, other

    cs.NI cs.CV cs.LG

    AirNet: Neural Network Transmission over the Air

    Authors: Mikolaj Jankowski, Deniz Gunduz, Krystian Mikolajczyk

    Abstract: State-of-the-art performance for many edge applications is achieved by deep neural networks (DNNs). Often, these DNNs are location- and time-sensitive, and must be delivered over a wireless channel rapidly and efficiently. In this paper, we introduce AirNet, a family of novel training and transmission methods that allow DNNs to be efficiently delivered over wireless channels under stringent transm… ▽ More

    Submitted 19 July, 2023; v1 submitted 24 May, 2021; originally announced May 2021.

  26. arXiv:2009.01579  [pdf, other

    cs.CV

    DESC: Domain Adaptation for Depth Estimation via Semantic Consistency

    Authors: Adrian Lopez-Rodriguez, Krystian Mikolajczyk

    Abstract: Accurate real depth annotations are difficult to acquire, needing the use of special devices such as a LiDAR sensor. Self-supervised methods try to overcome this problem by processing video or stereo sequences, which may not always be available. Instead, in this paper, we propose a domain adaptation approach to train a monocular depth estimation model using a fully-annotated source dataset and a n… ▽ More

    Submitted 3 September, 2020; originally announced September 2020.

    Comments: BMVC20 (Oral). Code: https://github.com/alopezgit/DESC

  27. arXiv:2008.06814  [pdf, other

    cs.CV

    Cascaded channel pruning using hierarchical self-distillation

    Authors: Roy Miles, Krystian Mikolajczyk

    Abstract: In this paper, we propose an approach for filter-level pruning with hierarchical knowledge distillation based on the teacher, teaching-assistant, and student framework. Our method makes use of teaching assistants at intermediate pruning levels that share the same architecture and weights as the target student. We propose to prune each model independently using the gradient information from its cor… ▽ More

    Submitted 15 August, 2020; originally announced August 2020.

    Comments: BMVC 2020

  28. arXiv:2008.01034  [pdf, other

    cs.CV

    Project to Adapt: Domain Adaptation for Depth Completion from Noisy and Sparse Sensor Data

    Authors: Adrian Lopez-Rodriguez, Benjamin Busam, Krystian Mikolajczyk

    Abstract: Depth completion aims to predict a dense depth map from a sparse depth input. The acquisition of dense ground truth annotations for depth completion settings can be difficult and, at the same time, a significant domain gap between real LiDAR measurements and synthetic data has prevented from successful training of models in virtual settings. We propose a domain adaptation approach for sparse-to-de… ▽ More

    Submitted 5 August, 2020; v1 submitted 3 August, 2020; originally announced August 2020.

  29. arXiv:2007.10915  [pdf, other

    cs.IT cs.LG

    Wireless Image Retrieval at the Edge

    Authors: Mikolaj Jankowski, Deniz Gunduz, Krystian Mikolajczyk

    Abstract: We study the image retrieval problem at the wireless edge, where an edge device captures an image, which is then used to retrieve similar images from an edge server. These can be images of the same person or a vehicle taken from other cameras at different times and locations. Our goal is to maximize the accuracy of the retrieval task under power and bandwidth constraints over the wireless link. Du… ▽ More

    Submitted 15 July, 2021; v1 submitted 21 July, 2020; originally announced July 2020.

  30. arXiv:2006.10202  [pdf, other

    cs.CV

    HyNet: Learning Local Descriptor with Hybrid Similarity Measure and Triplet Loss

    Authors: Yurun Tian, Axel Barroso-Laguna, Tony Ng, Vassileios Balntas, Krystian Mikolajczyk

    Abstract: Recent works show that local descriptor learning benefits from the use of L2 normalisation, however, an in-depth analysis of this effect lacks in the literature. In this paper, we investigate how L2 normalisation affects the back-propagated descriptor gradients during training. Based on our observations, we propose HyNet, a new local descriptor that leads to state-of-the-art results in matching. H… ▽ More

    Submitted 9 November, 2020; v1 submitted 17 June, 2020; originally announced June 2020.

  31. arXiv:2005.13605  [pdf, other

    cs.CV cs.LG eess.IV

    D2D: Keypoint Extraction with Describe to Detect Approach

    Authors: Yurun Tian, Vassileios Balntas, Tony Ng, Axel Barroso-Laguna, Yiannis Demiris, Krystian Mikolajczyk

    Abstract: In this paper, we present a novel approach that exploits the information within the descriptor space to propose keypoint locations. Detect then describe, or detect and describe jointly are two typical strategies for extracting local descriptors. In contrast, we propose an approach that inverts this process by first describing and then detecting the keypoint locations. % Describe-to-Detect (D2D) le… ▽ More

    Submitted 27 May, 2020; originally announced May 2020.

  32. arXiv:2005.05777  [pdf, other

    cs.CV

    HDD-Net: Hybrid Detector Descriptor with Mutual Interactive Learning

    Authors: Axel Barroso-Laguna, Yannick Verdie, Benjamin Busam, Krystian Mikolajczyk

    Abstract: Local feature extraction remains an active research area due to the advances in fields such as SLAM, 3D reconstructions, or AR applications. The success in these applications relies on the performance of the feature detector and descriptor. While the detector-descriptor interaction of most methods is based on unifying in single network detections and descriptors, we propose a method that treats bo… ▽ More

    Submitted 26 November, 2020; v1 submitted 12 May, 2020; originally announced May 2020.

    Journal ref: Asian Conference on Computer Vision (ACCV), 2020

  33. arXiv:2004.02673  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    SHOP-VRB: A Visual Reasoning Benchmark for Object Perception

    Authors: Michal Nazarczuk, Krystian Mikolajczyk

    Abstract: In this paper we present an approach and a benchmark for visual reasoning in robotics applications, in particular small object gras** and manipulation. The approach and benchmark are focused on inferring object properties from visual and text data. It concerns small household objects with their properties, functionality, natural language descriptions as well as question-answer pairs for visual r… ▽ More

    Submitted 6 April, 2020; originally announced April 2020.

    Comments: International Conference on Robotics and Automation (ICRA) 2020

  34. arXiv:2003.04191  [pdf, other

    cs.CV cs.LG

    Domain Adversarial Training for Infrared-colour Person Re-Identification

    Authors: Nima Mohammadi Meshky, Sara Iodice, Krystian Mikolajczyk

    Abstract: Person re-identification (re-ID) is a very active area of research in computer vision, due to the role it plays in video surveillance. Currently, most methods only address the task of matching between colour images. However, in poorly-lit environments CCTV cameras switch to infrared imaging, hence develo** a system which can correctly perform matching between infrared and colour images is a nece… ▽ More

    Submitted 9 March, 2020; originally announced March 2020.

    Journal ref: ICDP 2019

  35. Joint Device-Edge Inference over Wireless Links with Pruning

    Authors: Mikolaj Jankowski, Deniz Gunduz, Krystian Mikolajczyk

    Abstract: We propose a joint feature compression and transmission scheme for efficient inference at the wireless network edge. Our goal is to enable efficient and reliable inference at the edge server assuming limited computational resources at the edge device. Previous work focused mainly on feature compression, ignoring the computational cost of channel coding. We incorporate the recently proposed deep jo… ▽ More

    Submitted 20 October, 2020; v1 submitted 4 March, 2020; originally announced March 2020.

  36. arXiv:2001.08972  [pdf, other

    cs.CV cs.LG

    SOLAR: Second-Order Loss and Attention for Image Retrieval

    Authors: Tony Ng, Vassileios Balntas, Yurun Tian, Krystian Mikolajczyk

    Abstract: Recent works in deep-learning have shown that second-order information is beneficial in many computer-vision tasks. Second-order information can be enforced both in the spatial context and the abstract feature dimensions. In this work, we explore two second-order components. One is focused on second-order spatial information to increase the performance of image descriptors, both local and global.… ▽ More

    Submitted 4 August, 2020; v1 submitted 24 January, 2020; originally announced January 2020.

    Comments: ECCV 2020

  37. arXiv:2001.03102  [pdf, other

    cs.CV

    Compression of descriptor models for mobile applications

    Authors: Roy Miles, Krystian Mikolajczyk

    Abstract: Deep neural networks have demonstrated state-of-the-art performance for feature-based image matching through the advent of new large and diverse datasets. However, there has been little work on evaluating the computational cost, model size, and matching accuracy tradeoffs for these models. This paper explicitly addresses these practical metrics by considering the state-of-the-art HardNet model. We… ▽ More

    Submitted 5 February, 2021; v1 submitted 9 January, 2020; originally announced January 2020.

    Comments: ICASSP 2021

  38. arXiv:1911.10033  [pdf, other

    cs.CV

    Domain Adaptation for Object Detection via Style Consistency

    Authors: Adrian Lopez Rodriguez, Krystian Mikolajczyk

    Abstract: We propose a domain adaptation approach for object detection. We introduce a two-step method: the first step makes the detector robust to low-level differences and the second step adapts the classifiers to changes in the high-level features. For the first step, we use a style transfer method for pixel-adaptation of source images to the target domain. We find that enforcing low distance in the high… ▽ More

    Submitted 22 November, 2019; originally announced November 2019.

    Comments: BMVC 2019

  39. Deep Joint Source-Channel Coding for Wireless Image Retrieval

    Authors: Mikolaj Jankowski, Deniz Gunduz, Krystian Mikolajczyk

    Abstract: Motivated by surveillance applications with wireless cameras or drones, we consider the problem of image retrieval over a wireless channel. Conventional systems apply lossy compression on query images to reduce the data that must be transmitted over the bandwidth and power limited wireless link. We first note that reconstructing the original image is not needed for retrieval tasks; hence, we intro… ▽ More

    Submitted 28 October, 2019; originally announced October 2019.

  40. arXiv:1904.00889  [pdf, other

    cs.CV

    Key.Net: Keypoint Detection by Handcrafted and Learned CNN Filters

    Authors: Axel Barroso-Laguna, Edgar Riba, Daniel Ponsa, Krystian Mikolajczyk

    Abstract: We introduce a novel approach for keypoint detection task that combines handcrafted and learned CNN filters within a shallow multi-scale architecture. Handcrafted filters provide anchor structures for learned filters, which localize, score and rank repeatable features. Scale-space representation is used within the network to extract keypoints at different levels. We design a loss function to detec… ▽ More

    Submitted 12 October, 2019; v1 submitted 1 April, 2019; originally announced April 2019.

    Journal ref: International Conference on Computer Vision (ICCV) 2019

  41. arXiv:1904.00244  [pdf, other

    cs.CV

    Person Re-identification with Bias-controlled Adversarial Training

    Authors: Sara Iodice, Krystian Mikolajczyk

    Abstract: Inspired by the effectiveness of adversarial training in the area of Generative Adversarial Networks we present a new approach for learning feature representations in person re-identification. We investigate different types of bias that typically occur in re-ID scenarios, i.e., pose, body part and camera view, and propose a general approach to address them. We introduce an adversarial strategy for… ▽ More

    Submitted 30 March, 2019; originally announced April 2019.

  42. arXiv:1807.09162  [pdf, other

    cs.CV

    Partial Person Re-identification with Alignment and Hallucination

    Authors: Sara Iodice, Krystian Mikolajczyk

    Abstract: Partial person re-identification involves matching pedestrian frames where only a part of a body is visible in corresponding images. This reflects practical CCTV surveillance scenario, where full person views are often not available. Missing body parts make the comparison very challenging due to significant misalignment and varying scale of the views. We propose Partial Matching Net (PMN) that det… ▽ More

    Submitted 24 July, 2018; originally announced July 2018.

  43. arXiv:1805.06406  [pdf, other

    cs.CV

    Deep Segmentation and Registration in X-Ray Angiography Video

    Authors: Athanasios Vlontzos, Krystian Mikolajczyk

    Abstract: In interventional radiology, short video sequences of vein structure in motion are captured in order to help medical personnel identify vascular issues or plan intervention. Semantic segmentation can greatly improve the usefulness of these videos by indicating exact position of vessels and instruments, thus reducing the ambiguity. We propose a real-time segmentation method for these tasks, based o… ▽ More

    Submitted 3 August, 2018; v1 submitted 16 May, 2018; originally announced May 2018.

    Comments: To appear in BMVC 2018

  44. arXiv:1710.01202  [pdf, other

    cs.CV

    Person Re-Identification with Vision and Language

    Authors: Fei Yan, Krystian Mikolajczyk, Josef Kittler

    Abstract: In this paper we propose a new approach to person re-identification using images and natural language descriptions. We propose a joint vision and language model based on CCA and CNN architectures to match across the two modalities as well as to enrich visual examples for which there are no language descriptions. We also introduce new annotations in the form of natural language descriptions for two… ▽ More

    Submitted 3 October, 2017; originally announced October 2017.

  45. arXiv:1704.05939  [pdf, other

    cs.CV

    HPatches: A benchmark and evaluation of handcrafted and learned local descriptors

    Authors: Vassileios Balntas, Karel Lenc, Andrea Vedaldi, Krystian Mikolajczyk

    Abstract: In this paper, we propose a novel benchmark for evaluating local image descriptors. We demonstrate that the existing datasets and evaluation protocols do not specify unambiguously all aspects of evaluation, leading to ambiguities and inconsistencies in results reported in the literature. Furthermore, these datasets are nearly saturated due to the recent improvements in local descriptors obtained b… ▽ More

    Submitted 19 April, 2017; originally announced April 2017.

  46. arXiv:1603.07141  [pdf, other

    cs.CV

    BreakingNews: Article Annotation by Image and Text Processing

    Authors: Arnau Ramisa, Fei Yan, Francesc Moreno-Noguer, Krystian Mikolajczyk

    Abstract: Building upon recent Deep Neural Network architectures, current approaches lying in the intersection of computer vision and natural language processing have achieved unprecedented breakthroughs in tasks like automatic captioning or image retrieval. Most of these learning methods, though, rely on large training sets of images associated with human annotations that specifically describe the visual c… ▽ More

    Submitted 23 March, 2016; originally announced March 2016.

  47. arXiv:1601.05030  [pdf, other

    cs.CV

    PN-Net: Conjoined Triple Deep Network for Learning Local Image Descriptors

    Authors: Vassileios Balntas, Edward Johns, Lilian Tang, Krystian Mikolajczyk

    Abstract: In this paper we propose a new approach for learning local descriptors for matching image patches. It has recently been demonstrated that descriptors based on convolutional neural networks (CNN) can significantly improve the matching performance. Unfortunately their computational complexity is prohibitive for any practical application. We address this problem and propose a CNN based descriptor wit… ▽ More

    Submitted 19 January, 2016; originally announced January 2016.