Search | arXiv e-print repository

Improved Hard Example Mining Approach for Single Shot Object Detectors

Authors: Aybora Koksal, Onder Tuzcuoglu, Kutalmis Gokalp Ince, Yoldas Ataseven, A. Aydin Alatan

Abstract: Hard example mining methods generally improve the performance of the object detectors, which suffer from imbalanced training sets. In this work, two existing hard example mining approaches (LRM and focal loss, FL) are adapted and combined in a state-of-the-art real-time object detector, YOLOv5. The effectiveness of the proposed approach for improving the performance on hard examples is extensively… ▽ More Hard example mining methods generally improve the performance of the object detectors, which suffer from imbalanced training sets. In this work, two existing hard example mining approaches (LRM and focal loss, FL) are adapted and combined in a state-of-the-art real-time object detector, YOLOv5. The effectiveness of the proposed approach for improving the performance on hard examples is extensively evaluated. The proposed method increases mAP by 3% compared to using the original loss function and around 1-2% compared to using the hard-mining methods (LRM or FL) individually on 2021 Anti-UAV Challenge Dataset. △ Less

Submitted 12 July, 2022; v1 submitted 26 February, 2022; originally announced February 2022.

Comments: ICIP 2022. 5 pages, 2 figures, 7 tables. The codes are available at https://github.com/aybora/yolov5Loss

arXiv:2108.08179 [pdf, ps, other]

Effect of Parameter Optimization on Classical and Learning-based Image Matching Methods

Authors: Ufuk Efe, Kutalmis Gokalp Ince, A. Aydin Alatan

Abstract: Deep learning-based image matching methods are improved significantly during the recent years. Although these methods are reported to outperform the classical techniques, the performance of the classical methods is not examined in detail. In this study, we compare classical and learning-based methods by employing mutual nearest neighbor search with ratio test and optimizing the ratio test threshol… ▽ More Deep learning-based image matching methods are improved significantly during the recent years. Although these methods are reported to outperform the classical techniques, the performance of the classical methods is not examined in detail. In this study, we compare classical and learning-based methods by employing mutual nearest neighbor search with ratio test and optimizing the ratio test threshold to achieve the best performance on two different performance metrics. After a fair comparison, the experimental results on HPatches dataset reveal that the performance gap between classical and learning-based methods is not that significant. Throughout the experiments, we demonstrated that SuperGlue is the state-of-the-art technique for the image matching problem on HPatches dataset. However, if a single parameter, namely ratio test threshold, is carefully optimized, a well-known traditional method SIFT performs quite close to SuperGlue and even outperforms in terms of mean matching accuracy (MMA) under 1 and 2 pixel thresholds. Moreover, a recent approach, DFM, which only uses pre-trained VGG features as descriptors and ratio test, is shown to outperform most of the well-trained learning-based methods. Therefore, we conclude that the parameters of any classical method should be analyzed carefully before comparing against a learning-based technique. △ Less

Submitted 28 August, 2021; v1 submitted 18 August, 2021; originally announced August 2021.

Comments: 8 pages, 2 figures, 3 tables, ICCV 2021 TradiCV Workshop

arXiv:2106.07791 [pdf, ps, other]

DFM: A Performance Baseline for Deep Feature Matching

Authors: Ufuk Efe, Kutalmis Gokalp Ince, A. Aydin Alatan

Abstract: A novel image matching method is proposed that utilizes learned features extracted by an off-the-shelf deep neural network to obtain a promising performance. The proposed method uses pre-trained VGG architecture as a feature extractor and does not require any additional training specific to improve matching. Inspired by well-established concepts in the psychology area, such as the Mental Rotation… ▽ More A novel image matching method is proposed that utilizes learned features extracted by an off-the-shelf deep neural network to obtain a promising performance. The proposed method uses pre-trained VGG architecture as a feature extractor and does not require any additional training specific to improve matching. Inspired by well-established concepts in the psychology area, such as the Mental Rotation paradigm, an initial war** is performed as a result of a preliminary geometric transformation estimate. These estimates are simply based on dense matching of nearest neighbors at the terminal layer of VGG network outputs of the images to be matched. After this initial alignment, the same approach is repeated again between reference and aligned images in a hierarchical manner to reach a good localization and matching performance. Our algorithm achieves 0.57 and 0.80 overall scores in terms of Mean Matching Accuracy (MMA) for 1 pixel and 2 pixels thresholds respectively on Hpatches dataset, which indicates a better performance than the state-of-the-art. △ Less

Submitted 14 June, 2021; originally announced June 2021.

Comments: CVPR 2021 Image Matching Workshop Camera Ready Version

ACM Class: I.5.0; I.4.7

arXiv:2101.06977 [pdf, other]

doi 10.1109/ICCVW54120.2021.00143

Semi-Automatic Annotation For Visual Object Tracking

Authors: Kutalmis Gokalp Ince, Aybora Koksal, Arda Fazla, A. Aydin Alatan

Abstract: We propose a semi-automatic bounding box annotation method for visual object tracking by utilizing temporal information with a tracking-by-detection approach. For detection, we use an off-the-shelf object detector which is trained iteratively with the annotations generated by the proposed method, and we perform object detection on each frame independently. We employ Multiple Hypothesis Tracking (M… ▽ More We propose a semi-automatic bounding box annotation method for visual object tracking by utilizing temporal information with a tracking-by-detection approach. For detection, we use an off-the-shelf object detector which is trained iteratively with the annotations generated by the proposed method, and we perform object detection on each frame independently. We employ Multiple Hypothesis Tracking (MHT) to exploit temporal information and to reduce the number of false-positives which makes it possible to use lower objectness thresholds for detection to increase recall. The tracklets formed by MHT are evaluated by human operators to enlarge the training set. This novel incremental learning approach helps to perform annotation iteratively. The experiments performed on AUTH Multidrone Dataset reveal that the annotation workload can be reduced up to 96% by the proposed approach. △ Less

Submitted 19 August, 2021; v1 submitted 18 January, 2021; originally announced January 2021.

Comments: Accepted to The 2nd Anti-UAV Workshop & Challenge - ICCV Workshops, 2021. Resulting uav_detection_2 annotations and our codes are publicly available at https://github.com/aybora/Semi-Automatic-Video-Annotation-OGAM

arXiv:2004.01059 [pdf, other]

doi 10.1109/CVPRW50498.2020.00523

Effect of Annotation Errors on Drone Detection with YOLOv3

Authors: Aybora Koksal, Kutalmis Gokalp Ince, A. Aydin Alatan

Abstract: Following the recent advances in deep networks, object detection and tracking algorithms with deep learning backbones have been improved significantly; however, this rapid development resulted in the necessity of large amounts of annotated labels. Even if the details of such semi-automatic annotation processes for most of these datasets are not known precisely, especially for the video annotations… ▽ More Following the recent advances in deep networks, object detection and tracking algorithms with deep learning backbones have been improved significantly; however, this rapid development resulted in the necessity of large amounts of annotated labels. Even if the details of such semi-automatic annotation processes for most of these datasets are not known precisely, especially for the video annotations, some automated labeling processes are usually employed. Unfortunately, such approaches might result with erroneous annotations. In this work, different types of annotation errors for object detection problem are simulated and the performance of a popular state-of-the-art object detector, YOLOv3, with erroneous annotations during training and testing stages is examined. Moreover, some inevitable annotation errors in CVPR-2020 Anti-UAV Challenge dataset is also examined in this manner, while proposing a solution to correct such annotation errors of this valuable data set. △ Less

Submitted 12 January, 2021; v1 submitted 2 April, 2020; originally announced April 2020.

Comments: Best Paper Award at The 1st Anti-UAV Workshop & Challenge - CVPR Workshops, 2020

Showing 1–5 of 5 results for author: Ince, K G