-
Enhancing Object Detection Performance for Small Objects through Synthetic Data Generation and Proportional Class-Balancing Technique: A Comparative Study in Industrial Scenarios
Authors:
Jibinraj Antony,
Vinit Hegiste,
Ali Nazeri,
Hooman Tavakoli,
Snehal Walunj,
Christiane Plociennik,
Martin Ruskowski
Abstract:
Object Detection (OD) has proven to be a significant computer vision method in extracting localized class information and has multiple applications in the industry. Although many of the state-of-the-art (SOTA) OD models perform well on medium and large sized objects, they seem to under perform on small objects. In most of the industrial use cases, it is difficult to collect and annotate data for s…
▽ More
Object Detection (OD) has proven to be a significant computer vision method in extracting localized class information and has multiple applications in the industry. Although many of the state-of-the-art (SOTA) OD models perform well on medium and large sized objects, they seem to under perform on small objects. In most of the industrial use cases, it is difficult to collect and annotate data for small objects, as it is time-consuming and prone to human errors. Additionally, those datasets are likely to be unbalanced and often result in an inefficient model convergence. To tackle this challenge, this study presents a novel approach that injects additional data points to improve the performance of the OD models. Using synthetic data generation, the difficulties in data collection and annotations for small object data points can be minimized and to create a dataset with balanced distribution. This paper discusses the effects of a simple proportional class-balancing technique, to enable better anchor matching of the OD models. A comparison was carried out on the performances of the SOTA OD models: YOLOv5, YOLOv7 and SSD, for combinations of real and synthetic datasets within an industrial use case.
△ Less
Submitted 29 January, 2024; v1 submitted 23 January, 2024;
originally announced January 2024.
-
Toward Fault Detection in Industrial Welding Processes with Deep Learning and Data Augmentation
Authors:
Jibinraj Antony,
Dr. Florian Schlather,
Georgij Safronov,
Markus Schmitz,
Prof. Dr. Kristof Van Laerhoven
Abstract:
With the rise of deep learning models in the field of computer vision, new possibilities for their application in industrial processes proves to return great benefits. Nevertheless, the actual fit of machine learning for highly standardised industrial processes is still under debate. This paper addresses the challenges on the industrial realization of the AI tools, considering the use case of Lase…
▽ More
With the rise of deep learning models in the field of computer vision, new possibilities for their application in industrial processes proves to return great benefits. Nevertheless, the actual fit of machine learning for highly standardised industrial processes is still under debate. This paper addresses the challenges on the industrial realization of the AI tools, considering the use case of Laser Beam Welding quality control as an example. We use object detection algorithms from the TensorFlow object detection API and adapt them to our use case using transfer learning. The baseline models we develop are used as benchmarks and evaluated and compared to models that undergo dataset scaling and hyperparameter tuning. We find that moderate scaling of the dataset via image augmentation leads to improvements in intersection over union (IoU) and recall, whereas high levels of augmentation and scaling may lead to deterioration of results. Finally, we put our results into perspective of the underlying use case and evaluate their fit.
△ Less
Submitted 18 June, 2021;
originally announced June 2021.
-
How important are faces for person re-identification?
Authors:
Julia Dietlmeier,
Joseph Antony,
Kevin McGuinness,
Noel E. O'Connor
Abstract:
This paper investigates the dependence of existing state-of-the-art person re-identification models on the presence and visibility of human faces. We apply a face detection and blurring algorithm to create anonymized versions of several popular person re-identification datasets including Market1501, DukeMTMC-reID, CUHK03, Viper, and Airport. Using a cross-section of existing state-of-the-art model…
▽ More
This paper investigates the dependence of existing state-of-the-art person re-identification models on the presence and visibility of human faces. We apply a face detection and blurring algorithm to create anonymized versions of several popular person re-identification datasets including Market1501, DukeMTMC-reID, CUHK03, Viper, and Airport. Using a cross-section of existing state-of-the-art models that range in accuracy and computational efficiency, we evaluate the effect of this anonymization on re-identification performance using standard metrics. Perhaps surprisingly, the effect on mAP is very small, and accuracy is recovered by simply training on the anonymized versions of the data rather than the original data. These findings are consistent across multiple models and datasets. These results indicate that datasets can be safely anonymized by blurring faces without significantly impacting the performance of person reidentification systems, and may allow for the release of new richer re-identification datasets where previously there were privacy or data protection concerns.
△ Less
Submitted 13 October, 2020;
originally announced October 2020.
-
Dueling Deep Q-Network for Unsupervised Inter-frame Eye Movement Correction in Optical Coherence Tomography Volumes
Authors:
Yasmeen M. George,
Suman Sedai,
Bhavna J. Antony,
Hiroshi Ishikawa,
Gadi Wollstein,
Joel S. Schuman,
Rahil Garnavi
Abstract:
In optical coherence tomography (OCT) volumes of retina, the sequential acquisition of the individual slices makes this modality prone to motion artifacts, misalignments between adjacent slices being the most noticeable. Any distortion in OCT volumes can bias structural analysis and influence the outcome of longitudinal studies. On the other hand, presence of speckle noise that is characteristic o…
▽ More
In optical coherence tomography (OCT) volumes of retina, the sequential acquisition of the individual slices makes this modality prone to motion artifacts, misalignments between adjacent slices being the most noticeable. Any distortion in OCT volumes can bias structural analysis and influence the outcome of longitudinal studies. On the other hand, presence of speckle noise that is characteristic of this imaging modality, leads to inaccuracies when traditional registration techniques are employed. Also, the lack of a well-defined ground truth makes supervised deep-learning techniques ill-posed to tackle the problem. In this paper, we tackle these issues by using deep reinforcement learning to correct inter-frame movements in an unsupervised manner. Specifically, we use dueling deep Q-network to train an artificial agent to find the optimal policy, i.e. a sequence of actions, that best improves the alignment by maximizing the sum of reward signals. Instead of relying on the ground-truth of transformation parameters to guide the rewarding system, for the first time, we use a combination of intensity based image similarity metrics. Further, to avoid the agent bias towards speckle noise, we ensure the agent can see retinal layers as part of the interacting environment. For quantitative evaluation, we simulate the eye movement artifacts by applying 2D rigid transformations on individual B-scans. The proposed model achieves an average of 0.985 and 0.914 for normalized mutual information and correlation coefficient, respectively. We also compare our model with elastix intensity based medical image registration approach, where significant improvement is achieved by our model for both noisy and denoised volumes.
△ Less
Submitted 3 July, 2020;
originally announced July 2020.
-
Investigating Class-level Difficulty Factors in Multi-label Classification Problems
Authors:
Mark Marsden,
Kevin McGuinness,
Joseph Antony,
Haolin Wei,
Milan Redzic,
Jian Tang,
Zhilan Hu,
Alan Smeaton,
Noel E O'Connor
Abstract:
This work investigates the use of class-level difficulty factors in multi-label classification problems for the first time. Four class-level difficulty factors are proposed: frequency, visual variation, semantic abstraction, and class co-occurrence. Once computed for a given multi-label classification dataset, these difficulty factors are shown to have several potential applications including the…
▽ More
This work investigates the use of class-level difficulty factors in multi-label classification problems for the first time. Four class-level difficulty factors are proposed: frequency, visual variation, semantic abstraction, and class co-occurrence. Once computed for a given multi-label classification dataset, these difficulty factors are shown to have several potential applications including the prediction of class-level performance across datasets and the improvement of predictive performance through difficulty weighted optimisation. Significant improvements to mAP and AUC performance are observed for two challenging multi-label datasets (WWW Crowd and Visual Genome) with the inclusion of difficulty weighted optimisation. The proposed technique does not require any additional computational complexity during training or inference and can be extended over time with inclusion of other class-level difficulty factors.
△ Less
Submitted 1 May, 2020;
originally announced May 2020.
-
Dynamic Approach for Lane Detection using Google Street View and CNN
Authors:
Rama Sai Mamidala,
Uday Uthkota,
Mahamkali Bhavani Shankar,
A. Joseph Antony,
A. V. Narasimhadhan
Abstract:
Lane detection algorithms have been the key enablers for a fully-assistive and autonomous navigation systems. In this paper, a novel and pragmatic approach for lane detection is proposed using a convolutional neural network (CNN) model based on SegNet encoder-decoder architecture. The encoder block renders low-resolution feature maps of the input and the decoder block provides pixel-wise classific…
▽ More
Lane detection algorithms have been the key enablers for a fully-assistive and autonomous navigation systems. In this paper, a novel and pragmatic approach for lane detection is proposed using a convolutional neural network (CNN) model based on SegNet encoder-decoder architecture. The encoder block renders low-resolution feature maps of the input and the decoder block provides pixel-wise classification from the feature maps. The proposed model has been trained over 2000 image data-set and tested against their corresponding ground-truth provided in the data-set for evaluation. To enable real-time navigation, we extend our model's predictions interfacing it with the existing Google APIs evaluating the metrics of the model tuning the hyper-parameters. The novelty of this approach lies in the integration of existing segNet architecture with google APIs. This interface makes it handy for assistive robotic systems. The observed results show that the proposed method is robust under challenging occlusion conditions due to pre-processing involved and gives superior performance when compared to the existing methods.
△ Less
Submitted 2 September, 2019;
originally announced September 2019.
-
Predicting knee osteoarthritis severity: comparative modeling based on patient's data and plain X-ray images
Authors:
Jaynal Abedin,
Joseph Antony,
Kevin McGuinness,
Kieran Moran,
Noel E O'Connor,
Dietrich Rebholz-Schuhmann,
John Newell
Abstract:
Knee osteoarthritis (KOA) is a disease that impairs knee function and causes pain. A radiologist reviews knee X-ray images and grades the severity level of the impairments according to the Kellgren and Lawrence grading scheme; a five-point ordinal scale (0--4). In this study, we used Elastic Net (EN) and Random Forests (RF) to build predictive models using patient assessment data (i.e. signs and s…
▽ More
Knee osteoarthritis (KOA) is a disease that impairs knee function and causes pain. A radiologist reviews knee X-ray images and grades the severity level of the impairments according to the Kellgren and Lawrence grading scheme; a five-point ordinal scale (0--4). In this study, we used Elastic Net (EN) and Random Forests (RF) to build predictive models using patient assessment data (i.e. signs and symptoms of both knees and medication use) and a convolution neural network (CNN) trained using X-ray images only. Linear mixed effect models (LMM) were used to model the within subject correlation between the two knees. The root mean squared error for the CNN, EN, and RF models was 0.77, 0.97, and 0.94 respectively. The LMM shows similar overall prediction accuracy as the EN regression but correctly accounted for the hierarchical structure of the data resulting in more reliable inference. Useful explanatory variables were identified that could be used for patient monitoring before X-ray imaging. Our analyses suggest that the models trained for predicting the KOA severity levels achieve comparable results when modeling X-ray images and patient data. The subjectivity in the KL grade is still a primary concern.
△ Less
Submitted 23 August, 2019;
originally announced August 2019.
-
Assessing Knee OA Severity with CNN attention-based end-to-end architectures
Authors:
Marc Górriz,
Joseph Antony,
Kevin McGuinness,
Xavier Giró-i-Nieto,
Noel E. O'Connor
Abstract:
This work proposes a novel end-to-end convolutional neural network (CNN) architecture to automatically quantify the severity of knee osteoarthritis (OA) using X-Ray images, which incorporates trainable attention modules acting as unsupervised fine-grained detectors of the region of interest (ROI). The proposed attention modules can be applied at different levels and scales across any CNN pipeline…
▽ More
This work proposes a novel end-to-end convolutional neural network (CNN) architecture to automatically quantify the severity of knee osteoarthritis (OA) using X-Ray images, which incorporates trainable attention modules acting as unsupervised fine-grained detectors of the region of interest (ROI). The proposed attention modules can be applied at different levels and scales across any CNN pipeline hel** the network to learn relevant attention patterns over the most informative parts of the image at different resolutions. We test the proposed attention mechanism on existing state-of-the-art CNN architectures as our base models, achieving promising results on the benchmark knee OA datasets from the osteoarthritis initiative (OAI) and multicenter osteoarthritis study (MOST). All code from our experiments will be publicly available on the github repository: https://github.com/marc-gorriz/KneeOA-CNNAttention
△ Less
Submitted 23 August, 2019;
originally announced August 2019.
-
Feature Learning to Automatically Assess Radiographic Knee Osteoarthritis Severity
Authors:
Joseph Antony,
Kevin McGuinness,
Kieran Moran,
Noel E O' Connor
Abstract:
This chapter presents the investigations and the results of feature learning using convolutional neural networks to automatically assess knee osteoarthritis (OA) severity and the associated clinical and diagnostic features of knee OA from X-ray images. Also, this chapter demonstrates that feature learning in a supervised manner is more effective than using conventional handcrafted features for aut…
▽ More
This chapter presents the investigations and the results of feature learning using convolutional neural networks to automatically assess knee osteoarthritis (OA) severity and the associated clinical and diagnostic features of knee OA from X-ray images. Also, this chapter demonstrates that feature learning in a supervised manner is more effective than using conventional handcrafted features for automatic detection of knee joints and fine-grained knee OA image classification. In the general machine learning approach to automatically assess knee OA severity, the first step is to localize the region of interest that is to detect and extract the knee joint regions from the radiographs, and the next step is to classify the localized knee joints based on a radiographic classification scheme such as Kellgren and Lawrence grades. First, the existing approaches for detecting (or localizing) the knee joint regions based on handcrafted features are reviewed and outlined. Next, three new approaches are introduced: 1) to automatically detect the knee joint region using a fully convolutional network, 2) to automatically assess the radiographic knee OA using CNNs trained from scratch for classification and regression of knee joint images to predict KL grades in ordinal and continuous scales, and 3) to quantify the knee OA severity optimizing a weighted ratio of two loss functions: categorical cross entropy and mean-squared error using multi-objective convolutional learning and ordinal regression. Two public datasets: the OAI and the MOST are used to evaluate the approaches with promising results that outperform existing approaches. In summary, this work primarily contributes to the field of automated methods for localization (automatic detection) and quantification (image classification) of radiographic knee OA.
△ Less
Submitted 23 August, 2019;
originally announced August 2019.
-
Automatic Detection of Knee Joints and Quantification of Knee Osteoarthritis Severity using Convolutional Neural Networks
Authors:
Joseph Antony,
Kevin McGuinness,
Kieran Moran,
Noel E O'Connor
Abstract:
This paper introduces a new approach to automatically quantify the severity of knee OA using X-ray images. Automatically quantifying knee OA severity involves two steps: first, automatically localizing the knee joints; next, classifying the localized knee joint images. We introduce a new approach to automatically detect the knee joints using a fully convolutional neural network (FCN). We train con…
▽ More
This paper introduces a new approach to automatically quantify the severity of knee OA using X-ray images. Automatically quantifying knee OA severity involves two steps: first, automatically localizing the knee joints; next, classifying the localized knee joint images. We introduce a new approach to automatically detect the knee joints using a fully convolutional neural network (FCN). We train convolutional neural networks (CNN) from scratch to automatically quantify the knee OA severity optimizing a weighted ratio of two loss functions: categorical cross-entropy and mean-squared loss. This joint training further improves the overall quantification of knee OA severity, with the added benefit of naturally producing simultaneous multi-class classification and regression outputs. Two public datasets are used to evaluate our approach, the Osteoarthritis Initiative (OAI) and the Multicenter Osteoarthritis Study (MOST), with extremely promising results that outperform existing approaches.
△ Less
Submitted 28 March, 2017;
originally announced March 2017.
-
An Interactive Segmentation Tool for Quantifying Fat in Lumbar Muscles using Axial Lumbar-Spine MRI
Authors:
Joseph Antony,
Kevin McGuinness,
Neil Welch,
Joe Coyle,
Andy Franklyn-Miller,
Noel E. O'Connor,
Kieran Moran
Abstract:
In this paper we present an interactive tool that can be used to quantify fat infiltration in lumbar muscles, which is useful in studying fat infiltration and lower back pain (LBP) in adults. Currently, a qualitative assessment by visual grading via a 5-point scale is used to study fat infiltration in lumbar muscles from an axial view of lumbar-spine MR Images. However, a quantitative approach (on…
▽ More
In this paper we present an interactive tool that can be used to quantify fat infiltration in lumbar muscles, which is useful in studying fat infiltration and lower back pain (LBP) in adults. Currently, a qualitative assessment by visual grading via a 5-point scale is used to study fat infiltration in lumbar muscles from an axial view of lumbar-spine MR Images. However, a quantitative approach (on a continuous scale of 0-100\%) may provide a greater insight. In this paper, we propose a method to precisely quantify the fat deposition / infiltration in a user-defined region of the lumbar muscles, which may aid better diagnosis and analysis. The key steps are interactively segmenting the region of interest (ROI) from the lumbar muscles using the well known livewire technique, identifying fatty regions in the segmented region based on variable-selection of threshold and softness levels, automatically detecting the center of the spinal column and fragmenting the lumbar muscles into smaller regions with reference to the center of the spinal column, computing key parameters [such as total and region-wise fat content percentage, total-cross sectional area (TCSA) and functional cross-sectional area (FCSA)] and exporting the computations and associated patient information from the MRI, into a database. A standalone application using MATLAB R2014a was developed to perform the required computations along with an intuitive graphical user interface (GUI).
△ Less
Submitted 9 September, 2016;
originally announced September 2016.
-
Quantifying Radiographic Knee Osteoarthritis Severity using Deep Convolutional Neural Networks
Authors:
Joseph Antony,
Kevin McGuinness,
Noel E O Connor,
Kieran Moran
Abstract:
This paper proposes a new approach to automatically quantify the severity of knee osteoarthritis (OA) from radiographs using deep convolutional neural networks (CNN). Clinically, knee OA severity is assessed using Kellgren \& Lawrence (KL) grades, a five point scale. Previous work on automatically predicting KL grades from radiograph images were based on training shallow classifiers using a variet…
▽ More
This paper proposes a new approach to automatically quantify the severity of knee osteoarthritis (OA) from radiographs using deep convolutional neural networks (CNN). Clinically, knee OA severity is assessed using Kellgren \& Lawrence (KL) grades, a five point scale. Previous work on automatically predicting KL grades from radiograph images were based on training shallow classifiers using a variety of hand engineered features. We demonstrate that classification accuracy can be significantly improved using deep convolutional neural network models pre-trained on ImageNet and fine-tuned on knee OA images. Furthermore, we argue that it is more appropriate to assess the accuracy of automatic knee OA severity predictions using a continuous distance-based evaluation metric like mean squared error than it is to use classification accuracy. This leads to the formulation of the prediction of KL grades as a regression problem and further improves accuracy. Results on a dataset of X-ray images and KL grades from the Osteoarthritis Initiative (OAI) show a sizable improvement over the current state-of-the-art.
△ Less
Submitted 8 September, 2016;
originally announced September 2016.