Search | arXiv e-print repository

Image-level Regression for Uncertainty-aware Retinal Image Segmentation

Authors: Trung Dang, Huy Hoang Nguyen, Aleksei Tiulpin

Abstract: Accurate retinal vessel segmentation is a crucial step in the quantitative assessment of retinal vasculature, which is needed for the early detection of retinal diseases and other conditions. Numerous studies have been conducted to tackle the problem of segmenting vessels automatically using a pixel-wise classification approach. The common practice of creating ground truth labels is to categorize… ▽ More Accurate retinal vessel segmentation is a crucial step in the quantitative assessment of retinal vasculature, which is needed for the early detection of retinal diseases and other conditions. Numerous studies have been conducted to tackle the problem of segmenting vessels automatically using a pixel-wise classification approach. The common practice of creating ground truth labels is to categorize pixels as foreground and background. This approach is, however, biased, and it ignores the uncertainty of a human annotator when it comes to annotating e.g. thin vessels. In this work, we propose a simple and effective method that casts the retinal image segmentation task as an image-level regression. For this purpose, we first introduce a novel Segmentation Annotation Uncertainty-Aware (SAUNA) transform, which adds pixel uncertainty to the ground truth using the pixel's closeness to the annotation boundary and vessel thickness. To train our model with soft labels, we generalize the earlier proposed Jaccard metric loss to arbitrary hypercubes, which is a second contribution of this work. The proposed SAUNA transform and the new theoretical results allow us to directly train a standard U-Net-like architecture at the image level, outperforming all recently published methods. We conduct thorough experiments and compare our method to a diverse set of baselines across 5 retinal image datasets. Our implementation is available at \url{https://github.com/Oulu-IMEDS/SAUNA}. △ Less

Submitted 27 May, 2024; originally announced May 2024.

Comments: 13 pages

arXiv:2405.16813 [pdf, other]

SiNGR: Brain Tumor Segmentation via Signed Normalized Geodesic Transform Regression

Authors: Trung Dang, Huy Hoang Nguyen, Aleksei Tiulpin

Abstract: One of the primary challenges in brain tumor segmentation arises from the uncertainty of voxels close to tumor boundaries. However, the conventional process of generating ground truth segmentation masks fails to treat such uncertainties properly. Those ``hard labels'' with 0s and 1s conceptually influenced the majority of prior studies on brain image segmentation. As a result, tumor segmentation i… ▽ More One of the primary challenges in brain tumor segmentation arises from the uncertainty of voxels close to tumor boundaries. However, the conventional process of generating ground truth segmentation masks fails to treat such uncertainties properly. Those ``hard labels'' with 0s and 1s conceptually influenced the majority of prior studies on brain image segmentation. As a result, tumor segmentation is often solved through voxel classification. In this work, we instead view this problem as a voxel-level regression, where the ground truth represents a certainty map** from any pixel based on the distance to tumor border. We propose a novel ground truth label transformation, which is based on a signed geodesic transform, to capture the uncertainty in brain tumors' vicinity, while maintaining a margin between positive and negative samples. We combine this idea with a Focal-like regression L1-loss that enables effective regression learning in high-dimensional output space by appropriately weighting voxels according to their difficulty. We thoroughly conduct an experimental evaluation to validate the components of our proposed method, compare it to a diverse array of state-of-the-art segmentation models, and show that it is architecture-agnostic. The code of our method is made publicly available (\url{https://github.com/Oulu-IMEDS/SiNGR/}). △ Less

Submitted 27 May, 2024; originally announced May 2024.

Comments: Accepted as a conference paper at MICCAI 2024

arXiv:2312.08589 [pdf, other]

Consistent and Asymptotically Unbiased Estimation of Proper Calibration Errors

Authors: Teodora Popordanoska, Sebastian G. Gruber, Aleksei Tiulpin, Florian Buettner, Matthew B. Blaschko

Abstract: Proper scoring rules evaluate the quality of probabilistic predictions, playing an essential role in the pursuit of accurate and well-calibrated models. Every proper score decomposes into two fundamental components -- proper calibration error and refinement -- utilizing a Bregman divergence. While uncertainty calibration has gained significant attention, current literature lacks a general estimato… ▽ More Proper scoring rules evaluate the quality of probabilistic predictions, playing an essential role in the pursuit of accurate and well-calibrated models. Every proper score decomposes into two fundamental components -- proper calibration error and refinement -- utilizing a Bregman divergence. While uncertainty calibration has gained significant attention, current literature lacks a general estimator for these quantities with known statistical properties. To address this gap, we propose a method that allows consistent, and asymptotically unbiased estimation of all proper calibration errors and refinement terms. In particular, we introduce Kullback--Leibler calibration error, induced by the commonly used cross-entropy loss. As part of our results, we prove the relation between refinement and f-divergences, which implies information monotonicity in neural networks, regardless of which proper scoring rule is optimized. Our experiments validate empirically the claimed properties of the proposed estimator and suggest that the selection of a post-hoc calibration method should be determined by the particular calibration error of interest. △ Less

Submitted 13 December, 2023; originally announced December 2023.

Comments: Preprint

arXiv:2312.06645 [pdf, other]

Beyond Classification: Definition and Density-based Estimation of Calibration in Object Detection

Authors: Teodora Popordanoska, Aleksei Tiulpin, Matthew B. Blaschko

Abstract: Despite their impressive predictive performance in various computer vision tasks, deep neural networks (DNNs) tend to make overly confident predictions, which hinders their widespread use in safety-critical applications. While there have been recent attempts to calibrate DNNs, most of these efforts have primarily been focused on classification tasks, thus neglecting DNN-based object detectors. Alt… ▽ More Despite their impressive predictive performance in various computer vision tasks, deep neural networks (DNNs) tend to make overly confident predictions, which hinders their widespread use in safety-critical applications. While there have been recent attempts to calibrate DNNs, most of these efforts have primarily been focused on classification tasks, thus neglecting DNN-based object detectors. Although several recent works addressed calibration for object detection and proposed differentiable penalties, none of them are consistent estimators of established concepts in calibration. In this work, we tackle the challenge of defining and estimating calibration error specifically for this task. In particular, we adapt the definition of classification calibration error to handle the nuances associated with object detection, and predictions in structured output spaces more generally. Furthermore, we propose a consistent and differentiable estimator of the detection calibration error, utilizing kernel density estimation. Our experiments demonstrate the effectiveness of our estimator against competing train-time and post-hoc calibration methods, while maintaining similar detection performance. △ Less

Submitted 11 December, 2023; originally announced December 2023.

Comments: To appear at WACV 2024

arXiv:2307.00873 [pdf, other]

End-To-End Prediction of Knee Osteoarthritis Progression With Multi-Modal Transformers

Authors: Egor Panfilov, Simo Saarakkala, Miika T. Nieminen, Aleksei Tiulpin

Abstract: Knee Osteoarthritis (KOA) is a highly prevalent chronic musculoskeletal condition with no currently available treatment. The manifestation of KOA is heterogeneous and prediction of its progression is challenging. Current literature suggests that the use of multi-modal data and advanced modeling methods, such as the ones based on Deep Learning, has promise in tackling this challenge. To date, howev… ▽ More Knee Osteoarthritis (KOA) is a highly prevalent chronic musculoskeletal condition with no currently available treatment. The manifestation of KOA is heterogeneous and prediction of its progression is challenging. Current literature suggests that the use of multi-modal data and advanced modeling methods, such as the ones based on Deep Learning, has promise in tackling this challenge. To date, however, the evidence on the efficacy of this approach is limited. In this study, we leveraged recent advances in Deep Learning and, using a Transformer approach, developed a unified framework for the multi-modal fusion of knee imaging data. Subsequently, we analyzed its performance across a range of scenarios by investigating multiple progression horizons -- from short-term to long-term. We report our findings using a large cohort (n=2421-3967) derived from the Osteoarthritis Initiative dataset. We show that structural knee MRI allows identifying radiographic KOA progressors on par with multi-modal fusion approaches, achieving an area under the ROC curve (ROC AUC) of 0.70-0.76 and Average Precision (AP) of 0.15-0.54 in 2-8 year horizons. Progression within 1 year was better predicted with a multi-modal method using X-ray, structural, and compositional MR images -- ROC AUC of 0.76(0.04), AP of 0.13(0.04) -- or via clinical data. Our follow-up analysis generally shows that prediction from the imaging data is more accurate for post-traumatic subjects, and we further investigate which subject subgroups may benefit the most. The present study provides novel insights into multi-modal imaging of KOA and brings a unified data-driven framework for studying its progression in an end-to-end manner, providing new tools for the design of more efficient clinical trials. The source code of our framework and the pre-trained models are made publicly available. △ Less

Submitted 3 July, 2023; originally announced July 2023.

arXiv:2302.01790 [pdf, other]

doi 10.1038/s41592-023-02150-0

Understanding metric-related pitfalls in image analysis validation

Authors: Annika Reinke, Minu D. Tizabi, Michael Baumgartner, Matthias Eisenmann, Doreen Heckmann-Nötzel, A. Emre Kavur, Tim Rädsch, Carole H. Sudre, Laura Acion, Michela Antonelli, Tal Arbel, Spyridon Bakas, Arriel Benis, Matthew Blaschko, Florian Buettner, M. Jorge Cardoso, Veronika Cheplygina, Jianxu Chen, Evangelia Christodoulou, Beth A. Cimini, Gary S. Collins, Keyvan Farahani, Luciana Ferrer, Adrian Galdran, Bram van Ginneken , et al. (53 additional authors not shown)

Abstract: Validation metrics are key for the reliable tracking of scientific progress and for bridging the current chasm between artificial intelligence (AI) research and its translation into practice. However, increasing evidence shows that particularly in image analysis, metrics are often chosen inadequately in relation to the underlying research problem. This could be attributed to a lack of accessibilit… ▽ More Validation metrics are key for the reliable tracking of scientific progress and for bridging the current chasm between artificial intelligence (AI) research and its translation into practice. However, increasing evidence shows that particularly in image analysis, metrics are often chosen inadequately in relation to the underlying research problem. This could be attributed to a lack of accessibility of metric-related knowledge: While taking into account the individual strengths, weaknesses, and limitations of validation metrics is a critical prerequisite to making educated choices, the relevant knowledge is currently scattered and poorly accessible to individual researchers. Based on a multi-stage Delphi process conducted by a multidisciplinary expert consortium as well as extensive community feedback, the present work provides the first reliable and comprehensive common point of access to information on pitfalls related to validation metrics in image analysis. Focusing on biomedical image analysis but with the potential of transfer to other fields, the addressed pitfalls generalize across application domains and are categorized according to a newly created, domain-agnostic taxonomy. To facilitate comprehension, illustrations and specific examples accompany each pitfall. As a structured body of information accessible to researchers of all levels of expertise, this work enhances global comprehension of a key topic in image analysis validation. △ Less

Submitted 23 February, 2024; v1 submitted 3 February, 2023; originally announced February 2023.

Comments: Shared first authors: Annika Reinke and Minu D. Tizabi; shared senior authors: Lena Maier-Hein and Paul F. Jäger. Published in Nature Methods. arXiv admin note: text overlap with arXiv:2206.01653

Journal ref: Nature methods, 1-13 (2024)

arXiv:2210.14597 [pdf, other]

A Stronger Baseline For Automatic Pfirrmann Grading Of Lumbar Spine MRI Using Deep Learning

Authors: Narasimharao Kowlagi, Huy Hoang Nguyen, Terence McSweeney, Simo Saarakkala, Juhani määttä, Jaro Karppinen, Aleksei Tiulpin

Abstract: This paper addresses the challenge of grading visual features in lumbar spine MRI using Deep Learning. Such a method is essential for the automatic quantification of structural changes in the spine, which is valuable for understanding low back pain. Multiple recent studies investigated different architecture designs, and the most recent success has been attributed to the use of transformer archite… ▽ More This paper addresses the challenge of grading visual features in lumbar spine MRI using Deep Learning. Such a method is essential for the automatic quantification of structural changes in the spine, which is valuable for understanding low back pain. Multiple recent studies investigated different architecture designs, and the most recent success has been attributed to the use of transformer architectures. In this work, we argue that with a well-tuned three-stage pipeline comprising semantic segmentation, localization, and classification, convolutional networks outperform the state-of-the-art approaches. We conducted an ablation study of the existing methods in a population cohort, and report performance generalization across various subgroups. Our code is publicly available to advance research on disc degeneration and low back pain. △ Less

Submitted 26 October, 2022; originally announced October 2022.

Comments: 5 pages, under review

arXiv:2210.13889 [pdf, other]

doi 10.1109/TMI.2023.3312524

Clinically-Inspired Multi-Agent Transformers for Disease Trajectory Forecasting from Multimodal Data

Authors: Huy Hoang Nguyen, Matthew B. Blaschko, Simo Saarakkala, Aleksei Tiulpin

Abstract: Deep neural networks are often applied to medical images to automate the problem of medical diagnosis. However, a more clinically relevant question that practitioners usually face is how to predict the future trajectory of a disease. Current methods for prognosis or disease trajectory forecasting often require domain knowledge and are complicated to apply. In this paper, we formulate the prognosis… ▽ More Deep neural networks are often applied to medical images to automate the problem of medical diagnosis. However, a more clinically relevant question that practitioners usually face is how to predict the future trajectory of a disease. Current methods for prognosis or disease trajectory forecasting often require domain knowledge and are complicated to apply. In this paper, we formulate the prognosis prediction problem as a one-to-many prediction problem. Inspired by a clinical decision-making process with two agents -- a radiologist and a general practitioner -- we predict prognosis with two transformer-based components that share information with each other. The first transformer in this framework aims to analyze the imaging data, and the second one leverages its internal states as inputs, also fusing them with auxiliary clinical data. The temporal nature of the problem is modeled within the transformer states, allowing us to treat the forecasting problem as a multi-task classification, for which we propose a novel loss. We show the effectiveness of our approach in predicting the development of structural knee osteoarthritis changes and forecasting Alzheimer's disease clinical status directly from raw multi-modal data. The proposed method outperforms multiple state-of-the-art baselines with respect to performance and calibration, both of which are needed for real-world applications. An open-source implementation of our method is made publicly available at \url{https://github.com/Oulu-IMEDS/CLIMATv2}. △ Less

Submitted 19 September, 2023; v1 submitted 25 October, 2022; originally announced October 2022.

Comments: Accepted in IEEE Transactions on Medical Imaging 2023

arXiv:2208.11977 [pdf, other]

On confidence intervals for precision matrices and the eigendecomposition of covariance matrices

Authors: Teodora Popordanoska, Aleksei Tiulpin, Wacha Bounliphone, Matthew B. Blaschko

Abstract: The eigendecomposition of a matrix is the central procedure in probabilistic models based on matrix factorization, for instance principal component analysis and topic models. Quantifying the uncertainty of such a decomposition based on a finite sample estimate is essential to reasoning under uncertainty when employing such models. This paper tackles the challenge of computing confidence bounds on… ▽ More The eigendecomposition of a matrix is the central procedure in probabilistic models based on matrix factorization, for instance principal component analysis and topic models. Quantifying the uncertainty of such a decomposition based on a finite sample estimate is essential to reasoning under uncertainty when employing such models. This paper tackles the challenge of computing confidence bounds on the individual entries of eigenvectors of a covariance matrix of fixed dimension. Moreover, we derive a method to bound the entries of the inverse covariance matrix, the so-called precision matrix. The assumptions behind our method are minimal and require that the covariance matrix exists, and its empirical estimator converges to the true covariance. We make use of the theory of U-statistics to bound the $L_2$ perturbation of the empirical covariance matrix. From this result, we obtain bounds on the eigenvectors using Weyl's theorem and the eigenvalue-eigenvector identity and we derive confidence intervals on the entries of the precision matrix using matrix inversion perturbation bounds. As an application of these results, we demonstrate a new statistical test, which allows us to test for non-zero values of the precision matrix. We compare this test to the well-known Fisher-z test for partial correlations, and demonstrate the soundness and scalability of the proposed statistical test, as well as its application to real-world data from medical and physics domains. △ Less

Submitted 25 August, 2022; originally announced August 2022.

Comments: arXiv admin note: text overlap with arXiv:1604.01733

arXiv:2206.01653 [pdf, other]

doi 10.1038/s41592-023-02151-z

Metrics reloaded: Recommendations for image analysis validation

Authors: Lena Maier-Hein, Annika Reinke, Patrick Godau, Minu D. Tizabi, Florian Buettner, Evangelia Christodoulou, Ben Glocker, Fabian Isensee, Jens Kleesiek, Michal Kozubek, Mauricio Reyes, Michael A. Riegler, Manuel Wiesenfarth, A. Emre Kavur, Carole H. Sudre, Michael Baumgartner, Matthias Eisenmann, Doreen Heckmann-Nötzel, Tim Rädsch, Laura Acion, Michela Antonelli, Tal Arbel, Spyridon Bakas, Arriel Benis, Matthew Blaschko , et al. (49 additional authors not shown)

Abstract: Increasing evidence shows that flaws in machine learning (ML) algorithm validation are an underestimated global problem. Particularly in automatic biomedical image analysis, chosen performance metrics often do not reflect the domain interest, thus failing to adequately measure scientific progress and hindering translation of ML techniques into practice. To overcome this, our large international ex… ▽ More Increasing evidence shows that flaws in machine learning (ML) algorithm validation are an underestimated global problem. Particularly in automatic biomedical image analysis, chosen performance metrics often do not reflect the domain interest, thus failing to adequately measure scientific progress and hindering translation of ML techniques into practice. To overcome this, our large international expert consortium created Metrics Reloaded, a comprehensive framework guiding researchers in the problem-aware selection of metrics. Following the convergence of ML methodology across application domains, Metrics Reloaded fosters the convergence of validation methodology. The framework was developed in a multi-stage Delphi process and is based on the novel concept of a problem fingerprint - a structured representation of the given problem that captures all aspects that are relevant for metric selection, from the domain interest to the properties of the target structure(s), data set and algorithm output. Based on the problem fingerprint, users are guided through the process of choosing and applying appropriate validation metrics while being made aware of potential pitfalls. Metrics Reloaded targets image analysis problems that can be interpreted as a classification task at image, object or pixel level, namely image-level classification, object detection, semantic segmentation, and instance segmentation tasks. To improve the user experience, we implemented the framework in the Metrics Reloaded online tool, which also provides a point of access to explore weaknesses, strengths and specific recommendations for the most common validation metrics. The broad applicability of our framework across domains is demonstrated by an instantiation for various biological and medical image analysis use cases. △ Less

Submitted 23 February, 2024; v1 submitted 3 June, 2022; originally announced June 2022.

Comments: Shared first authors: Lena Maier-Hein, Annika Reinke. arXiv admin note: substantial text overlap with arXiv:2104.05642 Published in Nature Methods

Journal ref: Nature methods, 1-18 (2024)

arXiv:2205.02849 [pdf, other]

AdaTriplet: Adaptive Gradient Triplet Loss with Automatic Margin Learning for Forensic Medical Image Matching

Authors: Khanh Nguyen, Huy Hoang Nguyen, Aleksei Tiulpin

Abstract: This paper tackles the challenge of forensic medical image matching (FMIM) using deep neural networks (DNNs). FMIM is a particular case of content-based image retrieval (CBIR). The main challenge in FMIM compared to the general case of CBIR, is that the subject to whom a query image belongs may be affected by aging and progressive degenerative disorders, making it difficult to match data on a subj… ▽ More This paper tackles the challenge of forensic medical image matching (FMIM) using deep neural networks (DNNs). FMIM is a particular case of content-based image retrieval (CBIR). The main challenge in FMIM compared to the general case of CBIR, is that the subject to whom a query image belongs may be affected by aging and progressive degenerative disorders, making it difficult to match data on a subject level. CBIR with DNNs is generally solved by minimizing a ranking loss, such as Triplet loss (TL), computed on image representations extracted by a DNN from the original data. TL, in particular, operates on triplets: anchor, positive (similar to anchor) and negative (dissimilar to anchor). Although TL has been shown to perform well in many CBIR tasks, it still has limitations, which we identify and analyze in this work. In this paper, we introduce (i) the AdaTriplet loss -- an extension of TL whose gradients adapt to different difficulty levels of negative samples, and (ii) the AutoMargin method -- a technique to adjust hyperparameters of margin-based losses such as TL and our proposed loss dynamically. Our results are evaluated on two large-scale benchmarks for FMIM based on the Osteoarthritis Initiative and Chest X-ray-14 datasets. The codes allowing replication of this study have been made publicly available at \url{https://github.com/Oulu-IMEDS/AdaTriplet}. △ Less

Submitted 10 May, 2022; v1 submitted 5 May, 2022; originally announced May 2022.

Comments: 15 pages, 6 figures, accepted as a conference paper at MICCAI 2022

arXiv:2201.10849 [pdf, other]

Predicting Knee Osteoarthritis Progression from Structural MRI using Deep Learning

Authors: Egor Panfilov, Simo Saarakkala, Miika T. Nieminen, Aleksei Tiulpin

Abstract: Accurate prediction of knee osteoarthritis (KOA) progression from structural MRI has a potential to enhance disease understanding and support clinical trials. Prior art focused on manually designed imaging biomarkers, which may not fully exploit all disease-related information present in MRI scan. In contrast, our method learns relevant representations from raw data end-to-end using Deep Learning,… ▽ More Accurate prediction of knee osteoarthritis (KOA) progression from structural MRI has a potential to enhance disease understanding and support clinical trials. Prior art focused on manually designed imaging biomarkers, which may not fully exploit all disease-related information present in MRI scan. In contrast, our method learns relevant representations from raw data end-to-end using Deep Learning, and uses them for progression prediction. The method employs a 2D CNN to process the data slice-wise and aggregate the extracted features using a Transformer. Evaluated on a large cohort (n=4,866), the proposed method outperforms conventional 2D and 3D CNN-based models and achieves average precision of $0.58\pm0.03$ and ROC AUC of $0.78\pm0.01$. This paper sets a baseline on end-to-end KOA progression prediction from structural MRI. Our code is publicly available at https://github.com/MIPT-Oulu/OAProgressionMR. △ Less

Submitted 26 January, 2022; originally announced January 2022.

Comments: $©$ 2022 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

arXiv:2106.04193 [pdf, other]

Targeted Active Learning for Bayesian Decision-Making

Authors: Louis Filstroff, Iiris Sundin, Petrus Mikkola, Aleksei Tiulpin, Juuso Kylmäoja, Samuel Kaski

Abstract: Active learning is usually applied to acquire labels of informative data points in supervised learning, to maximize accuracy in a sample-efficient way. However, maximizing the accuracy is not the end goal when the results are used for decision-making, for example in personalized medicine or economics. We argue that when acquiring samples sequentially, separating learning and decision-making is sub… ▽ More Active learning is usually applied to acquire labels of informative data points in supervised learning, to maximize accuracy in a sample-efficient way. However, maximizing the accuracy is not the end goal when the results are used for decision-making, for example in personalized medicine or economics. We argue that when acquiring samples sequentially, separating learning and decision-making is sub-optimal, and we introduce an active learning strategy which takes the down-the-line decision problem into account. Specifically, we introduce a novel active learning criterion which maximizes the expected information gain on the posterior distribution of the optimal decision. We compare our targeted active learning strategy to existing alternatives on both simulated and real data, and show improved performance in decision-making accuracy. △ Less

Submitted 20 October, 2021; v1 submitted 8 June, 2021; originally announced June 2021.

arXiv:2105.14275 [pdf, other]

Greedy Bayesian Posterior Approximation with Deep Ensembles

Authors: Aleksei Tiulpin, Matthew B. Blaschko

Abstract: Ensembles of independently trained neural networks are a state-of-the-art approach to estimate predictive uncertainty in Deep Learning, and can be interpreted as an approximation of the posterior distribution via a mixture of delta functions. The training of ensembles relies on non-convexity of the loss landscape and random initialization of their individual members, making the resulting posterior… ▽ More Ensembles of independently trained neural networks are a state-of-the-art approach to estimate predictive uncertainty in Deep Learning, and can be interpreted as an approximation of the posterior distribution via a mixture of delta functions. The training of ensembles relies on non-convexity of the loss landscape and random initialization of their individual members, making the resulting posterior approximation uncontrolled. This paper proposes a novel and principled method to tackle this limitation, minimizing an $f$-divergence between the true posterior and a kernel density estimator (KDE) in a function space. We analyze this objective from a combinatorial point of view, and show that it is submodular with respect to mixture components for any $f$. Subsequently, we consider the problem of greedy ensemble construction. From the marginal gain on the negative $f$-divergence, which quantifies an improvement in posterior approximation yielded by adding a new component into the KDE, we derive a novel diversity term for ensemble methods. The performance of our approach is demonstrated on computer vision out-of-distribution detection benchmarks in a range of architectures trained on multiple datasets. The source code of our method is made publicly available at https://github.com/Oulu-IMEDS/greedy_ensembles_training. △ Less

Submitted 8 July, 2022; v1 submitted 29 May, 2021; originally announced May 2021.

Comments: Published in the Transactions on Machine Learning Research: https://openreview.net/forum?id=P1DuPJzVTN

Journal ref: Transactions on Machine Learning Research, 2022

arXiv:2104.05642 [pdf, other]

Common Limitations of Image Processing Metrics: A Picture Story

Authors: Annika Reinke, Minu D. Tizabi, Carole H. Sudre, Matthias Eisenmann, Tim Rädsch, Michael Baumgartner, Laura Acion, Michela Antonelli, Tal Arbel, Spyridon Bakas, Peter Bankhead, Arriel Benis, Matthew Blaschko, Florian Buettner, M. Jorge Cardoso, Jianxu Chen, Veronika Cheplygina, Evangelia Christodoulou, Beth Cimini, Gary S. Collins, Sandy Engelhardt, Keyvan Farahani, Luciana Ferrer, Adrian Galdran, Bram van Ginneken , et al. (68 additional authors not shown)

Abstract: While the importance of automatic image analysis is continuously increasing, recent meta-research revealed major flaws with respect to algorithm validation. Performance metrics are particularly key for meaningful, objective, and transparent performance assessment and validation of the used automatic algorithms, but relatively little attention has been given to the practical pitfalls when using spe… ▽ More While the importance of automatic image analysis is continuously increasing, recent meta-research revealed major flaws with respect to algorithm validation. Performance metrics are particularly key for meaningful, objective, and transparent performance assessment and validation of the used automatic algorithms, but relatively little attention has been given to the practical pitfalls when using specific metrics for a given image analysis task. These are typically related to (1) the disregard of inherent metric properties, such as the behaviour in the presence of class imbalance or small target structures, (2) the disregard of inherent data set properties, such as the non-independence of the test cases, and (3) the disregard of the actual biomedical domain interest that the metrics should reflect. This living dynamically document has the purpose to illustrate important limitations of performance metrics commonly applied in the field of image analysis. In this context, it focuses on biomedical image analysis problems that can be phrased as image-level classification, semantic segmentation, instance segmentation, or object detection task. The current version is based on a Delphi process on metrics conducted by an international consortium of image analysis experts from more than 60 institutions worldwide. △ Less

Submitted 6 December, 2023; v1 submitted 12 April, 2021; originally announced April 2021.

Comments: Shared first authors: Annika Reinke and Minu D. Tizabi. This is a dynamic paper on limitations of commonly used metrics. It discusses metrics for image-level classification, semantic and instance segmentation, and object detection. For missing use cases, comments or questions, please contact [email protected]. Substantial contributions to this document will be acknowledged with a co-authorship

arXiv:2104.03642 [pdf, other]

CLIMAT: Clinically-Inspired Multi-Agent Transformers for Knee Osteoarthritis Trajectory Forecasting

Authors: Huy Hoang Nguyen, Simo Saarakkala, Matthew B. Blaschko, Aleksei Tiulpin

Abstract: In medical applications, deep learning methods are built to automate diagnostic tasks. However, a clinically relevant question that practitioners usually face, is how to predict the future trajectory of a disease (prognosis). Current methods for such a problem often require domain knowledge, and are complicated to apply. In this paper, we formulate the prognosis prediction problem as a one-to-many… ▽ More In medical applications, deep learning methods are built to automate diagnostic tasks. However, a clinically relevant question that practitioners usually face, is how to predict the future trajectory of a disease (prognosis). Current methods for such a problem often require domain knowledge, and are complicated to apply. In this paper, we formulate the prognosis prediction problem as a one-to-many forecasting problem from multimodal data. Inspired by a clinical decision-making process with two agents -- a radiologist and a general practitioner, we model a prognosis prediction problem with two transformer-based components that share information between each other. The first block in this model aims to analyze the imaging data, and the second block leverages the internal representations of the first one as inputs, also fusing them with auxiliary patient data. We show the effectiveness of our method in predicting the development of structural knee osteoarthritis changes over time. Our results show that the proposed method outperforms the state-of-the-art baselines in terms of various performance metrics. In addition, we empirically show that the existence of the multi-agent transformers with depths of 2 is sufficient to achieve good performances. Our code is publicly available at \url{https://github.com/MIPT-Oulu/CLIMAT}. △ Less

Submitted 6 December, 2021; v1 submitted 8 April, 2021; originally announced April 2021.

Comments: 10 pages

arXiv:2012.02577 [pdf, other]

Critical Evaluation of Deep Neural Networks for Wrist Fracture Detection

Authors: Abu Mohammed Raisuddin, Elias Vaattovaara, Mika Nevalainen, Marko Nikki, Elina Järvenpää, Kaisa Makkonen, Pekka Pinola, Tuula Palsio, Arttu Niemensivu, Osmo Tervonen, Aleksei Tiulpin

Abstract: Wrist Fracture is the most common type of fracture with a high incidence rate. Conventional radiography (i.e. X-ray imaging) is used for wrist fracture detection routinely, but occasionally fracture delineation poses issues and an additional confirmation by computed tomography (CT) is needed for diagnosis. Recent advances in the field of Deep Learning (DL), a subfield of Artificial Intelligence (A… ▽ More Wrist Fracture is the most common type of fracture with a high incidence rate. Conventional radiography (i.e. X-ray imaging) is used for wrist fracture detection routinely, but occasionally fracture delineation poses issues and an additional confirmation by computed tomography (CT) is needed for diagnosis. Recent advances in the field of Deep Learning (DL), a subfield of Artificial Intelligence (AI), have shown that wrist fracture detection can be automated using Convolutional Neural Networks. However, previous studies did not pay close attention to the difficult cases which can only be confirmed via CT imaging. In this study, we have developed and analyzed a state-of-the-art DL-based pipeline for wrist (distal radius) fracture detection -- DeepWrist, and evaluated it against one general population test set, and one challenging test set comprising only cases requiring confirmation by CT. Our results reveal that a typical state-of-the-art approach, such as DeepWrist, while having a near-perfect performance on the general independent test set, has a substantially lower performance on the challenging test set -- average precision of 0.99 (0.99-0.99) vs 0.64 (0.46-0.83), respectively. Similarly, the area under the ROC curve was of 0.99 (0.98-0.99) vs 0.84 (0.72-0.93), respectively. Our findings highlight the importance of a meticulous analysis of DL-based models before clinical use, and unearth the need for more challenging settings for testing medical AI systems. △ Less

Submitted 5 March, 2021; v1 submitted 4 December, 2020; originally announced December 2020.

arXiv:2003.01944 [pdf, other]

Semixup: In- and Out-of-Manifold Regularization for Deep Semi-Supervised Knee Osteoarthritis Severity Grading from Plain Radiographs

Authors: Huy Hoang Nguyen, Simo Saarakkala, Matthew Blaschko, Aleksei Tiulpin

Abstract: Knee osteoarthritis (OA) is one of the highest disability factors in the world. This musculoskeletal disorder is assessed from clinical symptoms, and typically confirmed via radiographic assessment. This visual assessment done by a radiologist requires experience, and suffers from moderate to high inter-observer variability. The recent literature has shown that deep learning methods can reliably p… ▽ More Knee osteoarthritis (OA) is one of the highest disability factors in the world. This musculoskeletal disorder is assessed from clinical symptoms, and typically confirmed via radiographic assessment. This visual assessment done by a radiologist requires experience, and suffers from moderate to high inter-observer variability. The recent literature has shown that deep learning methods can reliably perform the OA severity assessment according to the gold standard Kellgren-Lawrence (KL) grading system. However, these methods require large amounts of labeled data, which are costly to obtain. In this study, we propose the Semixup algorithm, a semi-supervised learning (SSL) approach to leverage unlabeled data. Semixup relies on consistency regularization using in- and out-of-manifold samples, together with interpolated consistency. On an independent test set, our method significantly outperformed other state-of-the-art SSL methods in most cases. Finally, when compared to a well-tuned fully supervised baseline that yielded a balanced accuracy (BA) of $70.9\pm0.8%$ on the test set, Semixup had comparable performance -- BA of $71\pm0.8%$ $(p=0.368)$ while requiring $6$ times less labeled data. These results show that our proposed SSL method allows building fully automatic OA severity assessment tools with datasets that are available outside research settings. △ Less

Submitted 12 August, 2020; v1 submitted 4 March, 2020; originally announced March 2020.

Comments: 11 main, 03 supplementary pages. The manuscript was accepted to IEEE Transactions on Medical Imaging in August 2020

arXiv:1908.07736 [pdf, other]

doi 10.1016/j.joca.2020.03.006

Adaptive Segmentation of Knee Radiographs for Selecting the Optimal ROI in Texture Analysis

Authors: Neslihan Bayramoglu, Aleksei Tiulpin, Jukka Hirvasniemi, Miika T. Nieminen, Simo Saarakkala

Abstract: The purposes of this study were to investigate: 1) the effect of placement of region-of-interest (ROI) for texture analysis of subchondral bone in knee radiographs, and 2) the ability of several texture descriptors to distinguish between the knees with and without radiographic osteoarthritis (OA). Bilateral posterior-anterior knee radiographs were analyzed from the baseline of OAI and MOST dataset… ▽ More The purposes of this study were to investigate: 1) the effect of placement of region-of-interest (ROI) for texture analysis of subchondral bone in knee radiographs, and 2) the ability of several texture descriptors to distinguish between the knees with and without radiographic osteoarthritis (OA). Bilateral posterior-anterior knee radiographs were analyzed from the baseline of OAI and MOST datasets. A fully automatic method to locate the most informative region from subchondral bone using adaptive segmentation was developed. We used an oversegmentation strategy for partitioning knee images into the compact regions that follow natural texture boundaries. LBP, Fractal Dimension (FD), Haralick features, Shannon entropy, and HOG methods were computed within the standard ROI and within the proposed adaptive ROIs. Subsequently, we built logistic regression models to identify and compare the performances of each texture descriptor and each ROI placement method using 5-fold cross validation setting. Importantly, we also investigated the generalizability of our approach by training the models on OAI and testing them on MOST dataset.We used area under the receiver operating characteristic (ROC) curve (AUC) and average precision (AP) obtained from the precision-recall (PR) curve to compare the results. We found that the adaptive ROI improves the classification performance (OA vs. non-OA) over the commonly used standard ROI (up to 9% percent increase in AUC). We also observed that, from all texture parameters, LBP yielded the best performance in all settings with the best AUC of 0.840 [0.825, 0.852] and associated AP of 0.804 [0.786, 0.820]. Compared to the current state-of-the-art approaches, our results suggest that the proposed adaptive ROI approach in texture analysis of subchondral bone can increase the diagnostic performance for detecting the presence of radiographic OA. △ Less

Submitted 21 August, 2019; originally announced August 2019.

arXiv:1908.04126 [pdf, other]

Improving Robustness of Deep Learning Based Knee MRI Segmentation: Mixup and Adversarial Domain Adaptation

Authors: Egor Panfilov, Aleksei Tiulpin, Stefan Klein, Miika T. Nieminen, Simo Saarakkala

Abstract: Degeneration of articular cartilage (AC) is actively studied in knee osteoarthritis (OA) research via magnetic resonance imaging (MRI). Segmentation of AC tissues from MRI data is an essential step in quantification of their damage. Deep learning (DL) based methods have shown potential in this realm and are the current state-of-the-art, however, their robustness to heterogeneity of MRI acquisition… ▽ More Degeneration of articular cartilage (AC) is actively studied in knee osteoarthritis (OA) research via magnetic resonance imaging (MRI). Segmentation of AC tissues from MRI data is an essential step in quantification of their damage. Deep learning (DL) based methods have shown potential in this realm and are the current state-of-the-art, however, their robustness to heterogeneity of MRI acquisition settings remains an open problem. In this study, we investigated two modern regularization techniques -- mixup and adversarial unsupervised domain adaptation (UDA) -- to improve the robustness of DL-based knee cartilage segmentation to new MRI acquisition settings. Our validation setup included two datasets produced by different MRI scanners and using distinct data acquisition protocols. We assessed the robustness of automatic segmentation by comparing mixup and UDA approaches to a strong baseline method at different OA severity stages and, additionally, in relation to anatomical locations. Our results showed that for moderate changes in knee MRI data acquisition settings both approaches may provide notable improvements in the robustness, which are consistent for all stages of the disease and affect the clinically important areas of the knee joint. However, mixup may be considered as a recommended approach, since it is more computationally efficient and does not require additional data from the target acquisition setup. △ Less

Submitted 27 October, 2019; v1 submitted 12 August, 2019; originally announced August 2019.

arXiv:1908.02924 [pdf, other]

Bayesian Feature Pyramid Networks for Automatic Multi-Label Segmentation of Chest X-rays and Assessment of Cardio-Thoratic Ratio

Authors: Roman Solovyev, Iaroslav Melekhov, Timo Lesonen, Elias Vaattovaara, Osmo Tervonen, Aleksei Tiulpin

Abstract: Cardiothoratic ratio (CTR) estimated from chest radiographs is a marker indicative of cardiomegaly, the presence of which is in the criteria for heart failure diagnosis. Existing methods for automatic assessment of CTR are driven by Deep Learning-based segmentation. However, these techniques produce only point estimates of CTR but clinical decision making typically assumes the uncertainty. In this… ▽ More Cardiothoratic ratio (CTR) estimated from chest radiographs is a marker indicative of cardiomegaly, the presence of which is in the criteria for heart failure diagnosis. Existing methods for automatic assessment of CTR are driven by Deep Learning-based segmentation. However, these techniques produce only point estimates of CTR but clinical decision making typically assumes the uncertainty. In this paper, we propose a novel method for chest X-ray segmentation and CTR assessment in an automatic manner. In contrast to the previous art, we, for the first time, propose to estimate CTR with uncertainty bounds. Our method is based on Deep Convolutional Neural Network with Feature Pyramid Network (FPN) decoder. We propose two modifications of FPN: replace the batch normalization with instance normalization and inject the dropout which allows to obtain the Monte-Carlo estimates of the segmentation maps at test time. Finally, using the predicted segmentation mask samples, we estimate CTR with uncertainty. In our experiments we demonstrate that the proposed method generalizes well to three different test sets. Finally, we make the annotations produced by two radiologists for all our datasets publicly available. △ Less

Submitted 8 August, 2019; originally announced August 2019.

Comments: Roman Solovyev and Iaroslav Melekhov contributed equally. Timo Lesonen and Elias Vaattovaara contributed equally

arXiv:1907.12237 [pdf, other]

KNEEL: Knee Anatomical Landmark Localization Using Hourglass Networks

Authors: Aleksei Tiulpin, Iaroslav Melekhov, Simo Saarakkala

Abstract: This paper addresses the challenge of localization of anatomical landmarks in knee X-ray images at different stages of osteoarthritis (OA). Landmark localization can be viewed as regression problem, where the landmark position is directly predicted by using the region of interest or even full-size images leading to large memory footprint, especially in case of high resolution medical images. In th… ▽ More This paper addresses the challenge of localization of anatomical landmarks in knee X-ray images at different stages of osteoarthritis (OA). Landmark localization can be viewed as regression problem, where the landmark position is directly predicted by using the region of interest or even full-size images leading to large memory footprint, especially in case of high resolution medical images. In this work, we propose an efficient deep neural networks framework with an hourglass architecture utilizing a soft-argmax layer to directly predict normalized coordinates of the landmark points. We provide an extensive evaluation of different regularization techniques and various loss functions to understand their influence on the localization performance. Furthermore, we introduce the concept of transfer learning from low-budget annotations, and experimentally demonstrate that such approach is improving the accuracy of landmark localization. Compared to the prior methods, we validate our model on two datasets that are independent from the train data and assess the performance of the method for different stages of OA severity. The proposed approach demonstrates better generalization performance compared to the current state-of-the-art. △ Less

Submitted 6 September, 2019; v1 submitted 29 July, 2019; originally announced July 2019.

Comments: Accepted for Publication at ICCV 2019 VRMI Workshop

arXiv:1907.08020 [pdf, other]

Automatic Grading of Individual Knee Osteoarthritis Features in Plain Radiographs using Deep Convolutional Neural Networks

Authors: Aleksei Tiulpin, Simo Saarakkala

Abstract: Knee osteoarthritis (OA) is the most common musculoskeletal disease in the world. In primary healthcare, knee OA is diagnosed using clinical examination and radiographic assessment. Osteoarthritis Research Society International (OARSI) atlas of OA radiographic features allows to perform independent assessment of knee osteophytes, joint space narrowing and other knee features. This provides a fine-… ▽ More Knee osteoarthritis (OA) is the most common musculoskeletal disease in the world. In primary healthcare, knee OA is diagnosed using clinical examination and radiographic assessment. Osteoarthritis Research Society International (OARSI) atlas of OA radiographic features allows to perform independent assessment of knee osteophytes, joint space narrowing and other knee features. This provides a fine-grained OA severity assessment of the knee, compared to the gold standard and most commonly used Kellgren-Lawrence (KL) composite score. However, both OARSI and KL grading systems suffer from moderate inter-rater agreement, and therefore, the use of computer-aided methods could help to improve the reliability of the process. In this study, we developed a robust, automatic method to simultaneously predict KL and OARSI grades in knee radiographs. Our method is based on Deep Learning and leverages an ensemble of deep residual networks with 50 layers, squeeze-excitation and ResNeXt blocks. Here, we used transfer learning from ImageNet with a fine-tuning on the whole Osteoarthritis Initiative (OAI) dataset. An independent testing of our model was performed on the whole Multicenter Osteoarthritis Study (MOST) dataset. Our multi-task method yielded Cohen's kappa coefficients of 0.82 for KL-grade and 0.79, 0.84, 0.94, 0.83, 0.84, 0.90 for femoral osteophytes, tibial osteophytes and joint space narrowing for lateral and medial compartments respectively. Furthermore, our method yielded area under the ROC curve of 0.98 and average precision of 0.98 for detecting the presence of radiographic OA (KL $\geq 2$), which is better than the current state-of-the-art. △ Less

Submitted 18 July, 2019; originally announced July 2019.

arXiv:1907.05089 [pdf, other]

Deep-Learning for Tidemark Segmentation in Human Osteochondral Tissues Imaged with Micro-computed Tomography

Authors: Aleksei Tiulpin, Mikko Finnilä, Petri Lehenkari, Heikki J. Nieminen, Simo Saarakkala

Abstract: Three-dimensional (3D) semi-quantitative grading of pathological features in articular cartilage (AC) offers significant improvements in basic research of osteoarthritis (OA). We have earlier developed the 3D protocol for imaging of AC and its structures which includes staining of the sample with a contrast agent (phosphotungstic acid, PTA) and a consequent scanning with micro-computed tomography.… ▽ More Three-dimensional (3D) semi-quantitative grading of pathological features in articular cartilage (AC) offers significant improvements in basic research of osteoarthritis (OA). We have earlier developed the 3D protocol for imaging of AC and its structures which includes staining of the sample with a contrast agent (phosphotungstic acid, PTA) and a consequent scanning with micro-computed tomography. Such a protocol was designed to provide X-ray attenuation contrast to visualize AC structure. However, at the same time, this protocol has one major disadvantage: the loss of contrast at the tidemark (calcified cartilage interface, CCI). An accurate segmentation of CCI can be very important for understanding the etiology of OA and ex-vivo evaluation of tidemark condition at early OA stages. In this paper, we present the first application of Deep Learning to PTA-stained osteochondral samples that allows to perform tidemark segmentation in a fully-automatic manner. Our method is based on U-Net trained using a combination of binary cross-entropy and soft Jaccard loss. On cross-validation, this approach yielded intersection over the union of 0.59, 0.70, 0.79, 0.83 and 0.86 within 15 μm, 30 μm, 45 μm, 60 μm and 75 μm padded zones around the tidemark, respectively. Our codes and the dataset that consisted of 35 PTA-stained human AC samples are made publicly available together with the segmentation masks to facilitate the development of biomedical image segmentation methods. △ Less

Submitted 11 July, 2019; originally announced July 2019.

arXiv:1905.01743 [pdf, other]

Breast Tumor Cellularity Assessment using Deep Neural Networks

Authors: Alexander Rakhlin, Aleksei Tiulpin, Alexey A. Shvets, Alexandr A. Kalinin, Vladimir I. Iglovikov, Sergey Nikolenko

Abstract: Breast cancer is one of the main causes of death worldwide. Histopathological cellularity assessment of residual tumors in post-surgical tissues is used to analyze a tumor's response to a therapy. Correct cellularity assessment increases the chances of getting an appropriate treatment and facilitates the patient's survival. In current clinical practice, tumor cellularity is manually estimated by p… ▽ More Breast cancer is one of the main causes of death worldwide. Histopathological cellularity assessment of residual tumors in post-surgical tissues is used to analyze a tumor's response to a therapy. Correct cellularity assessment increases the chances of getting an appropriate treatment and facilitates the patient's survival. In current clinical practice, tumor cellularity is manually estimated by pathologists; this process is tedious and prone to errors or low agreement rates between assessors. In this work, we evaluated three strong novel Deep Learning-based approaches for automatic assessment of tumor cellularity from post-treated breast surgical specimens stained with hematoxylin and eosin. We validated the proposed methods on the BreastPathQ SPIE challenge dataset that consisted of 2395 image patches selected from whole slide images acquired from 64 patients. Compared to expert pathologist scoring, our best performing method yielded the Cohen's kappa coefficient of 0.70 (vs. 0.42 previously known in literature) and the intra-class correlation coefficient of 0.89 (vs. 0.83). Our results suggest that Deep Learning-based methods have a significant potential to alleviate the burden on pathologists, enhance the diagnostic workflow, and, thereby, facilitate better clinical outcomes in breast cancer treatment. △ Less

Submitted 3 September, 2019; v1 submitted 5 May, 2019; originally announced May 2019.

arXiv:1904.06236 [pdf, other]

Multimodal Machine Learning-based Knee Osteoarthritis Progression Prediction from Plain Radiographs and Clinical Data

Authors: Aleksei Tiulpin, Stefan Klein, Sita M. A. Bierma-Zeinstra, Jérôme Thevenot, Esa Rahtu, Joyce van Meurs, Edwin H. G. Oei, Simo Saarakkala

Abstract: Knee osteoarthritis (OA) is the most common musculoskeletal disease without a cure, and current treatment options are limited to symptomatic relief. Prediction of OA progression is a very challenging and timely issue, and it could, if resolved, accelerate the disease modifying drug development and ultimately help to prevent millions of total joint replacement surgeries performed annually. Here, we… ▽ More Knee osteoarthritis (OA) is the most common musculoskeletal disease without a cure, and current treatment options are limited to symptomatic relief. Prediction of OA progression is a very challenging and timely issue, and it could, if resolved, accelerate the disease modifying drug development and ultimately help to prevent millions of total joint replacement surgeries performed annually. Here, we present a multi-modal machine learning-based OA progression prediction model that utilizes raw radiographic data, clinical examination results and previous medical history of the patient. We validated this approach on an independent test set of 3,918 knee images from 2,129 subjects. Our method yielded area under the ROC curve (AUC) of 0.79 (0.78-0.81) and Average Precision (AP) of 0.68 (0.66-0.70). In contrast, a reference approach, based on logistic regression, yielded AUC of 0.75 (0.74-0.77) and AP of 0.62 (0.60-0.64). The proposed method could significantly improve the subject selection process for OA drug-development trials and help the development of personalized therapeutic plans. △ Less

Submitted 6 May, 2019; v1 submitted 12 April, 2019; originally announced April 2019.

arXiv:1810.08393 [pdf, other]

DGC-Net: Dense Geometric Correspondence Network

Authors: Iaroslav Melekhov, Aleksei Tiulpin, Torsten Sattler, Marc Pollefeys, Esa Rahtu, Juho Kannala

Abstract: This paper addresses the challenge of dense pixel correspondence estimation between two images. This problem is closely related to optical flow estimation task where ConvNets (CNNs) have recently achieved significant progress. While optical flow methods produce very accurate results for the small pixel translation and limited appearance variation scenarios, they hardly deal with the strong geometr… ▽ More This paper addresses the challenge of dense pixel correspondence estimation between two images. This problem is closely related to optical flow estimation task where ConvNets (CNNs) have recently achieved significant progress. While optical flow methods produce very accurate results for the small pixel translation and limited appearance variation scenarios, they hardly deal with the strong geometric transformations that we consider in this work. In this paper, we propose a coarse-to-fine CNN-based framework that can leverage the advantages of optical flow approaches and extend them to the case of large transformations providing dense and subpixel accurate estimates. It is trained on synthetic transformations and demonstrates very good performance to unseen, realistic, data. Further, we apply our method to the problem of relative camera pose estimation and demonstrate that the model outperforms existing dense approaches. △ Less

Submitted 22 October, 2018; v1 submitted 19 October, 2018; originally announced October 2018.

Comments: Supplementary material included; Affiliation section has been changed

arXiv:1710.10589 [pdf, other]

Automatic Knee Osteoarthritis Diagnosis from Plain Radiographs: A Deep Learning-Based Approach

Authors: Aleksei Tiulpin, Jérôme Thevenot, Esa Rahtu, Petri Lehenkari, Simo Saarakkala

Abstract: Knee osteoarthritis (OA) is the most common musculoskeletal disorder. OA diagnosis is currently conducted by assessing symptoms and evaluating plain radiographs, but this process suffers from subjectivity. In this study, we present a new transparent computer-aided diagnosis method based on the Deep Siamese Convolutional Neural Network to automatically score knee OA severity according to the Kellgr… ▽ More Knee osteoarthritis (OA) is the most common musculoskeletal disorder. OA diagnosis is currently conducted by assessing symptoms and evaluating plain radiographs, but this process suffers from subjectivity. In this study, we present a new transparent computer-aided diagnosis method based on the Deep Siamese Convolutional Neural Network to automatically score knee OA severity according to the Kellgren-Lawrence grading scale. We trained our method using the data solely from the Multicenter Osteoarthritis Study and validated it on randomly selected 3,000 subjects (5,960 knees) from Osteoarthritis Initiative dataset. Our method yielded a quadratic Kappa coefficient of 0.83 and average multiclass accuracy of 66.71\% compared to the annotations given by a committee of clinical experts. Here, we also report a radiological OA diagnosis area under the ROC curve of 0.93. We also present attention maps -- given as a class probability distribution -- highlighting the radiological features affecting the network decision. This information makes the decision process transparent for the practitioner, which builds better trust toward automatic methods. We believe that our model is useful for clinical decision making and for OA research; therefore, we openly release our training codes and the data set created in this study. △ Less

Submitted 29 October, 2017; originally announced October 2017.

arXiv:1701.08991 [pdf, other]

A novel method for automatic localization of joint area on knee plain radiographs

Authors: Aleksei Tiulpin, Jérôme Thevenot, Esa Rahtu, Simo Saarakkala

Abstract: Osteoarthritis (OA) is a common musculoskeletal condition typically diagnosed from radiographic assessment after clinical examination. However, a visual evaluation made by a practitioner suffers from subjectivity and is highly dependent on the experience. Computer-aided diagnostics (CAD) could improve the objectivity of knee radiographic examination. The first essential step of knee OA CAD is to a… ▽ More Osteoarthritis (OA) is a common musculoskeletal condition typically diagnosed from radiographic assessment after clinical examination. However, a visual evaluation made by a practitioner suffers from subjectivity and is highly dependent on the experience. Computer-aided diagnostics (CAD) could improve the objectivity of knee radiographic examination. The first essential step of knee OA CAD is to automatically localize the joint area. However, according to the literature this task itself remains challenging. The aim of this study was to develop novel and computationally efficient method to tackle the issue. Here, three different datasets of knee radiographs were used (n = 473/93/77) to validate the overall performance of the method. Our pipeline consists of two parts: anatomically-based joint area proposal and their evaluation using Histogram of Oriented Gradients and the pre-trained Support Vector Machine classifier scores. The obtained results for the used datasets show the mean intersection over the union equal to: 0.84, 0.79 and 0.78. Using a high-end computer, the method allows to automatically annotate conventional knee radiographs within 14-16ms and high resolution ones within 170ms. Our results demonstrate that the developed method is suitable for large-scale analyses. △ Less

Submitted 5 April, 2017; v1 submitted 31 January, 2017; originally announced January 2017.

Comments: Accepted to Scandinavian Conference on Image Analysis (SCIA) 2017

Showing 1–29 of 29 results for author: Tiulpin, A