Skip to main content

Showing 1–42 of 42 results for author: Cheplygina, V

.
  1. arXiv:2403.04484  [pdf, other

    cs.CV cs.LG

    Source Matters: Source Dataset Impact on Model Robustness in Medical Imaging

    Authors: Dovile Juodelyte, Yucheng Lu, Amelia Jiménez-Sánchez, Sabrina Bottazzi, Enzo Ferrante, Veronika Cheplygina

    Abstract: Transfer learning has become an essential part of medical imaging classification algorithms, often leveraging ImageNet weights. However, the domain shift from natural to medical images has prompted alternatives such as RadImageNet, often demonstrating comparable classification performance. However, it remains unclear whether the performance gains from transfer learning stem from improved generaliz… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

    Comments: Submitted to MICCAI 2024

  2. arXiv:2402.06353  [pdf, other

    cs.CV

    Copycats: the many lives of a publicly available medical imaging dataset

    Authors: Amelia Jiménez-Sánchez, Natalia-Rozalia Avlona, Dovile Juodelyte, Théo Sourget, Caroline Vang-Larsen, Anna Rogers, Hubert Dariusz Zając, Veronika Cheplygina

    Abstract: Medical Imaging (MI) datasets are fundamental to artificial intelligence in healthcare. The accuracy, robustness, and fairness of diagnostic algorithms depend on the data (and its quality) used to train and evaluate the models. MI datasets used to be proprietary, but have become increasingly available to the public, including on community-contributed platforms (CCPs) like Kaggle or HuggingFace. Wh… ▽ More

    Submitted 10 June, 2024; v1 submitted 9 February, 2024; originally announced February 2024.

    Comments: Manuscript under review

  3. arXiv:2402.03003  [pdf, other

    cs.CV cs.DL

    [Citation needed] Data usage and citation practices in medical imaging conferences

    Authors: Théo Sourget, Ahmet Akkoç, Stinna Winther, Christine Lyngbye Galsgaard, Amelia Jiménez-Sánchez, Dovile Juodelyte, Caroline Petitjean, Veronika Cheplygina

    Abstract: Medical imaging papers often focus on methodology, but the quality of the algorithms and the validity of the conclusions are highly dependent on the datasets used. As creating datasets requires a lot of effort, researchers often use publicly available datasets, there is however no adopted standard for citing the datasets used in scientific papers, leading to difficulty in tracking dataset usage. I… ▽ More

    Submitted 5 February, 2024; originally announced February 2024.

    Comments: Submitted to MIDL conference

  4. arXiv:2309.02244  [pdf, other

    cs.CV

    Augmenting Chest X-ray Datasets with Non-Expert Annotations

    Authors: Cathrine Damgaard, Trine Naja Eriksen, Dovile Juodelyte, Veronika Cheplygina, Amelia Jiménez-Sánchez

    Abstract: The advancement of machine learning algorithms in medical image analysis requires the expansion of training datasets. A popular and cost-effective approach is automated annotation extraction from free-text medical reports, primarily due to the high costs associated with expert clinicians annotating chest X-ray images. However, it has been shown that the resulting datasets are susceptible to biases… ▽ More

    Submitted 5 September, 2023; originally announced September 2023.

  5. arXiv:2303.17719  [pdf, other

    cs.CV cs.LG

    Why is the winner the best?

    Authors: Matthias Eisenmann, Annika Reinke, Vivienn Weru, Minu Dietlinde Tizabi, Fabian Isensee, Tim J. Adler, Sharib Ali, Vincent Andrearczyk, Marc Aubreville, Ujjwal Baid, Spyridon Bakas, Niranjan Balu, Sophia Bano, Jorge Bernal, Sebastian Bodenstedt, Alessandro Casella, Veronika Cheplygina, Marie Daum, Marleen de Bruijne, Adrien Depeursinge, Reuben Dorent, Jan Egger, David G. Ellis, Sandy Engelhardt, Melanie Ganz , et al. (100 additional authors not shown)

    Abstract: International benchmarking competitions have become fundamental for the comparative performance assessment of image analysis methods. However, little attention has been given to investigating what can be learnt from these competitions. Do they really generate scientific progress? What are common and successful participation strategies? What makes a solution superior to a competing method? To addre… ▽ More

    Submitted 30 March, 2023; originally announced March 2023.

    Comments: accepted to CVPR 2023

  6. arXiv:2302.08272  [pdf, other

    cs.CV cs.LG

    Revisiting Hidden Representations in Transfer Learning for Medical Imaging

    Authors: Dovile Juodelyte, Amelia Jiménez-Sánchez, Veronika Cheplygina

    Abstract: While a key component to the success of deep learning is the availability of massive amounts of training data, medical image datasets are often limited in diversity and size. Transfer learning has the potential to bridge the gap between related yet different domains. For medical applications, however, it remains unclear whether it is more beneficial to pre-train on natural or medical images. We ai… ▽ More

    Submitted 5 December, 2023; v1 submitted 16 February, 2023; originally announced February 2023.

    Comments: Published in TMLR

  7. Understanding metric-related pitfalls in image analysis validation

    Authors: Annika Reinke, Minu D. Tizabi, Michael Baumgartner, Matthias Eisenmann, Doreen Heckmann-Nötzel, A. Emre Kavur, Tim Rädsch, Carole H. Sudre, Laura Acion, Michela Antonelli, Tal Arbel, Spyridon Bakas, Arriel Benis, Matthew Blaschko, Florian Buettner, M. Jorge Cardoso, Veronika Cheplygina, Jianxu Chen, Evangelia Christodoulou, Beth A. Cimini, Gary S. Collins, Keyvan Farahani, Luciana Ferrer, Adrian Galdran, Bram van Ginneken , et al. (53 additional authors not shown)

    Abstract: Validation metrics are key for the reliable tracking of scientific progress and for bridging the current chasm between artificial intelligence (AI) research and its translation into practice. However, increasing evidence shows that particularly in image analysis, metrics are often chosen inadequately in relation to the underlying research problem. This could be attributed to a lack of accessibilit… ▽ More

    Submitted 23 February, 2024; v1 submitted 3 February, 2023; originally announced February 2023.

    Comments: Shared first authors: Annika Reinke and Minu D. Tizabi; shared senior authors: Lena Maier-Hein and Paul F. Jäger. Published in Nature Methods. arXiv admin note: text overlap with arXiv:2206.01653

    Journal ref: Nature methods, 1-13 (2024)

  8. arXiv:2212.08568  [pdf, other

    cs.CV cs.LG

    Biomedical image analysis competitions: The state of current participation practice

    Authors: Matthias Eisenmann, Annika Reinke, Vivienn Weru, Minu Dietlinde Tizabi, Fabian Isensee, Tim J. Adler, Patrick Godau, Veronika Cheplygina, Michal Kozubek, Sharib Ali, Anubha Gupta, Jan Kybic, Alison Noble, Carlos Ortiz de Solórzano, Samiksha Pachade, Caroline Petitjean, Daniel Sage, Donglai Wei, Elizabeth Wilden, Deepak Alapatt, Vincent Andrearczyk, Ujjwal Baid, Spyridon Bakas, Niranjan Balu, Sophia Bano , et al. (331 additional authors not shown)

    Abstract: The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis,… ▽ More

    Submitted 12 September, 2023; v1 submitted 16 December, 2022; originally announced December 2022.

  9. arXiv:2211.04279  [pdf, other

    cs.CV

    Detecting Shortcuts in Medical Images -- A Case Study in Chest X-rays

    Authors: Amelia Jiménez-Sánchez, Dovile Juodelyte, Bethany Chamberlain, Veronika Cheplygina

    Abstract: The availability of large public datasets and the increased amount of computing power have shifted the interest of the medical community to high-performance algorithms. However, little attention is paid to the quality of the data and their annotations. High performance on benchmark datasets may be reported without considering possible shortcuts or artifacts in the data, besides, models are not tes… ▽ More

    Submitted 9 November, 2022; v1 submitted 8 November, 2022; originally announced November 2022.

    Comments: Submitted to ISBI 2023

  10. arXiv:2207.03960  [pdf, other

    cs.CV

    Detection of Furigana Text in Images

    Authors: Nikolaj Kjøller Bjerregaard, Veronika Cheplygina, Stefan Heinrich

    Abstract: Furigana are pronunciation notes used in Japanese writing. Being able to detect these can help improve optical character recognition (OCR) performance or make more accurate digital copies of Japanese written media by correctly displaying furigana. This project focuses on detecting furigana in Japanese books and comics. While there has been research into the detection of Japanese text in general, t… ▽ More

    Submitted 8 July, 2022; originally announced July 2022.

    Comments: This project was originally submitted by NKB in fulfillment of the 30 ECTS MSc thesis at the IT University of Copenhagen

  11. Metrics reloaded: Recommendations for image analysis validation

    Authors: Lena Maier-Hein, Annika Reinke, Patrick Godau, Minu D. Tizabi, Florian Buettner, Evangelia Christodoulou, Ben Glocker, Fabian Isensee, Jens Kleesiek, Michal Kozubek, Mauricio Reyes, Michael A. Riegler, Manuel Wiesenfarth, A. Emre Kavur, Carole H. Sudre, Michael Baumgartner, Matthias Eisenmann, Doreen Heckmann-Nötzel, Tim Rädsch, Laura Acion, Michela Antonelli, Tal Arbel, Spyridon Bakas, Arriel Benis, Matthew Blaschko , et al. (49 additional authors not shown)

    Abstract: Increasing evidence shows that flaws in machine learning (ML) algorithm validation are an underestimated global problem. Particularly in automatic biomedical image analysis, chosen performance metrics often do not reflect the domain interest, thus failing to adequately measure scientific progress and hindering translation of ML techniques into practice. To overcome this, our large international ex… ▽ More

    Submitted 23 February, 2024; v1 submitted 3 June, 2022; originally announced June 2022.

    Comments: Shared first authors: Lena Maier-Hein, Annika Reinke. arXiv admin note: substantial text overlap with arXiv:2104.05642 Published in Nature Methods

    Journal ref: Nature methods, 1-18 (2024)

  12. arXiv:2203.03259  [pdf, other

    stat.ML cs.CV cs.LG

    Predicting Bearings' Degradation Stages for Predictive Maintenance in the Pharmaceutical Industry

    Authors: Dovile Juodelyte, Veronika Cheplygina, Therese Graversen, Philippe Bonnet

    Abstract: In the pharmaceutical industry, the maintenance of production machines must be audited by the regulator. In this context, the problem of predictive maintenance is not when to maintain a machine, but what parts to maintain at a given point in time. The focus shifts from the entire machine to its component parts and prediction becomes a classification problem. In this paper, we focus on rolling-elem… ▽ More

    Submitted 7 March, 2022; originally announced March 2022.

    Comments: Submitted to the KDD Applied Data Science track

  13. arXiv:2201.02428  [pdf, other

    eess.IV cs.AI cs.CV

    Effect of Prior-based Losses on Segmentation Performance: A Benchmark

    Authors: Rosana El Jurdi, Caroline Petitjean, Veronika Cheplygina, Paul Honeine, Fahed Abdallah

    Abstract: Today, deep convolutional neural networks (CNNs) have demonstrated state-of-the-art performance for medical image segmentation, on various imaging modalities and tasks. Despite early success, segmentation networks may still generate anatomically aberrant segmentations, with holes or inaccuracies near the object boundaries. To enforce anatomical plausibility, recent research studies have focused on… ▽ More

    Submitted 12 January, 2022; v1 submitted 7 January, 2022; originally announced January 2022.

    Comments: To be submitted to SPIE: Journal of Medical Imaging

  14. arXiv:2107.12734  [pdf, ps, other

    cs.CV cs.HC cs.LG

    ENHANCE (ENriching Health data by ANnotations of Crowd and Experts): A case study for skin lesion classification

    Authors: Ralf Raumanns, Gerard Schouten, Max Joosten, Josien P. W. Pluim, Veronika Cheplygina

    Abstract: We present ENHANCE, an open dataset with multiple annotations to complement the existing ISIC and PH2 skin lesion classification datasets. This dataset contains annotations of visual ABC (asymmetry, border, colour) features from non-expert annotation sources: undergraduate students, crowd workers from Amazon MTurk and classic image processing algorithms. In this paper we first analyse the correlat… ▽ More

    Submitted 24 December, 2021; v1 submitted 27 July, 2021; originally announced July 2021.

    Comments: Accepted for publication at the Journal of Machine Learning for Biomedical Imaging (MELBA) https://www.melba-journal.org

  15. arXiv:2107.05940  [pdf, other

    cs.CV

    Cats, not CAT scans: a study of dataset similarity in transfer learning for 2D medical image classification

    Authors: Irma van den Brandt, Floris Fok, Bas Mulders, Joaquin Vanschoren, Veronika Cheplygina

    Abstract: Transfer learning is a commonly used strategy for medical image classification, especially via pretraining on source data and fine-tuning on target data. There is currently no consensus on how to choose appropriate source data, and in the literature we can find both evidence of favoring large natural image datasets such as ImageNet, and evidence of favoring more specialized medical datasets. In th… ▽ More

    Submitted 13 July, 2021; originally announced July 2021.

  16. arXiv:2104.05642  [pdf, other

    eess.IV cs.CV

    Common Limitations of Image Processing Metrics: A Picture Story

    Authors: Annika Reinke, Minu D. Tizabi, Carole H. Sudre, Matthias Eisenmann, Tim Rädsch, Michael Baumgartner, Laura Acion, Michela Antonelli, Tal Arbel, Spyridon Bakas, Peter Bankhead, Arriel Benis, Matthew Blaschko, Florian Buettner, M. Jorge Cardoso, Jianxu Chen, Veronika Cheplygina, Evangelia Christodoulou, Beth Cimini, Gary S. Collins, Sandy Engelhardt, Keyvan Farahani, Luciana Ferrer, Adrian Galdran, Bram van Ginneken , et al. (68 additional authors not shown)

    Abstract: While the importance of automatic image analysis is continuously increasing, recent meta-research revealed major flaws with respect to algorithm validation. Performance metrics are particularly key for meaningful, objective, and transparent performance assessment and validation of the used automatic algorithms, but relatively little attention has been given to the practical pitfalls when using spe… ▽ More

    Submitted 6 December, 2023; v1 submitted 12 April, 2021; originally announced April 2021.

    Comments: Shared first authors: Annika Reinke and Minu D. Tizabi. This is a dynamic paper on limitations of commonly used metrics. It discusses metrics for image-level classification, semantic and instance segmentation, and object detection. For missing use cases, comments or questions, please contact [email protected]. Substantial contributions to this document will be acknowledged with a co-authorship

  17. arXiv:2103.10292  [pdf, ps, other

    eess.IV cs.CV cs.LG stat.ML

    How I failed machine learning in medical imaging -- shortcomings and recommendations

    Authors: Gaël Varoquaux, Veronika Cheplygina

    Abstract: Medical imaging is an important research field with many opportunities for improving patients' health. However, there are a number of challenges that are slowing down the progress of the field as a whole, such optimizing for publication. In this paper we reviewed several problems related to choosing datasets, methods, evaluation metrics, and publication strategies. With a review of literature and… ▽ More

    Submitted 12 May, 2022; v1 submitted 18 March, 2021; originally announced March 2021.

    Journal ref: npj Digit. Med. 5, 48 (2022). https://doi.org/10.1038/s41746-022-00592-y

  18. arXiv:2101.04386  [pdf, other

    eess.IV cs.CV cs.LG

    Using uncertainty estimation to reduce false positives in liver lesion detection

    Authors: Ishaan Bhat, Hugo J. Kuijf, Veronika Cheplygina, Josien P. W. Pluim

    Abstract: Despite the successes of deep learning techniques at detecting objects in medical images, false positive detections occur which may hinder an accurate diagnosis. We propose a technique to reduce false positive detections made by a neural network using an SVM classifier trained with features derived from the uncertainty map of the neural network prediction. We demonstrate the effectiveness of this… ▽ More

    Submitted 26 January, 2021; v1 submitted 12 January, 2021; originally announced January 2021.

    Comments: Accepted at IEEE ISBI 2021

  19. Crowdsourcing Airway Annotations in Chest Computed Tomography Images

    Authors: Veronika Cheplygina, Adria Perez-Rovira, Wieying Kuo, Harm A. W. M. Tiddens, Marleen de Bruijne

    Abstract: Measuring airways in chest computed tomography (CT) scans is important for characterizing diseases such as cystic fibrosis, yet very time-consuming to perform manually. Machine learning algorithms offer an alternative, but need large sets of annotated scans for good performance. We investigate whether crowdsourcing can be used to gather airway annotations. We generate image slices at known locatio… ▽ More

    Submitted 20 November, 2020; originally announced November 2020.

  20. arXiv:2011.08018  [pdf, other

    cs.CV cs.LG

    High-level Prior-based Loss Functions for Medical Image Segmentation: A Survey

    Authors: Rosana El Jurdi, Caroline Petitjean, Paul Honeine, Veronika Cheplygina, Fahed Abdallah

    Abstract: Today, deep convolutional neural networks (CNNs) have demonstrated state of the art performance for supervised medical image segmentation, across various imaging modalities and tasks. Despite early success, segmentation networks may still generate anatomically aberrant segmentations, with holes or inaccuracies near the object boundaries. To mitigate this effect, recent research works have focused… ▽ More

    Submitted 22 November, 2020; v1 submitted 16 November, 2020; originally announced November 2020.

  21. arXiv:2006.16633  [pdf, other

    cs.CV cs.LG eess.IV

    Primary Tumor Origin Classification of Lung Nodules in Spectral CT using Transfer Learning

    Authors: Linde S. Hesse, Pim A. de Jong, Josien P. W. Pluim, Veronika Cheplygina

    Abstract: Early detection of lung cancer has been proven to decrease mortality significantly. A recent development in computed tomography (CT), spectral CT, can potentially improve diagnostic accuracy, as it yields more information per scan than regular CT. However, the shear workload involved with analyzing a large number of scans drives the need for automated diagnosis methods. Therefore, we propose a det… ▽ More

    Submitted 30 June, 2020; originally announced June 2020.

    Comments: MSc thesis Linde Hesse

  22. arXiv:2005.10050  [pdf, other

    cs.LG cs.AI stat.ML

    Risk of Training Diagnostic Algorithms on Data with Demographic Bias

    Authors: Samaneh Abbasi-Sureshjani, Ralf Raumanns, Britt E. J. Michels, Gerard Schouten, Veronika Cheplygina

    Abstract: One of the critical challenges in machine learning applications is to have fair predictions. There are numerous recent examples in various domains that convincingly show that algorithms trained with biased datasets can easily lead to erroneous or discriminatory conclusions. This is even more crucial in clinical applications where the predictive algorithms are designed mainly based on a limited or… ▽ More

    Submitted 17 June, 2020; v1 submitted 20 May, 2020; originally announced May 2020.

  23. arXiv:2005.08869  [pdf, other

    eess.IV cs.LG

    Predicting Scores of Medical Imaging Segmentation Methods with Meta-Learning

    Authors: Tom van Sonsbeek, Veronika Cheplygina

    Abstract: Deep learning has led to state-of-the-art results for many medical imaging tasks, such as segmentation of different anatomical structures. With the increased numbers of deep learning publications and openly available code, the approach to choosing a model for a new task becomes more complicated, while time and (computational) resources are limited. A possible solution to choosing a model efficient… ▽ More

    Submitted 8 May, 2020; originally announced May 2020.

    MSC Class: 68T07

  24. arXiv:2004.14745  [pdf, other

    cs.HC cs.CV cs.LG eess.IV

    Multi-task Ensembles with Crowdsourced Features Improve Skin Lesion Diagnosis

    Authors: Ralf Raumanns, Elif K Contar, Gerard Schouten, Veronika Cheplygina

    Abstract: Machine learning has a recognised need for large amounts of annotated data. Due to the high cost of expert annotations, crowdsourcing, where non-experts are asked to label or outline images, has been proposed as an alternative. Although many promising results are reported, the quality of diagnostic crowdsourced labels is still unclear. We propose to address this by instead asking the crowd about v… ▽ More

    Submitted 6 July, 2020; v1 submitted 28 April, 2020; originally announced April 2020.

  25. arXiv:1902.09159  [pdf, other

    cs.CV cs.HC

    A Survey of Crowdsourcing in Medical Image Analysis

    Authors: Silas Ørting, Andrew Doyle, Arno van Hilten, Matthias Hirth, Oana Inel, Christopher R. Madan, Panagiotis Mavridis, Helen Spiers, Veronika Cheplygina

    Abstract: Rapid advances in image processing capabilities have been seen across many domains, fostered by the application of machine learning algorithms to "big-data". However, within the realm of medical image analysis, advances have been curtailed, in part, due to the limited availability of large-scale, well-annotated datasets. One of the main reasons for this is the high cost often associated with produ… ▽ More

    Submitted 4 September, 2019; v1 submitted 25 February, 2019; originally announced February 2019.

    Comments: Submitted to Human Computation

  26. Cats or CAT scans: transfer learning from natural or medical image source datasets?

    Authors: Veronika Cheplygina

    Abstract: Transfer learning is a widely used strategy in medical image analysis. Instead of only training a network with a limited amount of data from the target task of interest, we can first train the network with other, potentially larger source datasets, creating a more robust model. The source datasets do not have to be related to the target task. For a classification task in lung CT images, we could u… ▽ More

    Submitted 10 January, 2019; v1 submitted 12 October, 2018; originally announced October 2018.

    Comments: Accepted to Current Opinion in Biomedical Engineering

  27. Characterizing multiple instance datasets

    Authors: Veronika Cheplygina, David M. J. Tax

    Abstract: In many pattern recognition problems, a single feature vector is not sufficient to describe an object. In multiple instance learning (MIL), objects are represented by sets (\emph{bags}) of feature vectors (\emph{instances}). This requires an adaptation of standard supervised classifiers in order to train and evaluate on these bags of instances. Like for supervised classification, several benchmark… ▽ More

    Submitted 21 June, 2018; originally announced June 2018.

    Comments: Published at SIMBAD 2015 workshop

  28. arXiv:1806.08174  [pdf, other

    cs.CV

    Crowd disagreement about medical images is informative

    Authors: Veronika Cheplygina, Josien P. W. Pluim

    Abstract: Classifiers for medical image analysis are often trained with a single consensus label, based on combining labels given by experts or crowds. However, disagreement between annotators may be informative, and thus removing it may not be the best strategy. As a proof of concept, we predict whether a skin lesion from the ISIC 2017 dataset is a melanoma or not, based on crowd annotations of visual char… ▽ More

    Submitted 17 August, 2018; v1 submitted 21 June, 2018; originally announced June 2018.

    Comments: Accepted for publication at MICCAI LABELS 2018

  29. arXiv:1806.07131  [pdf, other

    cs.CV

    Feature learning based on visual similarity triplets in medical image analysis: A case study of emphysema in chest CT scans

    Authors: Silas Nyboe Ørting, Jens Petersen, Veronika Cheplygina, Laura H. Thomsen, Mathilde M W Wille, Marleen de Bruijne

    Abstract: Supervised feature learning using convolutional neural networks (CNNs) can provide concise and disease relevant representations of medical images. However, training CNNs requires annotated image data. Annotating medical images can be a time-consuming task and even expert annotations are subject to substantial inter- and intra-rater variability. Assessing visual similarity of images instead of indi… ▽ More

    Submitted 19 June, 2018; originally announced June 2018.

    Comments: 10 pages. Submitted to LABELS2018 - MICCAI Workshop on Large-scale Annotation of Biomedical data and Expert Label Synthesis

  30. arXiv:1804.06353  [pdf, other

    cs.CV

    Not-so-supervised: a survey of semi-supervised, multi-instance, and transfer learning in medical image analysis

    Authors: Veronika Cheplygina, Marleen de Bruijne, Josien P. W. Pluim

    Abstract: Machine learning (ML) algorithms have made a tremendous impact in the field of medical imaging. While medical imaging datasets have been growing in size, a challenge for supervised ML algorithms that is frequently mentioned is the lack of annotated data. As a result, various methods which can learn with less/other types of supervision, have been proposed. We review semi-supervised, multiple instan… ▽ More

    Submitted 14 September, 2018; v1 submitted 17 April, 2018; originally announced April 2018.

    Comments: Submitted to Medical Image Analysis

  31. arXiv:1706.03509  [pdf, other

    cs.CV

    Exploring the similarity of medical imaging classification problems

    Authors: Veronika Cheplygina, Pim Moeskops, Mitko Veta, Behdad Dasht Bozorg, Josien Pluim

    Abstract: Supervised learning is ubiquitous in medical image analysis. In this paper we consider the problem of meta-learning -- predicting which methods will perform well in an unseen classification problem, given previous experience with other classification problems. We investigate the first step of such an approach: how to quantify the similarity of different classification problems. We characterize dat… ▽ More

    Submitted 12 June, 2017; originally announced June 2017.

  32. Early Experiences with Crowdsourcing Airway Annotations in Chest CT

    Authors: Veronika Cheplygina, Adria Perez-Rovira, Wieying Kuo, Harm A. W. M. Tiddens, Marleen de Bruijne

    Abstract: Measuring airways in chest computed tomography (CT) images is important for characterizing diseases such as cystic fibrosis, yet very time-consuming to perform manually. Machine learning algorithms offer an alternative, but need large sets of annotated data to perform well. We investigate whether crowdsourcing can be used to gather airway annotations which can serve directly for measuring the airw… ▽ More

    Submitted 7 June, 2017; originally announced June 2017.

    Journal ref: LABELS 2016, DLMIA 2016: Deep Learning and Data Labeling for Medical Applications pp 209-218

  33. Automatic Emphysema Detection using Weakly Labeled HRCT Lung Images

    Authors: Isabel Pino Peña, Veronika Cheplygina, Sofia Paschaloudi, Morten Vuust, Jesper Carl, Ulla Møller Weinreich, Lasse Riis Østergaard, Marleen de Bruijne

    Abstract: A method for automatically quantifying emphysema regions using High-Resolution Computed Tomography (HRCT) scans of patients with chronic obstructive pulmonary disease (COPD) that does not require manually annotated scans for training is presented. HRCT scans of controls and of COPD patients with diverse disease severity are acquired at two different centers. Textural features from co-occurrence ma… ▽ More

    Submitted 1 October, 2018; v1 submitted 7 June, 2017; originally announced June 2017.

    Comments: Accepted at PLoS ONE

  34. Label Stability in Multiple Instance Learning

    Authors: Veronika Cheplygina, Lauge Sørensen, David M. J. Tax, Marleen de Bruijne, Marco Loog

    Abstract: We address the problem of \emph{instance label stability} in multiple instance learning (MIL) classifiers. These classifiers are trained only on globally annotated images (bags), but often can provide fine-grained annotations for image pixels or patches (instances). This is interesting for computer aided diagnosis (CAD) and other medical image analysis tasks for which only a coarse labeling is pro… ▽ More

    Submitted 15 March, 2017; originally announced March 2017.

    Comments: Published at MICCAI 2015

  35. arXiv:1703.04981  [pdf, other

    cs.CV stat.ML

    Transfer Learning by Asymmetric Image Weighting for Segmentation across Scanners

    Authors: Veronika Cheplygina, Annegreet van Opbroek, M. Arfan Ikram, Meike W. Vernooij, Marleen de Bruijne

    Abstract: Supervised learning has been very successful for automatic segmentation of images from a single scanner. However, several papers report deteriorated performances when using classifiers trained on images from one scanner to segment images from other scanners. We propose a transfer learning classifier that adapts to differences between training and test images. This method uses a weighted ensemble o… ▽ More

    Submitted 15 March, 2017; originally announced March 2017.

  36. Classification of COPD with Multiple Instance Learning

    Authors: Veronika Cheplygina, Lauge Sørensen, David M. J. Tax, Jesper Holst Pedersen, Marco Loog, Marleen de Bruijne

    Abstract: Chronic obstructive pulmonary disease (COPD) is a lung disease where early detection benefits the survival rate. COPD can be quantified by classifying patches of computed tomography images, and combining patch labels into an overall diagnosis for the image. As labeled patches are often not available, image labels are propagated to the patches, incorrectly labeling healthy patches in COPD patients… ▽ More

    Submitted 15 March, 2017; originally announced March 2017.

    Comments: Published at International Conference on Pattern Recognition (ICPR) 2014

  37. Transfer learning for multi-center classification of chronic obstructive pulmonary disease

    Authors: Veronika Cheplygina, Isabel Pino Peña, Jesper Holst Pedersen, David A. Lynch, Lauge Sørensen, Marleen de Bruijne

    Abstract: Chronic obstructive pulmonary disease (COPD) is a lung disease which can be quantified using chest computed tomography (CT) scans. Recent studies have shown that COPD can be automatically diagnosed using weakly supervised learning of intensity and texture distributions. However, up till now such classifiers have only been evaluated on scans from a single domain, and it is unclear whether they woul… ▽ More

    Submitted 23 November, 2017; v1 submitted 18 January, 2017; originally announced January 2017.

    Comments: Accepted at Journal of Biomedical and Health Informatics

  38. Multiple Instance Learning: A Survey of Problem Characteristics and Applications

    Authors: Marc-André Carbonneau, Veronika Cheplygina, Eric Granger, Ghyslain Gagnon

    Abstract: Multiple instance learning (MIL) is a form of weakly supervised learning where training instances are arranged in sets, called bags, and a label is provided for the entire bag. This formulation is gaining interest because it naturally fits various problems and allows to leverage weakly labeled data. Consequently, it has been used in diverse application fields such as computer vision and document c… ▽ More

    Submitted 10 December, 2016; originally announced December 2016.

  39. arXiv:1406.0281  [pdf, other

    stat.ML cs.CV cs.LG

    On Classification with Bags, Groups and Sets

    Authors: Veronika Cheplygina, David M. J. Tax, Marco Loog

    Abstract: Many classification problems can be difficult to formulate directly in terms of the traditional supervised setting, where both training and test samples are individual feature vectors. There are cases in which samples are better described by sets of feature vectors, that labels are only available for sets rather than individual samples, or, if individual labels are available, that these are not in… ▽ More

    Submitted 7 October, 2014; v1 submitted 2 June, 2014; originally announced June 2014.

    Journal ref: Pattern Recognition Letters Volume 59, 2015, Pages 11 - 17

  40. arXiv:1402.1371  [pdf, ps, other

    cs.CV

    Quantile Representation for Indirect Immunofluorescence Image Classification

    Authors: David M. J. Tax, Veronika Cheplygina, Marco Loog

    Abstract: In the diagnosis of autoimmune diseases, an important task is to classify images of slides containing several HEp-2 cells. All cells from one slide share the same label, and by classifying cells from one slide independently, some information on the global image quality and intensity is lost. Considering one whole slide as a collection (a bag) of feature vectors, however, poses the problem of how t… ▽ More

    Submitted 6 February, 2014; originally announced February 2014.

  41. Dissimilarity-based Ensembles for Multiple Instance Learning

    Authors: Veronika Cheplygina, David M. J. Tax, Marco Loog

    Abstract: In multiple instance learning, objects are sets (bags) of feature vectors (instances) rather than individual feature vectors. In this paper we address the problem of how these bags can best be represented. Two standard approaches are to use (dis)similarities between bags and prototype bags, or between bags and prototype instances. The first approach results in a relatively low-dimensional represen… ▽ More

    Submitted 6 February, 2014; originally announced February 2014.

    Comments: Submitted to IEEE Transactions on Neural Networks and Learning Systems, Special Issue on Learning in Non-(geo)metric Spaces

    Journal ref: IEEE Transactions on Neural Networks and Learning Systems, Volume 27, Issue 6, 2016, pages 1379 - 1391

  42. Multiple Instance Learning with Bag Dissimilarities

    Authors: Veronika Cheplygina, David M. J. Tax, Marco Loog

    Abstract: Multiple instance learning (MIL) is concerned with learning from sets (bags) of objects (instances), where the individual instance labels are ambiguous. In this setting, supervised learning cannot be applied directly. Often, specialized MIL methods learn by making additional assumptions about the relationship of the bag labels and instance labels. Such assumptions may fit a particular dataset, but… ▽ More

    Submitted 12 August, 2014; v1 submitted 22 September, 2013; originally announced September 2013.

    Comments: Pattern Recognition, in press

    Journal ref: Pattern Recognition 48.1 (2015): 264-275