-
MAMA-MIA: A Large-Scale Multi-Center Breast Cancer DCE-MRI Benchmark Dataset with Expert Segmentations
Authors:
Lidia Garrucho,
Claire-Anne Reidel,
Kaisar Kushibar,
Smriti Joshi,
Richard Osuala,
Apostolia Tsirikoglou,
Maciej Bobowicz,
Javier del Riego,
Alessandro Catanese,
Katarzyna Gwoździewicz,
Maria-Laura Cosaka,
Pasant M. Abo-Elhoda,
Sara W. Tantawy,
Shorouq S. Sakrana,
Norhan O. Shawky-Abdelfatah,
Amr Muhammad Abdo-Salem,
Androniki Kozana,
Eugen Divjak,
Gordana Ivanac,
Katerina Nikiforaki,
Michail E. Klontzas,
Rosa García-Dosdá,
Meltem Gulsun-Akpinar,
Oğuz Lafcı,
Ritse Mann
, et al. (8 additional authors not shown)
Abstract:
Current research in breast cancer Magnetic Resonance Imaging (MRI), especially with Artificial Intelligence (AI), faces challenges due to the lack of expert segmentations. To address this, we introduce the MAMA-MIA dataset, comprising 1506 multi-center dynamic contrast-enhanced MRI cases with expert segmentations of primary tumors and non-mass enhancement areas. These cases were sourced from four…
▽ More
Current research in breast cancer Magnetic Resonance Imaging (MRI), especially with Artificial Intelligence (AI), faces challenges due to the lack of expert segmentations. To address this, we introduce the MAMA-MIA dataset, comprising 1506 multi-center dynamic contrast-enhanced MRI cases with expert segmentations of primary tumors and non-mass enhancement areas. These cases were sourced from four publicly available collections in The Cancer Imaging Archive (TCIA). Initially, we trained a deep learning model to automatically segment the cases, generating preliminary segmentations that significantly reduced expert segmentation time. Sixteen experts, averaging 9 years of experience in breast cancer, then corrected these segmentations, resulting in the final expert segmentations. Additionally, two radiologists conducted a visual inspection of the automatic segmentations to support future quality control studies. Alongside the expert segmentations, we provide 49 harmonized demographic and clinical variables and the pretrained weights of the well-known nnUNet architecture trained using the DCE-MRI full-images and expert segmentations. This dataset aims to accelerate the development and benchmarking of deep learning models and foster innovation in breast cancer diagnostics and treatment planning.
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
Mitigating annotation shift in cancer classification using single image generative models
Authors:
Marta Buetas Arcas,
Richard Osuala,
Karim Lekadir,
Oliver Díaz
Abstract:
Artificial Intelligence (AI) has emerged as a valuable tool for assisting radiologists in breast cancer detection and diagnosis. However, the success of AI applications in this domain is restricted by the quantity and quality of available data, posing challenges due to limited and costly data annotation procedures that often lead to annotation shifts. This study simulates, analyses and mitigates a…
▽ More
Artificial Intelligence (AI) has emerged as a valuable tool for assisting radiologists in breast cancer detection and diagnosis. However, the success of AI applications in this domain is restricted by the quantity and quality of available data, posing challenges due to limited and costly data annotation procedures that often lead to annotation shifts. This study simulates, analyses and mitigates annotation shifts in cancer classification in the breast mammography domain. First, a high-accuracy cancer risk prediction model is developed, which effectively distinguishes benign from malignant lesions. Next, model performance is used to quantify the impact of annotation shift. We uncover a substantial impact of annotation shift on multiclass classification performance particularly for malignant lesions. We thus propose a training data augmentation approach based on single-image generative models for the affected class, requiring as few as four in-domain annotations to considerably mitigate annotation shift, while also addressing dataset imbalance. Lastly, we further increase performance by proposing and validating an ensemble architecture based on multiple models trained under different data augmentation regimes. Our study offers key insights into annotation shift in deep learning breast cancer classification and explores the potential of single-image generative models to overcome domain shift challenges.
△ Less
Submitted 30 May, 2024;
originally announced May 2024.
-
Towards Learning Contrast Kinetics with Multi-Condition Latent Diffusion Models
Authors:
Richard Osuala,
Daniel Lang,
Preeti Verma,
Smriti Joshi,
Apostolia Tsirikoglou,
Grzegorz Skorupko,
Kaisar Kushibar,
Lidia Garrucho,
Walter H. L. Pinaya,
Oliver Diaz,
Julia Schnabel,
Karim Lekadir
Abstract:
Contrast agents in dynamic contrast enhanced magnetic resonance imaging allow to localize tumors and observe their contrast kinetics, which is essential for cancer characterization and respective treatment decision-making. However, contrast agent administration is not only associated with adverse health risks, but also restricted for patients during pregnancy, and for those with kidney malfunction…
▽ More
Contrast agents in dynamic contrast enhanced magnetic resonance imaging allow to localize tumors and observe their contrast kinetics, which is essential for cancer characterization and respective treatment decision-making. However, contrast agent administration is not only associated with adverse health risks, but also restricted for patients during pregnancy, and for those with kidney malfunction, or other adverse reactions. With contrast uptake as key biomarker for lesion malignancy, cancer recurrence risk, and treatment response, it becomes pivotal to reduce the dependency on intravenous contrast agent administration. To this end, we propose a multi-conditional latent diffusion model capable of acquisition time-conditioned image synthesis of DCE-MRI temporal sequences. To evaluate medical image synthesis, we additionally propose and validate the Fréchet radiomics distance as an image quality measure based on biomarker variability between synthetic and real imaging data. Our results demonstrate our method's ability to generate realistic multi-sequence fat-saturated breast DCE-MRI and uncover the emerging potential of deep learning based contrast kinetics simulation. We publicly share our accessible codebase at https://github.com/RichardObi/ccnet and provide a user-friendly library for Fréchet radiomics distance calculation at https://pypi.org/project/frd-score.
△ Less
Submitted 1 May, 2024; v1 submitted 20 March, 2024;
originally announced March 2024.
-
On the integrality of étale extensions of polynomial rings
Authors:
Lázaro O. Rodríguez Díaz
Abstract:
Motivated by a valuation theorem, recently obtained by Rangachev, we study the étale extensions $A\subset B$ of polynomial rings over an algebraically closed field of characteristic zero, such that the integral closure $\overline{A}$ is a primary $\overline{A}$-submodule of $B$. We prove that in this case $\overline{A}$ has infinite cyclic divisor class group, where the generator is a prime diviso…
▽ More
Motivated by a valuation theorem, recently obtained by Rangachev, we study the étale extensions $A\subset B$ of polynomial rings over an algebraically closed field of characteristic zero, such that the integral closure $\overline{A}$ is a primary $\overline{A}$-submodule of $B$. We prove that in this case $\overline{A}$ has infinite cyclic divisor class group, where the generator is a prime divisor equal to the complement of $\textrm{Spec}(B)$ in $\textrm{Spec}(\overline{A})$. Moreover, this prime divisor coincides with the ramification divisor of the finite extension $A\subset \overline{A}$. In this situation we carry out Wright's geometric approach for two-dimensional non-integral étale extensions. It follows from the work of Miyanishi that $\textrm{Spec}(\overline{A})$ is a smooth affine surface. We show that $\textrm{Spec}(\overline{A})$ is an $\mathbb{A}^{1}$-bundle over $\mathbb{P}^{1}$, more precisely a Danilov-Gizatullin surface of index three. Based on Wright's analysis of which of these affine surfaces can factorize an étale morphism of the complex affine plane and his description of its affine coordinate rings, we prove that under the strong assumption that $\overline{A}$ is always a primary $\overline{A}$-submodule of $B$, any two-dimensional complex étale extension is integral.
△ Less
Submitted 10 April, 2024; v1 submitted 4 March, 2024;
originally announced March 2024.
-
Pre- to Post-Contrast Breast MRI Synthesis for Enhanced Tumour Segmentation
Authors:
Richard Osuala,
Smriti Joshi,
Apostolia Tsirikoglou,
Lidia Garrucho,
Walter H. L. Pinaya,
Oliver Diaz,
Karim Lekadir
Abstract:
Despite its benefits for tumour detection and treatment, the administration of contrast agents in dynamic contrast-enhanced MRI (DCE-MRI) is associated with a range of issues, including their invasiveness, bioaccumulation, and a risk of nephrogenic systemic fibrosis. This study explores the feasibility of producing synthetic contrast enhancements by translating pre-contrast T1-weighted fat-saturat…
▽ More
Despite its benefits for tumour detection and treatment, the administration of contrast agents in dynamic contrast-enhanced MRI (DCE-MRI) is associated with a range of issues, including their invasiveness, bioaccumulation, and a risk of nephrogenic systemic fibrosis. This study explores the feasibility of producing synthetic contrast enhancements by translating pre-contrast T1-weighted fat-saturated breast MRI to their corresponding first DCE-MRI sequence leveraging the capabilities of a generative adversarial network (GAN). Additionally, we introduce a Scaled Aggregate Measure (SAMe) designed for quantitatively evaluating the quality of synthetic data in a principled manner and serving as a basis for selecting the optimal generative model. We assess the generated DCE-MRI data using quantitative image quality metrics and apply them to the downstream task of 3D breast tumour segmentation. Our results highlight the potential of post-contrast DCE-MRI synthesis in enhancing the robustness of breast tumour segmentation models via data augmentation. Our code is available at https://github.com/RichardObi/pre_post_synthesis.
△ Less
Submitted 31 May, 2024; v1 submitted 17 November, 2023;
originally announced November 2023.
-
FUTURE-AI: International consensus guideline for trustworthy and deployable artificial intelligence in healthcare
Authors:
Karim Lekadir,
Aasa Feragen,
Abdul Joseph Fofanah,
Alejandro F Frangi,
Alena Buyx,
Anais Emelie,
Andrea Lara,
Antonio R Porras,
An-Wen Chan,
Arcadi Navarro,
Ben Glocker,
Benard O Botwe,
Bishesh Khanal,
Brigit Beger,
Carol C Wu,
Celia Cintas,
Curtis P Langlotz,
Daniel Rueckert,
Deogratias Mzurikwao,
Dimitrios I Fotiadis,
Doszhan Zhussupov,
Enzo Ferrante,
Erik Meijering,
Eva Weicken,
Fabio A González
, et al. (95 additional authors not shown)
Abstract:
Despite major advances in artificial intelligence (AI) for medicine and healthcare, the deployment and adoption of AI technologies remain limited in real-world clinical practice. In recent years, concerns have been raised about the technical, clinical, ethical and legal risks associated with medical AI. To increase real world adoption, it is essential that medical AI tools are trusted and accepted…
▽ More
Despite major advances in artificial intelligence (AI) for medicine and healthcare, the deployment and adoption of AI technologies remain limited in real-world clinical practice. In recent years, concerns have been raised about the technical, clinical, ethical and legal risks associated with medical AI. To increase real world adoption, it is essential that medical AI tools are trusted and accepted by patients, clinicians, health organisations and authorities. This work describes the FUTURE-AI guideline as the first international consensus framework for guiding the development and deployment of trustworthy AI tools in healthcare. The FUTURE-AI consortium was founded in 2021 and currently comprises 118 inter-disciplinary experts from 51 countries representing all continents, including AI scientists, clinicians, ethicists, and social scientists. Over a two-year period, the consortium defined guiding principles and best practices for trustworthy AI through an iterative process comprising an in-depth literature review, a modified Delphi survey, and online consensus meetings. The FUTURE-AI framework was established based on 6 guiding principles for trustworthy AI in healthcare, i.e. Fairness, Universality, Traceability, Usability, Robustness and Explainability. Through consensus, a set of 28 best practices were defined, addressing technical, clinical, legal and socio-ethical dimensions. The recommendations cover the entire lifecycle of medical AI, from design, development and validation to regulation, deployment, and monitoring. FUTURE-AI is a risk-informed, assumption-free guideline which provides a structured approach for constructing medical AI tools that will be trusted, deployed and adopted in real-world practice. Researchers are encouraged to take the recommendations into account in proof-of-concept stages to facilitate future translation towards clinical practice of medical AI.
△ Less
Submitted 8 July, 2024; v1 submitted 11 August, 2023;
originally announced September 2023.
-
How can feature usage be tracked across product variants? Implicit Feedback in Software Product Lines
Authors:
Oscar Díaz,
Raul Medeiros,
Mustafa Al-Hajjaji
Abstract:
Implicit feedback is collecting information about software usage to understand how and when the software is used. This research tackles implicit feedback in Software Product Lines (SPLs). The need for platform-centric feedback makes SPL feedback depart from one-off-application feedback in both the artefact to be tracked (the platform vs the variant) as well as the tracking approach (indirect codin…
▽ More
Implicit feedback is collecting information about software usage to understand how and when the software is used. This research tackles implicit feedback in Software Product Lines (SPLs). The need for platform-centric feedback makes SPL feedback depart from one-off-application feedback in both the artefact to be tracked (the platform vs the variant) as well as the tracking approach (indirect coding vs direct coding). Traditionally, product feedback is achieved by embedding `usage trackers' into the software's code. Yet, products are now members of the SPL portfolio, and hence, this approach conflicts with one of the main SPL tenants: reducing, if not eliminating, coding directly into the variant's code. Thus, we advocate for Product Derivation to be subject to a second transformation that precedes the construction of the variant based on the configuration model. This approach is tested through FEACKER, an extension to pure::variants. We resorted to a TAM evaluation on pure-systems GmbH employees(n=8). Observed divergences were next tackled through a focus group (n=3). The results reveal agreement in the interest in conducting feedback analysis at the platform level (perceived usefulness) while regarding FEACKER as a seamless
△ Less
Submitted 15 September, 2023; v1 submitted 8 September, 2023;
originally announced September 2023.
-
Revisiting Skin Tone Fairness in Dermatological Lesion Classification
Authors:
Thorsten Kalb,
Kaisar Kushibar,
Celia Cintas,
Karim Lekadir,
Oliver Diaz,
Richard Osuala
Abstract:
Addressing fairness in lesion classification from dermatological images is crucial due to variations in how skin diseases manifest across skin tones. However, the absence of skin tone labels in public datasets hinders building a fair classifier. To date, such skin tone labels have been estimated prior to fairness analysis in independent studies using the Individual Typology Angle (ITA). Briefly, I…
▽ More
Addressing fairness in lesion classification from dermatological images is crucial due to variations in how skin diseases manifest across skin tones. However, the absence of skin tone labels in public datasets hinders building a fair classifier. To date, such skin tone labels have been estimated prior to fairness analysis in independent studies using the Individual Typology Angle (ITA). Briefly, ITA calculates an angle based on pixels extracted from skin images taking into account the lightness and yellow-blue tints. These angles are then categorised into skin tones that are subsequently used to analyse fairness in skin cancer classification. In this work, we review and compare four ITA-based approaches of skin tone classification on the ISIC18 dataset, a common benchmark for assessing skin cancer classification fairness in the literature. Our analyses reveal a high disagreement among previously published studies demonstrating the risks of ITA-based skin tone estimation methods. Moreover, we investigate the causes of such large discrepancy among these approaches and find that the lack of diversity in the ISIC18 dataset limits its use as a testbed for fairness analysis. Finally, we recommend further research on robust ITA estimation and diverse dataset acquisition with skin tone annotation to facilitate conclusive fairness assessments of artificial intelligence tools in dermatology. Our code is available at https://github.com/tkalbl/RevisitingSkinToneFairness.
△ Less
Submitted 18 August, 2023;
originally announced August 2023.
-
Home-to-school pedestrian mobility GPS data from a citizen science experiment in the Barcelona area
Authors:
Ferran Larroya,
Ofelia Díaz,
Oleguer Segarra,
Pol Colomer Simón,
Salva Ferré,
Esteban Moro,
Josep Perelló
Abstract:
The analysis of pedestrian GPS datasets is fundamental to further advance on the study and the design of walkable cities. The highest resolution GPS data can characterize micro-mobility patterns and pedestrians' micro-motives in relation to a small-scale urban context. Purposed-based recurrent mobility data inside people's neighborhoods is an important source in these sorts of studies. However, mi…
▽ More
The analysis of pedestrian GPS datasets is fundamental to further advance on the study and the design of walkable cities. The highest resolution GPS data can characterize micro-mobility patterns and pedestrians' micro-motives in relation to a small-scale urban context. Purposed-based recurrent mobility data inside people's neighborhoods is an important source in these sorts of studies. However, micro-mobility around people's homes is generally unavailable, and if data exists, it is generally not shareable often due to privacy issues. Citizen science and its public involvement practices in scientific research are valid options to circumvent these challenges and provide meaningful datasets for walkable cities. The study presents GPS records from single-day home-to-school pedestrian mobility of 10 schools in the Barcelona Metropolitan area (Spain). The research provides pedestrian mobility from an age-homogeneous group of people. The study shares processed records with specific filtering, cleaning, and interpolation procedures that can facilitate and accelerate data usage. Citizen science practices during the whole research process are reported to offer a complete perspective of the data collected.
△ Less
Submitted 23 May, 2023; v1 submitted 15 February, 2023;
originally announced February 2023.
-
Optimizing Floors in First Price Auctions: an Empirical Study of Yahoo Advertising
Authors:
Miguel Alcobendas,
Jonathan Ji,
Hemakumar Gokulakannan,
Dawit Wami,
Boris Kapchits,
Emilien Pouradier Duteil,
Korby Satow,
Maria Rosario Levy Roman,
Oriol Diaz,
Amado A. Diaz Jr.,
Rabi Kavoori
Abstract:
Floors (also known as reserve prices) help publishers to increase the expected revenue of their ad space, which is usually sold via auctions. Floors are defined as the minimum bid that a seller (it can be a publisher or an ad exchange) is willing to accept for the inventory opportunity. In this paper, we present a model to set floors in first price auctions, and discuss the impact of its implement…
▽ More
Floors (also known as reserve prices) help publishers to increase the expected revenue of their ad space, which is usually sold via auctions. Floors are defined as the minimum bid that a seller (it can be a publisher or an ad exchange) is willing to accept for the inventory opportunity. In this paper, we present a model to set floors in first price auctions, and discuss the impact of its implementation on Yahoo sites. The model captures important characteristics of the online advertising industry. For instance, some bidders impose restrictions on how ad exchanges can handle data from bidders, conditioning the model choice to set reserve prices. Our solution induces bidders to change their bidding behavior as a response to the floors enclosed in the bid request, hel** online publishers to increase their ad revenue.
The outlined methodology has been implemented at Yahoo with remarkable results. The annualized incremental revenue is estimated at +1.3% on Yahoo display inventory, and +2.5% on video ad inventory. These are non-negligible numbers in the multi-million Yahoo ad business.
△ Less
Submitted 9 February, 2024; v1 submitted 12 February, 2023;
originally announced February 2023.
-
medigan: a Python library of pretrained generative models for medical image synthesis
Authors:
Richard Osuala,
Grzegorz Skorupko,
Noussair Lazrak,
Lidia Garrucho,
Eloy García,
Smriti Joshi,
Socayna Jouide,
Michael Rutherford,
Fred Prior,
Kaisar Kushibar,
Oliver Diaz,
Karim Lekadir
Abstract:
Synthetic data generated by generative models can enhance the performance and capabilities of data-hungry deep learning models in medical imaging. However, there is (1) limited availability of (synthetic) datasets and (2) generative models are complex to train, which hinders their adoption in research and clinical applications. To reduce this entry barrier, we propose medigan, a one-stop shop for…
▽ More
Synthetic data generated by generative models can enhance the performance and capabilities of data-hungry deep learning models in medical imaging. However, there is (1) limited availability of (synthetic) datasets and (2) generative models are complex to train, which hinders their adoption in research and clinical applications. To reduce this entry barrier, we propose medigan, a one-stop shop for pretrained generative models implemented as an open-source framework-agnostic Python library. medigan allows researchers and developers to create, increase, and domain-adapt their training data in just a few lines of code. Guided by design decisions based on gathered end-user requirements, we implement medigan based on modular components for generative model (i) execution, (ii) visualisation, (iii) search & ranking, and (iv) contribution. The library's scalability and design is demonstrated by its growing number of integrated and readily-usable pretrained generative models consisting of 21 models utilising 9 different Generative Adversarial Network architectures trained on 11 datasets from 4 domains, namely, mammography, endoscopy, x-ray, and MRI. Furthermore, 3 applications of medigan are analysed in this work, which include (a) enabling community-wide sharing of restricted data, (b) investigating generative model evaluation metrics, and (c) improving clinical downstream tasks. In (b), extending on common medical image synthesis assessment and reporting standards, we show Fréchet Inception Distance variability based on image normalisation and radiology-specific feature extraction.
△ Less
Submitted 23 February, 2023; v1 submitted 28 September, 2022;
originally announced September 2022.
-
High-resolution synthesis of high-density breast mammograms: Application to improved fairness in deep learning based mass detection
Authors:
Lidia Garrucho,
Kaisar Kushibar,
Richard Osuala,
Oliver Diaz,
Alessandro Catanese,
Javier del Riego,
Maciej Bobowicz,
Fredrik Strand,
Laura Igual,
Karim Lekadir
Abstract:
Computer-aided detection systems based on deep learning have shown good performance in breast cancer detection. However, high-density breasts show poorer detection performance since dense tissues can mask or even simulate masses. Therefore, the sensitivity of mammography for breast cancer detection can be reduced by more than 20% in dense breasts. Additionally, extremely dense cases reported an in…
▽ More
Computer-aided detection systems based on deep learning have shown good performance in breast cancer detection. However, high-density breasts show poorer detection performance since dense tissues can mask or even simulate masses. Therefore, the sensitivity of mammography for breast cancer detection can be reduced by more than 20% in dense breasts. Additionally, extremely dense cases reported an increased risk of cancer compared to low-density breasts. This study aims to improve the mass detection performance in highdensity breasts using synthetic high-density full-field digital mammograms (FFDM) as data augmentation during breast mass detection model training. To this end, a total of five cycle-consistent GAN (CycleGAN) models using three FFDM datasets were trained for low-to-high-density image translation in highresolution mammograms. The training images were split by breast density BIRADS categories, being BI-RADS A almost entirely fatty and BI-RADS D extremely dense breasts. Our results showed that the proposed data augmentation technique improved the sensitivity and precision of mass detection in models trained with small datasets and improved the domain generalization of the models trained with large databases. In addition, the clinical realism of the synthetic images was evaluated in a reader study involving two expert radiologists and one surgical oncologist.
△ Less
Submitted 24 January, 2023; v1 submitted 20 September, 2022;
originally announced September 2022.
-
Étale extensions of polynomial rings are faithfully flat
Authors:
Lázaro O. Rodríguez Díaz
Abstract:
We apply Ohi's criterion for faithfully flatness of extensions of commutative rings to prove that any étale extension $k[Y_1, \ldots, Y_n]\subseteq k[X_1, \ldots, X_n]$ of polynomial rings (each in $n$ indeterminates) over a commutative ring $k$ is faithfully flat. In particular, if $k$ is an algebraically closed field then any étale polynomial map $k^{n} \to k^{n}$ is surjective.
We apply Ohi's criterion for faithfully flatness of extensions of commutative rings to prove that any étale extension $k[Y_1, \ldots, Y_n]\subseteq k[X_1, \ldots, X_n]$ of polynomial rings (each in $n$ indeterminates) over a commutative ring $k$ is faithfully flat. In particular, if $k$ is an algebraically closed field then any étale polynomial map $k^{n} \to k^{n}$ is surjective.
△ Less
Submitted 29 February, 2024; v1 submitted 2 May, 2022;
originally announced May 2022.
-
Sharing Generative Models Instead of Private Data: A Simulation Study on Mammography Patch Classification
Authors:
Zuzanna Szafranowska,
Richard Osuala,
Bennet Breier,
Kaisar Kushibar,
Karim Lekadir,
Oliver Diaz
Abstract:
Early detection of breast cancer in mammography screening via deep-learning based computer-aided detection systems shows promising potential in improving the curability and mortality rates of breast cancer. However, many clinical centres are restricted in the amount and heterogeneity of available data to train such models to (i) achieve promising performance and to (ii) generalise well across acqu…
▽ More
Early detection of breast cancer in mammography screening via deep-learning based computer-aided detection systems shows promising potential in improving the curability and mortality rates of breast cancer. However, many clinical centres are restricted in the amount and heterogeneity of available data to train such models to (i) achieve promising performance and to (ii) generalise well across acquisition protocols and domains. As sharing data between centres is restricted due to patient privacy concerns, we propose a potential solution: sharing trained generative models between centres as substitute for real patient data. In this work, we use three well known mammography datasets to simulate three different centres, where one centre receives the trained generator of Generative Adversarial Networks (GANs) from the two remaining centres in order to augment the size and heterogeneity of its training dataset. We evaluate the utility of this approach on mammography patch classification on the test set of the GAN-receiving centre using two different classification models, (a) a convolutional neural network and (b) a transformer neural network. Our experiments demonstrate that shared GANs notably increase the performance of both transformer and convolutional classification models and highlight this approach as a viable alternative to inter-centre data sharing.
△ Less
Submitted 15 April, 2022; v1 submitted 8 March, 2022;
originally announced March 2022.
-
Domain generalization in deep learning-based mass detection in mammography: A large-scale multi-center study
Authors:
Lidia Garrucho,
Kaisar Kushibar,
Socayna Jouide,
Oliver Diaz,
Laura Igual,
Karim Lekadir
Abstract:
Computer-aided detection systems based on deep learning have shown great potential in breast cancer detection. However, the lack of domain generalization of artificial neural networks is an important obstacle to their deployment in changing clinical environments. In this work, we explore the domain generalization of deep learning methods for mass detection in digital mammography and analyze in-dep…
▽ More
Computer-aided detection systems based on deep learning have shown great potential in breast cancer detection. However, the lack of domain generalization of artificial neural networks is an important obstacle to their deployment in changing clinical environments. In this work, we explore the domain generalization of deep learning methods for mass detection in digital mammography and analyze in-depth the sources of domain shift in a large-scale multi-center setting. To this end, we compare the performance of eight state-of-the-art detection methods, including Transformer-based models, trained in a single domain and tested in five unseen domains. Moreover, a single-source mass detection training pipeline is designed to improve the domain generalization without requiring images from the new domain. The results show that our workflow generalizes better than state-of-the-art transfer learning-based approaches in four out of five domains while reducing the domain shift caused by the different acquisition protocols and scanner manufacturers. Subsequently, an extensive analysis is performed to identify the covariate shifts with bigger effects on the detection performance, such as due to differences in patient age, breast density, mass size, and mass malignancy. Ultimately, this comprehensive study provides key insights and best practices for future research on domain generalization in deep learning-based breast cancer detection.
△ Less
Submitted 24 January, 2023; v1 submitted 27 January, 2022;
originally announced January 2022.
-
Data synthesis and adversarial networks: A review and meta-analysis in cancer imaging
Authors:
Richard Osuala,
Kaisar Kushibar,
Lidia Garrucho,
Akis Linardos,
Zuzanna Szafranowska,
Stefan Klein,
Ben Glocker,
Oliver Diaz,
Karim Lekadir
Abstract:
Despite technological and medical advances, the detection, interpretation, and treatment of cancer based on imaging data continue to pose significant challenges. These include inter-observer variability, class imbalance, dataset shifts, inter- and intra-tumour heterogeneity, malignancy determination, and treatment effect uncertainty. Given the recent advancements in Generative Adversarial Networks…
▽ More
Despite technological and medical advances, the detection, interpretation, and treatment of cancer based on imaging data continue to pose significant challenges. These include inter-observer variability, class imbalance, dataset shifts, inter- and intra-tumour heterogeneity, malignancy determination, and treatment effect uncertainty. Given the recent advancements in Generative Adversarial Networks (GANs), data synthesis, and adversarial training, we assess the potential of these technologies to address a number of key challenges of cancer imaging. We categorise these challenges into (a) data scarcity and imbalance, (b) data access and privacy, (c) data annotation and segmentation, (d) cancer detection and diagnosis, and (e) tumour profiling, treatment planning and monitoring. Based on our analysis of 164 publications that apply adversarial training techniques in the context of cancer imaging, we highlight multiple underexplored solutions with research potential. We further contribute the Synthesis Study Trustworthiness Test (SynTRUST), a meta-analysis framework for assessing the validation rigour of medical image synthesis studies. SynTRUST is based on 26 concrete measures of thoroughness, reproducibility, usefulness, scalability, and tenability. Based on SynTRUST, we analyse 16 of the most promising cancer imaging challenge solutions and observe a high validation rigour in general, but also several desirable improvements. With this work, we strive to bridge the gap between the needs of the clinical cancer imaging community and the current and prospective research on data synthesis and adversarial networks in the artificial intelligence community.
△ Less
Submitted 27 November, 2022; v1 submitted 20 July, 2021;
originally announced July 2021.
-
Onboarding in Software Product Lines: ConceptMaps as Welcome Guides
Authors:
Maider Azanza,
Arantza Irastorza,
Raul Medeiros,
Oscar Díaz
Abstract:
With a volatile labour and technological market, onboarding is becoming increasingly important. The process of incorporating a new developer, a.k.a. the newcomer, into a software development team is reckoned to be lengthy, frustrating and expensive. Newcomers face personal, interpersonal, process and technical barriers during their incorporation, which, in turn, affects the overall productivity of…
▽ More
With a volatile labour and technological market, onboarding is becoming increasingly important. The process of incorporating a new developer, a.k.a. the newcomer, into a software development team is reckoned to be lengthy, frustrating and expensive. Newcomers face personal, interpersonal, process and technical barriers during their incorporation, which, in turn, affects the overall productivity of the whole team. This problem exacerbates for Software Product Lines (SPLs), where their size and variability combine to make onboarding even more challenging, even more so for developers that are transferred from the Application Engineering team into the Domain Engineering team, who will be our target newcomers. This work presents concept maps on the role of sensemaking scaffolds to help to introduce these newcomers into the SPL domain. Concept maps, used as knowledge visualisation tools, have been proven to be helpful for meaningful learning. Our main insight is to capture concepts of the SPL domain and their interrelationships in a concept map, and then, present them incrementally, hel** newcomers grasp the SPL and aiding them in exploring it in a guided manner while avoiding information overload. This work's contributions are four-fold. First, concept maps are proposed as a representation to introduce newcomers into the SPL domain. Second, concept maps are presented as the means for a guided exploration of the SPL core assets. Third, a feature-driven concept map construction process is introduced. Last, the usefulness of concept maps as guides for SPL onboarding is tested through a formative evaluation.
Link to the online demo: url="https://rebrand.ly/wacline-cmap"
△ Less
Submitted 14 April, 2021; v1 submitted 5 March, 2021;
originally announced March 2021.
-
Deep learning reconstruction of digital breast tomosynthesis images for accurate breast density and patient-specific radiation dose estimation
Authors:
Jonas Teuwen,
Nikita Moriakov,
Christian Fedon,
Marco Caballo,
Ingrid Reiser,
Pedrag Bakic,
Eloy García,
Oliver Diaz,
Koen Michielsen,
Ioannis Sechopoulos
Abstract:
The two-dimensional nature of mammography makes estimation of the overall breast density challenging, and estimation of the true patient-specific radiation dose impossible. Digital breast tomosynthesis (DBT), a pseudo-3D technique, is now commonly used in breast cancer screening and diagnostics. Still, the severely limited 3rd dimension information in DBT has not been used, until now, to estimate…
▽ More
The two-dimensional nature of mammography makes estimation of the overall breast density challenging, and estimation of the true patient-specific radiation dose impossible. Digital breast tomosynthesis (DBT), a pseudo-3D technique, is now commonly used in breast cancer screening and diagnostics. Still, the severely limited 3rd dimension information in DBT has not been used, until now, to estimate the true breast density or the patient-specific dose. This study proposes a reconstruction algorithm for DBT based on deep learning specifically optimized for these tasks. The algorithm, which we name DBToR, is based on unrolling a proximal-dual optimization method. The proximal operators are replaced with convolutional neural networks and prior knowledge is included in the model. This extends previous work on a deep learning-based reconstruction model by providing both the primal and the dual blocks with breast thickness information, which is available in DBT. Training and testing of the model were performed using virtual patient phantoms from two different sources. Reconstruction performance, and accuracy in estimation of breast density and radiation dose, were estimated, showing high accuracy (density <+/-3%; dose <+/-20%) without bias, significantly improving on the current state-of-the-art. This work also lays the groundwork for develo** a deep learning-based reconstruction algorithm for the task of image interpretation by radiologists.
△ Less
Submitted 29 March, 2021; v1 submitted 11 June, 2020;
originally announced June 2020.
-
Quality analysis of DCGAN-generated mammography lesions
Authors:
Basel Alyafi,
Oliver Diaz,
Joan C Vilanova,
Javier del Riego,
Robert Marti
Abstract:
Medical image synthesis has gained a great focus recently, especially after the introduction of Generative Adversarial Networks (GANs). GANs have been used widely to provide anatomically-plausible and diverse samples for augmentation and other applications, including segmentation and super resolution. In our previous work, Deep Convolutional GANs were used to generate synthetic mammogram lesions,…
▽ More
Medical image synthesis has gained a great focus recently, especially after the introduction of Generative Adversarial Networks (GANs). GANs have been used widely to provide anatomically-plausible and diverse samples for augmentation and other applications, including segmentation and super resolution. In our previous work, Deep Convolutional GANs were used to generate synthetic mammogram lesions, masses mainly, that could enhance the classification performance in imbalanced datasets. In this new work, a deeper investigation was carried out to explore other aspects of the generated images evaluation, i.e., realism, feature space distribution, and observers studies. t-Stochastic Neighbor Embedding (t-SNE) was used to reduce the dimensionality of real and fake images to enable 2D visualisations. Additionally, two expert radiologists performed a realism-evaluation study. Visualisations showed that the generated images have a similar feature distribution of the real ones, avoiding outliers. Moreover, Receiver Operating Characteristic (ROC) curve showed that the radiologists could not, in many cases, distinguish between synthetic and real lesions, giving 48% and 61% accuracies in a balanced sample set.
△ Less
Submitted 6 February, 2020; v1 submitted 28 November, 2019;
originally announced November 2019.
-
DCGANs for Realistic Breast Mass Augmentation in X-ray Mammography
Authors:
Basel Alyafi,
Oliver Diaz,
Robert Marti
Abstract:
Early detection of breast cancer has a major contribution to curability, and using mammographic images, this can be achieved non-invasively. Supervised deep learning, the dominant CADe tool currently, has played a great role in object detection in computer vision, but it suffers from a limiting property: the need of a large amount of labelled data. This becomes stricter when it comes to medical da…
▽ More
Early detection of breast cancer has a major contribution to curability, and using mammographic images, this can be achieved non-invasively. Supervised deep learning, the dominant CADe tool currently, has played a great role in object detection in computer vision, but it suffers from a limiting property: the need of a large amount of labelled data. This becomes stricter when it comes to medical datasets which require high-cost and time-consuming annotations. Furthermore, medical datasets are usually imbalanced, a condition that often hinders classifiers performance. The aim of this paper is to learn the distribution of the minority class to synthesise new samples in order to improve lesion detection in mammography. Deep Convolutional Generative Adversarial Networks (DCGANs) can efficiently generate breast masses. They are trained on increasing-size subsets of one mammographic dataset and used to generate diverse and realistic breast masses. The effect of including the generated images and/or applying horizontal and vertical flip** is tested in an environment where a 1:10 imbalanced dataset of masses and normal tissue patches is classified by a fully-convolutional network. A maximum of ~ 0:09 improvement of F1 score is reported by using DCGANs along with flip** augmentation over using the original images. We show that DCGANs can be used for synthesising photo-realistic breast mass patches with considerable diversity. It is demonstrated that appending synthetic images in this environment, along with flip**, outperforms the traditional augmentation method of flip** solely, offering faster improvements as a function of the training set size.
△ Less
Submitted 4 September, 2019;
originally announced September 2019.
-
Light control through a nonlinear lensing effect in a colloid of biosynthesized Gold Nanoparticles
Authors:
A. Balbuena Ortega,
E. Brambilia,
V. López Gayou,
R. Delgado Macuil,
A. Orduña Diaz,
A. Zamilpa Alvarez,
A. V. Arzola,
K. Volke-Sepúlveda
Abstract:
Biosynthesis of four samples of colloidal suspensions of gold nanoparticles is achieved using hydroalcoholic extract and three different separated compounds of the plant Bacopa procumbens. The nonlinear optical properties of each sample are characterized with the Z-scan technique. In all cases, the Z-scan curves indicate a negative or self-defocusing response, which is mainly attributed to thermal…
▽ More
Biosynthesis of four samples of colloidal suspensions of gold nanoparticles is achieved using hydroalcoholic extract and three different separated compounds of the plant Bacopa procumbens. The nonlinear optical properties of each sample are characterized with the Z-scan technique. In all cases, the Z-scan curves indicate a negative or self-defocusing response, which is mainly attributed to thermal effects. Among the four samples, the hydroalcoholic extract was noted to have the highest nonlinear optical response and was selected to demonstrate the formation of self-collimated beams (SCBs). This kind of beams are obtained when a convergent CW laser, with only few tens of milliwatts of optical power, is introduced into the sample and induces a negative-lens effect that shifts the focal spot forward. As a result, the otherwise highly focused beam propagate with little divergence over lengths of up to 10mm. Moreover, an SCB is capable of controlling and steering a weak probe beam of a different wavelength, since the probe experiences the lensing induced by the pump. Noteworthy, the response time of the material was found to be less than 0.07s, which makes it a plausible candidate for photonic applications.
△ Less
Submitted 21 August, 2018;
originally announced August 2018.
-
A note on Kirchhoff's theorem for almost complex spheres I
Authors:
Lázaro O. Rodríguez Díaz
Abstract:
By a theorem of Kirchhoff if the six sphere admits an almost complex structure then the seven sphere is parallelizable, more crucial, he exhibited an explicit global frame constructed out of the given almost complex structure. This result implicitly equips the seven sphere with a definite H-space multiplication. We propose to address the existence problem of complex structures on the six sphere st…
▽ More
By a theorem of Kirchhoff if the six sphere admits an almost complex structure then the seven sphere is parallelizable, more crucial, he exhibited an explicit global frame constructed out of the given almost complex structure. This result implicitly equips the seven sphere with a definite H-space multiplication. We propose to address the existence problem of complex structures on the six sphere studying the associated parallelism-multiplications on the seven sphere. We ask to what extent the integrability condition of the almost complex structure amounts to the constancy of the structure functions of the global frame defining the parallelism, i.e, if this parallelism comes from a Lie group structure. At a more fundamental level we inquire if the integrability condition of the almost complex structure entails the homotopy associativity of the induced multiplication. A positive answer to these questions would rule out the six sphere of being a complex manifold since the seven sphere is not a Lie group, not even a homotopy associative H-space.
△ Less
Submitted 16 April, 2018;
originally announced April 2018.
-
The $SW(3/2,2)$ superconformal algebra via a Quantum Hamiltonian Reduction of $osp(3|2)$
Authors:
Lázaro O. Rodríguez Díaz
Abstract:
We prove that the family of non-linear $W$-algebras $SW(3/2,2)$ which are extensions of the $N=1$ superconformal algebra by a primary supercurrent of conformal weight $2$ can be realized as a quantum Hamiltonian reduction of the Lie superalgebra $osp(3|2)$. In consequence we obtain an explicit free field realization of the algebra in terms of the screening operators. At central charge $c=12$ the…
▽ More
We prove that the family of non-linear $W$-algebras $SW(3/2,2)$ which are extensions of the $N=1$ superconformal algebra by a primary supercurrent of conformal weight $2$ can be realized as a quantum Hamiltonian reduction of the Lie superalgebra $osp(3|2)$. In consequence we obtain an explicit free field realization of the algebra in terms of the screening operators. At central charge $c=12$ the $SW(3/2,2)$ superconformal algebra corresponds to the superconformal algebra associated to sigma models based on eight-dimensional manifolds with special holonomy $Spin(7)$, i.e., the Shatashvili-Vafa $Spin(7)$ superconformal algebra.
△ Less
Submitted 10 November, 2016;
originally announced November 2016.
-
$\rm G_2$ holonomy manifolds are superconformal
Authors:
Lázaro O. Rodríguez Díaz
Abstract:
We study the chiral de Rham complex (CDR) over a manifold $M$ with holonomy $\rm G_2$. We prove that the vertex algebra of global sections of the CDR associated to $M$ contains two commuting copies of the Shatashvili-Vafa $\rm G_2$ superconformal algebra. Our proof is a tour de force, based on explicit computations.
We study the chiral de Rham complex (CDR) over a manifold $M$ with holonomy $\rm G_2$. We prove that the vertex algebra of global sections of the CDR associated to $M$ contains two commuting copies of the Shatashvili-Vafa $\rm G_2$ superconformal algebra. Our proof is a tour de force, based on explicit computations.
△ Less
Submitted 30 June, 2016;
originally announced June 2016.
-
Gauge theory and G2-geometry on Calabi-Yau links
Authors:
Omegar Calvo-Andrade,
Lázaro O. Rodríguez Díaz,
Henrique N. Sá Earp
Abstract:
The $7$-dimensional link $K$ of a weighted homogeneous hypersurface on the round $9$-sphere in $\mathbb{C}^5$ has a nontrivial null Sasakian structure which is contact Calabi-Yau, in many cases. It admits a canonical co-closed $\rm G_2$-structure $\varphi$ induced by the Calabi-Yau $3$-orbifold basic geometry. We distinguish these pairs $(K,\varphi)$ by the Crowley-Nordström $\mathbb{Z}_{48}$-valu…
▽ More
The $7$-dimensional link $K$ of a weighted homogeneous hypersurface on the round $9$-sphere in $\mathbb{C}^5$ has a nontrivial null Sasakian structure which is contact Calabi-Yau, in many cases. It admits a canonical co-closed $\rm G_2$-structure $\varphi$ induced by the Calabi-Yau $3$-orbifold basic geometry. We distinguish these pairs $(K,\varphi)$ by the Crowley-Nordström $\mathbb{Z}_{48}$-valued $ν$ invariant, for which we prove odd parity and provide an algorithmic formula. We describe moreover a natural Yang-Mills theory on such spaces, with many important features of the torsion-free case, such as a Chern-Simons formalism and topological energy bounds. In fact compatible $\rm G_2$-instantons on holomorphic Sasakian bundles over $K$ are exactly the transversely Hermitian Yang-Mills connections. As a proof of principle, we obtain $\rm G_2$-instantons over the Fermat quintic link from stable bundles over the smooth projective Fermat quintic, thus relating in a concrete example the Donaldson-Thomas theory of the quintic threefold with a conjectural $\rm G_2$-instanton count.
△ Less
Submitted 2 October, 2018; v1 submitted 29 June, 2016;
originally announced June 2016.
-
Uniaxially stressed germanium with fundamental direct band gap
Authors:
R. Geiger,
T. Zabel,
E. Marin,
A. Gassenq,
J. -M. Hartmann,
J. Widiez,
J. Escalante,
K. Guilloy,
N. Pauc,
D. Rouchon,
G. Osvaldo Diaz,
S. Tardif,
F. Rieutord,
I. Duchemin,
Y. -M. Niquet,
V. Reboud,
V. Calvo,
A. Chelnokov,
J. Faist,
H. Sigg
Abstract:
We demonstrate the crossover from indirect- to direct band gap in tensile-strained germanium by temperature-dependent photoluminescence. The samples are strained microbridges that enhance a biaxial strain of 0.16% up to 3.6% uniaxial tensile strain. Cooling the bridges to 20 K increases the uniaxial strain up to a maximum of 5.4%. Temperature-dependent photoluminescence reveals the crossover to a…
▽ More
We demonstrate the crossover from indirect- to direct band gap in tensile-strained germanium by temperature-dependent photoluminescence. The samples are strained microbridges that enhance a biaxial strain of 0.16% up to 3.6% uniaxial tensile strain. Cooling the bridges to 20 K increases the uniaxial strain up to a maximum of 5.4%. Temperature-dependent photoluminescence reveals the crossover to a fundamental direct band gap to occur between 4.0% and 4.5%. Our data are in good agreement with new theoretical computations that predict a strong bowing of the band parameters with strain.
△ Less
Submitted 10 December, 2015;
originally announced March 2016.
-
Performance of eddy-viscosity turbulence models for predicting swirling pipe-flow: Simulations and laser-Doppler velocimetry
Authors:
Diego del Olmo Díaz,
Denis F. Hinz
Abstract:
We use laser-Doppler velocimetry (LDV) experiments and Reynolds-averaged Navier--Stokes (RANS) simulations to study the characteristic flow patterns downstream of a standardized clockwise swirl disturbance generator. After quantifying the impact of the mesh size, we evaluate the potential of various eddy-viscosity turbulence models in providing reasonable approximations with respect to the experim…
▽ More
We use laser-Doppler velocimetry (LDV) experiments and Reynolds-averaged Navier--Stokes (RANS) simulations to study the characteristic flow patterns downstream of a standardized clockwise swirl disturbance generator. After quantifying the impact of the mesh size, we evaluate the potential of various eddy-viscosity turbulence models in providing reasonable approximations with respect to the experimental reference. The choice of turbulent models reflects current industry practice. Our results suggest that models from the $k$-$ε$ family are more accurate in predicting swirling flows than models from the $k$-$ω$ family. For sufficiently resolved meshes, the realizable $k$-$ε$ model provides the most accurate approximation of the velocity magnitudes, although it fails to capture small-scale flow structures which are accurately predicted by the standard $k$-$ε$ model and the RNG $k$-$ε$ model. Throughout the article, we highlight practical guidance for the choice of RANS turbulence models for swirling flow.
△ Less
Submitted 26 October, 2015; v1 submitted 16 July, 2015;
originally announced July 2015.
-
The Shatashvili-Vafa $G_{2}$ superconformal algebra as a Quantum Hamiltonian Reduction of $D(2,1;α)$
Authors:
Reimundo Heluani,
Lázaro O. Rodríguez Díaz
Abstract:
We obtain the superconformal algebra associated to a sigma model with target a manifold with $G_{2}$ holonomy, i.e., the Shatashvili-Vafa $G_{2}$ algebra as a quantum Hamiltonian reduction of the exceptional Lie superalgebra $D(2,1;α)$ for $α=1$. We produce the complete family of $W$-algebras $SW(\frac{3}{2},\frac{3}{2}, 2)$ (extensions of the $N=1$ superconformal algebra by two primary supercurre…
▽ More
We obtain the superconformal algebra associated to a sigma model with target a manifold with $G_{2}$ holonomy, i.e., the Shatashvili-Vafa $G_{2}$ algebra as a quantum Hamiltonian reduction of the exceptional Lie superalgebra $D(2,1;α)$ for $α=1$. We produce the complete family of $W$-algebras $SW(\frac{3}{2},\frac{3}{2}, 2)$ (extensions of the $N=1$ superconformal algebra by two primary supercurrents of conformal weight $\frac{3}{2}$ and $2$ respectively) as a quantum Hamiltonian reduction of $D(2,1;α)$. As a corollary we find a free field realization of the Shatashvili-Vafa $G_{2}$ algebra, and an explicit description of the screening operators.
△ Less
Submitted 18 June, 2014;
originally announced June 2014.