-
Brain tumor segmentation using synthetic MR images -- A comparison of GANs and diffusion models
Authors:
Muhammad Usman Akbar,
Måns Larsson,
Anders Eklund
Abstract:
Large annotated datasets are required for training deep learning models, but in medical imaging data sharing is often complicated due to ethics, anonymization and data protection legislation. Generative AI models, such as generative adversarial networks (GANs) and diffusion models, can today produce very realistic synthetic images, and can potentially facilitate data sharing. However, in order to…
▽ More
Large annotated datasets are required for training deep learning models, but in medical imaging data sharing is often complicated due to ethics, anonymization and data protection legislation. Generative AI models, such as generative adversarial networks (GANs) and diffusion models, can today produce very realistic synthetic images, and can potentially facilitate data sharing. However, in order to share synthetic medical images it must first be demonstrated that they can be used for training different networks with acceptable performance. Here, we therefore comprehensively evaluate four GANs (progressive GAN, StyleGAN 1-3) and a diffusion model for the task of brain tumor segmentation (using two segmentation networks, U-Net and a Swin transformer). Our results show that segmentation networks trained on synthetic images reach Dice scores that are 80% - 90% of Dice scores when training with real images, but that memorization of the training images can be a problem for diffusion models if the original dataset is too small. Our conclusion is that sharing synthetic medical images is a viable option to sharing real images, but that further work is required. The trained generative models and the generated synthetic images are shared on AIDA data hub
△ Less
Submitted 5 January, 2024; v1 submitted 5 June, 2023;
originally announced June 2023.
-
The LuViRA Dataset: Synchronized Vision, Radio, and Audio Sensors for Indoor Localization
Authors:
Ilayda Yaman,
Guoda Tian,
Martin Larsson,
Patrik Persson,
Michiel Sandra,
Alexander Dürr,
Erik Tegler,
Nikhil Challa,
Henrik Garde,
Fredrik Tufvesson,
Kalle Åström,
Ove Edfors,
Steffen Malkowsky,
Liang Liu
Abstract:
We present a synchronized multisensory dataset for accurate and robust indoor localization: the Lund University Vision, Radio, and Audio (LuViRA) Dataset. The dataset includes color images, corresponding depth maps, inertial measurement unit (IMU) readings, channel response between a 5G massive multiple-input and multiple-output (MIMO) testbed and user equipment, audio recorded by 12 microphones,…
▽ More
We present a synchronized multisensory dataset for accurate and robust indoor localization: the Lund University Vision, Radio, and Audio (LuViRA) Dataset. The dataset includes color images, corresponding depth maps, inertial measurement unit (IMU) readings, channel response between a 5G massive multiple-input and multiple-output (MIMO) testbed and user equipment, audio recorded by 12 microphones, and accurate six degrees of freedom (6DOF) pose ground truth of 0.5 mm. We synchronize these sensors to ensure that all data is recorded simultaneously. A camera, speaker, and transmit antenna are placed on top of a slowly moving service robot, and 89 trajectories are recorded. Each trajectory includes 20 to 50 seconds of recorded sensor data and ground truth labels. Data from different sensors can be used separately or jointly to perform localization tasks, and data from the motion capture (mocap) system is used to verify the results obtained by the localization algorithms. The main aim of this dataset is to enable research on sensor fusion with the most commonly used sensors for localization tasks. Moreover, the full dataset or some parts of it can also be used for other research areas such as channel estimation, image classification, etc. Our dataset is available at: https://github.com/ilaydayaman/LuViRA_Dataset
△ Less
Submitted 26 April, 2024; v1 submitted 10 February, 2023;
originally announced February 2023.
-
Does an ensemble of GANs lead to better performance when training segmentation networks with synthetic images?
Authors:
Måns Larsson,
Muhammad Usman Akbar,
Anders Eklund
Abstract:
Large annotated datasets are required to train segmentation networks. In medical imaging, it is often difficult, time consuming and expensive to create such datasets, and it may also be difficult to share these datasets with other researchers. Different AI models can today generate very realistic synthetic images, which can potentially be openly shared as they do not belong to specific persons. Ho…
▽ More
Large annotated datasets are required to train segmentation networks. In medical imaging, it is often difficult, time consuming and expensive to create such datasets, and it may also be difficult to share these datasets with other researchers. Different AI models can today generate very realistic synthetic images, which can potentially be openly shared as they do not belong to specific persons. However, recent work has shown that using synthetic images for training deep networks often leads to worse performance compared to using real images. Here we demonstrate that using synthetic images and annotations from an ensemble of 20 GANs, instead of from a single GAN, increases the Dice score on real test images with 4.7 % to 14.0 % on specific classes.
△ Less
Submitted 12 March, 2023; v1 submitted 8 November, 2022;
originally announced November 2022.
-
A composite generalization of Ville's martingale theorem
Authors:
Johannes Ruf,
Martin Larsson,
Wouter M. Koolen,
Aaditya Ramdas
Abstract:
We provide a composite version of Ville's theorem that an event has zero measure if and only if there exists a nonnegative martingale which explodes to infinity when that event occurs. This is a classic result connecting measure-theoretic probability to the sequence-by-sequence game-theoretic probability, recently developed by Shafer and Vovk. Our extension of Ville's result involves appropriate c…
▽ More
We provide a composite version of Ville's theorem that an event has zero measure if and only if there exists a nonnegative martingale which explodes to infinity when that event occurs. This is a classic result connecting measure-theoretic probability to the sequence-by-sequence game-theoretic probability, recently developed by Shafer and Vovk. Our extension of Ville's result involves appropriate composite generalizations of nonnegative martingales and measure-zero events: these are respectively provided by ``e-processes'', and a new inverse capital outer measure. We then develop a novel line-crossing inequality for sums of random variables which are only required to have a finite first moment, which we use to prove a composite version of the strong law of large numbers (SLLN). This allows us to show that violation of the SLLN is an event of outer measure zero and that our e-process explodes to infinity on every such violating sequence, while this is provably not achievable with a nonnegative (super)martingale.
△ Less
Submitted 3 May, 2023; v1 submitted 8 March, 2022;
originally announced March 2022.
-
Back to the Feature: Learning Robust Camera Localization from Pixels to Pose
Authors:
Paul-Edouard Sarlin,
Ajaykumar Unagar,
Måns Larsson,
Hugo Germain,
Carl Toft,
Viktor Larsson,
Marc Pollefeys,
Vincent Lepetit,
Lars Hammarstrand,
Fredrik Kahl,
Torsten Sattler
Abstract:
Camera pose estimation in known scenes is a 3D geometry task recently tackled by multiple learning algorithms. Many regress precise geometric quantities, like poses or 3D points, from an input image. This either fails to generalize to new viewpoints or ties the model parameters to a specific scene. In this paper, we go Back to the Feature: we argue that deep networks should focus on learning robus…
▽ More
Camera pose estimation in known scenes is a 3D geometry task recently tackled by multiple learning algorithms. Many regress precise geometric quantities, like poses or 3D points, from an input image. This either fails to generalize to new viewpoints or ties the model parameters to a specific scene. In this paper, we go Back to the Feature: we argue that deep networks should focus on learning robust and invariant visual features, while the geometric estimation should be left to principled algorithms. We introduce PixLoc, a scene-agnostic neural network that estimates an accurate 6-DoF pose from an image and a 3D model. Our approach is based on the direct alignment of multiscale deep features, casting camera localization as metric learning. PixLoc learns strong data priors by end-to-end training from pixels to pose and exhibits exceptional generalization to new scenes by separating model parameters and scene geometry. The system can localize in large environments given coarse pose priors but also improve the accuracy of sparse feature matching by jointly refining keypoints and poses with little overhead. The code will be publicly available at https://github.com/cvg/pixloc.
△ Less
Submitted 7 April, 2021; v1 submitted 16 March, 2021;
originally announced March 2021.
-
Testing exchangeability: fork-convexity, supermartingales, and e-processes
Authors:
Aaditya Ramdas,
Johannes Ruf,
Martin Larsson,
Wouter Koolen
Abstract:
Suppose we observe an infinite series of coin flips $X_1,X_2,\ldots$, and wish to sequentially test the null that these binary random variables are exchangeable. Nonnegative supermartingales (NSMs) are a workhorse of sequential inference, but we prove that they are powerless for this problem. First, utilizing a geometric concept called fork-convexity (a sequential analog of convexity), we show tha…
▽ More
Suppose we observe an infinite series of coin flips $X_1,X_2,\ldots$, and wish to sequentially test the null that these binary random variables are exchangeable. Nonnegative supermartingales (NSMs) are a workhorse of sequential inference, but we prove that they are powerless for this problem. First, utilizing a geometric concept called fork-convexity (a sequential analog of convexity), we show that any process that is an NSM under a set of distributions, is also necessarily an NSM under their "fork-convex hull". Second, we demonstrate that the fork-convex hull of the exchangeable null consists of all possible laws over binary sequences; this implies that any NSM under exchangeability is necessarily nonincreasing, hence always yields a powerless test for any alternative. Since testing arbitrary deviations from exchangeability is information theoretically impossible, we focus on Markovian alternatives. We combine ideas from universal inference and the method of mixtures to derive a "safe e-process", which is a nonnegative process with expectation at most one under the null at any stop** time, and is upper bounded by a martingale, but is not itself an NSM. This in turn yields a level $α$ sequential test that is consistent; regret bounds from universal coding also demonstrate rate-optimal power. We present ways to extend these results to any finite alphabet and to Markovian alternatives of any order using a "double mixture" approach. We provide an array of simulations, and give general approaches based on betting for unstructured or ill-specified alternatives. Finally, inspired by Shafer, Vovk, and Ville, we provide game-theoretic interpretations of our e-processes and pathwise results.
△ Less
Submitted 23 July, 2021; v1 submitted 31 January, 2021;
originally announced February 2021.
-
Model-based Automated Testing of Mobile Applications: An Industrial Case Study
Authors:
Stefan Karlsson,
Adnan Čaušević,
Daniel Sundmark,
Mårten Larsson
Abstract:
Automatic testing of mobile applications has been a well-researched area in recent years. However, testing in industry is still a very manual practice, as research results have not been fully transferred and adopted. Considering mobile applications, manual testing has the additional burden of adequate testing posed by a large number of available devices and different configurations, as well as the…
▽ More
Automatic testing of mobile applications has been a well-researched area in recent years. However, testing in industry is still a very manual practice, as research results have not been fully transferred and adopted. Considering mobile applications, manual testing has the additional burden of adequate testing posed by a large number of available devices and different configurations, as well as the maintenance and setup of such devices.
In this paper, we propose and evaluate the use of a model-based test generation approach, where generated tests are executed on a set of cloud-hosted real mobile devices. By using a model-based approach we generate dynamic, less brittle, and implementation simple test cases. The test execution on multiple real devices with different configurations increase the confidence in the implementation of the system under test. Our evaluation shows that the used approach produces a high coverage of the parts of the application related to user interactions. Nevertheless, the inclusion of external services in test generation is required in order to additionally increase the coverage of the complete application. Furthermore, we present the lessons learned while transferring and implementing this approach in an industrial context and applying it to the real product.
△ Less
Submitted 20 August, 2020;
originally announced August 2020.
-
Towards an Electronic Health Record System in Vietnam: A Core Readiness Assessment
Authors:
Stefan Hochwarter,
Do Duy Cuong,
Nguyen Thi Kim Chuc,
Mattias Larsson
Abstract:
Previous studies have shown that health information technologies have a positive impact on health systems. Electronic health record (EHR) systems are one of the most promising applications, demonstrating a positive effect in high income countries. On the other hand, robust evidence for low and middle income countries is still spare. The aim of this study is to initiate a carefully planned nationwi…
▽ More
Previous studies have shown that health information technologies have a positive impact on health systems. Electronic health record (EHR) systems are one of the most promising applications, demonstrating a positive effect in high income countries. On the other hand, robust evidence for low and middle income countries is still spare. The aim of this study is to initiate a carefully planned nationwide EHR system in Vietnam by assessing the core readiness. The assessment structure is mainly based on previous research, which recommends a readiness assessment prior to to an EHR system implementation. To collect data, participant observation, document analysis and an in-depth interview were used. This study has revealed new insights into the current situation on EHR in Vietnam. The Ministry of Health is currently working on improving the conditions for future implementation of a Vietnamese EHR system. There are issues with the current way of handling health records. These issues are encouraging the Ministry of Health to work on identifying the next steps for an EHR system implementation. The integration of an EHR system with current systems seems to be challenging as most systems are commercial, closed source and very likely have no standardised interface. In conclusion, this study identifies points which need to be further investigated prior to an implementation. Generally, health care workers show good awareness of new technologies. As the Vietnam's health care system is centrally organised, there is the possibility for a nation-wide implementation. This could have a positive impact on the health care system, however, besides rigours planning also standards need to be followed and common interfaces implemented. Finally, this assessment has focused on only one level of readiness assessment. Further research is needed to complete the assessment.
△ Less
Submitted 28 May, 2020;
originally announced May 2020.
-
Spectral Characterization of functional MRI data on voxel-resolution cortical graphs
Authors:
Hamid Behjat,
Martin Larsson
Abstract:
The human cortical layer exhibits a convoluted morphology that is unique to each individual. Conventional volumetric fMRI processing schemes take for granted the rich information provided by the underlying anatomy. We present a method to study fMRI data on subject-specific cerebral hemisphere cortex (CHC) graphs, which encode the cortical morphology at the resolution of voxels in 3-D. We study gra…
▽ More
The human cortical layer exhibits a convoluted morphology that is unique to each individual. Conventional volumetric fMRI processing schemes take for granted the rich information provided by the underlying anatomy. We present a method to study fMRI data on subject-specific cerebral hemisphere cortex (CHC) graphs, which encode the cortical morphology at the resolution of voxels in 3-D. We study graph spectral energy metrics associated to fMRI data of 100 subjects from the Human Connectome Project database, across seven tasks. Experimental results signify the strength of CHC graphs' Laplacian eigenvector bases in capturing subtle spatial patterns specific to different functional loads as well as experimental conditions within each task.
△ Less
Submitted 10 May, 2020; v1 submitted 21 October, 2019;
originally announced October 2019.
-
Fine-Grained Segmentation Networks: Self-Supervised Segmentation for Improved Long-Term Visual Localization
Authors:
Måns Larsson,
Erik Stenborg,
Carl Toft,
Lars Hammarstrand,
Torsten Sattler,
Fredrik Kahl
Abstract:
Long-term visual localization is the problem of estimating the camera pose of a given query image in a scene whose appearance changes over time. It is an important problem in practice, for example, encountered in autonomous driving. In order to gain robustness to such changes, long-term localization approaches often use segmantic segmentations as an invariant scene representation, as the semantic…
▽ More
Long-term visual localization is the problem of estimating the camera pose of a given query image in a scene whose appearance changes over time. It is an important problem in practice, for example, encountered in autonomous driving. In order to gain robustness to such changes, long-term localization approaches often use segmantic segmentations as an invariant scene representation, as the semantic meaning of each scene part should not be affected by seasonal and other changes. However, these representations are typically not very discriminative due to the limited number of available classes. In this paper, we propose a new neural network, the Fine-Grained Segmentation Network (FGSN), that can be used to provide image segmentations with a larger number of labels and can be trained in a self-supervised fashion. In addition, we show how FGSNs can be trained to output consistent labels across seasonal changes. We demonstrate through extensive experiments that integrating the fine-grained segmentations produced by our FGSNs into existing localization algorithms leads to substantial improvements in localization performance.
△ Less
Submitted 18 August, 2019;
originally announced August 2019.
-
Boundary Objects and their Use in Agile Systems Engineering
Authors:
Rebekka Wohlrab,
Patrizio Pelliccione,
Eric Knauss,
Mats Larsson
Abstract:
Agile methods are increasingly introduced in automotive companies in the attempt to become more efficient and flexible in the system development. The adoption of agile practices influences communication between stakeholders, but also makes companies rethink the management of artifacts and documentation like requirements, safety compliance documents, and architecture models. Practitioners aim to re…
▽ More
Agile methods are increasingly introduced in automotive companies in the attempt to become more efficient and flexible in the system development. The adoption of agile practices influences communication between stakeholders, but also makes companies rethink the management of artifacts and documentation like requirements, safety compliance documents, and architecture models. Practitioners aim to reduce irrelevant documentation, but face a lack of guidance to determine what artifacts are needed and how they should be managed. This paper presents artifacts, challenges, guidelines, and practices for the continuous management of systems engineering artifacts in automotive based on a theoretical and empirical understanding of the topic. In collaboration with 53 practitioners from six automotive companies, we conducted a design-science study involving interviews, a questionnaire, focus groups, and practical data analysis of a systems engineering tool. The guidelines suggest the distinction between artifacts that are shared among different actors in a company (boundary objects) and those that are used within a team (locally relevant artifacts). We propose an analysis approach to identify boundary objects and three practices to manage systems engineering artifacts in industry.
△ Less
Submitted 27 April, 2019;
originally announced April 2019.
-
A Cross-Season Correspondence Dataset for Robust Semantic Segmentation
Authors:
Måns Larsson,
Erik Stenborg,
Lars Hammarstrand,
Torsten Sattler,
Mark Pollefeys,
Fredrik Kahl
Abstract:
In this paper, we present a method to utilize 2D-2D point matches between images taken during different image conditions to train a convolutional neural network for semantic segmentation. Enforcing label consistency across the matches makes the final segmentation algorithm robust to seasonal changes. We describe how these 2D-2D matches can be generated with little human interaction by geometricall…
▽ More
In this paper, we present a method to utilize 2D-2D point matches between images taken during different image conditions to train a convolutional neural network for semantic segmentation. Enforcing label consistency across the matches makes the final segmentation algorithm robust to seasonal changes. We describe how these 2D-2D matches can be generated with little human interaction by geometrically matching points from 3D models built from images. Two cross-season correspondence datasets are created providing 2D-2D matches across seasonal changes as well as from day to night. The datasets are made publicly available to facilitate further research. We show that adding the correspondences as extra supervision during training improves the segmentation performance of the convolutional neural network, making it more robust to seasonal changes and weather conditions.
△ Less
Submitted 16 August, 2019; v1 submitted 16 March, 2019;
originally announced March 2019.
-
Disentangled Representations for Manipulation of Sentiment in Text
Authors:
Maria Larsson,
Amanda Nilsson,
Mikael Kågebäck
Abstract:
The ability to change arbitrary aspects of a text while leaving the core message intact could have a strong impact in fields like marketing and politics by enabling e.g. automatic optimization of message impact and personalized language adapted to the receiver's profile. In this paper we take a first step towards such a system by presenting an algorithm that can manipulate the sentiment of a text…
▽ More
The ability to change arbitrary aspects of a text while leaving the core message intact could have a strong impact in fields like marketing and politics by enabling e.g. automatic optimization of message impact and personalized language adapted to the receiver's profile. In this paper we take a first step towards such a system by presenting an algorithm that can manipulate the sentiment of a text while preserving its semantics using disentangled representations. Validation is performed by examining trajectories in embedding space and analyzing transformed sentences for semantic preservation while expression of desired sentiment shift.
△ Less
Submitted 22 December, 2017;
originally announced December 2017.
-
A Projected Gradient Descent Method for CRF Inference allowing End-To-End Training of Arbitrary Pairwise Potentials
Authors:
Måns Larsson,
Anurag Arnab,
Fredrik Kahl,
Shuai Zheng,
Philip Torr
Abstract:
Are we using the right potential functions in the Conditional Random Field models that are popular in the Vision community? Semantic segmentation and other pixel-level labelling tasks have made significant progress recently due to the deep learning paradigm. However, most state-of-the-art structured prediction methods also include a random field model with a hand-crafted Gaussian potential to mode…
▽ More
Are we using the right potential functions in the Conditional Random Field models that are popular in the Vision community? Semantic segmentation and other pixel-level labelling tasks have made significant progress recently due to the deep learning paradigm. However, most state-of-the-art structured prediction methods also include a random field model with a hand-crafted Gaussian potential to model spatial priors, label consistencies and feature-based image conditioning.
In this paper, we challenge this view by develo** a new inference and learning framework which can learn pairwise CRF potentials restricted only by their dependence on the image pixel values and the size of the support. Both standard spatial and high-dimensional bilateral kernels are considered. Our framework is based on the observation that CRF inference can be achieved via projected gradient descent and consequently, can easily be integrated in deep neural networks to allow for end-to-end training. It is empirically demonstrated that such learned potentials can improve segmentation accuracy and that certain label class interactions are indeed better modelled by a non-Gaussian potential. In addition, we compare our inference method to the commonly used mean-field algorithm. Our framework is evaluated on several public benchmarks for semantic segmentation with improved performance compared to previous state-of-the-art CNN+CRF models.
△ Less
Submitted 2 January, 2018; v1 submitted 24 January, 2017;
originally announced January 2017.