-
BaSeNet: A Learning-based Mobile Manipulator Base Pose Sequence Planning for Pickup Tasks
Authors:
Lakshadeep Naik,
Sinan Kalkan,
Sune L. Sørensen,
Mikkel B. Kjærgaard,
Norbert Krüger
Abstract:
In many applications, a mobile manipulator robot is required to grasp a set of objects distributed in space. This may not be feasible from a single base pose and the robot must plan the sequence of base poses for gras** all objects, minimizing the total navigation and gras** time. This is a Combinatorial Optimization problem that can be solved using exact methods, which provide optimal solutio…
▽ More
In many applications, a mobile manipulator robot is required to grasp a set of objects distributed in space. This may not be feasible from a single base pose and the robot must plan the sequence of base poses for gras** all objects, minimizing the total navigation and gras** time. This is a Combinatorial Optimization problem that can be solved using exact methods, which provide optimal solutions but are computationally expensive, or approximate methods, which offer computationally efficient but sub-optimal solutions. Recent studies have shown that learning-based methods can solve Combinatorial Optimization problems, providing near-optimal and computationally efficient solutions.
In this work, we present BASENET - a learning-based approach to plan the sequence of base poses for the robot to grasp all the objects in the scene. We propose a Reinforcement Learning based solution that learns the base poses for gras** individual objects and the sequence in which the objects should be grasped to minimize the total navigation and gras** costs using Layered Learning. As the problem has a varying number of states and actions, we represent states and actions as a graph and use Graph Neural Networks for learning. We show that the proposed method can produce comparable solutions to exact and approximate methods with significantly less computation time.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
Encoding Binary Events from Continuous Time Series in Rooted Trees using Contrastive Learning
Authors:
Tobias Engelhardt Rasmussen,
Siv Sørensen
Abstract:
Broadband infrastructure owners do not always know how their customers are connected in the local networks, which are structured as rooted trees. A recent study is able to infer the topology of a local network using discrete time series data from the leaves of the tree (customers). In this study we propose a contrastive approach for learning a binary event encoder from continuous time series data.…
▽ More
Broadband infrastructure owners do not always know how their customers are connected in the local networks, which are structured as rooted trees. A recent study is able to infer the topology of a local network using discrete time series data from the leaves of the tree (customers). In this study we propose a contrastive approach for learning a binary event encoder from continuous time series data. As a preliminary result, we show that our approach has some potential in learning a valuable encoder.
△ Less
Submitted 2 January, 2024;
originally announced January 2024.
-
Learning to Taste: A Multimodal Wine Dataset
Authors:
Thoranna Bender,
Simon Moe Sørensen,
Alireza Kashani,
K. Eldjarn Hjorleifsson,
Grethe Hyldig,
Søren Hauberg,
Serge Belongie,
Frederik Warburg
Abstract:
We present WineSensed, a large multimodal wine dataset for studying the relations between visual perception, language, and flavor. The dataset encompasses 897k images of wine labels and 824k reviews of wines curated from the Vivino platform. It has over 350k unique bottlings, annotated with year, region, rating, alcohol percentage, price, and grape composition. We obtained fine-grained flavor anno…
▽ More
We present WineSensed, a large multimodal wine dataset for studying the relations between visual perception, language, and flavor. The dataset encompasses 897k images of wine labels and 824k reviews of wines curated from the Vivino platform. It has over 350k unique bottlings, annotated with year, region, rating, alcohol percentage, price, and grape composition. We obtained fine-grained flavor annotations on a subset by conducting a wine-tasting experiment with 256 participants who were asked to rank wines based on their similarity in flavor, resulting in more than 5k pairwise flavor distances. We propose a low-dimensional concept embedding algorithm that combines human experience with automatic machine similarity kernels. We demonstrate that this shared concept embedding space improves upon separate embedding spaces for coarse flavor classification (alcohol percentage, country, grape, price, rating) and aligns with the intricate human perception of flavor.
△ Less
Submitted 15 January, 2024; v1 submitted 31 August, 2023;
originally announced August 2023.
-
Medical needle tip tracking based on Optical Imaging and AI
Authors:
Zhuoqi Cheng,
Simon Lyck Bjært Sørensen,
Mikkel Werge Olsen,
René Lynge Eriksen,
Thiusius Rajeeth Savarimuthu
Abstract:
Deep needle insertion to a target often poses a huge challenge, requiring a combination of specialized skills, assistive technology, and extensive training. One of the frequently encountered medical scenarios demanding such expertise includes the needle insertion into a femoral vessel in the groin. After the access to the femoral vessel, various medical procedures, such as cardiac catheterization…
▽ More
Deep needle insertion to a target often poses a huge challenge, requiring a combination of specialized skills, assistive technology, and extensive training. One of the frequently encountered medical scenarios demanding such expertise includes the needle insertion into a femoral vessel in the groin. After the access to the femoral vessel, various medical procedures, such as cardiac catheterization and extracorporeal membrane oxygenation (ECMO) can be performed. However, even with the aid of Ultrasound imaging, achieving successful insertion can necessitate multiple attempts due to the complexities of anatomy and tissue deformation. To address this challenge, this paper presents an innovative technology for needle tip real-time tracking, aiming for enhanced needle insertion guidance. Specifically, our approach revolves around the creation of scattering imaging using an optical fiber-equipped needle, and uses Convolutional Neural Network (CNN) based algorithms to enable real-time estimation of the needle tip's position and orientation during insertion procedures. The efficacy of the proposed technology was rigorously evaluated through three experiments. The first two experiments involved rubber and bacon phantoms to simulate groin anatomy. The positional errors averaging 2.3+1.5mm and 2.0+1.2mm, and the orientation errors averaging 0.2+0.11rad and 0.16+0.1rad. Furthermore, the system's capabilities were validated through experiments conducted on fresh porcine phantom mimicking more complex anatomical structures, yielding positional accuracy results of 3.2+3.1mm and orientational accuracy of 0.19+0.1rad. Given the average femoral arterial radius of 4 to 5mm, the proposed system is demonstrated with a great potential for precise needle guidance in femoral artery insertion procedures. In addition, the findings highlight the broader potential applications of the system in the medical field.
△ Less
Submitted 28 September, 2023; v1 submitted 28 August, 2023;
originally announced August 2023.
-
Generative time series models using Neural ODE in Variational Autoencoders
Authors:
M. L. Garsdal,
V. Søgaard,
S. M. Sørensen
Abstract:
In this paper, we implement Neural Ordinary Differential Equations in a Variational Autoencoder setting for generative time series modeling. An object-oriented approach to the code was taken to allow for easier development and research and all code used in the paper can be found here: https://github.com/simonmoesorensen/neural-ode-project
The results were initially recreated and the reconstructi…
▽ More
In this paper, we implement Neural Ordinary Differential Equations in a Variational Autoencoder setting for generative time series modeling. An object-oriented approach to the code was taken to allow for easier development and research and all code used in the paper can be found here: https://github.com/simonmoesorensen/neural-ode-project
The results were initially recreated and the reconstructions compared to a baseline Long-Short Term Memory AutoEncoder. The model was then extended with a LSTM encoder and challenged by more complex data consisting of time series in the form of spring oscillations. The model showed promise, and was able to reconstruct true trajectories for all complexities of data with a smaller RMSE than the baseline model. However, it was able to capture the dynamic behavior of the time series for known data in the decoder but was not able to produce extrapolations following the true trajectory very well for any of the complexities of spring data. A final experiment was carried out where the model was also presented with 68 days of solar power production data, and was able to reconstruct just as well as the baseline, even when very little data is available.
Finally, the models training time was compared to the baseline. It was found that for small amounts of data the NODE method was significantly slower at training than the baseline, while for larger amounts of data the NODE method would be equal or faster at training.
The paper is ended with a future work section which describes the many natural extensions to the work presented in this paper, with examples being investigating further the importance of input data, including extrapolation in the baseline model or testing more specific model setups.
△ Less
Submitted 12 January, 2022;
originally announced January 2022.
-
Bridging the gap between prostate radiology and pathology through machine learning
Authors:
Indrani Bhattacharya,
David S. Lim,
Han Lin Aung,
Xingchen Liu,
Arun Seetharaman,
Christian A. Kunder,
Wei Shao,
Simon J. C. Soerensen,
Richard E. Fan,
Pejman Ghanouni,
Katherine J. To'o,
James D. Brooks,
Geoffrey A. Sonn,
Mirabela Rusu
Abstract:
Prostate cancer is the second deadliest cancer for American men. While Magnetic Resonance Imaging (MRI) is increasingly used to guide targeted biopsies for prostate cancer diagnosis, its utility remains limited due to high rates of false positives and false negatives as well as low inter-reader agreements. Machine learning methods to detect and localize cancer on prostate MRI can help standardize…
▽ More
Prostate cancer is the second deadliest cancer for American men. While Magnetic Resonance Imaging (MRI) is increasingly used to guide targeted biopsies for prostate cancer diagnosis, its utility remains limited due to high rates of false positives and false negatives as well as low inter-reader agreements. Machine learning methods to detect and localize cancer on prostate MRI can help standardize radiologist interpretations. However, existing machine learning methods vary not only in model architecture, but also in the ground truth labeling strategies used for model training. In this study, we compare different labeling strategies, namely, pathology-confirmed radiologist labels, pathologist labels on whole-mount histopathology images, and lesion-level and pixel-level digital pathologist labels (previously validated deep learning algorithm on histopathology images to predict pixel-level Gleason patterns) on whole-mount histopathology images. We analyse the effects these labels have on the performance of the trained machine learning models. Our experiments show that (1) radiologist labels and models trained with them can miss cancers, or underestimate cancer extent, (2) digital pathologist labels and models trained with them have high concordance with pathologist labels, and (3) models trained with digital pathologist labels achieve the best performance in prostate cancer detection in two different cohorts with different disease distributions, irrespective of the model architecture used. Digital pathologist labels can reduce challenges associated with human annotations, including labor, time, inter- and intra-reader variability, and can help bridge the gap between prostate radiology and pathology by enabling the training of reliable machine learning models to detect and localize prostate cancer on MRI.
△ Less
Submitted 3 December, 2021;
originally announced December 2021.
-
Instance Segmentation of Microscopic Foraminifera
Authors:
Thomas Haugland Johansen,
Steffen Aagaard Sørensen,
Kajsa Møllersen,
Fred Godtliebsen
Abstract:
Foraminifera are single-celled marine organisms that construct shells that remain as fossils in the marine sediments. Classifying and counting these fossils are important in e.g. paleo-oceanographic and -climatological research. However, the identification and counting process has been performed manually since the 1800s and is laborious and time-consuming. In this work, we present a deep learning-…
▽ More
Foraminifera are single-celled marine organisms that construct shells that remain as fossils in the marine sediments. Classifying and counting these fossils are important in e.g. paleo-oceanographic and -climatological research. However, the identification and counting process has been performed manually since the 1800s and is laborious and time-consuming. In this work, we present a deep learning-based instance segmentation model for classifying, detecting, and segmenting microscopic foraminifera. Our model is based on the Mask R-CNN architecture, using model weight parameters that have learned on the COCO detection dataset. We use a fine-tuning approach to adapt the parameters on a novel object detection dataset of more than 7000 microscopic foraminifera and sediment grains. The model achieves a (COCO-style) average precision of $0.78 \pm 0.00$ on the classification and detection task, and $0.80 \pm 0.00$ on the segmentation task. When the model is evaluated without challenging sediment grain images, the average precision for both tasks increases to $0.84 \pm 0.00$ and $0.86 \pm 0.00$, respectively. Prediction results are analyzed both quantitatively and qualitatively and discussed. Based on our findings we propose several directions for future work, and conclude that our proposed model is an important step towards automating the identification and counting of microscopic foraminifera.
△ Less
Submitted 15 May, 2021;
originally announced May 2021.
-
Fitbeat: COVID-19 Estimation based on Wristband Heart Rate
Authors:
Shuo Liu,
**g Han,
Estela Laporta Puyal,
Spyridon Kontaxis,
Shaoxiong Sun,
Patrick Locatelli,
Judith Dineley,
Florian B. Pokorny,
Gloria Dalla Costa,
Letizia Leocan,
Ana Isabel Guerrero,
Carlos Nos,
Ana Zabalza,
Per Soelberg Sørensen,
Mathias Buron,
Melinda Magyari,
Yatharth Ranjan,
Zulqarnain Rashid,
Pauline Conde,
Callum Stewart,
Amos A Folarin,
Richard JB Dobson,
Raquel Bailón,
Srinivasan Vairavan,
Nicholas Cummins
, et al. (4 additional authors not shown)
Abstract:
This study investigates the potential of deep learning methods to identify individuals with suspected COVID-19 infection using remotely collected heart-rate data. The study utilises data from the ongoing EU IMI RADAR-CNS research project that is investigating the feasibility of wearable devices and smart phones to monitor individuals with multiple sclerosis (MS), depression or epilepsy. Aspart of…
▽ More
This study investigates the potential of deep learning methods to identify individuals with suspected COVID-19 infection using remotely collected heart-rate data. The study utilises data from the ongoing EU IMI RADAR-CNS research project that is investigating the feasibility of wearable devices and smart phones to monitor individuals with multiple sclerosis (MS), depression or epilepsy. Aspart of the project protocol, heart-rate data was collected from participants using a Fitbit wristband. The presence of COVID-19 in the cohort in this work was either confirmed through a positive swab test, or inferred through the self-reporting of a combination of symptoms including fever, respiratory symptoms, loss of smell or taste, tiredness and gastrointestinal symptoms. Experimental results indicate that our proposed contrastive convolutional auto-encoder (contrastive CAE), i. e., a combined architecture of an auto-encoder and contrastive loss, outperforms a conventional convolutional neural network (CNN), as well as a convolutional auto-encoder (CAE) without using contrastive loss. Our final contrastive CAE achieves 95.3% unweighted average recall, 86.4% precision, anF1 measure of 88.2%, a sensitivity of 100% and a specificity of 90.6% on a testset of 19 participants with MS who reported symptoms of COVID-19. Each of these participants was paired with a participant with MS with no COVID-19 symptoms.
△ Less
Submitted 19 April, 2021;
originally announced April 2021.
-
A reproduction of Apple's bi-directional LSTM models for language identification in short strings
Authors:
Mads Toftrup,
Søren Asger Sørensen,
Manuel R. Ciosici,
Ira Assent
Abstract:
Language Identification is the task of identifying a document's language. For applications like automatic spell checker selection, language identification must use very short strings such as text message fragments. In this work, we reproduce a language identification architecture that Apple briefly sketched in a blog post. We confirm the bi-LSTM model's performance and find that it outperforms cur…
▽ More
Language Identification is the task of identifying a document's language. For applications like automatic spell checker selection, language identification must use very short strings such as text message fragments. In this work, we reproduce a language identification architecture that Apple briefly sketched in a blog post. We confirm the bi-LSTM model's performance and find that it outperforms current open-source language identifiers. We further find that its language identification mistakes are due to confusion between related languages.
△ Less
Submitted 11 February, 2021;
originally announced February 2021.
-
CorrSigNet: Learning CORRelated Prostate Cancer SIGnatures from Radiology and Pathology Images for Improved Computer Aided Diagnosis
Authors:
Indrani Bhattacharya,
Arun Seetharaman,
Wei Shao,
Rewa Sood,
Christian A. Kunder,
Richard E. Fan,
Simon John Christoph Soerensen,
Jeffrey B. Wang,
Pejman Ghanouni,
Nikola C. Teslovich,
James D. Brooks,
Geoffrey A. Sonn,
Mirabela Rusu
Abstract:
Magnetic Resonance Imaging (MRI) is widely used for screening and staging prostate cancer. However, many prostate cancers have subtle features which are not easily identifiable on MRI, resulting in missed diagnoses and alarming variability in radiologist interpretation. Machine learning models have been developed in an effort to improve cancer identification, but current models localize cancer usi…
▽ More
Magnetic Resonance Imaging (MRI) is widely used for screening and staging prostate cancer. However, many prostate cancers have subtle features which are not easily identifiable on MRI, resulting in missed diagnoses and alarming variability in radiologist interpretation. Machine learning models have been developed in an effort to improve cancer identification, but current models localize cancer using MRI-derived features, while failing to consider the disease pathology characteristics observed on resected tissue. In this paper, we propose CorrSigNet, an automated two-step model that localizes prostate cancer on MRI by capturing the pathology features of cancer. First, the model learns MRI signatures of cancer that are correlated with corresponding histopathology features using Common Representation Learning. Second, the model uses the learned correlated MRI features to train a Convolutional Neural Network to localize prostate cancer. The histopathology images are used only in the first step to learn the correlated features. Once learned, these correlated features can be extracted from MRI of new patients (without histopathology or surgery) to localize cancer. We trained and validated our framework on a unique dataset of 75 patients with 806 slices who underwent MRI followed by prostatectomy surgery. We tested our method on an independent test set of 20 prostatectomy patients (139 slices, 24 cancerous lesions, 1.12M pixels) and achieved a per-pixel sensitivity of 0.81, specificity of 0.71, AUC of 0.86 and a per-lesion AUC of $0.96 \pm 0.07$, outperforming the current state-of-the-art accuracy in predicting prostate cancer using MRI.
△ Less
Submitted 31 July, 2020;
originally announced August 2020.
-
Using smartphones and wearable devices to monitor behavioural changes during COVID-19
Authors:
Shaoxiong Sun,
Amos Folarin,
Yatharth Ranjan,
Zulqarnain Rashid,
Pauline Conde,
Callum Stewart,
Nicholas Cummins,
Faith Matcham,
Gloria Dalla Costa,
Sara Simblett,
Letizia Leocani,
Per Soelberg Sørensen,
Mathias Buron,
Ana Isabel Guerrero,
Ana Zabalza,
Brenda WJH Penninx,
Femke Lamers,
Sara Siddi,
Josep Maria Haro,
Inez Myin-Germeys,
Aki Rintala,
Til Wykes,
Vaibhav A. Narayan,
Giancarlo Comi,
Matthew Hotopf
, et al. (1 additional authors not shown)
Abstract:
We aimed to explore the utility of the recently developed open-source mobile health platform RADAR-base as a toolbox to rapidly test the effect and response to NPIs aimed at limiting the spread of COVID-19. We analysed data extracted from smartphone and wearable devices and managed by the RADAR-base from 1062 participants recruited in Italy, Spain, Denmark, the UK, and the Netherlands. We derived…
▽ More
We aimed to explore the utility of the recently developed open-source mobile health platform RADAR-base as a toolbox to rapidly test the effect and response to NPIs aimed at limiting the spread of COVID-19. We analysed data extracted from smartphone and wearable devices and managed by the RADAR-base from 1062 participants recruited in Italy, Spain, Denmark, the UK, and the Netherlands. We derived nine features on a daily basis including time spent at home, maximum distance travelled from home, maximum number of Bluetooth-enabled nearby devices (as a proxy for physical distancing), step count, average heart rate, sleep duration, bedtime, phone unlock duration, and social app use duration. We performed Kruskal-Wallis tests followed by post-hoc Dunns tests to assess differences in these features among baseline, pre-, and during-lockdown periods. We also studied behavioural differences by age, gender, body mass index (BMI), and educational background. We were able to quantify expected changes in time spent at home, distance travelled, and the number of nearby Bluetooth-enabled devices between pre- and during-lockdown periods. We saw reduced sociality as measured through mobility features, and increased virtual sociality through phone usage. People were more active on their phones, spending more time using social media apps, particularly around major news events. Furthermore, participants had lower heart rate, went to bed later, and slept more. We also found that young people had longer homestay than older people during lockdown and fewer daily steps. Although there was no significant difference between the high and low BMI groups in time spent at home, the low BMI group walked more. RADAR-base can be used to rapidly quantify and provide a holistic view of behavioural changes in response to public health interventions as a result of infectious outbreaks such as COVID-19.
△ Less
Submitted 22 July, 2020; v1 submitted 29 April, 2020;
originally announced April 2020.
-
Towards detection and classification of microscopic foraminifera using transfer learning
Authors:
Thomas Haugland Johansen,
Steffen Aagaard Sørensen
Abstract:
Foraminifera are single-celled marine organisms, which may have a planktic or benthic lifestyle. During their life cycle they construct shells consisting of one or more chambers, and these shells remain as fossils in marine sediments. Classifying and counting these fossils have become an important tool in e.g. oceanography and climatology. Currently the process of identifying and counting microfos…
▽ More
Foraminifera are single-celled marine organisms, which may have a planktic or benthic lifestyle. During their life cycle they construct shells consisting of one or more chambers, and these shells remain as fossils in marine sediments. Classifying and counting these fossils have become an important tool in e.g. oceanography and climatology. Currently the process of identifying and counting microfossils is performed manually using a microscope and is very time consuming. Develo** methods to automate this process is therefore considered important across a range of research fields. The first steps towards develo** a deep learning model that can detect and classify microscopic foraminifera are proposed. The proposed model is based on a VGG16 model that has been pretrained on the ImageNet dataset, and adapted to the foraminifera task using transfer learning. Additionally, a novel image dataset consisting of microscopic foraminifera and sediments from the Barents Sea region is introduced.
△ Less
Submitted 14 January, 2020;
originally announced January 2020.
-
Learning Dense Stereo Matching for Digital Surface Models from Satellite Imagery
Authors:
Wayne Treible,
Scott Sorensen,
Andrew D. Gilliam,
Chandra Kambhamettu,
Joseph L. Mundy
Abstract:
Digital Surface Model generation from satellite imagery is a difficult task that has been largely overlooked by the deep learning community. Stereo reconstruction techniques developed for terrestrial systems including self driving cars do not translate well to satellite imagery where image pairs vary considerably. In this work we present neural network tailored for Digital Surface Model generation…
▽ More
Digital Surface Model generation from satellite imagery is a difficult task that has been largely overlooked by the deep learning community. Stereo reconstruction techniques developed for terrestrial systems including self driving cars do not translate well to satellite imagery where image pairs vary considerably. In this work we present neural network tailored for Digital Surface Model generation, a ground truthing and training scheme which maximizes available hardware, and we present a comparison to existing methods. The resulting models are smooth, preserve boundaries, and enable further processing. This represents one of the first attempts at leveraging deep learning in this domain.
△ Less
Submitted 11 December, 2018; v1 submitted 8 November, 2018;
originally announced November 2018.
-
Robust Shape Registration using Fuzzy Correspondences
Authors:
Abhishek Kolagunda,
Scott Sorensen,
Philip Saponaro,
Wayne Treible,
Chandra Kambhamettu
Abstract:
Shape registration is the process of aligning one 3D model to another. Most previous methods to align shapes with no known correspondences attempt to solve for both the transformation and correspondences iteratively. We present a shape registration approach that solves for the transformation using fuzzy correspondences to maximize the overlap between the given shape and the target shape. A coarse…
▽ More
Shape registration is the process of aligning one 3D model to another. Most previous methods to align shapes with no known correspondences attempt to solve for both the transformation and correspondences iteratively. We present a shape registration approach that solves for the transformation using fuzzy correspondences to maximize the overlap between the given shape and the target shape. A coarse to fine approach with Levenberg-Marquardt method is used for optimization. Real and synthetic experiments show our approach is robust and outperforms other state of the art methods when point clouds are noisy, sparse, and have non-uniform density. Experiments show our method is more robust to initialization and can handle larger scale changes and rotation than other methods. We also show that the approach can be used for 2D-3D alignment via ray-point alignment.
△ Less
Submitted 18 February, 2017;
originally announced February 2017.
-
Towards Automatic Migration of ROS Components from Software to Hardware
Authors:
Anders Blaabjerg Lange,
Ulrik Pagh Schultz,
Anders Stengaard Soerensen
Abstract:
The use of the ROS middleware is a growing trend in robotics in general, ROS and hard real-time embedded systems have however not been easily uniteable while retaining the same overall communication and processing methodology at all levels. In this paper we present an approach aimed at tackling the schism between high-level, flexible software and low-level, real-time software. The key idea of our…
▽ More
The use of the ROS middleware is a growing trend in robotics in general, ROS and hard real-time embedded systems have however not been easily uniteable while retaining the same overall communication and processing methodology at all levels. In this paper we present an approach aimed at tackling the schism between high-level, flexible software and low-level, real-time software. The key idea of our approach is to enable software components written for a high-level publish-subscribe software architecture to be automatically migrated to a dedicated hardware architecture implemented using programmable logic. Our approach is based on the Unity framework, a unified software/hardware framework based on FPGAs for quickly interfacing high-level software to low-level robotics hardware.
△ Less
Submitted 28 July, 2014;
originally announced July 2014.