Search | arXiv e-print repository

BaSeNet: A Learning-based Mobile Manipulator Base Pose Sequence Planning for Pickup Tasks

Authors: Lakshadeep Naik, Sinan Kalkan, Sune L. Sørensen, Mikkel B. Kjærgaard, Norbert Krüger

Abstract: In many applications, a mobile manipulator robot is required to grasp a set of objects distributed in space. This may not be feasible from a single base pose and the robot must plan the sequence of base poses for gras** all objects, minimizing the total navigation and gras** time. This is a Combinatorial Optimization problem that can be solved using exact methods, which provide optimal solutio… ▽ More In many applications, a mobile manipulator robot is required to grasp a set of objects distributed in space. This may not be feasible from a single base pose and the robot must plan the sequence of base poses for gras** all objects, minimizing the total navigation and gras** time. This is a Combinatorial Optimization problem that can be solved using exact methods, which provide optimal solutions but are computationally expensive, or approximate methods, which offer computationally efficient but sub-optimal solutions. Recent studies have shown that learning-based methods can solve Combinatorial Optimization problems, providing near-optimal and computationally efficient solutions. In this work, we present BASENET - a learning-based approach to plan the sequence of base poses for the robot to grasp all the objects in the scene. We propose a Reinforcement Learning based solution that learns the base poses for gras** individual objects and the sequence in which the objects should be grasped to minimize the total navigation and gras** costs using Layered Learning. As the problem has a varying number of states and actions, we represent states and actions as a graph and use Graph Neural Networks for learning. We show that the proposed method can produce comparable solutions to exact and approximate methods with significantly less computation time. △ Less

Submitted 12 June, 2024; originally announced June 2024.

Comments: Submitted to IROS 2024

arXiv:2403.10874 [pdf, other]

Robotic Task Success Evaluation Under Multi-modal Non-Parametric Object Pose Uncertainty

Authors: Lakshadeep Naik, Thorbjørn Mosekjær Iversen, Aljaz Kramberger, Norbert Krüger

Abstract: Accurate 6D object pose estimation is essential for various robotic tasks. Uncertain pose estimates can lead to task failures; however, a certain degree of error in the pose estimates is often acceptable. Hence, by quantifying errors in the object pose estimate and acceptable errors for task success, robots can make informed decisions. This is a challenging problem as both the object pose uncertai… ▽ More Accurate 6D object pose estimation is essential for various robotic tasks. Uncertain pose estimates can lead to task failures; however, a certain degree of error in the pose estimates is often acceptable. Hence, by quantifying errors in the object pose estimate and acceptable errors for task success, robots can make informed decisions. This is a challenging problem as both the object pose uncertainty and acceptable error for the robotic task are often multi-modal and cannot be parameterized with commonly used uni-modal distributions. In this paper, we introduce a framework for evaluating robotic task success under object pose uncertainty, representing both the estimated error space of the object pose and the acceptable error space for task success using multi-modal non-parametric probability distributions. The proposed framework pre-computes the acceptable error space for task success using dynamic simulations and subsequently integrates the pre-computed acceptable error space over the estimated error space of the object pose to predict the likelihood of the task success. We evaluated the proposed framework on two mobile manipulation tasks. Our results show that by representing the estimated and the acceptable error space using multi-modal non-parametric distributions, we achieve higher task success rates and fewer failures. △ Less

Submitted 16 March, 2024; originally announced March 2024.

Comments: Submitted to IROS 2024

arXiv:2308.05563 [pdf]

Recent Advancements In The Field Of Deepfake Detection

Authors: Natalie Krueger, Dr. Mounika Vanamala, Dr. Rushit Dave

Abstract: A deepfake is a photo or video of a person whose image has been digitally altered or partially replaced with an image of someone else. Deepfakes have the potential to cause a variety of problems and are often used maliciously. A common usage is altering videos of prominent political figures and celebrities. These deepfakes can portray them making offensive, problematic, and/or untrue statements. C… ▽ More A deepfake is a photo or video of a person whose image has been digitally altered or partially replaced with an image of someone else. Deepfakes have the potential to cause a variety of problems and are often used maliciously. A common usage is altering videos of prominent political figures and celebrities. These deepfakes can portray them making offensive, problematic, and/or untrue statements. Current deepfakes can be very realistic, and when used in this way, can spread panic and even influence elections and political opinions. There are many deepfake detection strategies currently in use but finding the most comprehensive and universal method is critical. So, in this survey we will address the problems of malicious deepfake creation and the lack of universal deepfake detection methods. Our objective is to survey and analyze a variety of current methods and advances in the field of deepfake detection. △ Less

Submitted 10 August, 2023; originally announced August 2023.

arXiv:2304.14504 [pdf]

Hybrid Deepfake Detection Utilizing MLP and LSTM

Authors: Jacob Mallet, Natalie Krueger, Mounika Vanamala, Rushit Dave

Abstract: The growing reliance of society on social media for authentic information has done nothing but increase over the past years. This has only raised the potential consequences of the spread of misinformation. One of the growing methods in popularity is to deceive users using a deepfake. A deepfake is an invention that has come with the latest technological advancements, which enables nefarious online… ▽ More The growing reliance of society on social media for authentic information has done nothing but increase over the past years. This has only raised the potential consequences of the spread of misinformation. One of the growing methods in popularity is to deceive users using a deepfake. A deepfake is an invention that has come with the latest technological advancements, which enables nefarious online users to replace their face with a computer generated, synthetic face of numerous powerful members of society. Deepfake images and videos now provide the means to mimic important political and cultural figures to spread massive amounts of false information. Models that can detect these deepfakes to prevent the spread of misinformation are now of tremendous necessity. In this paper, we propose a new deepfake detection schema utilizing two deep learning algorithms: long short term memory and multilayer perceptron. We evaluate our model using a publicly available dataset named 140k Real and Fake Faces to detect images altered by a deepfake with accuracies achieved as high as 74.7% △ Less

Submitted 21 April, 2023; originally announced April 2023.

Comments: 5 Pages

arXiv:2301.03982 [pdf, other]

doi 10.1145/3572848.3577436

Exploring the Use of WebAssembly in HPC

Authors: Mohak Chadha, Nils Krueger, Jophin John, Anshul **dal, Michael Gerndt, Shajulin Benedict

Abstract: Containerization approaches based on namespaces offered by the Linux kernel have seen an increasing popularity in the HPC community both as a means to isolate applications and as a format to package and distribute them. However, their adoption and usage in HPC systems faces several challenges. These include difficulties in unprivileged running and building of scientific application container image… ▽ More Containerization approaches based on namespaces offered by the Linux kernel have seen an increasing popularity in the HPC community both as a means to isolate applications and as a format to package and distribute them. However, their adoption and usage in HPC systems faces several challenges. These include difficulties in unprivileged running and building of scientific application container images directly on HPC resources, increasing heterogeneity of HPC architectures, and access to specialized networking libraries available only on HPC systems. These challenges of container-based HPC application development closely align with the several advantages that a new universal intermediate binary format called WebAssembly (Wasm) has to offer. These include a lightweight userspace isolation mechanism and portability across operating systems and processor architectures. In this paper, we explore the usage of Wasm as a distribution format for MPI-based HPC applications. To this end, we present MPIWasm, a novel Wasm embedder for MPI-based HPC applications that enables high-performance execution of Wasm code, has low-overhead for MPI calls, and supports high-performance networking interconnects present on HPC systems. We evaluate the performance and overhead of MPIWasm on a production HPC system and AWS Graviton2 nodes using standardized HPC benchmarks. Results from our experiments demonstrate that MPIWasm delivers competitive native application performance across all scenarios. Moreover, we observe that Wasm binaries are 139.5x smaller on average as compared to the statically-linked binaries for the different standardized benchmarks. △ Less

Submitted 10 January, 2023; originally announced January 2023.

Comments: ACM SIGPLAN PPoPP 2023

arXiv:2011.01696 [pdf, ps, other]

Towards Automated Anamnesis Summarization: BERT-based Models for Symptom Extraction

Authors: Anton Schäfer, Nils Blach, Oliver Rausch, Maximilian Warm, Nils Krüger

Abstract: Professionals in modern healthcare systems are increasingly burdened by documentation workloads. Documentation of the initial patient anamnesis is particularly relevant, forming the basis of successful further diagnostic measures. However, manually prepared notes are inherently unstructured and often incomplete. In this paper, we investigate the potential of modern NLP techniques to support doctor… ▽ More Professionals in modern healthcare systems are increasingly burdened by documentation workloads. Documentation of the initial patient anamnesis is particularly relevant, forming the basis of successful further diagnostic measures. However, manually prepared notes are inherently unstructured and often incomplete. In this paper, we investigate the potential of modern NLP techniques to support doctors in this matter. We present a dataset of German patient monologues, and formulate a well-defined information extraction task under the constraints of real-world utility and practicality. In addition, we propose BERT-based models in order to solve said task. We can demonstrate promising performance of the models in both symptom identification and symptom attribute extraction, significantly outperforming simpler baselines. △ Less

Submitted 3 November, 2020; originally announced November 2020.

Comments: Machine Learning for Health (ML4H) at NeurIPS 2020 - Extended Abstract

arXiv:2007.10938 [pdf, other]

doi 10.1111/tgis.12710

Reference study of CityGML software support: the GeoBIM benchmark 2019 -- Part II

Authors: Francesca Noardo, Ken Arroyo Ohori, Filip Biljecki, Claire Ellul, Lars Harrie, Thomas Krijnen, Helen Eriksson, Jordi van Liempt, Maria Pla, Antonio Ruiz, Dean Hintz, Nina Krueger, Cristina Leoni, Leire Leoz, Diana Moraru, Stelios Vitalis, Philipp Willkomm, Jantien Stoter

Abstract: OGC CityGML is an open standard for 3D city models intended to foster interoperability and support various applications. However, through our practical experience and discussions with practitioners, we have noticed several problems related to the implementation of the standard and the use of standardized data. Nevertheless, a systematic investigation of these issues has never been performed, and t… ▽ More OGC CityGML is an open standard for 3D city models intended to foster interoperability and support various applications. However, through our practical experience and discussions with practitioners, we have noticed several problems related to the implementation of the standard and the use of standardized data. Nevertheless, a systematic investigation of these issues has never been performed, and there is thus insufficient evidence that can be used for tackling the problems. The GeoBIM benchmark project is aimed at finding such evidence by involving external volunteers, reporting on tools behaviour about relevant aspects (geometry, semantics, georeferencing, functionalities), analysed and described in this paper. This study explicitly pointed out the critical points embedded in the format as an evidence base for future development. This paper is in tandem with Part I, describing the results of the benchmark related to IFC, counterpart of CityGML within building information modelling. △ Less

Submitted 7 January, 2021; v1 submitted 21 July, 2020; originally announced July 2020.

Comments: preprint of the paper

Journal ref: Transactions in GIS, ISSN: 1361-1682, 2020

arXiv:1708.06966 [pdf, other]

doi 10.1109/CVPR.2014.266

In search of inliers: 3d correspondence by local and global voting

Authors: Anders Glent Buch, Yang Yang, Norbert Krüger, Henrik Gordon Petersen

Abstract: We present a method for finding correspondence between 3D models. From an initial set of feature correspondences, our method uses a fast voting scheme to separate the inliers from the outliers. The novelty of our method lies in the use of a combination of local and global constraints to determine if a vote should be cast. On a local scale, we use simple, low-level geometric invariants. On a global… ▽ More We present a method for finding correspondence between 3D models. From an initial set of feature correspondences, our method uses a fast voting scheme to separate the inliers from the outliers. The novelty of our method lies in the use of a combination of local and global constraints to determine if a vote should be cast. On a local scale, we use simple, low-level geometric invariants. On a global scale, we apply covariant constraints for finding compatible correspondences. We guide the sampling for collecting voters by downward dependencies on previous voting stages. All of this together results in an accurate matching procedure. We evaluate our algorithm by controlled and comparative testing on different datasets, giving superior performance compared to state of the art methods. In a final experiment, we apply our method for 3D object detection, showing potential use of our method within higher-level vision. △ Less

Submitted 23 August, 2017; originally announced August 2017.

Journal ref: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

arXiv:1708.06963 [pdf, other]

doi 10.1109/ICRA.2013.6630856

Pose Estimation using Local Structure-Specific Shape and Appearance Context

Authors: Anders Glent Buch, Dirk Kraft, Joni-Kristian Kamarainen, Henrik Gordon Petersen, Norbert Krüger

Abstract: We address the problem of estimating the alignment pose between two models using structure-specific local descriptors. Our descriptors are generated using a combination of 2D image data and 3D contextual shape data, resulting in a set of semi-local descriptors containing rich appearance and shape information for both edge and texture structures. This is achieved by defining feature space relations… ▽ More We address the problem of estimating the alignment pose between two models using structure-specific local descriptors. Our descriptors are generated using a combination of 2D image data and 3D contextual shape data, resulting in a set of semi-local descriptors containing rich appearance and shape information for both edge and texture structures. This is achieved by defining feature space relations which describe the neighborhood of a descriptor. By quantitative evaluations, we show that our descriptors provide high discriminative power compared to state of the art approaches. In addition, we show how to utilize this for the estimation of the alignment pose between two point sets. We present experiments both in controlled and real-life scenarios to validate our approach. △ Less

Submitted 23 August, 2017; originally announced August 2017.

Journal ref: 2013 IEEE International Conference on Robotics and Automation (ICRA)

Showing 1–9 of 9 results for author: Krueger, N