Skip to main content

Showing 1–20 of 20 results for author: Paulus, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1092 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 14 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  2. arXiv:2402.01462  [pdf, other

    cs.CV

    3D Vertebrae Measurements: Assessing Vertebral Dimensions in Human Spine Mesh Models Using Local Anatomical Vertebral Axes

    Authors: Ivanna Kramer, Vinzent Rittel, Lara Blomenkamp, Sabine Bauer, Dietrich Paulus

    Abstract: Vertebral morphological measurements are important across various disciplines, including spinal biomechanics and clinical applications, pre- and post-operatively. These measurements also play a crucial role in anthropological longitudinal studies, where spinal metrics are repeatedly documented over extended periods. Traditionally, such measurements have been manually conducted, a process that is t… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

  3. arXiv:2401.07370  [pdf, other

    cs.CV

    Generation of Synthetic Images for Pedestrian Detection Using a Sequence of GANs

    Authors: Viktor Seib, Malte Roosen, Ida Germann, Stefan Wirtz, Dietrich Paulus

    Abstract: Creating annotated datasets demands a substantial amount of manual effort. In this proof-of-concept work, we address this issue by proposing a novel image generation pipeline. The pipeline consists of three distinct generative adversarial networks (previously published), combined in a novel way to augment a dataset for pedestrian detection. Despite the fact that the generated images are not always… ▽ More

    Submitted 14 January, 2024; originally announced January 2024.

  4. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  5. arXiv:2201.03508  [pdf, other

    cs.CY

    On the interplay of data and cognitive bias in crisis information management -- An exploratory study on epidemic response

    Authors: David Paulus, Ramian Fathi, Frank Fiedrich, Bartel Van de Walle, Tina Comes

    Abstract: Humanitarian crises, such as the 2014 West Africa Ebola epidemic, challenge information management and thereby threaten the digital resilience of the responding organizations. Crisis information management (CIM) is characterised by the urgency to respond despite the uncertainty of the situation. Coupled with high stakes, limited resources and a high cognitive load, crises are prone to induce biase… ▽ More

    Submitted 10 January, 2022; originally announced January 2022.

    Comments: 44 pages, 6 figures, 8 tables

  6. arXiv:2110.06766  [pdf, other

    cs.RO cs.LG

    Next-Best-View Estimation based on Deep Reinforcement Learning for Active Object Classification

    Authors: Christian Korbach, Markus D. Solbach, Raphael Memmesheimer, Dietrich Paulus, John K. Tsotsos

    Abstract: The presentation and analysis of image data from a single viewpoint are often not sufficient to solve a task. Several viewpoints are necessary to obtain more information. The next-best-view problem attempts to find the optimal viewpoint with the greatest information gain for the underlying task. In this work, a robot arm holds an object in its end-effector and searches for a sequence of next-best-… ▽ More

    Submitted 14 October, 2021; v1 submitted 13 October, 2021; originally announced October 2021.

    Comments: 9 pages, 11 figures, 4 tables, preprint, Github repo: https://github.com/ckorbach/nbv_rl

  7. arXiv:2109.12946  [pdf, other

    cs.CV

    Fusion-GCN: Multimodal Action Recognition using Graph Convolutional Networks

    Authors: Michael Duhme, Raphael Memmesheimer, Dietrich Paulus

    Abstract: In this paper, we present Fusion-GCN, an approach for multimodal action recognition using Graph Convolutional Networks (GCNs). Action recognition methods based around GCNs recently yielded state-of-the-art performance for skeleton-based action recognition. With Fusion-GCN, we propose to integrate various sensor data modalities into a graph that is trained using a GCN model for multi-modal action r… ▽ More

    Submitted 27 September, 2021; originally announced September 2021.

    Comments: 18 pages, 6 figures, 3 tables, GCPR 2021

  8. arXiv:2012.13823  [pdf, other

    cs.CV cs.AI cs.RO

    Skeleton-DML: Deep Metric Learning for Skeleton-Based One-Shot Action Recognition

    Authors: Raphael Memmesheimer, Simon Häring, Nick Theisen, Dietrich Paulus

    Abstract: One-shot action recognition allows the recognition of human-performed actions with only a single training example. This can influence human-robot-interaction positively by enabling the robot to react to previously unseen behaviour. We formulate the one-shot action recognition problem as a deep metric learning problem and propose a novel image-based skeleton representation that performs well in a m… ▽ More

    Submitted 8 March, 2021; v1 submitted 26 December, 2020; originally announced December 2020.

    Comments: 8 pages, 8 figures, 4 tables

  9. arXiv:2004.11085  [pdf, other

    cs.CV

    SL-DML: Signal Level Deep Metric Learning for Multimodal One-Shot Action Recognition

    Authors: Raphael Memmesheimer, Nick Theisen, Dietrich Paulus

    Abstract: Recognizing an activity with a single reference sample using metric learning approaches is a promising research field. The majority of few-shot methods focus on object recognition or face-identification. We propose a metric learning approach to reduce the action recognition problem to a nearest neighbor search in embedding space. We encode signals into images and extract features using a deep resi… ▽ More

    Submitted 19 October, 2020; v1 submitted 23 April, 2020; originally announced April 2020.

    Comments: 8 pages, 6 figures, 7 tables

  10. arXiv:2003.06156  [pdf, other

    cs.CV cs.RO

    Gimme Signals: Discriminative signal encoding for multimodal activity recognition

    Authors: Raphael Memmesheimer, Nick Theisen, Dietrich Paulus

    Abstract: We present a simple, yet effective and flexible method for action recognition supporting multiple sensor modalities. Multivariate signal sequences are encoded in an image and are then classified using a recently proposed EfficientNet CNN architecture. Our focus was to find an approach that generalizes well across different sensor modalities without specific adaptions while still achieving good res… ▽ More

    Submitted 9 April, 2020; v1 submitted 13 March, 2020; originally announced March 2020.

    Comments: 8 pages, 4 figures, 4 tables

  11. arXiv:1911.04787  [pdf

    cs.LG cs.HC stat.ML

    Effects of data ambiguity and cognitive biases on the interpretability of machine learning models in humanitarian decision making

    Authors: David Paulus, Gerdien de Vries, Bartel Van de Walle

    Abstract: The effectiveness of machine learning algorithms depends on the quality and amount of data and the operationalization and interpretation by the human analyst. In humanitarian response, data is often lacking or overburdening, thus ambiguous, and the time-scarce, volatile, insecure environments of humanitarian activities are likely to inflict cognitive biases. This paper proposes to research the eff… ▽ More

    Submitted 12 November, 2019; originally announced November 2019.

    Comments: 3 pager, 1 figure, AAAI Fall Symposium - AI for Social Good, November 7-9, 2019, Arlington, USA

  12. arXiv:1906.12171  [pdf, other

    cs.CV cs.LG cs.RO

    Gesture Recognition in RGB Videos UsingHuman Body Keypoints and Dynamic Time War**

    Authors: Pascal Schneider, Raphael Memmesheimer, Ivanna Kramer, Dietrich Paulus

    Abstract: Gesture recognition opens up new ways for humans to intuitively interact with machines. Especially for service robots, gestures can be a valuable addition to the means of communication to, for example, draw the robot's attention to someone or something. Extracting a gesture from video data and classifying it is a challenging task and a variety of approaches have been proposed throughout the years.… ▽ More

    Submitted 25 June, 2019; originally announced June 2019.

    Comments: 13 pages, 4 figures, 2 tables, RoboCup 2019 Symposium

  13. arXiv:1905.06002  [pdf, other

    cs.LG cs.RO stat.ML

    Simitate: A Hybrid Imitation Learning Benchmark

    Authors: Raphael Memmesheimer, Ivanna Mykhalchyshyna, Viktor Seib, Dietrich Paulus

    Abstract: We present Simitate --- a hybrid benchmarking suite targeting the evaluation of approaches for imitation learning. A dataset containing 1938 sequences where humans perform daily activities in a realistic environment is presented. The dataset is strongly coupled with an integration into a simulator. RGB and depth streams with a resolution of 960$\mathbb{\times}$540 at 30Hz and accurate ground truth… ▽ More

    Submitted 15 May, 2019; originally announced May 2019.

    Comments: 6 figures, 2 tables, submitted to IROS 2019

  14. arXiv:1905.05642  [pdf, other

    cs.RO

    Scratchy: A Lightweight Modular Autonomous Robot for Robotic Competitions

    Authors: Raphael Memmesheimer, Isabelle Kuhlmann, Mark Mints, Patrik Schmidt, Christian Korbach, Ida Germann, Dietrich Paulus

    Abstract: We present Scratchy---a modular, lightweight robot built for low budget competition attendances. Its base is mainly built with standard 4040 aluminium profiles and the robot is driven by four mecanum wheels on brushless DC motors. In combination with a laser range finder we use estimated odometry -- which is calculated by encoders -- for creating maps using a particle filter. A RGB-D camera is uti… ▽ More

    Submitted 14 May, 2019; originally announced May 2019.

    Comments: 6 pages, 6 figures, 4 tables, ICARSC2019

  15. arXiv:1903.10882  [pdf, other

    cs.RO

    Trends, Challenges and Adopted Strategies in RoboCup@Home (2019 version)

    Authors: Mauricio Matamoros, Viktor Seib, Dietrich Paulus

    Abstract: Scientific competitions are crucial in the field of service robotics. They foster knowledge exchange and benchmarking, allowing teams to test their research in unstandardized scenarios. In this paper, we summarize the trending solutions and approaches used in RoboCup@Home. Further on, we discuss the attained achievements and challenges to overcome in relation with the progress required to fulfill… ▽ More

    Submitted 25 March, 2019; originally announced March 2019.

    Comments: 7 pages, 4 figures, 3 tables. Accepted paper to be presented and published in the 2019 IEEE International Conference on Autonomous Robot Systems and Competitions. arXiv admin note: substantial text overlap with arXiv:1903.02516

    MSC Class: 68T40; 97U40

  16. arXiv:1903.02516  [pdf, other

    cs.RO

    Trends, Challenges and Adopted Strategies in RoboCup@Home

    Authors: Mauricio Matamoros, Viktor Seib, Dietrich Paulus

    Abstract: Scientific competitions are crucial in the field of service robotics. They foster knowledge exchange and allow teams to test their research in unstandardized scenarios and compare result. Such is the case of RoboCup@Home. However, kee** track of all the technologies and solution approaches used by teams to solve the tests can be a challenge in itself. Moreover, after eleven years of competitions… ▽ More

    Submitted 6 March, 2019; originally announced March 2019.

    Comments: 18 pages, 7 figures, 3 tables

    MSC Class: 68T40; 97U40

  17. RoboCup@Home: Summarizing achievements in over eleven years of competition

    Authors: Mauricio Matamoros, Viktor Seib, Raphael Memmesheimer, Dietrich Paulus

    Abstract: Scientific competitions are important in robotics because they foster knowledge exchange and allow teams to test their research in unstandardized scenarios and compare result. In the field of service robotics its role becomes crucial. Competitions like RoboCup@Home bring robots to people, a fundamental step to integrate them into society. In this paper we summarize and discuss the differences be… ▽ More

    Submitted 2 February, 2019; originally announced February 2019.

    Comments: 6 pages, 4 images, 3 tables Published in: 2018 IEEE International Conference on Autonomous Robot Systems and Competitions (ICARSC)

    MSC Class: 68T40

    Journal ref: 2018 IEEE International Conference on Autonomous Robot Systems and Competitions (ICARSC)

  18. arXiv:1902.00754  [pdf, ps, other

    cs.RO

    From Commands to Goal-based Dialogs: A Roadmap to Achieve Natural Language Interaction in RoboCup@Home

    Authors: Mauricio Matamoros, Karin Harbusch, Dietrich Paulus

    Abstract: On the one hand, speech is a key aspect to people's communication. On the other, it is widely acknowledged that language proficiency is related to intelligence. Therefore, intelligent robots should be able to understand, at least, people's orders within their application domain. These insights are not new in RoboCup@Home, but we lack of a long-term plan to evaluate this approach. In this paper we… ▽ More

    Submitted 2 February, 2019; originally announced February 2019.

    Comments: 12 pages, 2 tables, 1 figure. Accepted and presented (poster) in the RoboCup 2018 Symposium. In press

    MSC Class: 68T40

  19. arXiv:1807.11541  [pdf, other

    cs.CV cs.RO

    Markerless Visual Robot Programming by Demonstration

    Authors: Raphael Memmesheimer, Ivanna Mykhalchyshyna, Viktor Seib, Nick Theisen, Dietrich Paulus

    Abstract: In this paper we present an approach for learning to imitate human behavior on a semantic level by markerless visual observation. We analyze a set of spatial constraints on human pose data extracted using convolutional pose machines and object informations extracted from 2D image sequences. A scene analysis, based on an ontology of objects and affordances, is combined with continuous human pose es… ▽ More

    Submitted 30 July, 2018; originally announced July 2018.

    Comments: 6 pages, 5 figures, 3rd BAILAR workshop

  20. arXiv:1703.07402  [pdf, other

    cs.CV

    Simple Online and Realtime Tracking with a Deep Association Metric

    Authors: Nicolai Wojke, Alex Bewley, Dietrich Paulus

    Abstract: Simple Online and Realtime Tracking (SORT) is a pragmatic approach to multiple object tracking with a focus on simple, effective algorithms. In this paper, we integrate appearance information to improve the performance of SORT. Due to this extension we are able to track objects through longer periods of occlusions, effectively reducing the number of identity switches. In spirit of the original fra… ▽ More

    Submitted 21 March, 2017; originally announced March 2017.

    Comments: 5 pages, 1 figure