Search | arXiv e-print repository

FlexKalmanNet: A Modular AI-Enhanced Kalman Filter Framework Applied to Spacecraft Motion Estimation

Authors: Moritz D. Pinheiro-Torres Vogt, Markus Huwald, M. Khalil Ben-Larbi, Enrico Stoll

Abstract: The estimation of relative motion between spacecraft increasingly relies on feature-matching computer vision, which feeds data into a recursive filtering algorithm. Kalman filters, although efficient in noise compensation, demand extensive tuning of system and noise models. This paper introduces FlexKalmanNet, a novel modular framework that bridges this gap by integrating a deep fully connected ne… ▽ More The estimation of relative motion between spacecraft increasingly relies on feature-matching computer vision, which feeds data into a recursive filtering algorithm. Kalman filters, although efficient in noise compensation, demand extensive tuning of system and noise models. This paper introduces FlexKalmanNet, a novel modular framework that bridges this gap by integrating a deep fully connected neural network with Kalman filter-based motion estimation algorithms. FlexKalmanNet's core innovation is its ability to learn any Kalman filter parameter directly from measurement data, coupled with the flexibility to utilize various Kalman filter variants. This is achieved through a notable design decision to outsource the sequential computation from the neural network to the Kalman filter variant, enabling a purely feedforward neural network architecture. This architecture, proficient at handling complex, nonlinear features without the dependency on recurrent network modules, captures global data patterns more effectively. Empirical evaluation using data from NASA's Astrobee simulation environment focuses on learning unknown parameters of an Extended Kalman filter for spacecraft pose and twist estimation. The results demonstrate FlexKalmanNet's rapid training convergence, high accuracy, and superior performance against manually tuned Extended Kalman filters. △ Less

Submitted 5 May, 2024; originally announced May 2024.

arXiv:2308.05737 [pdf, other]

Follow Anything: Open-set detection, tracking, and following in real-time

Authors: Alaa Maalouf, Ninad Jadhav, Krishna Murthy Jatavallabhula, Makram Chahine, Daniel M. Vogt, Robert J. Wood, Antonio Torralba, Daniela Rus

Abstract: Tracking and following objects of interest is critical to several robotics use cases, ranging from industrial automation to logistics and warehousing, to healthcare and security. In this paper, we present a robotic system to detect, track, and follow any object in real-time. Our approach, dubbed ``follow anything'' (FAn), is an open-vocabulary and multimodal model -- it is not restricted to concep… ▽ More Tracking and following objects of interest is critical to several robotics use cases, ranging from industrial automation to logistics and warehousing, to healthcare and security. In this paper, we present a robotic system to detect, track, and follow any object in real-time. Our approach, dubbed ``follow anything'' (FAn), is an open-vocabulary and multimodal model -- it is not restricted to concepts seen at training time and can be applied to novel classes at inference time using text, images, or click queries. Leveraging rich visual descriptors from large-scale pre-trained models (foundation models), FAn can detect and segment objects by matching multimodal queries (text, images, clicks) against an input image sequence. These detected and segmented objects are tracked across image frames, all while accounting for occlusion and object re-emergence. We demonstrate FAn on a real-world robotic system (a micro aerial vehicle) and report its ability to seamlessly follow the objects of interest in a real-time control loop. FAn can be deployed on a laptop with a lightweight (6-8 GB) graphics card, achieving a throughput of 6-20 frames per second. To enable rapid adoption, deployment, and extensibility, we open-source all our code on our project webpage at https://github.com/alaamaalouf/FollowAnything . We also encourage the reader to watch our 5-minutes explainer video in this https://www.youtube.com/watch?v=6Mgt3EPytrw . △ Less

Submitted 9 February, 2024; v1 submitted 10 August, 2023; originally announced August 2023.

Comments: Project webpage: https://github.com/alaamaalouf/FollowAnything Explainer video: https://www.youtube.com/watch?v=6Mgt3EPytrw

arXiv:2104.08614 [pdf]

Cetacean Translation Initiative: a roadmap to deciphering the communication of sperm whales

Authors: Jacob Andreas, Gašper Beguš, Michael M. Bronstein, Roee Diamant, Denley Delaney, Shane Gero, Shafi Goldwasser, David F. Gruber, Sarah de Haas, Peter Malkin, Roger Payne, Giovanni Petri, Daniela Rus, Pratyusha Sharma, Dan Tchernov, Pernille Tønnesen, Antonio Torralba, Daniel Vogt, Robert J. Wood

Abstract: The past decade has witnessed a groundbreaking rise of machine learning for human language analysis, with current methods capable of automatically accurately recovering various aspects of syntax and semantics - including sentence structure and grounded word meaning - from large data collections. Recent research showed the promise of such tools for analyzing acoustic communication in nonhuman speci… ▽ More The past decade has witnessed a groundbreaking rise of machine learning for human language analysis, with current methods capable of automatically accurately recovering various aspects of syntax and semantics - including sentence structure and grounded word meaning - from large data collections. Recent research showed the promise of such tools for analyzing acoustic communication in nonhuman species. We posit that machine learning will be the cornerstone of future collection, processing, and analysis of multimodal streams of data in animal communication studies, including bioacoustic, behavioral, biological, and environmental data. Cetaceans are unique non-human model species as they possess sophisticated acoustic communications, but utilize a very different encoding system that evolved in an aquatic rather than terrestrial medium. Sperm whales, in particular, with their highly-developed neuroanatomical features, cognitive abilities, social structures, and discrete click-based encoding make for an excellent starting point for advanced machine learning tools that can be applied to other animals in the future. This paper details a roadmap toward this goal based on currently existing technology and multidisciplinary scientific community effort. We outline the key elements required for the collection and processing of massive bioacoustic data of sperm whales, detecting their basic communication units and language-like higher-level structures, and validating these models through interactive playback experiments. The technological capabilities developed by such an undertaking are likely to yield cross-applications and advancements in broader communities investigating non-human communication and animal behavioral research. △ Less

Submitted 17 April, 2021; originally announced April 2021.

Showing 1–3 of 3 results for author: Vogt, D