-
Band engineering and study of disorder using topology in compact high kinetic inductance cavity arrays
Authors:
Vincent Jouanny,
Simone Frasca,
Vera Jo Weibel,
Leo Peyruchat,
Marco Scigliuzzo,
Fabian Oppliger,
Franco De Palma,
Davide Sbroggio,
Guillaume Beaulieu,
Oded Zilberberg,
Pasquale Scarlino
Abstract:
Superconducting microwave metamaterials offer enormous potential for quantum optics and information science, enabling the development of advanced quantum technologies for sensing and amplification. In the context of circuit quantum electrodynamics, such metamaterials can be implemented as coupled cavity arrays (CCAs). In the continuous effort to miniaturize quantum devices for increasing scalabili…
▽ More
Superconducting microwave metamaterials offer enormous potential for quantum optics and information science, enabling the development of advanced quantum technologies for sensing and amplification. In the context of circuit quantum electrodynamics, such metamaterials can be implemented as coupled cavity arrays (CCAs). In the continuous effort to miniaturize quantum devices for increasing scalability, minimizing the footprint of CCAs while preserving low disorder becomes paramount. In this work, we present a compact CCA architecture leveraging superconducting NbN thin films presenting high kinetic inductance, which enables high-impedance CCA ($\sim1.5$ k$Ω$), while reducing the resonator footprint. We demonstrate its versatility and scalability by engineering one-dimensional CCAs with up to 100 resonators and exhibiting multiple bandgaps. Additionally, we quantitatively investigate disorder in the CCAs using symmetry-protected topological SSH modes, from which we extract a resonator frequency scattering of $0.22^{+0.04}_{-0.03}\%$. Our platform opens up exciting new prospects for analog quantum simulations of many-body physics with ultrastrongly coupled emitters.
△ Less
Submitted 26 March, 2024;
originally announced March 2024.
-
Ergodic theorem for branching Markov chains indexed by trees with arbitrary shape
Authors:
Julien Weibel
Abstract:
We prove an ergodic theorem for Markov chains indexed by the Ulam-Harris-Neveu tree over large subsets with arbitrary shape under two assumptions: with high probability, two vertices in the large subset are far from each other and have their common ancestor close to the root. The assumption on the common ancestor can be replaced by some regularity assumption on the Markov transition kernel. We ver…
▽ More
We prove an ergodic theorem for Markov chains indexed by the Ulam-Harris-Neveu tree over large subsets with arbitrary shape under two assumptions: with high probability, two vertices in the large subset are far from each other and have their common ancestor close to the root. The assumption on the common ancestor can be replaced by some regularity assumption on the Markov transition kernel. We verify that those assumptions are satisfied for some usual trees. Finally, with Markov-Chain Monte-Carlo considerations in mind, we prove when the underlying Markov chain is stationary and reversible that the Markov chain, that is the line graph, yields minimal variance for the empirical average estimator among trees with a given number of nodes.
△ Less
Submitted 25 March, 2024;
originally announced March 2024.
-
STAR: Shape-focused Texture Agnostic Representations for Improved Object Detection and 6D Pose Estimation
Authors:
Peter Hönig,
Stefan Thalhammer,
Jean-Baptiste Weibel,
Matthias Hirschmanner,
Markus Vincze
Abstract:
Recent advances in machine learning have greatly benefited object detection and 6D pose estimation for robotic gras**. However, textureless and metallic objects still pose a significant challenge due to fewer visual cues and the texture bias of CNNs. To address this issue, we propose a texture-agnostic approach that focuses on learning from CAD models and emphasizes object shape features. To ach…
▽ More
Recent advances in machine learning have greatly benefited object detection and 6D pose estimation for robotic gras**. However, textureless and metallic objects still pose a significant challenge due to fewer visual cues and the texture bias of CNNs. To address this issue, we propose a texture-agnostic approach that focuses on learning from CAD models and emphasizes object shape features. To achieve a focus on learning shape features, the textures are randomized during the rendering of the training data. By treating the texture as noise, the need for real-world object instances or their final appearance during training data generation is eliminated. The TLESS and ITODD datasets, specifically created for industrial settings in robotics and featuring textureless and metallic objects, were used for evaluation. Texture agnosticity also increases the robustness against image perturbations such as imaging noise, motion blur, and brightness changes, which are common in robotics applications. Code and datasets are publicly available at github.com/hoenigpeter/randomized_texturing.
△ Less
Submitted 7 February, 2024;
originally announced February 2024.
-
Probability-graphons: Limits of large dense weighted graphs
Authors:
Romain Abraham,
Jean-François Delmas,
Julien Weibel
Abstract:
We introduce probability-graphons which are probability kernels that generalize graphons to the case of weighted graphs. Probability-graphons appear as the limit objects to study sequences of large weighted graphs whose distribution of subgraph sampling converge. The edge-weights are taken from a general Polish space, which also covers the case of decorated graphs. Here, graphs can be either direc…
▽ More
We introduce probability-graphons which are probability kernels that generalize graphons to the case of weighted graphs. Probability-graphons appear as the limit objects to study sequences of large weighted graphs whose distribution of subgraph sampling converge. The edge-weights are taken from a general Polish space, which also covers the case of decorated graphs. Here, graphs can be either directed or undirected. Starting from a distance $d_m$ inducing the weak topology on measures, we define a cut distance on probability-graphons, making it a Polish space, and study the properties of this cut distance. In particular, we exhibit a tightness criterion for probability-graphons related to relative compactness in the cut distance. We also prove that under some conditions on the distance $d_m$, which are satisfied for some well-know distances like the Prohorov distance, and the Fortet-Mourier and Kantorovitch-Rubinstein norms, the topology induced by the cut distance on the spaceof probability-graphons is independent from the choice of $d_m$. Eventually, we prove that this topology coincides with the topology induced by the convergence in distribution of the sampled subgraphs.
△ Less
Submitted 26 December, 2023;
originally announced December 2023.
-
ZS6D: Zero-shot 6D Object Pose Estimation using Vision Transformers
Authors:
Philipp Ausserlechner,
David Haberger,
Stefan Thalhammer,
Jean-Baptiste Weibel,
Markus Vincze
Abstract:
As robotic systems increasingly encounter complex and unconstrained real-world scenarios, there is a demand to recognize diverse objects. The state-of-the-art 6D object pose estimation methods rely on object-specific training and therefore do not generalize to unseen objects. Recent novel object pose estimation methods are solving this issue using task-specific fine-tuned CNNs for deep template ma…
▽ More
As robotic systems increasingly encounter complex and unconstrained real-world scenarios, there is a demand to recognize diverse objects. The state-of-the-art 6D object pose estimation methods rely on object-specific training and therefore do not generalize to unseen objects. Recent novel object pose estimation methods are solving this issue using task-specific fine-tuned CNNs for deep template matching. This adaptation for pose estimation still requires expensive data rendering and training procedures. MegaPose for example is trained on a dataset consisting of two million images showing 20,000 different objects to reach such generalization capabilities. To overcome this shortcoming we introduce ZS6D, for zero-shot novel object 6D pose estimation. Visual descriptors, extracted using pre-trained Vision Transformers (ViT), are used for matching rendered templates against query images of objects and for establishing local correspondences. These local correspondences enable deriving geometric correspondences and are used for estimating the object's 6D pose with RANSAC-based PnP. This approach showcases that the image descriptors extracted by pre-trained ViTs are well-suited to achieve a notable improvement over two state-of-the-art novel object 6D pose estimation methods, without the need for task-specific fine-tuning. Experiments are performed on LMO, YCBV, and TLESS. In comparison to one of the two methods we improve the Average Recall on all three datasets and compared to the second method we improve on two datasets.
△ Less
Submitted 21 September, 2023;
originally announced September 2023.
-
Challenges for Monocular 6D Object Pose Estimation in Robotics
Authors:
Stefan Thalhammer,
Dominik Bauer,
Peter Hönig,
Jean-Baptiste Weibel,
José García-Rodríguez,
Markus Vincze
Abstract:
Object pose estimation is a core perception task that enables, for example, object gras** and scene understanding. The widely available, inexpensive and high-resolution RGB sensors and CNNs that allow for fast inference based on this modality make monocular approaches especially well suited for robotics applications. We observe that previous surveys on object pose estimation establish the state…
▽ More
Object pose estimation is a core perception task that enables, for example, object gras** and scene understanding. The widely available, inexpensive and high-resolution RGB sensors and CNNs that allow for fast inference based on this modality make monocular approaches especially well suited for robotics applications. We observe that previous surveys on object pose estimation establish the state of the art for varying modalities, single- and multi-view settings, and datasets and metrics that consider a multitude of applications. We argue, however, that those works' broad scope hinders the identification of open challenges that are specific to monocular approaches and the derivation of promising future challenges for their application in robotics. By providing a unified view on recent publications from both robotics and computer vision, we find that occlusion handling, novel pose representations, and formalizing and improving category-level pose estimation are still fundamental challenges that are highly relevant for robotics. Moreover, to further improve robotic performance, large object sets, novel objects, refractive materials, and uncertainty estimates are central, largely unsolved open challenges. In order to address them, ontological reasoning, deformability handling, scene-level reasoning, realistic datasets, and the ecological footprint of algorithms need to be improved.
△ Less
Submitted 22 July, 2023;
originally announced July 2023.
-
Self-supervised Vision Transformers for 3D Pose Estimation of Novel Objects
Authors:
Stefan Thalhammer,
Jean-Baptiste Weibel,
Markus Vincze,
Jose Garcia-Rodriguez
Abstract:
Object pose estimation is important for object manipulation and scene understanding. In order to improve the general applicability of pose estimators, recent research focuses on providing estimates for novel objects, that is objects unseen during training. Such works use deep template matching strategies to retrieve the closest template connected to a query image. This template retrieval implicitl…
▽ More
Object pose estimation is important for object manipulation and scene understanding. In order to improve the general applicability of pose estimators, recent research focuses on providing estimates for novel objects, that is objects unseen during training. Such works use deep template matching strategies to retrieve the closest template connected to a query image. This template retrieval implicitly provides object class and pose. Despite the recent success and improvements of Vision Transformers over CNNs for many vision tasks, the state of the art uses CNN-based approaches for novel object pose estimation. This work evaluates and demonstrates the differences between self-supervised CNNs and Vision Transformers for deep template matching. In detail, both types of approaches are trained using contrastive learning to match training images against rendered templates of isolated objects. At test time, such templates are matched against query images of known and novel objects under challenging settings, such as clutter, occlusion and object symmetries, using masked cosine similarity. The presented results not only demonstrate that Vision Transformers improve in matching accuracy over CNNs, but also that for some cases pre-trained Vision Transformers do not need fine-tuning to do so. Furthermore, we highlight the differences in optimization and network architecture when comparing these two types of network for deep template matching.
△ Less
Submitted 31 May, 2023;
originally announced June 2023.
-
Open Challenges for Monocular Single-shot 6D Object Pose Estimation
Authors:
Stefan Thalhammer,
Peter Hönig,
Jean-Baptiste Weibel,
Markus Vincze
Abstract:
Object pose estimation is a non-trivial task that enables robotic manipulation, bin picking, augmented reality, and scene understanding, to name a few use cases. Monocular object pose estimation gained considerable momentum with the rise of high-performing deep learning-based solutions and is particularly interesting for the community since sensors are inexpensive and inference is fast. Prior work…
▽ More
Object pose estimation is a non-trivial task that enables robotic manipulation, bin picking, augmented reality, and scene understanding, to name a few use cases. Monocular object pose estimation gained considerable momentum with the rise of high-performing deep learning-based solutions and is particularly interesting for the community since sensors are inexpensive and inference is fast. Prior works establish the comprehensive state of the art for diverse pose estimation problems. Their broad scopes make it difficult to identify promising future directions. We narrow down the scope to the problem of single-shot monocular 6D object pose estimation, which is commonly used in robotics, and thus are able to identify such trends. By reviewing recent publications in robotics and computer vision, the state of the art is established at the union of both fields. Following that, we identify promising research directions in order to help researchers to formulate relevant research ideas and effectively advance the state of the art. Findings include that methods are sophisticated enough to overcome the domain shift and that occlusion handling is a fundamental challenge. We also highlight problems such as novel object pose estimation and challenging materials handling as central challenges to advance robotics.
△ Less
Submitted 20 July, 2023; v1 submitted 23 February, 2023;
originally announced February 2023.
-
Sim2Real 3D Object Classification using Spherical Kernel Point Convolution and a Deep Center Voting Scheme
Authors:
Jean-Baptiste Weibel,
Timothy Patten,
Markus Vincze
Abstract:
While object semantic understanding is essential for most service robotic tasks, 3D object classification is still an open problem. Learning from artificial 3D models alleviates the cost of annotation necessary to approach this problem, but most methods still struggle with the differences existing between artificial and real 3D data. We conjecture that the cause of those issue is the fact that man…
▽ More
While object semantic understanding is essential for most service robotic tasks, 3D object classification is still an open problem. Learning from artificial 3D models alleviates the cost of annotation necessary to approach this problem, but most methods still struggle with the differences existing between artificial and real 3D data. We conjecture that the cause of those issue is the fact that many methods learn directly from point coordinates, instead of the shape, as the former is hard to center and to scale under variable occlusions reliably. We introduce spherical kernel point convolutions that directly exploit the object surface, represented as a graph, and a voting scheme to limit the impact of poor segmentation on the classification results. Our proposed approach improves upon state-of-the-art methods by up to 36% when transferring from artificial objects to real objects.
△ Less
Submitted 10 March, 2021;
originally announced March 2021.
-
Addressing the Sim2Real Gap in Robotic 3D Object Classification
Authors:
Jean-Baptiste Weibel,
Timothy Patten,
Markus Vincze
Abstract:
Object classification with 3D data is an essential component of any scene understanding method. It has gained significant interest in a variety of communities, most notably in robotics and computer graphics. While the advent of deep learning has progressed the field of 3D object classification, most work using this data type are solely evaluated on CAD model datasets. Consequently, current work do…
▽ More
Object classification with 3D data is an essential component of any scene understanding method. It has gained significant interest in a variety of communities, most notably in robotics and computer graphics. While the advent of deep learning has progressed the field of 3D object classification, most work using this data type are solely evaluated on CAD model datasets. Consequently, current work does not address the discrepancies existing between real and artificial data. In this work, we examine this gap in a robotic context by specifically addressing the problem of classification when transferring from artificial CAD models to real reconstructed objects. This is performed by training on ModelNet (CAD models) and evaluating on ScanNet (reconstructed objects). We show that standard methods do not perform well in this task. We thus introduce a method that carefully samples object parts that are reproducible under various transformations and hence robust. Using graph convolution to classify the composed graph of parts, our method significantly improves upon the baseline.
△ Less
Submitted 28 October, 2019;
originally announced October 2019.