-
Weak Multi-View Supervision for Surface Map** Estimation
Authors:
Nishant Rai,
Aidas Liaudanskas,
Srinivas Rao,
Rodrigo Ortiz Cayon,
Matteo Munaro,
Stefan Holzer
Abstract:
We propose a weakly-supervised multi-view learning approach to learn category-specific surface map** without dense annotations. We learn the underlying surface geometry of common categories, such as human faces, cars, and airplanes, given instances from those categories. While traditional approaches solve this problem using extensive supervision in the form of pixel-level annotations, we take ad…
▽ More
We propose a weakly-supervised multi-view learning approach to learn category-specific surface map** without dense annotations. We learn the underlying surface geometry of common categories, such as human faces, cars, and airplanes, given instances from those categories. While traditional approaches solve this problem using extensive supervision in the form of pixel-level annotations, we take advantage of the fact that pixel-level UV and mesh predictions can be combined with 3D reprojections to form consistency cycles. As a result of exploiting these cycles, we can establish a dense correspondence map** between image pixels and the mesh acting as a self-supervisory signal, which in turn helps improve our overall estimates. Our approach leverages information from multiple views of the object to establish additional consistency cycles, thus improving surface map** understanding without the need for explicit annotations. We also propose the use of deformation fields for predictions of an instance specific mesh. Given the lack of datasets providing multiple images of similar object instances from different viewpoints, we generate and release a multi-view ShapeNet Cars and Airplanes dataset created by rendering ShapeNet meshes using a 360 degree camera trajectory around the mesh. For the human faces category, we process and adapt an existing dataset to a multi-view setup. Through experimental evaluations, we show that, at test time, our method can generate accurate variations away from the mean shape, is multi-view consistent, and performs comparably to fully supervised approaches.
△ Less
Submitted 4 May, 2021;
originally announced May 2021.
-
RUR53: an Unmanned Ground Vehicle for Navigation, Recognition and Manipulation
Authors:
Nicola Castaman,
Elisa Tosello,
Morris Antonello,
Nicola Bagarello,
Silvia Gandin,
Marco Carraro,
Matteo Munaro,
Roberto Bortoletto,
Stefano Ghidoni,
Emanuele Menegatti,
Enrico Pagello
Abstract:
This paper proposes RUR53: an Unmanned Ground Vehicle able to autonomously navigate through, identify, and reach areas of interest; and there recognize, localize, and manipulate work tools to perform complex manipulation tasks. The proposed contribution includes a modular software architecture where each module solves specific sub-tasks and that can be easily enlarged to satisfy new requirements.…
▽ More
This paper proposes RUR53: an Unmanned Ground Vehicle able to autonomously navigate through, identify, and reach areas of interest; and there recognize, localize, and manipulate work tools to perform complex manipulation tasks. The proposed contribution includes a modular software architecture where each module solves specific sub-tasks and that can be easily enlarged to satisfy new requirements. Included indoor and outdoor tests demonstrate the capability of the proposed system to autonomously detect a target object (a panel) and precisely dock in front of it while avoiding obstacles. They show it can autonomously recognize and manipulate target work tools (i.e., wrenches and valve stems) to accomplish complex tasks (i.e., use a wrench to rotate a valve stem). A specific case study is described where the proposed modular architecture lets easy switch to a semi-teleoperated mode. The paper exhaustively describes description of both the hardware and software setup of RUR53, its performance when tests at the 2017 Mohamed Bin Zayed International Robotics Challenge, and the lessons we learned when participating at this competition, where we ranked third in the Gran Challenge in collaboration with the Czech Technical University in Prague, the University of Pennsylvania, and the University of Lincoln (UK).
△ Less
Submitted 3 October, 2020; v1 submitted 23 November, 2017;
originally announced November 2017.
-
Real-time marker-less multi-person 3D pose estimation in RGB-Depth camera networks
Authors:
Marco Carraro,
Matteo Munaro,
Jeff Burke,
Emanuele Menegatti
Abstract:
This paper proposes a novel system to estimate and track the 3D poses of multiple persons in calibrated RGB-Depth camera networks. The multi-view 3D pose of each person is computed by a central node which receives the single-view outcomes from each camera of the network. Each single-view outcome is computed by using a CNN for 2D pose estimation and extending the resulting skeletons to 3D by means…
▽ More
This paper proposes a novel system to estimate and track the 3D poses of multiple persons in calibrated RGB-Depth camera networks. The multi-view 3D pose of each person is computed by a central node which receives the single-view outcomes from each camera of the network. Each single-view outcome is computed by using a CNN for 2D pose estimation and extending the resulting skeletons to 3D by means of the sensor depth. The proposed system is marker-less, multi-person, independent of background and does not make any assumption on people appearance and initial pose. The system provides real-time outcomes, thus being perfectly suited for applications requiring user interaction. Experimental results show the effectiveness of this work with respect to a baseline multi-view approach in different scenarios. To foster research and applications based on this work, we released the source code in OpenPTrack, an open source project for RGB-D people tracking.
△ Less
Submitted 17 October, 2017;
originally announced October 2017.