-
MoRF: Mobile Realistic Fullbody Avatars from a Monocular Video
Authors:
Renat Bashirov,
Alexey Larionov,
Evgeniya Ustinova,
Mikhail Sidorenko,
David Svitov,
Ilya Zakharkin,
Victor Lempitsky
Abstract:
We present a system to create Mobile Realistic Fullbody (MoRF) avatars. MoRF avatars are rendered in real-time on mobile devices, learned from monocular videos, and have high realism. We use SMPL-X as a proxy geometry and render it with DNR (neural texture and image-2-image network). We improve on prior work, by overfitting per-frame war** fields in the neural texture space, allowing to better a…
▽ More
We present a system to create Mobile Realistic Fullbody (MoRF) avatars. MoRF avatars are rendered in real-time on mobile devices, learned from monocular videos, and have high realism. We use SMPL-X as a proxy geometry and render it with DNR (neural texture and image-2-image network). We improve on prior work, by overfitting per-frame war** fields in the neural texture space, allowing to better align the training signal between different frames. We also refine SMPL-X mesh fitting procedure to improve the overall avatar quality. In the comparisons to other monocular video-based avatar systems, MoRF avatars achieve higher image sharpness and temporal consistency. Participants of our user study also preferred avatars generated by MoRF.
△ Less
Submitted 11 December, 2023; v1 submitted 17 March, 2023;
originally announced March 2023.
-
DINAR: Diffusion Inpainting of Neural Textures for One-Shot Human Avatars
Authors:
David Svitov,
Dmitrii Gudkov,
Renat Bashirov,
Victor Lempitsky
Abstract:
We present DINAR, an approach for creating realistic rigged fullbody avatars from single RGB images. Similarly to previous works, our method uses neural textures combined with the SMPL-X body model to achieve photo-realistic quality of avatars while kee** them easy to animate and fast to infer. To restore the texture, we use a latent diffusion model and show how such model can be trained in the…
▽ More
We present DINAR, an approach for creating realistic rigged fullbody avatars from single RGB images. Similarly to previous works, our method uses neural textures combined with the SMPL-X body model to achieve photo-realistic quality of avatars while kee** them easy to animate and fast to infer. To restore the texture, we use a latent diffusion model and show how such model can be trained in the neural texture space. The use of the diffusion model allows us to realistically reconstruct large unseen regions such as the back of a person given the frontal view. The models in our pipeline are trained using 2D images and videos only. In the experiments, our approach achieves state-of-the-art rendering quality and good generalization to new poses and viewpoints. In particular, the approach improves state-of-the-art on the SnapshotPeople public benchmark.
△ Less
Submitted 10 December, 2023; v1 submitted 16 March, 2023;
originally announced March 2023.
-
StylePeople: A Generative Model of Fullbody Human Avatars
Authors:
Artur Grigorev,
Karim Iskakov,
Anastasia Ianina,
Renat Bashirov,
Ilya Zakharkin,
Alexander Vakhitov,
Victor Lempitsky
Abstract:
We propose a new type of full-body human avatars, which combines parametric mesh-based body model with a neural texture. We show that with the help of neural textures, such avatars can successfully model clothing and hair, which usually poses a problem for mesh-based approaches. We also show how these avatars can be created from multiple frames of a video using backpropagation. We then propose a g…
▽ More
We propose a new type of full-body human avatars, which combines parametric mesh-based body model with a neural texture. We show that with the help of neural textures, such avatars can successfully model clothing and hair, which usually poses a problem for mesh-based approaches. We also show how these avatars can be created from multiple frames of a video using backpropagation. We then propose a generative model for such avatars that can be trained from datasets of images and videos of people. The generative model allows us to sample random avatars as well as to create dressed avatars of people from one or few images. The code for the project is available at saic-violet.github.io/style-people.
△ Less
Submitted 16 April, 2021;
originally announced April 2021.
-
Real-time RGBD-based Extended Body Pose Estimation
Authors:
Renat Bashirov,
Anastasia Ianina,
Karim Iskakov,
Yevgeniy Kononenko,
Valeriya Strizhkova,
Victor Lempitsky,
Alexander Vakhitov
Abstract:
We present a system for real-time RGBD-based estimation of 3D human pose. We use parametric 3D deformable human mesh model (SMPL-X) as a representation and focus on the real-time estimation of parameters for the body pose, hands pose and facial expression from Kinect Azure RGB-D camera. We train estimators of body pose and facial expression parameters. Both estimators use previously published land…
▽ More
We present a system for real-time RGBD-based estimation of 3D human pose. We use parametric 3D deformable human mesh model (SMPL-X) as a representation and focus on the real-time estimation of parameters for the body pose, hands pose and facial expression from Kinect Azure RGB-D camera. We train estimators of body pose and facial expression parameters. Both estimators use previously published landmark extractors as input and custom annotated datasets for supervision, while hand pose is estimated directly by a previously published method. We combine the predictions of those estimators into a temporally-smooth human pose. We train the facial expression extractor on a large talking face dataset, which we annotate with facial expression parameters. For the body pose we collect and annotate a dataset of 56 people captured from a rig of 5 Kinect Azure RGB-D cameras and use it together with a large motion capture AMASS dataset. Our RGB-D body pose model outperforms the state-of-the-art RGB-only methods and works on the same level of accuracy compared to a slower RGB-D optimization-based solution. The combined system runs at 30 FPS on a server with a single GPU. The code will be available at https://saic-violet.github.io/rgbd-kinect-pose
△ Less
Submitted 5 March, 2021;
originally announced March 2021.
-
Textured Neural Avatars
Authors:
Aliaksandra Shysheya,
Egor Zakharov,
Kara-Ali Aliev,
Renat Bashirov,
Egor Burkov,
Karim Iskakov,
Aleksei Ivakhnenko,
Yury Malkov,
Igor Pasechnik,
Dmitry Ulyanov,
Alexander Vakhitov,
Victor Lempitsky
Abstract:
We present a system for learning full-body neural avatars, i.e. deep networks that produce full-body renderings of a person for varying body pose and camera position. Our system takes the middle path between the classical graphics pipeline and the recent deep learning approaches that generate images of humans using image-to-image translation. In particular, our system estimates an explicit two-dim…
▽ More
We present a system for learning full-body neural avatars, i.e. deep networks that produce full-body renderings of a person for varying body pose and camera position. Our system takes the middle path between the classical graphics pipeline and the recent deep learning approaches that generate images of humans using image-to-image translation. In particular, our system estimates an explicit two-dimensional texture map of the model surface. At the same time, it abstains from explicit shape modeling in 3D. Instead, at test time, the system uses a fully-convolutional network to directly map the configuration of body feature points w.r.t. the camera to the 2D texture coordinates of individual pixels in the image frame. We show that such a system is capable of learning to generate realistic renderings while being trained on videos annotated with 3D poses and foreground masks. We also demonstrate that maintaining an explicit texture representation helps our system to achieve better generalization compared to systems that use direct image-to-image translation.
△ Less
Submitted 21 May, 2019;
originally announced May 2019.