Search | arXiv e-print repository

DM-VTON: Distilled Mobile Real-time Virtual Try-On

Authors: Khoi-Nguyen Nguyen-Ngoc, Thanh-Tung Phan-Nguyen, Khanh-Duy Le, Tam V. Nguyen, Minh-Triet Tran, Trung-Nghia Le

Abstract: The fashion e-commerce industry has witnessed significant growth in recent years, prompting exploring image-based virtual try-on techniques to incorporate Augmented Reality (AR) experiences into online shop** platforms. However, existing research has primarily overlooked a crucial aspect - the runtime of the underlying machine-learning model. While existing methods prioritize enhancing output qu… ▽ More The fashion e-commerce industry has witnessed significant growth in recent years, prompting exploring image-based virtual try-on techniques to incorporate Augmented Reality (AR) experiences into online shop** platforms. However, existing research has primarily overlooked a crucial aspect - the runtime of the underlying machine-learning model. While existing methods prioritize enhancing output quality, they often disregard the execution time, which restricts their applications on a limited range of devices. To address this gap, we propose Distilled Mobile Real-time Virtual Try-On (DM-VTON), a novel virtual try-on framework designed to achieve simplicity and efficiency. Our approach is based on a knowledge distillation scheme that leverages a strong Teacher network as supervision to guide a Student network without relying on human parsing. Notably, we introduce an efficient Mobile Generative Module within the Student network, significantly reducing the runtime while ensuring high-quality output. Additionally, we propose Virtual Try-on-guided Pose for Data Synthesis to address the limited pose variation observed in training images. Experimental results show that the proposed method can achieve 40 frames per second on a single Nvidia Tesla T4 GPU and only take up 37 MB of memory while producing almost the same output quality as other state-of-the-art methods. DM-VTON stands poised to facilitate the advancement of real-time AR applications, in addition to the generation of lifelike attired human figures tailored for diverse specialized training tasks. https://sites.google.com/view/ltnghia/research/DMVTON △ Less

Submitted 26 August, 2023; originally announced August 2023.

Comments: Accepted to ISMAR 2023 (Poster paper)

arXiv:2212.00305 [pdf, other]

Multilingual Communication System with Deaf Individuals Utilizing Natural and Visual Languages

Authors: Tuan-Luc Huynh, Khoi-Nguyen Nguyen-Ngoc, Chi-Bien Chu, Minh-Triet Tran, Trung-Nghia Le

Abstract: According to the World Federation of the Deaf, more than two hundred sign languages exist. Therefore, it is challenging to understand deaf individuals, even proficient sign language users, resulting in a barrier between the deaf community and the rest of society. To bridge this language barrier, we propose a novel multilingual communication system, namely MUGCAT, to improve the communication effic… ▽ More According to the World Federation of the Deaf, more than two hundred sign languages exist. Therefore, it is challenging to understand deaf individuals, even proficient sign language users, resulting in a barrier between the deaf community and the rest of society. To bridge this language barrier, we propose a novel multilingual communication system, namely MUGCAT, to improve the communication efficiency of sign language users. By converting recognized specific hand gestures into expressive pictures, which is universal usage and language independence, our MUGCAT system significantly helps deaf people convey their thoughts. To overcome the limitation of sign language usage, which is mostly impossible to translate into complete sentences for ordinary people, we propose to reconstruct meaningful sentences from the incomplete translation of sign language. We also measure the semantic similarity of generated sentences with fragmented recognized hand gestures to keep the original meaning. Experimental results show that the proposed system can work in a real-time manner and synthesize exquisite stunning illustrations and meaningful sentences from a few hand gestures of sign language. This proves that our MUGCAT has promising potential in assisting deaf communication. △ Less

Submitted 1 December, 2022; originally announced December 2022.

arXiv:2207.04945 [pdf, other]

SHREC'22 Track: Sketch-Based 3D Shape Retrieval in the Wild

Authors: Jie Qin, Shuaihang Yuan, Jiaxin Chen, Boulbaba Ben Amor, Yi Fang, Nhat Hoang-Xuan, Chi-Bien Chu, Khoi-Nguyen Nguyen-Ngoc, Thien-Tri Cao, Nhat-Khang Ngo, Tuan-Luc Huynh, Hai-Dang Nguyen, Minh-Triet Tran, Haoyang Luo, Jianning Wang, Zheng Zhang, Zihao Xin, Yang Wang, Feng Wang, Ying Tang, Haiqin Chen, Yan Wang, Qunying Zhou, Ji Zhang, Hongyuan Wang

Abstract: Sketch-based 3D shape retrieval (SBSR) is an important yet challenging task, which has drawn more and more attention in recent years. Existing approaches address the problem in a restricted setting, without appropriately simulating real application scenarios. To mimic the realistic setting, in this track, we adopt large-scale sketches drawn by amateurs of different levels of drawing skills, as wel… ▽ More Sketch-based 3D shape retrieval (SBSR) is an important yet challenging task, which has drawn more and more attention in recent years. Existing approaches address the problem in a restricted setting, without appropriately simulating real application scenarios. To mimic the realistic setting, in this track, we adopt large-scale sketches drawn by amateurs of different levels of drawing skills, as well as a variety of 3D shapes including not only CAD models but also models scanned from real objects. We define two SBSR tasks and construct two benchmarks consisting of more than 46,000 CAD models, 1,700 realistic models, and 145,000 sketches in total. Four teams participated in this track and submitted 15 runs for the two tasks, evaluated by 7 commonly-adopted metrics. We hope that, the benchmarks, the comparative results, and the open-sourced evaluation code will foster future research in this direction among the 3D object retrieval community. △ Less

Submitted 11 July, 2022; originally announced July 2022.

arXiv:2206.07636 [pdf, other]

doi 10.1016/j.cag.2022.07.004

SHREC 2022: Fitting and recognition of simple geometric primitives on point clouds

Authors: Chiara Romanengo, Andrea Raffo, Silvia Biasotti, Bianca Falcidieno, Vlassis Fotis, Ioannis Romanelis, Eleftheria Psatha, Konstantinos Moustakas, Ivan Sipiran, Quang-Thuc Nguyen, Chi-Bien Chu, Khoi-Nguyen Nguyen-Ngoc, Dinh-Khoi Vo, Tuan-An To, Nham-Tan Nguyen, Nhat-Quynh Le-Pham, Hai-Dang Nguyen, Minh-Triet Tran, Yifan Qie, Nabil Anwer

Abstract: This paper presents the methods that have participated in the SHREC 2022 track on the fitting and recognition of simple geometric primitives on point clouds. As simple primitives we mean the classical surface primitives derived from constructive solid geometry, i.e., planes, spheres, cylinders, cones and tori. The aim of the track is to evaluate the quality of automatic algorithms for fitting and… ▽ More This paper presents the methods that have participated in the SHREC 2022 track on the fitting and recognition of simple geometric primitives on point clouds. As simple primitives we mean the classical surface primitives derived from constructive solid geometry, i.e., planes, spheres, cylinders, cones and tori. The aim of the track is to evaluate the quality of automatic algorithms for fitting and recognising geometric primitives on point clouds. Specifically, the goal is to identify, for each point cloud, its primitive type and some geometric descriptors. For this purpose, we created a synthetic dataset, divided into a training set and a test set, containing segments perturbed with different kinds of point cloud artifacts. Among the six participants to this track, two are based on direct methods, while four are either fully based on deep learning or combine direct and neural approaches. The performance of the methods is evaluated using various classification and approximation measures. △ Less

Submitted 7 July, 2022; v1 submitted 15 June, 2022; originally announced June 2022.

MSC Class: 68U05; 68U07; 65D18; 65D17 ACM Class: G.1.2; I.3.5; I.5.4

Journal ref: Computers & Graphics 107 (2022) 32-49

Showing 1–4 of 4 results for author: Nguyen-Ngoc, K