Skip to main content

Showing 1–17 of 17 results for author: Tjiputra, E

Searching in archive cs. Search in all archives.
.
  1. arXiv:2310.18986  [pdf, other

    cs.CV

    Controllable Group Choreography using Contrastive Diffusion

    Authors: Nhat Le, Tuong Do, Khoa Do, Hien Nguyen, Erman Tjiputra, Quang D. Tran, Anh Nguyen

    Abstract: Music-driven group choreography poses a considerable challenge but holds significant potential for a wide range of industrial applications. The ability to generate synchronized and visually appealing group dance motions that are aligned with music opens up opportunities in many fields such as entertainment, advertising, and virtual performances. However, most of the recent works are not able to ge… ▽ More

    Submitted 3 November, 2023; v1 submitted 29 October, 2023; originally announced October 2023.

  2. arXiv:2303.12337  [pdf, other

    cs.MM cs.CV cs.SD eess.AS

    Music-Driven Group Choreography

    Authors: Nhat Le, Thang Pham, Tuong Do, Erman Tjiputra, Quang D. Tran, Anh Nguyen

    Abstract: Music-driven choreography is a challenging problem with a wide variety of industrial applications. Recently, many methods have been proposed to synthesize dance motions from music for a single dancer. However, generating dance motion for a group remains an open problem. In this paper, we present $\rm AIOZ-GDANCE$, a new large-scale dataset for music-driven group dance generation. Unlike existing d… ▽ More

    Submitted 26 March, 2023; v1 submitted 22 March, 2023; originally announced March 2023.

    Comments: accepted in CVPR 2023

  3. arXiv:2303.09799  [pdf, other

    cs.CV

    Style Transfer for 2D Talking Head Animation

    Authors: Trong-Thang Pham, Nhat Le, Tuong Do, Hung Nguyen, Erman Tjiputra, Quang D. Tran, Anh Nguyen

    Abstract: Audio-driven talking head animation is a challenging research topic with many real-world applications. Recent works have focused on creating photo-realistic 2D animation, while learning different talking or singing styles remains an open problem. In this paper, we present a new method to generate talking head animation with learnable style references. Given a set of style reference frames, our fra… ▽ More

    Submitted 22 March, 2023; v1 submitted 17 March, 2023; originally announced March 2023.

  4. arXiv:2303.06305  [pdf, other

    cs.RO

    Reducing Non-IID Effects in Federated Autonomous Driving with Contrastive Divergence Loss

    Authors: Tuong Do, Binh X. Nguyen, Hien Nguyen, Erman Tjiputra, Quang D. Tran, Te-Chuan Chiu, Anh Nguyen

    Abstract: Federated learning has been widely applied in autonomous driving since it enables training a learning model among vehicles without sharing users' data. However, data from autonomous vehicles usually suffer from the non-independent-and-identically-distributed (non-IID) problem, which may cause negative effects on the convergence of the learning process. In this paper, we propose a new contrastive d… ▽ More

    Submitted 8 October, 2023; v1 submitted 10 March, 2023; originally announced March 2023.

  5. arXiv:2209.10448  [pdf, other

    cs.CV

    Uncertainty-aware Label Distribution Learning for Facial Expression Recognition

    Authors: Nhat Le, Khanh Nguyen, Quang Tran, Erman Tjiputra, Bac Le, Anh Nguyen

    Abstract: Despite significant progress over the past few years, ambiguity is still a key challenge in Facial Expression Recognition (FER). It can lead to noisy and inconsistent annotation, which hinders the performance of deep learning models in real-world scenarios. In this paper, we propose a new uncertainty-aware label distribution learning method to improve the robustness of deep models against uncertai… ▽ More

    Submitted 21 September, 2022; originally announced September 2022.

    Comments: Accepted to WACV 2023. The first two authors contributed equally to this work

  6. arXiv:2207.09657  [pdf, other

    cs.LG cs.DC

    Reducing Training Time in Cross-Silo Federated Learning using Multigraph Topology

    Authors: Tuong Do, Binh X. Nguyen, Vuong Pham, Toan Tran, Erman Tjiputra, Quang D. Tran, Anh Nguyen

    Abstract: Federated learning is an active research topic since it enables several participants to jointly train a model without sharing local data. Currently, cross-silo federated learning is a popular training setting that utilizes a few hundred reliable data silos with high-speed access links to training a model. While this approach has been widely applied in real-world scenarios, designing a robust topol… ▽ More

    Submitted 30 July, 2023; v1 submitted 20 July, 2022; originally announced July 2022.

    Comments: accepted in ICCV 2023

  7. arXiv:2205.10529  [pdf, other

    cs.CV

    Fine-Grained Visual Classification using Self Assessment Classifier

    Authors: Tuong Do, Huy Tran, Erman Tjiputra, Quang D. Tran, Anh Nguyen

    Abstract: Extracting discriminative features plays a crucial role in the fine-grained visual classification task. Most of the existing methods focus on develo** attention or augmentation mechanisms to achieve this goal. However, addressing the ambiguity in the top-k prediction classes is not fully investigated. In this paper, we introduce a Self Assessment Classifier, which simultaneously leverages the re… ▽ More

    Submitted 21 May, 2022; originally announced May 2022.

  8. arXiv:2110.05754  [pdf, other

    cs.LG cs.DC cs.RO

    Deep Federated Learning for Autonomous Driving

    Authors: Anh Nguyen, Tuong Do, Minh Tran, Binh X. Nguyen, Chien Duong, Tu Phan, Erman Tjiputra, Quang D. Tran

    Abstract: Autonomous driving is an active research topic in both academia and industry. However, most of the existing solutions focus on improving the accuracy by training learnable models with centralized large-scale data. Therefore, these methods do not take into account the user's privacy. In this paper, we present a new approach to learn autonomous driving policy while respecting privacy concerns. We pr… ▽ More

    Submitted 19 April, 2022; v1 submitted 12 October, 2021; originally announced October 2021.

    Comments: Accepted in IEEE Intelligent Vehicles Symposium 2022 (IV 2022)

  9. arXiv:2110.02526  [pdf, other

    cs.CV

    Coarse-to-Fine Reasoning for Visual Question Answering

    Authors: Binh X. Nguyen, Tuong Do, Huy Tran, Erman Tjiputra, Quang D. Tran, Anh Nguyen

    Abstract: Bridging the semantic gap between image and question is an important step to improve the accuracy of the Visual Question Answering (VQA) task. However, most of the existing VQA methods focus on attention mechanisms or visual relations for reasoning the answer, while the features at different semantic levels are not fully utilized. In this paper, we present a new reasoning framework to fill the gap… ▽ More

    Submitted 19 April, 2022; v1 submitted 6 October, 2021; originally announced October 2021.

    Comments: Accepted in CVPR 2022 Workshops

  10. arXiv:2110.01293  [pdf, other

    eess.IV cs.CV

    Light-weight Deformable Registration using Adversarial Learning with Distilling Knowledge

    Authors: Minh Q. Tran, Tuong Do, Huy Tran, Erman Tjiputra, Quang D. Tran, Anh Nguyen

    Abstract: Deformable registration is a crucial step in many medical procedures such as image-guided surgery and radiation therapy. Most recent learning-based methods focus on improving the accuracy by optimizing the non-linear spatial correspondence between the input images. Therefore, these methods are computationally expensive and require modern graphic cards for real-time deployment. In this paper, we in… ▽ More

    Submitted 4 October, 2021; originally announced October 2021.

  11. arXiv:2105.08913  [pdf, other

    cs.CV

    Multiple Meta-model Quantifying for Medical Visual Question Answering

    Authors: Tuong Do, Binh X. Nguyen, Erman Tjiputra, Minh Tran, Quang D. Tran, Anh Nguyen

    Abstract: Transfer learning is an important step to extract meaningful features and overcome the data limitation in the medical Visual Question Answering (VQA) task. However, most of the existing medical VQA methods rely on external data for transfer learning, while the meta-data within the dataset is not fully utilized. In this paper, we present a new multiple meta-model quantifying method that effectively… ▽ More

    Submitted 26 June, 2021; v1 submitted 19 May, 2021; originally announced May 2021.

    Comments: Provisional accepted in MICCAI 2021

  12. arXiv:2104.06770  [pdf, other

    cs.CV

    Graph-based Person Signature for Person Re-Identifications

    Authors: Binh X. Nguyen, Binh D. Nguyen, Tuong Do, Erman Tjiputra, Quang D. Tran, Anh Nguyen

    Abstract: The task of person re-identification (ReID) is to match images of the same person over multiple non-overlap** camera views. Due to the variations in visual factors, previous works have investigated how the person identity, body parts, and attributes benefit the person ReID problem. However, the correlations between attributes, body parts, and within each attribute are not fully utilized. In this… ▽ More

    Submitted 17 April, 2021; v1 submitted 14 April, 2021; originally announced April 2021.

    Comments: Accepted in CVPR 2021 Workshops

  13. arXiv:2009.11118  [pdf, other

    cs.CV

    Multiple interaction learning with question-type prior knowledge for constraining answer search space in visual question answering

    Authors: Tuong Do, Binh X. Nguyen, Huy Tran, Erman Tjiputra, Quang D. Tran, Thanh-Toan Do

    Abstract: Different approaches have been proposed to Visual Question Answering (VQA). However, few works are aware of the behaviors of varying joint modality methods over question type prior knowledge extracted from data in constraining answer search space, of which information gives a reliable cue to reason about answers for questions asked in input images. In this paper, we propose a novel VQA model that… ▽ More

    Submitted 23 September, 2020; originally announced September 2020.

    Comments: Accepted in ECCV Workshop 2020

  14. arXiv:2009.04091  [pdf, other

    cs.CV

    Deep Metric Learning Meets Deep Clustering: An Novel Unsupervised Approach for Feature Embedding

    Authors: Binh X. Nguyen, Binh D. Nguyen, Gustavo Carneiro, Erman Tjiputra, Quang D. Tran, Thanh-Toan Do

    Abstract: Unsupervised Deep Distance Metric Learning (UDML) aims to learn sample similarities in the embedding space from an unlabeled dataset. Traditional UDML methods usually use the triplet loss or pairwise loss which requires the mining of positive and negative samples w.r.t. anchor data points. This is, however, challenging in an unsupervised setting as the label information is not available. In this p… ▽ More

    Submitted 9 September, 2020; originally announced September 2020.

    Comments: Accepted in BMVC 2020

  15. arXiv:2007.15945  [pdf, other

    cs.RO

    Autonomous Navigation in Complex Environments with Deep Multimodal Fusion Network

    Authors: Anh Nguyen, Ngoc Nguyen, Kim Tran, Erman Tjiputra, Quang D. Tran

    Abstract: Autonomous navigation in complex environments is a crucial task in time-sensitive scenarios such as disaster response or search and rescue. However, complex environments pose significant challenges for autonomous platforms to navigate due to their challenging properties: constrained narrow passages, unstable pathway with debris and obstacles, or irregular geological structures and poor lighting co… ▽ More

    Submitted 31 July, 2020; originally announced July 2020.

    Comments: Accepted to IROS 2020

  16. arXiv:1909.11874  [pdf, other

    cs.CV

    Compact Trilinear Interaction for Visual Question Answering

    Authors: Tuong Do, Thanh-Toan Do, Huy Tran, Erman Tjiputra, Quang D. Tran

    Abstract: In Visual Question Answering (VQA), answers have a great correlation with question meaning and visual contents. Thus, to selectively utilize image, question and answer information, we propose a novel trilinear interaction model which simultaneously learns high level associations between these three inputs. In addition, to overcome the interaction complexity, we introduce a multimodal tensor-based… ▽ More

    Submitted 26 September, 2019; originally announced September 2019.

    Comments: Accepted in ICCV 2019

  17. arXiv:1909.11867  [pdf, other

    cs.CV

    Overcoming Data Limitation in Medical Visual Question Answering

    Authors: Binh D. Nguyen, Thanh-Toan Do, Binh X. Nguyen, Tuong Do, Erman Tjiputra, Quang D. Tran

    Abstract: Traditional approaches for Visual Question Answering (VQA) require large amount of labeled data for training. Unfortunately, such large scale data is usually not available for medical domain. In this paper, we propose a novel medical VQA framework that overcomes the labeled data limitation. The proposed framework explores the use of the unsupervised Denoising Auto-Encoder (DAE) and the supervised… ▽ More

    Submitted 25 September, 2019; originally announced September 2019.

    Comments: Accepted in MICCAI 2019