Skip to main content

Showing 1–14 of 14 results for author: Pham, H X

.
  1. arXiv:2401.13594  [pdf, other

    cs.CL cs.AI

    Graph Guided Question Answer Generation for Procedural Question-Answering

    Authors: Hai X. Pham, Isma Hadji, Xinnuo Xu, Ziedune Degutyte, Jay Rainey, Evangelos Kazakos, Afsaneh Fazly, Georgios Tzimiropoulos, Brais Martinez

    Abstract: In this paper, we focus on task-specific question answering (QA). To this end, we introduce a method for generating exhaustive and high-quality training data, which allows us to train compact (e.g., run on a mobile device), task-specific QA models that are competitive against GPT variants. The key technological enabler is a novel mechanism for automatic question-answer generation from procedural t… ▽ More

    Submitted 24 January, 2024; originally announced January 2024.

    Comments: Accepted to EACL 2024 as long paper. 25 pages including appendix

    MSC Class: I.2.7

  2. arXiv:2310.14030  [pdf, other

    cs.RO

    Visual Tracking Nonlinear Model Predictive Control Method for Autonomous Wind Turbine Inspection

    Authors: Abdelhakim Amer, Mohit Mehndiratta, Jonas le Fevre Sejersen, Huy Xuan Pham, Erdal Kayacan

    Abstract: Automated visual inspection of on-and offshore wind turbines using aerial robots provides several benefits, namely, a safe working environment by circumventing the need for workers to be suspended high above the ground, reduced inspection time, preventive maintenance, and access to hard-to-reach areas. A novel nonlinear model predictive control (NMPC) framework alongside a global wind turbine path… ▽ More

    Submitted 21 October, 2023; originally announced October 2023.

    Comments: 8 pages, accepted for publication at ICAR conference

  3. arXiv:2207.14131  [pdf, other

    cs.RO

    PencilNet: Zero-Shot Sim-to-Real Transfer Learning for Robust Gate Perception in Autonomous Drone Racing

    Authors: Huy Xuan Pham, Andriy Sarabakha, Mykola Odnoshyvkin, Erdal Kayacan

    Abstract: In autonomous and mobile robotics, one of the main challenges is the robust on-the-fly perception of the environment, which is often unknown and dynamic, like in autonomous drone racing. In this work, we propose a novel deep neural network-based perception method for racing gate detection -- PencilNet -- which relies on a lightweight neural network backbone on top of a pencil filter. This approach… ▽ More

    Submitted 28 July, 2022; originally announced July 2022.

    Comments: accepted for publication by IEEE RA-L/IROS 2022

  4. arXiv:2204.02120  [pdf, other

    cs.RO

    Event-based Navigation for Autonomous Drone Racing with Sparse Gated Recurrent Network

    Authors: Kristoffer Fogh Andersen, Huy Xuan Pham, Halil Ibrahim Ugurlu, Erdal Kayacan

    Abstract: Event-based vision has already revolutionized the perception task for robots by promising faster response, lower energy consumption, and lower bandwidth without introducing motion blur. In this work, a novel deep learning method based on gated recurrent units utilizing sparse convolutions for detecting gates in a race track is proposed using event-based vision for the autonomous drone racing probl… ▽ More

    Submitted 5 April, 2022; originally announced April 2022.

    Comments: Accepted to present at the 20th European Control Conference (ECC)

  5. arXiv:2102.02547  [pdf, other

    cs.CV

    CHEF: Cross-modal Hierarchical Embeddings for Food Domain Retrieval

    Authors: Hai X. Pham, Ricardo Guerrero, Jiatong Li, Vladimir Pavlovic

    Abstract: Despite the abundance of multi-modal data, such as image-text pairs, there has been little effort in understanding the individual entities and their different roles in the construction of these data instances. In this work, we endeavour to discover the entities and their corresponding importance in cooking recipes automaticall} as a visual-linguistic association problem. More specifically, we intr… ▽ More

    Submitted 4 February, 2021; originally announced February 2021.

    Comments: 22 pages, accepted in AAAI 2021

  6. arXiv:2012.01345  [pdf, other

    cs.CV cs.IR cs.LG

    Cross-Modal Retrieval and Synthesis (X-MRS): Closing the Modality Gap in Shared Representation Learning

    Authors: Ricardo Guerrero, Hai Xuan Pham, Vladimir Pavlovic

    Abstract: Computational food analysis (CFA) naturally requires multi-modal evidence of a particular food, e.g., images, recipe text, etc. A key to making CFA possible is multi-modal shared representation learning, which aims to create a joint representation of the multiple views (text and image) of the data. In this work we propose a method for food domain cross-modal shared representation learning that pre… ▽ More

    Submitted 30 September, 2021; v1 submitted 2 December, 2020; originally announced December 2020.

  7. arXiv:1811.00690  [pdf, other

    cs.RO

    A Multi-Robotic System for Environmental Cleaning

    Authors: Chuong Le, Huy Xuan Pham, Hung Manh La

    Abstract: There is a lot of waste in an industrial environment that could cause harmful effects to both the products and the workers resulting in product defects, itchy eyes or chronic obstructive pulmonary disease, etc. While automative cleaning robots could be used, the environment is often too big for one robot to clean alone in addition to the fact that it does not have adequate stored dirt capacity. We… ▽ More

    Submitted 1 November, 2018; originally announced November 2018.

  8. arXiv:1803.07926  [pdf, other

    cs.RO

    A Distributed Control Framework of Multiple Unmanned Aerial Vehicles for Dynamic Wildfire Tracking

    Authors: Huy Xuan Pham, Hung Manh La, David Feil-Seifer, Matthew Dean

    Abstract: Wild-land fire fighting is a hazardous job. A key task for firefighters is to observe the "fire front" to chart the progress of the fire and areas that will likely spread next. Lack of information of the fire front causes many accidents. Using Unmanned Aerial Vehicles (UAVs) to cover wildfire is promising because it can replace humans in hazardous fire tracking and significantly reduce operation c… ▽ More

    Submitted 19 March, 2018; originally announced March 2018.

    Comments: arXiv admin note: substantial text overlap with arXiv:1704.02630

  9. arXiv:1803.07716  [pdf, other

    cs.CV

    Generative Adversarial Talking Head: Bringing Portraits to Life with a Weakly Supervised Neural Network

    Authors: Hai X. Pham, Yuting Wang, Vladimir Pavlovic

    Abstract: This paper presents Generative Adversarial Talking Head (GATH), a novel deep generative neural network that enables fully automatic facial expression synthesis of an arbitrary portrait with continuous action unit (AU) coefficients. Specifically, our model directly manipulates image pixels to make the unseen subject in the still photo express various emotions controlled by values of facial AU coeff… ▽ More

    Submitted 28 March, 2018; v1 submitted 20 March, 2018; originally announced March 2018.

    Comments: Fix typos, add youtube link of supplementary video

  10. arXiv:1803.07250  [pdf, other

    cs.RO

    Cooperative and Distributed Reinforcement Learning of Drones for Field Coverage

    Authors: Huy Xuan Pham, Hung Manh La, David Feil-Seifer, Aria Nefian

    Abstract: This paper proposes a distributed Multi-Agent Reinforcement Learning (MARL) algorithm for a team of Unmanned Aerial Vehicles (UAVs). The proposed MARL algorithm allows UAVs to learn cooperatively to provide a full coverage of an unknown field of interest while minimizing the overlap** sections among their field of views. Two challenges in MARL for such a system are discussed in the paper: firstl… ▽ More

    Submitted 16 September, 2018; v1 submitted 20 March, 2018; originally announced March 2018.

  11. arXiv:1801.05086  [pdf, other

    cs.RO

    Autonomous UAV Navigation Using Reinforcement Learning

    Authors: Huy X. Pham, Hung M. La, David Feil-Seifer, Luan V. Nguyen

    Abstract: Unmanned aerial vehicles (UAV) are commonly used for missions in unknown environments, where an exact mathematical model of the environment may not be available. This paper provides a framework for using reinforcement learning to allow the UAV to navigate successfully in such environments. We conducted our simulation and real implementation to show how the UAVs can successfully learn to navigate t… ▽ More

    Submitted 15 January, 2018; originally announced January 2018.

  12. arXiv:1710.00920  [pdf, other

    cs.CV

    End-to-end Learning for 3D Facial Animation from Raw Waveforms of Speech

    Authors: Hai X. Pham, Yuting Wang, Vladimir Pavlovic

    Abstract: We present a deep learning framework for real-time speech-driven 3D facial animation from just raw waveforms. Our deep neural network directly maps an input sequence of speech audio to a series of micro facial action unit activations and head rotations to drive a 3D blendshape face model. In particular, our deep model is able to learn the latent representations of time-varying contextual informati… ▽ More

    Submitted 7 December, 2017; v1 submitted 2 October, 2017; originally announced October 2017.

  13. arXiv:1704.02630  [pdf, other

    cs.RO

    A Distributed Control Framework for a Team of Unmanned Aerial Vehicles for Dynamic Wildfire Tracking

    Authors: Huy X. Pham, Hung M. La, David Feil-Seifer, Matthew Deans

    Abstract: Wildland fire fighting is a very dangerous job, and the lack of information of the fire front is one of main reasons that causes many accidents. Using unmanned aerial vehicle (UAV) to cover wildfire is promising because it can replace human in hazardous fire tracking and save operation costs significantly. In this paper we propose a distributed control framework designed for a team of UAVs that ca… ▽ More

    Submitted 9 April, 2017; originally announced April 2017.

  14. arXiv:1507.02779  [pdf, other

    cs.CV

    Robust Performance-driven 3D Face Tracking in Long Range Depth Scenes

    Authors: Hai X. Pham, Chongyu Chen, Luc N. Dao, Vladimir Pavlovic, Jianfei Cai, Tat-jen Cham

    Abstract: We introduce a novel robust hybrid 3D face tracking framework from RGBD video streams, which is capable of tracking head pose and facial actions without pre-calibration or intervention from a user. In particular, we emphasize on improving the tracking performance in instances where the tracked subject is at a large distance from the cameras, and the quality of point cloud deteriorates severely. Th… ▽ More

    Submitted 10 July, 2015; originally announced July 2015.

    Comments: 10 pages, 8 figures, 4 tables