Skip to main content

Showing 1–14 of 14 results for author: Le, C

Searching in archive eess. Search in all archives.
.
  1. arXiv:2405.17809  [pdf, other

    cs.CL cs.AI cs.SD eess.AS

    TransVIP: Speech to Speech Translation System with Voice and Isochrony Preservation

    Authors: Chenyang Le, Yao Qian, Dongmei Wang, Long Zhou, Shujie Liu, Xiaofei Wang, Midia Yousefi, Yanmin Qian, **yu Li, Sheng Zhao, Michael Zeng

    Abstract: There is a rising interest and trend in research towards directly translating speech from one language to another, known as end-to-end speech-to-speech translation. However, most end-to-end models struggle to outperform cascade models, i.e., a pipeline framework by concatenating speech recognition, machine translation and text-to-speech models. The primary challenges stem from the inherent complex… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: Work in progress

  2. arXiv:2312.16717  [pdf, other

    cs.CV cs.LG eess.IV

    Landslide Detection and Segmentation Using Remote Sensing Images and Deep Neural Network

    Authors: Cam Le, Lam Pham, Jasmin Lampert, Matthias Schlögl, Alexander Schindler

    Abstract: Knowledge about historic landslide event occurrence is important for supporting disaster risk reduction strategies. Building upon findings from 2022 Landslide4Sense Competition, we propose a deep neural network based system for landslide detection and segmentation from multisource remote sensing image input. We use a U-Net trained with Cross Entropy loss as baseline model. We then improve the U-Ne… ▽ More

    Submitted 27 December, 2023; originally announced December 2023.

  3. arXiv:2311.05600  [pdf, other

    cs.RO eess.SY

    FogROS2-Config: Optimizing Latency and Cost for Multi-Cloud Robot Applications

    Authors: Kaiyuan Chen, Kush Hari, Rohil Khare, Charlotte Le, Trinity Chung, Jaimyn Drake, Jeffrey Ichnowski, John Kubiatowicz, Ken Goldberg

    Abstract: Cloud service providers provide over 50,000 distinct and dynamically changing set of cloud server options. To help roboticists make cost-effective decisions, we present FogROS2-Config, an open toolkit that takes ROS2 nodes as input and automatically runs relevant benchmarks to quickly return a menu of cloud compute services that tradeoff latency and cost. Because it is infeasible to try every hard… ▽ More

    Submitted 13 May, 2024; v1 submitted 9 November, 2023; originally announced November 2023.

    Comments: Published 2024 IEEE International Conference on Robotics and Automation (ICRA), Former name: FogROS2-Sky

  4. arXiv:2308.06539  [pdf, other

    cs.IT eess.SP

    Phase Shift Design for RIS-Aided Cell-Free Massive MIMO with Improved Differential Evolution

    Authors: Trinh Van Chien, Cuong V. Le, Huynh Thi Thanh Binh, Hien Quoc Ngo, Symeon Chatzinotas

    Abstract: This paper proposes a novel phase shift design for cell-free massive multiple-input and multiple-output (MIMO) systems assisted by reconfigurable intelligent surface (RIS), which only utilizes channel statistics to achieve the uplink sum ergodic throughput maximization under spatial channel correlations. Due to the non-convexity and the scale of the derived optimization problem, we develop an impr… ▽ More

    Submitted 12 August, 2023; originally announced August 2023.

    Comments: 5 pages, 2 figures. Accepted by IEEE WCL

  5. arXiv:2307.16834  [pdf

    cs.CV cs.AI cs.LG eess.IV

    Benchmarking Jetson Edge Devices with an End-to-end Video-based Anomaly Detection System

    Authors: Hoang Viet Pham, Thinh Gia Tran, Chuong Dinh Le, An Dinh Le, Hien Bich Vo

    Abstract: Innovative enhancement in embedded system platforms, specifically hardware accelerations, significantly influence the application of deep learning in real-world scenarios. These innovations translate human labor efforts into automated intelligent systems employed in various areas such as autonomous driving, robotics, Internet-of-Things (IoT), and numerous other impactful applications. NVIDIA's Jet… ▽ More

    Submitted 12 September, 2023; v1 submitted 28 July, 2023; originally announced July 2023.

    Comments: Accepted in Future of Information and Communication Conference (FICC) 2024

  6. arXiv:2305.14838  [pdf, other

    cs.CL cs.SD eess.AS

    ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text Translation

    Authors: Chenyang Le, Yao Qian, Long Zhou, Shujie Liu, Yanmin Qian, Michael Zeng, Xuedong Huang

    Abstract: Joint speech-language training is challenging due to the large demand for training data and GPU consumption, as well as the modality gap between speech and language. We present ComSL, a speech-language model built atop a composite architecture of public pretrained speech-only and language-only models and optimized data-efficiently for spoken language tasks. Particularly, we propose to incorporate… ▽ More

    Submitted 14 October, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: NeurIPS 2023, Poster

  7. arXiv:2305.09463  [pdf, other

    cs.SD cs.AI eess.AS

    Low-complexity deep learning frameworks for acoustic scene classification using teacher-student scheme and multiple spectrograms

    Authors: Lam Pham, Dat Ngo, Cam Le, Anahid Jalali, Alexander Schindler

    Abstract: In this technical report, a low-complexity deep learning system for acoustic scene classification (ASC) is presented. The proposed system comprises two main phases: (Phase I) Training a teacher network; and (Phase II) training a student network using distilled knowledge from the teacher. In the first phase, the teacher, which presents a large footprint model, is trained. After training the teacher… ▽ More

    Submitted 16 May, 2023; originally announced May 2023.

    Comments: arXiv admin note: text overlap with arXiv:2206.06057

  8. arXiv:2305.01476  [pdf, other

    cs.SD cs.MM eess.AS

    Deep Learning Based Multimodal with Two-phase Training Strategy for Daily Life Video Classification

    Authors: Lam Pham, Trang Le, Cam Le, Dat Ngo, Weissenfeld Axel, Alexander Schindler

    Abstract: In this paper, we present a deep learning based multimodal system for classifying daily life videos. To train the system, we propose a two-phase training strategy. In the first training phase (Phase I), we extract the audio and visual (image) data from the original video. We then train the audio data and the visual data with independent deep learning based models. After the training processes, we… ▽ More

    Submitted 30 April, 2023; originally announced May 2023.

  9. arXiv:2212.14353  [pdf, other

    cs.DC eess.SP

    Sheaf-theoretic self-filtering network of low-cost sensors for local air quality monitoring: A causal approach

    Authors: Anh-Duy Pham, Chuong Dinh Le, Hoang Viet Pham, Thinh Gia Tran, Dat Thanh Vo, Chau Long Tran, An Dinh Le, Hien Bich Vo

    Abstract: Sheaf theory, which is a complex but powerful tool supported by topological theory, offers more flexibility and precision than traditional graph theory when it comes to modeling relationships between multiple features. In the realm of air quality monitoring, this can be incredibly useful in detecting sudden changes in local dust particle density, which can be difficult to accurately measure using… ▽ More

    Submitted 29 December, 2022; originally announced December 2022.

  10. arXiv:2212.04313  [pdf

    eess.SY

    Scalable, low-cost, and versatile system design for air pollution and traffic density monitoring and analysis

    Authors: Thinh Gia Tran, Dat Thanh Vo, Long Chau Tran, Hoang Viet Pham, Chuong Dinh Le, An Dinh Le, Duy Anh Pham, Hien Bich Vo

    Abstract: Vietnam requires a sustainable urbanization, for which city sensing is used in planning and de-cision-making. Large cities need portable, scalable, and inexpensive digital technology for this purpose. End-to-end air quality monitoring companies such as AirVisual and Plume Air have shown their reliability with portable devices outfitted with superior air sensors. They are pricey, yet homeowners use… ▽ More

    Submitted 8 December, 2022; originally announced December 2022.

  11. arXiv:2211.02820  [pdf, other

    cs.CV cs.LG eess.IV

    A Robust and Low Complexity Deep Learning Model for Remote Sensing Image Classification

    Authors: Cam Le, Lam Pham, Nghia NVN, Truong Nguyen, Le Hong Trang

    Abstract: In this paper, we present a robust and low complexity deep learning model for Remote Sensing Image Classification (RSIC), the task of identifying the scene of a remote sensing image. In particular, we firstly evaluate different low complexity and benchmark deep neural networks: MobileNetV1, MobileNetV2, NASNetMobile, and EfficientNetB0, which present the number of trainable parameters lower than 5… ▽ More

    Submitted 12 December, 2022; v1 submitted 5 November, 2022; originally announced November 2022.

    Comments: 8 pages

  12. arXiv:2206.09146  [pdf, other

    eess.IV cs.AI cs.CV

    A Perceptually Optimized and Self-Calibrated Tone Map** Operator

    Authors: Peibei Cao, Chenyang Le, Yuming Fang, Kede Ma

    Abstract: With the increasing popularity and accessibility of high dynamic range (HDR) photography, tone map** operators (TMOs) for dynamic range compression are practically demanding. In this paper, we develop a two-stage neural network-based TMO that is self-calibrated and perceptually optimized. In Stage one, motivated by the physiology of the early stages of the human visual system, we first decompose… ▽ More

    Submitted 25 August, 2023; v1 submitted 18 June, 2022; originally announced June 2022.

    Comments: 15 pages,17 figures

  13. arXiv:2103.12827  [pdf, other

    cs.LG eess.IV stat.ML

    Fisher Task Distance and Its Application in Neural Architecture Search

    Authors: Cat P. Le, Mohammadreza Soltani, Juncheng Dong, Vahid Tarokh

    Abstract: We formulate an asymmetric (or non-commutative) distance between tasks based on Fisher Information Matrices, called Fisher task distance. This distance represents the complexity of transferring the knowledge from one task to another. We provide a proof of consistency for our distance through theorems and experiments on various classification tasks from MNIST, CIFAR-10, CIFAR-100, ImageNet, and Tas… ▽ More

    Submitted 30 April, 2022; v1 submitted 23 March, 2021; originally announced March 2021.

    Comments: Published in IEEE Access, Volume 10, 2022

  14. arXiv:2001.08366  [pdf, other

    cs.CV cs.LG eess.IV stat.ML

    Continual Local Replacement for Few-shot Learning

    Authors: Canyu Le, Zhonggui Chen, Xihan Wei, Biao Wang, Lei Zhang

    Abstract: The goal of few-shot learning is to learn a model that can recognize novel classes based on one or few training data. It is challenging mainly due to two aspects: (1) it lacks good feature representation of novel classes; (2) a few of labeled data could not accurately represent the true data distribution and thus it's hard to learn a good decision function for classification. In this work, we use… ▽ More

    Submitted 10 March, 2020; v1 submitted 22 January, 2020; originally announced January 2020.

    Comments: Update experiment results and reorganize paper writting