Skip to main content

Showing 1–50 of 178 results for author: Vu, N

.
  1. arXiv:2407.02937  [pdf, other

    cs.CL cs.SD eess.AS

    Probing the Feasibility of Multilingual Speaker Anonymization

    Authors: Sarina Meyer, Florian Lux, Ngoc Thang Vu

    Abstract: In speaker anonymization, speech recordings are modified in a way that the identity of the speaker remains hidden. While this technology could help to protect the privacy of individuals around the globe, current research restricts this by focusing almost exclusively on English data. In this study, we extend a state-of-the-art anonymization system to nine languages by transforming language-dependen… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: accepted at Interspeech 2024

  2. arXiv:2407.01381  [pdf, other

    physics.chem-ph

    Polaritonic Chemistry using the Density Matrix Renormalization Group Method

    Authors: Mikuláš Matoušek, Nam Vu, Niranjan Govind, Jonathan J. Foley IV, Libor Veis

    Abstract: The emerging field of polaritonic chemistry explores the behavior of molecules under strong coupling with cavity modes. Despite recent developments in ab initio polaritonic methods for simulating polaritonic chemistry under electronic strong coupling, their capabilities are limited, especially in cases where the molecule also features strong electronic correlation. To bridge this gap, we have deve… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  3. arXiv:2407.00145  [pdf, other

    physics.soc-ph math.DS

    Co-evolving networks for opinion and social dynamics in agent-based models

    Authors: Nataša Djurdjevac Conrad, Nhu Quang Vu, Sören Nagel

    Abstract: The rise of digital social media has strengthened the coevolution of public opinions and social interactions, that shape social structures and collective outcomes in increasingly complex ways. Existing literature often explores this interplay as a one-directional influence, focusing on how opinions determine social ties within adaptive networks. However, this perspective overlooks the intrinsic dy… ▽ More

    Submitted 28 June, 2024; originally announced July 2024.

    MSC Class: 91Dxx; 05C82; 37Hxx

  4. arXiv:2406.19038  [pdf, other

    gr-qc

    Binary neutron star mergers using a discontinuous Galerkin-finite difference hybrid method

    Authors: Nils Deppe, Francois Foucart, Marceline S. Bonilla, Michael Boyle, Nicholas J. Corso, Matthew D. Duez, Matthew Giesler, François Hébert, Lawrence E. Kidder, Yoonsoo Kim, Prayush Kumar, Isaac Legred, Geoffrey Lovelace, Elias R. Most, Jordan Moxon, Kyle C. Nelli, Harald P. Pfeiffer, Mark A. Scheel, Saul A. Teukolsky, William Throwe, Nils L. Vu

    Abstract: We present a discontinuous Galerkin-finite difference hybrid scheme that allows high-order shock capturing with the discontinuous Galerkin method for general relativistic magnetohydrodynamics in dynamical spacetimes. We present several optimizations and stability improvements to our algorithm that allow the hybrid method to successfully simulate single, rotating, and binary neutron stars. The hybr… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: 31 pages, 8 figures, comments welcome!

  5. arXiv:2406.09489  [pdf, other

    cs.CV

    Language-driven Grasp Detection

    Authors: An Dinh Vuong, Minh Nhat Vu, Baoru Huang, Nghia Nguyen, Hieu Le, Thieu Vo, Anh Nguyen

    Abstract: Grasp detection is a persistent and intricate challenge with various industrial applications. Recently, many methods and datasets have been proposed to tackle the grasp detection problem. However, most of them do not consider using natural language as a condition to detect the grasp poses. In this paper, we introduce Grasp-Anything++, a new language-driven grasp detection dataset featuring 1M samp… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 19 pages. Accepted to CVPR24

  6. arXiv:2406.09039  [pdf, other

    cs.RO

    Language-Driven Closed-Loop Gras** with Model-Predictive Trajectory Replanning

    Authors: Huy Hoang Nguyen, Minh Nhat Vu, Florian Beck, Gerald Ebmer, Anh Nguyen, Andreas Kugi

    Abstract: Combining a vision module inside a closed-loop control system for a \emph{seamless movement} of a robot in a manipulation task is challenging due to the inconsistent update rates between utilized modules. This task is even more difficult in a dynamic environment, e.g., objects are moving. This paper presents a \emph{modular} zero-shot framework for language-driven manipulation of (dynamic) objects… ▽ More

    Submitted 19 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

    Comments: 9 pages, 6 figures

  7. arXiv:2406.08410  [pdf, other

    gr-qc

    Quasistationary hair for binary black hole initial data in scalar Gauss-Bonnet gravity

    Authors: Peter James Nee, Guillermo Lara, Harald P. Pfeiffer, Nils L. Vu

    Abstract: Recent efforts to numerically simulate compact objects in alternative theories of gravity have largely focused on the time-evolution equations. Another critical aspect is the construction of constraint-satisfying initial data with precise control over the properties of the systems under consideration. Here, we augment the extended conformal thin sandwich framework to construct quasistationary init… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 13 pages, 11 figures

  8. arXiv:2406.07124  [pdf, other

    cs.AI cs.LG

    CHARME: A chain-based reinforcement learning approach for the minor embedding problem

    Authors: Hoang M. Ngo, Nguyen H K. Do, Minh N. Vu, Tamer Kahveci, My T. Thai

    Abstract: Quantum Annealing (QA) holds great potential for solving combinatorial optimization problems efficiently. However, the effectiveness of QA algorithms heavily relies on the embedding of problem instances, represented as logical graphs, into the quantum unit processing (QPU) whose topology is in form of a limited connectivity graph, known as the minor embedding Problem. Existing methods for the mino… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  9. arXiv:2406.06406  [pdf, other

    cs.CL cs.SD eess.AS

    Controlling Emotion in Text-to-Speech with Natural Language Prompts

    Authors: Thomas Bott, Florian Lux, Ngoc Thang Vu

    Abstract: In recent years, prompting has quickly become one of the standard ways of steering the outputs of generative machine learning models, due to its intuitive use of natural language. In this work, we propose a system conditioned on embeddings derived from an emotionally rich text that serves as prompt. Thereby, a joint representation of speaker and prompt embeddings is integrated at several points wi… ▽ More

    Submitted 11 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

    Comments: accepted at Interspeech 2024

  10. arXiv:2406.06403  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Meta Learning Text-to-Speech Synthesis in over 7000 Languages

    Authors: Florian Lux, Sarina Meyer, Lyonel Behringer, Frank Zalkow, Phat Do, Matt Coler, Emanuël A. P. Habets, Ngoc Thang Vu

    Abstract: In this work, we take on the challenging task of building a single text-to-speech synthesis system that is capable of generating speech in over 7000 languages, many of which lack sufficient data for traditional TTS development. By leveraging a novel integration of massively multilingual pretraining and meta learning to approximate language representations, our approach enables zero-shot speech syn… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: accepted at Interspeech 2024

  11. arXiv:2405.09335  [pdf, other

    cs.CL

    Prompting-based Synthetic Data Generation for Few-Shot Question Answering

    Authors: Maximilian Schmidt, Andrea Bartezzaghi, Ngoc Thang Vu

    Abstract: Although language models (LMs) have boosted the performance of Question Answering, they still need plenty of data. Data annotation, in contrast, is a time-consuming process. This especially applies to Question Answering, where possibly large documents have to be parsed and annotated with questions and their corresponding answers. Furthermore, Question Answering models often only work well for the… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

    Comments: LREC-COLING 2024

  12. arXiv:2405.08868  [pdf, other

    gr-qc hep-th

    A Review of Gravitational Memory and BMS Frame Fixing in Numerical Relativity

    Authors: Keefe Mitman, Michael Boyle, Leo C. Stein, Nils Deppe, Lawrence E. Kidder, Jordan Moxon, Harald P. Pfeiffer, Mark A. Scheel, Saul A. Teukolsky, William Throwe, Nils L. Vu

    Abstract: Gravitational memory effects and the BMS freedoms exhibited at future null infinity have recently been resolved and utilized in numerical relativity simulations. With this, gravitational wave models and our understanding of the fundamental nature of general relativity have been vastly improved. In this paper, we review the history and intuition behind memory effects and BMS symmetries, how they ma… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

    Comments: 20 pages, 8 figures. Submitted to CGQ's focus issue: Gravitational-Wave Memory Effects: From Theory to Observation

  13. arXiv:2405.06197  [pdf, other

    gr-qc

    Improved frequency spectra of gravitational waves with memory in a binary-black-hole simulation

    Authors: Yitian Chen, Michael Boyle, Nils Deppe, Lawrence E. Kidder, Keefe Mitman, Jordan Moxon, Kyle C. Nelli, Harald P. Pfeiffer, Mark A. Scheel, William Throwe, Nils L. Vu, Saul A. Teukolsky

    Abstract: Numerical relativists can now produce gravitational waveforms with memory effects routinely and accurately. The gravitational-wave memory effect contains very low-frequency components, including a persistent offset. The presence of these components violates basic assumptions about time-shift behavior underpinning standard data-analysis techniques in gravitational-wave astronomy. This poses a chall… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

    Comments: 24 pages, 11 figures, 5 tables

  14. arXiv:2405.06120  [pdf, other

    gr-qc math.NA

    A discontinuous Galerkin scheme for elliptic equations on extremely stretched grids

    Authors: Nils L. Vu

    Abstract: Discontinuous Galerkin (DG) methods for solving elliptic equations are gaining popularity in the computational physics community for their high-order spectral convergence and their potential for parallelization on computing clusters. However, problems in numerical relativity with extremely stretched grids, such as initial data problems for binary black holes that impose boundary conditions at larg… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

    Comments: 12 pages, 10 figures. Results are reproducible with the ancillary input files

  15. arXiv:2404.10922  [pdf, other

    cs.CL cs.SD eess.AS

    Teaching a Multilingual Large Language Model to Understand Multilingual Speech via Multi-Instructional Training

    Authors: Pavel Denisov, Ngoc Thang Vu

    Abstract: Recent advancements in language modeling have led to the emergence of Large Language Models (LLMs) capable of various natural language processing tasks. Despite their success in text-based tasks, applying LLMs to the speech domain remains limited and challenging. This paper presents BLOOMZMMS, a novel model that integrates a multilingual LLM with a multilingual speech encoder, aiming to harness th… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Comments: NAACL Findings 2024

  16. arXiv:2404.10222  [pdf, other

    quant-ph

    Simulating electronic structure on bosonic quantum computers

    Authors: Rishab Dutta, Nam P. Vu, Ningyi Lyu, Chen Wang, Victor S. Batista

    Abstract: Computations with quantum harmonic oscillators or qumodes is a promising and rapidly evolving approach towards quantum computing. In contrast to qubits, which are two-level quantum systems, bosonic qumodes can in principle have infinite discrete levels, and can also be represented with continuous variable bases. One of the most promising applications of quantum computing is simulating many-fermion… ▽ More

    Submitted 27 April, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

    Comments: 47 pages including references, 7 figures, revised

  17. arXiv:2404.10214  [pdf, other

    quant-ph

    Simulating Chemistry on Bosonic Quantum Devices

    Authors: Rishab Dutta, Delmar G. A. Cabral, Ningyi Lyu, Nam P. Vu, Yuchen Wang, Brandon Allen, Xiaohan Dan, Rodrigo G. Cortiñas, Pouya Khazaei, Max Schäfer, Alejandro C. C. d. Albornoz, Scott E. Smart, Scott Nie, Michel H. Devoret, David A. Mazziotti, Prineha Narang, Chen Wang, James D. Whitfield, Angela K. Wilson, Heidi P. Hendrickson, Daniel A. Lidar, Francisco Pérez-Bernal, Lea F. Santos, Sabre Kais, Eitan Geva , et al. (1 additional authors not shown)

    Abstract: Bosonic quantum devices offer a novel approach to realize quantum computations, where the quantum two-level system (qubit) is replaced with the quantum (an)harmonic oscillator (qumode) as the fundamental building block of the quantum simulator. The simulation of chemical structure and dynamics can then be achieved by representing or map** the system Hamiltonians in terms of bosonic operators. In… ▽ More

    Submitted 5 July, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

    Comments: 40 pages including references, 13 figures, revised

  18. arXiv:2404.07122  [pdf, other

    cs.CV

    Driver Attention Tracking and Analysis

    Authors: Dat Viet Thanh Nguyen, Anh Tran, Hoai Nam Vu, Cuong Pham, Minh Hoai

    Abstract: We propose a novel method to estimate a driver's points-of-gaze using a pair of ordinary cameras mounted on the windshield and dashboard of a car. This is a challenging problem due to the dynamics of traffic environments with 3D scenes of unknown depths. This problem is further complicated by the volatile distance between the driver and the camera system. To tackle these challenges, we develop a n… ▽ More

    Submitted 11 April, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

  19. Superior Genetic Algorithms for the Target Set Selection Problem Based on Power-Law Parameter Choices and Simple Greedy Heuristics

    Authors: Benjamin Doerr, Martin S. Krejca, Nguyen Vu

    Abstract: The target set selection problem (TSS) asks for a set of vertices such that an influence spreading process started in these vertices reaches the whole graph. The current state of the art for this NP-hard problem are three recently proposed randomized search heuristics, namely a biased random-key genetic algorithm (BRKGA) obtained from extensive parameter tuning, a max-min ant system (MMAS), and a… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

  20. arXiv:2403.17647  [pdf, other

    cs.CL

    Intrinsic Subgraph Generation for Interpretable Graph based Visual Question Answering

    Authors: Pascal Tilli, Ngoc Thang Vu

    Abstract: The large success of deep learning based methods in Visual Question Answering (VQA) has concurrently increased the demand for explainable methods. Most methods in Explainable Artificial Intelligence (XAI) focus on generating post-hoc explanations rather than taking an intrinsic approach, the latter characterizing an interpretable model. In this work, we introduce an interpretable approach for grap… ▽ More

    Submitted 27 March, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

    Comments: Accepted at LREC-COLING 2024

  21. arXiv:2403.17582  [pdf, other

    cs.CL cs.AI cs.LG

    Towards a Zero-Data, Controllable, Adaptive Dialog System

    Authors: Dirk Väth, Lindsey Vanderlyn, Ngoc Thang Vu

    Abstract: Conversational Tree Search (Väth et al., 2023) is a recent approach to controllable dialog systems, where domain experts shape the behavior of a Reinforcement Learning agent through a dialog tree. The agent learns to efficiently navigate this tree, while adapting to information needs, e.g., domain familiarity, of different users. However, the need for additional training data hinders deployment in… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

  22. arXiv:2403.08705  [pdf, other

    gr-qc

    Scalarization of isolated black holes in scalar Gauss-Bonnet theory in the fixing-the-equations approach

    Authors: Guillermo Lara, Harald P. Pfeiffer, Nikolas A. Wittek, Nils L. Vu, Kyle C. Nelli, Alexander Carpenter, Geoffrey Lovelace, Mark A. Scheel, William Throwe

    Abstract: One of the most promising avenues to perform numerical evolutions in theories beyond General Relativity is the fixing-the-equations approach, a proposal in which new ``driver'' equations are added to the evolution equations in a way that allows for stable numerical evolutions. In this direction, we extend the numerical relativity code SpECTRE to evolve a ``fixed'' version of scalar Gauss-Bonnet th… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

    Comments: 16 pages, 12 figures

  23. arXiv:2403.05338  [pdf, other

    cs.CL

    Explaining Pre-Trained Language Models with Attribution Scores: An Analysis in Low-Resource Settings

    Authors: Wei Zhou, Heike Adel, Hendrik Schuff, Ngoc Thang Vu

    Abstract: Attribution scores indicate the importance of different input parts and can, thus, explain model behaviour. Currently, prompt-based models are gaining popularity, i.a., due to their easier adaptability in low-resource settings. However, the quality of attribution scores extracted from prompt-based models has not been investigated yet. In this work, we address this topic by analyzing attribution sc… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

  24. arXiv:2403.04784  [pdf, other

    cs.CR cs.LG

    Analysis of Privacy Leakage in Federated Large Language Models

    Authors: Minh N. Vu, Truc Nguyen, Tre' R. Jeter, My T. Thai

    Abstract: With the rapid adoption of Federated Learning (FL) as the training and tuning protocol for applications utilizing Large Language Models (LLMs), recent research highlights the need for significant modifications to FL to accommodate the large-scale of LLMs. While substantial adjustments to the protocol have been introduced as a response, comprehensive privacy analysis for the adapted FL protocol is… ▽ More

    Submitted 2 March, 2024; originally announced March 2024.

  25. arXiv:2402.04769  [pdf, other

    cs.RO

    Hierarchical Motion Planning and Offline Robust Model Predictive Control for Autonomous Vehicles

    Authors: Hung Duy Nguyen, Minh Nhat Vu, Nguyen Ngoc Nam, Kyoungseok Han

    Abstract: Driving vehicles in complex scenarios under harsh conditions is the biggest challenge for autonomous vehicles (AVs). To address this issue, we propose hierarchical motion planning and robust control strategy using the front-active steering system in complex scenarios with various slippery road adhesion coefficients while considering vehicle uncertain parameters. Behaviors of human vehicles (HVs) a… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

    Comments: 6 pages, 9 illustrations, Accepted for publication in American Control Conference (ACC) 2024

  26. arXiv:2402.04730  [pdf, other

    cs.RO

    Model Predictive Trajectory Optimization With Dynamically Changing Waypoints for Serial Manipulators

    Authors: Florian Beck, Minh Nhat Vu, Christian Hartl-Nesic, Andreas Kugi

    Abstract: Systematically including dynamically changing waypoints as desired discrete actions, for instance, resulting from superordinate task planning, has been challenging for online model predictive trajectory optimization with short planning horizons. This paper presents a novel waypoint model predictive control (wMPC) concept for online replanning tasks. The main idea is to split the planning horizon a… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

    Comments: 8 pages, 6 figures

  27. Striking the right tone: toward a self-consistent framework for measuring black hole ringdowns

    Authors: Teagan A. Clarke, Maximiliano Isi, Paul D. Lasky, Eric Thrane, Michael Boyle, Nils Deppe, Lawrence E. Kidder, Keefe Mitman, Jordan Moxon, Kyle C. Nelli, William Throwe, Nils L. Vu

    Abstract: The ringdown portion of a binary black hole merger consists of a sum of modes, each containing an infinite number of tones that are exponentially damped sinusoids. In principle, these can be measured as gravitational-waves with observatories like LIGO/Virgo/KAGRA, however in practice it is unclear how many tones can be meaningfully resolved. We investigate the consistency and resolvability of the… ▽ More

    Submitted 11 June, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

    Comments: 14 pages, 8 figures, 2 tables. Published in PRD

  28. arXiv:2401.17676  [pdf, other

    cs.RO

    Observer-based Controller Design for Oscillation Dam** of a Novel Suspended Underactuated Aerial Platform

    Authors: Hemjyoti Das, Minh Nhat Vu, Tobias Egle, Christian Ott

    Abstract: In this work, we present a novel actuation strategy for a suspended aerial platform. By utilizing an underactuation approach, we demonstrate the successful oscillation dam** of the proposed platform, modeled as a spherical double pendulum. A state estimator is designed in order to obtain the deflection angles of the platform, which uses only onboard IMU measurements. The state estimator is an ex… ▽ More

    Submitted 31 January, 2024; originally announced January 2024.

    Comments: 7 pages, 11 figures, Accepted for publication to ICRA 2024

  29. arXiv:2401.09059  [pdf, other

    cs.RO cs.CV

    Autonomous Catheterization with Open-source Simulator and Expert Trajectory

    Authors: Tudor Jianu, Baoru Huang, Tuan Vo, Minh Nhat Vu, **gxuan Kang, Hoan Nguyen, Olatunji Omisore, Pierre Berthet-Rayne, Sebastiano Fichera, Anh Nguyen

    Abstract: Endovascular robots have been actively developed in both academia and industry. However, progress toward autonomous catheterization is often hampered by the widespread use of closed-source simulators and physical phantoms. Additionally, the acquisition of large-scale datasets for training machine learning algorithms with endovascular robots is usually infeasible due to expensive medical procedures… ▽ More

    Submitted 19 January, 2024; v1 submitted 17 January, 2024; originally announced January 2024.

    Comments: Code: https://github.com/airvlab/cathsim

  30. arXiv:2401.00805  [pdf, other

    gr-qc astro-ph.CO

    Nonlinear Effects In Black Hole Ringdown From Scattering Experiments I: spin and initial data dependence of quadratic mode coupling

    Authors: Hengrui Zhu, Justin L. Ripley, Frans Pretorius, Sizheng Ma, Keefe Mitman, Robert Owen, Michael Boyle, Yitian Chen, Nils Deppe, Lawrence E. Kidder, Jordan Moxon, Kyle C. Nelli, Harald P. Pfeiffer, Mark A. Scheel, William Throwe, Nils L. Vu

    Abstract: We investigate quadratic quasinormal mode coupling in black hole spacetime through numerical simulations of single perturbed black holes using both numerical relativity and second-order black hole perturbation theory. Focusing on the dominant $\ell=|m|=2$ quadrupolar modes, we find good agreement (within $\sim10\%$) between these approaches, with discrepancies attributed to truncation error and un… ▽ More

    Submitted 1 January, 2024; originally announced January 2024.

  31. arXiv:2312.08588  [pdf, other

    gr-qc astro-ph.CO astro-ph.SR

    Black Hole Spectroscopy for Precessing Binary Black Hole Coalescences

    Authors: Hengrui Zhu, Harrison Siegel, Keefe Mitman, Maximiliano Isi, Will M. Farr, Michael Boyle, Nils Deppe, Lawrence E. Kidder, Sizheng Ma, Jordan Moxon, Kyle C. Nelli, Harald P. Pfeiffer, Mark A. Scheel, Saul A. Teukolsky, William Throwe, Vijay Varma, Nils L. Vu

    Abstract: To accurately perform black hole spectroscopy, it is essential to know which quasinormal modes dominate astrophysical ringdown signals. In this Letter, we present a phenomenological description of the quasinormal modes that are excited in the ringdowns of comparable mass, quasi-circular precessing binary black hole coalescences. By analyzing an exhaustive catalog of numerical relativity simulation… ▽ More

    Submitted 13 December, 2023; originally announced December 2023.

    Comments: Data Release and Analysis Scripts: https://github.com/HengruiPrinceton/precession_ringdown

  32. arXiv:2311.14465  [pdf, other

    cs.CL

    DP-NMT: Scalable Differentially-Private Machine Translation

    Authors: Timour Igamberdiev, Doan Nam Long Vu, Felix Künnecke, Zhuo Yu, Jannik Holmer, Ivan Habernal

    Abstract: Neural machine translation (NMT) is a widely popular text generation task, yet there is a considerable research gap in the development of privacy-preserving NMT models, despite significant data privacy concerns for NMT systems. Differentially private stochastic gradient descent (DP-SGD) is a popular method for training machine learning models with concrete privacy guarantees; however, the implemen… ▽ More

    Submitted 24 April, 2024; v1 submitted 24 November, 2023; originally announced November 2023.

    Comments: Accepted at EACL 2024

  33. Controllable Generation of Artificial Speaker Embeddings through Discovery of Principal Directions

    Authors: Florian Lux, Pascal Tilli, Sarina Meyer, Ngoc Thang Vu

    Abstract: Customizing voice and speaking style in a speech synthesis system with intuitive and fine-grained controls is challenging, given that little data with appropriate labels is available. Furthermore, editing an existing human's voice also comes with ethical concerns. In this paper, we propose a method to generate artificial speaker embeddings that cannot be linked to a real human while offering intui… ▽ More

    Submitted 26 October, 2023; originally announced October 2023.

    Comments: Published at ISCA Interspeech 2023 https://www.isca-speech.org/archive/interspeech_2023/lux23_interspeech.html

  34. arXiv:2310.17499  [pdf, other

    cs.CL cs.LG eess.AS

    The IMS Toucan System for the Blizzard Challenge 2023

    Authors: Florian Lux, Julia Koch, Sarina Meyer, Thomas Bott, Nadja Schauffler, Pavel Denisov, Antje Schweitzer, Ngoc Thang Vu

    Abstract: For our contribution to the Blizzard Challenge 2023, we improved on the system we submitted to the Blizzard Challenge 2021. Our approach entails a rule-based text-to-phoneme processing system that includes rule-based disambiguation of homographs in the French language. It then transforms the phonemes to spectrograms as intermediate representations using a fast and efficient non-autoregressive synt… ▽ More

    Submitted 26 October, 2023; originally announced October 2023.

    Comments: Published at the Blizzard Challenge Workshop 2023, colocated with the Speech Synthesis Workshop 2023, a sattelite event of the Interspeech 2023

  35. arXiv:2310.16618  [pdf, other

    cs.CV cs.RO

    Real-time 6-DoF Pose Estimation by an Event-based Camera using Active LED Markers

    Authors: Gerald Ebmer, Adam Loch, Minh Nhat Vu, Germain Haessig, Roberto Mecca, Markus Vincze, Christian Hartl-Nesic, Andreas Kugi

    Abstract: Real-time applications for autonomous operations depend largely on fast and robust vision-based localization systems. Since image processing tasks require processing large amounts of data, the computational resources often limit the performance of other processes. To overcome this limitation, traditional marker-based localization systems are widely used since they are easy to integrate and achieve… ▽ More

    Submitted 25 October, 2023; originally announced October 2023.

    Comments: 14 pages, 12 figures, this paper has been accepted to WACV 2024

  36. arXiv:2310.15948  [pdf, other

    cs.CV

    Language-driven Scene Synthesis using Multi-conditional Diffusion Model

    Authors: An Vuong, Minh Nhat Vu, Toan Tien Nguyen, Baoru Huang, Dzung Nguyen, Thieu Vo, Anh Nguyen

    Abstract: Scene synthesis is a challenging problem with several industrial applications. Recently, substantial efforts have been directed to synthesize the scene using human motions, room layouts, or spatial graphs as the input. However, few studies have addressed this problem from multiple modalities, especially combining text prompts. In this paper, we propose a language-driven scene synthesis task, which… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

    Comments: Accepted to NeurIPS 2023

  37. arXiv:2310.15262  [pdf, other

    cs.CL

    Data Augmentation Techniques for Machine Translation of Code-Switched Texts: A Comparative Study

    Authors: Injy Hamed, Nizar Habash, Ngoc Thang Vu

    Abstract: Code-switching (CSW) text generation has been receiving increasing attention as a solution to address data scarcity. In light of this growing interest, we need more comprehensive studies comparing different augmentation approaches. In this work, we compare three popular approaches: lexical replacements, linguistic theories, and back-translation (BT), in the context of Egyptian Arabic-English CSW.… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

    Comments: Findings of EMNLP 2023

  38. arXiv:2310.06103  [pdf, other

    cs.CL cs.SD eess.AS

    Leveraging Multilingual Self-Supervised Pretrained Models for Sequence-to-Sequence End-to-End Spoken Language Understanding

    Authors: Pavel Denisov, Ngoc Thang Vu

    Abstract: A number of methods have been proposed for End-to-End Spoken Language Understanding (E2E-SLU) using pretrained models, however their evaluation often lacks multilingual setup and tasks that require prediction of lexical fillers, such as slot filling. In this work, we propose a unified method that integrates multilingual pretrained speech and text models and performs E2E-SLU on six datasets in four… ▽ More

    Submitted 9 October, 2023; originally announced October 2023.

    Comments: IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU) 2023

  39. arXiv:2309.10932  [pdf, other

    cs.RO

    Open-Vocabulary Affordance Detection using Knowledge Distillation and Text-Point Correlation

    Authors: Tuan Van Vo, Minh Nhat Vu, Baoru Huang, Toan Nguyen, Ngan Le, Thieu Vo, Anh Nguyen

    Abstract: Affordance detection presents intricate challenges and has a wide range of robotic applications. Previous works have faced limitations such as the complexities of 3D object shapes, the wide range of potential affordances on real-world objects, and the lack of open-vocabulary support for affordance understanding. In this paper, we introduce a new open-vocabulary affordance detection method in 3D po… ▽ More

    Submitted 19 September, 2023; originally announced September 2023.

    Comments: 8 pages

  40. arXiv:2309.10911  [pdf, other

    cs.RO

    Language-Conditioned Affordance-Pose Detection in 3D Point Clouds

    Authors: Toan Nguyen, Minh Nhat Vu, Baoru Huang, Tuan Van Vo, Vy Truong, Ngan Le, Thieu Vo, Bac Le, Anh Nguyen

    Abstract: Affordance detection and pose estimation are of great importance in many robotic applications. Their combination helps the robot gain an enhanced manipulation capability, in which the generated pose can facilitate the corresponding affordance task. Previous methods for affodance-pose joint learning are limited to a predefined set of affordances, thus limiting the adaptability of robots in real-wor… ▽ More

    Submitted 19 September, 2023; originally announced September 2023.

    Comments: Project page: https://3DAPNet.github.io

  41. arXiv:2309.09818  [pdf, other

    cs.RO cs.CV

    Grasp-Anything: Large-scale Grasp Dataset from Foundation Models

    Authors: An Dinh Vuong, Minh Nhat Vu, Hieu Le, Baoru Huang, Binh Huynh, Thieu Vo, Andreas Kugi, Anh Nguyen

    Abstract: Foundation models such as ChatGPT have made significant strides in robotic tasks due to their universal representation of real-world domains. In this paper, we leverage foundation models to tackle grasp detection, a persistent challenge in robotics with broad industrial applications. Despite numerous grasp datasets, their object diversity remains limited compared to real-world figures. Fortunately… ▽ More

    Submitted 18 September, 2023; originally announced September 2023.

    Comments: Project page: https://grasp-anything-2023.github.io

  42. VoicePAT: An Efficient Open-source Evaluation Toolkit for Voice Privacy Research

    Authors: Sarina Meyer, Xiaoxiao Miao, Ngoc Thang Vu

    Abstract: Speaker anonymization is the task of modifying a speech recording such that the original speaker cannot be identified anymore. Since the first Voice Privacy Challenge in 2020, along with the release of a framework, the popularity of this research topic is continually increasing. However, the comparison and combination of different anonymization approaches remains challenging due to the complexity… ▽ More

    Submitted 21 December, 2023; v1 submitted 14 September, 2023; originally announced September 2023.

    Comments: Accepted by OJSP-ICASSP 2024 https://ieeexplore.ieee.org/document/10365329

  43. arXiv:2308.15005  [pdf, other

    cs.CV

    Few-Shot Object Detection via Synthetic Features with Optimal Transport

    Authors: Anh-Khoa Nguyen Vu, Thanh-Toan Do, Vinh-Tiep Nguyen, Tam Le, Minh-Triet Tran, Tam V. Nguyen

    Abstract: Few-shot object detection aims to simultaneously localize and classify the objects in an image with limited training samples. However, most existing few-shot object detection methods focus on extracting the features of a few samples of novel classes that lack diversity. Hence, they may not be sufficient to capture the data distribution. To address that limitation, in this paper, we propose a novel… ▽ More

    Submitted 29 August, 2023; v1 submitted 28 August, 2023; originally announced August 2023.

  44. Fully relativistic three-dimensional Cauchy-characteristic matching for physical degrees of freedom

    Authors: Sizheng Ma, Jordan Moxon, Mark A. Scheel, Kyle C. Nelli, Nils Deppe, Marceline S. Bonilla, Lawrence E. Kidder, Prayush Kumar, Geoffrey Lovelace, William Throwe, Nils L. Vu

    Abstract: A fully relativistic three-dimensional Cauchy-characteristic matching (CCM) algorithm is implemented for physical degrees of freedom in a numerical relativity code SpECTRE. The method is free of approximations and can be applied to any physical system. We test the algorithm with various scenarios involving smooth data, including the propagation of Teukolsky waves within a flat background, the pert… ▽ More

    Submitted 11 June, 2024; v1 submitted 20 August, 2023; originally announced August 2023.

    Journal ref: Phys. Rev. D 109, 124027 (2024)

  45. arXiv:2308.06420  [pdf, other

    cs.CV

    M&M: Tackling False Positives in Mammography with a Multi-view and Multi-instance Learning Sparse Detector

    Authors: Yen Nhi Truong Vu, Dan Guo, Ahmed Taha, Jason Su, Thomas Paul Matthews

    Abstract: Deep-learning-based object detection methods show promise for improving screening mammography, but high rates of false positives can hinder their effectiveness in clinical practice. To reduce false positives, we identify three challenges: (1) unlike natural images, a malignant mammogram typically contains only one malignant finding; (2) mammography exams contain two views of each breast, and both… ▽ More

    Submitted 11 August, 2023; originally announced August 2023.

    Comments: MICCAI 2023 with supplementary materials

  46. Extending black-hole remnant surrogate models to extreme mass ratios

    Authors: Matteo Boschini, Davide Gerosa, Vijay Varma, Cristobal Armaza, Michael Boyle, Marceline S. Bonilla, Andrea Ceja, Yitian Chen, Nils Deppe, Matthew Giesler, Lawrence E. Kidder, Prayush Kumar, Guillermo Lara, Oliver Long, Sizheng Ma, Keefe Mitman, Peter James Nee, Harald P. Pfeiffer, Antoni Ramos-Buades, Mark A. Scheel, Nils L. Vu, Jooheon Yoo

    Abstract: Numerical-relativity surrogate models for both black-hole merger waveforms and remnants have emerged as important tools in gravitational-wave astronomy. While producing very accurate predictions, their applicability is limited to the region of the parameter space where numerical-relativity simulations are available and computationally feasible. Notably, this excludes extreme mass ratios. We presen… ▽ More

    Submitted 24 October, 2023; v1 submitted 7 July, 2023; originally announced July 2023.

    Comments: 10 pages, 3 figures. Published in PRD. Model publicly available at https://pypi.org/project/surfinBH

    Journal ref: Phys.Rev.D 108 (2023) 8, 084015

  47. arXiv:2306.11377  [pdf, other

    cs.CV

    HabiCrowd: A High Performance Simulator for Crowd-Aware Visual Navigation

    Authors: An Dinh Vuong, Toan Tien Nguyen, Minh Nhat VU, Baoru Huang, Dzung Nguyen, Huynh Thi Thanh Binh, Thieu Vo, Anh Nguyen

    Abstract: Visual navigation, a foundational aspect of Embodied AI (E-AI), has been significantly studied in the past few years. While many 3D simulators have been introduced to support visual navigation tasks, scarcely works have been directed towards combining human dynamics, creating the gap between simulation and real-world applications. Furthermore, current 3D simulators incorporating human dynamics hav… ▽ More

    Submitted 20 June, 2023; originally announced June 2023.

    Comments: 14 pages, 10 figures

  48. arXiv:2306.06804  [pdf, other

    cs.CL stat.ML

    Neural Machine Translation for the Indigenous Languages of the Americas: An Introduction

    Authors: Manuel Mager, Rajat Bhatnagar, Graham Neubig, Ngoc Thang Vu, Katharina Kann

    Abstract: Neural models have drastically advanced state of the art for machine translation (MT) between high-resource languages. Traditionally, these models rely on large amounts of training data, but many language pairs lack these resources. However, an important part of the languages in the world do not have this amount of data. Most languages from the Americas are among them, having a limited amount of p… ▽ More

    Submitted 11 June, 2023; originally announced June 2023.

    Comments: Accepted to AmericasNLP 2023

  49. arXiv:2306.05653  [pdf

    physics.bio-ph physics.optics q-bio.QM

    Rapid, antibiotic incubation-free determination of tuberculosis drug resistance using machine learning and Raman spectroscopy

    Authors: Babatunde Ogunlade, Loza F. Tadesse, Hongquan Li, Nhat Vu, Niaz Banaei, Amy K. Barczak, Amr. A. E. Saleh, Manu Prakash, Jennifer A. Dionne

    Abstract: Tuberculosis (TB) is the world's deadliest infectious disease, with over 1.5 million deaths annually and 10 million new cases reported each year. The causative organism, Mycobacterium tuberculosis (Mtb) can take nearly 40 days to culture, a required step to determine the pathogen's antibiotic susceptibility. Both rapid identification of Mtb and rapid antibiotic susceptibility testing (AST) are ess… ▽ More

    Submitted 9 April, 2024; v1 submitted 8 June, 2023; originally announced June 2023.

  50. arXiv:2306.04755  [pdf, other

    astro-ph.HE gr-qc

    A positivity-preserving adaptive-order finite-difference scheme for GRMHD

    Authors: Nils Deppe, Lawrence E. Kidder, Saul A. Teukolsky, Marceline S. Bonilla, François Hébert, Yoonsoo Kim, Mark A. Scheel, William Throwe, Nils L. Vu

    Abstract: We present an adaptive-order positivity-preserving conservative finite-difference scheme that allows a high-order solution away from shocks and discontinuities while guaranteeing positivity and robustness at discontinuities. This is achieved by monitoring the relative power in the highest mode of the reconstructed polynomial and reducing the order when the polynomial series no longer converges. Ou… ▽ More

    Submitted 18 January, 2024; v1 submitted 7 June, 2023; originally announced June 2023.

    Comments: 48 pages, 17 figures. Matches published version, minor changes only