Skip to main content

Showing 1–50 of 341 results for author: Le, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.00021  [pdf, other

    cs.CV cs.GR eess.IV

    Neural Graphics Texture Compression Supporting Random Acces

    Authors: Farzad Farhadzadeh, Qiqi Hou, Hoang Le, Amir Said, Randall Rauwendaal, Alex Bourd, Fatih Porikli

    Abstract: Advances in rendering have led to tremendous growth in texture assets, including resolution, complexity, and novel textures components, but this growth in data volume has not been matched by advances in its compression. Meanwhile Neural Image Compression (NIC) has advanced significantly and shown promising results, but the proposed methods cannot be directly adapted to neural texture compression.… ▽ More

    Submitted 6 May, 2024; originally announced July 2024.

    Comments: ECCV submission

  2. arXiv:2406.19765  [pdf, other

    cs.SE cs.LG

    Systematic Literature Review on Application of Learning-based Approaches in Continuous Integration

    Authors: Ali Kazemi Arani, Triet Huynh Minh Le, Mansooreh Zahedi, M. Ali Babar

    Abstract: Context: Machine learning (ML) and deep learning (DL) analyze raw data to extract valuable insights in specific phases. The rise of continuous practices in software projects emphasizes automating Continuous Integration (CI) with these learning-based methods, while the growing adoption of such approaches underscores the need for systematizing knowledge. Objective: Our objective is to comprehensivel… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

    Comments: This paper has been accepted to be published in IEEE Access

  3. arXiv:2406.15883  [pdf, other

    cs.CL cs.AI

    SimSMoE: Solving Representational Collapse via Similarity Measure

    Authors: Giang Do, Hung Le, Truyen Tran

    Abstract: Sparse mixture of experts (SMoE) have emerged as an effective approach for scaling large language models while kee** a constant computational cost. Regardless of several notable successes of SMoE, effective training such architecture remains elusive due to the representation collapse problem, which in turn harms model performance and causes parameter redundancy. In this work, we present Similari… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

  4. arXiv:2406.09489  [pdf, other

    cs.CV

    Language-driven Grasp Detection

    Authors: An Dinh Vuong, Minh Nhat Vu, Baoru Huang, Nghia Nguyen, Hieu Le, Thieu Vo, Anh Nguyen

    Abstract: Grasp detection is a persistent and intricate challenge with various industrial applications. Recently, many methods and datasets have been proposed to tackle the grasp detection problem. However, most of them do not consider using natural language as a condition to detect the grasp poses. In this paper, we introduce Grasp-Anything++, a new language-driven grasp detection dataset featuring 1M samp… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 19 pages. Accepted to CVPR24

  5. arXiv:2406.08953  [pdf, other

    cs.CV cs.LG

    Preserving Identity with Variational Score for General-purpose 3D Editing

    Authors: Duong H. Le, Tuan Pham, Aniruddha Kembhavi, Stephan Mandt, Wei-Chiu Ma, Jiasen Lu

    Abstract: We present Piva (Preserving Identity with Variational Score Distillation), a novel optimization-based method for editing images and 3D models based on diffusion models. Specifically, our approach is inspired by the recently proposed method for 2D image editing - Delta Denoising Score (DDS). We pinpoint the limitations in DDS for 2D and 3D editing, which causes detail loss and over-saturation. To a… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 22 pages, 14 figures

  6. arXiv:2406.06239  [pdf, other

    cs.CV

    I-MPN: Inductive Message Passing Network for Effective and Efficient Human-in-the-Loop Annotation of Mobile Eye Tracking Data

    Authors: Hoang H. Le, Duy M. H. Nguyen, Omair Shahzad Bhatti, Laszlo Kopacsi, Thinh P. Ngo, Binh T. Nguyen, Michael Barz, Daniel Sonntag

    Abstract: Understanding human visual processing in dynamic environments is essential for psychology and human-centered interaction design. Mobile eye-tracking systems, combining egocentric video and gaze signals, offer valuable insights. However, manual analysis of these recordings is time-intensive. In this work, we present a novel human-centered learning algorithm designed for automated object recognition… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: First version

  7. arXiv:2405.18435  [pdf, other

    eess.IV cs.CV

    QUBIQ: Uncertainty Quantification for Biomedical Image Segmentation Challenge

    Authors: Hongwei Bran Li, Fernando Navarro, Ivan Ezhov, Amirhossein Bayat, Dhritiman Das, Florian Kofler, Suprosanna Shit, Diana Waldmannstetter, Johannes C. Paetzold, Xiaobin Hu, Benedikt Wiestler, Lucas Zimmer, Tamaz Amiranashvili, Chinmay Prabhakar, Christoph Berger, Jonas Weidner, Michelle Alonso-Basant, Arif Rashid, Ujjwal Baid, Wesam Adel, Deniz Ali, Bhakti Baheti, Yingbin Bai, Ishaan Bhatt, Sabri Can Cetindag , et al. (55 additional authors not shown)

    Abstract: Uncertainty in medical image segmentation tasks, especially inter-rater variability, arising from differences in interpretations and annotations by various experts, presents a significant challenge in achieving consistent and reliable image segmentation. This variability not only reflects the inherent complexity and subjective nature of medical image interpretation but also directly impacts the de… ▽ More

    Submitted 24 June, 2024; v1 submitted 19 March, 2024; originally announced May 2024.

    Comments: initial technical report

  8. arXiv:2405.17926  [pdf, other

    cs.CV

    SarcNet: A Novel AI-based Framework to Automatically Analyze and Score Sarcomere Organizations in Fluorescently Tagged hiPSC-CMs

    Authors: Huyen Le, Khiet Dang, Tien Lai, Nhung Nguyen, Mai Tran, Hieu Pham

    Abstract: Quantifying sarcomere structure organization in human-induced pluripotent stem cell-derived cardiomyocytes (hiPSC-CMs) is crucial for understanding cardiac disease pathology, improving drug screening, and advancing regenerative medicine. Traditional methods, such as manual annotation and Fourier transform analysis, are labor-intensive, error-prone, and lack high-throughput capabilities. In this st… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  9. arXiv:2405.16388  [pdf, other

    cs.CL cs.LG

    Multi-Reference Preference Optimization for Large Language Models

    Authors: Hung Le, Quan Tran, Dung Nguyen, Kien Do, Saloni Mittal, Kelechi Ogueji, Svetha Venkatesh

    Abstract: How can Large Language Models (LLMs) be aligned with human intentions and values? A typical solution is to gather human preference on model outputs and finetune the LLMs accordingly while ensuring that updates do not deviate too far from a reference model. Recent approaches, such as direct preference optimization (DPO), have eliminated the need for unstable and sluggish reinforcement learning opti… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

    Comments: 20 pages

  10. arXiv:2405.15665  [pdf

    cs.SE

    Examining Ownership Models in Software Teams: A Systematic Literature Review and a Replication Study

    Authors: Umme Ayman Koana, Quang Hy Le, Shadikur Rahman, Chris Carlson, Francis Chew, Maleknaz Nayebi

    Abstract: Effective ownership of software artifacts, particularly code, is crucial for accountability, knowledge sharing, and code quality enhancement. Researchers have proposed models linking ownership of software artifacts with developer performance and code quality. Our study aims to systematically examine various ownership models and provide a structured literature overview. Conducting a systematic lite… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: Pre-print an accepted paper for the ESE journal

  11. arXiv:2405.15394  [pdf, other

    cs.CV

    Leveraging knowledge distillation for partial multi-task learning from multiple remote sensing datasets

    Authors: Hoàng-Ân Lê, Minh-Tan Pham

    Abstract: Partial multi-task learning where training examples are annotated for one of the target tasks is a promising idea in remote sensing as it allows combining datasets annotated for different tasks and predicting more tasks with fewer network parameters. The naïve approach to partial multi-task learning is sub-optimal due to the lack of all-task annotations for learning joint representations. This pap… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: Accepted for oral presentation at IGARSS 2024

  12. arXiv:2405.07615  [pdf, other

    cs.CL

    ViWikiFC: Fact-Checking for Vietnamese Wikipedia-Based Textual Knowledge Source

    Authors: Hung Tuan Le, Long Truong To, Manh Trong Nguyen, Kiet Van Nguyen

    Abstract: Fact-checking is essential due to the explosion of misinformation in the media ecosystem. Although false information exists in every language and country, most research to solve the problem mainly concentrated on huge communities like English and Chinese. Low-resource languages like Vietnamese are necessary to explore corpora and models for fact verification. To bridge this gap, we construct ViWik… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

  13. arXiv:2405.01979  [pdf, other

    cs.IT eess.SP

    Graph Neural Network based Active and Passive Beamforming for Distributed STAR-RIS-Assisted Multi-User MISO Systems

    Authors: Ha An Le, Trinh Van Chien, Wan Choi

    Abstract: This paper investigates a joint active and passive beamforming design for distributed simultaneous transmitting and reflecting (STAR) reconfigurable intelligent surface (RIS) assisted multi-user (MU)- mutiple input single output (MISO) systems, where the energy splitting (ES) mode is considered for the STAR-RIS. We aim to design the active beamforming vectors at the base station (BS) and the passi… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

    Comments: 13 pages, 7 figures

  14. arXiv:2405.00681  [pdf, other

    eess.SP cs.IT cs.NI eess.SY

    Delay and Overhead Efficient Transmission Scheduling for Federated Learning in UAV Swarms

    Authors: Duc N. M. Hoang, Vu Tuan Truong, Hung Duy Le, Long Bao Le

    Abstract: This paper studies the wireless scheduling design to coordinate the transmissions of (local) model parameters of federated learning (FL) for a swarm of unmanned aerial vehicles (UAVs). The overall goal of the proposed design is to realize the FL training and aggregation processes with a central aggregator exploiting the sensory data collected by the UAVs but it considers the multi-hop wireless net… ▽ More

    Submitted 22 February, 2024; originally announced May 2024.

    Comments: accepted to WCNC'24

  15. arXiv:2404.17110  [pdf, other

    cs.SE cs.CR cs.LG

    Software Vulnerability Prediction in Low-Resource Languages: An Empirical Study of CodeBERT and ChatGPT

    Authors: Triet H. M. Le, M. Ali Babar, Tung Hoang Thai

    Abstract: Background: Software Vulnerability (SV) prediction in emerging languages is increasingly important to ensure software security in modern systems. However, these languages usually have limited SV data for develo** high-performing prediction models. Aims: We conduct an empirical study to evaluate the impact of SV data scarcity in emerging languages on the state-of-the-art SV prediction model and i… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: Accepted in the 4th International Workshop on Software Security co-located with the 28th International Conference on Evaluation and Assessment in Software Engineering (EASE) 2024

  16. arXiv:2404.11870  [pdf, ps, other

    cs.LG cs.CL

    Enhancing Length Extrapolation in Sequential Models with Pointer-Augmented Neural Memory

    Authors: Hung Le, Dung Nguyen, Kien Do, Svetha Venkatesh, Truyen Tran

    Abstract: We propose Pointer-Augmented Neural Memory (PANM) to help neural networks understand and apply symbol processing to new, longer sequences of data. PANM integrates an external neural memory that uses novel physical addresses and pointer manipulation techniques to mimic human and computer symbol processing abilities. PANM facilitates pointer assignment, dereference, and arithmetic by explicitly usin… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: Preprint

  17. arXiv:2404.10730   

    cs.LG cs.AI

    Insight Gained from Migrating a Machine Learning Model to Intelligence Processing Units

    Authors: Hieu Le, Zhenhua He, Mai Le, Dhruva K. Chakravorty, Lisa M. Perez, Akhil Chilumuru, Yan Yao, Jiefu Chen

    Abstract: The discoveries in this paper show that Intelligence Processing Units (IPUs) offer a viable accelerator alternative to GPUs for machine learning (ML) applications within the fields of materials science and battery research. We investigate the process of migrating a model from GPU to IPU and explore several optimization techniques, including pipelining and gradient accumulation, aimed at enhancing… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Comments: arXiv admin note: This version has been removed by arXiv administrators as the submitter did not have the right to agree to the license at the time of submission

  18. arXiv:2404.10404  [pdf, other

    cs.CR

    Sisu: Decentralized Trustless Bridge For Full Ethereum Node

    Authors: Billy Pham, Huy Le

    Abstract: In this paper, we present a detailed approach and implementation to prove Ethereum full node using recursive SNARK, distributed general GKR and Groth16. Our protocol's name is Sisu whose architecture is based on distributed Virgo in zkBridge with some major improvements. Besides proving signature aggregation, we provide solutions to 2 hard problems in proving Ethereum full node: 1) any public key… ▽ More

    Submitted 17 April, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

  19. arXiv:2404.09259  [pdf, other

    cs.CV cs.AI

    FedCCL: Federated Dual-Clustered Feature Contrast Under Domain Heterogeneity

    Authors: Yu Qiao, Huy Q. Le, Mengchun Zhang, Apurba Adhikary, Chaoning Zhang, Choong Seon Hong

    Abstract: Federated learning (FL) facilitates a privacy-preserving neural network training paradigm through collaboration between edge clients and a central server. One significant challenge is that the distributed data is not independently and identically distributed (non-IID), typically including both intra-domain and inter-domain heterogeneity. However, recent research is limited to simply using averaged… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.

  20. arXiv:2404.05045  [pdf, other

    cs.CG cs.DS

    Spanners in Planar Domains via Steiner Spanners and non-Steiner Tree Covers

    Authors: Sujoy Bhore, Balázs Keszegh, Andrey Kupavskii, Hung Le, Alexandre Louvet, Dömötör Pálvölgyi, Csaba D. Tóth

    Abstract: We study spanners in planar domains, including polygonal domains, polyhedral terrain, and planar metrics. Previous work showed that for any constant $ε\in (0,1)$, one could construct a $(2+ε)$-spanner with $O(n\log(n))$ edges (SICOMP 2019), and there is a lower bound of $Ω(n^2)$ edges for any $(2-ε)$-spanner (SoCG 2015). The main open question is whether a linear number of edges suffices and the s… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

    Comments: 40 pages, 11 figures. Abstract shorten to meet Arxiv limits

  21. arXiv:2404.02717  [pdf, other

    cs.CL cs.LG

    Automatic Prompt Selection for Large Language Models

    Authors: Viet-Tung Do, Van-Khanh Hoang, Duy-Hung Nguyen, Shahab Sabahi, Jeff Yang, Hajime Hotta, Minh-Tien Nguyen, Hung Le

    Abstract: Large Language Models (LLMs) can perform various natural language processing tasks with suitable instruction prompts. However, designing effective prompts manually is challenging and time-consuming. Existing methods for automatic prompt optimization either lack flexibility or efficiency. In this paper, we propose an effective approach to automatically select the optimal prompt for a given input fr… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: preprint

  22. arXiv:2403.18172  [pdf, other

    cs.RO

    Vision-Based Force Estimation for Minimally Invasive Telesurgery Through Contact Detection and Local Stiffness Models

    Authors: Shuyuan Yang, My H. Le, Kyle R. Golobish, Juan C. Beaver, Zonghe Chua

    Abstract: In minimally invasive telesurgery, obtaining accurate force information is difficult due to the complexities of in-vivo end effector force sensing. This constrains development and implementation of haptic feedback and force-based automated performance metrics, respectively. Vision-based force sensing approaches using deep learning are a promising alternative to intrinsic end effector force sensing… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

    Comments: Preprint of an article accepted in Journal of Medical Robotics Research ©2024 copyright World Scientific Publishing Company

  23. arXiv:2403.17879  [pdf, other

    cs.CV eess.IV

    Low-Latency Neural Stereo Streaming

    Authors: Qiqi Hou, Farzad Farhadzadeh, Amir Said, Guillaume Sautiere, Hoang Le

    Abstract: The rise of new video modalities like virtual reality or autonomous driving has increased the demand for efficient multi-view video compression methods, both in terms of rate-distortion (R-D) performance and in terms of delay and runtime. While most recent stereo video compression approaches have shown promising performance, they compress left and right views sequentially, leading to poor parallel… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

    Comments: Accepted by CVPR2024

  24. arXiv:2403.17754  [pdf, other

    cs.CG

    Optimal Euclidean Tree Covers

    Authors: Hsien-Chih Chang, Jonathan Conroy, Hung Le, Lazar Milenkovic, Shay Solomon, Cuong Than

    Abstract: A $(1+\varepsilon)\textit{-stretch tree cover}$ of a metric space is a collection of trees, where every pair of points has a $(1+\varepsilon)$-stretch path in one of the trees. The celebrated $\textit{Dumbbell Theorem}$ [Arya et~al. STOC'95] states that any set of $n$ points in $d$-dimensional Euclidean space admits a $(1+\varepsilon)$-stretch tree cover with… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

  25. arXiv:2403.16732  [pdf, other

    cs.AI

    Enabling Uncertainty Estimation in Iterative Neural Networks

    Authors: Nikita Durasov, Doruk Oner, Jonathan Donier, Hieu Le, Pascal Fua

    Abstract: Turning pass-through network architectures into iterative ones, which use their own output as input, is a well-known approach for boosting performance. In this paper, we argue that such architectures offer an additional benefit: The convergence rate of their successive outputs is highly correlated with the accuracy of the value to which they converge. Thus, we can use the convergence rate as a use… ▽ More

    Submitted 30 May, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

    Comments: Accepted at ICML 2024

  26. arXiv:2403.13575  [pdf, other

    cs.CV

    Leveraging feature communication in federated learning for remote sensing image classification

    Authors: Anh-Kiet Duong, Hoàng-Ân Lê, Minh-Tan Pham

    Abstract: In the realm of Federated Learning (FL) applied to remote sensing image classification, this study introduces and assesses several innovative communication strategies. Our exploration includes feature-centric communication, pseudo-weight amalgamation, and a combined method utilizing both weights and features. Experiments conducted on two public scene classification datasets unveil the effectivenes… ▽ More

    Submitted 23 May, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

    Comments: 5 pages, to appear in IGARSS 2024

  27. arXiv:2403.02495  [pdf, other

    cs.RO cs.AI

    Pseudo-Labeling and Contextual Curriculum Learning for Online Grasp Learning in Robotic Bin Picking

    Authors: Huy Le, Philipp Schillinger, Miroslav Gabriel, Alexander Qualmann, Ngo Anh Vien

    Abstract: The prevailing grasp prediction methods predominantly rely on offline learning, overlooking the dynamic grasp learning that occurs during real-time adaptation to novel picking scenarios. These scenarios may involve previously unseen objects, variations in camera perspectives, and bin configurations, among other factors. In this paper, we introduce a novel approach, SSL-ConvSAC, that combines semi-… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

    Comments: Accepted to ICRA 2024

  28. arXiv:2402.13613  [pdf, other

    cs.CL cs.LG

    Overview of the VLSP 2023 -- ComOM Shared Task: A Data Challenge for Comparative Opinion Mining from Vietnamese Product Reviews

    Authors: Hoang-Quynh Le, Duy-Cat Can, Khanh-Vinh Nguyen, Mai-Vu Tran

    Abstract: This paper presents a comprehensive overview of the Comparative Opinion Mining from Vietnamese Product Reviews shared task (ComOM), held as part of the 10$^{th}$ International Workshop on Vietnamese Language and Speech Processing (VLSP 2023). The primary objective of this shared task is to advance the field of natural language processing by develo** techniques that proficiently extract comparati… ▽ More

    Submitted 4 March, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

    Comments: In Proceedings of VLSP 2023

  29. arXiv:2402.12179  [pdf, other

    cs.CV cs.AI cs.CY

    Examining Monitoring System: Detecting Abnormal Behavior In Online Examinations

    Authors: Dinh An Ngo, Thanh Dat Nguyen, Thi Le Chi Dang, Huy Hoan Le, Ton Bao Ho, Vo Thanh Khang Nguyen, Truong Thanh Hung Nguyen

    Abstract: Cheating in online exams has become a prevalent issue over the past decade, especially during the COVID-19 pandemic. To address this issue of academic dishonesty, our "Exam Monitoring System: Detecting Abnormal Behavior in Online Examinations" is designed to assist proctors in identifying unusual student behavior. Our system demonstrates high accuracy and speed in detecting cheating in real-time s… ▽ More

    Submitted 19 February, 2024; originally announced February 2024.

  30. arXiv:2402.12035  [pdf, other

    cs.LG cs.AI

    Class-incremental Learning for Time Series: Benchmark and Evaluation

    Authors: Zhongzheng Qiao, Quang Pham, Zhen Cao, Hoang H Le, P. N. Suganthan, Xudong Jiang, Ramasamy Savitha

    Abstract: Real-world environments are inherently non-stationary, frequently introducing new classes over time. This is especially common in time series classification, such as the emergence of new disease classification in healthcare or the addition of new activities in human activity recognition. In such cases, a learning system is required to assimilate novel classes effectively while avoiding catastrophi… ▽ More

    Submitted 19 February, 2024; originally announced February 2024.

    Comments: Currently under review for KDD 2024 (ADS track)

  31. arXiv:2402.09132  [pdf, other

    cs.AI cs.LG

    Exploring the Adversarial Capabilities of Large Language Models

    Authors: Lukas Struppek, Minh Hieu Le, Dominik Hintersdorf, Kristian Kersting

    Abstract: The proliferation of large language models (LLMs) has sparked widespread and general interest due to their strong language generation capabilities, offering great potential for both industry and research. While previous research delved into the security and privacy issues of LLMs, the extent to which these models can exhibit adversarial behavior remains largely unexplored. Addressing this gap, we… ▽ More

    Submitted 25 March, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

  32. arXiv:2402.04931  [pdf, other

    cs.DM cs.CC cs.DS math.CO

    Complexity of the (Connected) Cluster Vertex Deletion problem on $H$-free graphs

    Authors: Hoang-Oanh Le, Van Bang Le

    Abstract: The well-known Cluster Vertex Deletion problem (CVD) asks for a given graph $G$ and an integer $k$ whether it is possible to delete a set $S$ of at most $k$ vertices of $G$ such that the resulting graph $G-S$ is a cluster graph (a disjoint union of cliques). We give a complete characterization of graphs $H$ for which CVD on $H$-free graphs is polynomially solvable and for which it is NP-complete.… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

    Comments: Extended version of a MFCS 2022 paper. To appear in Theory of Computing Systems

  33. arXiv:2402.03577  [pdf, other

    cs.LG

    Revisiting the Dataset Bias Problem from a Statistical Perspective

    Authors: Kien Do, Dung Nguyen, Hung Le, Thao Le, Dang Nguyen, Haripriya Harikumar, Truyen Tran, Santu Rana, Svetha Venkatesh

    Abstract: In this paper, we study the "dataset bias" problem from a statistical standpoint, and identify the main cause of the problem as the strong correlation between a class attribute u and a non-class attribute b in the input x, represented by p(u|b) differing significantly from p(u). Since p(u|b) appears as part of the sampling distributions in the standard maximum log-likelihood (MLL) objective, a mod… ▽ More

    Submitted 5 February, 2024; originally announced February 2024.

  34. arXiv:2402.02977  [pdf, other

    cs.LG cs.AI

    Variational Flow Models: Flowing in Your Style

    Authors: Kien Do, Duc Kieu, Toan Nguyen, Dang Nguyen, Hung Le, Dung Nguyen, Thin Nguyen

    Abstract: We introduce "posterior flows" - generalizations of "probability flows" to a broader class of stochastic processes not necessarily diffusion processes - and propose a systematic training-free method to transform the posterior flow of a "linear" stochastic process characterized by the equation Xt = at * X0 + st * X1 into a straight constant-speed (SC) flow, reminiscent of Rectified Flow. This trans… ▽ More

    Submitted 29 March, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

  35. arXiv:2402.01955  [pdf, other

    cs.LG cs.AI math.FA

    OPSurv: Orthogonal Polynomials Quadrature Algorithm for Survival Analysis

    Authors: Lilian W. Bialokozowicz, Hoang M. Le, Tristan Sylvain, Peter A. I. Forsyth, Vineel Nagisetty, Greg Mori

    Abstract: This paper introduces the Orthogonal Polynomials Quadrature Algorithm for Survival Analysis (OPSurv), a new method providing time-continuous functional outputs for both single and competing risks scenarios in survival analysis. OPSurv utilizes the initial zero condition of the Cumulative Incidence function and a unique decomposition of probability densities using orthogonal polynomials, allowing i… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

    MSC Class: 68W25 (Primary); 65Z05 (Secondary) ACM Class: I.2.0; J.3

  36. arXiv:2401.13898  [pdf, other

    cs.LG

    Cross-Modal Prototype based Multimodal Federated Learning under Severely Missing Modality

    Authors: Huy Q. Le, Chu Myaet Thwal, Yu Qiao, Ye Lin Tun, Minh N. H. Nguyen, Choong Seon Hong

    Abstract: Multimodal federated learning (MFL) has emerged as a decentralized machine learning paradigm, allowing multiple clients with different modalities to collaborate on training a machine learning model across diverse data sources without sharing their private data. However, challenges, such as data heterogeneity and severely missing modalities, pose crucial hindrances to the robustness of MFL, signifi… ▽ More

    Submitted 24 January, 2024; originally announced January 2024.

    Comments: 12 pages, 8 figures, 5 tables

  37. arXiv:2401.12881  [pdf, other

    cs.DS cs.CG

    Computing Diameter+2 in Truly Subquadratic Time for Unit-Disk Graphs

    Authors: Hsien-Chih Chang, Jie Gao, Hung Le

    Abstract: Finding the diameter of a graph in general cannot be done in truly subquadratic assuming the Strong Exponential Time Hypothesis (SETH), even when the underlying graph is unweighted and sparse. When restricting to concrete classes of graphs and assuming SETH, planar graphs and minor-free graphs admit truly subquadratic algorithms, while geometric intersection graphs of unit balls, congruent equilat… ▽ More

    Submitted 23 January, 2024; originally announced January 2024.

    Comments: 28 pages, 7 figures

  38. arXiv:2401.11105  [pdf, other

    cs.SE cs.CR cs.LG

    Are Latent Vulnerabilities Hidden Gems for Software Vulnerability Prediction? An Empirical Study

    Authors: Triet H. M. Le, Xiaoning Du, M. Ali Babar

    Abstract: Collecting relevant and high-quality data is integral to the development of effective Software Vulnerability (SV) prediction models. Most of the current SV datasets rely on SV-fixing commits to extract vulnerable functions and lines. However, none of these datasets have considered latent SVs existing between the introduction and fix of the collected SVs. There is also little known about the useful… ▽ More

    Submitted 19 January, 2024; originally announced January 2024.

    Comments: Accepted as a full paper in the technical track at the 21st International Conference on Mining Software Repositories (MSR) 2024

  39. arXiv:2401.03754  [pdf, other

    cs.IT eess.SP

    Joint Power Allocation and User Scheduling in Integrated Satellite-Terrestrial Cell-Free Massive MIMO IoT Systems

    Authors: Trinh Van Chien, Ha An Le, Ta Hai Tung, Hien Quoc Ngo, Symeon Chatzinotas

    Abstract: Both space and ground communications have been proven effective solutions under different perspectives in Internet of Things (IoT) networks. This paper investigates multiple-access scenarios, where plenty of IoT users are cooperatively served by a satellite in space and access points (APs) on the ground. Available users in each coherence interval are split into scheduled and unscheduled subsets to… ▽ More

    Submitted 8 January, 2024; originally announced January 2024.

    Comments: 15 pages, 10 figures, 1 table. Submitted for publication

  40. arXiv:2401.01827  [pdf, other

    cs.CV

    Moonshot: Towards Controllable Video Generation and Editing with Multimodal Conditions

    Authors: David Junhao Zhang, Dongxu Li, Hung Le, Mike Zheng Shou, Caiming Xiong, Doyen Sahoo

    Abstract: Most existing video diffusion models (VDMs) are limited to mere text conditions. Thereby, they are usually lacking in control over visual appearance and geometry structure of the generated videos. This work presents Moonshot, a new video generation model that conditions simultaneously on multimodal inputs of image and text. The model builts upon a core module, called multimodal video block (MVB),… ▽ More

    Submitted 3 January, 2024; originally announced January 2024.

    Comments: project page: https://showlab.github.io/Moonshot/

  41. arXiv:2401.01108  [pdf, other

    cs.CL

    Unveiling Comparative Sentiments in Vietnamese Product Reviews: A Sequential Classification Framework

    Authors: Ha Le, Bao Tran, Phuong Le, Tan Nguyen, Dac Nguyen, Ngoan Pham, Dang Huynh

    Abstract: Comparative opinion mining is a specialized field of sentiment analysis that aims to identify and extract sentiments expressed comparatively. To address this task, we propose an approach that consists of solving three sequential sub-tasks: (i) identifying comparative sentence, i.e., if a sentence has a comparative meaning, (ii) extracting comparative elements, i.e., what are comparison subjects, o… ▽ More

    Submitted 2 January, 2024; originally announced January 2024.

    Comments: Accepted manuscript at VLSP 2023

  42. WAVER: Writing-style Agnostic Text-Video Retrieval via Distilling Vision-Language Models Through Open-Vocabulary Knowledge

    Authors: Huy Le, Tung Kieu, Anh Nguyen, Ngan Le

    Abstract: Text-video retrieval, a prominent sub-field within the domain of multimodal information retrieval, has witnessed remarkable growth in recent years. However, existing methods assume video scenes are consistent with unbiased descriptions. These limitations fail to align with real-world scenarios since descriptions can be influenced by annotator biases, diverse writing styles, and varying textual per… ▽ More

    Submitted 10 January, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

    Comments: Accepted to ICASSP 2024

  43. arXiv:2312.07740  [pdf, other

    cs.CV

    HAtt-Flow: Hierarchical Attention-Flow Mechanism for Group Activity Scene Graph Generation in Videos

    Authors: Naga VS Raviteja Chappa, Pha Nguyen, Thi Hoang Ngan Le, Khoa Luu

    Abstract: Group Activity Scene Graph (GASG) generation is a challenging task in computer vision, aiming to anticipate and describe relationships between subjects and objects in video sequences. Traditional Video Scene Graph Generation (VidSGG) methods focus on retrospective analysis, limiting their predictive capabilities. To enrich the scene understanding capabilities, we introduced a GASG dataset extendin… ▽ More

    Submitted 28 November, 2023; originally announced December 2023.

    Comments: 11 pages, 5 figures, 6 tables

  44. arXiv:2312.00921  [pdf, ps, other

    eess.IV cs.IT

    Bitstream Organization for Parallel Entropy Coding on Neural Network-based Video Codecs

    Authors: Amir Said, Hoang Le, Farzad Farhadzadeh

    Abstract: Video compression systems must support increasing bandwidth and data throughput at low cost and power, and can be limited by entropy coding bottlenecks. Efficiency can be greatly improved by parallelizing coding, which can be done at much larger scales with new neural-based codecs, but with some compression loss related to data organization. We analyze the bit rate overhead needed to support multi… ▽ More

    Submitted 1 December, 2023; originally announced December 2023.

    Journal ref: Proc. IEEE International Conference on Multimedia, Dec. 2023

  45. arXiv:2312.00398  [pdf, other

    cs.CV

    Learning to Estimate Critical Gait Parameters from Single-View RGB Videos with Transformer-Based Attention Network

    Authors: Quoc Hung T. Le, Hieu H. Pham

    Abstract: Musculoskeletal diseases and cognitive impairments in patients lead to difficulties in movement as well as negative effects on their psychological health. Clinical gait analysis, a vital tool for early diagnosis and treatment, traditionally relies on expensive optical motion capture systems. Recent advances in computer vision and deep learning have opened the door to more accessible and cost-effec… ▽ More

    Submitted 1 March, 2024; v1 submitted 1 December, 2023; originally announced December 2023.

    Comments: Accepted at ISBI 2024 (21st IEEE International Symposium on Biomedical Imaging)

  46. arXiv:2311.15525  [pdf, other

    cs.CL

    Overview of the VLSP 2022 -- Abmusu Shared Task: A Data Challenge for Vietnamese Abstractive Multi-document Summarization

    Authors: Mai-Vu Tran, Hoang-Quynh Le, Duy-Cat Can, Quoc-An Nguyen

    Abstract: This paper reports the overview of the VLSP 2022 - Vietnamese abstractive multi-document summarization (Abmusu) shared task for Vietnamese News. This task is hosted at the 9$^{th}$ annual workshop on Vietnamese Language and Speech Processing (VLSP 2022). The goal of Abmusu shared task is to develop summarization systems that could create abstractive summaries automatically for a set of documents o… ▽ More

    Submitted 26 November, 2023; originally announced November 2023.

    Comments: VLSP 2022

  47. arXiv:2311.11096  [pdf, other

    eess.IV cs.CV

    On the Out of Distribution Robustness of Foundation Models in Medical Image Segmentation

    Authors: Duy Minh Ho Nguyen, Tan Ngoc Pham, Nghiem Tuong Diep, Nghi Quoc Phan, Quang Pham, Vinh Tong, Binh T. Nguyen, Ngan Hoang Le, Nhat Ho, Pengtao Xie, Daniel Sonntag, Mathias Niepert

    Abstract: Constructing a robust model that can effectively generalize to test samples under distribution shifts remains a significant challenge in the field of medical imaging. The foundational models for vision and language, pre-trained on extensive sets of natural image and text data, have emerged as a promising approach. It showcases impressive learning abilities across different tasks with the need for… ▽ More

    Submitted 18 November, 2023; originally announced November 2023.

    Comments: Advances in Neural Information Processing Systems (NeurIPS) 2023, Workshop on robustness of zero/few-shot learning in foundation models

  48. arXiv:2311.09854  [pdf, other

    cs.LG cs.AI math.NA

    SurvTimeSurvival: Survival Analysis On The Patient With Multiple Visits/Records

    Authors: Hung Le, Ong Eng-Jon, Bober Miroslaw

    Abstract: The accurate prediction of survival times for patients with severe diseases remains a critical challenge despite recent advances in artificial intelligence. This study introduces "SurvTimeSurvival: Survival Analysis On Patients With Multiple Visits/Records", utilizing the Transformer model to not only handle the complexities of time-varying covariates but also covariates data. We also tackle the d… ▽ More

    Submitted 16 November, 2023; originally announced November 2023.

    Comments: Accepted as Findings Track in Machine Learning For Health (ML4H) 2023

  49. arXiv:2311.06956  [pdf, other

    cs.CV

    SegReg: Segmenting OARs by Registering MR Images and CT Annotations

    Authors: Zeyu Zhang, Xuyin Qi, Bowen Zhang, Biao Wu, Hien Le, Bora Jeong, Zhibin Liao, Yunxiang Liu, Johan Verjans, Minh-Son To, Richard Hartley

    Abstract: Organ at risk (OAR) segmentation is a critical process in radiotherapy treatment planning such as head and neck tumors. Nevertheless, in clinical practice, radiation oncologists predominantly perform OAR segmentations manually on CT scans. This manual process is highly time-consuming and expensive, limiting the number of patients who can receive timely radiotherapy. Additionally, CT scans offer lo… ▽ More

    Submitted 1 March, 2024; v1 submitted 12 November, 2023; originally announced November 2023.

    Comments: Accepted to ISBI 2024

  50. arXiv:2311.04040  [pdf, other

    cs.CV

    Data exploitation: multi-task learning of object detection and semantic segmentation on partially annotated data

    Authors: Hoàng-Ân Lê, Minh-Tan Pham

    Abstract: Multi-task partially annotated data where each data point is annotated for only a single task are potentially helpful for data scarcity if a network can leverage the inter-task relationship. In this paper, we study the joint learning of object detection and semantic segmentation, the two most popular vision problems, from multi-task data with partial annotations. Extensive experiments are performe… ▽ More

    Submitted 7 November, 2023; originally announced November 2023.

    Comments: Accepted for publishing at BMVC 2023