Skip to main content

Showing 1–50 of 175 results for author: Tran, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.19871  [pdf, other

    cs.LG cs.NI eess.SY

    Koopman based trajectory model and computation offloading for high mobility paradigm in ISAC enabled IoT system

    Authors: Minh-Tuan Tran

    Abstract: User experience on mobile devices is constrained by limited battery capacity and processing power, but 6G technology advancements are diving rapidly into mobile technical evolution. Mobile edge computing (MEC) offers a solution, offloading computationally intensive tasks to edge cloud servers, reducing battery drain compared to local processing. The upcoming integrated sensing and communication in… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

    MSC Class: 52-08 ACM Class: C.2

  2. arXiv:2406.14819  [pdf, other

    cs.CV

    SAM-EG: Segment Anything Model with Egde Guidance framework for efficient Polyp Segmentation

    Authors: Quoc-Huy Trinh, Hai-Dang Nguyen, Bao-Tram Nguyen Ngoc, Debesh Jha, Ulas Bagci, Minh-Triet Tran

    Abstract: Polyp segmentation, a critical concern in medical imaging, has prompted numerous proposed methods aimed at enhancing the quality of segmented masks. While current state-of-the-art techniques produce impressive results, the size and computational cost of these models pose challenges for practical industry applications. Recently, the Segment Anything Model (SAM) has been proposed as a robust foundat… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  3. What is in the Chrome Web Store? Investigating Security-Noteworthy Browser Extensions

    Authors: Sheryl Hsu, Manda Tran, Aurore Fass

    Abstract: This paper is the first attempt at providing a holistic view of the Chrome Web Store (CWS). We leverage historical data provided by ChromeStats to study global trends in the CWS and security implications. We first highlight the extremely short life cycles of extensions: roughly 60% of extensions stay in the CWS for one year. Second, we define and show that Security-Noteworthy Extensions (SNE) are… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: Published in ACM AsiaCCS 2024

    Journal ref: ACM AsiaCCS 2024

  4. arXiv:2406.11146  [pdf, other

    cs.HC

    Designing Interactions with Autonomous Physical Systems

    Authors: Marius Hoggenmueller, Tram Thi Minh Tran, Luke Hespanhol, Martin Tomitsch

    Abstract: In this position paper, we present a collection of four different prototy** approaches which we have developed and applied to prototype and evaluate interfaces for and interactions around autonomous physical systems. Further, we provide a classification of our approaches aiming to support other researchers and designers in choosing appropriate prototy** platforms and representations.

    Submitted 16 June, 2024; originally announced June 2024.

  5. arXiv:2406.09837  [pdf, other

    cs.LG

    TabularFM: An Open Framework For Tabular Foundational Models

    Authors: Quan M. Tran, Suong N. Hoang, Lam M. Nguyen, Dzung Phan, Hoang Thanh Lam

    Abstract: Foundational models (FMs), pretrained on extensive datasets using self-supervised techniques, are capable of learning generalized patterns from large amounts of data. This reduces the need for extensive labeled datasets for each new task, saving both time and resources by leveraging the broad knowledge base established during pretraining. Most research on FMs has primarily focused on unstructured… ▽ More

    Submitted 17 June, 2024; v1 submitted 14 June, 2024; originally announced June 2024.

  6. Context-Based Interface Prototy**: Understanding the Effect of Prototype Representation on User Feedback

    Authors: Marius Hoggenmueller, Martin Tomitsch, Luke Hespanhol, Tram Thi Minh Tran, Stewart Worrall, Eduardo Nebot

    Abstract: The rise of autonomous systems in cities, such as automated vehicles (AVs), requires new approaches for prototy** and evaluating how people interact with those systems through context-based user interfaces, such as external human-machine interfaces (eHMIs). In this paper, we present a comparative study of three prototype representations (real-world VR, computer-generated VR, real-world video) of… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  7. arXiv:2406.00307  [pdf, other

    cs.CV

    HENASY: Learning to Assemble Scene-Entities for Egocentric Video-Language Model

    Authors: Khoa Vo, Thinh Phan, Kashu Yamazaki, Minh Tran, Ngan Le

    Abstract: Current video-language models (VLMs) rely extensively on instance-level alignment between video and language modalities, which presents two major limitations: (1) visual reasoning disobeys the natural perception that humans do in first-person perspective, leading to a lack of reasoning interpretation; and (2) learning is limited in capturing inherent fine-grained relationships between two modaliti… ▽ More

    Submitted 6 June, 2024; v1 submitted 1 June, 2024; originally announced June 2024.

    Comments: under submission

  8. arXiv:2405.17926  [pdf, other

    cs.CV

    SarcNet: A Novel AI-based Framework to Automatically Analyze and Score Sarcomere Organizations in Fluorescently Tagged hiPSC-CMs

    Authors: Huyen Le, Khiet Dang, Tien Lai, Nhung Nguyen, Mai Tran, Hieu Pham

    Abstract: Quantifying sarcomere structure organization in human-induced pluripotent stem cell-derived cardiomyocytes (hiPSC-CMs) is crucial for understanding cardiac disease pathology, improving drug screening, and advancing regenerative medicine. Traditional methods, such as manual annotation and Fourier transform analysis, are labor-intensive, error-prone, and lack high-throughput capabilities. In this st… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  9. arXiv:2405.14608  [pdf, other

    cs.LG cs.AI

    ShapeFormer: Shapelet Transformer for Multivariate Time Series Classification

    Authors: Xuan-May Le, Ling Luo, Uwe Aickelin, Minh-Tuan Tran

    Abstract: Multivariate time series classification (MTSC) has attracted significant research attention due to its diverse real-world applications. Recently, exploiting transformers for MTSC has achieved state-of-the-art performance. However, existing methods focus on generic features, providing a comprehensive understanding of data, but they ignore class-specific features crucial for learning the representat… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

    Comments: Accepted at KDD 2024

  10. arXiv:2405.04489  [pdf, other

    cs.CV

    S3Former: Self-supervised High-resolution Transformer for Solar PV Profiling

    Authors: Minh Tran, Adrian De Luis, Haitao Liao, Ying Huang, Roy McCann, Alan Mantooth, Jack Cothren, Ngan Le

    Abstract: As the impact of climate change escalates, the global necessity to transition to sustainable energy sources becomes increasingly evident. Renewable energies have emerged as a viable solution for users, with Photovoltaic energy being a favored choice for small installations due to its reliability and efficiency. Accurate map** of PV installations is crucial for understanding the extension of its… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: Preprint

  11. arXiv:2404.18705  [pdf, other

    cs.IT eess.SP

    Wireless Information and Energy Transfer in the Era of 6G Communications

    Authors: Constantinos Psomas, Konstantinos Ntougias, Nikita Shanin, Dongfang Xu, Kenneth MacSporran Mayer, Nguyen Minh Tran, Laura Cottatellucci, Kae Won Choi, Dong In Kim, Robert Schober, Ioannis Krikidis

    Abstract: Wireless information and energy transfer (WIET) represents an emerging paradigm which employs controllable transmission of radio-frequency signals for the dual purpose of data communication and wireless charging. As such, WIET is widely regarded as an enabler of envisioned 6G use cases that rely on energy-sustainable Internet-of-Things (IoT) networks, such as smart cities and smart grids. Meeting… ▽ More

    Submitted 16 May, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

    Comments: Proceedings of the IEEE, 36 pages, 33 figures

  12. arXiv:2404.11429  [pdf, other

    cs.CV

    CarcassFormer: An End-to-end Transformer-based Framework for Simultaneous Localization, Segmentation and Classification of Poultry Carcass Defect

    Authors: Minh Tran, Sang Truong, Arthur F. A. Fernandes, Michael T. Kidd, Ngan Le

    Abstract: In the food industry, assessing the quality of poultry carcasses during processing is a crucial step. This study proposes an effective approach for automating the assessment of carcass quality without requiring skilled labor or inspector involvement. The proposed system is based on machine learning (ML) and computer vision (CV) techniques, enabling automated defect detection and carcass quality as… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: Accepted to Poultry Science Journal

  13. arXiv:2404.08590  [pdf, other

    cs.CV cs.AI

    Improving Referring Image Segmentation using Vision-Aware Text Features

    Authors: Hai Nguyen-Truong, E-Ro Nguyen, Tuan-Anh Vu, Minh-Triet Tran, Binh-Son Hua, Sai-Kit Yeung

    Abstract: Referring image segmentation is a challenging task that involves generating pixel-wise segmentation masks based on natural language descriptions. Existing methods have relied mostly on visual features to generate the segmentation masks while treating text features as supporting components. This over-reliance on visual features can lead to suboptimal results, especially in complex scenarios where t… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

    Comments: 30 pages including supplementary

  14. arXiv:2404.04564  [pdf, other

    cs.CV cs.AI

    Enhancing Video Summarization with Context Awareness

    Authors: Hai-Dang Huynh-Lam, Ngoc-Phuong Ho-Thi, Minh-Triet Tran, Trung-Nghia Le

    Abstract: Video summarization is a crucial research area that aims to efficiently browse and retrieve relevant information from the vast amount of video content available today. With the exponential growth of multimedia data, the ability to extract meaningful representations from videos has become essential. Video summarization techniques automatically generate concise summaries by selecting keyframes, shot… ▽ More

    Submitted 6 April, 2024; originally announced April 2024.

    Comments: 115 pages, 1 supplementary paper, undergraduate thesis report at US-VNUHCM

  15. Cluster-based Video Summarization with Temporal Context Awareness

    Authors: Hai-Dang Huynh-Lam, Ngoc-Phuong Ho-Thi, Minh-Triet Tran, Trung-Nghia Le

    Abstract: In this paper, we present TAC-SUM, a novel and efficient training-free approach for video summarization that addresses the limitations of existing cluster-based models by incorporating temporal context. Our method partitions the input video into temporally consecutive segments with clustering information, enabling the injection of temporal awareness into the clustering process, setting it apart fr… ▽ More

    Submitted 6 April, 2024; originally announced April 2024.

    Comments: 14 pages, 6 figures, accepted in PSIVT 2023

  16. Ensemble Learning for Vietnamese Scene Text Spotting in Urban Environments

    Authors: Hieu Nguyen, Cong-Hoang Ta, Phuong-Thuy Le-Nguyen, Minh-Triet Tran, Trung-Nghia Le

    Abstract: This paper presents a simple yet efficient ensemble learning framework for Vietnamese scene text spotting. Leveraging the power of ensemble learning, which combines multiple models to yield more accurate predictions, our approach aims to significantly enhance the performance of scene text spotting in challenging urban settings. Through experimental evaluations on the VinText dataset, our proposed… ▽ More

    Submitted 31 March, 2024; originally announced April 2024.

    Comments: RIVF 2023

    Journal ref: In 2023 RIVF International Conference on Computing and Communication Technologies (RIVF) (pp. 177-182). IEEE

  17. Exploring Holistic HMI Design for Automated Vehicles: Insights from a Participatory Workshop to Bridge In-Vehicle and External Communication

    Authors: Haoyu Dong, Tram Thi Minh Tran, Rutger Verstegen, Silvia Cazacu, Ruolin Gao, Marius Hoggenmüller, Debargha Dey, Mervyn Franssen, Markus Sasalovici, Pavlo Bazilinskyy, Marieke Martens

    Abstract: Human-Machine Interfaces (HMIs) for automated vehicles (AVs) are typically divided into two categories: internal HMIs for interactions within the vehicle, and external HMIs for communication with other road users. In this work, we examine the prospects of bridging these two seemingly distinct domains. Through a participatory workshop with automotive user interface researchers and practitioners, we… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

  18. arXiv:2403.14101  [pdf, other

    cs.CV cs.CL cs.LG

    Text-Enhanced Data-free Approach for Federated Class-Incremental Learning

    Authors: Minh-Tuan Tran, Trung Le, Xuan-May Le, Mehrtash Harandi, Dinh Phung

    Abstract: Federated Class-Incremental Learning (FCIL) is an underexplored yet pivotal issue, involving the dynamic addition of new classes in the context of federated learning. In this field, Data-Free Knowledge Transfer (DFKT) plays a crucial role in addressing catastrophic forgetting and data privacy problems. However, prior approaches lack the crucial synergy between DFKT and the model training phases, c… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

    Comments: Accepted at CVPR 2024

  19. Holistic HMI Design for Automated Vehicles: Bridging In-Vehicle and External Communication

    Authors: Haoyu Dong, Tram Thi Minh Tran, Pavlo Bazilinskyy, Marius Hoggenmüller, Debargha Dey, Silvia Cazacu, Mervyn Franssen, Ruolin Gao

    Abstract: As the field of automated vehicles (AVs) advances, it has become increasingly critical to develop human-machine interfaces (HMI) for both internal and external communication. Critical dialogue is emerging around the potential necessity for a holistic approach to HMI designs, which promotes the integration of both in-vehicle user and external road user perspectives. This approach aims to create a u… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

  20. A Review of Virtual Reality Studies on Autonomous Vehicle--Pedestrian Interaction

    Authors: Tram Thi Minh Tran, Callum Parker, Martin Tomitsch

    Abstract: An increasing number of studies employ virtual reality (VR) to evaluate interactions between autonomous vehicles (AVs) and pedestrians. VR simulators are valued for their cost-effectiveness, flexibility in develo** various traffic scenarios, safe conduct of user studies, and acceptable ecological validity. Reviewing the literature between 2010 and 2020, we found 31 empirical studies using VR as… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

  21. Simulating Wearable Urban Augmented Reality Experiences in VR: Lessons Learnt from Designing Two Future Urban Interfaces

    Authors: Tram Thi Minh Tran, Callum Parker, Marius Hoggenmüller, Luke Hespanhol, Martin Tomitsch

    Abstract: Augmented reality (AR) has the potential to fundamentally change how people engage with increasingly interactive urban environments. However, many challenges exist in designing and evaluating these new urban AR experiences, such as technical constraints and safety concerns associated with outdoor AR. We contribute to this domain by assessing the use of virtual reality (VR) for simulating wearable… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

  22. arXiv:2403.11376  [pdf, other

    cs.CV

    ShapeFormer: Shape Prior Visible-to-Amodal Transformer-based Amodal Instance Segmentation

    Authors: Minh Tran, Winston Bounsavy, Khoa Vo, Anh Nguyen, Tri Nguyen, Ngan Le

    Abstract: Amodal Instance Segmentation (AIS) presents a challenging task as it involves predicting both visible and occluded parts of objects within images. Existing AIS methods rely on a bidirectional approach, encompassing both the transition from amodal features to visible features (amodal-to-visible) and from visible features to amodal features (visible-to-amodal). Our observation shows that the utiliza… ▽ More

    Submitted 17 April, 2024; v1 submitted 17 March, 2024; originally announced March 2024.

    Comments: Accepted to IJCNN2024

  23. arXiv:2403.09069  [pdf, other

    cs.CV

    Dyadic Interaction Modeling for Social Behavior Generation

    Authors: Minh Tran, Di Chang, Maksim Siniukov, Mohammad Soleymani

    Abstract: Human-human communication is like a delicate dance where listeners and speakers concurrently interact to maintain conversational dynamics. Hence, an effective model for generating listener nonverbal behaviors requires understanding the dyadic context and interaction. In this paper, we present an effective framework for creating 3D facial motions in dyadic interactions. Existing work consider a lis… ▽ More

    Submitted 26 March, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

  24. arXiv:2403.08876  [pdf, other

    cs.CV

    ARtVista: Gateway To Empower Anyone Into Artist

    Authors: Trong-Vu Hoang, Quang-Binh Nguyen, Duy-Nam Ly, Khanh-Duy Le, Tam V. Nguyen, Minh-Triet Tran, Trung-Nghia Le

    Abstract: Drawing is an art that enables people to express their imagination and emotions. However, individuals usually face challenges in drawing, especially when translating conceptual ideas into visually coherent representations and bridging the gap between mental visualization and practical execution. In response, we propose ARtVista - a novel system integrating AR and generative AI technologies. ARtVis… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

    Comments: CHI 2024

  25. arXiv:2403.08746  [pdf, other

    cs.CV

    iCONTRA: Toward Thematic Collection Design Via Interactive Concept Transfer

    Authors: Dinh-Khoi Vo, Duy-Nam Ly, Khanh-Duy Le, Tam V. Nguyen, Minh-Triet Tran, Trung-Nghia Le

    Abstract: Creating thematic collections in industries demands innovative designs and cohesive concepts. Designers may face challenges in maintaining thematic consistency when drawing inspiration from existing objects, landscapes, or artifacts. While AI-powered graphic design tools offer help, they often fail to generate cohesive sets based on specific thematic concepts. In response, we introduce iCONTRA, an… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

    Comments: CHI 2024

  26. Designing Wearable Augmented Reality Concepts to Support Scalability in Autonomous Vehicle-Pedestrian Interaction

    Authors: Tram Thi Minh Tran, Callum Parker, Yiyuan Wang, Martin Tomitsch

    Abstract: Wearable augmented reality (AR) offers new ways for supporting the interaction between autonomous vehicles (AVs) and pedestrians due to its ability to integrate timely and contextually relevant data into the user's field of view. This article presents novel wearable AR concepts that assist crossing pedestrians in multi-vehicle scenarios where several AVs frequent the road from both directions. Thr… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

  27. arXiv:2403.05727  [pdf, other

    cs.HC

    Sco** Out the Scalability Issues of Autonomous Vehicle-Pedestrian Interaction

    Authors: Tram Thi Minh Tran, Callum Parker, Martin Tomitsch

    Abstract: Autonomous vehicles (AVs) may use external interfaces, such as LED light bands, to communicate with pedestrians safely and intuitively. While previous research has demonstrated the effectiveness of these interfaces in simple traffic scenarios involving one pedestrian and one vehicle, their performance in more complex scenarios with multiple road users remains unclear. The scalability of AV externa… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

  28. Exploring the Impact of Interconnected External Interfaces in Autonomous Vehicleson Pedestrian Safety and Experience

    Authors: Tram Thi Minh Tran, Callum Parker, Marius Hoggenmuller, Yiyuan Wang, Martin Tomitsch

    Abstract: Policymakers advocate for the use of external Human-Machine Interfaces (eHMIs) to allow autonomous vehicles (AVs) to communicate their intentions or status. Nonetheless, scalability concerns in complex traffic scenarios arise, such as potentially increasing pedestrian cognitive load or conveying contradictory signals. Building upon precursory works, our study explores 'interconnected eHMIs,' where… ▽ More

    Submitted 17 March, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  29. arXiv:2402.13613  [pdf, other

    cs.CL cs.LG

    Overview of the VLSP 2023 -- ComOM Shared Task: A Data Challenge for Comparative Opinion Mining from Vietnamese Product Reviews

    Authors: Hoang-Quynh Le, Duy-Cat Can, Khanh-Vinh Nguyen, Mai-Vu Tran

    Abstract: This paper presents a comprehensive overview of the Comparative Opinion Mining from Vietnamese Product Reviews shared task (ComOM), held as part of the 10$^{th}$ International Workshop on Vietnamese Language and Speech Processing (VLSP 2023). The primary objective of this shared task is to advance the field of natural language processing by develo** techniques that proficiently extract comparati… ▽ More

    Submitted 4 March, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

    Comments: In Proceedings of VLSP 2023

  30. arXiv:2401.16500  [pdf, other

    cs.ET physics.ins-det

    Error detection using pneumatic logic

    Authors: Shane Hoang, Mabel Shehada, Zinal Patel, Minh-Huy Tran, Konstantinos Karydis, Philip Brisk, William H. Grover

    Abstract: Pneumatic systems are common in manufacturing, healthcare, transportation, robotics, and many other fields. Failures in these systems can have very serious consequences, particularly if they go undetected. In this work, we present an air-powered error detector device that can detect and respond to failures in pneumatically actuated systems. The device contains 21 monolithic membrane valves that ac… ▽ More

    Submitted 29 January, 2024; originally announced January 2024.

    Comments: 23 pages, 5 figures

  31. arXiv:2401.08868  [pdf, other

    cs.CV

    B-Cos Aligned Transformers Learn Human-Interpretable Features

    Authors: Manuel Tran, Amal Lahiani, Yashin Dicente Cid, Melanie Boxberg, Peter Lienemann, Christian Matek, Sophia J. Wagner, Fabian J. Theis, Eldad Klaiman, Tingying Peng

    Abstract: Vision Transformers (ViTs) and Swin Transformers (Swin) are currently state-of-the-art in computational pathology. However, domain experts are still reluctant to use these models due to their lack of interpretability. This is not surprising, as critical decisions need to be transparent and understandable. The most common approach to understanding transformers is to visualize their attention. Howev… ▽ More

    Submitted 18 January, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

    Comments: Accepted at MICCAI 2023 (oral). Camera-ready available at https://doi.org/10.1007/978-3-031-43993-3_50

  32. arXiv:2312.12746  [pdf, other

    cs.CL cs.CY

    ChatFDA: Medical Records Risk Assessment

    Authors: M Tran, C Sun

    Abstract: In healthcare, the emphasis on patient safety and the minimization of medical errors cannot be overstated. Despite concerted efforts, many healthcare systems, especially in low-resource regions, still grapple with preventing these errors effectively. This study explores a pioneering application aimed at addressing this challenge by assisting caregivers in gauging potential risks derived from medic… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

  33. arXiv:2312.10187  [pdf, other

    eess.SP cs.LG

    TSRNet: Simple Framework for Real-time ECG Anomaly Detection with Multimodal Time and Spectrogram Restoration Network

    Authors: Nhat-Tan Bui, Dinh-Hieu Hoang, Thinh Phan, Minh-Triet Tran, Brijesh Patel, Donald Adjeroh, Ngan Le

    Abstract: The electrocardiogram (ECG) is a valuable signal used to assess various aspects of heart health, such as heart rate and rhythm. It plays a crucial role in identifying cardiac conditions and detecting anomalies in ECG data. However, distinguishing between normal and abnormal ECG signals can be a challenging task. In this paper, we propose an approach that leverages anomaly detection to identify unh… ▽ More

    Submitted 5 March, 2024; v1 submitted 15 December, 2023; originally announced December 2023.

    Comments: Accepted at ISBI 2024

  34. arXiv:2312.10179  [pdf, other

    cs.LG

    3FM: Multi-modal Meta-learning for Federated Tasks

    Authors: Minh Tran, Roochi Shah, Zejun Gong

    Abstract: We present a novel approach in the domain of federated learning (FL), particularly focusing on addressing the challenges posed by modality heterogeneity, variability in modality availability across clients, and the prevalent issue of missing data. We introduce a meta-learning framework specifically designed for multimodal federated tasks. Our approach is motivated by the need to enable federated m… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

  35. arXiv:2312.07489  [pdf, other

    cs.CV

    NearbyPatchCL: Leveraging Nearby Patches for Self-Supervised Patch-Level Multi-Class Classification in Whole-Slide Images

    Authors: Gia-Bao Le, Van-Tien Nguyen, Trung-Nghia Le, Minh-Triet Tran

    Abstract: Whole-slide image (WSI) analysis plays a crucial role in cancer diagnosis and treatment. In addressing the demands of this critical task, self-supervised learning (SSL) methods have emerged as a valuable resource, leveraging their efficiency in circumventing the need for a large number of annotations, which can be both costly and time-consuming to deploy supervised methods. Nevertheless, patch-wis… ▽ More

    Submitted 12 December, 2023; originally announced December 2023.

    Comments: MMM 2024

  36. arXiv:2312.05848  [pdf

    cs.MM

    Super-rays grou** scheme and novel coding architecture for computational time reduction of graph-based Light Field coding

    Authors: Bach Nguyen Gia, Chanh Minh Tran, Tho Nguyen Duc, Tan Phan Xuan, Eiji Kamioka

    Abstract: Graph-based Light Field coding using the concept of super-rays is powerful to exploit signal redundancy along irregular shapes and achieves good energy compaction, compared to rectangular block -based approaches. However, its main limitation lies in the high time complexity for eigen-decomposition of each super-ray local graph, a high number of which can be found in a Light Field when segmented in… ▽ More

    Submitted 10 December, 2023; originally announced December 2023.

  37. arXiv:2312.05634  [pdf, other

    cs.CV

    PGDS: Pose-Guidance Deep Supervision for Mitigating Clothes-Changing in Person Re-Identification

    Authors: Quoc-Huy Trinh, Nhat-Tan Bui, Dinh-Hieu Hoang, Phuoc-Thao Vo Thi, Hai-Dang Nguyen, Debesh Jha, Ulas Bagci, Ngan Le, Minh-Triet Tran

    Abstract: Person Re-Identification (Re-ID) task seeks to enhance the tracking of multiple individuals by surveillance cameras. It supports multimodal tasks, including text-based person retrieval and human matching. One of the most significant challenges faced in Re-ID is clothes-changing, where the same person may appear in different outfits. While previous methods have made notable progress in maintaining… ▽ More

    Submitted 1 June, 2024; v1 submitted 9 December, 2023; originally announced December 2023.

    Comments: Accepted at AVSS 2024

  38. arXiv:2311.15525  [pdf, other

    cs.CL

    Overview of the VLSP 2022 -- Abmusu Shared Task: A Data Challenge for Vietnamese Abstractive Multi-document Summarization

    Authors: Mai-Vu Tran, Hoang-Quynh Le, Duy-Cat Can, Quoc-An Nguyen

    Abstract: This paper reports the overview of the VLSP 2022 - Vietnamese abstractive multi-document summarization (Abmusu) shared task for Vietnamese News. This task is hosted at the 9$^{th}$ annual workshop on Vietnamese Language and Speech Processing (VLSP 2022). The goal of Abmusu shared task is to develop summarization systems that could create abstractive summaries automatically for a set of documents o… ▽ More

    Submitted 26 November, 2023; originally announced November 2023.

    Comments: VLSP 2022

  39. arXiv:2311.14764  [pdf, other

    cs.CV

    SafeSea: Synthetic Data Generation for Adverse & Low Probability Maritime Conditions

    Authors: Martin Tran, Jordan Shipard, Hermawan Mulyono, Arnold Wiliem, Clinton Fookes

    Abstract: High-quality training data is essential for enhancing the robustness of object detection models. Within the maritime domain, obtaining a diverse real image dataset is particularly challenging due to the difficulty of capturing sea images with the presence of maritime objects , especially in stormy conditions. These challenges arise due to resource limitations, in addition to the unpredictable appe… ▽ More

    Submitted 23 November, 2023; originally announced November 2023.

    Comments: Accepted to WACV 2024 workshop on Maritime Computer Vision

  40. arXiv:2310.20057  [pdf, other

    cs.CV

    SolarFormer: Multi-scale Transformer for Solar PV Profiling

    Authors: Adrian de Luis, Minh Tran, Taisei Hanyu, Anh Tran, Liao Haitao, Roy McCann, Alan Mantooth, Ying Huang, Ngan Le

    Abstract: As climate change intensifies, the global imperative to shift towards sustainable energy sources becomes more pronounced. Photovoltaic (PV) energy is a favored choice due to its reliability and ease of installation. Accurate map** of PV installations is crucial for understanding their adoption and informing energy policy. To meet this need, we introduce the SolarFormer, designed to segment solar… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

    Comments: Pre-print

  41. arXiv:2310.16112  [pdf, other

    cs.CV

    Towards long-tailed, multi-label disease classification from chest X-ray: Overview of the CXR-LT challenge

    Authors: Gregory Holste, Yiliang Zhou, Song Wang, Ajay Jaiswal, Mingquan Lin, Sherry Zhuge, Yuzhe Yang, Dongkyun Kim, Trong-Hieu Nguyen-Mau, Minh-Triet Tran, Jaehyup Jeong, Wongi Park, Jongbin Ryu, Feng Hong, Arsh Verma, Yosuke Yamagishi, Changhyun Kim, Hyeryeong Seo, Myungjoo Kang, Leo Anthony Celi, Zhiyong Lu, Ronald M. Summers, George Shih, Zhangyang Wang, Yifan Peng

    Abstract: Many real-world image recognition problems, such as diagnostic medical imaging exams, are "long-tailed" $\unicode{x2013}$ there are a few common findings followed by many more relatively rare conditions. In chest radiography, diagnosis is both a long-tailed and multi-label problem, as patients often present with multiple findings simultaneously. While researchers have begun to study the problem of… ▽ More

    Submitted 1 April, 2024; v1 submitted 24 October, 2023; originally announced October 2023.

    Comments: Update after major revision

  42. arXiv:2310.10875  [pdf, other

    cs.CV cs.CG

    Filling the Holes on 3D Heritage Object Surface based on Automatic Segmentation Algorithm

    Authors: Sinh Van Nguyen, Son Thanh Le, Minh Khai Tran, Le Thanh Sach

    Abstract: Reconstructing and processing the 3D objects are popular activities in the research field of computer graphics, image processing and computer vision. The 3D objects are processed based on the methods like geometric modeling, a branch of applied mathematics and computational geometry, or the machine learning algorithms based on image processing. The computation of geometrical objects includes proce… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

    Comments: 20 pages, 11 figures, 37 references

  43. arXiv:2310.03923  [pdf, other

    cs.CV cs.RO

    Open-Fusion: Real-time Open-Vocabulary 3D Map** and Queryable Scene Representation

    Authors: Kashu Yamazaki, Taisei Hanyu, Khoa Vo, Thang Pham, Minh Tran, Gianfranco Doretto, Anh Nguyen, Ngan Le

    Abstract: Precise 3D environmental map** is pivotal in robotics. Existing methods often rely on predefined concepts during training or are time-intensive when generating semantic maps. This paper presents Open-Fusion, a groundbreaking approach for real-time open-vocabulary 3D map** and queryable scene representation using RGB-D data. Open-Fusion harnesses the power of a pre-trained vision-language found… ▽ More

    Submitted 5 October, 2023; originally announced October 2023.

  44. arXiv:2310.00258  [pdf, other

    cs.CV

    NAYER: Noisy Layer Data Generation for Efficient and Effective Data-free Knowledge Distillation

    Authors: Minh-Tuan Tran, Trung Le, Xuan-May Le, Mehrtash Harandi, Quan Hung Tran, Dinh Phung

    Abstract: Data-Free Knowledge Distillation (DFKD) has made significant recent strides by transferring knowledge from a teacher neural network to a student neural network without accessing the original data. Nonetheless, existing approaches encounter a significant challenge when attempting to generate samples from random noise inputs, which inherently lack meaningful information. Consequently, these models s… ▽ More

    Submitted 21 March, 2024; v1 submitted 30 September, 2023; originally announced October 2023.

    Comments: Accepted at CVPR 2024

  45. arXiv:2309.17215  [pdf, other

    cs.LG cs.AI

    RSAM: Learning on manifolds with Riemannian Sharpness-aware Minimization

    Authors: Tuan Truong, Hoang-Phi Nguyen, Tung Pham, Minh-Tuan Tran, Mehrtash Harandi, Dinh Phung, Trung Le

    Abstract: Nowadays, understanding the geometry of the loss landscape shows promise in enhancing a model's generalization ability. In this work, we draw upon prior works that apply geometric principles to optimization and present a novel approach to improve robustness and generalization ability for constrained optimization problems. Indeed, this paper aims to generalize the Sharpness-Aware Minimization (SAM)… ▽ More

    Submitted 29 September, 2023; originally announced September 2023.

  46. Cloud Watching: Understanding Attacks Against Cloud-Hosted Services

    Authors: Liz Izhikevich, Manda Tran, Michalis Kallitsis, Aurore Fass, Zakir Durumeric

    Abstract: Cloud computing has dramatically changed service deployment patterns. In this work, we analyze how attackers identify and target cloud services in contrast to traditional enterprise networks and network telescopes. Using a diverse set of cloud honeypots in 5~providers and 23~countries as well as 2~educational networks and 1~network telescope, we analyze how IP address assignment, geography, networ… ▽ More

    Submitted 28 September, 2023; v1 submitted 23 September, 2023; originally announced September 2023.

    Comments: Proceedings of the 2023 ACM Internet Measurement Conference (IMC '23), October 24--26, 2023, Montreal, QC, Canada

  47. arXiv:2309.03493  [pdf, other

    eess.IV cs.CV

    SAM3D: Segment Anything Model in Volumetric Medical Images

    Authors: Nhat-Tan Bui, Dinh-Hieu Hoang, Minh-Triet Tran, Gianfranco Doretto, Donald Adjeroh, Brijesh Patel, Arabinda Choudhary, Ngan Le

    Abstract: Image segmentation remains a pivotal component in medical image analysis, aiding in the extraction of critical information for precise diagnostic practices. With the advent of deep learning, automated image segmentation methods have risen to prominence, showcasing exceptional proficiency in processing medical imagery. Motivated by the Segment Anything Model (SAM)-a foundational model renowned for… ▽ More

    Submitted 5 March, 2024; v1 submitted 7 September, 2023; originally announced September 2023.

    Comments: Accepted at ISBI 2024

  48. arXiv:2309.03329  [pdf, other

    cs.CV

    MEGANet: Multi-Scale Edge-Guided Attention Network for Weak Boundary Polyp Segmentation

    Authors: Nhat-Tan Bui, Dinh-Hieu Hoang, Quang-Thuc Nguyen, Minh-Triet Tran, Ngan Le

    Abstract: Efficient polyp segmentation in healthcare plays a critical role in enabling early diagnosis of colorectal cancer. However, the segmentation of polyps presents numerous challenges, including the intricate distribution of backgrounds, variations in polyp sizes and shapes, and indistinct boundaries. Defining the boundary between the foreground (i.e. polyp itself) and the background (surrounding tiss… ▽ More

    Submitted 4 November, 2023; v1 submitted 6 September, 2023; originally announced September 2023.

    Comments: Accepted at the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2024)

  49. arXiv:2309.02418  [pdf, other

    eess.AS cs.SD eess.SP

    Personalized Adaptation with Pre-trained Speech Encoders for Continuous Emotion Recognition

    Authors: Minh Tran, Yufeng Yin, Mohammad Soleymani

    Abstract: There are individual differences in expressive behaviors driven by cultural norms and personality. This between-person variation can result in reduced emotion recognition performance. Therefore, personalization is an important step in improving the generalization and robustness of speech emotion recognition. In this paper, to achieve unsupervised personalized emotion recognition, we first pre-trai… ▽ More

    Submitted 5 September, 2023; originally announced September 2023.

    Comments: Accepted by INTERSPEECH 2023

  50. arXiv:2309.02072  [pdf, other

    econ.EM cs.AI q-fin.CP

    Data Scaling Effect of Deep Learning in Financial Time Series Forecasting

    Authors: Chen Liu, Minh-Ngoc Tran, Chao Wang, Richard Gerlach, Robert Kohn

    Abstract: For years, researchers investigated the applications of deep learning in forecasting financial time series. However, they continued to rely on the conventional econometric approach for model training that optimizes the deep learning models on individual assets. This study highlights the importance of global training, where the deep learning model is optimized across a wide spectrum of stocks. Focu… ▽ More

    Submitted 31 May, 2024; v1 submitted 5 September, 2023; originally announced September 2023.

    Comments: 25 pages, 5 figures