Skip to main content

Showing 1–50 of 179 results for author: Le, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.00129  [pdf

    eess.IV cs.AI cs.HC

    Multimodal Learning and Cognitive Processes in Radiology: MedGaze for Chest X-ray Scanpath Prediction

    Authors: Akash Awasthi, Ngan Le, Zhigang Deng, Rishi Agrawal, Carol C. Wu, Hien Van Nguyen

    Abstract: Predicting human gaze behavior within computer vision is integral for develo** interactive systems that can anticipate user attention, address fundamental questions in cognitive science, and hold implications for fields like human-computer interaction (HCI) and augmented/virtual reality (AR/VR) systems. Despite methodologies introduced for modeling human eye gaze behavior, applying these models… ▽ More

    Submitted 28 June, 2024; originally announced July 2024.

    Comments: Submitted to the Journal

  2. arXiv:2406.19686  [pdf

    eess.IV cs.AI cs.CV cs.HC

    Enhancing Radiological Diagnosis: A Collaborative Approach Integrating AI and Human Expertise for Visual Miss Correction

    Authors: Akash Awasthi, Ngan Le, Zhigang Deng, Carol C. Wu, Hien Van Nguyen

    Abstract: Human-AI collaboration to identify and correct perceptual errors in chest radiographs has not been previously explored. This study aimed to develop a collaborative AI system, CoRaX, which integrates eye gaze data and radiology reports to enhance diagnostic accuracy in chest radiology by pinpointing perceptual errors and refining the decision-making process. Using public datasets REFLACX and EGD-CX… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

    Comments: Under Review in Journal

  3. arXiv:2406.12367  [pdf, other

    cs.CV cs.LG cs.MM

    Competitive Learning for Achieving Content-specific Filters in Video Coding for Machines

    Authors: Honglei Zhang, Jukka I. Ahonen, Nam Le, Ruiying Yang, Francesco Cricri

    Abstract: This paper investigates the efficacy of jointly optimizing content-specific post-processing filters to adapt a human oriented video/image codec into a codec suitable for machine vision tasks. By observing that artifacts produced by video/image codecs are content-dependent, we propose a novel training strategy based on competitive learning principles. This strategy assigns training samples to filte… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: Accepted to be preseneted in ICIP 2024

  4. arXiv:2406.03431  [pdf, other

    cs.CV

    CattleFace-RGBT: RGB-T Cattle Facial Landmark Benchmark

    Authors: Ethan Coffman, Reagan Clark, Nhat-Tan Bui, Trong Thang Pham, Beth Kegley, Jeremy G. Powell, Jiangchao Zhao, Ngan Le

    Abstract: To address this challenge, we introduce CattleFace-RGBT, a RGB-T Cattle Facial Landmark dataset consisting of 2,300 RGB-T image pairs, a total of 4,600 images. Creating a landmark dataset is time-consuming, but AI-assisted annotation can help. However, applying AI to thermal images is challenging due to suboptimal results from direct thermal training and infeasible RGB-thermal alignment due to dif… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  5. arXiv:2406.00307  [pdf, other

    cs.CV

    HENASY: Learning to Assemble Scene-Entities for Egocentric Video-Language Model

    Authors: Khoa Vo, Thinh Phan, Kashu Yamazaki, Minh Tran, Ngan Le

    Abstract: Current video-language models (VLMs) rely extensively on instance-level alignment between video and language modalities, which presents two major limitations: (1) visual reasoning disobeys the natural perception that humans do in first-person perspective, leading to a lack of reasoning interpretation; and (2) learning is limited in capturing inherent fine-grained relationships between two modaliti… ▽ More

    Submitted 6 June, 2024; v1 submitted 1 June, 2024; originally announced June 2024.

    Comments: under submission

  6. arXiv:2405.16148  [pdf, other

    cs.LG

    Accelerating Transformers with Spectrum-Preserving Token Merging

    Authors: Hoai-Chau Tran, Duy M. H. Nguyen, Duy M. Nguyen, Trung-Tin Nguyen, Ngan Le, Pengtao Xie, Daniel Sonntag, James Y. Zou, Binh T. Nguyen, Mathias Niepert

    Abstract: Increasing the throughput of the Transformer architecture, a foundational component used in numerous state-of-the-art models for vision and language tasks (e.g., GPT, LLaVa), is an important problem in machine learning. One recent and effective strategy is to merge token representations within Transformer models, aiming to reduce computational and memory requirements while maintaining accuracy. Pr… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

    Comments: Version 1

  7. arXiv:2405.07994  [pdf

    eess.IV cs.AI cs.CV cs.LG

    BubbleID: A Deep Learning Framework for Bubble Interface Dynamics Analysis

    Authors: Christy Dunlap, Changgen Li, Hari Pandey, Ngan Le, Han Hu

    Abstract: This paper presents BubbleID, a sophisticated deep learning architecture designed to comprehensively identify both static and dynamic attributes of bubbles within sequences of boiling images. By amalgamating segmentation powered by Mask R-CNN with SORT-based tracking techniques, the framework is capable of analyzing each bubble's location, dimensions, interface shape, and velocity over its lifetim… ▽ More

    Submitted 20 March, 2024; originally announced May 2024.

    Comments: 16 pages, 4 figures

  8. arXiv:2405.04489  [pdf, other

    cs.CV

    S3Former: Self-supervised High-resolution Transformer for Solar PV Profiling

    Authors: Minh Tran, Adrian De Luis, Haitao Liao, Ying Huang, Roy McCann, Alan Mantooth, Jack Cothren, Ngan Le

    Abstract: As the impact of climate change escalates, the global necessity to transition to sustainable energy sources becomes increasingly evident. Renewable energies have emerged as a viable solution for users, with Photovoltaic energy being a favored choice for small installations due to its reliability and efficiency. Accurate map** of PV installations is crucial for understanding the extension of its… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: Preprint

  9. arXiv:2405.02522  [pdf

    cs.HC cs.AI cs.CY cs.SI

    New contexts, old heuristics: How young people in India and the US trust online content in the age of generative AI

    Authors: Rachel Xu, Nhu Le, Rebekah Park, Laura Murray, Vishnupriya Das, Devika Kumar, Beth Goldberg

    Abstract: We conducted an in-person ethnography in India and the US to investigate how young people (18-24) trusted online content, with a focus on generative AI (GenAI). We had four key findings about how young people use GenAI and determine what to trust online. First, when online, we found participants fluidly shifted between mindsets and emotional states, which we term "information modes." Second, these… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

    Comments: 14 pages

  10. arXiv:2404.19052  [pdf, other

    cs.DB cs.IR

    Exploring Weighted Property Approaches for RDF Graph Similarity Measure

    Authors: Ngoc Luyen Le, Marie-Hélène Abel, Philippe Gouspillou

    Abstract: Measuring similarity between RDF graphs is essential for various applications, including knowledge discovery, semantic web analysis, and recommender systems. However, traditional similarity measures often treat all properties equally, potentially overlooking the varying importance of different properties in different contexts. Consequently, exploring weighted property approaches for RDF graph simi… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

  11. arXiv:2404.11429  [pdf, other

    cs.CV

    CarcassFormer: An End-to-end Transformer-based Framework for Simultaneous Localization, Segmentation and Classification of Poultry Carcass Defect

    Authors: Minh Tran, Sang Truong, Arthur F. A. Fernandes, Michael T. Kidd, Ngan Le

    Abstract: In the food industry, assessing the quality of poultry carcasses during processing is a crucial step. This study proposes an effective approach for automating the assessment of carcass quality without requiring skilled labor or inspector involvement. The proposed system is based on machine learning (ML) and computer vision (CV) techniques, enabling automated defect detection and carcass quality as… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: Accepted to Poultry Science Journal

  12. arXiv:2404.09951  [pdf, other

    cs.CV

    Unifying Global and Local Scene Entities Modelling for Precise Action Spotting

    Authors: Kim Hoang Tran, Phuc Vuong Do, Ngoc Quoc Ly, Ngan Le

    Abstract: Sports videos pose complex challenges, including cluttered backgrounds, camera angle changes, small action-representing objects, and imbalanced action class distribution. Existing methods for detecting actions in sports videos heavily rely on global features, utilizing a backbone network as a black box that encompasses the entire spatial frame. However, these approaches tend to overlook the nuance… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: Accepted to IJCNN 2024

  13. arXiv:2403.17606  [pdf, other

    cs.RO

    Interactive Identification of Granular Materials using Force Measurements

    Authors: Samuli Hynninen, Tran Nguyen Le, Ville Kyrki

    Abstract: The ability to identify granular materials facilitates the emergence of various new applications in robotics, ranging from cooking at home to truck loading at mining sites. However, granular material identification remains a challenging and underexplored area. In this work, we present a novel interactive material identification framework that enables robots to identify a wide range of granular mat… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

    Comments: Submitted to 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2024)

  14. arXiv:2403.12685  [pdf, other

    cs.RO

    Dynamic Manipulation of Deformable Objects using Imitation Learning with Adaptation to Hardware Constraints

    Authors: Eric Hannus, Tran Nguyen Le, David Blanco-Mulero, Ville Kyrki

    Abstract: Imitation Learning (IL) is a promising paradigm for learning dynamic manipulation of deformable objects since it does not depend on difficult-to-create accurate simulations of such objects. However, the translation of motions demonstrated by a human to a robot is a challenge for IL, due to differences in the embodiments and the robot's physical limits. These limits are especially relevant in dynam… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Comments: Submitted to 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2024). 8 pages, 8 figures

  15. arXiv:2403.11376  [pdf, other

    cs.CV

    ShapeFormer: Shape Prior Visible-to-Amodal Transformer-based Amodal Instance Segmentation

    Authors: Minh Tran, Winston Bounsavy, Khoa Vo, Anh Nguyen, Tri Nguyen, Ngan Le

    Abstract: Amodal Instance Segmentation (AIS) presents a challenging task as it involves predicting both visible and occluded parts of objects within images. Existing AIS methods rely on a bidirectional approach, encompassing both the transition from amodal features to visible features (amodal-to-visible) and from visible features to amodal features (visible-to-amodal). Our observation shows that the utiliza… ▽ More

    Submitted 17 April, 2024; v1 submitted 17 March, 2024; originally announced March 2024.

    Comments: Accepted to IJCNN2024

  16. arXiv:2403.03435  [pdf, ps, other

    cs.CL

    VLSP 2023 -- LTER: A Summary of the Challenge on Legal Textual Entailment Recognition

    Authors: Vu Tran, Ha-Thanh Nguyen, Trung Vo, Son T. Luu, Hoang-Anh Dang, Ngoc-Cam Le, Thi-Thuy Le, Minh-Tien Nguyen, Truong-Son Nguyen, Le-Minh Nguyen

    Abstract: In this new era of rapid AI development, especially in language processing, the demand for AI in the legal domain is increasingly critical. In the context where research in other languages such as English, Japanese, and Chinese has been well-established, we introduce the first fundamental research for the Vietnamese language in the legal domain: legal textual entailment recognition through the Vie… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

  17. arXiv:2403.02974  [pdf, other

    cs.RO cs.HC cs.LG

    Online Learning of Human Constraints from Feedback in Shared Autonomy

    Authors: Shibei Zhu, Tran Nguyen Le, Samuel Kaski, Ville Kyrki

    Abstract: Real-time collaboration with humans poses challenges due to the different behavior patterns of humans resulting from diverse physical constraints. Existing works typically focus on learning safety constraints for collaboration, or how to divide and distribute the subtasks between the participating agents to carry out the main task. In contrast, we propose to learn a human constraints model that, i… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

    Comments: Accepted to AAAI-24 Bridge Program on Collaborative AI and Modeling of Humans & AAAI-24 Workshop on Ad Hoc Teamwork

  18. arXiv:2402.18753  [pdf

    cs.HC cs.CY cs.SI

    Like-minded, like-bodied: How users (18-26) trust online eating and health information

    Authors: Rachel Xu, Nhu Le, Rebekah Park, Laura Murray

    Abstract: This paper investigates the relationship between social media and eating practices amongst 42 internet users aged 18-26. We conducted an ethnography in the US and India to observe how they navigated eating and health information online. We found that participants portrayed themselves online through a vocabulary we have labeled "the good life": performing holistic health by displaying a socially-id… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

    Comments: 10 pages

  19. arXiv:2401.10761  [pdf, other

    eess.IV cs.CV

    NN-VVC: Versatile Video Coding boosted by self-supervisedly learned image coding for machines

    Authors: Jukka I. Ahonen, Nam Le, Honglei Zhang, Antti Hallapuro, Francesco Cricri, Hamed Rezazadegan Tavakoli, Miska M. Hannuksela, Esa Rahtu

    Abstract: The recent progress in artificial intelligence has led to an ever-increasing usage of images and videos by machine analysis algorithms, mainly neural networks. Nonetheless, compression, storage and transmission of media have traditionally been designed considering human beings as the viewers of the content. Recent research on image and video coding for machine analysis has progressed mainly in two… ▽ More

    Submitted 19 January, 2024; originally announced January 2024.

    Comments: ISM 2023 Best paper award winner version

  20. Bridging the gap between image coding for machines and humans

    Authors: Nam Le, Honglei Zhang, Francesco Cricri, Ramin G. Youvalari, Hamed Rezazadegan Tavakoli, Emre Aksu, Miska M. Hannuksela, Esa Rahtu

    Abstract: Image coding for machines (ICM) aims at reducing the bitrate required to represent an image while minimizing the drop in machine vision analysis accuracy. In many use cases, such as surveillance, it is also important that the visual quality is not drastically deteriorated by the compression process. Recent works on using neural network (NN) based ICM codecs have shown significant coding gains agai… ▽ More

    Submitted 19 January, 2024; originally announced January 2024.

    Journal ref: IEEE International Conference on Image Processing (ICIP), Bordeaux, France, 2022, pp. 3411-3415

  21. arXiv:2401.04474  [pdf, other

    cs.IR cs.AI

    Combining Embedding-Based and Semantic-Based Models for Post-hoc Explanations in Recommender Systems

    Authors: Ngoc Luyen Le, Marie-Hélène Abel, Philippe Gouspillou

    Abstract: In today's data-rich environment, recommender systems play a crucial role in decision support systems. They provide to users personalized recommendations and explanations about these recommendations. Embedding-based models, despite their widespread use, often suffer from a lack of interpretability, which can undermine trust and user engagement. This paper presents an approach that combines embeddi… ▽ More

    Submitted 9 January, 2024; originally announced January 2024.

  22. arXiv:2401.03770  [pdf, other

    cs.IR

    Recognizing Similar Crises through the Application of Ontology-based Knowledge Mining

    Authors: Ngoc Luyen Le, Marie-Hélène Abel, Elsa Negre

    Abstract: Recognizing and learning from similar crisis situations is crucial for the development of effective response strategies. This study addresses the challenge of identifying similarities within a wide range of crisis-related information. To overcome this challenge, we employed an ontology-based crisis situation knowledge base enriched with crisis-related information. Additionally, we implemented a se… ▽ More

    Submitted 8 January, 2024; originally announced January 2024.

  23. arXiv:2312.10187  [pdf, other

    eess.SP cs.LG

    TSRNet: Simple Framework for Real-time ECG Anomaly Detection with Multimodal Time and Spectrogram Restoration Network

    Authors: Nhat-Tan Bui, Dinh-Hieu Hoang, Thinh Phan, Minh-Triet Tran, Brijesh Patel, Donald Adjeroh, Ngan Le

    Abstract: The electrocardiogram (ECG) is a valuable signal used to assess various aspects of heart health, such as heart rate and rhythm. It plays a crucial role in identifying cardiac conditions and detecting anomalies in ECG data. However, distinguishing between normal and abnormal ECG signals can be a challenging task. In this paper, we propose an approach that leverages anomaly detection to identify unh… ▽ More

    Submitted 5 March, 2024; v1 submitted 15 December, 2023; originally announced December 2023.

    Comments: Accepted at ISBI 2024

  24. WAVER: Writing-style Agnostic Text-Video Retrieval via Distilling Vision-Language Models Through Open-Vocabulary Knowledge

    Authors: Huy Le, Tung Kieu, Anh Nguyen, Ngan Le

    Abstract: Text-video retrieval, a prominent sub-field within the domain of multimodal information retrieval, has witnessed remarkable growth in recent years. However, existing methods assume video scenes are consistent with unbiased descriptions. These limitations fail to align with real-world scenarios since descriptions can be influenced by annotator biases, diverse writing styles, and varying textual per… ▽ More

    Submitted 10 January, 2024; v1 submitted 14 December, 2023; originally announced December 2023.

    Comments: Accepted to ICASSP 2024

  25. arXiv:2312.07740  [pdf, other

    cs.CV

    HAtt-Flow: Hierarchical Attention-Flow Mechanism for Group Activity Scene Graph Generation in Videos

    Authors: Naga VS Raviteja Chappa, Pha Nguyen, Thi Hoang Ngan Le, Khoa Luu

    Abstract: Group Activity Scene Graph (GASG) generation is a challenging task in computer vision, aiming to anticipate and describe relationships between subjects and objects in video sequences. Traditional Video Scene Graph Generation (VidSGG) methods focus on retrospective analysis, limiting their predictive capabilities. To enrich the scene understanding capabilities, we introduced a GASG dataset extendin… ▽ More

    Submitted 28 November, 2023; originally announced December 2023.

    Comments: 11 pages, 5 figures, 6 tables

  26. arXiv:2312.05634  [pdf, other

    cs.CV

    PGDS: Pose-Guidance Deep Supervision for Mitigating Clothes-Changing in Person Re-Identification

    Authors: Quoc-Huy Trinh, Nhat-Tan Bui, Dinh-Hieu Hoang, Phuoc-Thao Vo Thi, Hai-Dang Nguyen, Debesh Jha, Ulas Bagci, Ngan Le, Minh-Triet Tran

    Abstract: Person Re-Identification (Re-ID) task seeks to enhance the tracking of multiple individuals by surveillance cameras. It supports multimodal tasks, including text-based person retrieval and human matching. One of the most significant challenges faced in Re-ID is clothes-changing, where the same person may appear in different outfits. While previous methods have made notable progress in maintaining… ▽ More

    Submitted 1 June, 2024; v1 submitted 9 December, 2023; originally announced December 2023.

    Comments: Accepted at AVSS 2024

  27. arXiv:2311.11362  [pdf, other

    quant-ph cs.LG physics.chem-ph physics.comp-ph

    Symmetry-invariant quantum machine learning force fields

    Authors: Isabel Nha Minh Le, Oriel Kiss, Julian Schuhmacher, Ivano Tavernelli, Francesco Tacchino

    Abstract: Machine learning techniques are essential tools to compute efficient, yet accurate, force fields for atomistic simulations. This approach has recently been extended to incorporate quantum computational methods, making use of variational quantum learning models to predict potential energy surfaces and atomic forces from ab initio training data. However, the trainability and scalability of such mode… ▽ More

    Submitted 19 November, 2023; originally announced November 2023.

    Comments: 12 pages, 8 figures

  28. arXiv:2311.11096  [pdf, other

    eess.IV cs.CV

    On the Out of Distribution Robustness of Foundation Models in Medical Image Segmentation

    Authors: Duy Minh Ho Nguyen, Tan Ngoc Pham, Nghiem Tuong Diep, Nghi Quoc Phan, Quang Pham, Vinh Tong, Binh T. Nguyen, Ngan Hoang Le, Nhat Ho, Pengtao Xie, Daniel Sonntag, Mathias Niepert

    Abstract: Constructing a robust model that can effectively generalize to test samples under distribution shifts remains a significant challenge in the field of medical imaging. The foundational models for vision and language, pre-trained on extensive sets of natural image and text data, have emerged as a promising approach. It showcases impressive learning abilities across different tasks with the need for… ▽ More

    Submitted 18 November, 2023; originally announced November 2023.

    Comments: Advances in Neural Information Processing Systems (NeurIPS) 2023, Workshop on robustness of zero/few-shot learning in foundation models

  29. arXiv:2311.00729  [pdf, other

    cs.CV cs.AI

    ZEETAD: Adapting Pretrained Vision-Language Model for Zero-Shot End-to-End Temporal Action Detection

    Authors: Thinh Phan, Khoa Vo, Duy Le, Gianfranco Doretto, Donald Adjeroh, Ngan Le

    Abstract: Temporal action detection (TAD) involves the localization and classification of action instances within untrimmed videos. While standard TAD follows fully supervised learning with closed-set setting on large training data, recent zero-shot TAD methods showcase the promising open-set setting by leveraging large-scale contrastive visual-language (ViL) pretrained models. However, existing zero-shot T… ▽ More

    Submitted 4 November, 2023; v1 submitted 31 October, 2023; originally announced November 2023.

  30. arXiv:2310.20057  [pdf, other

    cs.CV

    SolarFormer: Multi-scale Transformer for Solar PV Profiling

    Authors: Adrian de Luis, Minh Tran, Taisei Hanyu, Anh Tran, Liao Haitao, Roy McCann, Alan Mantooth, Ying Huang, Ngan Le

    Abstract: As climate change intensifies, the global imperative to shift towards sustainable energy sources becomes more pronounced. Photovoltaic (PV) energy is a favored choice due to its reliability and ease of installation. Accurate map** of PV installations is crucial for understanding their adoption and informing energy policy. To meet this need, we introduce the SolarFormer, designed to segment solar… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

    Comments: Pre-print

  31. arXiv:2310.18986  [pdf, other

    cs.CV

    Controllable Group Choreography using Contrastive Diffusion

    Authors: Nhat Le, Tuong Do, Khoa Do, Hien Nguyen, Erman Tjiputra, Quang D. Tran, Anh Nguyen

    Abstract: Music-driven group choreography poses a considerable challenge but holds significant potential for a wide range of industrial applications. The ability to generate synchronized and visually appealing group dance motions that are aligned with music opens up opportunities in many fields such as entertainment, advertising, and virtual performances. However, most of the recent works are not able to ge… ▽ More

    Submitted 3 November, 2023; v1 submitted 29 October, 2023; originally announced October 2023.

  32. arXiv:2310.03923  [pdf, other

    cs.CV cs.RO

    Open-Fusion: Real-time Open-Vocabulary 3D Map** and Queryable Scene Representation

    Authors: Kashu Yamazaki, Taisei Hanyu, Khoa Vo, Thang Pham, Minh Tran, Gianfranco Doretto, Anh Nguyen, Ngan Le

    Abstract: Precise 3D environmental map** is pivotal in robotics. Existing methods often rely on predefined concepts during training or are time-intensive when generating semantic maps. This paper presents Open-Fusion, a groundbreaking approach for real-time open-vocabulary 3D map** and queryable scene representation using RGB-D data. Open-Fusion harnesses the power of a pre-trained vision-language found… ▽ More

    Submitted 5 October, 2023; originally announced October 2023.

  33. arXiv:2310.02143  [pdf, other

    cs.CY cs.IR

    CORec-Cri: How collaborative and social technologies can help to contextualize crises?

    Authors: Ngoc Luyen Le, **feng Zhong, Elsa Negre, Marie-Hélène Abel

    Abstract: Crisis situations can present complex and multifaceted challenges, often requiring the involvement of multiple organizations and stakeholders with varying areas of expertise, responsibilities, and resources. Acquiring accurate and timely information about impacted areas is crucial to effectively respond to these crises. In this paper, we investigate how collaborative and social technologies help t… ▽ More

    Submitted 3 October, 2023; originally announced October 2023.

  34. arXiv:2309.13550  [pdf, other

    cs.CV cs.AI

    I-AI: A Controllable & Interpretable AI System for Decoding Radiologists' Intense Focus for Accurate CXR Diagnoses

    Authors: Trong Thang Pham, Jacob Brecheisen, Anh Nguyen, Hien Nguyen, Ngan Le

    Abstract: In the field of chest X-ray (CXR) diagnosis, existing works often focus solely on determining where a radiologist looks, typically through tasks such as detection, segmentation, or classification. However, these approaches are often designed as black-box models, lacking interpretability. In this paper, we introduce Interpretable Artificial Intelligence (I-AI) a novel and unified controllable inter… ▽ More

    Submitted 9 December, 2023; v1 submitted 24 September, 2023; originally announced September 2023.

    Comments: Accepted at WACV 2024

  35. arXiv:2309.12323  [pdf, other

    cond-mat.mtrl-sci cs.LG physics.comp-ph

    Evaluating the diversity and utility of materials proposed by generative models

    Authors: Alexander New, Michael Pekala, Elizabeth A. Pogue, Nam Q. Le, Janna Domenico, Christine D. Piatko, Christopher D. Stiles

    Abstract: Generative machine learning models can use data generated by scientific modeling to create large quantities of novel material structures. Here, we assess how one state-of-the-art generative model, the physics-guided crystal generation model (PGCGM), can be used as part of the inverse design process. We show that the default PGCGM's input space is not smooth with respect to parameter variation, mak… ▽ More

    Submitted 9 August, 2023; originally announced September 2023.

    Comments: 12 pages, 9 figures. Published at SynS & ML @ ICML2023: https://openreview.net/forum?id=2ZYbmYTKoR

  36. arXiv:2309.10932  [pdf, other

    cs.RO

    Open-Vocabulary Affordance Detection using Knowledge Distillation and Text-Point Correlation

    Authors: Tuan Van Vo, Minh Nhat Vu, Baoru Huang, Toan Nguyen, Ngan Le, Thieu Vo, Anh Nguyen

    Abstract: Affordance detection presents intricate challenges and has a wide range of robotic applications. Previous works have faced limitations such as the complexities of 3D object shapes, the wide range of potential affordances on real-world objects, and the lack of open-vocabulary support for affordance understanding. In this paper, we introduce a new open-vocabulary affordance detection method in 3D po… ▽ More

    Submitted 19 September, 2023; originally announced September 2023.

    Comments: 8 pages

  37. arXiv:2309.10911  [pdf, other

    cs.RO

    Language-Conditioned Affordance-Pose Detection in 3D Point Clouds

    Authors: Toan Nguyen, Minh Nhat Vu, Baoru Huang, Tuan Van Vo, Vy Truong, Ngan Le, Thieu Vo, Bac Le, Anh Nguyen

    Abstract: Affordance detection and pose estimation are of great importance in many robotic applications. Their combination helps the robot gain an enhanced manipulation capability, in which the generated pose can facilitate the corresponding affordance task. Previous methods for affodance-pose joint learning are limited to a predefined set of affordances, thus limiting the adaptability of robots in real-wor… ▽ More

    Submitted 19 September, 2023; originally announced September 2023.

    Comments: Project page: https://3DAPNet.github.io

  38. arXiv:2309.03506  [pdf, other

    cs.CV cs.AI

    Towards Robust Natural-Looking Mammography Lesion Synthesis on Ipsilateral Dual-Views Breast Cancer Analysis

    Authors: Thanh-Huy Nguyen, Quang Hien Kha, Thai Ngoc Toan Truong, Ba Thinh Lam, Ba Hung Ngo, Quang Vinh Dinh, Nguyen Quoc Khanh Le

    Abstract: In recent years, many mammographic image analysis methods have been introduced for improving cancer classification tasks. Two major issues of mammogram classification tasks are leveraging multi-view mammographic information and class-imbalance handling. In the first problem, many multi-view methods have been released for concatenating features of two or more views for the training and inference st… ▽ More

    Submitted 7 September, 2023; originally announced September 2023.

  39. arXiv:2309.03493  [pdf, other

    eess.IV cs.CV

    SAM3D: Segment Anything Model in Volumetric Medical Images

    Authors: Nhat-Tan Bui, Dinh-Hieu Hoang, Minh-Triet Tran, Gianfranco Doretto, Donald Adjeroh, Brijesh Patel, Arabinda Choudhary, Ngan Le

    Abstract: Image segmentation remains a pivotal component in medical image analysis, aiding in the extraction of critical information for precise diagnostic practices. With the advent of deep learning, automated image segmentation methods have risen to prominence, showcasing exceptional proficiency in processing medical imagery. Motivated by the Segment Anything Model (SAM)-a foundational model renowned for… ▽ More

    Submitted 5 March, 2024; v1 submitted 7 September, 2023; originally announced September 2023.

    Comments: Accepted at ISBI 2024

  40. arXiv:2309.03329  [pdf, other

    cs.CV

    MEGANet: Multi-Scale Edge-Guided Attention Network for Weak Boundary Polyp Segmentation

    Authors: Nhat-Tan Bui, Dinh-Hieu Hoang, Quang-Thuc Nguyen, Minh-Triet Tran, Ngan Le

    Abstract: Efficient polyp segmentation in healthcare plays a critical role in enabling early diagnosis of colorectal cancer. However, the segmentation of polyps presents numerous challenges, including the intricate distribution of backgrounds, variations in polyp sizes and shapes, and indistinct boundaries. Defining the boundary between the foreground (i.e. polyp itself) and the background (surrounding tiss… ▽ More

    Submitted 4 November, 2023; v1 submitted 6 September, 2023; originally announced September 2023.

    Comments: Accepted at the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2024)

  41. arXiv:2308.06018  [pdf, other

    cs.IR

    Designing a User Contextual Profile Ontology: A Focus on the Vehicle Sales Domain

    Authors: Ngoc Luyen Le, Marie-Hélène Abel, Philippe Gouspillou

    Abstract: In the digital age, it is crucial to understand and tailor experiences for users interacting with systems and applications. This requires the creation of user contextual profiles that combine user profiles with contextual information. However, there is a lack of research on the integration of contextual information with different user profiles. This study aims to address this gap by designing a us… ▽ More

    Submitted 11 August, 2023; originally announced August 2023.

  42. A Constraint-based Recommender System via RDF Knowledge Graphs

    Authors: Ngoc Luyen Le, Marie-Hélène Abel, Philippe Gouspillou

    Abstract: Knowledge graphs, represented in RDF, are able to model entities and their relations by means of ontologies. The use of knowledge graphs for information modeling has attracted interest in recent years. In recommender systems, items and users can be mapped and integrated into the knowledge graph, which can represent more links and relationships between users and items. Constraint-based recommender… ▽ More

    Submitted 20 July, 2023; originally announced July 2023.

    Journal ref: The 26th International Conference on Computer Supported Cooperative Work in Design (CSCWD 2023 ), May 2023, Rio de Janeiro, Brazil. pp.849-854

  43. A Personalized Recommender System Based-on Knowledge Graph Embeddings

    Authors: Ngoc Luyen Le, Marie-Hélène Abel, Philippe Gouspillou

    Abstract: Knowledge graphs have proven to be effective for modeling entities and their relationships through the use of ontologies. The recent emergence in interest for using knowledge graphs as a form of information modeling has led to their increased adoption in recommender systems. By incorporating users and items into the knowledge graph, these systems can better capture the implicit connections between… ▽ More

    Submitted 20 July, 2023; originally announced July 2023.

    Journal ref: The International Conference on Artificial Intelligence and Computer Vision (AICV2023), Mar 2023, Marrakesh, Morocco. pp.368-378

  44. Improving Semantic Similarity Measure Within a Recommender System Based-on RDF Graphs

    Authors: Ngoc Luyen Le, Marie-Hélène Abel, Philippe Gouspillou

    Abstract: In today's era of information explosion, more users are becoming more reliant upon recommender systems to have better advice, suggestions, or inspire them. The measure of the semantic relatedness or likeness between terms, words, or text data plays an important role in different applications dealing with textual data, as in a recommender system. Over the past few years, many ontologies have been d… ▽ More

    Submitted 20 July, 2023; originally announced July 2023.

    Journal ref: International Conference on Information Technology & Systems. ICITS 2023, Apr 2023, Cusco, Peru. pp.463-474

  45. arXiv:2307.08272  [pdf, other

    cs.CL

    ChatGPT is Good but Bing Chat is Better for Vietnamese Students

    Authors: Xuan-Quy Dao, Ngoc-Bich Le

    Abstract: This study examines the efficacy of two SOTA large language models (LLMs), namely ChatGPT and Microsoft Bing Chat (BingChat), in catering to the needs of Vietnamese students. Although ChatGPT exhibits proficiency in multiple disciplines, Bing Chat emerges as the more advantageous option. We conduct a comparative analysis of their academic achievements in various disciplines, encompassing mathemati… ▽ More

    Submitted 29 July, 2023; v1 submitted 17 July, 2023; originally announced July 2023.

    Comments: 13 pages; 6 figures

  46. arXiv:2307.04251   

    cs.CL cs.AI cs.LG

    ChatGPT in the Age of Generative AI and Large Language Models: A Concise Survey

    Authors: Salman Mohamadi, Ghulam Mujtaba, Ngan Le, Gianfranco Doretto, Donald A. Adjeroh

    Abstract: ChatGPT is a large language model (LLM) created by OpenAI that has been carefully trained on a large amount of data. It has revolutionized the field of natural language processing (NLP) and has pushed the boundaries of LLM capabilities. ChatGPT has played a pivotal role in enabling widespread public interaction with generative artificial intelligence (GAI) on a large scale. It has also sparked res… ▽ More

    Submitted 15 July, 2023; v1 submitted 9 July, 2023; originally announced July 2023.

    Comments: The paper was uploaded in error, before it was ready for submission. The paper requires a deep revision, and significant changes and modifications before uploading to the archive

  47. arXiv:2307.01844  [pdf, other

    cs.CV

    Advancing Wound Filling Extraction on 3D Faces: Auto-Segmentation and Wound Face Regeneration Approach

    Authors: Duong Q. Nguyen, Thinh D. Le, Phuong D. Nguyen, Nga T. K. Le, H. Nguyen-Xuan

    Abstract: Facial wound segmentation plays a crucial role in preoperative planning and optimizing patient outcomes in various medical applications. In this paper, we propose an efficient approach for automating 3D facial wound segmentation using a two-stream graph convolutional network. Our method leverages the Cir3D-FaIR dataset and addresses the challenge of data imbalance through extensive experimentation… ▽ More

    Submitted 12 July, 2023; v1 submitted 4 July, 2023; originally announced July 2023.

  48. arXiv:2307.01159  [pdf, other

    cs.RO cs.AI

    Soft Grip**: Specifying for Trustworthiness

    Authors: Dhaminda B. Abeywickrama, Nguyen Hao Le, Greg Chance, Peter D. Winter, Arianna Manzini, Alix J. Partridge, Jonathan Ives, John Downer, Graham Deacon, Jonathan Rossiter, Kerstin Eder, Shane Windsor

    Abstract: Soft robotics is an emerging technology in which engineers create flexible devices for use in a variety of applications. In order to advance the wide adoption of soft robots, ensuring their trustworthiness is essential; if soft robots are not trusted, they will not be used to their full potential. In order to demonstrate trustworthiness, a specification needs to be formulated to define what is tru… ▽ More

    Submitted 30 October, 2023; v1 submitted 3 July, 2023; originally announced July 2023.

    Comments: Updated the Standards subsection of paper. 9 pages, 2 figures, 1 table, 34 references

    ACM Class: D.2.1; I.2.9

  49. arXiv:2306.09170  [pdf, ps, other

    cs.CL cs.HC

    Can ChatGPT pass the Vietnamese National High School Graduation Examination?

    Authors: Xuan-Quy Dao, Ngoc-Bich Le, Xuan-Dung Phan, Bac-Bien Ngo

    Abstract: This research article highlights the potential of AI-powered chatbots in education and presents the results of using ChatGPT, a large language model, to complete the Vietnamese National High School Graduation Examination (VNHSGE). The study dataset included 30 essays in the literature test case and 1,700 multiple-choice questions designed for other subjects. The results showed that ChatGPT was abl… ▽ More

    Submitted 10 July, 2023; v1 submitted 15 June, 2023; originally announced June 2023.

    Comments: 9 pages, 13 figures, 4 tables

  50. arXiv:2306.06842  [pdf, other

    cs.CV

    AerialFormer: Multi-resolution Transformer for Aerial Image Segmentation

    Authors: Kashu Yamazaki, Taisei Hanyu, Minh Tran, Adrian de Luis, Roy McCann, Haitao Liao, Chase Rainwater, Meredith Adkins, Jackson Cothren, Ngan Le

    Abstract: Aerial Image Segmentation is a top-down perspective semantic segmentation and has several challenging characteristics such as strong imbalance in the foreground-background distribution, complex background, intra-class heterogeneity, inter-class homogeneity, and tiny objects. To handle these problems, we inherit the advantages of Transformers and propose AerialFormer, which unifies Transformers at… ▽ More

    Submitted 1 October, 2023; v1 submitted 11 June, 2023; originally announced June 2023.

    Comments: under review