Skip to main content

Showing 1–49 of 49 results for author: Vo, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.15613  [pdf, other

    cs.LG cs.AI cs.CV

    Automatic Data Curation for Self-Supervised Learning: A Clustering-Based Approach

    Authors: Huy V. Vo, Vasil Khalidov, Timothée Darcet, Théo Moutakanni, Nikita Smetanin, Marc Szafraniec, Hugo Touvron, Camille Couprie, Maxime Oquab, Armand Joulin, Hervé Jégou, Patrick Labatut, Piotr Bojanowski

    Abstract: Self-supervised features are the cornerstone of modern machine learning systems. They are typically pre-trained on data collections whose construction and curation typically require extensive human effort. This manual process has some limitations similar to those encountered in supervised learning, e.g., the crowd-sourced selection of data is costly and time-consuming, preventing scaling the datas… ▽ More

    Submitted 28 June, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

  2. arXiv:2405.05272  [pdf, other

    math.GT cs.LG

    Learning bridge numbers of knots

    Authors: Hanh Vo, Puttipong Pongtanapaisan, Thieu Nguyen

    Abstract: This paper employs various computational techniques to determine the bridge numbers of both classical and virtual knots. For classical knots, there is no ambiguity of what the bridge number means. For virtual knots, there are multiple natural definitions of bridge number, and we demonstrate that the difference can be arbitrarily far apart. We then acquired two datasets, one for classical and one f… ▽ More

    Submitted 29 April, 2024; originally announced May 2024.

  3. arXiv:2403.00833  [pdf, other

    cs.AI

    Position Paper: Agent AI Towards a Holistic Intelligence

    Authors: Qiuyuan Huang, Naoki Wake, Bidipta Sarkar, Zane Durante, Ran Gong, Rohan Taori, Yusuke Noda, Demetri Terzopoulos, Noboru Kuno, Ade Famoti, Ashley Llorens, John Langford, Hoi Vo, Li Fei-Fei, Katsu Ikeuchi, Jianfeng Gao

    Abstract: Recent advancements in large foundation models have remarkably enhanced our understanding of sensory information in open-world environments. In leveraging the power of foundation models, it is crucial for AI research to pivot away from excessive reductionism and toward an emphasis on systems that function as cohesive wholes. Specifically, we emphasize develo** Agent AI -- an embodied system that… ▽ More

    Submitted 28 February, 2024; originally announced March 2024.

    Comments: 22 pages, 4 figures. arXiv admin note: substantial text overlap with arXiv:2401.03568

  4. arXiv:2402.05929  [pdf, other

    cs.AI cs.LG cs.RO

    An Interactive Agent Foundation Model

    Authors: Zane Durante, Bidipta Sarkar, Ran Gong, Rohan Taori, Yusuke Noda, Paul Tang, Ehsan Adeli, Shrinidhi Kowshika Lakshmikanth, Kevin Schulman, Arnold Milstein, Demetri Terzopoulos, Ade Famoti, Noboru Kuno, Ashley Llorens, Hoi Vo, Katsu Ikeuchi, Li Fei-Fei, Jianfeng Gao, Naoki Wake, Qiuyuan Huang

    Abstract: The development of artificial intelligence systems is transitioning from creating static, task-specific models to dynamic, agent-based systems capable of performing well in a wide range of applications. We propose an Interactive Agent Foundation Model that uses a novel multi-task agent training paradigm for training AI agents across a wide range of domains, datasets, and tasks. Our training paradi… ▽ More

    Submitted 17 June, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

  5. arXiv:2402.03805  [pdf, other

    cs.SE

    Automated Description Generation for Software Patches

    Authors: Thanh Trong Vu, Tuan-Dung Bui, Thanh-Dat Do, Thu-Trang Nguyen, Hieu Dinh Vo, Son Nguyen

    Abstract: Software patches are pivotal in refining and evolving codebases, addressing bugs, vulnerabilities, and optimizations. Patch descriptions provide detailed accounts of changes, aiding comprehension and collaboration among developers. However, manual description creation poses challenges in terms of time consumption and variations in quality and detail. In this paper, we propose PATCHEXPLAINER, an ap… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

    Comments: Pre-print version of PATCHEXPLAINER

  6. arXiv:2401.03568  [pdf, other

    cs.AI cs.HC cs.LG

    Agent AI: Surveying the Horizons of Multimodal Interaction

    Authors: Zane Durante, Qiuyuan Huang, Naoki Wake, Ran Gong, Jae Sung Park, Bidipta Sarkar, Rohan Taori, Yusuke Noda, Demetri Terzopoulos, Ye** Choi, Katsushi Ikeuchi, Hoi Vo, Li Fei-Fei, Jianfeng Gao

    Abstract: Multi-modal AI systems will likely become a ubiquitous presence in our everyday lives. A promising approach to making these systems more interactive is to embody them as agents within physical and virtual environments. At present, systems leverage existing foundation models as the basic building blocks for the creation of embodied agents. Embedding agents within such environments facilitates the a… ▽ More

    Submitted 25 January, 2024; v1 submitted 7 January, 2024; originally announced January 2024.

  7. arXiv:2311.14971  [pdf

    cs.CV cs.LG q-bio.TO

    Segmentation of diagnostic tissue compartments on whole slide images with renal thrombotic microangiopathies (TMAs)

    Authors: Huy Q. Vo, Pietro A. Cicalese, Surya Seshan, Syed A. Rizvi, Aneesh Vathul, Gloria Bueno, Anibal Pedraza Dorado, Niels Grabe, Katharina Stolle, Francesco Pesce, Joris J. T. H. Roelofs, Jesper Kers, Vitoantonio Bevilacqua, Nicola Altini, Bernd Schröppel, Dario Roccatello, Antonella Barreca, Savino Sciascia, Chandra Mohan, Hien V. Nguyen, Jan U. Becker

    Abstract: The thrombotic microangiopathies (TMAs) manifest in renal biopsy histology with a broad spectrum of acute and chronic findings. Precise diagnostic criteria for a renal biopsy diagnosis of TMA are missing. As a first step towards a machine learning- and computer vision-based analysis of wholes slide images from renal biopsies, we trained a segmentation model for the decisive diagnostic kidney tissu… ▽ More

    Submitted 28 November, 2023; v1 submitted 25 November, 2023; originally announced November 2023.

    Comments: 12 pages, 3 figures

  8. arXiv:2309.09971  [pdf, other

    cs.AI cs.HC cs.MA

    MindAgent: Emergent Gaming Interaction

    Authors: Ran Gong, Qiuyuan Huang, Xiaojian Ma, Hoi Vo, Zane Durante, Yusuke Noda, Zilong Zheng, Song-Chun Zhu, Demetri Terzopoulos, Li Fei-Fei, Jianfeng Gao

    Abstract: Large Language Models (LLMs) have the capacity of performing complex scheduling in a multi-agent system and can coordinate these agents into completing sophisticated tasks that require extensive collaboration. However, despite the introduction of numerous gaming frameworks, the community has insufficient benchmarks towards building general multi-agents collaboration infrastructure that encompass b… ▽ More

    Submitted 19 September, 2023; v1 submitted 18 September, 2023; originally announced September 2023.

    Comments: The first three authors contributed equally. 28 pages

  9. arXiv:2309.08225  [pdf, other

    cs.SE

    Silent Vulnerability-fixing Commit Identification Based on Graph Neural Networks

    Authors: Hieu Dinh Vo, Thanh Trong Vu, Son Nguyen

    Abstract: The growing dependence of software projects on external libraries has generated apprehensions regarding the security of these libraries because of concealed vulnerabilities. Handling these vulnerabilities presents difficulties due to the temporal delay between remediation and public exposure. Furthermore, a substantial fraction of open-source projects covertly address vulnerabilities without any f… ▽ More

    Submitted 15 September, 2023; originally announced September 2023.

    Comments: arXiv admin note: substantial text overlap with arXiv:2304.08396, arXiv:2309.01971

  10. arXiv:2309.01971  [pdf, other

    cs.SE

    VFFINDER: A Graph-based Approach for Automated Silent Vulnerability-Fix Identification

    Authors: Son Nguyen, Thanh Trong Vu, Hieu Dinh Vo

    Abstract: The increasing reliance of software projects on third-party libraries has raised concerns about the security of these libraries due to hidden vulnerabilities. Managing these vulnerabilities is challenging due to the time gap between fixes and public disclosures. Moreover, a significant portion of open-source projects silently fix vulnerabilities without disclosure, impacting vulnerability manageme… ▽ More

    Submitted 5 September, 2023; originally announced September 2023.

    Comments: Accepted by IEEE KSE 2023

  11. arXiv:2308.16262  [pdf, other

    cs.AI

    Causal Strategic Learning with Competitive Selection

    Authors: Kiet Q. H. Vo, Muneeb Aadil, Siu Lun Chau, Krikamol Muandet

    Abstract: We study the problem of agent selection in causal strategic learning under multiple decision makers and address two key challenges that come with it. Firstly, while much of prior work focuses on studying a fixed pool of agents that remains static regardless of their evaluations, we consider the impact of selection procedure by which agents are not only evaluated, but also selected. When each decis… ▽ More

    Submitted 3 February, 2024; v1 submitted 30 August, 2023; originally announced August 2023.

    Comments: Added more discussions on assumptions and the algorithm, and expand the Conclusion

  12. arXiv:2308.13735  [pdf, other

    cs.CV

    MST-compression: Compressing and Accelerating Binary Neural Networks with Minimum Spanning Tree

    Authors: Quang Hieu Vo, Linh-Tam Tran, Sung-Ho Bae, Lok-Won Kim, Choong Seon Hong

    Abstract: Binary neural networks (BNNs) have been widely adopted to reduce the computational cost and memory storage on edge-computing devices by using one-bit representation for activations and weights. However, as neural networks become wider/deeper to improve accuracy and meet practical requirements, the computational burden remains a significant challenge even on the binary version. To address these iss… ▽ More

    Submitted 25 August, 2023; originally announced August 2023.

    Comments: 11 pages, 9 figures, ICCV 2023

  13. arXiv:2308.06246  [pdf, other

    cs.HC

    ARGUS: Visualization of AI-Assisted Task Guidance in AR

    Authors: Sonia Castelo, Joao Rulff, Erin McGowan, Bea Steers, Guande Wu, Shaoyu Chen, Iran Roman, Roque Lopez, Ethan Brewer, Chen Zhao, **g Qian, Kyunghyun Cho, He He, Qi Sun, Huy Vo, Juan Bello, Michael Krone, Claudio Silva

    Abstract: The concept of augmented reality (AR) assistants has captured the human imagination for decades, becoming a staple of modern science fiction. To pursue this goal, it is necessary to develop artificial intelligence (AI)-based methods that simultaneously perceive the 3D environment, reason about physical tasks, and model the performer, all in real-time. Within this framework, a wide variety of senso… ▽ More

    Submitted 11 August, 2023; originally announced August 2023.

    Comments: 11 pages, 8 figures. This is the author's version of the article of the article that has been accepted for publication in IEEE Transactions on Visualization and Computer Graphics

  14. arXiv:2307.16834  [pdf

    cs.CV cs.AI cs.LG eess.IV

    Benchmarking Jetson Edge Devices with an End-to-end Video-based Anomaly Detection System

    Authors: Hoang Viet Pham, Thinh Gia Tran, Chuong Dinh Le, An Dinh Le, Hien Bich Vo

    Abstract: Innovative enhancement in embedded system platforms, specifically hardware accelerations, significantly influence the application of deep learning in real-world scenarios. These innovations translate human labor efforts into automated intelligent systems employed in various areas such as autonomous driving, robotics, Internet-of-Things (IoT), and numerous other impactful applications. NVIDIA's Jet… ▽ More

    Submitted 12 September, 2023; v1 submitted 28 July, 2023; originally announced July 2023.

    Comments: Accepted in Future of Information and Communication Conference (FICC) 2024

  15. arXiv:2306.14726  [pdf, other

    cs.SE

    Can An Old Fashioned Feature Extraction and A Light-weight Model Improve Vulnerability Type Identification Performance?

    Authors: Hieu Dinh Vo, Son Nguyen

    Abstract: Recent advances in automated vulnerability detection have achieved potential results in hel** developers determine vulnerable components. However, after detecting vulnerabilities, investigating to fix vulnerable code is a non-trivial task. In fact, the types of vulnerability, such as buffer overflow or memory corruption, could help developers quickly understand the nature of the weaknesses and l… ▽ More

    Submitted 26 June, 2023; originally announced June 2023.

  16. arXiv:2306.14418  [pdf, other

    cs.SE

    Context-Encoded Code Change Representation for Automated Commit Message Generation

    Authors: Thanh Trong Vu, Thanh-Dat Do, Hieu Dinh Vo

    Abstract: Changes in source code are an inevitable part of software development. They are the results of indispensable activities such as fixing bugs or improving functionality. Descriptions for code changes (commit messages) help people better understand the changes. However, due to a lack of motivation and time pressure, writing high-quality commit messages remains reluctantly considered. Several methods… ▽ More

    Submitted 26 June, 2023; originally announced June 2023.

    Comments: 16 pages

  17. arXiv:2306.06620  [pdf, other

    cs.SE cs.AI

    ARIST: An Effective API Argument Recommendation Approach

    Authors: Son Nguyen, Cuong Tran Manh, Kien T. Tran, Tan M. Nguyen, Thu-Trang Nguyen, Kien-Tuan Ngo, Hieu Dinh Vo

    Abstract: Learning and remembering to use APIs are difficult. Several techniques have been proposed to assist developers in using APIs. Most existing techniques focus on recommending the right API methods to call, but very few techniques focus on recommending API arguments. In this paper, we propose ARIST, a novel automated argument recommendation approach which suggests arguments by predicting developers'… ▽ More

    Submitted 11 June, 2023; originally announced June 2023.

  18. arXiv:2304.08396  [pdf, other

    cs.SE

    Code-centric Learning-based Just-In-Time Vulnerability Detection

    Authors: Son Nguyen, Thu-Trang Nguyen, Thanh Trong Vu, Thanh-Dat Do, Kien-Tuan Ngo, Hieu Dinh Vo

    Abstract: Attacks against computer systems exploiting software vulnerabilities can cause substantial damage to the cyber-infrastructure of our modern society and economy. To minimize the consequences, it is vital to detect and fix vulnerabilities as soon as possible. Just-in-time vulnerability detection (JIT-VD) discovers vulnerability-prone ("dangerous") commits to prevent them from being merged into sourc… ▽ More

    Submitted 17 April, 2023; originally announced April 2023.

  19. Very high resolution canopy height maps from RGB imagery using self-supervised vision transformer and convolutional decoder trained on Aerial Lidar

    Authors: Jamie Tolan, Hung-I Yang, Ben Nosarzewski, Guillaume Couairon, Huy Vo, John Brandt, Justine Spore, Sayantan Majumdar, Daniel Haziza, Janaki Vamaraju, Theo Moutakanni, Piotr Bojanowski, Tracy Johns, Brian White, Tobias Tiecke, Camille Couprie

    Abstract: Vegetation structure map** is critical for understanding the global carbon cycle and monitoring nature-based approaches to climate adaptation and mitigation. Repeated measurements of these data allow for the observation of deforestation or degradation of existing forests, natural forest regeneration, and the implementation of sustainable agricultural practices like agroforestry. Assessments of t… ▽ More

    Submitted 15 December, 2023; v1 submitted 14 April, 2023; originally announced April 2023.

    Journal ref: Remote Sensing of Environment 300, 113888, 2024

  20. arXiv:2304.07193  [pdf, other

    cs.CV

    DINOv2: Learning Robust Visual Features without Supervision

    Authors: Maxime Oquab, Timothée Darcet, Théo Moutakanni, Huy Vo, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel Haziza, Francisco Massa, Alaaeldin El-Nouby, Mahmoud Assran, Nicolas Ballas, Wojciech Galuba, Russell Howes, Po-Yao Huang, Shang-Wen Li, Ishan Misra, Michael Rabbat, Vasu Sharma, Gabriel Synnaeve, Hu Xu, Hervé Jegou, Julien Mairal, Patrick Labatut, Armand Joulin , et al. (1 additional authors not shown)

    Abstract: The recent breakthroughs in natural language processing for model pretraining on large quantities of data have opened the way for similar foundation models in computer vision. These models could greatly simplify the use of images in any system by producing all-purpose visual features, i.e., features that work across image distributions and tasks without finetuning. This work shows that existing pr… ▽ More

    Submitted 2 February, 2024; v1 submitted 14 April, 2023; originally announced April 2023.

  21. arXiv:2304.05731  [pdf, other

    cs.CV

    SketchANIMAR: Sketch-based 3D Animal Fine-Grained Retrieval

    Authors: Trung-Nghia Le, Tam V. Nguyen, Minh-Quan Le, Trong-Thuan Nguyen, Viet-Tham Huynh, Trong-Le Do, Khanh-Duy Le, Mai-Khiem Tran, Nhat Hoang-Xuan, Thang-Long Nguyen-Ho, Vinh-Tiep Nguyen, Nhat-Quynh Le-Pham, Huu-Phuc Pham, Trong-Vu Hoang, Quang-Binh Nguyen, Trong-Hieu Nguyen-Mau, Tuan-Luc Huynh, Thanh-Danh Le, Ngoc-Linh Nguyen-Ha, Tuong-Vy Truong-Thuy, Truong Hoai Phong, Tuong-Nghiem Diep, Khanh-Duy Ho, Xuan-Hieu Nguyen, Thien-Phuc Tran , et al. (9 additional authors not shown)

    Abstract: The retrieval of 3D objects has gained significant importance in recent years due to its broad range of applications in computer vision, computer graphics, virtual reality, and augmented reality. However, the retrieval of 3D objects presents significant challenges due to the intricate nature of 3D models, which can vary in shape, size, and texture, and have numerous polygons and vertices. To this… ▽ More

    Submitted 9 August, 2023; v1 submitted 12 April, 2023; originally announced April 2023.

    Comments: Accepted to Computers & Graphics (3DOR 2023, Journal track)

  22. arXiv:2212.14353  [pdf, other

    cs.DC eess.SP

    Sheaf-theoretic self-filtering network of low-cost sensors for local air quality monitoring: A causal approach

    Authors: Anh-Duy Pham, Chuong Dinh Le, Hoang Viet Pham, Thinh Gia Tran, Dat Thanh Vo, Chau Long Tran, An Dinh Le, Hien Bich Vo

    Abstract: Sheaf theory, which is a complex but powerful tool supported by topological theory, offers more flexibility and precision than traditional graph theory when it comes to modeling relationships between multiple features. In the realm of air quality monitoring, this can be incredibly useful in detecting sudden changes in local dust particle density, which can be difficult to accurately measure using… ▽ More

    Submitted 29 December, 2022; originally announced December 2022.

  23. arXiv:2212.01761  [pdf

    physics.soc-ph cs.CV

    A PM2.5 concentration prediction framework with vehicle tracking system: From cause to effect

    Authors: Chuong D. Le, Hoang V. Pham, Duy A. Pham, An D. Le, Hien B. Vo

    Abstract: Air pollution is an emerging problem that needs to be solved especially in developed and develo** countries. In Vietnam, air pollution is also a concerning issue in big cities such as Hanoi and Ho Chi Minh cities where air pollution comes mostly from vehicles such as cars and motorbikes. In order to tackle the problem, the paper focuses on develo** a solution that can estimate the emitted PM2.… ▽ More

    Submitted 4 December, 2022; originally announced December 2022.

  24. arXiv:2209.12181  [pdf

    cs.SE

    Using Multiple Code Representations to Prioritize Static Analysis Warnings

    Authors: Thanh Trong Vu, Hieu Dinh Vo

    Abstract: In order to ensure the quality of software and prevent attacks from hackers on critical systems, static analysis tools are frequently utilized to detect vulnerabilities in the early development phase. However, these tools often report a large number of warnings with a high false-positive rate, which causes many difficulties for developers. In this paper, we introduce VulRG, a novel approach to add… ▽ More

    Submitted 26 September, 2022; v1 submitted 25 September, 2022; originally announced September 2022.

    Comments: 6 pages, 2 figures, 4 tables

    MSC Class: 68N20 ACM Class: D.2.5

  25. arXiv:2209.02415  [pdf, other

    cs.CV cs.AI

    Automatic Infectious Disease Classification Analysis with Concept Discovery

    Authors: Elena Sizikova, Joshua Vendrow, Xu Cao, Rachel Grotheer, Jamie Haddock, Lara Kassab, Alona Kryshchenko, Thomas Merkh, R. W. M. A. Madushani, Kenny Moise, Annie Ulichney, Huy V. Vo, Chuntian Wang, Megan Coffee, Kathryn Leonard, Deanna Needell

    Abstract: Automatic infectious disease classification from images can facilitate needed medical diagnoses. Such an approach can identify diseases, like tuberculosis, which remain under-diagnosed due to resource constraints and also novel and emerging diseases, like monkeypox, which clinicians have little experience or acumen in diagnosing. Avoiding missed or delayed diagnoses would prevent further transmiss… ▽ More

    Submitted 14 November, 2022; v1 submitted 28 August, 2022; originally announced September 2022.

    Comments: Extended Abstract presented at Machine Learning for Health (ML4H) symposium 2022, November 28th, 2022, New Orleans, United States & Virtual, http://www.ml4h.cc, 13 pages

  26. arXiv:2207.12112  [pdf, other

    cs.CV

    Active Learning Strategies for Weakly-supervised Object Detection

    Authors: Huy V. Vo, Oriane Siméoni, Spyros Gidaris, Andrei Bursuc, Patrick Pérez, Jean Ponce

    Abstract: Object detectors trained with weak annotations are affordable alternatives to fully-supervised counterparts. However, there is still a significant performance gap between them. We propose to narrow this gap by fine-tuning a base pre-trained weakly-supervised detector with a few fully-annotated samples automatically selected from the training set using ``box-in-box'' (BiB), a novel active learning… ▽ More

    Submitted 25 July, 2022; originally announced July 2022.

    Comments: Accepted to European Conference on Computer Vision (ECCV) 2022. Contains 27 pages, 9 tables and 6 figures

  27. arXiv:2205.05194  [pdf, other

    cs.CV

    Multiplexed Immunofluorescence Brain Image Analysis Using Self-Supervised Dual-Loss Adaptive Masked Autoencoder

    Authors: Son T. Ly, Bai Lin, Hung Q. Vo, Dragan Maric, Badri Roysam, Hien V. Nguyen

    Abstract: Reliable large-scale cell detection and segmentation is the fundamental first step to understanding biological processes in the brain. The ability to phenotype cells at scale can accelerate preclinical drug evaluation and system-level brain histology studies. The impressive advances in deep learning offer a practical solution to cell image detection and segmentation. Unfortunately, categorizing ce… ▽ More

    Submitted 30 December, 2022; v1 submitted 10 May, 2022; originally announced May 2022.

    Comments: Adding new results on multiplexed image data and data efficiency. Pytorch code: https://github.com/hula-ai/DAMA

  28. arXiv:2112.14594  [pdf, other

    physics.soc-ph cs.CY

    The Levy Flight of Cities: Analyzing Social-Economical Trajectories with Auto-Embedding

    Authors: Linfang Tian, Kai Zhao, Jiaming Yin, Huy Vo, Weixiong Rao

    Abstract: It has been found that human mobility exhibits random patterns following the Levy flight, where human movement contains many short flights and some long flights, and these flights follow a power-law distribution. In this paper, we study the social-economical development trajectories of urban cities. We observe that social-economical movement of cities also exhibit the Levy flight characteristics.… ▽ More

    Submitted 29 December, 2021; originally announced December 2021.

  29. arXiv:2112.13341  [pdf, other

    cs.CV

    AlertTrap: A study on object detection in remote insects trap monitoring system using on-the-edge deep learning platform

    Authors: An D. Le, Duy A. Pham, Dong T. Pham, Hien B. Vo

    Abstract: Fruit flies are one of the most harmful insect species to fruit yields. In AlertTrap, implementation of SSD architecture with different state-of-the-art backbone feature extractors such as MobileNetV1 and MobileNetV2 appear to be potential solutions for the real-time detection problem. SSD-MobileNetV1 and SSD-MobileNetV2 perform well and result in [email protected] of 0.957 and 1.0 respectively. YOLOv4-tiny… ▽ More

    Submitted 4 March, 2022; v1 submitted 26 December, 2021; originally announced December 2021.

  30. Predicting Job Titles from Job Descriptions with Multi-label Text Classification

    Authors: Hieu Trung Tran, Hanh Hong Phuc Vo, Son T. Luu

    Abstract: Finding a suitable job and hunting for eligible candidates are important to job seeking and human resource agencies. With the vast information about job descriptions, employees and employers need assistance to automatically detect job titles based on job description texts. In this paper, we propose the multi-label classification approach for predicting relevant job titles from job description text… ▽ More

    Submitted 9 February, 2022; v1 submitted 21 December, 2021; originally announced December 2021.

    Comments: Published in the 2021 NAFOSTED Conference on Information and Computer Science (NICS 2021)

  31. arXiv:2111.00640  [pdf, other

    cs.CL

    VSEC: Transformer-based Model for Vietnamese Spelling Correction

    Authors: Dinh-Truong Do, Ha Thanh Nguyen, Thang Ngoc Bui, Dinh Hieu Vo

    Abstract: Spelling error correction is one of topics which have a long history in natural language processing. Although previous studies have achieved remarkable results, challenges still exist. In the Vietnamese language, a state-of-the-art method for the task infers a syllable's context from its adjacent syllables. The method's accuracy can be unsatisfactory, however, because the model may lose the contex… ▽ More

    Submitted 8 November, 2021; v1 submitted 31 October, 2021; originally announced November 2021.

  32. Ranking Warnings of Static Analysis Tools Using Representation Learning

    Authors: Kien-Tuan Ngo, Dinh-Truong Do, Thu-Trang Nguyen, Hieu Dinh Vo

    Abstract: Static analysis tools are frequently used to detect potential vulnerabilities in software systems. However, an inevitable problem of these tools is their large number of warnings with a high false positive rate, which consumes time and effort for investigating. In this paper, we present DeFP, a novel method for ranking static analysis warnings. Based on the intuition that warnings which have simil… ▽ More

    Submitted 7 October, 2021; originally announced October 2021.

    Comments: Published in Proceedings of the 28th Asia-Pacific Software Engineering Conference (APSEC'21)

  33. arXiv:2109.14279  [pdf, other

    cs.CV

    Localizing Objects with Self-Supervised Transformers and no Labels

    Authors: Oriane Siméoni, Gilles Puy, Huy V. Vo, Simon Roburin, Spyros Gidaris, Andrei Bursuc, Patrick Pérez, Renaud Marlet, Jean Ponce

    Abstract: Localizing objects in image collections without supervision can help to avoid expensive annotation campaigns. We propose a simple approach to this problem, that leverages the activation features of a vision transformer pre-trained in a self-supervised manner. Our method, LOST, does not require any external object proposal nor any exploration of the image collection; it operates on a single image.… ▽ More

    Submitted 29 September, 2021; originally announced September 2021.

    Journal ref: BMVC 2021

  34. A Variability Fault Localization Approach for Software Product Lines

    Authors: Thu-Trang Nguyen, Kien-Tuan Ngo, Son Nguyen, Hieu Dinh Vo

    Abstract: Software fault localization is one of the most expensive, tedious, and time-consuming activities in program debugging. This activity becomes even much more challenging in Software Product Line (SPL) systems due to variability of failures. These unexpected behaviors are induced by variability faults which can only be exposed under some combinations of system features. The interaction among these fe… ▽ More

    Submitted 21 September, 2021; originally announced September 2021.

    Comments: Published in IEEE Transactions on Software Engineering (Early Access)

  35. arXiv:2109.02917  [pdf, other

    cs.CV

    Fine-grained Hand Gesture Recognition in Multi-viewpoint Hand Hygiene

    Authors: Huy Q. Vo, Tuong Do, Vi C. Pham, Duy Nguyen, An T. Duong, Quang D. Tran

    Abstract: This paper contributes a new high-quality dataset for hand gesture recognition in hand hygiene systems, named "MFH". Generally, current datasets are not focused on: (i) fine-grained actions; and (ii) data mismatch between different viewpoints, which are available under realistic settings. To address the aforementioned issues, the MFH dataset is proposed to contain a total of 731147 samples obtaine… ▽ More

    Submitted 7 September, 2021; originally announced September 2021.

    Comments: 6 pages, accepted for oral in IEEE SMC 2021

  36. Multimodal Breast Lesion Classification Using Cross-Attention Deep Networks

    Authors: Hung Q. Vo, Pengyu Yuan, Tiancheng He, Stephen T. C. Wong, Hien V. Nguyen

    Abstract: Accurate breast lesion risk estimation can significantly reduce unnecessary biopsies and help doctors decide optimal treatment plans. Most existing computer-aided systems rely solely on mammogram features to classify breast lesions. While this approach is convenient, it does not fully exploit useful information in clinical reports to achieve the optimal performance. Would clinical features signifi… ▽ More

    Submitted 21 August, 2021; originally announced August 2021.

  37. Variability Fault Localization: A Benchmark

    Authors: Kien-Tuan Ngo, Thu-Trang Nguyen, Son Nguyen, Hieu Dinh Vo

    Abstract: Software fault localization is one of the most expensive, tedious, and time-consuming activities in program debugging. This activity becomes even much more challenging in Software Product Line (SPL) systems due to the variability of failures in SPL systems. These unexpected behaviors are caused by variability faults which can only be exposed under some combinations of system features. Although loc… ▽ More

    Submitted 21 September, 2021; v1 submitted 9 July, 2021; originally announced July 2021.

    Comments: Published in Proceedings of the 25th ACM International Systems and Software Product Line Conference - Volume A (SPLC '21)

  38. arXiv:2106.06650  [pdf, other

    cs.CV

    Large-Scale Unsupervised Object Discovery

    Authors: Huy V. Vo, Elena Sizikova, Cordelia Schmid, Patrick Pérez, Jean Ponce

    Abstract: Existing approaches to unsupervised object discovery (UOD) do not scale up to large datasets without approximations that compromise their performance. We propose a novel formulation of UOD as a ranking problem, amenable to the arsenal of distributed methods available for eigenvalue problems and link analysis. Through the use of self-supervised features, we also demonstrate the first effective full… ▽ More

    Submitted 16 November, 2021; v1 submitted 11 June, 2021; originally announced June 2021.

    Comments: Accepted to NeurIPS 2021, 19 pages with supplemental materials

  39. Automatically Detecting Cyberbullying Comments on Online Game Forums

    Authors: Hanh Hong-Phuc Vo, Hieu Trung Tran, Son T. Luu

    Abstract: Online game forums are popular to most of game players. They use it to communicate and discuss the strategy of the game, or even to make friends. However, game forums also contain abusive and harassment speech, disturbing and threatening players. Therefore, it is necessary to automatically detect and remove cyberbullying comments to keep the game forum clean and friendly. We use the Cyberbullying… ▽ More

    Submitted 26 December, 2021; v1 submitted 3 June, 2021; originally announced June 2021.

    Comments: Published in the 2021 RIVF International Conference on Computing and Communication Technologies (RIVF)

  40. arXiv:2105.00994  [pdf, other

    eess.SY cs.CE physics.soc-ph

    Fleet management for ride-pooling with meeting points at scale: a case study in the five boroughs of New York City

    Authors: Motahare Mounesan, Vindula Jayawardana, Yaocheng Wu, Samitha Samaranayake, Huy T. Vo

    Abstract: Introducing meeting points to ride-pooling (RP) services has been shown to increase the satisfaction level of both riders and service providers. Passengers may choose to walk to a meeting point for a cost reduction. Drivers may also get matched with more riders without making additional stops. There are economic benefits of using ride-pooling with meeting points (RPMP) compared to the traditional… ▽ More

    Submitted 25 April, 2021; originally announced May 2021.

  41. Single Stage Class Agnostic Common Object Detection: A Simple Baseline

    Authors: Chuong H. Nguyen, Thuy C. Nguyen, Anh H. Vo, Yamazaki Masayuki

    Abstract: This paper addresses the problem of common object detection, which aims to detect objects of similar categories from a set of images. Although it shares some similarities with the standard object detection and co-segmentation, common object detection, recently promoted by \cite{Jiang2019a}, has some unique advantages and challenges. First, it is designed to work on both closed-set and open-set con… ▽ More

    Submitted 25 April, 2021; originally announced April 2021.

    Comments: This paper is accepted to International Conference on Pattern Recognition Applications and Methods (ICPRAM) 2021

    Report number: ISBN 978-989-758-486-2 ISSN 2184-4313, pages 396-407

  42. arXiv:2007.02662  [pdf, other

    cs.CV

    Toward unsupervised, multi-object discovery in large-scale image collections

    Authors: Huy V. Vo, Patrick Pérez, Jean Ponce

    Abstract: This paper addresses the problem of discovering the objects present in a collection of images without any supervision. We build on the optimization approach of Vo et al. (CVPR'19) with several key novelties: (1) We propose a novel saliency-based region proposal algorithm that achieves significantly higher overlap with ground-truth objects than other competitive methods. This procedure leverages of… ▽ More

    Submitted 25 August, 2020; v1 submitted 6 July, 2020; originally announced July 2020.

    Comments: Accepted for publication in European Conference on Computer Vision (ECCV) 2020

  43. AMIC: An Adaptive Information Theoretic Method to Identify Multi-Scale Temporal Correlations in Big Time Series Data -- Accepted Version

    Authors: Nguyen Ho, Huy Vo, Mai Vu, Torben Bach Pedersen

    Abstract: Recent development in computing, sensing and crowd-sourced data have resulted in an explosion in the availability of quantitative information. The possibilities of analyzing this so-called Big Data to inform research and the decision-making process are virtually endless. In general, analyses have to be done across multiple data sets in order to bring out the most value of Big Data. A first importa… ▽ More

    Submitted 7 July, 2019; v1 submitted 24 June, 2019; originally announced June 2019.

  44. arXiv:1904.03148  [pdf, other

    cs.CV

    Unsupervised Image Matching and Object Discovery as Optimization

    Authors: Huy V. Vo, Francis Bach, Minsu Cho, Kai Han, Yann LeCun, Patrick Perez, Jean Ponce

    Abstract: Learning with complete or partial supervision is powerful but relies on ever-growing human annotation efforts. As a way to mitigate this serious problem, as well as to serve specific applications, unsupervised learning has emerged as an important field of research. In computer vision, unsupervised learning comes in various guises. We focus here on the unsupervised discovery and matching of object… ▽ More

    Submitted 5 April, 2019; originally announced April 2019.

    Comments: Accepted to CVPR 2019

  45. arXiv:1904.02062  [pdf

    cs.SI

    An Ensemble Deep Learning Model for Drug Abuse Detection in Sparse Twitter-Sphere

    Authors: Han Hu, NhatHai Phan, James Geller, Stephen Iezzi, Huy Vo, De**g Dou, Soon Ae Chun

    Abstract: As the problem of drug abuse intensifies in the U.S., many studies that primarily utilize social media data, such as postings on Twitter, to study drug abuse-related activities use machine learning as a powerful tool for text classification and filtering. However, given the wide range of topics of Twitter users, tweets related to drug abuse are rare in most of the datasets. This imbalanced data re… ▽ More

    Submitted 3 April, 2019; originally announced April 2019.

    Comments: The 17th World Congress of Medical and Health Informatics [MedInfo 2019]

  46. Structural inpainting

    Authors: Huy V. Vo, Ngoc Q. K. Duong, Patrick Perez

    Abstract: Scene-agnostic visual inpainting remains very challenging despite progress in patch-based methods. Recently, Pathak et al. 2016 have introduced convolutional "context encoders" (CEs) for unsupervised feature learning through image completion tasks. With the additional help of adversarial training, CEs turned out to be a promising tool to complete complex structures in real inpainting problems. In… ▽ More

    Submitted 27 March, 2018; originally announced March 2018.

  47. arXiv:1306.4411  [pdf, other

    cs.AI

    Event-Object Reasoning with Curated Knowledge Bases: Deriving Missing Information

    Authors: Chitta Baral, Nguyen H. Vo

    Abstract: The broader goal of our research is to formulate answers to why and how questions with respect to knowledge bases, such as AURA. One issue we face when reasoning with many available knowledge bases is that at times needed information is missing. Examples of this include partially missing information about next sub-event, first sub-event, last sub-event, result of an event, input to an event, desti… ▽ More

    Submitted 19 June, 2013; v1 submitted 18 June, 2013; originally announced June 2013.

    Comments: 13 pages

  48. arXiv:1207.0140  [pdf, other

    cs.DB

    LogBase: A Scalable Log-structured Database System in the Cloud

    Authors: Hoang Tam Vo, Sheng Wang, Divyakant Agrawal, Gang Chen, Beng Chin Ooi

    Abstract: Numerous applications such as financial transactions (e.g., stock trading) are write-heavy in nature. The shift from reads to writes in web applications has also been accelerating in recent years. Write-ahead-logging is a common approach for providing recovery capability while improving performance in most storage systems. However, the separation of log and application data incurs write overheads… ▽ More

    Submitted 30 June, 2012; originally announced July 2012.

    Comments: VLDB2012

    Journal ref: Proceedings of the VLDB Endowment (PVLDB), Vol. 5, No. 10, pp. 1004-1015 (2012)

  49. arXiv:1109.1618  [pdf

    cs.SI cs.CL physics.soc-ph

    An analysis of Twitter messages in the 2011 Tohoku Earthquake

    Authors: Son Doan, Bao-Khanh Ho Vo, Nigel Collier

    Abstract: Social media such as Facebook and Twitter have proven to be a useful resource to understand public opinion towards real world events. In this paper, we investigate over 1.5 million Twitter messages (tweets) for the period 9th March 2011 to 31st May 2011 in order to track awareness and anxiety levels in the Tokyo metropolitan district to the 2011 Tohoku Earthquake and subsequent tsunami and nuclear… ▽ More

    Submitted 7 September, 2011; originally announced September 2011.

    Comments: 9 pages, 4 figures, eHealth 2011 conference, Malaga (Spain) (accepted)

    Journal ref: Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, 2012, Volume 91, Part 4, 58-66