Skip to main content

Showing 1–50 of 113 results for author: Tran, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.00337  [pdf, other

    cs.CE cs.LG math.NA

    WgLaSDI: Weak-Form Greedy Latent Space Dynamics Identification

    Authors: Xiaolong He, April Tran, David M. Bortz, Youngsoo Choi

    Abstract: The parametric greedy latent space dynamics identification (gLaSDI) framework has demonstrated promising potential for accurate and efficient modeling of high-dimensional nonlinear physical systems. However, it remains challenging to handle noisy data. To enhance robustness against noise, we incorporate the weak-form estimation of nonlinear dynamics (WENDy) into gLaSDI. In the proposed weak-form g… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

  2. arXiv:2406.19753  [pdf, other

    cs.LG

    Backdoor Attack in Prompt-Based Continual Learning

    Authors: Trang Nguyen, Anh Tran, Nhat Ho

    Abstract: Prompt-based approaches offer a cutting-edge solution to data privacy issues in continual learning, particularly in scenarios involving multiple data suppliers where long-term storage of private user data is prohibited. Despite delivering state-of-the-art performance, its impressive remembering capability can become a double-edged sword, raising security concerns as it might inadvertently retain p… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  3. arXiv:2405.16204  [pdf, other

    cs.CV cs.AI cs.GR

    VOODOO XP: Expressive One-Shot Head Reenactment for VR Telepresence

    Authors: Phong Tran, Egor Zakharov, Long-Nhat Ho, Liwen Hu, Adilbek Karmanov, Aviral Agarwal, McLean Goldwhite, Ariana Bermudez Venegas, Anh Tuan Tran, Hao Li

    Abstract: We introduce VOODOO XP: a 3D-aware one-shot head reenactment method that can generate highly expressive facial expressions from any input driver video and a single 2D portrait. Our solution is real-time, view-consistent, and can be instantly used without calibration or fine-tuning. We demonstrate our solution on a monocular video setting and an end-to-end VR telepresence system for two-way communi… ▽ More

    Submitted 28 May, 2024; v1 submitted 25 May, 2024; originally announced May 2024.

  4. arXiv:2404.07122  [pdf, other

    cs.CV

    Driver Attention Tracking and Analysis

    Authors: Dat Viet Thanh Nguyen, Anh Tran, Hoai Nam Vu, Cuong Pham, Minh Hoai

    Abstract: We propose a novel method to estimate a driver's points-of-gaze using a pair of ordinary cameras mounted on the windshield and dashboard of a car. This is a challenging problem due to the dynamics of traffic environments with 3D scenes of unknown depths. This problem is further complicated by the volatile distance between the driver and the camera system. To tackle these challenges, we develop a n… ▽ More

    Submitted 11 April, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

  5. arXiv:2403.18871  [pdf

    cs.CV cs.AI cs.LG

    Clinical Domain Knowledge-Derived Template Improves Post Hoc AI Explanations in Pneumothorax Classification

    Authors: Han Yuan, Chuan Hong, Pengtao Jiang, Gangming Zhao, Nguyen Tuan Anh Tran, Xinxing Xu, Yet Yen Yan, Nan Liu

    Abstract: Background: Pneumothorax is an acute thoracic disease caused by abnormal air collection between the lungs and chest wall. To address the opaqueness often associated with deep learning (DL) models, explainable artificial intelligence (XAI) methods have been introduced to outline regions related to pneumothorax diagnoses made by DL models. However, these explanations sometimes diverge from actual le… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

  6. arXiv:2403.18605  [pdf, other

    cs.CV

    FlexEdit: Flexible and Controllable Diffusion-based Object-centric Image Editing

    Authors: Trong-Tung Nguyen, Duc-Anh Nguyen, Anh Tran, Cuong Pham

    Abstract: Our work addresses limitations seen in previous approaches for object-centric editing problems, such as unrealistic results due to shape discrepancies and limited control in object replacement or insertion. To this end, we introduce FlexEdit, a flexible and controllable editing framework for objects where we iteratively adjust latents at each denoising step using our FlexEdit block. Initially, we… ▽ More

    Submitted 27 March, 2024; v1 submitted 27 March, 2024; originally announced March 2024.

    Comments: Our project page: https://flex-edit.github.io/

  7. arXiv:2403.16205  [pdf, other

    cs.CV

    Blur2Blur: Blur Conversion for Unsupervised Image Deblurring on Unknown Domains

    Authors: Bang-Dang Pham, Phong Tran, Anh Tran, Cuong Pham, Rang Nguyen, Minh Hoai

    Abstract: This paper presents an innovative framework designed to train an image deblurring algorithm tailored to a specific camera device. This algorithm works by transforming a blurry input image, which is challenging to deblur, into another blurry image that is more amenable to deblurring. The transformation process, from one blurry state to another, leverages unpaired data consisting of sharp and blurry… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

    Comments: Accepted to CVPR 2024

  8. arXiv:2403.10748  [pdf, other

    cs.CE cs.LG cs.MS math.NA

    A Comprehensive Review of Latent Space Dynamics Identification Algorithms for Intrusive and Non-Intrusive Reduced-Order-Modeling

    Authors: Christophe Bonneville, Xiaolong He, April Tran, Jun Sur Park, William Fries, Daniel A. Messenger, Siu Wun Cheung, Yeonjong Shin, David M. Bortz, Debojyoti Ghosh, Jiun-Shyan Chen, Jonathan Belof, Youngsoo Choi

    Abstract: Numerical solvers of partial differential equations (PDEs) have been widely employed for simulating physical systems. However, the computational cost remains a major bottleneck in various scientific and engineering applications, which has motivated the development of reduced-order models (ROMs). Recently, machine-learning-based ROMs have gained significant popularity and are promising for addressi… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  9. arXiv:2403.07371  [pdf, other

    cs.CV

    Time-Efficient and Identity-Consistent Virtual Try-On Using A Variant of Altered Diffusion Models

    Authors: Phuong Dam, Jihoon Jeong, Anh Tran, Daeyoung Kim

    Abstract: This study discusses the critical issues of Virtual Try-On in contemporary e-commerce and the prospective metaverse, emphasizing the challenges of preserving intricate texture details and distinctive features of the target person and the clothes in various scenarios, such as clothing texture and identity characteristics like tattoos or accessories. In addition to the fidelity of the synthesized im… ▽ More

    Submitted 25 March, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

  10. arXiv:2402.15321  [pdf, other

    cs.CV cs.AI cs.LG

    OpenSUN3D: 1st Workshop Challenge on Open-Vocabulary 3D Scene Understanding

    Authors: Francis Engelmann, Ayca Takmaz, Jonas Schult, Elisabetta Fedele, Johanna Wald, Songyou Peng, Xi Wang, Or Litany, Siyu Tang, Federico Tombari, Marc Pollefeys, Leonidas Guibas, Hongbo Tian, Chunjie Wang, Xiaosheng Yan, Bingwen Wang, Xuanyang Zhang, Xiao Liu, Phuc Nguyen, Khoi Nguyen, Anh Tran, Cuong Pham, Zhening Huang, Xiaoyang Wu, Xi Chen , et al. (3 additional authors not shown)

    Abstract: This report provides an overview of the challenge hosted at the OpenSUN3D Workshop on Open-Vocabulary 3D Scene Understanding held in conjunction with ICCV 2023. The goal of this workshop series is to provide a platform for exploration and discussion of open-vocabulary 3D scene understanding tasks, including but not limited to segmentation, detection and map**. We provide an overview of the chall… ▽ More

    Submitted 17 March, 2024; v1 submitted 23 February, 2024; originally announced February 2024.

    Comments: Our OpenSUN3D workshop website for ICCV 2023: https://opensun3d.github.io/index_iccv23.html

  11. Situating Data Sets: Making Public Data Actionable for Housing Justice

    Authors: Anh-Ton Tran, Grace Guo, Jordan Taylor, Katsuki Chan, Elora Raymond, Carl DiSalvo

    Abstract: Activists, governmentsm and academics regularly advocate for more open data. But how is data made open, and for whom is it made useful and usable? In this paper, we investigate and describe the work of making eviction data open to tenant organizers. We do this through an ethnographic description of ongoing work with a local housing activist organization. This work combines observation, direct part… ▽ More

    Submitted 19 February, 2024; originally announced February 2024.

    Comments: 16 pages including references, 4 figures, 1 table, ACM CHI 2024

  12. Cruising Queer HCI on the DL: A Literature Review of LGBTQ+ People in HCI

    Authors: Jordan Taylor, Ellen Simpson, Anh-Ton Tran, Jed Brubaker, Sarah Fox, Haiyi Zhu

    Abstract: LGBTQ+ people have received increased attention in HCI research, paralleling a greater emphasis on social justice in recent years. However, there has not been a systematic review of how LGBTQ+ people are researched or discussed in HCI. In this work, we review all research mentioning LGBTQ+ people across the HCI venues of CHI, CSCW, DIS, and TOCHI. Since 2014, we find a linear growth in the number… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

    Journal ref: Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems (CHI '24)

  13. arXiv:2312.17205  [pdf, other

    cs.CV

    EFHQ: Multi-purpose ExtremePose-Face-HQ dataset

    Authors: Trung Tuan Dao, Duc Hong Vu, Cuong Pham, Anh Tran

    Abstract: The existing facial datasets, while having plentiful images at near frontal views, lack images with extreme head poses, leading to the downgraded performance of deep learning models when dealing with profile or pitched faces. This work aims to address this gap by introducing a novel dataset named Extreme Pose Face High-Quality Dataset (EFHQ), which includes a maximum of 450k high-quality images of… ▽ More

    Submitted 11 April, 2024; v1 submitted 28 December, 2023; originally announced December 2023.

    Comments: Project Page: https://bomcon123456.github.io/efhq/

  14. arXiv:2312.10671  [pdf, other

    cs.CV

    Open3DIS: Open-Vocabulary 3D Instance Segmentation with 2D Mask Guidance

    Authors: Phuc D. A. Nguyen, Tuan Duc Ngo, Evangelos Kalogerakis, Chuang Gan, Anh Tran, Cuong Pham, Khoi Nguyen

    Abstract: We introduce Open3DIS, a novel solution designed to tackle the problem of Open-Vocabulary Instance Segmentation within 3D scenes. Objects within 3D environments exhibit diverse shapes, scales, and colors, making precise instance-level identification a challenging task. Recent advancements in Open-Vocabulary scene understanding have made significant strides in this area by employing class-agnostic… ▽ More

    Submitted 5 April, 2024; v1 submitted 17 December, 2023; originally announced December 2023.

    Comments: CVPR 2024. Project page: https://open3dis.github.io/

  15. arXiv:2312.05239  [pdf, other

    cs.CV

    SwiftBrush: One-Step Text-to-Image Diffusion Model with Variational Score Distillation

    Authors: Thuan Hoang Nguyen, Anh Tran

    Abstract: Despite their ability to generate high-resolution and diverse images from text prompts, text-to-image diffusion models often suffer from slow iterative sampling processes. Model distillation is one of the most effective directions to accelerate these models. However, previous distillation methods fail to retain the generation quality while requiring a significant amount of images for training, eit… ▽ More

    Submitted 14 April, 2024; v1 submitted 8 December, 2023; originally announced December 2023.

    Comments: Accepted to CVPR 2024; Project Page: https://thuanz123.github.io/swiftbrush/

  16. arXiv:2312.04651  [pdf, other

    cs.CV

    VOODOO 3D: Volumetric Portrait Disentanglement for One-Shot 3D Head Reenactment

    Authors: Phong Tran, Egor Zakharov, Long-Nhat Ho, Anh Tuan Tran, Liwen Hu, Hao Li

    Abstract: We present a 3D-aware one-shot head reenactment method based on a fully volumetric neural disentanglement framework for source appearance and driver expressions. Our method is real-time and produces high-fidelity and view-consistent output, suitable for 3D teleconferencing systems based on holographic displays. Existing cutting-edge 3D-aware reenactment methods often use neural radiance fields or… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

  17. arXiv:2312.03419  [pdf, other

    cs.CR

    Synthesizing Physical Backdoor Datasets: An Automated Framework Leveraging Deep Generative Models

    Authors: Sze Jue Yang, Chinh D. La, Quang H. Nguyen, Kok-Seng Wong, Anh Tuan Tran, Chee Seng Chan, Khoa D. Doan

    Abstract: Backdoor attacks, representing an emerging threat to the integrity of deep neural networks, have garnered significant attention due to their ability to compromise deep learning systems clandestinely. While numerous backdoor attacks occur within the digital realm, their practical implementation in real-world prediction systems remains limited and vulnerable to disturbances in the physical world. Co… ▽ More

    Submitted 15 March, 2024; v1 submitted 6 December, 2023; originally announced December 2023.

  18. arXiv:2312.01284  [pdf, other

    cs.CV

    Stable Messenger: Steganography for Message-Concealed Image Generation

    Authors: Quang Nguyen, Truong Vu, Cuong Pham, Anh Tran, Khoi Nguyen

    Abstract: In the ever-expanding digital landscape, safeguarding sensitive information remains paramount. This paper delves deep into digital protection, specifically focusing on steganography. While prior research predominantly fixated on individual bit decoding, we address this limitation by introducing ``message accuracy'', a novel metric evaluating the entirety of decoded messages for a more holistic eva… ▽ More

    Submitted 3 December, 2023; originally announced December 2023.

  19. arXiv:2312.00656  [pdf, other

    cs.LG cs.AI stat.ML

    Simple Transferability Estimation for Regression Tasks

    Authors: Cuong N. Nguyen, Phong Tran, Lam Si Tung Ho, Vu Dinh, Anh T. Tran, Tal Hassner, Cuong V. Nguyen

    Abstract: We consider transferability estimation, the problem of estimating how well deep learning models transfer from a source to a target task. We focus on regression tasks, which received little previous attention, and propose two simple and computationally efficient approaches that estimate transferability based on the negative regularized mean squared error of a linear regression model. We prove novel… ▽ More

    Submitted 3 December, 2023; v1 submitted 1 December, 2023; originally announced December 2023.

    Comments: Paper published at The 39th Conference on Uncertainty in Artificial Intelligence (UAI) 2023

  20. arXiv:2311.17101  [pdf, other

    cs.CV

    Robust Diffusion GAN using Semi-Unbalanced Optimal Transport

    Authors: Quan Dao, Binh Ta, Tung Pham, Anh Tran

    Abstract: Diffusion models, a type of generative model, have demonstrated great potential for synthesizing highly detailed images. By integrating with GAN, advanced diffusion models like DDGAN \citep{xiao2022DDGAN} could approach real-time performance for expansive practical applications. While DDGAN has effectively addressed the challenges of generative modeling, namely producing high-quality samples, cove… ▽ More

    Submitted 28 November, 2023; originally announced November 2023.

  21. arXiv:2311.16017  [pdf, other

    cs.HC cs.AI

    Decoding Logic Errors: A Comparative Study on Bug Detection by Students and Large Language Models

    Authors: Stephen MacNeil, Paul Denny, Andrew Tran, Juho Leinonen, Seth Bernstein, Arto Hellas, Sami Sarsa, Joanne Kim

    Abstract: Identifying and resolving logic errors can be one of the most frustrating challenges for novices programmers. Unlike syntax errors, for which a compiler or interpreter can issue a message, logic errors can be subtle. In certain conditions, buggy code may even exhibit correct behavior -- in other cases, the issue might be about how a problem statement has been interpreted. Such errors can be hard t… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

  22. arXiv:2311.15213  [pdf

    eess.IV cs.CV cs.LG

    Leveraging Anatomical Constraints with Uncertainty for Pneumothorax Segmentation

    Authors: Han Yuan, Chuan Hong, Nguyen Tuan Anh Tran, Xinxing Xu, Nan Liu

    Abstract: Pneumothorax is a medical emergency caused by abnormal accumulation of air in the pleural space - the potential space between the lungs and chest wall. On 2D chest radiographs, pneumothorax occurs within the thoracic cavity and outside of the mediastinum and we refer to this area as "lung+ space". While deep learning (DL) has increasingly been utilized to segment pneumothorax lesions in chest radi… ▽ More

    Submitted 26 November, 2023; originally announced November 2023.

  23. arXiv:2311.12880  [pdf, other

    eess.SY cs.CE cs.LG math.NA

    Weak-Form Latent Space Dynamics Identification

    Authors: April Tran, Xiaolong He, Daniel A. Messenger, Youngsoo Choi, David M. Bortz

    Abstract: Recent work in data-driven modeling has demonstrated that a weak formulation of model equations enhances the noise robustness of a wide range of computational methods. In this paper, we demonstrate the power of the weak form to enhance the LaSDI (Latent Space Dynamics Identification) algorithm, a recently developed data-driven reduced order modeling technique. We introduce a weak form-based vers… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

    Comments: 21 pages, 15 figures

  24. arXiv:2311.10219  [pdf, other

    cs.SI

    Measuring Moral Dimensions in Social Media with Mformer

    Authors: Tuan Dung Nguyen, Ziyu Chen, Nicholas George Carroll, Alasdair Tran, Colin Klein, Lexing Xie

    Abstract: The ever-growing textual records of contemporary social issues, often discussed online with moral rhetoric, present both an opportunity and a challenge for studying how moral concerns are debated in real life. Moral foundations theory is a taxonomy of intuitions widely used in data-driven analyses of online content, but current computational tools to detect moral foundations suffer from the incomp… ▽ More

    Submitted 19 April, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

    Comments: To be published in ICWSM 2024

  25. arXiv:2310.20057  [pdf, other

    cs.CV

    SolarFormer: Multi-scale Transformer for Solar PV Profiling

    Authors: Adrian de Luis, Minh Tran, Taisei Hanyu, Anh Tran, Liao Haitao, Roy McCann, Alan Mantooth, Ying Huang, Ngan Le

    Abstract: As climate change intensifies, the global imperative to shift towards sustainable energy sources becomes more pronounced. Photovoltaic (PV) energy is a favored choice due to its reliability and ease of installation. Accurate map** of PV installations is crucial for understanding their adoption and informing energy policy. To meet this need, we introduce the SolarFormer, designed to segment solar… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

    Comments: Pre-print

  26. arXiv:2309.14303  [pdf, other

    cs.CV

    Dataset Diffusion: Diffusion-based Synthetic Dataset Generation for Pixel-Level Semantic Segmentation

    Authors: Quang Nguyen, Truong Vu, Anh Tran, Khoi Nguyen

    Abstract: Preparing training data for deep vision models is a labor-intensive task. To address this, generative models have emerged as an effective solution for generating synthetic data. While current generative models produce image-level category labels, we propose a novel method for generating pixel-level semantic segmentation labels using the text-to-image generative model Stable Diffusion (SD). By util… ▽ More

    Submitted 13 November, 2023; v1 submitted 25 September, 2023; originally announced September 2023.

    Comments: Accepted to NeurIPS 2023. Our project page: https://dataset-diffusion.github.io/

  27. arXiv:2309.05381  [pdf, other

    cs.SE cs.AI

    Hazards in Deep Learning Testing: Prevalence, Impact and Recommendations

    Authors: Salah Ghamizi, Maxime Cordy, Yuejun Guo, Mike Papadakis, And Yves Le Traon

    Abstract: Much research on Machine Learning testing relies on empirical studies that evaluate and show their potential. However, in this context empirical results are sensitive to a number of parameters that can adversely impact the results of the experiments and potentially lead to wrong conclusions (Type I errors, i.e., incorrectly rejecting the Null Hypothesis). To this end, we survey the related literat… ▽ More

    Submitted 11 September, 2023; originally announced September 2023.

  28. arXiv:2309.01078  [pdf, other

    cs.CV cs.AI

    UnsMOT: Unified Framework for Unsupervised Multi-Object Tracking with Geometric Topology Guidance

    Authors: Son Tran, Cong Tran, Anh Tran, Cuong Pham

    Abstract: Object detection has long been a topic of high interest in computer vision literature. Motivated by the fact that annotating data for the multi-object tracking (MOT) problem is immensely expensive, recent studies have turned their attention to the unsupervised learning setting. In this paper, we push forward the state-of-the-art performance of unsupervised MOT methods by proposing UnsMOT, a novel… ▽ More

    Submitted 3 September, 2023; originally announced September 2023.

  29. A Text-based Approach For Link Prediction on Wikipedia Articles

    Authors: Anh Hoang Tran, Tam Minh Nguyen, Son T. Luu

    Abstract: This paper present our work in the DSAA 2023 Challenge about Link Prediction for Wikipedia Articles. We use traditional machine learning models with POS tags (part-of-speech tags) features extracted from text to train the classification model for predicting whether two nodes has the link. Then, we use these tags to test on various machine learning models. We obtained the results by F1 score at 0.9… ▽ More

    Submitted 6 November, 2023; v1 submitted 1 September, 2023; originally announced September 2023.

    Comments: Accepted by DSAA 2023 Conference in the DSAA Student Competition Section

  30. arXiv:2307.08698  [pdf, other

    cs.CV cs.LG

    Flow Matching in Latent Space

    Authors: Quan Dao, Hao Phung, Binh Nguyen, Anh Tran

    Abstract: Flow matching is a recent framework to train generative models that exhibits impressive empirical performance while being relatively easier to train compared with diffusion-based models. Despite its advantageous properties, prior methods still face the challenges of expensive computing and a large number of function evaluations of off-the-shelf solvers in the pixel space. Furthermore, although lat… ▽ More

    Submitted 17 July, 2023; originally announced July 2023.

    Comments: Project Page: https://vinairesearch.github.io/LFM/

  31. Beyond Geo-localization: Fine-grained Orientation of Street-view Images by Cross-view Matching with Satellite Imagery with Supplementary Materials

    Authors: Wenmiao Hu, Yichen Zhang, Yuxuan Liang, Yifang Yin, Andrei Georgescu, An Tran, Hannes Kruppa, See-Kiong Ng, Roger Zimmermann

    Abstract: Street-view imagery provides us with novel experiences to explore different places remotely. Carefully calibrated street-view images (e.g. Google Street View) can be used for different downstream tasks, e.g. navigation, map features extraction. As personal high-quality cameras have become much more affordable and portable, an enormous amount of crowdsourced street-view images are uploaded to the i… ▽ More

    Submitted 13 July, 2023; v1 submitted 7 July, 2023; originally announced July 2023.

    Comments: This paper has been accepted by ACM Multimedia 2022. This version contains additional supplementary materials

    ACM Class: I.4.9; I.4.8

    Journal ref: Proceedings of the 30th ACM International Conference on Multimedia (2022) 6155-6164

  32. arXiv:2307.01142  [pdf, other

    cs.HC

    Prompt Middleware: Map** Prompts for Large Language Models to UI Affordances

    Authors: Stephen MacNeil, Andrew Tran, Joanne Kim, Ziheng Huang, Seth Bernstein, Dan Mogil

    Abstract: To help users do complex work, researchers have developed techniques to integrate AI and human intelligence into user interfaces (UIs). With the recent introduction of large language models (LLMs), which can generate text in response to a natural language prompt, there are new opportunities to consider how to integrate LLMs into UIs. We present Prompt Middleware, a framework for generating prompts… ▽ More

    Submitted 3 July, 2023; originally announced July 2023.

  33. arXiv:2305.11286  [pdf, ps, other

    cs.DC cs.DS

    Improved and Partially-Tight Lower Bounds for Message-Passing Implementations of Multiplicity Queues

    Authors: Anh Tran, Edward Talmage

    Abstract: A multiplicity queue is a concurrently-defined data type which relaxes the conditions of a linearizable FIFO queue to allow concurrent Dequeue instances to return the same value. It would seem that this should allow faster implementations, as processes should not need to wait as long to learn about concurrent operations at remote processes and previous work has shown that multiplicity queues are c… ▽ More

    Submitted 18 May, 2023; originally announced May 2023.

    Comments: 17 pages, 0 figures Full version of Brief Announcement: Improved, Partially-Tight Multiplicity Queue Lower Bounds (Accepted to ACM PODC) Full version of submission to the International Symposium on Distributed Computing (DISC)

    ACM Class: F.2.0; C.2.4; E.1

  34. arXiv:2305.05001  [pdf, other

    cs.CL

    GersteinLab at MEDIQA-Chat 2023: Clinical Note Summarization from Doctor-Patient Conversations through Fine-tuning and In-context Learning

    Authors: Xiangru Tang, Andrew Tran, Jeffrey Tan, Mark Gerstein

    Abstract: This paper presents our contribution to the MEDIQA-2023 Dialogue2Note shared task, encompassing both subtask A and subtask B. We approach the task as a dialogue summarization problem and implement two distinct pipelines: (a) a fine-tuning of a pre-trained dialogue summarization model and GPT-3, and (b) few-shot in-context learning (ICL) using a large language model, GPT-4. Both methods achieve exc… ▽ More

    Submitted 8 May, 2023; originally announced May 2023.

  35. Uncertainty Quantification in Machine Learning for Engineering Design and Health Prognostics: A Tutorial

    Authors: Venkat Nemani, Luca Biggio, Xun Huan, Zhen Hu, Olga Fink, Anh Tran, Yan Wang, Xiaoge Zhang, Chao Hu

    Abstract: On top of machine learning models, uncertainty quantification (UQ) functions as an essential layer of safety assurance that could lead to more principled decision making by enabling sound risk assessment and management. The safety and reliability improvement of ML models empowered by UQ has the potential to significantly facilitate the broad adoption of ML solutions in high-stakes decision setting… ▽ More

    Submitted 19 September, 2023; v1 submitted 6 May, 2023; originally announced May 2023.

    Journal ref: Mechanical Systems and Signal Processing 205 (2023) 110796

  36. arXiv:2305.03088  [pdf, other

    cs.CL cs.AI

    Modeling What-to-ask and How-to-ask for Answer-unaware Conversational Question Generation

    Authors: Xuan Long Do, Bowei Zou, Shafiq Joty, Anh Tai Tran, Liangming Pan, Nancy F. Chen, Ai Ti Aw

    Abstract: Conversational Question Generation (CQG) is a critical task for machines to assist humans in fulfilling their information needs through conversations. The task is generally cast into two different settings: answer-aware and answer-unaware. While the former facilitates the models by exposing the expected answer, the latter is more realistic and receiving growing attentions recently. What-to-ask and… ▽ More

    Submitted 4 May, 2023; originally announced May 2023.

    Comments: 17 pages, ACL 2023

  37. arXiv:2304.08252  [pdf, other

    cs.RO

    PaaS: Planning as a Service for reactive driving in CARLA Leaderboard

    Authors: Nhat Hao Truong, Huu Thien Mai, Tuan Anh Tran, Minh Quang Tran, Duc Duy Nguyen, Ngoc Viet Phuong Pham

    Abstract: End-to-end deep learning approaches has been proven to be efficient in autonomous driving and robotics. By using deep learning techniques for decision-making, those systems are often referred to as a black box, and the result is driven by data. In this paper, we propose PaaS (Planning as a Service), a vanilla module to generate local trajectory planning for autonomous driving in CARLA simulation.… ▽ More

    Submitted 14 June, 2023; v1 submitted 17 April, 2023; originally announced April 2023.

    Comments: accepted on 05.06.2023, revised on 15.06.2023, to be published on ICSSE 2023

  38. arXiv:2304.03938  [pdf, other

    cs.CY cs.AI cs.CL cs.HC cs.SE

    Comparing Code Explanations Created by Students and Large Language Models

    Authors: Juho Leinonen, Paul Denny, Stephen MacNeil, Sami Sarsa, Seth Bernstein, Joanne Kim, Andrew Tran, Arto Hellas

    Abstract: Reasoning about code and explaining its purpose are fundamental skills for computer scientists. There has been extensive research in the field of computing education on the relationship between a student's ability to explain code and other skills such as writing and tracing code. In particular, the ability to describe at a high-level of abstraction how code will behave over all possible inputs cor… ▽ More

    Submitted 8 April, 2023; originally announced April 2023.

    Comments: 8 pages, 3 figures. To be published in Proceedings of the 2023 Conference on Innovation and Technology in Computer Science Education V. 1

  39. arXiv:2304.01686  [pdf, other

    cs.CV cs.AI

    HyperCUT: Video Sequence from a Single Blurry Image using Unsupervised Ordering

    Authors: Bang-Dang Pham, Phong Tran, Anh Tran, Cuong Pham, Rang Nguyen, Minh Hoai

    Abstract: We consider the challenging task of training models for image-to-video deblurring, which aims to recover a sequence of sharp images corresponding to a given blurry image input. A critical issue disturbing the training of an image-to-video model is the ambiguity of the frame ordering since both the forward and backward sequences are plausible solutions. This paper proposes an effective self-supervi… ▽ More

    Submitted 5 April, 2023; v1 submitted 4 April, 2023; originally announced April 2023.

    Comments: Accepted to CVPR 2023

  40. arXiv:2303.15433  [pdf, other

    cs.CV cs.CR cs.LG

    Anti-DreamBooth: Protecting users from personalized text-to-image synthesis

    Authors: Thanh Van Le, Hao Phung, Thuan Hoang Nguyen, Quan Dao, Ngoc Tran, Anh Tran

    Abstract: Text-to-image diffusion models are nothing but a revolution, allowing anyone, even without design skills, to create realistic images from simple text inputs. With powerful personalization tools like DreamBooth, they can generate images of a specific person just by learning from his/her few reference images. However, when misused, such a powerful and convenient tool can produce fake news or disturb… ▽ More

    Submitted 19 October, 2023; v1 submitted 27 March, 2023; originally announced March 2023.

    Comments: Accepted to ICCV 2023. Project page: https://anti-dreambooth.github.io/

  41. arXiv:2303.14157  [pdf, other

    cs.CV cs.LG

    Efficient Scale-Invariant Generator with Column-Row Entangled Pixel Synthesis

    Authors: Thuan Hoang Nguyen, Thanh Van Le, Anh Tran

    Abstract: Any-scale image synthesis offers an efficient and scalable solution to synthesize photo-realistic images at any scale, even going beyond 2K resolution. However, existing GAN-based solutions depend excessively on convolutions and a hierarchical architecture, which introduce inconsistency and the $``$texture sticking$"$ issue when scaling the output resolution. From another perspective, INR-based ge… ▽ More

    Submitted 25 April, 2023; v1 submitted 24 March, 2023; originally announced March 2023.

    Comments: Accepted to CVPR 2023; Project Page: https://thuanz123.github.io/creps/

  42. arXiv:2303.09115  [pdf, other

    cs.CV

    Learning for Amalgamation: A Multi-Source Transfer Learning Framework For Sentiment Classification

    Authors: Cuong V. Nguyen, Khiem H. Le, Anh M. Tran, Quang H. Pham, Binh T. Nguyen

    Abstract: Transfer learning plays an essential role in Deep Learning, which can remarkably improve the performance of the target domain, whose training data is not sufficient. Our work explores beyond the common practice of transfer learning with a single pre-trained model. We focus on the task of Vietnamese sentiment classification and propose LIFA, a framework to learn a unified embedding from several pre… ▽ More

    Submitted 16 March, 2023; originally announced March 2023.

    Comments: Information Sciences

  43. arXiv:2303.07618  [pdf, other

    cs.CV

    Medical Phrase Grounding with Region-Phrase Context Contrastive Alignment

    Authors: Zhihao Chen, Yang Zhou, Anh Tran, Junting Zhao, Liang Wan, Gideon Ooi, Lionel Cheng, Choon Hua Thng, Xinxing Xu, Yong Liu, Huazhu Fu

    Abstract: Medical phrase grounding (MPG) aims to locate the most relevant region in a medical image, given a phrase query describing certain medical findings, which is an important task for medical image analysis and radiological diagnosis. However, existing visual grounding methods rely on general visual features for identifying objects in natural images and are not capable of capturing the subtle and spec… ▽ More

    Submitted 13 March, 2023; originally announced March 2023.

  44. arXiv:2302.12133  [pdf, other

    cs.MM

    Practical Analyses of How Common Social Media Platforms and Photo Storage Services Handle Uploaded Images

    Authors: Duc-Tien Dang-Nguyen, Vegard Velle Sjøen, Dinh-Hai Le, Thien-Phu Dao, Anh-Duy Tran, Minh-Triet Tran

    Abstract: The research done in this study has delved deeply into the changes made to digital images that are uploaded to three of the major social media platforms and image storage services in today's society: Facebook, Flickr, and Google Photos. In addition to providing up-to-date data on an ever-changing landscape of different social media networks' digital fingerprints, a deep analysis of the social netw… ▽ More

    Submitted 23 February, 2023; originally announced February 2023.

  45. arXiv:2212.11050  [pdf

    cs.CY

    CNN waste classification project report

    Authors: Fei Wu, LiQin Zhang, An Tran

    Abstract: This report is about waste management project. We used CNN as classifier to classify waste image captured from mobile phone. Our model can identify 6 waste classes with highly accurate and our model is successfully transferred into IOS platform as application by swift. In addition, this report also introduced some basic project management from planning project to landing project, for instance usin… ▽ More

    Submitted 21 December, 2022; originally announced December 2022.

    Comments: 23 pages,11 figures

  46. Marginal-Certainty-aware Fair Ranking Algorithm

    Authors: Tao Yang, Zhichao Xu, Zhenduo Wang, Anh Tran, Qingyao Ai

    Abstract: Ranking systems are ubiquitous in modern Internet services, including online marketplaces, social media, and search engines. Traditionally, ranking systems only focus on how to get better relevance estimation. When relevance estimation is available, they usually adopt a user-centric optimization strategy where ranked lists are generated by sorting items according to their estimated relevance. Howe… ▽ More

    Submitted 18 December, 2022; originally announced December 2022.

    Comments: 10 pages, 5 figures

    ACM Class: H.3.3

  47. Automatically Generating CS Learning Materials with Large Language Models

    Authors: Stephen MacNeil, Andrew Tran, Juho Leinonen, Paul Denny, Joanne Kim, Arto Hellas, Seth Bernstein, Sami Sarsa

    Abstract: Recent breakthroughs in Large Language Models (LLMs), such as GPT-3 and Codex, now enable software developers to generate code based on a natural language prompt. Within computer science education, researchers are exploring the potential for LLMs to generate code explanations and programming assignments using carefully crafted prompts. These advances may enable students to interact with code in ne… ▽ More

    Submitted 9 December, 2022; originally announced December 2022.

    Comments: In Proceedings of the 54th ACM Technical Symposium on Computing Science Education

  48. arXiv:2212.00981  [pdf, other

    cs.CV cs.AI

    QC-StyleGAN -- Quality Controllable Image Generation and Manipulation

    Authors: Dat Viet Thanh Nguyen, Phong Tran The, Tan M. Dinh, Cuong Pham, Anh Tuan Tran

    Abstract: The introduction of high-quality image generation models, particularly the StyleGAN family, provides a powerful tool to synthesize and manipulate images. However, existing models are built upon high-quality (HQ) data as desired outputs, making them unfit for in-the-wild low-quality (LQ) images, which are common inputs for manipulation. In this work, we bridge this gap by proposing a novel GAN stru… ▽ More

    Submitted 7 December, 2022; v1 submitted 2 December, 2022; originally announced December 2022.

    Comments: Accepted to NeurIPS 2022; The code is available at https://github.com/VinAIResearch/QC-StyleGAN

  49. arXiv:2211.16152  [pdf, other

    cs.CV eess.IV

    Wavelet Diffusion Models are fast and scalable Image Generators

    Authors: Hao Phung, Quan Dao, Anh Tran

    Abstract: Diffusion models are rising as a powerful solution for high-fidelity image generation, which exceeds GANs in quality in many circumstances. However, their slow training and inference speed is a huge bottleneck, blocking them from being used in real-time applications. A recent DiffusionGAN method significantly decreases the models' running time by reducing the number of sampling steps from thousand… ▽ More

    Submitted 22 March, 2023; v1 submitted 29 November, 2022; originally announced November 2022.

    Comments: Accepted to CVPR 2023

  50. arXiv:2211.05199  [pdf

    cs.HC physics.ed-ph

    An Open, Multi-Platform Software Architecture for Online Education in the Metaverse

    Authors: S. Lombeyda, S. G. Djorgovski, A. Tran, J. Liu, A. Noyes, S. Fomina

    Abstract: Use of online platforms for education is a vibrant and growing arena, incorporating a variety of software platforms and technologies, including various modalities of extended reality. We present our Enhanced Reality Teaching Concierge, an open networking hub architected to enable efficient and easy connectivity between a wide variety of services or applications to a wide variety of clients, design… ▽ More

    Submitted 9 November, 2022; originally announced November 2022.

    Comments: 4 pages, 3 figures; In The 27th International Conference on 3D Web Technology (Web3D 2022). ACM, New York, NY, USA

    ACM Class: H.5; I.3.7; K.3