Skip to main content

Showing 1–50 of 314 results for author: Hoang, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.18518  [pdf, other

    cs.CL cs.AI cs.LG cs.SE

    APIGen: Automated Pipeline for Generating Verifiable and Diverse Function-Calling Datasets

    Authors: Zuxin Liu, Thai Hoang, Jianguo Zhang, Ming Zhu, Tian Lan, Shirley Kokane, Juntao Tan, Weiran Yao, Zhiwei Liu, Yihao Feng, Rithesh Murthy, Liangwei Yang, Silvio Savarese, Juan Carlos Niebles, Huan Wang, Shelby Heinecke, Caiming Xiong

    Abstract: The advancement of function-calling agent models requires diverse, reliable, and high-quality datasets. This paper presents APIGen, an automated data generation pipeline designed to synthesize verifiable high-quality datasets for function-calling applications. We leverage APIGen and collect 3,673 executable APIs across 21 different categories to generate diverse function-calling datasets in a scal… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  2. arXiv:2406.15877  [pdf, other

    cs.SE cs.AI cs.CL

    BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions

    Authors: Terry Yue Zhuo, Minh Chien Vu, Jenny Chim, Han Hu, Wenhao Yu, Ratnadira Widyasari, Imam Nur Bani Yusuf, Haolan Zhan, Junda He, Indraneil Paul, Simon Brunner, Chen Gong, Thong Hoang, Armel Randy Zebaze, Xiaoheng Hong, Wen-Ding Li, Jean Kaddour, Ming Xu, Zhihan Zhang, Prateek Yadav, Naman Jain, Alex Gu, Zhoujun Cheng, Jiawei Liu, Qian Liu , et al. (8 additional authors not shown)

    Abstract: Automated software engineering has been greatly empowered by the recent advances in Large Language Models (LLMs) for programming. While current benchmarks have shown that LLMs can perform various software engineering tasks like human developers, the majority of their evaluations are limited to short and self-contained algorithmic tasks. Solving challenging and practical programming tasks requires… ▽ More

    Submitted 26 June, 2024; v1 submitted 22 June, 2024; originally announced June 2024.

    Comments: 44 pages, 14 figures, 7 tables, built with love by the BigCode community :)

  3. arXiv:2406.06518  [pdf, other

    cs.LG

    Data Augmentation for Multivariate Time Series Classification: An Experimental Study

    Authors: Romain Ilbert, Thai V. Hoang, Zonghua Zhang

    Abstract: Our study investigates the impact of data augmentation on the performance of multivariate time series models, focusing on datasets from the UCR archive. Despite the limited size of these datasets, we achieved classification accuracy improvements in 10 out of 13 datasets using the Rocket and InceptionTime models. This highlights the essential role of sufficient data in training effective models, pa… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: Workshop on Multivariate Time Series Analytics (MulTiSA), ICDE Workshop

  4. arXiv:2405.15506  [pdf, other

    cs.LG

    Learning to Discretize Denoising Diffusion ODEs

    Authors: Vinh Tong, Anji Liu, Trung-Dung Hoang, Guy Van den Broeck, Mathias Niepert

    Abstract: Diffusion Probabilistic Models (DPMs) are powerful generative models showing competitive performance in various domains, including image synthesis and 3D point cloud generation. However, sampling from pre-trained DPMs involves multiple neural function evaluations (NFE) to transform Gaussian noise samples into images, resulting in higher computational costs compared to single-step generative models… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  5. arXiv:2405.11895  [pdf, other

    cs.LG eess.SY

    Sparse Attention-driven Quality Prediction for Production Process Optimization in Digital Twins

    Authors: Yanlei Yin, Lihua Wang, Wenbo Wang, Dinh Thai Hoang

    Abstract: In the process industry, optimizing production lines for long-term efficiency requires real-time monitoring and analysis of operation states to fine-tune production line parameters. However, the complexity in operational logic and the intricate coupling of production process parameters make it difficult to develop an accurate mathematical model for the entire process, thus hindering the deployment… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

  6. arXiv:2405.05349  [pdf, other

    cs.LG cs.AI

    Offline Model-Based Optimization via Policy-Guided Gradient Search

    Authors: Yassine Chemingui, Aryan Deshwal, Trong Nghia Hoang, Janardhan Rao Doppa

    Abstract: Offline optimization is an emerging problem in many experimental engineering domains including protein, drug or aircraft design, where online experimentation to collect evaluation data is too expensive or dangerous. To avoid that, one has to optimize an unknown function given only its offline evaluation at a fixed set of inputs. A naive solution to this problem is to learn a surrogate model of the… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: Published at AAAI Conference on Artificial Intelligence, 2024

  7. arXiv:2404.18199  [pdf, other

    eess.IV cs.CV

    Rethinking Attention Gated with Hybrid Dual Pyramid Transformer-CNN for Generalized Segmentation in Medical Imaging

    Authors: Fares Bougourzi, Fadi Dornaika, Abdelmalik Taleb-Ahmed, Vinh Truong Hoang

    Abstract: Inspired by the success of Transformers in Computer vision, Transformers have been widely investigated for medical imaging segmentation. However, most of Transformer architecture are using the recent transformer architectures as encoder or as parallel encoder with the CNN encoder. In this paper, we introduce a novel hybrid CNN-Transformer segmentation architecture (PAG-TransYnet) designed for effi… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

  8. arXiv:2404.06257  [pdf, other

    cs.NI

    DDPG-E2E: A Novel Policy Gradient Approach for End-to-End Communication Systems

    Authors: Bolun Zhang, Nguyen Van Huynh, Dinh Thai Hoang, Diep N. Nguyen, Quoc-Viet Pham

    Abstract: The End-to-end (E2E) learning-based approach has great potential to reshape the existing communication systems by replacing the transceivers with deep neural networks. To this end, the E2E learning approach needs to assume the availability of prior channel information to mathematically formulate a differentiable channel layer for the backpropagation (BP) of the error gradients, thereby jointly opt… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

  9. arXiv:2404.01676  [pdf, other

    cs.LG

    Incentives in Private Collaborative Machine Learning

    Authors: Rachael Hwee Ling Sim, Yehong Zhang, Trong Nghia Hoang, Xinyi Xu, Bryan Kian Hsiang Low, Patrick Jaillet

    Abstract: Collaborative machine learning involves training models on data from multiple parties but must incentivize their participation. Existing data valuation methods fairly value and reward each party based on shared data or model parameters but neglect the privacy risks involved. To address this, we introduce differential privacy (DP) as an incentive. Each party can select its required DP guarantee and… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: Accepted to NeurIPS 2023

  10. arXiv:2403.15917  [pdf, other

    cs.SE

    Who Uses Personas in Requirements Engineering: The Practitioners' Perspective

    Authors: Yi Wang, Chetan Arora, Xiao Liu, Thuong Hoang, Vasudha Malhotra, Ben Cheng, John Grundy

    Abstract: Personas are commonly used in software projects to gain a better understanding of end-users' needs. However, there is a limited understanding of their usage and effectiveness in practice. This paper presents the results of a two-step investigation, comprising interviews with 26 software developers, UI/UX designers, business analysts and product managers and a survey of 203 practitioners, aimed at… ▽ More

    Submitted 23 March, 2024; originally announced March 2024.

  11. arXiv:2403.15511  [pdf, other

    cs.LG cs.AI cs.CR

    Multiple-Input Auto-Encoder Guided Feature Selection for IoT Intrusion Detection Systems

    Authors: Phai Vu Dinh, Diep N. Nguyen, Dinh Thai Hoang, Quang Uy Nguyen, Eryk Dutkiewicz, Son Pham Bao

    Abstract: While intrusion detection systems (IDSs) benefit from the diversity and generalization of IoT data features, the data diversity (e.g., the heterogeneity and high dimensions of data) also makes it difficult to train effective machine learning models in IoT IDSs. This also leads to potentially redundant/noisy features that may decrease the accuracy of the detection engine in IDSs. This paper first i… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

  12. arXiv:2403.14697  [pdf

    cs.CY cs.AI cs.SE

    An AIC-based approach for articulating unpredictable problems in open complex environments

    Authors: Haider AL-Shareefy, Michael Butler, Thai Son Hoang

    Abstract: This research paper presents an approach to enhancing the predictive capability of architects in the design and assurance of systems, focusing on systems operating in dynamic and unpredictable environments. By adopting a systems approach, we aim to improve architects' predictive capabilities in designing dependable systems (for example, ML-based systems). An aerospace case study is used to illustr… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

    Comments: S. Bernardi, T. Zoppi (Editors), "Fast Abstracts and Student Forum Proceedings - EDCC 2024 - 19th European Dependable Computing Conference, Leuven, Belgium, 8-11 April 2024"

  13. arXiv:2403.08876  [pdf, other

    cs.CV

    ARtVista: Gateway To Empower Anyone Into Artist

    Authors: Trong-Vu Hoang, Quang-Binh Nguyen, Duy-Nam Ly, Khanh-Duy Le, Tam V. Nguyen, Minh-Triet Tran, Trung-Nghia Le

    Abstract: Drawing is an art that enables people to express their imagination and emotions. However, individuals usually face challenges in drawing, especially when translating conceptual ideas into visually coherent representations and bridging the gap between mental visualization and practical execution. In response, we propose ARtVista - a novel system integrating AR and generative AI technologies. ARtVis… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

    Comments: CHI 2024

  14. arXiv:2403.07763  [pdf, other

    cs.NI cs.ET

    Emerging Technologies for 6G Non-Terrestrial-Networks: From Academia to Industrial Applications

    Authors: Cong T. Nguyen, Yuris Mulya Saputra, Nguyen Van Huynh, Tan N. Nguyen, Dinh Thai Hoang, Diep N Nguyen, Van-Quan Pham, Miroslav Voznak, Symeon Chatzinotas, Dinh-Hieu Tran

    Abstract: Terrestrial networks form the fundamental infrastructure of modern communication systems, serving more than 4 billion users globally. However, terrestrial networks are facing a wide range of challenges, from coverage and reliability to interference and congestion. As the demands of the 6G era are expected to be much higher, it is crucial to address these challenges to ensure a robust and efficient… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

    Comments: 26 pages

  15. arXiv:2402.18062  [pdf, other

    cs.RO cs.AI

    Generative AI for Unmanned Vehicle Swarms: Challenges, Applications and Opportunities

    Authors: Guangyuan Liu, Nguyen Van Huynh, Hongyang Du, Dinh Thai Hoang, Dusit Niyato, Kun Zhu, Jiawen Kang, Zehui Xiong, Abbas Jamalipour, Dong In Kim

    Abstract: With recent advances in artificial intelligence (AI) and robotics, unmanned vehicle swarms have received great attention from both academia and industry due to their potential to provide services that are difficult and dangerous to perform by humans. However, learning and coordinating movements and actions for a large number of unmanned vehicles in complex and dynamic environments introduce signif… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

    Comments: 23 pages

  16. arXiv:2402.15506  [pdf, other

    cs.AI cs.CL cs.LG

    AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning

    Authors: Jianguo Zhang, Tian Lan, Rithesh Murthy, Zhiwei Liu, Weiran Yao, Juntao Tan, Thai Hoang, Liangwei Yang, Yihao Feng, Zuxin Liu, Tulika Awalgaonkar, Juan Carlos Niebles, Silvio Savarese, Shelby Heinecke, Huan Wang, Caiming Xiong

    Abstract: Autonomous agents powered by large language models (LLMs) have garnered significant research attention. However, fully harnessing the potential of LLMs for agent-based tasks presents inherent challenges due to the heterogeneous nature of diverse data sources featuring multi-turn trajectories. In this paper, we introduce \textbf{AgentOhana} as a comprehensive solution to address these challenges. \… ▽ More

    Submitted 20 March, 2024; v1 submitted 23 February, 2024; originally announced February 2024.

    Comments: Add GitHub repo link at \url{https://github.com/SalesforceAIResearch/xLAM} and HuggingFace model link at \url{https://huggingface.co/Salesforce/xLAM-v0.1-r}

  17. arXiv:2402.14544  [pdf, other

    cs.CR cs.SE

    {A New Hope}: Contextual Privacy Policies for Mobile Applications and An Approach Toward Automated Generation

    Authors: Shidong Pan, Zhen Tao, Thong Hoang, Dawen Zhang, Tianshi Li, Zhenchang Xing, Sherry Xu, Mark Staples, Thierry Rakotoarivelo, David Lo

    Abstract: Privacy policies have emerged as the predominant approach to conveying privacy notices to mobile application users. In an effort to enhance both readability and user engagement, the concept of contextual privacy policies (CPPs) has been proposed by researchers. The aim of CPPs is to fragment privacy policies into concise snippets, displaying them only within the corresponding contexts within the a… ▽ More

    Submitted 10 March, 2024; v1 submitted 22 February, 2024; originally announced February 2024.

    Comments: USENIX Security 2024. arXiv admin note: text overlap with arXiv:2307.01691

  18. arXiv:2402.14305  [pdf, other

    cs.IR cs.LG

    Towards Efficient Pareto-optimal Utility-Fairness between Groups in Repeated Rankings

    Authors: Phuong Dinh Mai, Duc-Trong Le, Tuan-Anh Hoang, Dung D. Le

    Abstract: In this paper, we tackle the problem of computing a sequence of rankings with the guarantee of the Pareto-optimal balance between (1) maximizing the utility of the consumers and (2) minimizing unfairness between producers of the items. Such a multi-objective optimization problem is typically solved using a combination of a scalarization method and linear programming on bi-stochastic matrices, repr… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

  19. arXiv:2402.13549  [pdf, ps, other

    cs.IT eess.SY

    Q-learning-based Joint Design of Adaptive Modulation and Precoding for Physical Layer Security in Visible Light Communications

    Authors: Duc M. T. Hoang, Thanh V. Pham, Anh T. Pham, Chuyen T Nguyen

    Abstract: There has been an increasing interest in physical layer security (PLS), which, compared with conventional cryptography, offers a unique approach to guaranteeing information confidentiality against eavesdroppers. In this paper, we study a joint design of adaptive $M$-ary pulse amplitude modulation (PAM) and precoding, which aims to optimize wiretap visible-light channels' secrecy capacity and bit e… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

  20. arXiv:2402.02319  [pdf

    cs.RO

    Smart Textile-Driven Soft Spine Exosuit for Lifting Tasks in Industrial Applications

    Authors: Kefan Zhu, Bibhu Sharma, Phuoc Thien Phan, James Davies, Mai Thanh Thai, Trung Thien Hoang, Chi Cong Nguyen, Adrienne Ji, Emanuele Nicotra, Nigel H. Lovell, Thanh Nho Do

    Abstract: Work related musculoskeletal disorders (WMSDs) are often caused by repetitive lifting, making them a significant concern in occupational health. Although wearable assist devices have become the norm for mitigating the risk of back pain, most spinal assist devices still possess a partially rigid structure that impacts the user comfort and flexibility. This paper addresses this issue by presenting a… ▽ More

    Submitted 3 February, 2024; originally announced February 2024.

    Comments: 6 pages, 7 figures

  21. arXiv:2402.00238  [pdf, other

    cs.LG eess.IV q-bio.QM

    CNN-FL for Biotechnology Industry Empowered by Internet-of-BioNano Things and Digital Twins

    Authors: Mohammad, Jamshidi, Dinh Thai Hoang, Diep N. Nguyen

    Abstract: Digital twins (DTs) are revolutionizing the biotechnology industry by enabling sophisticated digital representations of biological assets, microorganisms, drug development processes, and digital health applications. However, digital twinning at micro and nano scales, particularly in modeling complex entities like bacteria, presents significant challenges in terms of requiring advanced Internet of… ▽ More

    Submitted 31 January, 2024; originally announced February 2024.

  22. Effective Multi-Stage Training Model For Edge Computing Devices In Intrusion Detection

    Authors: Thua Huynh Trong, Thanh Nguyen Hoang

    Abstract: Intrusion detection poses a significant challenge within expansive and persistently interconnected environments. As malicious code continues to advance and sophisticated attack methodologies proliferate, various advanced deep learning-based detection approaches have been proposed. Nevertheless, the complexity and accuracy of intrusion detection models still need further enhancement to render them… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

  23. arXiv:2401.16176  [pdf, other

    cs.LG cs.AI

    A Survey on Structure-Preserving Graph Transformers

    Authors: Van Thuy Hoang, O-Joun Lee

    Abstract: The transformer architecture has shown remarkable success in various domains, such as natural language processing and computer vision. When it comes to graph learning, transformers are required not only to capture the interactions between pairs of nodes but also to preserve graph structures connoting the underlying relations and proximity between them, showing the expressive power to capture diffe… ▽ More

    Submitted 29 January, 2024; originally announced January 2024.

    Comments: 12

  24. arXiv:2401.15625  [pdf, other

    cs.CR cs.AI

    Generative AI-enabled Blockchain Networks: Fundamentals, Applications, and Case Study

    Authors: Cong T. Nguyen, Yinqiu Liu, Hongyang Du, Dinh Thai Hoang, Dusit Niyato, Diep N. Nguyen, Shiwen Mao

    Abstract: Generative Artificial Intelligence (GAI) has recently emerged as a promising solution to address critical challenges of blockchain technology, including scalability, security, privacy, and interoperability. In this paper, we first introduce GAI techniques, outline their applications, and discuss existing solutions for integrating GAI into blockchains. Then, we discuss emerging solutions that demon… ▽ More

    Submitted 28 January, 2024; originally announced January 2024.

  25. BugsInPy: A Database of Existing Bugs in Python Programs to Enable Controlled Testing and Debugging Studies

    Authors: Ratnadira Widyasari, Sheng Qin Sim, Camellia Lok, Haodi Qi, Jack Phan, Qi** Tay, Constance Tan, Fiona Wee, Jodie Ethelda Tan, Yuheng Yieh, Brian Goh, Ferdian Thung, Hong ** Kang, Thong Hoang, David Lo, Eng Lieh Ouh

    Abstract: The 2019 edition of Stack Overflow developer survey highlights that, for the first time, Python outperformed Java in terms of popularity. The gap between Python and Java further widened in the 2020 edition of the survey. Unfortunately, despite the rapid increase in Python's popularity, there are not many testing and debugging tools that are designed for Python. This is in stark contrast with the a… ▽ More

    Submitted 27 January, 2024; originally announced January 2024.

    Journal ref: Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (2020) 1556-1560

  26. arXiv:2401.14420  [pdf, other

    cs.CR

    A Novel Blockchain Based Information Management Framework for Web 3.0

    Authors: Md Arif Hassan, Cong T. Nguyen, Chi-Hieu Nguyen, Dinh Thai Hoang, Diep N. Nguyen, Eryk Dutkiewicz

    Abstract: Web 3.0 is the third generation of the World Wide Web (WWW), concentrating on the critical concepts of decentralization, availability, and increasing client usability. Although Web 3.0 is undoubtedly an essential component of the future Internet, it currently faces critical challenges, including decentralized data collection and management. To overcome these challenges, blockchain has emerged as o… ▽ More

    Submitted 23 January, 2024; originally announced January 2024.

  27. arXiv:2401.10901  [pdf, other

    cs.CY

    Enabling Technologies for Web 3.0: A Comprehensive Survey

    Authors: Md Arif Hassan, Mohammad Behdad Jamshidi, Bui Duc Manh, Nam H. Chu, Chi-Hieu Nguyen, Nguyen Quang Hieu, Cong T. Nguyen, Dinh Thai Hoang, Diep N. Nguyen, Nguyen Van Huynh, Mohammad Abu Alsheikh, Eryk Dutkiewicz

    Abstract: Web 3.0 represents the next stage of Internet evolution, aiming to empower users with increased autonomy, efficiency, quality, security, and privacy. This evolution can potentially democratize content access by utilizing the latest developments in enabling technologies. In this paper, we conduct an in-depth survey of enabling technologies in the context of Web 3.0, such as blockchain, semantic web… ▽ More

    Submitted 29 December, 2023; originally announced January 2024.

  28. Improving Graph Convolutional Networks with Transformer Layer in social-based items recommendation

    Authors: Thi Linh Hoang, Tuan Dung Pham, Viet Cuong Ta

    Abstract: In this work, we have proposed an approach for improving the GCN for predicting ratings in social networks. Our model is expanded from the standard model with several layers of transformer architecture. The main focus of the paper is on the encoder architecture for node embedding in the network. Using the embedding layer from the graph-based convolution layer, the attention mechanism could rearran… ▽ More

    Submitted 12 January, 2024; originally announced January 2024.

  29. arXiv:2401.05712  [pdf, other

    cs.DB

    BOD: Blindly Optimal Data Discovery

    Authors: Thomas Hoang

    Abstract: Combining discovery and augmentation is important in the era of data usage when it comes to predicting the outcome of tasks. However, having to ask the user the utility function to discover the goal to achieve the optimal small rightful dataset is not an optimal solution. The existing solutions do not make good use of this combination, hence underutilizing the data. In this paper, we introduce a n… ▽ More

    Submitted 12 January, 2024; v1 submitted 11 January, 2024; originally announced January 2024.

    Comments: Still working on other parts

  30. arXiv:2401.04468  [pdf, other

    cs.CV cs.AI

    MagicVideo-V2: Multi-Stage High-Aesthetic Video Generation

    Authors: Weimin Wang, Jiawei Liu, Zhijie Lin, Jiangqiao Yan, Shuo Chen, Chetwin Low, Tuyen Hoang, Jie Wu, Jun Hao Liew, Hanshu Yan, Daquan Zhou, Jiashi Feng

    Abstract: The growing demand for high-fidelity video generation from textual descriptions has catalyzed significant research in this field. In this work, we introduce MagicVideo-V2 that integrates the text-to-image model, video motion generator, reference image embedding module and frame interpolation module into an end-to-end video generation pipeline. Benefiting from these architecture designs, MagicVideo… ▽ More

    Submitted 9 January, 2024; originally announced January 2024.

  31. arXiv:2401.03748  [pdf, other

    cs.LG cs.CR cs.DC cs.IR

    Towards Efficient Communication and Secure Federated Recommendation System via Low-rank Training

    Authors: Ngoc-Hieu Nguyen, Tuan-Anh Nguyen, Tuan Nguyen, Vu Tien Hoang, Dung D. Le, Kok-Seng Wong

    Abstract: Federated Recommendation (FedRec) systems have emerged as a solution to safeguard users' data in response to growing regulatory concerns. However, one of the major challenges in these systems lies in the communication costs that arise from the need to transmit neural network models between user devices and a central server. Prior approaches to these challenges often lead to issues such as computat… ▽ More

    Submitted 28 February, 2024; v1 submitted 8 January, 2024; originally announced January 2024.

    Comments: 12 pages, 6 figures, 4 tables

  32. arXiv:2312.16788  [pdf, other

    cs.LG cs.AI cs.SI

    Mitigating Degree Biases in Message Passing Mechanism by Utilizing Community Structures

    Authors: Van Thuy Hoang, O-Joun Lee

    Abstract: This study utilizes community structures to address node degree biases in message-passing (MP) via learnable graph augmentations and novel graph transformers. Recent augmentation-based methods showed that MP neural networks often perform poorly on low-degree nodes, leading to degree biases due to a lack of messages reaching low-degree nodes. Despite their success, most methods use heuristic or uni… ▽ More

    Submitted 27 December, 2023; originally announced December 2023.

    Comments: 11

  33. arXiv:2312.08701  [pdf, other

    cs.DC

    Enabling End-to-End Secure Federated Learning in Biomedical Research on Heterogeneous Computing Environments with APPFLx

    Authors: Trung-Hieu Hoang, Jordan Fuhrman, Ravi Madduri, Miao Li, Pranshu Chaturvedi, Zilinghan Li, Kibaek Kim, Minseok Ryu, Ryan Chard, E. A. Huerta, Maryellen Giger

    Abstract: Facilitating large-scale, cross-institutional collaboration in biomedical machine learning projects requires a trustworthy and resilient federated learning (FL) environment to ensure that sensitive information such as protected health information is kept confidential. In this work, we introduce APPFLx, a low-code FL framework that enables the easy setup, configuration, and running of FL experiment… ▽ More

    Submitted 14 December, 2023; originally announced December 2023.

  34. arXiv:2312.07011  [pdf, ps, other

    cs.IT eess.SP

    Securing MIMO Wiretap Channel with Learning-Based Friendly Jamming under Imperfect CSI

    Authors: Bui Minh Tuan, Diep N. Nguyen, Nguyen Linh Trung, Van-Dinh Nguyen, Nguyen Van Huynh, Dinh Thai Hoang, Marwan Krunz, Eryk Dutkiewicz

    Abstract: Wireless communications are particularly vulnerable to eavesdrop** attacks due to their broadcast nature. To effectively deal with eavesdroppers, existing security techniques usually require accurate channel state information (CSI), e.g., for friendly jamming (FJ), and/or additional computing resources at transceivers, e.g., cryptography-based solutions, which unfortunately may not be feasible i… ▽ More

    Submitted 12 December, 2023; originally announced December 2023.

    Comments: 12 pages, 15 figures

  35. arXiv:2312.06797  [pdf, other

    cs.CV

    Improving the Robustness of 3D Human Pose Estimation: A Benchmark and Learning from Noisy Input

    Authors: Trung-Hieu Hoang, Mona Zehni, Huy Phan, Duc Minh Vo, Minh N. Do

    Abstract: Despite the promising performance of current 3D human pose estimation techniques, understanding and enhancing their generalization on challenging in-the-wild videos remain an open problem. In this work, we focus on the robustness of 2D-to-3D pose lifters. To this end, we develop two benchmark datasets, namely Human3.6M-C and HumanEva-I-C, to examine the robustness of video-based 3D pose lifters to… ▽ More

    Submitted 15 April, 2024; v1 submitted 11 December, 2023; originally announced December 2023.

  36. arXiv:2312.05594  [pdf, other

    cs.NI cs.AI

    Generative AI for Physical Layer Communications: A Survey

    Authors: Nguyen Van Huynh, Jiacheng Wang, Hongyang Du, Dinh Thai Hoang, Dusit Niyato, Diep N. Nguyen, Dong In Kim, Khaled B. Letaief

    Abstract: The recent evolution of generative artificial intelligence (GAI) leads to the emergence of groundbreaking applications such as ChatGPT, which not only enhances the efficiency of digital content production, such as text, audio, video, or even network traffic data, but also enriches its diversity. Beyond digital content creation, GAI's capability in analyzing complex data distributions offers great… ▽ More

    Submitted 9 December, 2023; originally announced December 2023.

  37. arXiv:2312.04095  [pdf, other

    cs.LG cs.CV

    Learn to Unlearn for Deep Neural Networks: Minimizing Unlearning Interference with Gradient Projection

    Authors: Tuan Hoang, Santu Rana, Sunil Gupta, Svetha Venkatesh

    Abstract: Recent data-privacy laws have sparked interest in machine unlearning, which involves removing the effect of specific training samples from a learnt model as if they were never present in the original training dataset. The challenge of machine unlearning is to discard information about the ``forget'' data in the learnt model without altering the knowledge about the remaining dataset and to do so mo… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

    Comments: Accepted to WACV 2024

  38. arXiv:2312.02490  [pdf, other

    cs.LG cs.CR

    Constrained Twin Variational Auto-Encoder for Intrusion Detection in IoT Systems

    Authors: Phai Vu Dinh, Quang Uy Nguyen, Dinh Thai Hoang, Diep N. Nguyen, Son Pham Bao, Eryk Dutkiewicz

    Abstract: Intrusion detection systems (IDSs) play a critical role in protecting billions of IoT devices from malicious attacks. However, the IDSs for IoT devices face inherent challenges of IoT systems, including the heterogeneity of IoT data/devices, the high dimensionality of training data, and the imbalanced data. Moreover, the deployment of IDSs on IoT systems is challenging, and sometimes impossible, d… ▽ More

    Submitted 4 December, 2023; originally announced December 2023.

  39. arXiv:2311.18252  [pdf, other

    cs.SE cs.AI cs.CY cs.LG

    Navigating Privacy and Copyright Challenges Across the Data Lifecycle of Generative AI

    Authors: Dawen Zhang, Boming Xia, Yue Liu, Xiwei Xu, Thong Hoang, Zhenchang Xing, Mark Staples, Qinghua Lu, Liming Zhu

    Abstract: The advent of Generative AI has marked a significant milestone in artificial intelligence, demonstrating remarkable capabilities in generating realistic images, texts, and data patterns. However, these advancements come with heightened concerns over data privacy and copyright infringement, primarily due to the reliance on vast datasets for model training. Traditional approaches like differential p… ▽ More

    Submitted 10 January, 2024; v1 submitted 30 November, 2023; originally announced November 2023.

    Comments: Accepted by 2024 IEEE/ACM 3rd International Conference on AI Engineering - Software Engineering for AI (CAIN)

  40. arXiv:2311.18193  [pdf, other

    cs.CV

    Persistent Test-time Adaptation in Episodic Testing Scenarios

    Authors: Trung-Hieu Hoang, Duc Minh Vo, Minh N. Do

    Abstract: Current test-time adaptation (TTA) approaches aim to adapt to environments that change continuously. Yet, when the environments not only change but also recur in a correlated manner over time, such as in the case of day-night surveillance cameras, it is unclear whether the adaptability of these methods is sustained after a long run. This study aims to examine the error accumulation of TTA models w… ▽ More

    Submitted 16 January, 2024; v1 submitted 29 November, 2023; originally announced November 2023.

  41. arXiv:2311.09790  [pdf, other

    cs.LG cs.AI cs.CR

    Breaking Boundaries: Balancing Performance and Robustness in Deep Wireless Traffic Forecasting

    Authors: Romain Ilbert, Thai V. Hoang, Zonghua Zhang, Themis Palpanas

    Abstract: Balancing the trade-off between accuracy and robustness is a long-standing challenge in time series forecasting. While most of existing robust algorithms have achieved certain suboptimal performance on clean data, sustaining the same performance level in the presence of data perturbations remains extremely hard. In this paper, we study a wide array of perturbation scenarios and propose novel defen… ▽ More

    Submitted 28 November, 2023; v1 submitted 16 November, 2023; originally announced November 2023.

    Comments: Accepted for presentation at the ARTMAN workshop, part of the ACM Conference on Computer and Communications Security (CCS), 2023

    MSC Class: 68T05; 62M10; 68T01 ACM Class: I.2.6; I.2.4; K.6.5

    Journal ref: Proceedings of the 2023 Workshop on Recent Advances in Resilient and Trustworthy ML Systems in Autonomous Networks; pp.17-28

  42. arXiv:2311.05256  [pdf, other

    cs.LG

    Latent Task-Specific Graph Network Simulators

    Authors: Philipp Dahlinger, Niklas Freymuth, Michael Volpp, Tai Hoang, Gerhard Neumann

    Abstract: Simulating dynamic physical interactions is a critical challenge across multiple scientific domains, with applications ranging from robotics to material science. For mesh-based simulations, Graph Network Simulators (GNSs) pose an efficient alternative to traditional physics-based simulators. Their inherent differentiability and speed make them particularly well-suited for inverse design problems.… ▽ More

    Submitted 9 November, 2023; originally announced November 2023.

  43. arXiv:2310.20228  [pdf, other

    cs.HC

    Reconstructing Human Pose from Inertial Measurements: A Generative Model-based Compressive Sensing Approach

    Authors: Nguyen Quang Hieu, Dinh Thai Hoang, Diep N. Nguyen, Mohammad Abu Alsheikh

    Abstract: The ability to sense, localize, and estimate the 3D position and orientation of the human body is critical in virtual reality (VR) and extended reality (XR) applications. This becomes more important and challenging with the deployment of VR/XR applications over the next generation of wireless systems such as 5G and beyond. In this paper, we propose a novel framework that can reconstruct the 3D hum… ▽ More

    Submitted 12 May, 2024; v1 submitted 31 October, 2023; originally announced October 2023.

  44. arXiv:2310.19251  [pdf, other

    cs.IR cs.AI

    Pre-trained Recommender Systems: A Causal Debiasing Perspective

    Authors: Ziqian Lin, Hao Ding, Nghia Trong Hoang, Branislav Kveton, Anoop Deoras, Hao Wang

    Abstract: Recent studies on pre-trained vision/language models have demonstrated the practical benefit of a new, promising solution-building paradigm in AI where models can be pre-trained on broad data describing a generic task space and then adapted successfully to solve a wide range of downstream tasks, even when training data is severely limited (e.g., in zero- or few-shot learning scenarios). Inspired b… ▽ More

    Submitted 8 January, 2024; v1 submitted 29 October, 2023; originally announced October 2023.

    Comments: 8 pages, WSDM 24

  45. arXiv:2310.17250  [pdf, other

    cs.LG cs.AI cs.NE

    IDENAS: Internal Dependency Exploration for Neural Architecture Search

    Authors: Anh T. Hoang, Zsolt J. Viharos

    Abstract: Machine learning is a powerful tool for extracting valuable information and making various predictions from diverse datasets. Traditional algorithms rely on well-defined input and output variables however, there are scenarios where the distinction between the input and output variables and the underlying, associated (input and output) layers of the model, are unknown. Neural Architecture Search (N… ▽ More

    Submitted 26 October, 2023; originally announced October 2023.

    Comments: 57 pages, 19 figures + appendix, the related software code can be found under the link: https://github.com/viharoszsolt/IDENAS

    MSC Class: 68T07; 68T10; 93A10 ACM Class: I.5.2; I.5.1; I.2.6

  46. arXiv:2310.07915  [pdf, other

    cs.NI cs.CY cs.SI

    Tag Your Fish in the Broken Net: A Responsible Web Framework for Protecting Online Privacy and Copyright

    Authors: Dawen Zhang, Boming Xia, Yue Liu, Xiwei Xu, Thong Hoang, Zhenchang Xing, Mark Staples, Qinghua Lu, Liming Zhu

    Abstract: The World Wide Web, a ubiquitous source of information, serves as a primary resource for countless individuals, amassing a vast amount of data from global internet users. However, this online data, when scraped, indexed, and utilized for activities like web crawling, search engine indexing, and, notably, AI model training, often diverges from the original intent of its contributors. The ascent of… ▽ More

    Submitted 5 November, 2023; v1 submitted 11 October, 2023; originally announced October 2023.

    Comments: added some information on how to deal with CDN in the design section; minor fixes on writing

  47. arXiv:2310.07780  [pdf, other

    cs.LG

    Promoting Robustness of Randomized Smoothing: Two Cost-Effective Approaches

    Authors: Linbo Liu, Trong Nghia Hoang, Lam M. Nguyen, Tsui-Wei Weng

    Abstract: Randomized smoothing has recently attracted attentions in the field of adversarial robustness to provide provable robustness guarantees on smoothed neural network classifiers. However, existing works show that vanilla randomized smoothing usually does not provide good robustness performance and often requires (re)training techniques on the base classifier in order to boost the robustness of the re… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

  48. arXiv:2310.07497  [pdf, other

    cs.LG cs.AI

    Sample-Driven Federated Learning for Energy-Efficient and Real-Time IoT Sensing

    Authors: Minh Ngoc Luu, Minh-Duong Nguyen, Ebrahim Bedeer, Van Duc Nguyen, Dinh Thai Hoang, Diep N. Nguyen, Quoc-Viet Pham

    Abstract: In the domain of Federated Learning (FL) systems, recent cutting-edge methods heavily rely on ideal conditions convergence analysis. Specifically, these approaches assume that the training datasets on IoT devices possess similar attributes to the global data distribution. However, this approach fails to capture the full spectrum of data characteristics in real-time sensing FL systems. In order to… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

    Comments: 17 pages, 5 figures

    MSC Class: 68-00 ACM Class: I.2.11

  49. arXiv:2310.03614  [pdf

    cs.LG cs.CY

    Adversarial Machine Learning for Social Good: Reframing the Adversary as an Ally

    Authors: Shawqi Al-Maliki, Adnan Qayyum, Hassan Ali, Mohamed Abdallah, Junaid Qadir, Dinh Thai Hoang, Dusit Niyato, Ala Al-Fuqaha

    Abstract: Deep Neural Networks (DNNs) have been the driving force behind many of the recent advances in machine learning. However, research has shown that DNNs are vulnerable to adversarial examples -- input samples that have been perturbed to force DNN-based models to make errors. As a result, Adversarial Machine Learning (AdvML) has gained a lot of attention, and researchers have investigated these vulner… ▽ More

    Submitted 5 October, 2023; originally announced October 2023.

  50. arXiv:2309.08474  [pdf, other

    cs.CR cs.AI

    VulnSense: Efficient Vulnerability Detection in Ethereum Smart Contracts by Multimodal Learning with Graph Neural Network and Language Model

    Authors: Phan The Duy, Nghi Hoang Khoa, Nguyen Huu Quyen, Le Cong Trinh, Vu Trung Kien, Trinh Minh Hoang, Van-Hau Pham

    Abstract: This paper presents VulnSense framework, a comprehensive approach to efficiently detect vulnerabilities in Ethereum smart contracts using a multimodal learning approach on graph-based and natural language processing (NLP) models. Our proposed framework combines three types of features from smart contracts comprising source code, opcode sequences, and control flow graph (CFG) extracted from bytecod… ▽ More

    Submitted 15 September, 2023; originally announced September 2023.