Skip to main content

Showing 1–29 of 29 results for author: Bui, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.15633  [pdf, other

    cs.SE

    Good things come in three: Generating SO Post Titles with Pre-Trained Models, Self Improvement and Post Ranking

    Authors: Duc Anh Le, Anh M. T. Bui, Phuong T. Nguyen, Davide Di Ruscio

    Abstract: Stack Overflow is a prominent Q and A forum, supporting developers in seeking suitable resources on programming-related matters. Having high-quality question titles is an effective means to attract developers' attention. Unfortunately, this is often underestimated, leaving room for improvement. Research has been conducted, predominantly leveraging pre-trained models to generate titles from code sn… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: The paper has been per-reviewed and accepted for publication to the International Symposium on Empirical Software Engineering and Measurement (ESEM 2024)

  2. arXiv:2403.13204  [pdf, other

    cs.LG cs.CV stat.ML

    Diversity-Aware Agnostic Ensemble of Sharpness Minimizers

    Authors: Anh Bui, Vy Vo, Tung Pham, Dinh Phung, Trung Le

    Abstract: There has long been plenty of theoretical and empirical evidence supporting the success of ensemble learning. Deep ensembles in particular take advantage of training randomness and expressivity of individual neural networks to gain prediction diversity, ultimately leading to better generalization, robustness and uncertainty estimation. In respect of generalization, it is found that pursuing wider… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

  3. arXiv:2403.12326  [pdf, other

    cs.LG cs.CV

    Removing Undesirable Concepts in Text-to-Image Generative Models with Learnable Prompts

    Authors: Anh Bui, Khanh Doan, Trung Le, Paul Montague, Tamas Abraham, Dinh Phung

    Abstract: Generative models have demonstrated remarkable potential in generating visually impressive content from textual descriptions. However, training these models on unfiltered internet data poses the risk of learning and subsequently propagating undesirable concepts, such as copyrighted or unethical content. In this paper, we propose a novel method to remove undesirable concepts from text-to-image gene… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  4. arXiv:2403.05873  [pdf, other

    cs.SE cs.IR cs.LG

    LEGION: Harnessing Pre-trained Language Models for GitHub Topic Recommendations with Distribution-Balance Loss

    Authors: Yen-Trang Dang, Thanh-Le Cong, Phuc-Thanh Nguyen, Anh M. T. Bui, Phuong T. Nguyen, Bach Le, Quyet-Thang Huynh

    Abstract: Open-source development has revolutionized the software industry by promoting collaboration, transparency, and community-driven innovation. Today, a vast amount of various kinds of open-source software, which form networks of repositories, is often hosted on GitHub - a popular software development platform. To enhance the discoverability of the repository networks, i.e., groups of similar reposito… ▽ More

    Submitted 9 March, 2024; originally announced March 2024.

    Comments: Accepted to EASE'24

  5. arXiv:2402.11101  [pdf

    cond-mat.mtrl-sci cs.CE cs.LG

    Physics-based material parameters extraction from perovskite experiments via Bayesian optimization

    Authors: Hualin Zhan, Viqar Ahmad, Azul Mayon, Grace Tabi, Anh Dinh Bui, Zhuofeng Li, Daniel Walter, Hieu Nguyen, Klaus Weber, Thomas White, Kylie Catchpole

    Abstract: The ability to extract material parameters of perovskite from quantitative experimental analysis is essential for rational design of photovoltaic and optoelectronic applications. However, the difficulty of this analysis increases significantly with the complexity of the theoretical model and the number of material parameters for perovskite. Here we use Bayesian optimization to develop an analysis… ▽ More

    Submitted 29 May, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

    Comments: The work is published in Energy & Environmental Science (DOI: 10.1039/D4EE00911H). This work is supported by the Australian Centre for Advanced Photovoltaics (ACAP) and received funding from the Australian Renewable Energy Agency (ARENA). H.Z. acknowledges the support of the ACAP Fellowship. H.Z. thanks Pawsey for providing the Nimbus Research Cloud Service

  6. arXiv:2402.00024  [pdf, other

    q-bio.BM cs.AI cs.CL cs.LG

    Can Large Language Models Understand Molecules?

    Authors: Shaghayegh Sadeghi, Alan Bui, Ali Forooghi, Jianguo Lu, Alioune Ngom

    Abstract: Purpose: Large Language Models (LLMs) like GPT (Generative Pre-trained Transformer) from OpenAI and LLaMA (Large Language Model Meta AI) from Meta AI are increasingly recognized for their potential in the field of cheminformatics, particularly in understanding Simplified Molecular Input Line Entry System (SMILES), a standard method for representing chemical structures. These LLMs also have the abi… ▽ More

    Submitted 20 May, 2024; v1 submitted 5 January, 2024; originally announced February 2024.

  7. arXiv:2311.09671  [pdf, ps, other

    cs.LG cs.CV

    Robust Contrastive Learning With Theory Guarantee

    Authors: Ngoc N. Tran, Lam Tran, Hoang Phan, Anh Bui, Tung Pham, Toan Tran, Dinh Phung, Trung Le

    Abstract: Contrastive learning (CL) is a self-supervised training paradigm that allows us to extract meaningful features without any label information. A typical CL framework is divided into two phases, where it first tries to learn the features from unlabelled data, and then uses those features to train a linear classifier with the labeled data. While a fair amount of existing theoretical works have analyz… ▽ More

    Submitted 16 November, 2023; originally announced November 2023.

    Comments: 27 pages, 0 figures. arXiv admin note: text overlap with arXiv:2305.10252

  8. arXiv:2311.00737  [pdf

    cs.LG physics.ins-det physics.med-ph

    Real-Time Magnetic Tracking and Diagnosis of COVID-19 via Machine Learning

    Authors: Dang Nguyen, Phat K. Huynh, Vinh Duc An Bui, Kee Young Hwang, Nityanand Jain, Chau Nguyen, Le Huu Nhat Minh, Le Van Truong, Xuan Thanh Nguyen, Dinh Hoang Nguyen, Le Tien Dung, Trung Q. Le, Manh-Huong Phan

    Abstract: The COVID-19 pandemic underscored the importance of reliable, noninvasive diagnostic tools for robust public health interventions. In this work, we fused magnetic respiratory sensing technology (MRST) with machine learning (ML) to create a diagnostic platform for real-time tracking and diagnosis of COVID-19 and other respiratory diseases. The MRST precisely captures breathing patterns through thre… ▽ More

    Submitted 1 November, 2023; originally announced November 2023.

  9. arXiv:2306.04178  [pdf, other

    cs.LG cs.CG

    Optimal Transport Model Distributional Robustness

    Authors: Van-Anh Nguyen, Trung Le, Anh Tuan Bui, Thanh-Toan Do, Dinh Phung

    Abstract: Distributional robustness is a promising framework for training deep learning models that are less vulnerable to adversarial examples and data distribution shifts. Previous works have mainly focused on exploiting distributional robustness in the data space. In this work, we explore an optimal transport-based distributional robustness framework in model spaces. Specifically, we examine a model dist… ▽ More

    Submitted 1 November, 2023; v1 submitted 7 June, 2023; originally announced June 2023.

    Comments: Accepted at NeurIPs 2023

    Journal ref: Advances in Neural Information Processing Systems, 2023

  10. arXiv:2304.13229  [pdf, other

    cs.LG cs.CV

    Generating Adversarial Examples with Task Oriented Multi-Objective Optimization

    Authors: Anh Bui, Trung Le, He Zhao, Quan Tran, Paul Montague, Dinh Phung

    Abstract: Deep learning models, even the-state-of-the-art ones, are highly vulnerable to adversarial examples. Adversarial training is one of the most efficient methods to improve the model's robustness. The key factor for the success of adversarial training is the capability to generate qualified and divergent adversarial examples which satisfy some objectives/goals (e.g., finding adversarial examples that… ▽ More

    Submitted 1 June, 2023; v1 submitted 25 April, 2023; originally announced April 2023.

  11. arXiv:2304.10175  [pdf

    cs.LG cs.AI

    Automated Dynamic Bayesian Networks for Predicting Acute Kidney Injury Before Onset

    Authors: David Gordon, Panayiotis Petousis, Anders O. Garlid, Keith Norris, Katherine Tuttle, Susanne B. Nicholas, Alex A. T. Bui

    Abstract: Several algorithms for learning the structure of dynamic Bayesian networks (DBNs) require an a priori ordering of variables, which influences the determined graph topology. However, it is often unclear how to determine this order if feature importance is unknown, especially as an exhaustive search is usually impractical. In this paper, we introduce Ranking Approaches for Unknown Structures (RAUS),… ▽ More

    Submitted 20 April, 2023; originally announced April 2023.

    Comments: 27 pages (including 8 pages supplementary information)

  12. arXiv:2212.03069  [pdf, other

    cs.CV cs.CR cs.LG

    Multiple Perturbation Attack: Attack Pixelwise Under Different $\ell_p$-norms For Better Adversarial Performance

    Authors: Ngoc N. Tran, Anh Tuan Bui, Dinh Phung, Trung Le

    Abstract: Adversarial machine learning has been both a major concern and a hot topic recently, especially with the ubiquitous use of deep neural networks in the current landscape. Adversarial attacks and defenses are usually likened to a cat-and-mouse game in which defenders and attackers evolve over the time. On one hand, the goal is to develop strong and robust deep networks that are resistant to maliciou… ▽ More

    Submitted 7 December, 2022; v1 submitted 5 December, 2022; originally announced December 2022.

    Comments: 18 pages, 8 figures, 7 tables

  13. arXiv:2211.04773  [pdf, other

    cs.CV

    SG-Shuffle: Multi-aspect Shuffle Transformer for Scene Graph Generation

    Authors: Anh Duc Bui, Soyeon Caren Han, Josiah Poon

    Abstract: Scene Graph Generation (SGG) serves a comprehensive representation of the images for human understanding as well as visual understanding tasks. Due to the long tail bias problem of the object and predicate labels in the available annotated data, the scene graph generated from current methodologies can be biased toward common, non-informative relationship labels. Relationship can sometimes be non-m… ▽ More

    Submitted 9 November, 2022; originally announced November 2022.

  14. arXiv:2203.00553  [pdf, other

    cs.LG cs.AI

    Global-Local Regularization Via Distributional Robustness

    Authors: Hoang Phan, Trung Le, Trung Phung, Tuan Anh Bui, Nhat Ho, Dinh Phung

    Abstract: Despite superior performance in many situations, deep neural networks are often vulnerable to adversarial examples and distribution shifts, limiting model generalization ability in real-world applications. To alleviate these problems, recent approaches leverage distributional robustness optimization (DRO) to find the most challenging distribution, and then minimize loss function over this most cha… ▽ More

    Submitted 12 February, 2023; v1 submitted 1 March, 2022; originally announced March 2022.

    Comments: Accepted to International Conference on Artificial Intelligence and Statistics (AISTATS 2023)

  15. arXiv:2202.13437  [pdf, other

    cs.LG cs.CV

    A Unified Wasserstein Distributional Robustness Framework for Adversarial Training

    Authors: Tuan Anh Bui, Trung Le, Quan Tran, He Zhao, Dinh Phung

    Abstract: It is well-known that deep neural networks (DNNs) are susceptible to adversarial attacks, exposing a severe fragility of deep learning systems. As the result, adversarial training (AT) method, by incorporating adversarial examples during training, represents a natural and effective approach to strengthen the robustness of a DNN-based classifier. However, most AT-based methods, notably PGD-AT and T… ▽ More

    Submitted 27 February, 2022; originally announced February 2022.

  16. Dimension Reduction with Prior Information for Knowledge Discovery

    Authors: Anh Tuan Bui

    Abstract: This paper addresses the problem of map** high-dimensional data to a low-dimensional space, in the presence of other known features. This problem is ubiquitous in science and engineering as there are often controllable/measurable features in most applications. To solve this problem, this paper proposes a broad class of methods, which is referred to as conditional multidimensional scaling (MDS).… ▽ More

    Submitted 29 December, 2023; v1 submitted 26 November, 2021; originally announced November 2021.

    Comments: Article accepted for publication in IEEE Transactions on Pattern Analysis and Machine Intelligence, 12 pages, 8 figures

  17. arXiv:2101.10027  [pdf, other

    cs.LG cs.AI cs.CV

    Understanding and Achieving Efficient Robustness with Adversarial Supervised Contrastive Learning

    Authors: Anh Bui, Trung Le, He Zhao, Paul Montague, Seyit Camtepe, Dinh Phung

    Abstract: Contrastive learning (CL) has recently emerged as an effective approach to learning representation in a range of downstream tasks. Central to this approach is the selection of positive (similar) and negative (dissimilar) sets to provide the model the opportunity to `contrast' between data and class representation in the latent space. In this paper, we investigate CL for improving model robustness… ▽ More

    Submitted 22 October, 2021; v1 submitted 25 January, 2021; originally announced January 2021.

  18. arXiv:2012.06916  [pdf, other

    stat.ML cs.LG

    Concept Drift Monitoring and Diagnostics of Supervised Learning Models via Score Vectors

    Authors: Kungang Zhang, Anh T. Bui, Daniel W. Apley

    Abstract: Supervised learning models are one of the most fundamental classes of models. Viewing supervised learning from a probabilistic perspective, the set of training data to which the model is fitted is usually assumed to follow a stationary distribution. However, this stationarity assumption is often violated in a phenomenon called concept drift, which refers to changes over time in the predictive rela… ▽ More

    Submitted 12 September, 2022; v1 submitted 12 December, 2020; originally announced December 2020.

  19. arXiv:2009.09612  [pdf, other

    cs.CV cs.LG

    Improving Ensemble Robustness by Collaboratively Promoting and Demoting Adversarial Robustness

    Authors: Anh Bui, Trung Le, He Zhao, Paul Montague, Olivier deVel, Tamas Abraham, Dinh Phung

    Abstract: Ensemble-based adversarial training is a principled approach to achieve robustness against adversarial attacks. An important technique of this approach is to control the transferability of adversarial examples among ensemble members. We propose in this work a simple yet effective strategy to collaborate among committee models of an ensemble model. This is achieved via the secure and insecure sets… ▽ More

    Submitted 4 February, 2022; v1 submitted 21 September, 2020; originally announced September 2020.

  20. arXiv:2007.05123  [pdf, other

    cs.LG cs.CV cs.NE stat.ML

    Improving Adversarial Robustness by Enforcing Local and Global Compactness

    Authors: Anh Bui, Trung Le, He Zhao, Paul Montague, Olivier deVel, Tamas Abraham, Dinh Phung

    Abstract: The fact that deep neural networks are susceptible to crafted perturbations severely impacts the use of deep learning in certain domains of application. Among many developed defense models against such attacks, adversarial training emerges as the most successful method that consistently resists a wide range of attacks. In this work, based on an observation from a previous study that the representa… ▽ More

    Submitted 9 July, 2020; originally announced July 2020.

    Comments: Proceeding of the European Conference on Computer Vision (ECCV) 2020

  21. arXiv:1908.06337  [pdf, other

    eess.IV cs.CV cs.LG stat.ML

    EigenRank by Committee: A Data Subset Selection and Failure Prediction paradigm for Robust Deep Learning based Medical Image Segmentation

    Authors: Bilwaj Gaonkar, Joel Beckett, Mark Attiah, Christine Ahn, Matthew Edwards, Bayard Wilson, Azim Laiwalla, Banafsheh Salehi, Bryan Yoo, Alex Bui, Luke Macyszyn

    Abstract: Translation of fully automated deep learning based medical image segmentation technologies to clinical workflows face two main algorithmic challenges. The first, is the collection and archival of large quantities of manually annotated ground truth data for both training and validation. The second is the relative inability of the majority of deep learning based segmentation techniques to alert phys… ▽ More

    Submitted 18 January, 2021; v1 submitted 17 August, 2019; originally announced August 2019.

    MSC Class: 68T45 (Primary) 68T05; 68T20 (Secondary) ACM Class: I.5.4; I.4.6

    Journal ref: Medical Image Analysis, Volume 67, 2021, Medical Image Analysis, Volume 67,2021,101834,ISSN 1361-8415,

  22. arXiv:1810.01621  [pdf, other

    cs.CV

    Extreme Augmentation : Can deep learning based medical image segmentation be trained using a single manually delineated scan?

    Authors: Bilwaj Gaonkar, Matthew Edwards, Alex Bui, Matthew Brown, Luke Macyszyn

    Abstract: Yes, it can. Data augmentation is perhaps the oldest preprocessing step in computer vision literature. Almost every computer vision model trained on imaging data uses some form of augmentation. In this paper, we use the inter-vertebral disk segmentation task alongside a deep residual U-Net as the learning model, to explore the effectiveness of augmentation. In the extreme, we observed that a model… ▽ More

    Submitted 6 September, 2019; v1 submitted 3 October, 2018; originally announced October 2018.

  23. arXiv:1806.00712  [pdf, ps, other

    cs.CV cs.AI

    An Interpretable Deep Hierarchical Semantic Convolutional Neural Network for Lung Nodule Malignancy Classification

    Authors: Shiwen Shen, Simon X. Han, Denise R. Aberle, Alex A. T. Bui, Willliam Hsu

    Abstract: While deep learning methods are increasingly being applied to tasks such as computer-aided diagnosis, these models are difficult to interpret, do not incorporate prior domain knowledge, and are often considered as a "black-box." The lack of model interpretability hinders them from being fully understood by target users such as radiologists. In this paper, we present a novel interpretable deep hier… ▽ More

    Submitted 2 June, 2018; originally announced June 2018.

  24. arXiv:1706.06087  [pdf

    cs.DL

    Aztec: A Platform to Render Biomedical Software Findable, Accessible, Interoperable, and Reusable

    Authors: Wei Wang, Brian Bleakley, Chelsea Ju, Vincent Kyi, Patrick Tan, Howard Choi, Xinxin Huang, Yichao Zhou, Justin Wood, Ding Wang, Alex Bui, Peipei **

    Abstract: Precision medicine and health requires the characterization and phenoty** of biological systems and patient datasets using a variety of data formats. This scenario mandates the centralization of various tools and resources in a unified platform to render them Findable, Accessible, Interoperable, and Reusable (FAIR Principles). Leveraging these principles, Aztec provides the scientific community… ▽ More

    Submitted 19 June, 2017; originally announced June 2017.

    Comments: 21 pages, 4 figures, 2 tables

    ACM Class: H.2.8; H.3.1; H.3.3; H.3.6; H.3.7; I.2.6; I.2.7

  25. arXiv:1112.4536  [pdf, ps, other

    cs.CC

    Intractability of the Minimum-Flip Supertree problem and its variants

    Authors: Sebastian Böcker, Quang Bao Anh Bui, Francois Nicolas, Anke Truss

    Abstract: Computing supertrees is a central problem in phylogenetics. The supertree method that is by far the most widely used today was introduced in 1992 and is called Matrix Representation with Parsimony analysis (MRP). Matrix Representation using Flip** (MRF)}, which was introduced in 2002, is an interesting variant of MRP: MRF is arguably more relevant that MRP and various efficient implementations o… ▽ More

    Submitted 19 December, 2011; originally announced December 2011.

    Comments: To be submitted

  26. arXiv:1109.3561  [pdf, other

    cs.DC

    Universal adaptive self-stabilizing traversal scheme: random walk and reloading wave

    Authors: Thibault Bernard, Alain Bui, Devan Sohier

    Abstract: In this paper, we investigate random walk based token circulation in dynamic environments subject to failures. We describe hypotheses on the dynamic environment that allow random walks to meet the important property that the token visits any node infinitely often. The randomness of this scheme allows it to work on any topology, and require no adaptation after a topological change, which is a desir… ▽ More

    Submitted 16 September, 2011; originally announced September 2011.

  27. arXiv:1011.2953  [pdf, other

    cs.DC

    A Distributed Clustering Algorithm for Dynamic Networks

    Authors: Thibault Bernard, Alain Bui, Laurence Pilard, Devan Sohier

    Abstract: We propose an algorithm that builds and maintains clusters over a network subject to mobility. This algorithm is fully decentralized and makes all the different clusters grow concurrently. The algorithm uses circulating tokens that collect data and move according to a random walk traversal scheme. Their task consists in (i) creating a cluster with the nodes it discovers and (ii) managing the clust… ▽ More

    Submitted 12 November, 2010; originally announced November 2010.

  28. arXiv:0812.0736  [pdf, ps, other

    cs.DC

    Fully distributed and fault tolerant task management based on diffusions

    Authors: Alain Bui, Olivier Flauzac, Cyril Rabat

    Abstract: The task management is a critical component for the computational grids. The aim is to assign tasks on nodes according to a global scheduling policy and a view of local resources of nodes. A peer-to-peer approach for the task management involves a better scalability for the grid and a higher fault tolerance. But some mechanisms have to be proposed to avoid the computation of replicated tasks tha… ▽ More

    Submitted 3 December, 2008; originally announced December 2008.

  29. arXiv:0807.3632  [pdf, other

    cs.DC cs.DM

    How to Compute Times of Random Walks based Distributed Algorithms

    Authors: Alain Bui, Devan Sohier

    Abstract: Random walk based distributed algorithms make use of a token that circulates in the system according to a random walk scheme to achieve their goal. To study their efficiency and compare it to one of the deterministic solutions, one is led to compute certain quantities, namely the hitting times and the cover time. Until now, only bounds on these quantities were defined. First, this paper presents… ▽ More

    Submitted 23 July, 2008; originally announced July 2008.

    Comments: 18 pages