Skip to main content

Showing 1–15 of 15 results for author: Le, T H M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.19765  [pdf, other

    cs.SE cs.LG

    Systematic Literature Review on Application of Learning-based Approaches in Continuous Integration

    Authors: Ali Kazemi Arani, Triet Huynh Minh Le, Mansooreh Zahedi, M. Ali Babar

    Abstract: Context: Machine learning (ML) and deep learning (DL) analyze raw data to extract valuable insights in specific phases. The rise of continuous practices in software projects emphasizes automating Continuous Integration (CI) with these learning-based methods, while the growing adoption of such approaches underscores the need for systematizing knowledge. Objective: Our objective is to comprehensivel… ▽ More

    Submitted 2 July, 2024; v1 submitted 28 June, 2024; originally announced June 2024.

    Comments: This paper has been accepted to be published in IEEE Access

  2. arXiv:2404.17110  [pdf, other

    cs.SE cs.CR cs.LG

    Software Vulnerability Prediction in Low-Resource Languages: An Empirical Study of CodeBERT and ChatGPT

    Authors: Triet H. M. Le, M. Ali Babar, Tung Hoang Thai

    Abstract: Background: Software Vulnerability (SV) prediction in emerging languages is increasingly important to ensure software security in modern systems. However, these languages usually have limited SV data for develo** high-performing prediction models. Aims: We conduct an empirical study to evaluate the impact of SV data scarcity in emerging languages on the state-of-the-art SV prediction model and i… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: Accepted in the 4th International Workshop on Software Security co-located with the 28th International Conference on Evaluation and Assessment in Software Engineering (EASE) 2024

  3. arXiv:2401.11105  [pdf, other

    cs.SE cs.CR cs.LG

    Are Latent Vulnerabilities Hidden Gems for Software Vulnerability Prediction? An Empirical Study

    Authors: Triet H. M. Le, Xiaoning Du, M. Ali Babar

    Abstract: Collecting relevant and high-quality data is integral to the development of effective Software Vulnerability (SV) prediction models. Most of the current SV datasets rely on SV-fixing commits to extract vulnerable functions and lines. However, none of these datasets have considered latent SVs existing between the introduction and fix of the collected SVs. There is also little known about the useful… ▽ More

    Submitted 19 January, 2024; originally announced January 2024.

    Comments: Accepted as a full paper in the technical track at the 21st International Conference on Mining Software Repositories (MSR) 2024

  4. arXiv:2305.12736   

    cs.SE

    Mitigating ML Model Decay in Continuous Integration with Data Drift Detection: An Empirical Study

    Authors: Ali Kazemi Arani, Triet Huynh Minh Le, Mansooreh Zahedi, Muhammad Ali Babar

    Abstract: Background: Machine Learning (ML) methods are being increasingly used for automating different activities, e.g., Test Case Prioritization (TCP), of Continuous Integration (CI). However, ML models need frequent retraining as a result of changes in the CI environment, more commonly known as data drift. Also, continuously retraining ML models consume a lot of time and effort. Hence, there is an urgen… ▽ More

    Submitted 17 July, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

    Comments: This paper got a rejection and we need to address the comments and upload the new version with new results

  5. arXiv:2305.12695   

    cs.SE cs.LG

    Systematic Literature Review on Application of Machine Learning in Continuous Integration

    Authors: Ali Kazemi Arani, Triet Huynh Minh Le, Mansooreh Zahedi, Muhammad Ali Babar

    Abstract: This research conducted a systematic review of the literature on machine learning (ML)-based methods in the context of Continuous Integration (CI) over the past 22 years. The study aimed to identify and describe the techniques used in ML-based solutions for CI and analyzed various aspects such as data engineering, feature engineering, hyper-parameter tuning, ML models, evaluation methods, and metr… ▽ More

    Submitted 17 July, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

    Comments: This paper got a rejection and we need to address the comments and upload the new version with new results

  6. arXiv:2304.02829  [pdf, other

    cs.SE cs.LG

    SoK: Machine Learning for Continuous Integration

    Authors: Ali Kazemi Arani, Mansooreh Zahedi, Triet Huynh Minh Le, Muhammad Ali Babar

    Abstract: Continuous Integration (CI) has become a well-established software development practice for automatically and continuously integrating code changes during software development. An increasing number of Machine Learning (ML) based approaches for automation of CI phases are being reported in the literature. It is timely and relevant to provide a Systemization of Knowledge (SoK) of ML-based approaches… ▽ More

    Submitted 5 April, 2023; originally announced April 2023.

    Comments: 6 pages, 2 figures, accepted in the ICSE'23 Workshop on Cloud Intelligence / AIOps

  7. arXiv:2207.11708  [pdf, other

    cs.SE cs.CR cs.LG

    Towards an Improved Understanding of Software Vulnerability Assessment Using Data-Driven Approaches

    Authors: Triet H. M. Le

    Abstract: The thesis advances the field of software security by providing knowledge and automation support for software vulnerability assessment using data-driven approaches. Software vulnerability assessment provides important and multifaceted information to prevent and mitigate dangerous cyber-attacks in the wild. The key contributions include a systematisation of knowledge, along with a suite of novel da… ▽ More

    Submitted 20 June, 2023; v1 submitted 24 July, 2022; originally announced July 2022.

    Comments: A thesis submitted for the degree of Doctor of Philosophy at The University of Adelaide. The official version of the thesis can be found at the institutional repository: https://hdl.handle.net/2440/135914

  8. arXiv:2203.08417  [pdf, other

    cs.SE cs.CR cs.LG

    On the Use of Fine-grained Vulnerable Code Statements for Software Vulnerability Assessment Models

    Authors: Triet H. M. Le, M. Ali Babar

    Abstract: Many studies have developed Machine Learning (ML) approaches to detect Software Vulnerabilities (SVs) in functions and fine-grained code statements that cause such SVs. However, there is little work on leveraging such detection outputs for data-driven SV assessment to give information about exploitability, impact, and severity of SVs. The information is important to understand SVs and prioritize t… ▽ More

    Submitted 16 March, 2022; originally announced March 2022.

    Comments: Accepted as a full paper in the technical track at the 19th International Conference on Mining Software Repositories (MSR) 2022

  9. arXiv:2109.04029  [pdf, other

    cs.CR cs.AI cs.LG

    Automated Security Assessment for the Internet of Things

    Authors: Xuanyu Duan, Mengmeng Ge, Triet H. M. Le, Faheem Ullah, Shang Gao, Xuequan Lu, M. Ali Babar

    Abstract: Internet of Things (IoT) based applications face an increasing number of potential security risks, which need to be systematically assessed and addressed. Expert-based manual assessment of IoT security is a predominant approach, which is usually inefficient. To address this problem, we propose an automated security assessment framework for IoT networks. Our framework first leverages machine learni… ▽ More

    Submitted 9 September, 2021; originally announced September 2021.

    Comments: Accepted for publication at the 26th IEEE Pacific Rim International Symposium on Dependable Computing (PRDC 2021)

  10. arXiv:2108.08041  [pdf, other

    cs.SE cs.CR cs.LG

    DeepCVA: Automated Commit-level Vulnerability Assessment with Deep Multi-task Learning

    Authors: Triet H. M. Le, David Hin, Roland Croft, M. Ali Babar

    Abstract: It is increasingly suggested to identify Software Vulnerabilities (SVs) in code commits to give early warnings about potential security risks. However, there is a lack of effort to assess vulnerability-contributing commits right after they are detected to provide timely information about the exploitability, impact and severity of SVs. Such information is important to plan and prioritize the mitiga… ▽ More

    Submitted 18 August, 2021; originally announced August 2021.

    Comments: Accepted as a full paper at the 36th IEEE/ACM International Conference on Automated Software Engineering (ASE) 2021

  11. arXiv:2107.08364  [pdf, other

    cs.SE cs.AI cs.CR cs.LG

    A Survey on Data-driven Software Vulnerability Assessment and Prioritization

    Authors: Triet H. M. Le, Huaming Chen, M. Ali Babar

    Abstract: Software Vulnerabilities (SVs) are increasing in complexity and scale, posing great security risks to many software systems. Given the limited resources in practice, SV assessment and prioritization help practitioners devise optimal SV mitigation plans based on various SV characteristics. The surges in SV data sources and data-driven techniques such as Machine Learning and Deep Learning have taken… ▽ More

    Submitted 3 April, 2022; v1 submitted 18 July, 2021; originally announced July 2021.

    Comments: Accepted for publication in the ACM Computing Surveys journal (CSUR), 2022

    Journal ref: ACM Comput. Surv., 55, 5 (2022), Article 100

  12. Automated Software Vulnerability Assessment with Concept Drift

    Authors: Triet H. M. Le, Bushra Sabir, M. Ali Babar

    Abstract: Software Engineering researchers are increasingly using Natural Language Processing (NLP) techniques to automate Software Vulnerabilities (SVs) assessment using the descriptions in public repositories. However, the existing NLP-based approaches suffer from concept drift. This problem is caused by a lack of proper treatment of new (out-of-vocabulary) terms for the evaluation of unseen SVs over time… ▽ More

    Submitted 21 March, 2021; originally announced March 2021.

    Comments: Published as a full paper at the 16th International Conference on Mining Software Repositories 2019

    Journal ref: Proceedings of the 16th International Conference on Mining Software Repositories, 2019, pp. 371-382

  13. A Large-scale Study of Security Vulnerability Support on Developer Q&A Websites

    Authors: Triet H. M. Le, Roland Croft, David Hin, M. Ali Babar

    Abstract: Context: Security Vulnerabilities (SVs) pose many serious threats to software systems. Developers usually seek solutions to addressing these SVs on developer Question and Answer (Q&A) websites. However, there is still little known about on-going SV-specific discussions on different developer Q&A sites. Objective: We present a large-scale empirical study to understand developers' SV discussions and… ▽ More

    Submitted 21 April, 2021; v1 submitted 10 August, 2020; originally announced August 2020.

    Comments: Accepted for publication at the 25th International Conference on Evaluation and Assessment in Software Engineering (EASE 2021)

  14. arXiv:2003.03741  [pdf

    cs.SE cs.IR cs.LG

    PUMiner: Mining Security Posts from Developer Question and Answer Websites with PU Learning

    Authors: Triet H. M. Le, David Hin, Roland Croft, M. Ali Babar

    Abstract: Security is an increasing concern in software development. Developer Question and Answer (Q&A) websites provide a large amount of security discussion. Existing studies have used human-defined rules to mine security discussions, but these works still miss many posts, which may lead to an incomplete analysis of the security practices reported on Q&A websites. Traditional supervised Machine Learning… ▽ More

    Submitted 8 March, 2020; originally announced March 2020.

    Comments: Accepted for publication at the 17th Mining Software Repositories 2020 conference

  15. arXiv:2002.05442  [pdf, other

    cs.SE cs.AI cs.LG

    Deep Learning for Source Code Modeling and Generation: Models, Applications and Challenges

    Authors: Triet H. M. Le, Hao Chen, M. Ali Babar

    Abstract: Deep Learning (DL) techniques for Natural Language Processing have been evolving remarkably fast. Recently, the DL advances in language modeling, machine translation and paragraph understanding are so prominent that the potential of DL in Software Engineering cannot be overlooked, especially in the field of program learning. To facilitate further research and applications of DL in this field, we p… ▽ More

    Submitted 13 February, 2020; originally announced February 2020.

    Journal ref: ACM Comput. Surv., 53, 3 (2020), Article 62