Skip to main content

Showing 1–16 of 16 results for author: Fard, F H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.13840  [pdf, other

    cs.AI cs.CL

    StackRAG Agent: Improving Developer Answers with Retrieval-Augmented Generation

    Authors: Davit Abrahamyan, Fatemeh H. Fard

    Abstract: Developers spend much time finding information that is relevant to their questions. Stack Overflow has been the leading resource, and with the advent of Large Language Models (LLMs), generative models such as ChatGPT are used frequently. However, there is a catch in using each one separately. Searching for answers is time-consuming and tedious, as shown by the many tools developed by researchers t… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  2. arXiv:2406.00215  [pdf, other

    cs.SE

    Benchmarking the Communication Competence of Code Generation for LLMs and LLM Agent

    Authors: Jie JW Wu, Fatemeh H. Fard

    Abstract: Large language models (LLMs) have significantly improved their ability to perform tasks in the field of code generation. However, there is still a gap between LLMs being capable coders and being top-tier software engineers. Based on the observation that top-level software engineers often ask clarifying questions to reduce ambiguity in both requirements and coding solutions, we argue that the same… ▽ More

    Submitted 31 May, 2024; originally announced June 2024.

  3. arXiv:2405.01553  [pdf, ps, other

    cs.SE cs.AI

    Empirical Studies of Parameter Efficient Methods for Large Language Models of Code and Knowledge Transfer to R

    Authors: Amirreza Esmaeili, Iman Saberi, Fatemeh H. Fard

    Abstract: Recently, Large Langauge Models (LLMs) have gained a lot of attention in the Software Engineering (SE) community. LLMs or their variants pre-trained on code are used for many SE tasks. A main approach for adapting LLMs to the downstream task is to fine-tune the models. However, with having billions-parameters-LLMs, fine-tuning the models is not practical. An alternative approach is using Parameter… ▽ More

    Submitted 15 March, 2024; originally announced May 2024.

  4. arXiv:2402.04421  [pdf, other

    cs.SE cs.AI

    Studying Vulnerable Code Entities in R

    Authors: Zixiao Zhao, Millon Madhur Das, Fatemeh H. Fard

    Abstract: Pre-trained Code Language Models (Code-PLMs) have shown many advancements and achieved state-of-the-art results for many software engineering tasks in the past few years. These models are mainly targeted for popular programming languages such as Java and Python, leaving out many other ones like R. Though R has a wide community of developers and users, there is little known about the applicability… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

    Comments: 5 pages, 3 figures, and 2 tables. to be published in ICPC 2024

  5. arXiv:2401.13802  [pdf, other

    cs.SE cs.AI cs.CL cs.LG

    Investigating the Efficacy of Large Language Models for Code Clone Detection

    Authors: Mohamad Khajezade, Jie JW Wu, Fatemeh Hendijani Fard, Gema Rodríguez-Pérez, Mohamed Sami Shehata

    Abstract: Large Language Models (LLMs) have demonstrated remarkable success in various natural language processing and software engineering tasks, such as code generation. The LLMs are mainly utilized in the prompt-based zero/few-shot paradigm to guide the model in accomplishing the task. GPT-based models are one of the popular ones studied for tasks such as code comment generation or test generation. These… ▽ More

    Submitted 30 January, 2024; v1 submitted 24 January, 2024; originally announced January 2024.

  6. arXiv:2303.06233  [pdf, other

    cs.SE

    Model-Agnostic Syntactical Information for Pre-Trained Programming Language Models

    Authors: Iman Saberi, Fatemeh H. Fard

    Abstract: Pre-trained Programming Language Models (PPLMs) achieved many recent states of the art results for many code-related software engineering tasks. Though some studies use data flow or propose tree-based models that utilize Abstract Syntax Tree (AST), most PPLMs do not fully utilize the rich syntactical information in source code. Still, the input is considered a sequence of tokens. There are two iss… ▽ More

    Submitted 10 March, 2023; originally announced March 2023.

    Comments: 11 pages, 5 Figures, Has been accepted on ICSE 2023

  7. arXiv:2204.08653  [pdf, other

    cs.SE cs.CL cs.LG

    On The Cross-Modal Transfer from Natural Language to Code through Adapter Modules

    Authors: Divyam Goel, Ramansh Grover, Fatemeh H. Fard

    Abstract: Pre-trained neural Language Models (PTLM), such as CodeBERT, are recently used in software engineering as models pre-trained on large source code corpora. Their knowledge is transferred to downstream tasks (e.g. code clone detection) via fine-tuning. In natural language processing (NLP), other alternatives for transferring the knowledge of PTLMs are explored through using adapters, compact, parame… ▽ More

    Submitted 19 April, 2022; originally announced April 2022.

    Comments: 11 pages, 6 figures, ICPC 2022. 30th International Conference on Program Comprehension (ICPC '22), May 16--17, 2022, Virtual Event, USA}

  8. arXiv:2204.07501  [pdf, other

    cs.SE

    Evaluating few shot and Contrastive learning Methods for Code Clone Detection

    Authors: Mohamad Khajezade, Fatemeh Hendijani Fard, Mohamed S. Shehata

    Abstract: Context: Code Clone Detection (CCD) is a software engineering task that is used for plagiarism detection, code search, and code comprehension. Recently, deep learning-based models have achieved an F1 score (a metric used to assess classifiers) of $\sim$95\% on the CodeXGLUE benchmark. These models require many training data, mainly fine-tuned on Java or C++ datasets. However, no previous study eva… ▽ More

    Submitted 9 November, 2023; v1 submitted 15 April, 2022; originally announced April 2022.

  9. On the Effectiveness of Pretrained Models for API Learning

    Authors: Mohammad Abdul Hadi, Imam Nur Bani Yusuf, Ferdian Thung, Kien Gia Luong, Jiang Lingxiao, Fatemeh H. Fard, David Lo

    Abstract: Developers frequently use APIs to implement certain functionalities, such as parsing Excel Files, reading and writing text files line by line, etc. Developers can greatly benefit from automatic API usage sequence generation based on natural language queries for building applications in a faster and cleaner manner. Existing approaches utilize information retrieval models to search for matching API… ▽ More

    Submitted 5 April, 2022; originally announced April 2022.

    Comments: 12 pages, 4 figures, ICPC 2022

    Journal ref: 30th International Conference on Program Comprehension (ICPC '22), May 16--17, 2022, Virtual Event, USA}

  10. arXiv:2202.02294  [pdf, other

    cs.CL cs.LG cs.SE

    Pre-Trained Neural Language Models for Automatic Mobile App User Feedback Answer Generation

    Authors: Yue Cao, Fatemeh H. Fard

    Abstract: Studies show that developers' answers to the mobile app users' feedbacks on app stores can increase the apps' star rating. To help app developers generate answers that are related to the users' issues, recent studies develop models to generate the answers automatically. Aims: The app response generation models use deep neural networks and require training data. Pre-Trained neural language Models (… ▽ More

    Submitted 4 February, 2022; originally announced February 2022.

    Comments: 6 pages, published in the 2021 ASE RAISE workshop

  11. arXiv:2104.05861  [pdf, other

    cs.SE cs.LG

    Evaluating Pre-Trained Models for User Feedback Analysis in Software Engineering: A Study on Classification of App-Reviews

    Authors: Mohammad Abdul Hadi, Fatemeh H. Fard

    Abstract: Context: Mobile app reviews written by users on app stores or social media are significant resources for app developers.Analyzing app reviews have proved to be useful for many areas of software engineering (e.g., requirement engineering, testing). Automatic classification of app reviews requires extensive efforts to manually curate a labeled dataset. When the classification purpose changes (e.g. i… ▽ More

    Submitted 6 April, 2022; v1 submitted 12 April, 2021; originally announced April 2021.

    Comments: 55 pages, 13 tables, 6 figures, EMSE 2022

  12. arXiv:2103.10668  [pdf, other

    cs.SE cs.CL cs.LG

    API2Com: On the Improvement of Automatically Generated Code Comments Using API Documentations

    Authors: Ramin Shahbazi, Rishab Sharma, Fatemeh H. Fard

    Abstract: Code comments can help in program comprehension and are considered as important artifacts to help developers in software maintenance. However, the comments are mostly missing or are outdated, specially in complex software projects. As a result, several automatic comment generation models are developed as a solution. The recent models explore the integration of external knowledge resources such as… ▽ More

    Submitted 19 March, 2021; originally announced March 2021.

  13. arXiv:2103.09340  [pdf, other

    cs.SE

    Technical Debt in the Peer-Review Documentation of R Packages: a rOpenSci Case Study

    Authors: Zadia Codabux, Melina Vidoni, Fatemeh H. Fard

    Abstract: Context: Technical Debt is a metaphor used to describe code that is "not quite right." Although TD studies have gained momentum, TD has yet to be studied as thoroughly in non-Object-Oriented (OO) or scientific software such as R. R is a multi-paradigm programming language, whose popularity in data science and statistical applications has amplified in recent years. Due to R's inherent ability to ex… ▽ More

    Submitted 16 March, 2021; originally announced March 2021.

  14. arXiv:2009.09930  [pdf, other

    cs.IR cs.LG stat.ML

    AOBTM: Adaptive Online Biterm Topic Modeling for Version Sensitive Short-texts Analysis

    Authors: Mohammad Abdul Hadi, Fatemeh H Fard

    Abstract: Analysis of mobile app reviews has shown its important role in requirement engineering, software maintenance and evolution of mobile apps. Mobile app developers check their users' reviews frequently to clarify the issues experienced by users or capture the new issues that are introduced due to a recent app update. App reviews have a dynamic nature and their discussed topics change over time. The c… ▽ More

    Submitted 13 September, 2020; originally announced September 2020.

    Comments: 13 pages, 7 figures, 7 tables

  15. ReviewViz: Assisting Developers Perform Empirical Study on Energy Consumption Related Reviews for Mobile Applications

    Authors: Mohammad Abdul Hadi, Fatemeh H Fard

    Abstract: Improving the energy efficiency of mobile applications is a topic that has gained a lot of attention recently. It has been addressed in a number of ways such as identifying energy bugs and develo** a catalog of energy patterns. Previous work shows that users discuss the battery-related issues (energy inefficiency or energy consumption) of the apps in their reviews. However, there is no work that… ▽ More

    Submitted 19 March, 2021; v1 submitted 13 September, 2020; originally announced September 2020.

    Comments: 4 pages, 5 figures

  16. arXiv:2009.05936  [pdf, other

    cs.HC

    Geo-Spatial Data Visualization and Critical Metrics Predictions for Canadian Elections

    Authors: Mohammad Abdul Hadi, Fatemeh H Fard, Irene Vrbik

    Abstract: Open data published by various organizations is intended to make the data available to the public. All over the world, numerous organizations maintain a considerable number of open databases containing a lot of facts and numbers. However, most of them do not offer a concise and insightful data interpretation or visualization tool, which can help users to process all of the information in a consist… ▽ More

    Submitted 13 September, 2020; originally announced September 2020.

    Comments: 7 pages, 11 figures