Search | arXiv e-print repository

Prompt Learning for Multi-Label Code Smell Detection: A Promising Approach

Authors: Haiyang Liu, Yang Zhang, Vidya Saikrishna, Quanquan Tian, Kun Zheng

Abstract: Code smells indicate the potential problems of software quality so that developers can identify refactoring opportunities by detecting code smells. State-of-the-art approaches leverage heuristics, machine learning, and deep learning to detect code smells. However, existing approaches have not fully explored the potential of large language models (LLMs). In this paper, we propose \textit{PromptSmel… ▽ More Code smells indicate the potential problems of software quality so that developers can identify refactoring opportunities by detecting code smells. State-of-the-art approaches leverage heuristics, machine learning, and deep learning to detect code smells. However, existing approaches have not fully explored the potential of large language models (LLMs). In this paper, we propose \textit{PromptSmell}, a novel approach based on prompt learning for detecting multi-label code smell. Firstly, code snippets are acquired by traversing abstract syntax trees. Combined code snippets with natural language prompts and mask tokens, \textit{PromptSmell} constructs the input of LLMs. Secondly, to detect multi-label code smell, we leverage a label combination approach by converting a multi-label problem into a multi-classification problem. A customized answer space is added to the word list of pre-trained language models, and the probability distribution of intermediate answers is obtained by predicting the words at the mask positions. Finally, the intermediate answers are mapped to the target class labels by a verbalizer as the final classification result. We evaluate the effectiveness of \textit{PromptSmell} by answering six research questions. The experimental results demonstrate that \textit{PromptSmell} obtains an improvement of 11.17\% in $precision_{w}$ and 7.4\% in $F1_{w}$ compared to existing approaches. △ Less

Submitted 15 February, 2024; originally announced February 2024.

arXiv:2402.03732 [pdf, other]

doi 10.1109/ICDMW60847.2023.00184

Deep Outdated Fact Detection in Knowledge Graphs

Authors: Huiling Tu, Shuo Yu, Vidya Saikrishna, Feng Xia, Karin Verspoor

Abstract: Knowledge graphs (KGs) have garnered significant attention for their vast potential across diverse domains. However, the issue of outdated facts poses a challenge to KGs, affecting their overall quality as real-world information evolves. Existing solutions for outdated fact detection often rely on manual recognition. In response, this paper presents DEAN (Deep outdatEd fAct detectioN), a novel dee… ▽ More Knowledge graphs (KGs) have garnered significant attention for their vast potential across diverse domains. However, the issue of outdated facts poses a challenge to KGs, affecting their overall quality as real-world information evolves. Existing solutions for outdated fact detection often rely on manual recognition. In response, this paper presents DEAN (Deep outdatEd fAct detectioN), a novel deep learning-based framework designed to identify outdated facts within KGs. DEAN distinguishes itself by capturing implicit structural information among facts through comprehensive modeling of both entities and relations. To effectively uncover latent out-of-date information, DEAN employs a contrastive approach based on a pre-defined Relations-to-Nodes (R2N) graph, weighted by the number of entities. Experimental results demonstrate the effectiveness and superiority of DEAN over state-of-the-art baseline methods. △ Less

Submitted 6 February, 2024; originally announced February 2024.

Comments: 10 pages, 6 figures

MSC Class: 68T09; 68T30; 68P20 ACM Class: I.2.6; I.2.4; H.3.7; H.3.3

Journal ref: 2023 IEEE International Conference on Data Mining Workshops (ICDMW), December 1-4, 2023, Shanghai, China

arXiv:2312.09478 [pdf, other]

Entropy Causal Graphs for Multivariate Time Series Anomaly Detection

Authors: Falih Gozi Febrinanto, Kristen Moore, Chandra Thapa, Mujie Liu, Vidya Saikrishna, Jiangang Ma, Feng Xia

Abstract: Many multivariate time series anomaly detection frameworks have been proposed and widely applied. However, most of these frameworks do not consider intrinsic relationships between variables in multivariate time series data, thus ignoring the causal relationship among variables and degrading anomaly detection performance. This work proposes a novel framework called CGAD, an entropy Causal Graph for… ▽ More Many multivariate time series anomaly detection frameworks have been proposed and widely applied. However, most of these frameworks do not consider intrinsic relationships between variables in multivariate time series data, thus ignoring the causal relationship among variables and degrading anomaly detection performance. This work proposes a novel framework called CGAD, an entropy Causal Graph for multivariate time series Anomaly Detection. CGAD utilizes transfer entropy to construct graph structures that unveil the underlying causal relationships among time series data. Weighted graph convolutional networks combined with causal convolutions are employed to model both the causal graph structures and the temporal patterns within multivariate time series data. Furthermore, CGAD applies anomaly scoring, leveraging median absolute deviation-based normalization to improve the robustness of the anomaly identification process. Extensive experiments demonstrate that CGAD outperforms state-of-the-art methods on real-world datasets with a 15% average improvement based on three different multivariate time series anomaly detection metrics. △ Less

Submitted 14 December, 2023; originally announced December 2023.

arXiv:2202.10679 [pdf, other]

Physics-Informed Graph Learning

Authors: Ciyuan Peng, Feng Xia, Vidya Saikrishna, Huan Liu

Abstract: An expeditious development of graph learning in recent years has found innumerable applications in several diversified fields. Of the main associated challenges are the volume and complexity of graph data. The graph learning models suffer from the inability to efficiently learn graph information. In order to indemnify this inefficacy, physics-informed graph learning (PIGL) is emerging. PIGL incorp… ▽ More An expeditious development of graph learning in recent years has found innumerable applications in several diversified fields. Of the main associated challenges are the volume and complexity of graph data. The graph learning models suffer from the inability to efficiently learn graph information. In order to indemnify this inefficacy, physics-informed graph learning (PIGL) is emerging. PIGL incorporates physics rules while performing graph learning, which has enormous benefits. This paper presents a systematic review of PIGL methods. We begin with introducing a unified framework of graph learning models followed by examining existing PIGL methods in relation to the unified framework. We also discuss several future challenges for PIGL. This survey paper is expected to stimulate innovative research and development activities pertaining to PIGL. △ Less

Submitted 20 October, 2022; v1 submitted 22 February, 2022; originally announced February 2022.

Comments: 8 pages, 3 figures

MSC Class: 68T07; 68T30 ACM Class: I.2.6

Journal ref: 2022 IEEE International Conference on Data Mining Workshops (ICDMW)

Showing 1–4 of 4 results for author: Saikrishna, V