Is a Single Model Enough? MuCoS: A Multi-Model Ensemble Learning for Semantic Code Search

Du, Lun; Shi, Xiaozhou; Wang, Yanlin; Shi, Ensheng; Han, Shi; Zhang, Dongmei

Computer Science > Software Engineering

arXiv:2107.04773 (cs)

[Submitted on 10 Jul 2021 (v1), last revised 13 Jul 2021 (this version, v2)]

Title:Is a Single Model Enough? MuCoS: A Multi-Model Ensemble Learning for Semantic Code Search

Authors:Lun Du, Xiaozhou Shi, Yanlin Wang, Ensheng Shi, Shi Han, Dongmei Zhang

View PDF

Abstract:Recently, deep learning methods have become mainstream in code search since they do better at capturing semantic correlations between code snippets and search queries and have promising performance. However, code snippets have diverse information from different dimensions, such as business logic, specific algorithm, and hardware communication, so it is hard for a single code representation module to cover all the perspectives. On the other hand, as a specific query may focus on one or several perspectives, it is difficult for a single query representation module to represent different user intents. In this paper, we propose MuCoS, a multi-model ensemble learning architecture for semantic code search. It combines several individual learners, each of which emphasizes a specific perspective of code snippets. We train the individual learners on different datasets which contain different perspectives of code information, and we use a data augmentation strategy to get these different datasets. Then we ensemble the learners to capture comprehensive features of code snippets.

Comments:	5 pages
Subjects:	Software Engineering (cs.SE); Machine Learning (cs.LG)
Cite as:	arXiv:2107.04773 [cs.SE]
	(or arXiv:2107.04773v2 [cs.SE] for this version)
	https://doi.org/10.48550/arXiv.2107.04773

Submission history

From: Lun Du [view email]
[v1] Sat, 10 Jul 2021 06:40:44 UTC (999 KB)
[v2] Tue, 13 Jul 2021 02:42:51 UTC (999 KB)

Computer Science > Software Engineering

Title:Is a Single Model Enough? MuCoS: A Multi-Model Ensemble Learning for Semantic Code Search

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Software Engineering

Title:Is a Single Model Enough? MuCoS: A Multi-Model Ensemble Learning for Semantic Code Search

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators