NIR-Prompt: A Multi-task Generalized Neural Information Retrieval Training Framework

Xu, Shicheng; Pang, Liang; Shen, Huawei; Cheng, Xueqi

Computer Science > Information Retrieval

arXiv:2212.00229 (cs)

[Submitted on 1 Dec 2022 (v1), last revised 18 Dec 2023 (this version, v3)]

Title:NIR-Prompt: A Multi-task Generalized Neural Information Retrieval Training Framework

Authors:Shicheng Xu, Liang Pang, Huawei Shen, Xueqi Cheng

View PDF HTML (experimental)

Abstract:Information retrieval aims to find information that meets users' needs from the corpus. Different needs correspond to different IR tasks such as document retrieval, open-domain question answering, retrieval-based dialogue, etc., while they share the same schema to estimate the relationship between texts. It indicates that a good IR model can generalize to different tasks and domains. However, previous studies indicate that state-of-the-art neural information retrieval (NIR) models, e.g, pre-trained language models (PLMs) are hard to generalize. Mainly because the end-to-end fine-tuning paradigm makes the model overemphasize task-specific signals and domain biases but loses the ability to capture generalized essential signals. To address this problem, we propose a novel NIR training framework named NIR-Prompt for retrieval and reranking stages based on the idea of decoupling signal capturing and combination. NIR-Prompt exploits Essential Matching Module (EMM) to capture the essential matching signals and gets the description of tasks by Matching Description Module (MDM). The description is used as task-adaptation information to combine the essential matching signals to adapt to different tasks. Experiments under in-domain multi-task, out-of-domain multi-task, and new task adaptation settings show that NIR-Prompt can improve the generalization of PLMs in NIR for both retrieval and reranking stages compared with baselines.

Comments:	This article is the extension of arXiv:2204.02725 and accepted by TOIS
Subjects:	Information Retrieval (cs.IR)
Cite as:	arXiv:2212.00229 [cs.IR]
	(or arXiv:2212.00229v3 [cs.IR] for this version)
	https://doi.org/10.48550/arXiv.2212.00229

Submission history

From: Shicheng Xu [view email]
[v1] Thu, 1 Dec 2022 02:26:52 UTC (2,985 KB)
[v2] Fri, 2 Dec 2022 02:30:19 UTC (2,985 KB)
[v3] Mon, 18 Dec 2023 06:33:57 UTC (4,746 KB)

Computer Science > Information Retrieval

Title:NIR-Prompt: A Multi-task Generalized Neural Information Retrieval Training Framework

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Information Retrieval

Title:NIR-Prompt: A Multi-task Generalized Neural Information Retrieval Training Framework

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators