-
Generative Relevance Feedback and Convergence of Adaptive Re-Ranking: University of Glasgow Terrier Team at TREC DL 2023
Authors:
Andrew Parry,
Thomas Jaenich,
Sean MacAvaney,
Iadh Ounis
Abstract:
This paper describes our participation in the TREC 2023 Deep Learning Track. We submitted runs that apply generative relevance feedback from a large language model in both a zero-shot and pseudo-relevance feedback setting over two sparse retrieval approaches, namely BM25 and SPLADE. We couple this first stage with adaptive re-ranking over a BM25 corpus graph scored using a monoELECTRA cross-encode…
▽ More
This paper describes our participation in the TREC 2023 Deep Learning Track. We submitted runs that apply generative relevance feedback from a large language model in both a zero-shot and pseudo-relevance feedback setting over two sparse retrieval approaches, namely BM25 and SPLADE. We couple this first stage with adaptive re-ranking over a BM25 corpus graph scored using a monoELECTRA cross-encoder. We investigate the efficacy of these generative approaches for different query types in first-stage retrieval. In re-ranking, we investigate operating points of adaptive re-ranking with different first stages to find the point in graph traversal where the first stage no longer has an effect on the performance of the overall retrieval pipeline. We find some performance gains from the application of generative query reformulation. However, our strongest run in terms of P@10 and nDCG@10 applied both adaptive re-ranking and generative pseudo-relevance feedback, namely uogtr_b_grf_e_gb.
△ Less
Submitted 2 May, 2024;
originally announced May 2024.
-
Language Fairness in Multilingual Information Retrieval
Authors:
Eugene Yang,
Thomas Jänich,
James Mayfield,
Dawn Lawrie
Abstract:
Multilingual information retrieval (MLIR) considers the problem of ranking documents in several languages for a query expressed in a language that may differ from any of those languages. Recent work has observed that approaches such as combining ranked lists representing a single document language each or using multilingual pretrained language models demonstrate a preference for one language over…
▽ More
Multilingual information retrieval (MLIR) considers the problem of ranking documents in several languages for a query expressed in a language that may differ from any of those languages. Recent work has observed that approaches such as combining ranked lists representing a single document language each or using multilingual pretrained language models demonstrate a preference for one language over others. This results in systematic unfair treatment of documents in different languages. This work proposes a language fairness metric to evaluate whether documents across different languages are fairly ranked through statistical equivalence testing using the Kruskal-Wallis test. In contrast to most prior work in group fairness, we do not consider any language to be an unprotected group. Thus our proposed measure, PEER (Probability of EqualExpected Rank), is the first fairness metric specifically designed to capture the language fairness of MLIR systems. We demonstrate the behavior of PEER on artificial ranked lists. We also evaluate real MLIR systems on two publicly available benchmarks and show that the PEER scores align with prior analytical findings on MLIR fairness. Our implementation is compatible with ir-measures and is available at http://github.com/hltcoe/peer_measure.
△ Less
Submitted 1 May, 2024;
originally announced May 2024.
-
Query Exposure Prediction for Groups of Documents in Rankings
Authors:
Thomas Jaenich,
Graham McDonald,
Iadh Ounis
Abstract:
The main objective of an Information Retrieval system is to provide a user with the most relevant documents to the user's query. To do this, modern IR systems typically deploy a re-ranking pipeline in which a set of documents is retrieved by a lightweight first-stage retrieval process and then re-ranked by a more effective but expensive model. However, the success of a re-ranking pipeline is heavi…
▽ More
The main objective of an Information Retrieval system is to provide a user with the most relevant documents to the user's query. To do this, modern IR systems typically deploy a re-ranking pipeline in which a set of documents is retrieved by a lightweight first-stage retrieval process and then re-ranked by a more effective but expensive model. However, the success of a re-ranking pipeline is heavily dependent on the performance of the first stage retrieval, since new documents are not usually identified during the re-ranking stage. Moreover, this can impact the amount of exposure that a particular group of documents, such as documents from a particular demographic group, can receive in the final ranking. For example, the fair allocation of exposure becomes more challenging or impossible if the first stage retrieval returns too few documents from certain groups, since the number of group documents in the ranking affects the exposure more than the documents' positions. With this in mind, it is beneficial to predict the amount of exposure that a group of documents is likely to receive in the results of the first stage retrieval process, in order to ensure that there are a sufficient number of documents included from each of the groups. In this paper, we introduce the novel task of query exposure prediction (QEP). Specifically, we propose the first approach for predicting the distribution of exposure that groups of documents will receive for a given query. Our new approach, called GEP, uses lexical information from individual groups of documents to estimate the exposure the groups will receive in a ranking. Our experiments on the TREC 2021 and 2022 Fair Ranking Track test collections show that our proposed GEP approach results in exposure predictions that are up to 40 % more accurate than the predictions of adapted existing query performance prediction and resource allocation approaches.
△ Less
Submitted 24 January, 2024;
originally announced January 2024.