Skip to main content

Showing 1–50 of 217 results for author: de Rijke, M

.
  1. arXiv:2406.19999  [pdf, other

    cs.CL

    The SIFo Benchmark: Investigating the Sequential Instruction Following Ability of Large Language Models

    Authors: Xinyi Chen, Baohao Liao, Jirui Qi, Panagiotis Eustratiadis, Christof Monz, Arianna Bisazza, Maarten de Rijke

    Abstract: Following multiple instructions is a crucial ability for large language models (LLMs). Evaluating this ability comes with significant challenges: (i) limited coherence between multiple instructions, (ii) positional bias where the order of instructions affects model performance, and (iii) a lack of objectively verifiable tasks. To address these issues, we introduce a benchmark designed to evaluate… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  2. arXiv:2406.08891  [pdf, ps, other

    cs.IR

    Robust Information Retrieval

    Authors: Yu-An Liu, Ruqing Zhang, Jiafeng Guo, Maarten de Rijke

    Abstract: Beyond effectiveness, the robustness of an information retrieval (IR) system is increasingly attracting attention. When deployed, a critical technology such as IR should not only deliver strong performance on average but also have the ability to handle a variety of exceptional situations. In recent years, research into the robustness of IR has seen significant growth, with numerous researchers off… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: accepted by SIGIR2024 Tutorial

  3. Let Me Do It For You: Towards LLM Empowered Recommendation via Tool Learning

    Authors: Yuyue Zhao, Jiancan Wu, Xiang Wang, Wei Tang, Dingxian Wang, Maarten de Rijke

    Abstract: Conventional recommender systems (RSs) face challenges in precisely capturing users' fine-grained preferences. Large language models (LLMs) have shown capabilities in commonsense reasoning and leveraging external tools that may help address these challenges. However, existing LLM-based RSs suffer from hallucinations, misalignment between the semantic space of items and the behavior space of users,… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  4. arXiv:2405.05736  [pdf, other

    cs.LG cs.IR

    Optimal Baseline Corrections for Off-Policy Contextual Bandits

    Authors: Shashank Gupta, Olivier Jeunen, Harrie Oosterhuis, Maarten de Rijke

    Abstract: The off-policy learning paradigm allows for recommender systems and general ranking applications to be framed as decision-making problems, where we aim to learn decision policies that optimize an unbiased offline estimate of an online reward metric. With unbiasedness comes potentially high variance, and prevalent methods exist to reduce estimation variance. These methods typically make use of cont… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  5. arXiv:2405.05109  [pdf, other

    cs.CL cs.AI

    QFMTS: Generating Query-Focused Summaries over Multi-Table Inputs

    Authors: Weijia Zhang, Vaishali Pal, Jia-Hong Huang, Evangelos Kanoulas, Maarten de Rijke

    Abstract: Table summarization is a crucial task aimed at condensing information from tabular data into concise and comprehensible textual summaries. However, existing approaches often fall short of adequately meeting users' information and quality requirements and tend to overlook the complexities of real-world queries. In this paper, we propose a novel method to address these limitations by introducing que… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: 16 pages, 3 figures

  6. Are We Really Achieving Better Beyond-Accuracy Performance in Next Basket Recommendation?

    Authors: Ming Li, Yuanna Liu, Sami Jullien, Mozhdeh Ariannezhad, Mohammad Aliannejadi, Andrew Yates, Maarten de Rijke

    Abstract: Next basket recommendation (NBR) is a special type of sequential recommendation that is increasingly receiving attention. So far, most NBR studies have focused on optimizing the accuracy of the recommendation, whereas optimizing for beyond-accuracy metrics, e.g., item fairness and diversity remains largely unexplored. Recent studies into NBR have found a substantial performance difference between… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: To appear at SIGIR'24

  7. arXiv:2405.00554  [pdf, other

    cs.IR

    A First Look at Selection Bias in Preference Elicitation for Recommendation

    Authors: Shashank Gupta, Harrie Oosterhuis, Maarten de Rijke

    Abstract: Preference elicitation explicitly asks users what kind of recommendations they would like to receive. It is a popular technique for conversational recommender systems to deal with cold-starts. Previous work has studied selection bias in implicit feedback, e.g., clicks, and in some forms of explicit feedback, i.e., ratings on items. Despite the fact that the extreme sparsity of preference elicitati… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

    Comments: Accepted at the CONSEQUENCES'23 workshop at RecSys '23

  8. Going Beyond Popularity and Positivity Bias: Correcting for Multifactorial Bias in Recommender Systems

    Authors: ** Huang, Harrie Oosterhuis, Masoud Mansoury, Herke van Hoof, Maarten de Rijke

    Abstract: Two typical forms of bias in user interaction data with recommender systems (RSs) are popularity bias and positivity bias, which manifest themselves as the over-representation of interactions with popular items or items that users prefer, respectively. Debiasing methods aim to mitigate the effect of selection bias on the evaluation and optimization of RSs. However, existing debiasing methods only… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: SIGIR 2024

  9. arXiv:2404.18185  [pdf, other

    cs.IR cs.AI cs.CL cs.LG

    Ranked List Truncation for Large Language Model-based Re-Ranking

    Authors: Chuan Meng, Negar Arabzadeh, Arian Askari, Mohammad Aliannejadi, Maarten de Rijke

    Abstract: We study ranked list truncation (RLT) from a novel "retrieve-then-re-rank" perspective, where we optimize re-ranking by truncating the retrieved list (i.e., trim re-ranking candidates). RLT is crucial for re-ranking as it can improve re-ranking efficiency by sending variable-length candidate lists to a re-ranker on a per-query basis. It also has the potential to improve re-ranking effectiveness. D… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

    Comments: Accepted for publication as a long paper at SIGIR 2024

    ACM Class: H.3.3

  10. arXiv:2404.17591  [pdf, other

    cs.IR cs.AI cs.LG

    Large Language Models for Next Point-of-Interest Recommendation

    Authors: Peibo Li, Maarten de Rijke, Hao Xue, Shuang Ao, Yang Song, Flora D. Salim

    Abstract: The next Point of Interest (POI) recommendation task is to predict users' immediate next POI visit given their historical data. Location-Based Social Network (LBSN) data, which is often used for the next POI recommendation task, comes with challenges. One frequently disregarded challenge is how to effectively use the abundant contextual information present in LBSN data. Previous methods are limite… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

  11. arXiv:2404.17288  [pdf, other

    cs.IR

    ExcluIR: Exclusionary Neural Information Retrieval

    Authors: Wenhao Zhang, Mengqi Zhang, Shiguang Wu, Jiahuan Pei, Zhaochun Ren, Maarten de Rijke, Zhumin Chen, Pengjie Ren

    Abstract: Exclusion is an important and universal linguistic skill that humans use to express what they do not want. However, in information retrieval community, there is little research on exclusionary retrieval, where users express what they do not want in their queries. In this work, we investigate the scenario of exclusionary retrieval in document retrieval for the first time. We present ExcluIR, a set… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

  12. Rethinking the Evaluation of Dialogue Systems: Effects of User Feedback on Crowdworkers and LLMs

    Authors: Clemencia Siro, Mohammad Aliannejadi, Maarten de Rijke

    Abstract: In ad-hoc retrieval, evaluation relies heavily on user actions, including implicit feedback. In a conversational setting such signals are usually unavailable due to the nature of the interactions, and, instead, the evaluation often relies on crowdsourced evaluation labels. The role of user feedback in annotators' assessment of turns in a conversational perception has been little studied. We focus… ▽ More

    Submitted 29 April, 2024; v1 submitted 19 April, 2024; originally announced April 2024.

    Comments: Accepted at SIGIR 2024 long paper track

  13. Estimating the Hessian Matrix of Ranking Objectives for Stochastic Learning to Rank with Gradient Boosted Trees

    Authors: **gwei Kang, Maarten de Rijke, Harrie Oosterhuis

    Abstract: Stochastic learning to rank (LTR) is a recent branch in the LTR field that concerns the optimization of probabilistic ranking models. Their probabilistic behavior enables certain ranking qualities that are impossible with deterministic models. For example, they can increase the diversity of displayed documents, increase fairness of exposure over documents, and better balance exploitation and explo… ▽ More

    Submitted 9 May, 2024; v1 submitted 18 April, 2024; originally announced April 2024.

    Comments: SIGIR2024 conference Short paper track

  14. arXiv:2404.09980  [pdf, other

    cs.CL cs.HC cs.IR

    Context Does Matter: Implications for Crowdsourced Evaluation Labels in Task-Oriented Dialogue Systems

    Authors: Clemencia Siro, Mohammad Aliannejadi, Maarten de Rijke

    Abstract: Crowdsourced labels play a crucial role in evaluating task-oriented dialogue systems (TDSs). Obtaining high-quality and consistent ground-truth labels from annotators presents challenges. When evaluating a TDS, annotators must fully comprehend the dialogue before providing judgments. Previous studies suggest using only a portion of the dialogue context in the annotation process. However, the impac… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: Accepted at NAACL 2024 Findings

  15. Unbiased Learning to Rank Meets Reality: Lessons from Baidu's Large-Scale Search Dataset

    Authors: Philipp Hager, Romain Deffayet, Jean-Michel Renders, Onno Zoeter, Maarten de Rijke

    Abstract: Unbiased learning-to-rank (ULTR) is a well-established framework for learning from user clicks, which are often biased by the ranker collecting the data. While theoretically justified and extensively tested in simulation, ULTR techniques lack empirical validation, especially on modern search engines. The Baidu-ULTR dataset released for the WSDM Cup 2023, collected from Baidu's search engine, offer… ▽ More

    Submitted 15 May, 2024; v1 submitted 3 April, 2024; originally announced April 2024.

  16. arXiv:2404.01574  [pdf, other

    cs.IR cs.CR cs.LG

    Multi-granular Adversarial Attacks against Black-box Neural Ranking Models

    Authors: Yu-An Liu, Ruqing Zhang, Jiafeng Guo, Maarten de Rijke, Yixing Fan, Xueqi Cheng

    Abstract: Adversarial ranking attacks have gained increasing attention due to their success in probing vulnerabilities, and, hence, enhancing the robustness, of neural ranking models. Conventional attack methods employ perturbations at a single granularity, e.g., word or sentence level, to target documents. However, limiting perturbations to a single level of granularity may reduce the flexibility of advers… ▽ More

    Submitted 10 April, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

    Comments: Accepted by SIGIR2024

  17. arXiv:2404.01012  [pdf, other

    cs.IR cs.AI cs.CL cs.LG

    Query Performance Prediction using Relevance Judgments Generated by Large Language Models

    Authors: Chuan Meng, Negar Arabzadeh, Arian Askari, Mohammad Aliannejadi, Maarten de Rijke

    Abstract: Query performance prediction (QPP) aims to estimate the retrieval quality of a search system for a query without human relevance judgments. Previous QPP methods typically return a single scalar value and do not require the predicted values to approximate a specific information retrieval (IR) evaluation measure, leading to certain drawbacks: (i) a single scalar is insufficient to accurately represe… ▽ More

    Submitted 17 June, 2024; v1 submitted 1 April, 2024; originally announced April 2024.

    ACM Class: H.3.3

  18. arXiv:2404.00684  [pdf, other

    cs.IR cs.AI

    Generative Retrieval as Multi-Vector Dense Retrieval

    Authors: Shiguang Wu, Wenda Wei, Mengqi Zhang, Zhumin Chen, Jun Ma, Zhaochun Ren, Maarten de Rijke, Pengjie Ren

    Abstract: Generative retrieval generates identifiers of relevant documents in an end-to-end manner using a sequence-to-sequence architecture for a given query. The relation between generative retrieval and other retrieval methods, especially those based on matching within dense retrieval models, is not yet fully comprehended. Prior work has demonstrated that generative retrieval with atomic identifiers is e… ▽ More

    Submitted 31 March, 2024; originally announced April 2024.

    Comments: 12 pages, 5 figures, 8 tables, accepted at SIGIR 2024

  19. arXiv:2403.19216  [pdf, other

    cs.IR

    Are Large Language Models Good at Utility Judgments?

    Authors: Hengran Zhang, Ruqing Zhang, Jiafeng Guo, Maarten de Rijke, Yixing Fan, Xueqi Cheng

    Abstract: Retrieval-augmented generation (RAG) is considered to be a promising approach to alleviate the hallucination issue of large language models (LLMs), and it has received widespread attention from researchers recently. Due to the limitation in the semantic understanding of retrieval models, the success of RAG heavily lies on the ability of LLMs to identify passages with utility. Recent efforts have e… ▽ More

    Submitted 8 June, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

    Comments: Acctepted by SIGIR2024

  20. arXiv:2403.19056  [pdf, other

    cs.CL

    CAUSE: Counterfactual Assessment of User Satisfaction Estimation in Task-Oriented Dialogue Systems

    Authors: Amin Abolghasemi, Zhaochun Ren, Arian Askari, Mohammad Aliannejadi, Maarten de Rijke, Suzan Verberne

    Abstract: An important unexplored aspect in previous work on user satisfaction estimation for Task-Oriented Dialogue (TOD) systems is their evaluation in terms of robustness for the identification of user dissatisfaction: current benchmarks for user satisfaction estimation in TOD systems are highly skewed towards dialogues for which the user is satisfied. The effect of having a more balanced set of satisfac… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

  21. arXiv:2403.16085  [pdf, other

    cs.IR

    RankingSHAP -- Listwise Feature Attribution Explanations for Ranking Models

    Authors: Maria Heuss, Maarten de Rijke, Avishek Anand

    Abstract: Feature attributions are a commonly used explanation type, when we want to posthoc explain the prediction of a trained model. Yet, they are not very well explored in IR. Importantly, feature attribution has rarely been rigorously defined, beyond attributing the most important feature the highest value. What it means for a feature to be more important than others is often left vague. Consequently,… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

  22. arXiv:2403.12499  [pdf, other

    cs.IR

    Listwise Generative Retrieval Models via a Sequential Learning Process

    Authors: Yubao Tang, Ruqing Zhang, Jiafeng Guo, Maarten de Rijke, Wei Chen, Xueqi Cheng

    Abstract: Recently, a novel generative retrieval (GR) paradigm has been proposed, where a single sequence-to-sequence model is learned to directly generate a list of relevant document identifiers (docids) given a query. Existing GR models commonly employ maximum likelihood estimation (MLE) for optimization: this involves maximizing the likelihood of a single relevant docid given an input query, with the ass… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Comments: Accepted by ACM Transactions on Information Systems

  23. arXiv:2403.05975  [pdf, other

    cs.CL

    Measuring Bias in a Ranked List using Term-based Representations

    Authors: Amin Abolghasemi, Leif Azzopardi, Arian Askari, Maarten de Rijke, Suzan Verberne

    Abstract: In most recent studies, gender bias in document ranking is evaluated with the NFaiRR metric, which measures bias in a ranked list based on an aggregation over the unbiasedness scores of each ranked document. This perspective in measuring the bias of a ranked list has a key limitation: individual documents of a ranked list might be biased while the ranked list as a whole balances the groups' repres… ▽ More

    Submitted 9 March, 2024; originally announced March 2024.

    Comments: Accepted at the 46th European Conference on Information Retrieval (ECIR 2024)

  24. arXiv:2402.17535  [pdf, other

    cs.IR cs.CV

    Multimodal Learned Sparse Retrieval with Probabilistic Expansion Control

    Authors: Thong Nguyen, Mariya Hendriksen, Andrew Yates, Maarten de Rijke

    Abstract: Learned sparse retrieval (LSR) is a family of neural methods that encode queries and documents into sparse lexical vectors that can be indexed and retrieved efficiently with an inverted index. We explore the application of LSR to the multi-modal domain, with a focus on text-image retrieval. While LSR has seen success in text retrieval, its application in multimodal retrieval remains underexplored.… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

    Comments: 17 pages, accepted as a full paper at ECIR 2024

  25. arXiv:2402.17510  [pdf, other

    cs.CV cs.AI

    Demonstrating and Reducing Shortcuts in Vision-Language Representation Learning

    Authors: Maurits Bleeker, Mariya Hendriksen, Andrew Yates, Maarten de Rijke

    Abstract: Vision-language models (VLMs) mainly rely on contrastive training to learn general-purpose representations of images and captions. We focus on the situation when one image is associated with several captions, each caption containing both information shared among all captions and unique information per caption about the scene depicted in the image. In such cases, it is unclear whether contrastive l… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

    Comments: 25 pages

  26. arXiv:2402.17263  [pdf, other

    cs.CL

    MELoRA: Mini-Ensemble Low-Rank Adapters for Parameter-Efficient Fine-Tuning

    Authors: Pengjie Ren, Chengshun Shi, Shiguang Wu, Mengqi Zhang, Zhaochun Ren, Maarten de Rijke, Zhumin Chen, Jiahuan Pei

    Abstract: Parameter-efficient fine-tuning (PEFT) is a popular method for tailoring pre-trained large language models (LLMs), especially as the models' scale and the diversity of tasks increase. Low-rank adaptation (LoRA) is based on the idea that the adaptation process is intrinsically low-dimensional, i.e., significant model changes can be represented with relatively few parameters. However, decreasing the… ▽ More

    Submitted 24 June, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

    Comments: ACL2024

    MSC Class: 68T50 ACM Class: I.2.7

  27. arXiv:2402.16767  [pdf, other

    cs.IR cs.CL

    CorpusBrain++: A Continual Generative Pre-Training Framework for Knowledge-Intensive Language Tasks

    Authors: Jiafeng Guo, Changjiang Zhou, Ruqing Zhang, Jiangui Chen, Maarten de Rijke, Yixing Fan, Xueqi Cheng

    Abstract: Knowledge-intensive language tasks (KILTs) typically require retrieving relevant documents from trustworthy corpora, e.g., Wikipedia, to produce specific answers. Very recently, a pre-trained generative retrieval model for KILTs, named CorpusBrain, was proposed and reached new state-of-the-art retrieval performance. However, most existing research on KILTs, including CorpusBrain, has predominantly… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

    Comments: Submitted to ACM Transactions on Information Systems

  28. arXiv:2402.15708  [pdf, other

    cs.CL cs.AI cs.IR

    Query Augmentation by Decoding Semantics from Brain Signals

    Authors: Ziyi Ye, **gtao Zhan, Qingyao Ai, Yiqun Liu, Maarten de Rijke, Christina Lioma, Tuukka Ruotsalo

    Abstract: Query augmentation is a crucial technique for refining semantically imprecise queries. Traditionally, query augmentation relies on extracting information from initially retrieved, potentially relevant documents. If the quality of the initially retrieved documents is low, then the effectiveness of query augmentation would be limited as well. We propose Brain-Aug, which enhances a query by incorpora… ▽ More

    Submitted 3 March, 2024; v1 submitted 23 February, 2024; originally announced February 2024.

  29. arXiv:2402.11176  [pdf, other

    cs.CL cs.AI

    KnowTuning: Knowledge-aware Fine-tuning for Large Language Models

    Authors: Yougang Lyu, Lingyong Yan, Shuaiqiang Wang, Haibo Shi, Dawei Yin, Pengjie Ren, Zhumin Chen, Maarten de Rijke, Zhaochun Ren

    Abstract: Despite their success at many natural language processing (NLP) tasks, large language models still struggle to effectively leverage knowledge for knowledge-intensive tasks, manifesting limitations such as generating incomplete, non-factual, or illogical answers. These limitations stem from inadequate knowledge awareness of LLMs during vanilla fine-tuning. To address these problems, we propose a kn… ▽ More

    Submitted 17 April, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

  30. arXiv:2402.07742  [pdf, other

    cs.CL cs.CV

    Asking Multimodal Clarifying Questions in Mixed-Initiative Conversational Search

    Authors: Yifei Yuan, Clemencia Siro, Mohammad Aliannejadi, Maarten de Rijke, Wai Lam

    Abstract: In mixed-initiative conversational search systems, clarifying questions are used to help users who struggle to express their intentions in a single query. These questions aim to uncover user's information needs and resolve query ambiguities. We hypothesize that in scenarios where multimodal information is pertinent, the clarification process can be improved by using non-textual information. Theref… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

    Comments: Accepted to WWW24

  31. arXiv:2312.10329  [pdf, other

    cs.IR cs.CL

    Perturbation-Invariant Adversarial Training for Neural Ranking Models: Improving the Effectiveness-Robustness Trade-Off

    Authors: Yu-An Liu, Ruqing Zhang, Mingkun Zhang, Wei Chen, Maarten de Rijke, Jiafeng Guo, Xueqi Cheng

    Abstract: Neural ranking models (NRMs) have shown great success in information retrieval (IR). But their predictions can easily be manipulated using adversarial examples, which are crafted by adding imperceptible perturbations to legitimate documents. This vulnerability raises significant concerns about their reliability and hinders the widespread deployment of NRMs. By incorporating adversarial examples in… ▽ More

    Submitted 16 December, 2023; originally announced December 2023.

    Comments: Accepted by AAAI 24

  32. arXiv:2311.16586  [pdf, other

    cs.IR

    SARDINE: A Simulator for Automated Recommendation in Dynamic and Interactive Environments

    Authors: Romain Deffayet, Thibaut Thonet, Dongyoon Hwang, Vassilissa Lehoux, Jean-Michel Renders, Maarten de Rijke

    Abstract: Simulators can provide valuable insights for researchers and practitioners who wish to improve recommender systems, because they allow one to easily tweak the experimental setup in which recommender systems operate, and as a result lower the cost of identifying general trends and uncovering novel findings about the candidate methods. A key requirement to enable this accelerated improvement cycle i… ▽ More

    Submitted 8 April, 2024; v1 submitted 28 November, 2023; originally announced November 2023.

  33. arXiv:2311.09889  [pdf, other

    cs.CL

    Language Generation from Brain Recordings

    Authors: Ziyi Ye, Qingyao Ai, Yiqun Liu, Maarten de Rijke, Min Zhang, Christina Lioma, Tuukka Ruotsalo

    Abstract: Generating human language through non-invasive brain-computer interfaces (BCIs) has the potential to unlock many applications, such as serving disabled patients and improving communication. Currently, however, generating language via BCIs has been previously successful only within a classification setup for selecting pre-generated sentence continuation candidates with the most likely cortical sema… ▽ More

    Submitted 11 March, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

    Comments: Preprint. Under Submission

  34. arXiv:2311.02446  [pdf, other

    cs.IR

    Learning Robust Sequential Recommenders through Confident Soft Labels

    Authors: Shiguang Wu, Xin Xin, Pengjie Ren, Zhumin Chen, Jun Ma, Maarten de Rijke, Zhaochun Ren

    Abstract: Sequential recommenders that are trained on implicit feedback are usually learned as a multi-class classification task through softmax-based loss functions on one-hot class labels. However, one-hot training labels are sparse and may lead to biased training and sub-optimal performance. Dense, soft labels have been shown to help improve recommendation performance. But how to generate high-quality an… ▽ More

    Submitted 4 November, 2023; originally announced November 2023.

  35. arXiv:2310.12809  [pdf, other

    cs.LG

    Hierarchical Forecasting at Scale

    Authors: Olivier Sprangers, Wander Wadman, Sebastian Schelter, Maarten de Rijke

    Abstract: Existing hierarchical forecasting techniques scale poorly when the number of time series increases. We propose to learn a coherent forecast for millions of time series with a single bottom-level forecast model by using a sparse loss function that directly optimizes the hierarchical product and/or temporal structure. The benefit of our sparse hierarchical loss function is that it provides practitio… ▽ More

    Submitted 26 February, 2024; v1 submitted 19 October, 2023; originally announced October 2023.

  36. arXiv:2310.11675  [pdf, other

    cs.IR

    From Relevance to Utility: Evidence Retrieval with Feedback for Fact Verification

    Authors: Hengran Zhang, Ruqing Zhang, Jiafeng Guo, Maarten de Rijke, Yixing Fan, Xueqi Cheng

    Abstract: Retrieval-enhanced methods have become a primary approach in fact verification (FV); it requires reasoning over multiple retrieved pieces of evidence to verify the integrity of a claim. To retrieve evidence, existing work often employs off-the-shelf retrieval models whose design is based on the probability ranking principle. We argue that, rather than relevance, for FV we need to focus on the util… ▽ More

    Submitted 19 October, 2023; v1 submitted 17 October, 2023; originally announced October 2023.

    Comments: Acctepted by EMNLP 2023 Findings

  37. arXiv:2310.09846  [pdf, other

    cs.IR

    Generalizing Few-Shot Named Entity Recognizers to Unseen Domains with Type-Related Features

    Authors: Zihan Wang, Ziqi Zhao, Zhumin Chen, Pengjie Ren, Maarten de Rijke, Zhaochun Ren

    Abstract: Few-shot named entity recognition (NER) has shown remarkable progress in identifying entities in low-resource domains. However, few-shot NER methods still struggle with out-of-domain (OOD) examples due to their reliance on manual labeling for the target domain. To address this limitation, recent studies enable generalization to an unseen target domain with only a few labeled examples using data au… ▽ More

    Submitted 15 October, 2023; originally announced October 2023.

    Comments: Accepted at EMNLP findings

  38. Predictive Uncertainty-based Bias Mitigation in Ranking

    Authors: Maria Heuss, Daniel Cohen, Masoud Mansoury, Maarten de Rijke, Carsten Eickhoff

    Abstract: Societal biases that are contained in retrieved documents have received increased interest. Such biases, which are often prevalent in the training data and learned by the model, can cause societal harms, by misrepresenting certain groups, and by enforcing stereotypes. Mitigating such biases demands algorithms that balance the trade-off between maximized utility for the user with fairness objective… ▽ More

    Submitted 18 September, 2023; originally announced September 2023.

    Journal ref: CIKM 2023: 32nd ACM International Conference on Information and Knowledge Management

  39. Continual Learning for Generative Retrieval over Dynamic Corpora

    Authors: Jiangui Chen, Ruqing Zhang, Jiafeng Guo, Maarten de Rijke, Wei Chen, Yixing Fan, Xueqi Cheng

    Abstract: Generative retrieval (GR) directly predicts the identifiers of relevant documents (i.e., docids) based on a parametric model. It has achieved solid performance on many ad-hoc retrieval tasks. So far, these tasks have assumed a static document collection. In many practical scenarios, however, document collections are dynamic, where new documents are continuously added to the corpus. The ability to… ▽ More

    Submitted 28 August, 2023; originally announced August 2023.

    Comments: Accepted by CIKM 2023

  40. arXiv:2308.09861  [pdf, other

    cs.IR cs.CL

    Black-box Adversarial Attacks against Dense Retrieval Models: A Multi-view Contrastive Learning Method

    Authors: Yu-An Liu, Ruqing Zhang, Jiafeng Guo, Maarten de Rijke, Wei Chen, Yixing Fan, Xueqi Cheng

    Abstract: Neural ranking models (NRMs) and dense retrieval (DR) models have given rise to substantial improvements in overall retrieval performance. In addition to their effectiveness, and motivated by the proven lack of robustness of deep learning-based approaches in other areas, there is growing interest in the robustness of deep learning-based approaches to the core retrieval problem. Adversarial attack… ▽ More

    Submitted 18 August, 2023; originally announced August 2023.

    Comments: Accept by CIKM2023, 10 pages

  41. arXiv:2308.02887  [pdf, other

    cs.IR

    The Impact of Group Membership Bias on the Quality and Fairness of Exposure in Ranking

    Authors: Ali Vardasbi, Maarten de Rijke, Fernando Diaz, Mostafa Dehghani

    Abstract: When learning to rank from user interactions, search and recommender systems must address biases in user behavior to provide a high-quality ranking. One type of bias that has recently been studied in the ranking literature is when sensitive attributes, such as gender, have an impact on a user's judgment about an item's utility. For example, in a search for an expertise area, some users may be bias… ▽ More

    Submitted 29 April, 2024; v1 submitted 5 August, 2023; originally announced August 2023.

  42. Masked and Swapped Sequence Modeling for Next Novel Basket Recommendation in Grocery Shop**

    Authors: Ming Li, Mozhdeh Ariannezhad, Andrew Yates, Maarten de Rijke

    Abstract: Next basket recommendation (NBR) is the task of predicting the next set of items based on a sequence of already purchased baskets. It is a recommendation task that has been widely studied, especially in the context of grocery shop**. In next basket recommendation (NBR), it is useful to distinguish between repeat items, i.e., items that a user has consumed before, and explore items, i.e., items t… ▽ More

    Submitted 2 August, 2023; originally announced August 2023.

    Comments: To appear at RecSys'23

  43. arXiv:2307.03897  [pdf, other

    cs.CL

    Answering Ambiguous Questions via Iterative Prompting

    Authors: Weiwei Sun, Hengyi Cai, Hongshen Chen, Pengjie Ren, Zhumin Chen, Maarten de Rijke, Zhaochun Ren

    Abstract: In open-domain question answering, due to the ambiguity of questions, multiple plausible answers may exist. To provide feasible answers to an ambiguous question, one approach is to directly predict all valid answers, but this can struggle with balancing relevance and diversity. An alternative is to gather candidate answers and aggregate them, but this method can be computationally costly and may n… ▽ More

    Submitted 8 July, 2023; originally announced July 2023.

    Comments: To be published in ACL 2023

  44. arXiv:2306.08947  [pdf, other

    cs.IR cs.LG

    RecFusion: A Binomial Diffusion Process for 1D Data for Recommendation

    Authors: Gabriel Bénédict, Olivier Jeunen, Samuele Papa, Samarth Bhargav, Daan Odijk, Maarten de Rijke

    Abstract: In this paper we propose RecFusion, which comprise a set of diffusion models for recommendation. Unlike image data which contain spatial correlations, a user-item interaction matrix, commonly utilized in recommendation, lacks spatial relationships between users and items. We formulate diffusion on a 1D vector and propose binomial diffusion, which explicitly models binary user-item interactions wit… ▽ More

    Submitted 7 September, 2023; v1 submitted 15 June, 2023; originally announced June 2023.

    Comments: code: https://github.com/gabriben/recfusion

  45. arXiv:2305.16877  [pdf, other

    cs.LG cs.AI

    Distributional Reinforcement Learning with Dual Expectile-Quantile Regression

    Authors: Sami Jullien, Romain Deffayet, Jean-Michel Renders, Paul Groth, Maarten de Rijke

    Abstract: Distributional reinforcement learning (RL) has proven useful in multiple benchmarks as it enables approximating the full distribution of returns and makes a better use of environment samples. The commonly used quantile regression approach to distributional RL -- based on asymmetric $L_1$ losses -- provides a flexible and effective way of learning arbitrary return distributions. In practice, it is… ▽ More

    Submitted 18 March, 2024; v1 submitted 26 May, 2023; originally announced May 2023.

    Comments: 16 pages, 3 figures, 1 algorithm

    ACM Class: I.2.8; G.3

  46. MultiTabQA: Generating Tabular Answers for Multi-Table Question Answering

    Authors: Vaishali Pal, Andrew Yates, Evangelos Kanoulas, Maarten de Rijke

    Abstract: Recent advances in tabular question answering (QA) with large language models are constrained in their coverage and only answer questions over a single table. However, real-world queries are complex in nature, often over multiple tables in a relational database or web page. Single table questions do not involve common table operations such as set operations, Cartesian products (joins), or nested q… ▽ More

    Submitted 24 May, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

    Comments: Accepted at ACL-2023

    Journal ref: In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023, pages 6322-6334, Toronto, Canada. Association for Computational Linguistics

  47. arXiv:2305.10923  [pdf, other

    cs.IR cs.CL cs.LG

    Query Performance Prediction: From Ad-hoc to Conversational Search

    Authors: Chuan Meng, Negar Arabzadeh, Mohammad Aliannejadi, Maarten de Rijke

    Abstract: Query performance prediction (QPP) is a core task in information retrieval. The QPP task is to predict the retrieval quality of a search system for a query without relevance judgments. Research has shown the effectiveness and usefulness of QPP for ad-hoc search. Recent years have witnessed considerable progress in conversational search (CS). Effective QPP could help a CS system to decide an approp… ▽ More

    Submitted 18 May, 2023; originally announced May 2023.

    Comments: Accepted for publication at SIGIR 2023

    ACM Class: H.3.3

  48. arXiv:2305.10531  [pdf, other

    cs.IR

    Iteratively Learning Representations for Unseen Entities with Inter-Rule Correlations

    Authors: Zihan Wang, Kai Zhao, Yongquan He, Zhumin Chen, Pengjie Ren, Maarten de Rijke, Zhaochun Ren

    Abstract: Recent work on knowledge graph completion (KGC) focused on learning embeddings of entities and relations in knowledge graphs. These embedding methods require that all test entities are observed at training time, resulting in a time-consuming retraining process for out-of-knowledge-graph (OOKG) entities. To address this issue, current inductive knowledge embedding methods employ graph neural networ… ▽ More

    Submitted 4 September, 2023; v1 submitted 17 May, 2023; originally announced May 2023.

    Comments: Accepted at CIKM 2023

  49. Improving Implicit Feedback-Based Recommendation through Multi-Behavior Alignment

    Authors: Xin Xin, Xiangyuan Liu, Hanbing Wang, Pengjie Ren, Zhumin Chen, Jiahuan Lei, Xinlei Shi, Hengliang Luo, Joemon Jose, Maarten de Rijke, Zhaochun Ren

    Abstract: Recommender systems that learn from implicit feedback often use large volumes of a single type of implicit user feedback, such as clicks, to enhance the prediction of sparse target behavior such as purchases. Using multiple types of implicit user feedback for such target behavior prediction purposes is still an open question. Existing studies that attempted to learn from multiple types of user beh… ▽ More

    Submitted 9 May, 2023; originally announced May 2023.

  50. Safe Deployment for Counterfactual Learning to Rank with Exposure-Based Risk Minimization

    Authors: Shashank Gupta, Harrie Oosterhuis, Maarten de Rijke

    Abstract: Counterfactual learning to rank (CLTR) relies on exposure-based inverse propensity scoring (IPS), a LTR-specific adaptation of IPS to correct for position bias. While IPS can provide unbiased and consistent estimates, it often suffers from high variance. Especially when little click data is available, this variance can cause CLTR to learn sub-optimal ranking behavior. Consequently, existing CLTR m… ▽ More

    Submitted 26 April, 2023; originally announced May 2023.

    Comments: SIGIR 2023 - Full paper