Skip to main content

Showing 1–8 of 8 results for author: Kraaij, W

.
  1. arXiv:2303.01200  [pdf, other

    cs.IR

    Retrieval for Extremely Long Queries and Documents with RPRS: a Highly Efficient and Effective Transformer-based Re-Ranker

    Authors: Arian Askari, Suzan Verberne, Amin Abolghasemi, Wessel Kraaij, Gabriella Pasi

    Abstract: Retrieval with extremely long queries and documents is a well-known and challenging task in information retrieval and is commonly known as Query-by-Document (QBD) retrieval. Specifically designed Transformer models that can handle long input sequences have not shown high effectiveness in QBD tasks in previous work. We propose a Re-Ranker based on the novel Proportional Relevance Score (RPRS) to co… ▽ More

    Submitted 1 November, 2023; v1 submitted 2 March, 2023; originally announced March 2023.

    Comments: Accepted at ACM Transactions on Information Systems (ACM TOIS journal)

  2. arXiv:2301.09728  [pdf, other

    cs.IR

    Injecting the BM25 Score as Text Improves BERT-Based Re-rankers

    Authors: Arian Askari, Amin Abolghasemi, Gabriella Pasi, Wessel Kraaij, Suzan Verberne

    Abstract: In this paper we propose a novel approach for combining first-stage lexical retrieval models and Transformer-based re-rankers: we inject the relevance score of the lexical model as a token in the middle of the input of the cross-encoder re-ranker. It was shown in prior work that interpolation between the relevance score of lexical and BERT-based re-rankers may not consistently result in higher eff… ▽ More

    Submitted 23 January, 2023; originally announced January 2023.

    Comments: Accepted at ECIR 2023

  3. arXiv:2109.11308  [pdf, other

    cs.CL cs.IR

    Breaking BERT: Understanding its Vulnerabilities for Named Entity Recognition through Adversarial Attack

    Authors: Anne Dirkson, Suzan Verberne, Wessel Kraaij

    Abstract: Both generic and domain-specific BERT models are widely used for natural language processing (NLP) tasks. In this paper we investigate the vulnerability of BERT models to variation in input data for Named Entity Recognition (NER) through adversarial attack. Experimental results show that BERT models are vulnerable to variation in the entity context with 20.2 to 45.0% of entities predicted complete… ▽ More

    Submitted 31 January, 2022; v1 submitted 23 September, 2021; originally announced September 2021.

  4. arXiv:2104.13473  [pdf, other

    cs.CV cs.AI

    TRECVID 2020: A comprehensive campaign for evaluating video retrieval tasks across multiple application domains

    Authors: George Awad, Asad A. Butt, Keith Curtis, Jonathan Fiscus, Afzal Godil, Yooyoung Lee, Andrew Delgado, Jesse Zhang, Eliot Godard, Baptiste Chocot, Lukas Diduch, Jeffrey Liu, Alan F. Smeaton, Yvette Graham, Gareth J. F. Jones, Wessel Kraaij, Georges Quenot

    Abstract: The TREC Video Retrieval Evaluation (TRECVID) is a TREC-style video analysis and retrieval evaluation with the goal of promoting progress in research and development of content-based exploitation and retrieval of information from digital video via open, metrics-based evaluation. Over the last twenty years this effort has yielded a better understanding of how systems can effectively accomplish such… ▽ More

    Submitted 27 April, 2021; originally announced April 2021.

    Comments: TRECVID 2020 Workshop Overview Paper. arXiv admin note: substantial text overlap with arXiv:2009.09984

  5. arXiv:2009.09984  [pdf, other

    cs.CV cs.AI

    TRECVID 2019: An Evaluation Campaign to Benchmark Video Activity Detection, Video Captioning and Matching, and Video Search & Retrieval

    Authors: George Awad, Asad A. Butt, Keith Curtis, Yooyoung Lee, Jonathan Fiscus, Afzal Godil, Andrew Delgado, Jesse Zhang, Eliot Godard, Lukas Diduch, Alan F. Smeaton, Yvette Graham, Wessel Kraaij, Georges Quenot

    Abstract: The TREC Video Retrieval Evaluation (TRECVID) 2019 was a TREC-style video analysis and retrieval evaluation, the goal of which remains to promote progress in research and development of content-based exploitation and retrieval of information from digital video via open, metrics-based evaluation. Over the last nineteen years this effort has yielded a better understanding of how systems can effectiv… ▽ More

    Submitted 21 September, 2020; originally announced September 2020.

    Comments: TRECVID Workshop overview paper. 39 pages

  6. arXiv:1812.04265  [pdf

    cs.IR

    Proceedings of the 17th Dutch-Belgian Information Retrieval Workshop

    Authors: Alex Brandsen, Anne Dirkson, Wessel Kraaij, Wout Lamers, Suzan Verberne, Hugo de Vos, Gineke Wiggers

    Abstract: This volume contains the papers presented at DIR 2018: 17th Dutch-Belgian Information Retrieval Workshop (DIR) held on November 23, 2018 in Leiden. DIR aims to serve as an international platform (with a special focus on the Netherlands and Belgium) for exchange and discussions on research & applications in the field of information retrieval and related fields. The committee accepted 4 short pape… ▽ More

    Submitted 11 December, 2018; originally announced December 2018.

  7. arXiv:1804.11131  [pdf, other

    cs.IR

    Author-topic profiles for academic search

    Authors: Suzan Verberne, Arjen P. de Vries, Wessel Kraaij

    Abstract: We implemented and evaluated a two-stage retrieval method for personalized academic search in which the initial search results are re-ranked using an author-topic profile. In academic search tasks, the user's own data can help optimizing the ranking of search results to match the searcher's specific individual needs. The author-topic profile consists of topic-specific terms, stored in a graph. We… ▽ More

    Submitted 30 April, 2018; originally announced April 2018.

    Comments: 13 pages, 1 figure

  8. arXiv:cs/0312008  [pdf, ps, other

    cs.CL cs.IR

    Embedding Web-based Statistical Translation Models in Cross-Language Information Retrieval

    Authors: Wessel Kraaij, Jian-Yun Nie, Michel Simard

    Abstract: Although more and more language pairs are covered by machine translation services, there are still many pairs that lack translation resources. Cross-language information retrieval (CLIR) is an application which needs translation functionality of a relatively low level of sophistication since current models for information retrieval (IR) are still based on a bag-of-words. The Web provides a vast… ▽ More

    Submitted 3 December, 2003; originally announced December 2003.

    Comments: 37 pages

    ACM Class: H.3.1; H.3.3; I.2.7

    Journal ref: Computational Linguistics 29(3) september 2003