Skip to main content

Showing 1–50 of 56 results for author: Kanoulas, E

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.15264  [pdf, other

    cs.IR cs.CL

    Towards Fine-Grained Citation Evaluation in Generated Text: A Comparative Analysis of Faithfulness Metrics

    Authors: Weijia Zhang, Mohammad Aliannejadi, Yifei Yuan, Jiahuan Pei, Jia-Hong Huang, Evangelos Kanoulas

    Abstract: Large language models (LLMs) often produce unsupported or unverifiable information, known as "hallucinations." To mitigate this, retrieval-augmented LLMs incorporate citations, grounding the content in verifiable sources. Despite such developments, manually assessing how well a citation supports the associated statement remains a major challenge. Previous studies use faithfulness metrics to estima… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: 12 pages, 3 figures

  2. arXiv:2405.13003  [pdf, other

    cs.CL cs.AI cs.IR

    A Survey on Recent Advances in Conversational Data Generation

    Authors: Heydar Soudani, Roxana Petcu, Evangelos Kanoulas, Faegheh Hasibi

    Abstract: Recent advancements in conversational systems have significantly enhanced human-machine interactions across various domains. However, training these systems is challenging due to the scarcity of specialized dialogue data. Traditionally, conversational datasets were created through crowdsourcing, but this method has proven costly, limited in scale, and labor-intensive. As a solution, the developmen… ▽ More

    Submitted 12 May, 2024; originally announced May 2024.

  3. arXiv:2405.05109  [pdf, other

    cs.CL cs.AI

    QFMTS: Generating Query-Focused Summaries over Multi-Table Inputs

    Authors: Weijia Zhang, Vaishali Pal, Jia-Hong Huang, Evangelos Kanoulas, Maarten de Rijke

    Abstract: Table summarization is a crucial task aimed at condensing information from tabular data into concise and comprehensible textual summaries. However, existing approaches often fall short of adequately meeting users' information and quality requirements and tend to overlook the complexities of real-world queries. In this paper, we propose a novel method to address these limitations by introducing que… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: 16 pages, 3 figures

  4. Enhancing Interactive Image Retrieval With Query Rewriting Using Large Language Models and Vision Language Models

    Authors: Hongyi Zhu, Jia-Hong Huang, Stevan Rudinac, Evangelos Kanoulas

    Abstract: Image search stands as a pivotal task in multimedia and computer vision, finding applications across diverse domains, ranging from internet search to medical diagnostics. Conventional image search systems operate by accepting textual or visual queries, retrieving the top-relevant candidate results from the database. However, prevalent methods often rely on single-turn procedures, introducing poten… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

  5. arXiv:2403.10939  [pdf, ps, other

    cs.IR

    Improving the Robustness of Dense Retrievers Against Typos via Multi-Positive Contrastive Learning

    Authors: Georgios Sidiropoulos, Evangelos Kanoulas

    Abstract: Dense retrieval has become the new paradigm in passage retrieval. Despite its effectiveness on typo-free queries, it is not robust when dealing with queries that contain typos. Current works on improving the typo-robustness of dense retrievers combine (i) data augmentation to obtain the typoed queries during training time with (ii) additional robustifying subtasks that aim to align the original, t… ▽ More

    Submitted 16 March, 2024; originally announced March 2024.

    Comments: Accepted at the 46th European Conference on Information Retrieval (ECIR 2024)

  6. arXiv:2403.01432  [pdf, other

    cs.CL

    Fine Tuning vs. Retrieval Augmented Generation for Less Popular Knowledge

    Authors: Heydar Soudani, Evangelos Kanoulas, Faegheh Hasibi

    Abstract: Large language models (LLMs) memorize a vast amount of factual knowledge, exhibiting strong performance across diverse tasks and domains. However, it has been observed that the performance diminishes when dealing with less-popular or low-frequency concepts and entities, for example in domain specific applications. The two prominent approaches to enhance the performance of LLMs on low-frequent topi… ▽ More

    Submitted 7 March, 2024; v1 submitted 3 March, 2024; originally announced March 2024.

  7. arXiv:2402.11633  [pdf, other

    cs.CL

    Self-seeding and Multi-intent Self-instructing LLMs for Generating Intent-aware Information-Seeking dialogs

    Authors: Arian Askari, Roxana Petcu, Chuan Meng, Mohammad Aliannejadi, Amin Abolghasemi, Evangelos Kanoulas, Suzan Verberne

    Abstract: Identifying user intents in information-seeking dialogs is crucial for a system to meet user's information needs. Intent prediction (IP) is challenging and demands sufficient dialogs with human-labeled intents for training. However, manually annotating intents is resource-intensive. While large language models (LLMs) have been shown to be effective in generating synthetic data, there is no study o… ▽ More

    Submitted 18 February, 2024; originally announced February 2024.

  8. arXiv:2312.02913  [pdf, other

    cs.CL cs.AI cs.IR

    Let the LLMs Talk: Simulating Human-to-Human Conversational QA via Zero-Shot LLM-to-LLM Interactions

    Authors: Zahra Abbasiantaeb, Yifei Yuan, Evangelos Kanoulas, Mohammad Aliannejadi

    Abstract: Conversational question-answering (CQA) systems aim to create interactive search systems that effectively retrieve information by interacting with users. To replicate human-to-human conversations, existing work uses human annotators to play the roles of the questioner (student) and the answerer (teacher). Despite its effectiveness, challenges exist as human annotation is time-consuming, inconsiste… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

    Comments: Accepted at WSDM 2024

  9. Data Augmentation for Conversational AI

    Authors: Heydar Soudani, Evangelos Kanoulas, Faegheh Hasibi

    Abstract: Advancements in conversational systems have revolutionized information access, surpassing the limitations of single queries. However, develo** dialogue systems requires a large amount of training data, which is a challenge in low-resource domains and languages. Traditional data collection methods like crowd-sourcing are labor-intensive and time-consuming, making them ineffective in this context.… ▽ More

    Submitted 2 March, 2024; v1 submitted 9 September, 2023; originally announced September 2023.

  10. arXiv:2306.13379  [pdf, other

    cs.CL

    Stress Testing BERT Anaphora Resolution Models for Reaction Extraction in Chemical Patents

    Authors: Chieling Yueh, Evangelos Kanoulas, Bruno Martins, Camilo Thorne, Saber Akhondi

    Abstract: The high volume of published chemical patents and the importance of a timely acquisition of their information gives rise to automating information extraction from chemical patents. Anaphora resolution is an important component of comprehensive information extraction, and is critical for extracting reactions. In chemical patents, there are five anaphoric relations of interest: co-reference, transfo… ▽ More

    Submitted 23 June, 2023; originally announced June 2023.

  11. MultiTabQA: Generating Tabular Answers for Multi-Table Question Answering

    Authors: Vaishali Pal, Andrew Yates, Evangelos Kanoulas, Maarten de Rijke

    Abstract: Recent advances in tabular question answering (QA) with large language models are constrained in their coverage and only answer questions over a single table. However, real-world queries are complex in nature, often over multiple tables in a relational database or web page. Single table questions do not involve common table operations such as set operations, Cartesian products (joins), or nested q… ▽ More

    Submitted 24 May, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

    Comments: Accepted at ACL-2023

    Journal ref: In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023, pages 6322-6334, Toronto, Canada. Association for Computational Linguistics

  12. arXiv:2305.02320  [pdf, other

    cs.IR

    Generating Synthetic Documents for Cross-Encoder Re-Rankers: A Comparative Study of ChatGPT and Human Experts

    Authors: Arian Askari, Mohammad Aliannejadi, Evangelos Kanoulas, Suzan Verberne

    Abstract: We investigate the usefulness of generative Large Language Models (LLMs) in generating training data for cross-encoder re-rankers in a novel direction: generating synthetic documents instead of synthetic queries. We introduce a new dataset, ChatGPT-RetrievalQA, and compare the effectiveness of models fine-tuned on LLM-generated and human-generated data. Data generated with generative LLMs can be u… ▽ More

    Submitted 3 May, 2023; originally announced May 2023.

  13. Perspectives on Large Language Models for Relevance Judgment

    Authors: Guglielmo Faggioli, Laura Dietz, Charles Clarke, Gianluca Demartini, Matthias Hagen, Claudia Hauff, Noriko Kando, Evangelos Kanoulas, Martin Potthast, Benno Stein, Henning Wachsmuth

    Abstract: When asked, large language models (LLMs) like ChatGPT claim that they can assist with relevance judgments but it is not clear whether automated judgments can reliably be used in evaluations of retrieval systems. In this perspectives paper, we discuss possible ways for LLMs to support relevance judgments along with concerns and issues that arise. We devise a human--machine collaboration spectrum th… ▽ More

    Submitted 18 November, 2023; v1 submitted 13 April, 2023; originally announced April 2023.

    ACM Class: H.3.3

  14. Market-Aware Models for Efficient Cross-Market Recommendation

    Authors: Samarth Bhargav, Mohammad Aliannejadi, Evangelos Kanoulas

    Abstract: We consider the cross-market recommendation (CMR) task, which involves recommendation in a low-resource target market using data from a richer, auxiliary source market. Prior work in CMR utilised meta-learning to improve recommendation performance in target markets; meta-learning however can be complex and resource intensive. In this paper, we propose market-aware (MA) models, which directly model… ▽ More

    Submitted 14 February, 2023; originally announced February 2023.

  15. Natural Language Processing of Aviation Occurrence Reports for Safety Management

    Authors: Patrick Jonk, Vincent de Vries, Rombout Wever, Georgios Sidiropoulos, Evangelos Kanoulas

    Abstract: Occurrence reporting is a commonly used method in safety management systems to obtain insight in the prevalence of hazards and accident scenarios. In support of safety data analysis, reports are often categorized according to a taxonomy. However, the processing of the reports can require significant effort from safety analysts and a common problem is interrater variability in labeling processes. A… ▽ More

    Submitted 13 January, 2023; originally announced January 2023.

    Journal ref: Proceedings of the 32nd European Safety and Reliability Conference (ESREL 2022)

  16. arXiv:2210.06164  [pdf, other

    cs.CL cs.IR

    Focusing on Context is NICE: Improving Overshadowed Entity Disambiguation

    Authors: Vera Provatorova, Simone Tedeschi, Svitlana Vakulenko, Roberto Navigli, Evangelos Kanoulas

    Abstract: Entity disambiguation (ED) is the task of map** an ambiguous entity mention to the corresponding entry in a structured knowledge base. Previous research showed that entity overshadowing is a significant challenge for existing ED models: when presented with an ambiguous entity mention, the models are much more likely to rank a more frequent yet less contextually relevant entity at the top. Here,… ▽ More

    Submitted 12 October, 2022; originally announced October 2022.

  17. On the Impact of Speech Recognition Errors in Passage Retrieval for Spoken Question Answering

    Authors: Georgios Sidiropoulos, Svitlana Vakulenko, Evangelos Kanoulas

    Abstract: Interacting with a speech interface to query a Question Answering (QA) system is becoming increasingly popular. Typically, QA systems rely on passage retrieval to select candidate contexts and reading comprehension to extract the final answer. While there has been some attention to improving the reading comprehension part of QA systems against errors that automatic speech recognition (ASR) models… ▽ More

    Submitted 26 September, 2022; originally announced September 2022.

    Comments: Accepted at 31st ACM International Conference on Information and Knowledge Management (CIKM 2022)

  18. Analysing the Robustness of Dual Encoders for Dense Retrieval Against Misspellings

    Authors: Georgios Sidiropoulos, Evangelos Kanoulas

    Abstract: Dense retrieval is becoming one of the standard approaches for document and passage ranking. The dual-encoder architecture is widely adopted for scoring question-passage pairs due to its efficiency and high performance. Typically, dense retrieval models are evaluated on clean and curated datasets. However, when deployed in real-life applications, these models encounter noisy user-generated text. T… ▽ More

    Submitted 4 May, 2022; originally announced May 2022.

    Comments: Accepted at the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2022)

  19. Zero-shot Query Contextualization for Conversational Search

    Authors: Antonios Minas Krasakis, Andrew Yates, Evangelos Kanoulas

    Abstract: Current conversational passage retrieval systems cast conversational search into ad-hoc search by using an intermediate query resolution step that places the user's question in context of the conversation. While the proposed methods have proven effective, they still assume the availability of large-scale question resolution and conversational search datasets. To waive the dependency on the availab… ▽ More

    Submitted 22 April, 2022; originally announced April 2022.

    Comments: Accepted @ SIGIR 2022

  20. Parameter-Efficient Abstractive Question Answering over Tables or Text

    Authors: Vaishali Pal, Evangelos Kanoulas, Maarten de Rijke

    Abstract: A long-term ambition of information seeking QA systems is to reason over multi-modal contexts and generate natural answers to user queries. Today, memory intensive pre-trained language models are adapted to downstream tasks such as QA by fine-tuning the model on QA data in a specific modality like unstructured text or structured tables. To avoid training such memory-hungry models while utilizing a… ▽ More

    Submitted 7 April, 2022; originally announced April 2022.

    Comments: Published in Dialdoc Workshop at ACL 2022

  21. arXiv:2201.08742  [pdf, ps, other

    cs.IR cs.AI cs.CL

    Towards Building Economic Models of Conversational Search

    Authors: Leif Azzopardi, Mohammad Aliannejadi, Evangelos Kanoulas

    Abstract: Various conceptual and descriptive models of conversational search have been proposed in the literature -- while useful, they do not provide insights into how interaction between the agent and user would change in response to the costs and benefits of the different interactions. In this paper, we develop two economic models of conversational search based on patterns previously observed during conv… ▽ More

    Submitted 21 January, 2022; originally announced January 2022.

    Comments: To appear in ECIR 2022

  22. arXiv:2112.07536  [pdf, other

    cs.CL cs.IR

    Tackling Query-Focused Summarization as A Knowledge-Intensive Task: A Pilot Study

    Authors: Weijia Zhang, Svitlana Vakulenko, Thilina Rajapakse, Yumo Xu, Evangelos Kanoulas

    Abstract: Query-focused summarization (QFS) requires generating a summary given a query using a set of relevant documents. However, such relevant documents should be annotated manually and thus are not readily available in realistic scenarios. To address this limitation, we tackle the QFS task as a knowledge-intensive (KI) task without access to any relevant documents. Instead, we assume that these document… ▽ More

    Submitted 31 July, 2023; v1 submitted 14 December, 2021; originally announced December 2021.

    Comments: Accepted by Gen-IR@SIGIR 2023 workshop

  23. arXiv:2110.05056  [pdf, other

    cs.IR

    Controllable Recommenders using Deep Generative Models and Disentanglement

    Authors: Samarth Bhargav, Evangelos Kanoulas

    Abstract: In this paper, we consider controllability as a means to satisfy dynamic preferences of users, enabling them to control recommendations such that their current preference is met. While deep models have shown improved performance for collaborative filtering, they are generally not amenable to fine grained control by a user, leading to the development of methods like deep language critiquing. We pro… ▽ More

    Submitted 11 October, 2021; originally announced October 2021.

    Comments: 10 pages, 1 figure

  24. arXiv:2109.11682  [pdf, other

    cs.CV cs.AI

    Paint4Poem: A Dataset for Artistic Visualization of Classical Chinese Poems

    Authors: Dan Li, Shuai Wang, Jie Zou, Chang Tian, Elisha Nieuwburg, Fengyuan Sun, Evangelos Kanoulas

    Abstract: In this work we propose a new task: artistic visualization of classical Chinese poems, where the goal is to generatepaintings of a certain artistic style for classical Chinese poems. For this purpose, we construct a new dataset called Paint4Poem. Thefirst part of Paint4Poem consists of 301 high-quality poem-painting pairs collected manually from an influential modern Chinese artistFeng Zikai. As i… ▽ More

    Submitted 27 September, 2021; v1 submitted 23 September, 2021; originally announced September 2021.

  25. arXiv:2109.05955  [pdf, other

    cs.IR cs.CL cs.HC

    Analysing Mixed Initiatives and Search Strategies during Conversational Search

    Authors: Mohammad Aliannejadi, Leif Azzopardi, Hamed Zamani, Evangelos Kanoulas, Paul Thomas, Nick Craswel

    Abstract: Information seeking conversations between users and Conversational Search Agents (CSAs) consist of multiple turns of interaction. While users initiate a search session, ideally a CSA should sometimes take the lead in the conversation by obtaining feedback from the user by offering query suggestions or asking for query clarifications i.e. mixed initiative. This creates the potential for more engagi… ▽ More

    Submitted 13 September, 2021; originally announced September 2021.

    Comments: Accepted in CIKM 2021

  26. Cross-Market Product Recommendation

    Authors: Hamed Bonab, Mohammad Aliannejadi, Ali Vardasbi, Evangelos Kanoulas, James Allan

    Abstract: We study the problem of recommending relevant products to users in relatively resource-scarce markets by leveraging data from similar, richer in resource auxiliary markets. We hypothesize that data from one market can be used to improve performance in another. Only a few studies have been conducted in this area, partly due to the lack of publicly available experimental data. To this end, we collec… ▽ More

    Submitted 13 September, 2021; originally announced September 2021.

    Comments: Accepted in CIKM 2021

  27. arXiv:2108.10949  [pdf, other

    cs.CL

    Robustness Evaluation of Entity Disambiguation Using Prior Probes:the Case of Entity Overshadowing

    Authors: Vera Provatorova, Svitlana Vakulenko, Samarth Bhargav, Evangelos Kanoulas

    Abstract: Entity disambiguation (ED) is the last step of entity linking (EL), when candidate entities are reranked according to the context they appear in. All datasets for training and evaluating models for EL consist of convenience samples, such as news articles and tweets, that propagate the prior probability bias of the entity distribution towards more frequently occurring entities. It was previously sh… ▽ More

    Submitted 14 December, 2021; v1 submitted 24 August, 2021; originally announced August 2021.

    Journal ref: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (2021) 10501-10510

  28. arXiv:2108.10127  [pdf, other

    cs.CL

    Legal Search in Case Law and Statute Law

    Authors: Julien Rossi, Evangelos Kanoulas

    Abstract: In this work we describe a method to identify document pairwise relevance in the context of a typical legal document collection: limited resources, long queries and long documents. We review the usage of generalized language models, including supervised and unsupervised learning. We observe how our method, while using text summaries, overperforms existing baselines based on full text, and motivate… ▽ More

    Submitted 23 August, 2021; originally announced August 2021.

  29. VerbCL: A Dataset of Verbatim Quotes for Highlight Extraction in Case Law

    Authors: Julien Rossi, Svitlana Vakulenko, Evangelos Kanoulas

    Abstract: Citing legal opinions is a key part of legal argumentation, an expert task that requires retrieval, extraction and summarization of information from court decisions. The identification of legally salient parts in an opinion for the purpose of citation may be seen as a domain-specific formulation of a highlight extraction or passage retrieval task. As similar tasks in other domains such as web sear… ▽ More

    Submitted 23 August, 2021; originally announced August 2021.

    Comments: CIKM 2021, Resource Track

  30. arXiv:2106.15467  [pdf, other

    cs.AI cs.CL

    Few-Shot Electronic Health Record Coding through Graph Contrastive Learning

    Authors: Shanshan Wang, Pengjie Ren, Zhumin Chen, Zhaochun Ren, Huasheng Liang, Qiang Yan, Evangelos Kanoulas, Maarten de Rijke

    Abstract: Electronic health record (EHR) coding is the task of assigning ICD codes to each EHR. Most previous studies either only focus on the frequent ICD codes or treat rare and frequent ICD codes in the same way. These methods perform well on frequent ICD codes but due to the extremely unbalanced distribution of ICD codes, the performance on rare ones is far from satisfactory. We seek to improve the perf… ▽ More

    Submitted 29 June, 2021; originally announced June 2021.

  31. arXiv:2106.08433  [pdf, other

    cs.IR

    Combining Lexical and Dense Retrieval for Computationally Efficient Multi-hop Question Answering

    Authors: Georgios Sidiropoulos, Nikos Voskarides, Svitlana Vakulenko, Evangelos Kanoulas

    Abstract: In simple open-domain question answering (QA), dense retrieval has become one of the standard approaches for retrieving the relevant passages to infer an answer. Recently, dense retrieval also achieved state-of-the-art results in multi-hop QA, where aggregating information from multiple pieces of information and reasoning over them is required. Despite their success, dense retrieval methods are co… ▽ More

    Submitted 22 September, 2021; v1 submitted 15 June, 2021; originally announced June 2021.

    Comments: Accepted at the 2nd Workshop on Simple and Efficient Natural Language Processing (SustaiNLP 2021)

  32. arXiv:2104.07096  [pdf, other

    cs.IR

    A Large-Scale Analysis of Mixed Initiative in Information-Seeking Dialogues for Conversational Search

    Authors: Svitlana Vakulenko, Evangelos Kanoulas, Maarten de Rijke

    Abstract: Conversational search is a relatively young area of research that aims at automating an information-seeking dialogue. In this paper we help to position it with respect to other research areas within conversational Artificial Intelligence (AI) by analysing the structural properties of an information-seeking dialogue. To this end, we perform a large-scale dialogue analysis of more than 150K transcri… ▽ More

    Submitted 8 June, 2021; v1 submitted 14 April, 2021; originally announced April 2021.

    Comments: 32 pages; To appear in ACM Transactions on Information Systems (TOIS), Special Issue on Conversational Search and Recommendation. 2021

  33. arXiv:2103.08733  [pdf, other

    cs.AI cs.HC cs.IR cs.LG

    Category Aware Explainable Conversational Recommendation

    Authors: Nikolaos Kondylidis, Jie Zou, Evangelos Kanoulas

    Abstract: Most conversational recommendation approaches are either not explainable, or they require external user's knowledge for explaining or their explanations cannot be applied in real time due to computational limitations. In this work, we present a real time category based conversational recommendation approach, which can provide concise explanations without prior user knowledge being required. We fir… ▽ More

    Submitted 15 March, 2021; originally announced March 2021.

    Comments: Workshop on Mixed-Initiative ConveRsatiOnal Systems (MICROS) @ECIR, 2021

    Journal ref: Workshop on Mixed-Initiative ConveRsatiOnal Systems (MICROS) @ECIR, 2021

  34. arXiv:2103.06192  [pdf, other

    cs.IR cs.HC

    Ranking Clarifying Questions Based on Predicted User Engagement

    Authors: Tom Lotze, Stefan Klut, Mohammad Aliannejadi, Evangelos Kanoulas

    Abstract: To improve online search results, clarification questions can be used to elucidate the information need of the user. This research aims to predict the user engagement with the clarification pane as an indicator of relevance based on the lexical information: query, question, and answers. Subsequently, the predicted user engagement can be used as a feature to rank the clarification panes. Regression… ▽ More

    Submitted 1 April, 2021; v1 submitted 10 March, 2021; originally announced March 2021.

    Comments: Appeared in MICROS Workshop, co-located with ECIR'21

  35. arXiv:2010.10386  [pdf, other

    cs.IR cs.CL

    A Benchmark for Lease Contract Review

    Authors: Spyretta Leivaditi, Julien Rossi, Evangelos Kanoulas

    Abstract: Extracting entities and other useful information from legal contracts is an important task whose automation can help legal professionals perform contract reviews more efficiently and reduce relevant risks. In this paper, we tackle the problem of detecting two different types of elements that play an important role in a contract review, namely entities and red flags. The latter are terms or sentenc… ▽ More

    Submitted 20 October, 2020; originally announced October 2020.

  36. arXiv:2008.03717  [pdf, other

    cs.IR cs.AI cs.CL cs.HC

    Analysing the Effect of Clarifying Questions on Document Ranking in Conversational Search

    Authors: Antonios Minas Krasakis, Mohammad Aliannejadi, Nikos Voskarides, Evangelos Kanoulas

    Abstract: Recent research on conversational search highlights the importance of mixed-initiative in conversations. To enable mixed-initiative, the system should be able to ask clarifying questions to the user. However, the ability of the underlying ranking models (which support conversational search) to account for these clarifying questions and answers has not been analysed when ranking documents, at large… ▽ More

    Submitted 11 August, 2020; v1 submitted 9 August, 2020; originally announced August 2020.

    Comments: Proceedings of the 2020 ACM SIGIR International Conference on the Theory of Information Retrieval (ICTIR '20), September 14-17, 2020

  37. arXiv:2008.00279  [pdf, other

    cs.IR cs.CL

    An Empirical Study of Clarifying Question-Based Systems

    Authors: Jie Zou, Evangelos Kanoulas, Yiqun Liu

    Abstract: Search and recommender systems that take the initiative to ask clarifying questions to better understand users' information needs are receiving increasing attention from the research community. However, to the best of our knowledge, there is no empirical study to quantify whether and to what extent users are willing or able to answer these questions. In this work, we conduct an online experiment b… ▽ More

    Submitted 1 August, 2020; originally announced August 2020.

    Comments: Parts of content are published on CIKM 2020

  38. Towards Question-based Recommender Systems

    Authors: Jie Zou, Yifan Chen, Evangelos Kanoulas

    Abstract: Conversational and question-based recommender systems have gained increasing attention in recent years, with users enabled to converse with the system and better control recommendations. Nevertheless, research in the field is still limited, compared to traditional recommender systems. In this work, we propose a novel Question-based recommendation method, Qrec, to assist users to find items interac… ▽ More

    Submitted 28 May, 2020; originally announced May 2020.

    Comments: accepted by SIGIR 2020

  39. An Analysis of Mixed Initiative and Collaboration in Information-Seeking Dialogues

    Authors: Svitlana Vakulenko, Evangelos Kanoulas, Maarten de Rijke

    Abstract: The ability to engage in mixed-initiative interaction is one of the core requirements for a conversational search system. How to achieve this is poorly understood. We propose a set of unsupervised metrics, termed ConversationShape, that highlights the role each of the conversation participants plays by comparing the distribution of vocabulary and utterance types. Using ConversationShape as a lens,… ▽ More

    Submitted 25 May, 2020; originally announced May 2020.

    Comments: SIGIR 2020 short conference paper

  40. arXiv:2005.12040  [pdf, other

    cs.CL

    Knowledge Graph Simple Question Answering for Unseen Domains

    Authors: Georgios Sidiropoulos, Nikos Voskarides, Evangelos Kanoulas

    Abstract: Knowledge graph simple question answering (KGSQA), in its standard form, does not take into account that human-curated question answering training data only cover a small subset of the relations that exist in a Knowledge Graph (KG), or even worse, that new domains covering unseen and rather different to existing domains relations are added to the KG. In this work, we study KGSQA in a previously un… ▽ More

    Submitted 25 May, 2020; originally announced May 2020.

    Comments: Accepted at AKBC 2020

  41. Query Resolution for Conversational Search with Limited Supervision

    Authors: Nikos Voskarides, Dan Li, Pengjie Ren, Evangelos Kanoulas, Maarten de Rijke

    Abstract: In this work we focus on multi-turn passage retrieval as a crucial component of conversational search. One of the key challenges in multi-turn passage retrieval comes from the fact that the current turn query is often underspecified due to zero anaphora, topic change, or topic return. Context from the conversational history can be used to arrive at a better expression of the current turn query, de… ▽ More

    Submitted 24 May, 2020; originally announced May 2020.

    Comments: SIGIR 2020 full conference paper

  42. arXiv:2004.14162  [pdf, other

    cs.IR cs.AI

    Conversations with Search Engines: SERP-based Conversational Response Generation

    Authors: Pengjie Ren, Zhumin Chen, Zhaochun Ren, Evangelos Kanoulas, Christof Monz, Maarten de Rijke

    Abstract: In this paper, we address the problem of answering complex information needs by conversing conversations with search engines, in the sense that users can express their queries in natural language, and directly receivethe information they need from a short system response in a conversational manner. Recently, there have been some attempts towards a similar goal, e.g., studies on Conversational Agen… ▽ More

    Submitted 18 May, 2021; v1 submitted 29 April, 2020; originally announced April 2020.

    Comments: published in TOIS 2021

  43. arXiv:1908.11733  [pdf, other

    cs.IR

    Learning to Ask: Question-based Sequential Bayesian Product Search

    Authors: Jie Zou, Evangelos Kanoulas

    Abstract: Product search is generally recognized as the first and foremost stage of online shop** and thus significant for users and retailers of e-commerce. Most of the traditional retrieval methods use some similarity functions to match the user's query and the document that describes a product, either directly or in a latent vector space. However, user queries are often too general to capture the minut… ▽ More

    Submitted 30 August, 2019; originally announced August 2019.

    Comments: This paper is accepted by CIKM 2019

  44. arXiv:1907.03039  [pdf, other

    cs.IR cs.AI cs.LG

    Global Aggregations of Local Explanations for Black Box models

    Authors: Ilse van der Linden, Hinda Haned, Evangelos Kanoulas

    Abstract: The decision-making process of many state-of-the-art machine learning models is inherently inscrutable to the extent that it is impossible for a human to interpret the model directly: they are black box models. This has led to a call for research on explaining black box models, for which there are two main approaches. Global explanations that aim to explain a model's decision making process in gen… ▽ More

    Submitted 5 July, 2019; originally announced July 2019.

    Comments: FACTS-IR: Fairness, Accountability, Confidentiality, Transparency, and Safety - SIGIR 2019 Workshop

  45. arXiv:1906.07591  [pdf, ps, other

    cs.IR cs.CL

    Query Generation for Patent Retrieval with Keyword Extraction based on Syntactic Features

    Authors: Julien Rossi, Matthias Wirth, Evangelos Kanoulas

    Abstract: This paper describes a new method to extract relevant keywords from patent claims, as part of the task of retrieving other patents with similar claims (search for prior art). The method combines a qualitative analysis of the writing style of the claims with NLP methods to parse text, in order to represent a legal text as a specialization arborescence of terms. In this setting, the set of extracted… ▽ More

    Submitted 18 June, 2019; originally announced June 2019.

    Comments: Presented as short paper at JURIX 2018

  46. arXiv:1812.00382  [pdf, other

    cs.IR cs.CL

    Improved and Robust Controversy Detection in General Web Pages Using Semantic Approaches under Large Scale Conditions

    Authors: Jasper Linmans, Bob van de Velde, Evangelos Kanoulas

    Abstract: Detecting controversy in general web pages is a daunting task, but increasingly essential to efficiently moderate discussions and effectively filter problematic content. Unfortunately, controversies occur across many topics and domains, with great changes over time. This paper investigates neural classifiers as a more robust methodology for controversy detection in general web pages. Current model… ▽ More

    Submitted 2 December, 2018; originally announced December 2018.

    Comments: Presented at the 27th ACM International Conference on Information and Knowledge Management (CIKM 2018)

  47. arXiv:1810.05414  [pdf, other

    cs.IR

    Technology Assisted Reviews: Finding the Last Few Relevant Documents by Asking Yes/No Questions to Reviewers

    Authors: Jie Zou, Dan Li, Evangelos Kanoulas

    Abstract: The goal of a technology-assisted review is to achieve high recall with low human effort. Continuous active learning algorithms have demonstrated good performance in locating the majority of relevant documents in a collection, however their performance is reaching a plateau when 80\%-90\% of them has been found. Finding the last few relevant documents typically requires exhaustively reviewing the… ▽ More

    Submitted 12 October, 2018; originally announced October 2018.

    Comments: This paper is accepted by SIGIR 2018

  48. arXiv:1801.10603  [pdf, other

    cs.IR

    ILPS at TREC 2017 Common Core Track

    Authors: Christophe Van Gysel, Dan Li, Evangelos Kanoulas

    Abstract: The TREC 2017 Common Core Track aimed at gathering a diverse set of participating runs and building a new test collection using advanced pooling methods. In this paper, we describe the participation of the IlpsUvA team at the TREC 2017 Common Core Track. We submitted runs created using two methods to the track: (1) BOIR uses Bayesian optimization to automatically optimize retrieval model hyperpa… ▽ More

    Submitted 31 January, 2018; originally announced January 2018.

    Comments: TREC 2017

  49. arXiv:1712.01329  [pdf, other

    cs.CV cs.AI cs.CL cs.IR cs.LG

    Examining Cooperation in Visual Dialog Models

    Authors: Mircea Mironenco, Dana Kianfar, Ke Tran, Evangelos Kanoulas, Efstratios Gavves

    Abstract: In this work we propose a blackbox intervention method for visual dialog models, with the aim of assessing the contribution of individual linguistic or visual components. Concretely, we conduct structured or randomized interventions that aim to impair an individual component of the model, and observe changes in task performance. We reproduce a state-of-the-art visual dialog model and demonstrate t… ▽ More

    Submitted 4 December, 2017; originally announced December 2017.

    Comments: 9 pages, 5 figures, 2 tables, code at http://github.com/danakianfar/Examining-Cooperation-in-VDM/

  50. Active Sampling for Large-scale Information Retrieval Evaluation

    Authors: Dan Li, Evangelos Kanoulas

    Abstract: Evaluation is crucial in Information Retrieval. The development of models, tools and methods has significantly benefited from the availability of reusable test collections formed through a standardized and thoroughly tested methodology, known as the Cranfield paradigm. Constructing these collections requires obtaining relevance judgments for a pool of documents, retrieved by systems participating… ▽ More

    Submitted 6 September, 2017; originally announced September 2017.

    MSC Class: 68P20