Skip to main content

Showing 1–50 of 54 results for author: Croft, W

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.17748  [pdf, other

    cs.CL

    UCxn: Typologically Informed Annotation of Constructions Atop Universal Dependencies

    Authors: Leonie Weissweiler, Nina Böbel, Kirian Guiller, Santiago Herrera, Wesley Scivetti, Arthur Lorenzi, Nurit Melnik, Archna Bhatia, Hinrich Schütze, Lori Levin, Amir Zeldes, Joakim Nivre, William Croft, Nathan Schneider

    Abstract: The Universal Dependencies (UD) project has created an invaluable collection of treebanks with contributions in over 140 languages. However, the UD annotations do not tell the full story. Grammatical constructions that convey meaning through a particular combination of several morphosyntactic elements -- for example, interrogative sentences with special markers and/or word orders -- are not labele… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

    Comments: LREC-COLING 2024

  2. arXiv:2307.02740  [pdf, other

    cs.IR cs.CL

    Dense Retrieval Adaptation using Target Domain Description

    Authors: Helia Hashemi, Yong Zhuang, Sachith Sri Ram Kothur, Srivas Prasad, Edgar Meij, W. Bruce Croft

    Abstract: In information retrieval (IR), domain adaptation is the process of adapting a retrieval model to a new domain whose data distribution is different from the source domain. Existing methods in this area focus on unsupervised domain adaptation where they have access to the target document collection or supervised (often few-shot) domain adaptation where they additionally have access to (limited) labe… ▽ More

    Submitted 5 July, 2023; originally announced July 2023.

  3. arXiv:2304.08912  [pdf, other

    cs.IR

    Generalized Weak Supervision for Neural Information Retrieval

    Authors: Yen-Chieh Lien, Hamed Zamani, W. Bruce Croft

    Abstract: Neural ranking models (NRMs) have demonstrated effective performance in several information retrieval (IR) tasks. However, training NRMs often requires large-scale training data, which is difficult and expensive to obtain. To address this issue, one can train NRMs via weak supervision, where a large dataset is automatically generated using an existing ranking model (called the weak labeler) for tr… ▽ More

    Submitted 18 April, 2023; originally announced April 2023.

  4. arXiv:2210.12908  [pdf, other

    cs.DL cs.LG

    Predicting the Citation Count and CiteScore of Journals One Year in Advance

    Authors: William Croft, Jörg-Rüdiger Sack

    Abstract: Prediction of the future performance of academic journals is a task that can benefit a variety of stakeholders including editorial staff, publishers, indexing services, researchers, university administrators and granting agencies. Using historical data on journal performance, this can be framed as a machine learning regression problem. In this work, we study two such regression tasks: 1) predictio… ▽ More

    Submitted 23 October, 2022; originally announced October 2022.

  5. Evaluating Fairness in Argument Retrieval

    Authors: Sachin Pathiyan Cherumanal, Damiano Spina, Falk Scholer, W. Bruce Croft

    Abstract: Existing commercial search engines often struggle to represent different perspectives of a search query. Argument retrieval systems address this limitation of search engines and provide both positive (PRO) and negative (CON) perspectives about a user's information need on a controversial topic (e.g., climate change). The effectiveness of such argument retrieval systems is typically evaluated based… ▽ More

    Submitted 19 September, 2021; v1 submitted 23 August, 2021; originally announced August 2021.

    Comments: Accepted at CIKM 2021

  6. Asking Clarifying Questions Based on Negative Feedback in Conversational Search

    Authors: Ke** Bi, Qingyao Ai, W. Bruce Croft

    Abstract: Users often need to look through multiple search result pages or reformulate queries when they have complex information-seeking needs. Conversational search systems make it possible to improve user satisfaction by asking questions to clarify users' search intents. This, however, can take significant effort to answer a series of questions starting with "what/why/how". To quickly identify user inten… ▽ More

    Submitted 12 July, 2021; originally announced July 2021.

    Comments: In the proceedings of ICTIR'21

  7. Passage Retrieval for Outside-Knowledge Visual Question Answering

    Authors: Chen Qu, Hamed Zamani, Liu Yang, W. Bruce Croft, Erik Learned-Miller

    Abstract: In this work, we address multi-modal information needs that contain text questions and images by focusing on passage retrieval for outside-knowledge visual question answering. This task requires access to outside knowledge, which in our case we define to be a large unstructured passage collection. We first conduct sparse retrieval with BM25 and study expanding the question with object names and im… ▽ More

    Submitted 9 May, 2021; originally announced May 2021.

    Comments: Accepted to SIGIR'21 as a short paper

  8. arXiv:2104.10210  [pdf, other

    cs.CL physics.soc-ph q-bio.PE

    How individuals change language

    Authors: Richard A Blythe, William Croft

    Abstract: Languages emerge and change over time at the population level though interactions between individual speakers. It is, however, hard to directly observe how a single speaker's linguistic innovation precipitates a population-wide change in the language, and many theoretical proposals exist. We introduce a very general mathematical model that encompasses a wide variety of individual-level linguistic… ▽ More

    Submitted 20 April, 2021; originally announced April 2021.

    Comments: 50 pages, 11 figures

    Journal ref: PLoS ONE 16(6): e0252582 (2021)

  9. A Neural Passage Model for Ad-hoc Document Retrieval

    Authors: Qingyao Ai, Brendan O Connor, W. Bruce Croft

    Abstract: Traditional statistical retrieval models often treat each document as a whole. In many cases, however, a document is relevant to a query only because a small part of it contain the targeted information. In this work, we propose a neural passage model (NPM) that uses passage-level information to improve the performance of ad-hoc retrieval. Instead of using a single window to extract passages, our m… ▽ More

    Submitted 16 March, 2021; originally announced March 2021.

  10. arXiv:2103.02537  [pdf, other

    cs.IR cs.CL

    Weakly-Supervised Open-Retrieval Conversational Question Answering

    Authors: Chen Qu, Liu Yang, Cen Chen, W. Bruce Croft, Kalpesh Krishna, Mohit Iyyer

    Abstract: Recent studies on Question Answering (QA) and Conversational QA (ConvQA) emphasize the role of retrieval: a system first retrieves evidence from a large collection and then extracts answers. This open-retrieval ConvQA setting typically assumes that each question is answerable by a single span of text within a particular passage (a span answer). The supervision signal is thus derived from whether o… ▽ More

    Submitted 3 March, 2021; originally announced March 2021.

    Comments: Accepted to ECIR'21

  11. arXiv:2102.11072  [pdf, other

    cs.CR

    Obfuscation of Images via Differential Privacy: From Facial Images to General Images

    Authors: William Croft, Jörg-Rüdiger Sack, Wei Shi

    Abstract: Due to the pervasiveness of image capturing devices in every-day life, images of individuals are routinely captured. Although this has enabled many benefits, it also infringes on personal privacy. A promising direction in research on obfuscation of facial images has been the work in the k-same family of methods which employ the concept of k-anonymity from database privacy. However, there are a num… ▽ More

    Submitted 19 February, 2021; originally announced February 2021.

  12. arXiv:2101.03394  [pdf, other

    cs.IR cs.AI cs.HC

    Context-Aware Target Apps Selection and Recommendation for Enhancing Personal Mobile Assistants

    Authors: Mohammad Aliannejadi, Hamed Zamani, Fabio Crestani, W. Bruce Croft

    Abstract: Users install many apps on their smartphones, raising issues related to information overload for users and resource management for devices. Moreover, the recent increase in the use of personal assistants has made mobile devices even more pervasive in users' lives. This paper addresses two research problems that are vital for develo** effective personal mobile assistants: target apps selection an… ▽ More

    Submitted 9 January, 2021; originally announced January 2021.

    Comments: Accepted to ACM TOIS, 30 pages

  13. arXiv:2006.07548  [pdf, other

    cs.IR cs.CL cs.LG

    Guided Transformer: Leveraging Multiple External Sources for Representation Learning in Conversational Search

    Authors: Helia Hashemi, Hamed Zamani, W. Bruce Croft

    Abstract: Asking clarifying questions in response to ambiguous or faceted queries has been recognized as a useful technique for various information retrieval systems, especially conversational search systems with limited bandwidth interfaces. Analyzing and generating clarifying questions have been studied recently but the accurate utilization of user responses to clarifying questions has been relatively les… ▽ More

    Submitted 12 June, 2020; originally announced June 2020.

    Comments: To appear in the Proceedings of ACM SIGIR 2020. 10 pages

  14. Open-Retrieval Conversational Question Answering

    Authors: Chen Qu, Liu Yang, Cen Chen, Minghui Qiu, W. Bruce Croft, Mohit Iyyer

    Abstract: Conversational search is one of the ultimate goals of information retrieval. Recent research approaches conversational search by simplified settings of response ranking and conversational question answering, where an answer is either selected from a given candidate set or extracted from a given passage. These simplifications neglect the fundamental role of retrieval in conversational search. To ad… ▽ More

    Submitted 22 May, 2020; originally announced May 2020.

    Comments: Accepted to SIGIR'20

  15. A Transformer-based Embedding Model for Personalized Product Search

    Authors: Ke** Bi, Qingyao Ai, W. Bruce Croft

    Abstract: Product search is an important way for people to browse and purchase items on E-commerce platforms. While customers tend to make choices based on their personal tastes and preferences, analysis of commercial product search logs has shown that personalization does not always improve product search quality. Most existing product search techniques, however, conduct undifferentiated personalization ac… ▽ More

    Submitted 18 May, 2020; originally announced May 2020.

    Comments: In the proceedings of SIGIR 2020

    ACM Class: H.3.3

  16. Learning a Fine-Grained Review-based Transformer Model for Personalized Product Search

    Authors: Ke** Bi, Qingyao Ai, W. Bruce Croft

    Abstract: Product search has been a crucial entry point to serve people shop** online. Most existing personalized product models follow the paradigm of representing and matching user intents and items in the semantic space, where finer-grained matching is totally discarded and the ranking of an item cannot be explained further than just user/item level similarity. In addition, while some models in existin… ▽ More

    Submitted 3 June, 2021; v1 submitted 20 April, 2020; originally announced April 2020.

    Comments: To appear in SIGIR'2021

    MSC Class: 68T07; 68P20 ACM Class: H.3.3

  17. arXiv:2004.06176  [pdf, other

    cs.CL

    AREDSUM: Adaptive Redundancy-Aware Iterative Sentence Ranking for Extractive Document Summarization

    Authors: Ke** Bi, Rahul Jha, W. Bruce Croft, Asli Celikyilmaz

    Abstract: Redundancy-aware extractive summarization systems score the redundancy of the sentences to be included in a summary either jointly with their salience information or separately as an additional sentence scoring step. Previous work shows the efficacy of jointly scoring and selecting sentences with neural sequence generation models. It is, however, not well-understood if the gain is due to better en… ▽ More

    Submitted 2 April, 2021; v1 submitted 13 April, 2020; originally announced April 2020.

    Comments: In proceedings of EACL'2021

  18. arXiv:2002.00571  [pdf, other

    cs.IR cs.CL

    IART: Intent-aware Response Ranking with Transformers in Information-seeking Conversation Systems

    Authors: Liu Yang, Minghui Qiu, Chen Qu, Cen Chen, Jiafeng Guo, Yongfeng Zhang, W. Bruce Croft, Haiqing Chen

    Abstract: Personal assistant systems, such as Apple Siri, Google Assistant, Amazon Alexa, and Microsoft Cortana, are becoming ever more widely used. Understanding user intent such as clarification questions, potential answers and user feedback in information-seeking conversations is critical for retrieving good responses. In this paper, we analyze user intent patterns in information-seeking conversations an… ▽ More

    Submitted 3 February, 2020; originally announced February 2020.

    Comments: Accepted by WWW2020

  19. Differential Privacy Via a Truncated and Normalized Laplace Mechanism

    Authors: William Lee Croft, Jörg-Rüdiger Sack, Wei Shi

    Abstract: When querying databases containing sensitive information, the privacy of individuals stored in the database has to be guaranteed. Such guarantees are provided by differentially private mechanisms which add controlled noise to the query responses. However, most such mechanisms do not take into consideration the valid range of the query being posed. Thus, noisy responses that fall outside of this ra… ▽ More

    Submitted 25 August, 2020; v1 submitted 1 November, 2019; originally announced November 2019.

    Comments: This is a pre-print of an article published in Journal of Computer Science and Technology. The final authenticated version is available online at: https://doi.org/10.1007/s11390-020-0193-z

  20. arXiv:1909.07212  [pdf, other

    cs.IR

    Explainable Product Search with a Dynamic Relation Embedding Model

    Authors: Qingyao Ai, Yongfeng Zhang, Ke** Bi, W. Bruce Croft

    Abstract: Product search is one of the most popular methods for customers to discover products online. Most existing studies on product search focus on develo** effective retrieval models that rank items by their likelihood to be purchased. They, however, ignore the problem that there is a gap between how systems and customers perceive the relevance of items. Without explanations, users may not understand… ▽ More

    Submitted 16 September, 2019; originally announced September 2019.

  21. A Study of Context Dependencies in Multi-page Product Search

    Authors: Ke** Bi, Choon Hui Teo, Yesh Dattatreya, Vijai Mohan, W. Bruce Croft

    Abstract: In product search, users tend to browse results on multiple search result pages (SERPs) (e.g., for queries on clothing and shoes) before deciding which item to purchase. Users' clicks can be considered as implicit feedback which indicates their preferences and used to re-rank subsequent SERPs. Relevance feedback (RF) techniques are usually involved to deal with such scenarios. However, these metho… ▽ More

    Submitted 9 January, 2020; v1 submitted 9 September, 2019; originally announced September 2019.

    Comments: Accepted by CIKM 2019. arXiv admin note: substantial text overlap with arXiv:1909.02065

  22. Conversational Product Search Based on Negative Feedback

    Authors: Ke** Bi, Qingyao Ai, Yongfeng Zhang, W. Bruce Croft

    Abstract: Intelligent assistants change the way people interact with computers and make it possible for people to search for products through conversations when they have purchase needs. During the interactions, the system could ask questions on certain aspects of the ideal products to clarify the users' needs. For example, previous work proposed to ask users the exact characteristics of their ideal items b… ▽ More

    Submitted 4 September, 2019; originally announced September 2019.

    Comments: Accepted as a long paper in CIKM 2019

  23. arXiv:1909.02065  [pdf, other

    cs.IR

    Leverage Implicit Feedback for Context-aware Product Search

    Authors: Ke** Bi, Choon Hui Teo, Yesh Dattatreya, Vijai Mohan, W. Bruce Croft

    Abstract: Product search serves as an important entry point for online shop**. In contrast to web search, the retrieved results in product search not only need to be relevant but also should satisfy customers' preferences in order to elicit purchases. Previous work has shown the efficacy of purchase history in personalized product search. However, customers with little or no purchase history do not benefi… ▽ More

    Submitted 9 January, 2020; v1 submitted 4 September, 2019; originally announced September 2019.

    Comments: Presented at 2019 SIGIR Workshop on eCommerce (ECOM'19)

  24. A Zero Attention Model for Personalized Product Search

    Authors: Qingyao Ai, Daniel N. Hill, S. V. N. Vishwanathan, W. Bruce Croft

    Abstract: Product search is one of the most popular methods for people to discover and purchase products on e-commerce websites. Because personal preferences often have an important influence on the purchase decision of each customer, it is intuitive that personalization should be beneficial for product search engines. While synthetic experiments from previous studies show that purchase histories are useful… ▽ More

    Submitted 29 August, 2019; originally announced August 2019.

  25. Attentive History Selection for Conversational Question Answering

    Authors: Chen Qu, Liu Yang, Minghui Qiu, Yongfeng Zhang, Cen Chen, W. Bruce Croft, Mohit Iyyer

    Abstract: Conversational question answering (ConvQA) is a simplified but concrete setting of conversational search. One of its major challenges is to leverage the conversation history to understand and answer the current question. In this work, we propose a novel solution for ConvQA that involves three aspects. First, we propose a positional history answer embedding method to encode conversation history wit… ▽ More

    Submitted 25 August, 2019; originally announced August 2019.

    Comments: Accepted to CIKM 2019

  26. arXiv:1907.06554  [pdf, other

    cs.CL cs.AI cs.IR

    Asking Clarifying Questions in Open-Domain Information-Seeking Conversations

    Authors: Mohammad Aliannejadi, Hamed Zamani, Fabio Crestani, W. Bruce Croft

    Abstract: Users often fail to formulate their complex information needs in a single query. As a consequence, they may need to scan multiple result pages or reformulate their queries, which may be a frustrating experience. Alternatively, systems can improve user satisfaction by proactively asking questions of the users to clarify their information needs. Asking clarifying questions is especially important in… ▽ More

    Submitted 15 July, 2019; originally announced July 2019.

    Comments: To appear in SIGIR 2019

  27. arXiv:1905.08957  [pdf, other

    cs.IR cs.CL

    ANTIQUE: A Non-Factoid Question Answering Benchmark

    Authors: Helia Hashemi, Mohammad Aliannejadi, Hamed Zamani, W. Bruce Croft

    Abstract: Considering the widespread use of mobile and voice search, answer passage retrieval for non-factoid questions plays a critical role in modern information retrieval systems. Despite the importance of the task, the community still feels the significant lack of large-scale non-factoid question answering collections with real questions and comprehensive relevance judgments. In this paper, we develop a… ▽ More

    Submitted 19 August, 2019; v1 submitted 22 May, 2019; originally announced May 2019.

  28. BERT with History Answer Embedding for Conversational Question Answering

    Authors: Chen Qu, Liu Yang, Minghui Qiu, W. Bruce Croft, Yongfeng Zhang, Mohit Iyyer

    Abstract: Conversational search is an emerging topic in the information retrieval community. One of the major challenges to multi-turn conversational search is to model the conversation history to answer the current question. Existing methods either prepend history turns to the current question or use complicated attention mechanisms to model the history. We propose a conceptually simple yet highly effectiv… ▽ More

    Submitted 27 October, 2019; v1 submitted 14 May, 2019; originally announced May 2019.

    Comments: Accepted to SIGIR 2019 as a short paper

  29. arXiv:1905.01758  [pdf, other

    cs.IR cs.CL

    Investigating the Successes and Failures of BERT for Passage Re-Ranking

    Authors: Harshith Padigela, Hamed Zamani, W. Bruce Croft

    Abstract: The bidirectional encoder representations from transformers (BERT) model has recently advanced the state-of-the-art in passage re-ranking. In this paper, we analyze the results produced by a fine-tuned BERT model to better understand the reasons behind such substantial improvements. To this aim, we focus on the MS MARCO passage re-ranking dataset and provide potential reasons for the successes and… ▽ More

    Submitted 5 May, 2019; originally announced May 2019.

  30. arXiv:1904.09068  [pdf, other

    cs.IR cs.CL

    A Hybrid Retrieval-Generation Neural Conversation Model

    Authors: Liu Yang, Junjie Hu, Minghui Qiu, Chen Qu, Jianfeng Gao, W. Bruce Croft, Xiaodong Liu, Yelong Shen, **g**g Liu

    Abstract: Intelligent personal assistant systems that are able to have multi-turn conversations with human users are becoming increasingly popular. Most previous research has been focused on using either retrieval-based or generation-based methods to develop such systems. Retrieval-based methods have the advantage of returning fluent and informative responses with great diversity. However, the performance o… ▽ More

    Submitted 25 August, 2019; v1 submitted 19 April, 2019; originally announced April 2019.

    Comments: Accepted as a Full Paper in CIKM 2019. 10 pages

  31. arXiv:1903.06902  [pdf, other

    cs.IR

    A Deep Look into Neural Ranking Models for Information Retrieval

    Authors: Jiafeng Guo, Yixing Fan, Liang Pang, Liu Yang, Qingyao Ai, Hamed Zamani, Chen Wu, W. Bruce Croft, Xueqi Cheng

    Abstract: Ranking models lie at the heart of research on information retrieval (IR). During the past decades, different techniques have been proposed for constructing ranking models, from traditional heuristic methods, probabilistic methods, to modern machine learning methods. Recently, with the advance of deep learning technology, we have witnessed a growing body of work in applying shallow or deep neural… ▽ More

    Submitted 27 June, 2019; v1 submitted 16 March, 2019; originally announced March 2019.

  32. arXiv:1812.11561  [pdf, other

    cs.IR cs.CL cs.LG

    Learning to Selectively Transfer: Reinforced Transfer Learning for Deep Text Matching

    Authors: Chen Qu, Feng Ji, Minghui Qiu, Liu Yang, Zhiyu Min, Haiqing Chen, Jun Huang, W. Bruce Croft

    Abstract: Deep text matching approaches have been widely studied for many applications including question answering and information retrieval systems. To deal with a domain that has insufficient labeled data, these approaches can be used in a Transfer Learning (TL) setting to leverage labeled data from a resource-rich source domain. To achieve better performance, source domain data selection is essential in… ▽ More

    Submitted 30 December, 2018; originally announced December 2018.

    Comments: Accepted to WSDM 2019

  33. arXiv:1812.08870  [pdf, other

    cs.IR

    Iterative Relevance Feedback for Answer Passage Retrieval with Passage-level Semantic Match

    Authors: Ke** Bi, Qingyao Ai, W. Bruce Croft

    Abstract: Relevance feedback techniques assume that users provide relevance judgments for the top k (usually 10) documents and then re-rank using a new query model based on those judgments. Even though this is effective, there has been little research recently on this topic because requiring users to provide substantial feedback on a result list is impractical in a typical web search scenario. In new enviro… ▽ More

    Submitted 20 December, 2018; originally announced December 2018.

    Journal ref: 41st European Conference on IR Research, ECIR 2019

  34. arXiv:1812.05731  [pdf, other

    cs.IR

    Revisiting Iterative Relevance Feedback for Document and Passage Retrieval

    Authors: Ke** Bi, Qingyao Ai, W. Bruce Croft

    Abstract: As more and more search traffic comes from mobile phones, intelligent assistants, and smart-home devices, new challenges (e.g., limited presentation space) and opportunities come up in information retrieval. Previously, an effective technique, relevance feedback (RF), has rarely been used in real search scenarios due to the overhead of collecting users' relevance judgments. However, since users te… ▽ More

    Submitted 9 June, 2019; v1 submitted 13 December, 2018; originally announced December 2018.

  35. arXiv:1807.05631  [pdf, other

    cs.IR

    Joint Modeling and Optimization of Search and Recommendation

    Authors: Hamed Zamani, W. Bruce Croft

    Abstract: Despite the somewhat different techniques used in develo** search engines and recommender systems, they both follow the same goal: hel** people to get the information they need at the right time. Due to this common goal, search and recommendation models can potentially benefit from each other. The recent advances in neural network technologies make them effective and easily extendable for vari… ▽ More

    Submitted 15 July, 2018; originally announced July 2018.

    Comments: In Proceedings of Design of Experimental Search & Information REtrieval Systems (DESIRES 2018)

  36. arXiv:1806.05434  [pdf, other

    cs.CL

    Transfer Learning for Context-Aware Question Matching in Information-seeking Conversations in E-commerce

    Authors: Minghui Qiu, Liu Yang, Feng Ji, Weipeng Zhao, Wei Zhou, Jun Huang, Haiqing Chen, W. Bruce Croft, Wei Lin

    Abstract: Building multi-turn information-seeking conversation systems is an important and challenging research topic. Although several advanced neural text matching models have been proposed for this task, they are generally not efficient for industrial applications. Furthermore, they rely on a large amount of labeled data, which may not be available in real-world applications. To alleviate these problems,… ▽ More

    Submitted 14 June, 2018; originally announced June 2018.

    Comments: 6

    Journal ref: ACL 2018

  37. arXiv:1806.04815  [pdf, ps, other

    cs.IR

    Towards Theoretical Understanding of Weak Supervision for Information Retrieval

    Authors: Hamed Zamani, W. Bruce Croft

    Abstract: Neural network approaches have recently shown to be effective in several information retrieval (IR) tasks. However, neural approaches often require large volumes of training data to perform effectively, which is not always available. To mitigate the shortage of labeled data, training neural IR models with weak supervision has been recently proposed and received considerable attention in the litera… ▽ More

    Submitted 12 June, 2018; originally announced June 2018.

    Comments: A position paper accepted to the 2018 ACM SIGIR Workshop on Learning from Limited or Noisy Data for Information Retrieval (LND4IR)

  38. arXiv:1806.03790  [pdf, other

    cs.IR

    Distributed Evaluations: Ending Neural Point Metrics

    Authors: Daniel Cohen, Scott M. Jordan, W. Bruce Croft

    Abstract: With the rise of neural models across the field of information retrieval, numerous publications have incrementally pushed the envelope of performance for a multitude of IR tasks. However, these networks often sample data in random order, are initialized randomly, and their success is determined by a single evaluation score. These issues are aggravated by neural models achieving incremental improve… ▽ More

    Submitted 10 June, 2018; originally announced June 2018.

    Comments: ACM SIGIR - LND4IR Workshop

  39. arXiv:1805.03797  [pdf, other

    cs.IR

    WikiPassageQA: A Benchmark Collection for Research on Non-factoid Answer Passage Retrieval

    Authors: Daniel Cohen, Liu Yang, W. Bruce Croft

    Abstract: With the rise in mobile and voice search, answer passage retrieval acts as a critical component of an effective information retrieval system for open domain question answering. Currently, there are no comparable collections that address non-factoid question answering within larger documents while simultaneously providing enough examples sufficient to train a deep neural network. In this paper, we… ▽ More

    Submitted 9 May, 2018; originally announced May 2018.

    Comments: Accepted by SIGIR18

  40. arXiv:1805.03403  [pdf, other

    cs.IR

    Cross Domain Regularization for Neural Ranking Models Using Adversarial Learning

    Authors: Daniel Cohen, Bhaskar Mitra, Katja Hofmann, W. Bruce Croft

    Abstract: Unlike traditional learning to rank models that depend on hand-crafted features, neural representation learning models learn higher level features for the ranking task by training on large datasets. Their ability to learn new features directly from the data, however, may come at a price. Without any special supervision, these models learn relationships that may hold only in the domain from which t… ▽ More

    Submitted 9 May, 2018; originally announced May 2018.

    Comments: SIGIR 2018 short paper

  41. Target Apps Selection: Towards a Unified Search Framework for Mobile Devices

    Authors: Mohammad Aliannejadi, Hamed Zamani, Fabio Crestani, W. Bruce Croft

    Abstract: With the recent growth of conversational systems and intelligent assistants such as Apple Siri and Google Assistant, mobile devices are becoming even more pervasive in our lives. As a consequence, users are getting engaged with the mobile apps and frequently search for an information need in their apps. However, users cannot search within their apps through their intelligent assistants. This requi… ▽ More

    Submitted 13 July, 2018; v1 submitted 6 May, 2018; originally announced May 2018.

    Comments: To appear at SIGIR 2018

  42. Response Ranking with Deep Matching Networks and External Knowledge in Information-seeking Conversation Systems

    Authors: Liu Yang, Minghui Qiu, Chen Qu, Jiafeng Guo, Yongfeng Zhang, W. Bruce Croft, Jun Huang, Haiqing Chen

    Abstract: Intelligent personal assistant systems with either text-based or voice-based conversational interfaces are becoming increasingly popular around the world. Retrieval-based conversation models have the advantages of returning fluent and informative responses. Most existing studies in this area are on open domain "chit-chat" conversations or task / transaction oriented conversations. More research is… ▽ More

    Submitted 9 May, 2018; v1 submitted 1 May, 2018; originally announced May 2018.

    Comments: Accepted by the 41th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018), Ann Arbor, Michigan, U.S.A. July 8-12, 2018 (Full Oral Paper)

  43. Analyzing and Characterizing User Intent in Information-seeking Conversations

    Authors: Chen Qu, Liu Yang, W. Bruce Croft, Johanne R. Trippas, Yongfeng Zhang, Minghui Qiu

    Abstract: Understanding and characterizing how people interact in information-seeking conversations is crucial in develo** conversational search systems. In this paper, we introduce a new dataset designed for this purpose and use it to analyze information-seeking conversations by user intent distribution, co-occurrence, and flow patterns. The MSDialog dataset is a labeled dialog dataset of question answer… ▽ More

    Submitted 23 April, 2018; originally announced April 2018.

    Comments: Accepted by SIGIR 2018 as a short paper

  44. Unbiased Learning to Rank with Unbiased Propensity Estimation

    Authors: Qingyao Ai, Ke** Bi, Cheng Luo, Jiafeng Guo, W. Bruce Croft

    Abstract: Learning to rank with biased click data is a well-known challenge. A variety of methods has been explored to debias click data for learning to rank such as click models, result interleaving and, more recently, the unbiased learning-to-rank framework based on inverse propensity weighting. Despite their differences, most existing studies separate the estimation of click bias (namely the \textit{prop… ▽ More

    Submitted 23 April, 2018; v1 submitted 16 April, 2018; originally announced April 2018.

  45. Learning a Deep Listwise Context Model for Ranking Refinement

    Authors: Qingyao Ai, Ke** Bi, Jiafeng Guo, W. Bruce Croft

    Abstract: Learning to rank has been intensively studied and widely applied in information retrieval. Typically, a global ranking function is learned from a set of labeled data, which can achieve good performance on average but may be suboptimal for individual queries by ignoring the fact that relevant documents for different queries may have different distributions in the feature space. Inspired by the idea… ▽ More

    Submitted 23 April, 2018; v1 submitted 16 April, 2018; originally announced April 2018.

  46. arXiv:1801.01641  [pdf, other

    cs.IR cs.CL

    aNMM: Ranking Short Answer Texts with Attention-Based Neural Matching Model

    Authors: Liu Yang, Qingyao Ai, Jiafeng Guo, W. Bruce Croft

    Abstract: As an alternative to question answering methods based on feature engineering, deep learning approaches such as convolutional neural networks (CNNs) and Long Short-Term Memory Models (LSTMs) have recently been proposed for semantic matching of questions and answers. To achieve good results, however, these models have been combined with additional features such as word overlap or BM25 scores. Withou… ▽ More

    Submitted 31 May, 2019; v1 submitted 5 January, 2018; originally announced January 2018.

    Comments: Accepted as a full paper by CIKM'16

  47. A Deep Relevance Matching Model for Ad-hoc Retrieval

    Authors: Jiafeng Guo, Yixing Fan, Qingyao Ai, W. Bruce Croft

    Abstract: In recent years, deep neural networks have led to exciting breakthroughs in speech recognition, computer vision, and natural language processing (NLP) tasks. However, there have been few positive results of deep models on ad-hoc retrieval tasks. This is partially due to the fact that many important characteristics of the ad-hoc retrieval task have not been well addressed in deep models yet. Typica… ▽ More

    Submitted 23 November, 2017; originally announced November 2017.

    Comments: CIKM 2016, long paper

  48. arXiv:1707.05409  [pdf, other

    cs.IR

    Neural Matching Models for Question Retrieval and Next Question Prediction in Conversation

    Authors: Liu Yang, Hamed Zamani, Yongfeng Zhang, Jiafeng Guo, W. Bruce Croft

    Abstract: The recent boom of AI has seen the emergence of many human-computer conversation systems such as Google Assistant, Microsoft Cortana, Amazon Echo and Apple Siri. We introduce and formalize the task of predicting questions in conversations, where the goal is to predict the new question that the user will ask, given the past conversational context. This task can be modeled as a "sequence matching" p… ▽ More

    Submitted 17 July, 2017; originally announced July 2017.

    Comments: Neu-IR 2017: The SIGIR 2017 Workshop on Neural Information Retrieval (SIGIR Neu-IR 2017), Tokyo, Japan, August 7-11, 2017

  49. arXiv:1705.03556  [pdf, other

    cs.IR cs.CL cs.LG cs.NE

    Relevance-based Word Embedding

    Authors: Hamed Zamani, W. Bruce Croft

    Abstract: Learning a high-dimensional dense representation for vocabulary terms, also known as a word embedding, has recently attracted much attention in natural language processing and information retrieval tasks. The embedding vectors are typically learned based on term proximity in a large corpus. This means that the objective in well-known word embedding algorithms, e.g., word2vec, is to accurately pred… ▽ More

    Submitted 16 July, 2017; v1 submitted 9 May, 2017; originally announced May 2017.

    Comments: to appear in the proceedings of The 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '17)

  50. arXiv:1704.08803  [pdf, other

    cs.IR cs.CL cs.LG

    Neural Ranking Models with Weak Supervision

    Authors: Mostafa Dehghani, Hamed Zamani, Aliaksei Severyn, Jaap Kamps, W. Bruce Croft

    Abstract: Despite the impressive improvements achieved by unsupervised deep neural networks in computer vision and NLP tasks, such improvements have not yet been observed in ranking for information retrieval. The reason may be the complexity of the ranking problem, as it is not obvious how to learn from queries and documents when no supervised signal is available. Hence, in this paper, we propose to train a… ▽ More

    Submitted 29 May, 2017; v1 submitted 28 April, 2017; originally announced April 2017.

    Comments: In proceedings of The 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR2017)