Skip to main content

Showing 1–25 of 25 results for author: Roy, R S

.
  1. arXiv:2310.13505  [pdf, other

    cs.CL cs.AI cs.IR

    Robust Training for Conversational Question Answering Models with Reinforced Reformulation Generation

    Authors: Magdalena Kaiser, Rishiraj Saha Roy, Gerhard Weikum

    Abstract: Models for conversational question answering (ConvQA) over knowledge graphs (KGs) are usually trained and tested on benchmarks of gold QA pairs. This implies that training is limited to surface forms seen in the respective datasets, and evaluation is on a small set of held-out questions. Through our proposed framework REIGN, we take several steps to remedy this restricted learning setup. First, we… ▽ More

    Submitted 16 February, 2024; v1 submitted 20 October, 2023; originally announced October 2023.

    Comments: WSDM 2024 Research Paper, 11 pages

  2. arXiv:2306.12235  [pdf, other

    cs.IR

    CompMix: A Benchmark for Heterogeneous Question Answering

    Authors: Philipp Christmann, Rishiraj Saha Roy, Gerhard Weikum

    Abstract: Fact-centric question answering (QA) often requires access to multiple, heterogeneous, information sources. By jointly considering several sources like a knowledge base (KB), a text collection, and tables from the web, QA systems can enhance their answer coverage and confidence. However, existing QA benchmarks are mostly constructed with a single source of knowledge in mind. This limits capabiliti… ▽ More

    Submitted 19 August, 2023; v1 submitted 21 June, 2023; originally announced June 2023.

  3. arXiv:2305.01548  [pdf, other

    cs.IR

    Explainable Conversational Question Answering over Heterogeneous Sources via Iterative Graph Neural Networks

    Authors: Philipp Christmann, Rishiraj Saha Roy, Gerhard Weikum

    Abstract: In conversational question answering, users express their information needs through a series of utterances with incomplete context. Typical ConvQA methods rely on a single source (a knowledge base (KB), or a text corpus, or a set of tables), thus being unable to benefit from increased answer coverage and redundancy of multiple sources. Our method EXPLAIGNN overcomes these limitations by integratin… ▽ More

    Submitted 18 July, 2023; v1 submitted 2 May, 2023; originally announced May 2023.

    Comments: Accepted at SIGIR 2023 (extended version)

  4. arXiv:2209.08171  [pdf, other

    q-bio.BM cs.AI cs.LG

    Deep learning for reconstructing protein structures from cryo-EM density maps: recent advances and future directions

    Authors: Nabin Giri, Raj S. Roy, Jianlin Cheng

    Abstract: Cryo-Electron Microscopy (cryo-EM) has emerged as a key technology to determine the structure of proteins, particularly large protein complexes and assemblies in recent years. A key challenge in cryo-EM data analysis is to automatically reconstruct accurate protein structures from cryo-EM density maps. In this review, we briefly overview various deep learning methods for building protein structure… ▽ More

    Submitted 16 September, 2022; originally announced September 2022.

    Journal ref: Current Opinion in Structural Biology Volume 79, April 2023, 102536

  5. arXiv:2205.13594  [pdf, other

    cs.LG cs.AI q-bio.BM q-bio.QM

    DRLComplex: Reconstruction of protein quaternary structures using deep reinforcement learning

    Authors: Elham Soltanikazemi, Raj S. Roy, Farhan Quadir, Nabin Giri, Alex Morehead, Jianlin Cheng

    Abstract: Predicted inter-chain residue-residue contacts can be used to build the quaternary structure of protein complexes from scratch. However, only a small number of methods have been developed to reconstruct protein quaternary structures using predicted inter-chain contacts. Here, we present an agent-based self-learning method based on deep reinforcement learning (DRLComplex) to build protein complex s… ▽ More

    Submitted 26 May, 2022; originally announced May 2022.

    Comments: 20 pages, 8 figures, 12 tables. Under review

    ACM Class: I.2.1; J.3

  6. arXiv:2204.11677  [pdf, other

    cs.IR cs.CL

    Conversational Question Answering on Heterogeneous Sources

    Authors: Philipp Christmann, Rishiraj Saha Roy, Gerhard Weikum

    Abstract: Conversational question answering (ConvQA) tackles sequential information needs where contexts in follow-up questions are left implicit. Current ConvQA systems operate over homogeneous sources of information: either a knowledge base (KB), or a text corpus, or a collection of tables. This paper addresses the novel issue of jointly tap** into all of these together, this way boosting answer coverag… ▽ More

    Submitted 30 June, 2023; v1 submitted 25 April, 2022; originally announced April 2022.

    Comments: SIGIR 2022 Research Track Long Paper

  7. Complex Temporal Question Answering on Knowledge Graphs

    Authors: Zhen Jia, Soumajit Pramanik, Rishiraj Saha Roy, Gerhard Weikum

    Abstract: Question answering over knowledge graphs (KG-QA) is a vital topic in IR. Questions with temporal intent are a special class of practical importance, but have not received much attention in research. This work presents EXAQT, the first end-to-end system for answering complex temporal questions that have multiple entities and predicates, and associated temporal conditions. EXAQT answers natural lang… ▽ More

    Submitted 18 September, 2021; originally announced September 2021.

    Comments: CIKM 2021 Long Paper, 11 pages

  8. arXiv:2108.08614  [pdf

    cs.IR cs.CL

    UNIQORN: Unified Question Answering over RDF Knowledge Graphs and Natural Language Text

    Authors: Soumajit Pramanik, Jesujoba Alabi, Rishiraj Saha Roy, Gerhard Weikum

    Abstract: Question answering over RDF data like knowledge graphs has been greatly advanced, with a number of good systems providing crisp answers for natural language questions or telegraphic queries. Some of these systems incorporate textual sources as additional evidence for the answering process, but cannot compute answers that are present in text alone. Conversely, the IR and NLP communities have addres… ▽ More

    Submitted 10 October, 2023; v1 submitted 19 August, 2021; originally announced August 2021.

    Comments: 24 pages

    ACM Class: H.3.3

  9. arXiv:2108.08597  [pdf, other

    cs.IR cs.CL

    Beyond NED: Fast and Effective Search Space Reduction for Complex Question Answering over Knowledge Bases

    Authors: Philipp Christmann, Rishiraj Saha Roy, Gerhard Weikum

    Abstract: Answering complex questions over knowledge bases (KB-QA) faces huge input data with billions of facts, involving millions of entities and thousands of predicates. For efficiency, QA systems first reduce the answer search space by identifying a set of facts that is likely to contain all answers and relevant cues. The most common technique for doing this is to apply named entity disambiguation (NED)… ▽ More

    Submitted 4 April, 2022; v1 submitted 19 August, 2021; originally announced August 2021.

    Comments: WSDM 2022 Research Track Long Paper (Extended version)

  10. Counterfactual Explanations for Neural Recommenders

    Authors: Khanh Hiep Tran, Azin Ghazimatin, Rishiraj Saha Roy

    Abstract: Understanding why specific items are recommended to users can significantly increase their trust and satisfaction in the system. While neural recommenders have become the state-of-the-art in recent years, the complexity of deep models still makes the generation of tangible explanations for end users a challenging problem. Existing methods are usually based on attention distributions over a variety… ▽ More

    Submitted 11 May, 2021; originally announced May 2021.

    Comments: SIGIR 2021 Short Paper, 5 pages

  11. Reinforcement Learning from Reformulations in Conversational Question Answering over Knowledge Graphs

    Authors: Magdalena Kaiser, Rishiraj Saha Roy, Gerhard Weikum

    Abstract: The rise of personal assistants has made conversational question answering (ConvQA) a very popular mechanism for user-system interaction. State-of-the-art methods for ConvQA over knowledge graphs (KGs) can only learn from crisp question-answer pairs found in popular benchmarks. In reality, however, such training data is hard to come by: users would rarely mark answers explicitly as correct or wron… ▽ More

    Submitted 20 August, 2021; v1 submitted 11 May, 2021; originally announced May 2021.

    Comments: SIGIR 2021 Long Paper, 11 pages

  12. arXiv:2102.09388  [pdf, other

    cs.IR cs.AI cs.LG

    ELIXIR: Learning from User Feedback on Explanations to Improve Recommender Models

    Authors: Azin Ghazimatin, Soumajit Pramanik, Rishiraj Saha Roy, Gerhard Weikum

    Abstract: System-provided explanations for recommendations are an important component towards transparent and trustworthy AI. In state-of-the-art research, this is a one-way signal, though, to improve user acceptance. In this paper, we turn the role of explanations around and investigate how they can contribute to enhancing the quality of the generated recommendations themselves. We devise a human-in-the-lo… ▽ More

    Submitted 30 April, 2021; v1 submitted 15 February, 2021; originally announced February 2021.

    Comments: WWW 2021, 11 pages

  13. arXiv:2004.13117  [pdf, other

    cs.IR cs.CL

    Conversational Question Answering over Passages by Leveraging Word Proximity Networks

    Authors: Magdalena Kaiser, Rishiraj Saha Roy, Gerhard Weikum

    Abstract: Question answering (QA) over text passages is a problem of long-standing interest in information retrieval. Recently, the conversational setting has attracted attention, where a user asks a sequence of questions to satisfy her information needs around a topic. While this setup is a natural one and similar to humans conversing with each other, it introduces two key research challenges: understandin… ▽ More

    Submitted 25 May, 2020; v1 submitted 27 April, 2020; originally announced April 2020.

    Comments: SIGIR 2020 Demonstrations

  14. Question Answering over Curated and Open Web Sources

    Authors: Rishiraj Saha Roy, Avishek Anand

    Abstract: The last few years have seen an explosion of research on the topic of automated question answering (QA), spanning the communities of information retrieval, natural language processing, and artificial intelligence. This tutorial would cover the highlights of this really active period of growth for QA to give the audience a grasp over the families of algorithms that are currently being used. We part… ▽ More

    Submitted 7 August, 2020; v1 submitted 24 April, 2020; originally announced April 2020.

    Comments: SIGIR 2020 Tutorial

  15. Towards Query Logs for Privacy Studies: On Deriving Search Queries from Questions

    Authors: Asia J. Biega, Jana Schmidt, Rishiraj Saha Roy

    Abstract: Translating verbose information needs into crisp search queries is a phenomenon that is ubiquitous but hardly understood. Insights into this process could be valuable in several applications, including synthesizing large privacy-friendly query logs from public Web sources which are readily available to the academic research community. In this work, we take a step towards understanding query formul… ▽ More

    Submitted 3 June, 2021; v1 submitted 4 April, 2020; originally announced April 2020.

    Comments: ECIR 2020 Short Paper

  16. arXiv:1911.08378  [pdf, other

    cs.LG cs.AI stat.ML

    PRINCE: Provider-side Interpretability with Counterfactual Explanations in Recommender Systems

    Authors: Azin Ghazimatin, Oana Balalau, Rishiraj Saha Roy, Gerhard Weikum

    Abstract: Interpretable explanations for recommender systems and other machine learning models are crucial to gain user trust. Prior works that have focused on paths connecting users and items in a heterogeneous network have several limitations, such as discovering relationships rather than true explanations, or disregarding other users' privacy. In this work, we take a fresh perspective, and present PRINCE… ▽ More

    Submitted 24 December, 2019; v1 submitted 19 November, 2019; originally announced November 2019.

    Comments: WSDM 2020, 9 pages

  17. arXiv:1911.02850  [pdf, other

    cs.IR cs.CL

    CROWN: Conversational Passage Ranking by Reasoning over Word Networks

    Authors: Magdalena Kaiser, Rishiraj Saha Roy, Gerhard Weikum

    Abstract: Information needs around a topic cannot be satisfied in a single turn; users typically ask follow-up questions referring to the same theme and a system must be capable of understanding the conversational context of a request to retrieve correct answers. In this paper, we present our submission to the TREC Conversational Assistance Track 2019, in which such a conversational setting is explored. We… ▽ More

    Submitted 11 February, 2020; v1 submitted 7 November, 2019; originally announced November 2019.

    Comments: TREC 2019, 14 pages

    Journal ref: TREC 2019

  18. Look before you Hop: Conversational Question Answering over Knowledge Graphs Using Judicious Context Expansion

    Authors: Philipp Christmann, Rishiraj Saha Roy, Abdalghani Abujabal, Jyotsna Singh, Gerhard Weikum

    Abstract: Fact-centric information needs are rarely one-shot; users typically ask follow-up questions to explore a topic. In such a conversational setting, the user's inputs are often incomplete, with entities or predicates left out, and ungrammatical phrases. This poses a huge challenge to question answering (QA) systems that typically rely on cues in full-fledged interrogative sentences. As a solution, we… ▽ More

    Submitted 5 November, 2019; v1 submitted 8 October, 2019; originally announced October 2019.

    Comments: CIKM 2019 Long Paper, 10 pages

    Journal ref: CIKM 2019

  19. TEQUILA: Temporal Question Answering over Knowledge Bases

    Authors: Zhen Jia, Abdalghani Abujabal, Rishiraj Saha Roy, Jannik Stroetgen, Gerhard Weikum

    Abstract: Question answering over knowledge bases (KB-QA) poses challenges in handling complex questions that need to be decomposed into sub-questions. An important case, addressed here, is that of temporal questions, where cues for temporal relations need to be discovered and handled. We present TEQUILA, an enabler method for temporal QA that can run on top of any KB-QA engine. TEQUILA has four stages. It… ▽ More

    Submitted 25 January, 2021; v1 submitted 9 August, 2019; originally announced August 2019.

    Comments: CIKM 2018 Short Paper

    Journal ref: CIKM 2018

  20. arXiv:1908.03109  [pdf, other

    cs.SI cs.LG stat.ML

    FAIRY: A Framework for Understanding Relationships between Users' Actions and their Social Feeds

    Authors: Azin Ghazimatin, Rishiraj Saha Roy, Gerhard Weikum

    Abstract: Users increasingly rely on social media feeds for consuming daily information. The items in a feed, such as news, questions, songs, etc., usually result from the complex interplay of a user's social contacts, her interests and her actions on the platform. The relationship of the user's own behavior and the received feed is often puzzling, and many users would like to have a clear explanation on wh… ▽ More

    Submitted 5 November, 2019; v1 submitted 8 August, 2019; originally announced August 2019.

    Comments: WSDM 2019

    MSC Class: http://www.acm.org/about/class/1998

    Journal ref: WSDM 2019

  21. Answering Complex Questions by Joining Multi-Document Evidence with Quasi Knowledge Graphs

    Authors: Xiaolu Lu, Soumajit Pramanik, Rishiraj Saha Roy, Abdalghani Abujabal, Yafang Wang, Gerhard Weikum

    Abstract: Direct answering of questions that involve multiple entities and relations is a challenge for text-based QA. This problem is most pronounced when answers can be found only by joining evidence from multiple documents. Curated knowledge graphs (KGs) may yield good answers, but are limited by their inherent incompleteness and potential staleness. This paper presents QUEST, a method that can answer co… ▽ More

    Submitted 28 November, 2020; v1 submitted 1 August, 2019; originally announced August 2019.

    Comments: SIGIR 2019 Long Paper, 10 pages

  22. arXiv:1809.09528  [pdf, other

    cs.CL

    ComQA: A Community-sourced Dataset for Complex Factoid Question Answering with Paraphrase Clusters

    Authors: Abdalghani Abujabal, Rishiraj Saha Roy, Mohamed Yahya, Gerhard Weikum

    Abstract: To bridge the gap between the capabilities of the state-of-the-art in factoid question answering (QA) and what users ask, we need large datasets of real user questions that capture the various question phenomena users are interested in, and the diverse ways in which these questions are formulated. We introduce ComQA, a large dataset of real user questions that exhibit different challenging aspects… ▽ More

    Submitted 10 April, 2019; v1 submitted 25 September, 2018; originally announced September 2018.

    Comments: 11 pages, NAACL 2019

  23. arXiv:1305.1861  [pdf, other

    q-bio.GN cs.CE

    Turtle: Identifying frequent k-mers with cache-efficient algorithms

    Authors: Rajat Shuvro Roy, Debashish Bhattacharya, Alexander Schliep

    Abstract: Counting the frequencies of k-mers in read libraries is often a first step in the analysis of high-throughput sequencing experiments. Infrequent k-mers are assumed to be a result of sequencing errors. The frequent k-mers constitute a reduced but error-free representation of the experiment, which can inform read error correction or serve as the input to de novo assembly methods. Ideally, the memory… ▽ More

    Submitted 8 May, 2013; originally announced May 2013.

  24. arXiv:1111.1497  [pdf, ps, other

    cs.IR

    An IR-based Evaluation Framework for Web Search Query Segmentation

    Authors: Rishiraj Saha Roy, Niloy Ganguly, Monojit Choudhury, Srivatsan Laxman

    Abstract: This paper presents the first evaluation framework for Web search query segmentation based directly on IR performance. In the past, segmentation strategies were mainly validated against manual annotations. Our work shows that the goodness of a segmentation algorithm as judged through evaluation against a handful of human annotated segmentations hardly reflects its effectiveness in an IR-based setu… ▽ More

    Submitted 17 September, 2012; v1 submitted 7 November, 2011; originally announced November 2011.

    ACM Class: H.3.3

  25. arXiv:1111.1426  [pdf, ps, other

    q-bio.GN cs.CE

    SLIQ: Simple Linear Inequalities for Efficient Contig Scaffolding

    Authors: Rajat S. Roy, Kevin C. Chen, Anirvan M. Sengupta, Alexander Schliep

    Abstract: Scaffolding is an important subproblem in "de novo" genome assembly in which mate pair data are used to construct a linear sequence of contigs separated by gaps. Here we present SLIQ, a set of simple linear inequalities derived from the geometry of contigs on the line that can be used to predict the relative positions and orientations of contigs from individual mate pair reads and thus produce a c… ▽ More

    Submitted 9 November, 2011; v1 submitted 6 November, 2011; originally announced November 2011.

    Comments: 16 pages, 6 figures, 7 tables