Skip to main content

Showing 1–46 of 46 results for author: Moschitti, A

.
  1. arXiv:2406.10172  [pdf, other

    cs.CL

    Datasets for Multilingual Answer Sentence Selection

    Authors: Matteo Gabburo, Stefano Campese, Federico Agostini, Alessandro Moschitti

    Abstract: Answer Sentence Selection (AS2) is a critical task for designing effective retrieval-based Question Answering (QA) systems. Most advancements in AS2 focus on English due to the scarcity of annotated datasets for other languages. This lack of resources prevents the training of effective AS2 models in different languages, creating a performance gap between QA systems in English and other locales. In… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  2. arXiv:2406.03592  [pdf, other

    cs.CL cs.AI

    Measuring Retrieval Complexity in Question Answering Systems

    Authors: Matteo Gabburo, Nicolaas Paul Jedema, Siddhant Garg, Leonardo F. R. Ribeiro, Alessandro Moschitti

    Abstract: In this paper, we investigate which questions are challenging for retrieval-based Question Answering (QA). We (i) propose retrieval complexity (RC), a novel metric conditioned on the completeness of retrieved documents, which measures the difficulty of answering questions, and (ii) propose an unsupervised pipeline to measure RC given an arbitrary retrieval system. Our proposed pipeline measures RC… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: Accepted to ACL 2024 (findings)

  3. arXiv:2309.12250  [pdf, other

    cs.CL cs.LG

    SQUARE: Automatic Question Answering Evaluation using Multiple Positive and Negative References

    Authors: Matteo Gabburo, Siddhant Garg, Rik Koncel Kedziorski, Alessandro Moschitti

    Abstract: Evaluation of QA systems is very challenging and expensive, with the most reliable approach being human annotations of correctness of answers for questions. Recent works (AVA, BEM) have shown that transformer LM encoder based similarity metrics transfer well for QA evaluation, but they are limited by the usage of a single correct reference answer. We propose a new evaluation metric: SQuArE (Senten… ▽ More

    Submitted 21 September, 2023; originally announced September 2023.

    Comments: Accepted to IJCNLP-AACL 2023

  4. arXiv:2305.16302  [pdf, other

    cs.CL

    Cross-Lingual Knowledge Distillation for Answer Sentence Selection in Low-Resource Languages

    Authors: Shivanshu Gupta, Yoshitomo Matsubara, Ankit Chadha, Alessandro Moschitti

    Abstract: While impressive performance has been achieved on the task of Answer Sentence Selection (AS2) for English, the same does not hold for languages that lack large labeled datasets. In this work, we propose Cross-Lingual Knowledge Distillation (CLKD) from a strong English AS2 teacher as a method to train AS2 models for low-resource languages in the tasks without the need of labeled data for the target… ▽ More

    Submitted 25 May, 2023; originally announced May 2023.

    Comments: Accepted at ACL 2023 as a long paper (Findings). Datasets are available at https://huggingface.co/datasets/AmazonScience/xtr-wiki_qa and https://huggingface.co/datasets/AmazonScience/tydi-as2

  5. arXiv:2305.15358  [pdf, other

    cs.CL cs.LG

    Context-Aware Transformer Pre-Training for Answer Sentence Selection

    Authors: Luca Di Liello, Siddhant Garg, Alessandro Moschitti

    Abstract: Answer Sentence Selection (AS2) is a core component for building an accurate Question Answering pipeline. AS2 models rank a set of candidate sentences based on how likely they answer a given question. The state of the art in AS2 exploits pre-trained transformers by transferring them on large annotated datasets, while using local contextual information around the candidate sentence. In this paper,… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.

    Comments: Accepted at ACL 2023

  6. arXiv:2305.15344  [pdf, other

    cs.CL cs.LG

    Learning Answer Generation using Supervision from Automatic Question Answering Evaluators

    Authors: Matteo Gabburo, Siddhant Garg, Rik Koncel-Kedziorski, Alessandro Moschitti

    Abstract: Recent studies show that sentence-level extractive QA, i.e., based on Answer Sentence Selection (AS2), is outperformed by Generation-based QA (GenQA) models, which generate answers using the top-k answer sentences ranked by AS2 models (a la retrieval-augmented generation style). In this paper, we propose a novel training paradigm for GenQA using supervision from automatic QA evaluation models (GAV… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.

    Comments: Accepted at ACL 2023

  7. arXiv:2304.01003  [pdf, other

    cs.CL cs.LG

    QUADRo: Dataset and Models for QUestion-Answer Database Retrieval

    Authors: Stefano Campese, Ivano Lauriola, Alessandro Moschitti

    Abstract: An effective paradigm for building Automated Question Answering systems is the re-use of previously answered questions, e.g., for FAQs or forum applications. Given a database (DB) of question/answer (q/a) pairs, it is possible to answer a target question by scanning the DB for similar questions. In this paper, we scale this approach to open domain, making it competitive with other standard methods… ▽ More

    Submitted 29 March, 2023; originally announced April 2023.

  8. arXiv:2210.13536  [pdf, other

    cs.CL

    Effective Pre-Training Objectives for Transformer-based Autoencoders

    Authors: Luca Di Liello, Matteo Gabburo, Alessandro Moschitti

    Abstract: In this paper, we study trade-offs between efficiency, cost and accuracy when pre-training Transformer encoders with different pre-training objectives. For this purpose, we analyze features of common objectives and combine them to create new effective pre-training approaches. Specifically, we designed light token generators based on a straightforward statistical approach, which can replace ELECTRA… ▽ More

    Submitted 24 October, 2022; originally announced October 2022.

    Comments: Accepted at EMNLP 2022 Findings

  9. arXiv:2210.12865  [pdf, other

    cs.CL cs.LG

    Knowledge Transfer from Answer Ranking to Answer Generation

    Authors: Matteo Gabburo, Rik Koncel-Kedziorski, Siddhant Garg, Luca Soldaini, Alessandro Moschitti

    Abstract: Recent studies show that Question Answering (QA) based on Answer Sentence Selection (AS2) can be improved by generating an improved answer from the top-k ranked answer sentences (termed GenQA). This allows for synthesizing the information from multiple candidates into a concise, natural-sounding answer. However, creating large-scale supervised training data for GenQA models is very challenging. In… ▽ More

    Submitted 23 October, 2022; originally announced October 2022.

    Comments: Accepted at EMNLP 2022

  10. arXiv:2205.10455  [pdf, other

    cs.CL

    Pre-training Transformer Models with Sentence-Level Objectives for Answer Sentence Selection

    Authors: Luca Di Liello, Siddhant Garg, Luca Soldaini, Alessandro Moschitti

    Abstract: An important task for designing QA systems is answer sentence selection (AS2): selecting the sentence containing (or constituting) the answer to a question from a set of retrieved relevant documents. In this paper, we propose three novel sentence-level transformer pre-training objectives that incorporate paragraph-level semantics within and across documents, to improve the performance of transform… ▽ More

    Submitted 20 October, 2022; v1 submitted 20 May, 2022; originally announced May 2022.

    Comments: Accepted at EMNLP 2022

  11. arXiv:2205.01228  [pdf, other

    cs.CL

    Paragraph-based Transformer Pre-training for Multi-Sentence Inference

    Authors: Luca Di Liello, Siddhant Garg, Luca Soldaini, Alessandro Moschitti

    Abstract: Inference tasks such as answer sentence selection (AS2) or fact verification are typically solved by fine-tuning transformer-based models as individual sentence-pair classifiers. Recent studies show that these tasks benefit from modeling dependencies across multiple candidate sentences jointly. In this paper, we first show that popular pre-trained transformers perform poorly when used for fine-tun… ▽ More

    Submitted 6 July, 2022; v1 submitted 2 May, 2022; originally announced May 2022.

    Comments: Accepted at NAACL 2022

  12. arXiv:2203.09598  [pdf, other

    cs.CL cs.LG

    DP-KB: Data Programming with Knowledge Bases Improves Transformer Fine Tuning for Answer Sentence Selection

    Authors: Nic Jedema, Thuy Vu, Manish Gupta, Alessandro Moschitti

    Abstract: While transformers demonstrate impressive performance on many knowledge intensive (KI) tasks, their ability to serve as implicit knowledge bases (KBs) remains limited, as shown on several slot-filling, question-answering (QA), fact verification, and entity-linking tasks. In this paper, we implement an efficient, data-programming technique that enriches training data with KB-derived context and imp… ▽ More

    Submitted 17 March, 2022; originally announced March 2022.

    Comments: Workshop on Databases and AI @NeurIPS 2021, Oral Presentation

  13. Question-Answer Sentence Graph for Joint Modeling Answer Selection

    Authors: Roshni G. Iyer, Thuy Vu, Alessandro Moschitti, Yizhou Sun

    Abstract: This research studies graph-based approaches for Answer Sentence Selection (AS2), an essential component for retrieval-based Question Answering (QA) systems. During offline learning, our model constructs a small-scale relevant training graph per question in an unsupervised manner, and integrates with Graph Neural Networks. Graph nodes are question sentence to answer sentence pairs. We train and in… ▽ More

    Submitted 23 April, 2023; v1 submitted 16 February, 2022; originally announced March 2022.

  14. arXiv:2201.05984  [pdf, other

    cs.CL

    In Situ Answer Sentence Selection at Web-scale

    Authors: Zeyu Zhang, Thuy Vu, Alessandro Moschitti

    Abstract: Current answer sentence selection (AS2) applied in open-domain question answering (ODQA) selects answers by ranking a large set of possible candidates, i.e., sentences, extracted from the retrieved text. In this paper, we present Passage-based Extracting Answer Sentence In-place (PEASI), a novel design for AS2 optimized for Web-scale setting, that, instead, computes such answer without processing… ▽ More

    Submitted 16 January, 2022; originally announced January 2022.

  15. arXiv:2201.05981  [pdf, other

    cs.CL

    Double Retrieval and Ranking for Accurate Question Answering

    Authors: Zeyu Zhang, Thuy Vu, Alessandro Moschitti

    Abstract: Recent work has shown that an answer verification step introduced in Transformer-based answer selection models can significantly improve the state of the art in Question Answering. This step is performed by aggregating the embeddings of top $k$ answer candidates to support the verification of a target answer. Although the approach is intuitive and sound still shows two limitations: (i) the support… ▽ More

    Submitted 16 January, 2022; originally announced January 2022.

  16. arXiv:2201.05767  [pdf, other

    cs.CL

    Ensemble Transformer for Efficient and Accurate Ranking Tasks: an Application to Question Answering Systems

    Authors: Yoshitomo Matsubara, Luca Soldaini, Eric Lind, Alessandro Moschitti

    Abstract: Large transformer models can highly improve Answer Sentence Selection (AS2) tasks, but their high computational costs prevent their use in many real-world applications. In this paper, we explore the following research question: How can we make the AS2 models more accurate without significantly increasing their model complexity? To address the question, we propose a Multiple Heads Student architect… ▽ More

    Submitted 6 December, 2022; v1 submitted 15 January, 2022; originally announced January 2022.

    Comments: Accepted to EMNLP 2022 as a long paper (Findings). Model code is available at https://github.com/amazon-research/wqa-cerberus

    Journal ref: Findings of the Association for Computational Linguistics: EMNLP 2022

  17. arXiv:2110.07150  [pdf, other

    cs.CL

    Cross-Lingual Open-Domain Question Answering with Answer Sentence Generation

    Authors: Benjamin Muller, Luca Soldaini, Rik Koncel-Kedziorski, Eric Lind, Alessandro Moschitti

    Abstract: Open-Domain Generative Question Answering has achieved impressive performance in English by combining document-level retrieval with answer generation. These approaches, which we refer to as GenQA, can generate complete sentences, effectively answering both factoid and non-factoid questions. In this paper, we extend GenQA to the multilingual and cross-lingual settings. For this purpose, we first in… ▽ More

    Submitted 19 December, 2022; v1 submitted 14 October, 2021; originally announced October 2021.

    Comments: AACL 2022 Long Paper

  18. arXiv:2109.07009  [pdf, other

    cs.CL

    Will this Question be Answered? Question Filtering via Answer Model Distillation for Efficient Question Answering

    Authors: Siddhant Garg, Alessandro Moschitti

    Abstract: In this paper we propose a novel approach towards improving the efficiency of Question Answering (QA) systems by filtering out questions that will not be answered by them. This is based on an interesting new finding: the answer confidence scores of state-of-the-art QA systems can be approximated well by models solely using the input question text. This enables preemptive filtering of questions tha… ▽ More

    Submitted 14 September, 2021; originally announced September 2021.

    Comments: Accepted at EMNLP 2021 Main Conference (Long)

  19. arXiv:2107.04217  [pdf, other

    cs.CL

    Joint Models for Answer Verification in Question Answering Systems

    Authors: Zeyu Zhang, Thuy Vu, Alessandro Moschitti

    Abstract: This paper studies joint models for selecting correct answer sentences among the top $k$ provided by answer sentence selection (AS2) modules, which are core components of retrieval-based Question Answering (QA) systems. Our work shows that a critical step to effectively exploit an answer set regards modeling the interrelated information between pair of answers. For this purpose, we build a three-w… ▽ More

    Submitted 9 July, 2021; originally announced July 2021.

    Journal ref: ACL 2021

  20. arXiv:2106.00955  [pdf, ps, other

    cs.CL

    Answer Generation for Retrieval-based Question Answering Systems

    Authors: Chao-Chun Hsu, Eric Lind, Luca Soldaini, Alessandro Moschitti

    Abstract: Recent advancements in transformer-based models have greatly improved the ability of Question Answering (QA) systems to provide correct answers; in particular, answer sentence selection (AS2) models, core components of retrieval-based systems, have achieved impressive results. While generally effective, these models fail to provide a satisfying answer when all retrieved candidates are of poor qual… ▽ More

    Submitted 2 June, 2021; originally announced June 2021.

    Comments: Short paper, Accepted at Findings of ACL 2021

  21. arXiv:2104.09694  [pdf, ps, other

    cs.CL cs.IR cs.LG

    Efficient pre-training objectives for Transformers

    Authors: Luca Di Liello, Matteo Gabburo, Alessandro Moschitti

    Abstract: The Transformer architecture deeply changed the natural language processing, outperforming all previous state-of-the-art models. However, well-known Transformer models like BERT, RoBERTa, and GPT-2 require a huge compute budget to create a high quality contextualised representation. In this paper, we study several efficient pre-training objectives for Transformers-based models. By testing these ob… ▽ More

    Submitted 19 April, 2021; originally announced April 2021.

  22. arXiv:2104.08943  [pdf, other

    cs.CL cs.AI cs.IR

    Reference-based Weak Supervision for Answer Sentence Selection using Web Data

    Authors: Vivek Krishnamurthy, Thuy Vu, Alessandro Moschitti

    Abstract: Answer sentence selection (AS2) modeling requires annotated data, i.e., hand-labeled question-answer pairs. We present a strategy to collect weakly supervised answers for a question based on its reference to improve AS2 modeling. Specifically, we introduce Reference-based Weak Supervision (RWS), a fully automatic large-scale data pipeline that harvests high-quality weakly-supervised answers from a… ▽ More

    Submitted 18 April, 2021; originally announced April 2021.

  23. arXiv:2102.10250  [pdf, ps, other

    cs.CL

    Multilingual Answer Sentence Reranking via Automatically Translated Data

    Authors: Thuy Vu, Alessandro Moschitti

    Abstract: We present a study on the design of multilingual Answer Sentence Selection (AS2) models, which are a core component of modern Question Answering (QA) systems. The main idea is to transfer data, created from one resource rich language, e.g., English, to other languages, less rich in terms of resources. The main findings of this paper are: (i) the training data for AS2 translated into a target langu… ▽ More

    Submitted 19 February, 2021; originally announced February 2021.

  24. arXiv:2102.10246  [pdf, other

    cs.CL cs.AI cs.IR

    CDA: a Cost Efficient Content-based Multilingual Web Document Aligner

    Authors: Thuy Vu, Alessandro Moschitti

    Abstract: We introduce a Content-based Document Alignment approach (CDA), an efficient method to align multilingual web documents based on content in creating parallel training data for machine translation (MT) systems operating at the industrial level. CDA works in two steps: (i) projecting documents of a web domain to a shared multilingual space; then (ii) aligning them based on the similarity of their re… ▽ More

    Submitted 19 February, 2021; originally announced February 2021.

    Journal ref: EACL 2021

  25. arXiv:2102.10243  [pdf, other

    cs.CL

    Machine Translation Customization via Automatic Training Data Selection from the Web

    Authors: Thuy Vu, Alessandro Moschitti

    Abstract: Machine translation (MT) systems, especially when designed for an industrial setting, are trained with general parallel data derived from the Web. Thus, their style is typically driven by word/structure distribution coming from the average of many domains. In contrast, MT customers want translations to be specialized to their domain, for which they are typically able to provide text samples. We de… ▽ More

    Submitted 19 February, 2021; originally announced February 2021.

    Journal ref: ECIR 2021

  26. arXiv:2101.12093  [pdf, other

    cs.CL

    Modeling Context in Answer Sentence Selection Systems on a Latency Budget

    Authors: Rujun Han, Luca Soldaini, Alessandro Moschitti

    Abstract: Answer Sentence Selection (AS2) is an efficient approach for the design of open-domain Question Answering (QA) systems. In order to achieve low latency, traditional AS2 models score question-answer pairs individually, ignoring any information from the document each potential answer was extracted from. In contrast, more computationally expensive models designed for machine reading comprehension tas… ▽ More

    Submitted 3 February, 2021; v1 submitted 28 January, 2021; originally announced January 2021.

    Comments: Accepted as a short paper at EACL 2021

  27. arXiv:2006.01285  [pdf, other

    cs.CL cs.LG

    Context-based Transformer Models for Answer Sentence Selection

    Authors: Ivano Lauriola, Alessandro Moschitti

    Abstract: An important task for the design of Question Answering systems is the selection of the sentence containing (or constituting) the answer from documents relevant to the asked question. Most previous work has only used the target sentence to compute its score with the question as the models were not powerful enough to also effectively encode additional contextual information. In this paper, we analyz… ▽ More

    Submitted 1 June, 2020; originally announced June 2020.

  28. arXiv:2005.02534  [pdf, other

    cs.CL

    The Cascade Transformer: an Application for Efficient Answer Sentence Selection

    Authors: Luca Soldaini, Alessandro Moschitti

    Abstract: Large transformer-based language models have been shown to be very effective in many classification tasks. However, their computational complexity prevents their use in applications requiring the classification of a large set of candidates. While previous works have investigated approaches to reduce model size, relatively little attention has been paid to techniques to improve batch throughput dur… ▽ More

    Submitted 7 May, 2020; v1 submitted 5 May, 2020; originally announced May 2020.

    Comments: Accepted to ACL 2020 (long)

  29. arXiv:2005.00705  [pdf, other

    cs.CL cs.AI cs.IR cs.LG

    AVA: an Automatic eValuation Approach to Question Answering Systems

    Authors: Thuy Vu, Alessandro Moschitti

    Abstract: We introduce AVA, an automatic evaluation approach for Question Answering, which given a set of questions associated with Gold Standard answers, can estimate system Accuracy. AVA uses Transformer-based language models to encode question, answer, and reference text. This allows for effectively measuring the similarity between the reference and an automatic answer, biased towards the question semant… ▽ More

    Submitted 2 May, 2020; originally announced May 2020.

    Journal ref: NAACL 2021

  30. arXiv:2003.02349  [pdf, ps, other

    cs.CL

    A Study on Efficiency, Accuracy and Document Structure for Answer Sentence Selection

    Authors: Daniele Bonadiman, Alessandro Moschitti

    Abstract: An essential task of most Question Answering (QA) systems is to re-rank the set of answer candidates, i.e., Answer Sentence Selection (A2S). These candidates are typically sentences either extracted from one or more documents preserving their natural order or retrieved by a search engine. Most state-of-the-art approaches to the task use huge neural models, such as BERT, or complex attentive archit… ▽ More

    Submitted 4 March, 2020; originally announced March 2020.

  31. arXiv:1912.01972  [pdf, other

    cs.CL cs.IR

    SemEval-2016 Task 3: Community Question Answering

    Authors: Preslav Nakov, Lluís Màrquez, Alessandro Moschitti, Walid Magdy, Hamdy Mubarak, Abed Alhakim Freihat, James Glass, Bilal Randeree

    Abstract: This paper describes the SemEval--2016 Task 3 on Community Question Answering, which we offered in English and Arabic. For English, we had three subtasks: Question--Comment Similarity (subtask A), Question--Question Similarity (B), and Question--External Comment Similarity (C). For Arabic, we had another subtask: Rerank the correct answers for a new question (D). Eighteen teams participated in the… ▽ More

    Submitted 3 December, 2019; originally announced December 2019.

    Comments: community question answering, question-question similarity, question-comment similarity, answer reranking, English, Arabic. arXiv admin note: substantial text overlap with arXiv:1912.00730

    MSC Class: 68T50 ACM Class: I.2.7

    Journal ref: SemEval-2016

  32. arXiv:1912.00730  [pdf, other

    cs.CL cs.AI cs.IR cs.LG

    SemEval-2017 Task 3: Community Question Answering

    Authors: Preslav Nakov, Doris Hoogeveen, Lluís Màrquez, Alessandro Moschitti, Hamdy Mubarak, Timothy Baldwin, Karin Verspoor

    Abstract: We describe SemEval-2017 Task 3 on Community Question Answering. This year, we reran the four subtasks from SemEval-2016:(A) Question-Comment Similarity,(B) Question-Question Similarity,(C) Question-External Comment Similarity, and (D) Rerank the correct answers for a new question in Arabic, providing all the data from 2015 and 2016 for training, and fresh data for testing. Additionally, we added… ▽ More

    Submitted 2 December, 2019; originally announced December 2019.

    Comments: community question answering, question-question similarity, question-comment similarity, answer reranking, Multi-domain Question Duplicate Detection, StackExchange, English, Arabic

    MSC Class: 68T50 ACM Class: I.2.7

    Journal ref: SemEval-2017

  33. arXiv:1911.11403  [pdf, other

    cs.CL cs.AI cs.IR

    SemEval-2015 Task 3: Answer Selection in Community Question Answering

    Authors: Preslav Nakov, Lluís Màrquez, Walid Magdy, Alessandro Moschitti, James Glass, Bilal Randeree

    Abstract: Community Question Answering (cQA) provides new interesting research directions to the traditional Question Answering (QA) field, e.g., the exploitation of the interaction between users and the structure of related posts. In this context, we organized SemEval-2015 Task 3 on "Answer Selection in cQA", which included two subtasks: (a) classifying answers as "good", "bad", or "potentially relevant" w… ▽ More

    Submitted 26 November, 2019; originally announced November 2019.

    Comments: community question answering, answer selection, English, Arabic

    MSC Class: 68T50 ACM Class: I.2.7

    Journal ref: SemEval-2015

  34. arXiv:1911.08755  [pdf, ps, other

    cs.CL cs.AI cs.IR cs.LO

    Global Thread-Level Inference for Comment Classification in Community Question Answering

    Authors: Shafiq Joty, Alberto Barrón-Cedeño, Giovanni Da San Martino, Simone Filice, Lluís Màrquez, Alessandro Moschitti, Preslav Nakov

    Abstract: Community question answering, a recent evolution of question answering in the Web context, allows a user to quickly consult the opinion of a number of people on a particular topic, thus taking advantage of the wisdom of the crowd. Here we try to help the user by deciding automatically which answers are good and which are bad for a given question. In particular, we focus on exploiting the output st… ▽ More

    Submitted 20 November, 2019; originally announced November 2019.

    Comments: community question answering, thread-level inference, graph-cut, inductive logic programming

    MSC Class: 68T50 ACM Class: I.2.7

    Journal ref: EMNLP-2015

  35. arXiv:1911.04118  [pdf, other

    cs.CL

    TANDA: Transfer and Adapt Pre-Trained Transformer Models for Answer Sentence Selection

    Authors: Siddhant Garg, Thuy Vu, Alessandro Moschitti

    Abstract: We propose TANDA, an effective technique for fine-tuning pre-trained Transformer models for natural language tasks. Specifically, we first transfer a pre-trained model into a model for a general task by fine-tuning it with a large and high-quality dataset. We then perform a second fine-tuning step to adapt the transferred model to the target domain. We demonstrate the benefits of our approach for… ▽ More

    Submitted 20 November, 2019; v1 submitted 11 November, 2019; originally announced November 2019.

    Comments: Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI 2020), Oral Presentation

  36. arXiv:1902.05309  [pdf, other

    cs.CL

    Transfer Learning for Sequence Labeling Using Source Model and Target Data

    Authors: Lingzhen Chen, Alessandro Moschitti

    Abstract: In this paper, we propose an approach for transferring the knowledge of a neural model for sequence labeling, learned from the source domain, to a new model trained on a target domain, where new label categories appear. Our transfer learning (TL) techniques enable to adapt the source model using the target data and new categories, without accessing to the source data. Our solution consists in addi… ▽ More

    Submitted 14 February, 2019; originally announced February 2019.

    Comments: 9 pages, 4 figures, 3 tables, accepted paper in the Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19)

  37. arXiv:1809.02255  [pdf, other

    cs.CL

    Adversarial Domain Adaptation for Duplicate Question Detection

    Authors: Darsh J Shah, Tao Lei, Alessandro Moschitti, Salvatore Romeo, Preslav Nakov

    Abstract: We address the problem of detecting duplicate questions in forums, which is an important step towards automating the process of answering new questions. As finding and annotating such potential duplicates manually is very tedious and costly, automatic methods based on machine learning are a viable alternative. However, many forums do not have annotated data, i.e., questions labeled by experts as d… ▽ More

    Submitted 6 September, 2018; originally announced September 2018.

    Comments: EMNLP 2018 short paper - camera ready. 8 pages

  38. arXiv:1806.08009  [pdf, other

    cs.CL

    Injecting Relational Structural Representation in Neural Networks for Question Similarity

    Authors: Antonio Uva, Daniele Bonadiman, Alessandro Moschitti

    Abstract: Effectively using full syntactic parsing information in Neural Networks (NNs) to solve relational tasks, e.g., question similarity, is still an open problem. In this paper, we propose to inject structural representations in NNs by (i) learning an SVM model using Tree Kernels (TKs) on relatively few pairs of questions (few thousands) as gold standard (GS) training data is typically scarce, (ii) pre… ▽ More

    Submitted 20 June, 2018; originally announced June 2018.

    Comments: ACL2018

  39. arXiv:1804.08012  [pdf, ps, other

    cs.CL

    Integrating Stance Detection and Fact Checking in a Unified Corpus

    Authors: Ramy Baly, Mitra Mohtarami, James Glass, Lluis Marquez, Alessandro Moschitti, Preslav Nakov

    Abstract: A reasonable approach for fact checking a claim involves retrieving potentially relevant documents from different sources (e.g., news websites, social media, etc.), determining the stance of each document with respect to the claim, and finally making a prediction about the claim's factuality by aggregating the strength of the stances, while taking the reliability of the source into account. Moreov… ▽ More

    Submitted 21 April, 2018; originally announced April 2018.

    Comments: Stance Detection, Fact-Checking, Veracity, Arabic, NAACL-2018

    MSC Class: 68T50 ACM Class: I.2.7

  40. arXiv:1804.07581  [pdf, other

    cs.CL

    Automatic Stance Detection Using End-to-End Memory Networks

    Authors: Mitra Mohtarami, Ramy Baly, James Glass, Preslav Nakov, Lluis Marquez, Alessandro Moschitti

    Abstract: We present a novel end-to-end memory network for stance detection, which jointly (i) predicts whether a document agrees, disagrees, discusses or is unrelated with respect to a given target claim, and also (ii) extracts snippets of evidence for that prediction. The network operates at the paragraph level and integrates convolutional and recurrent neural networks, as well as a similarity matrix as p… ▽ More

    Submitted 20 April, 2018; originally announced April 2018.

    Comments: NAACL-2018; Stance detection; Fact-Checking; Veracity; Memory networks; Neural Networks; Distributed Representations

    MSC Class: 68T50 ACM Class: I.2.7

  41. arXiv:1710.01487  [pdf, other

    cs.CL

    Cross-Language Question Re-Ranking

    Authors: Giovanni Da San Martino, Salvatore Romeo, Alberto Barron-Cedeno, Shafiq Joty, Lluis Marquez, Alessandro Moschitti, Preslav Nakov

    Abstract: We study how to find relevant questions in community forums when the language of the new questions is different from that of the existing questions in the forum. In particular, we explore the Arabic-English language pair. We compare a kernel-based system with a feed-forward neural network in a scenario where a large parallel corpus is available for training a machine translation system, bilingual… ▽ More

    Submitted 4 October, 2017; originally announced October 2017.

    Comments: SIGIR-2017; Community Question Answering; Cross-language Approaches; Question Retrieval; Kernel-based Methods; Neural Networks; Distributed Representations

    MSC Class: 68T50 ACM Class: I.2.7

    Journal ref: SIGIR 2017: 1145-1148

  42. arXiv:1710.00689  [pdf, ps, other

    cs.CL

    Building Chatbots from Forum Data: Model Selection Using Question Answering Metrics

    Authors: Martin Boyanov, Ivan Koychev, Preslav Nakov, Alessandro Moschitti, Giovanni Da San Martino

    Abstract: We propose to use question answering (QA) data from Web forums to train chatbots from scratch, i.e., without dialog training data. First, we extract pairs of question and answer sentences from the typically much longer texts of questions and answers in a forum. We then use these shorter texts to train seq2seq models in a more efficient way. We further improve the parameter optimization using a new… ▽ More

    Submitted 2 October, 2017; originally announced October 2017.

    Comments: RANLP-2017

    MSC Class: 68T50 ACM Class: I.2.7

  43. arXiv:1702.03706  [pdf, other

    cs.CL

    Multitask Learning with Deep Neural Networks for Community Question Answering

    Authors: Daniele Bonadiman, Antonio Uva, Alessandro Moschitti

    Abstract: In this paper, we developed a deep neural network (DNN) that learns to solve simultaneously the three tasks of the cQA challenge proposed by the SemEval-2016 Task 3, i.e., question-comment similarity, question-question similarity and new question-comment similarity. The latter is the main task, which can exploit the previous two for achieving better results. Our DNN is trained jointly on all the t… ▽ More

    Submitted 13 February, 2017; originally announced February 2017.

  44. arXiv:1610.05522  [pdf, other

    cs.CL

    Addressing Community Question Answering in English and Arabic

    Authors: Giovanni Da San Martino, Alberto Barrón-Cedeño, Salvatore Romeo, Alessandro Moschitti, Shafiq Joty, Fahad A. Al Obaidli, Kateryna Tymoshenko, Antonio Uva

    Abstract: This paper studies the impact of different types of features applied to learning to re-rank questions in community Question Answering. We tested our models on two datasets released in SemEval-2016 Task 3 on "Community Question Answering". Task 3 targeted real-life Web fora both in English and Arabic. Our models include bag-of-words features (BoW), syntactic tree kernels (TKs), rank features, embed… ▽ More

    Submitted 18 October, 2016; originally announced October 2016.

    Comments: presented at Second WebQA workshop, SIGIR2016 (http://plg2.cs.uwaterloo.ca/~avtyurin/WebQA2016/)

    ACM Class: I.2.7; H.3.4

  45. arXiv:1604.01178  [pdf, other

    cs.CL

    Modeling Relational Information in Question-Answer Pairs with Convolutional Neural Networks

    Authors: Aliaksei Severyn, Alessandro Moschitti

    Abstract: In this paper, we propose convolutional neural networks for learning an optimal representation of question and answer sentences. Their main aspect is the use of relational information given by the matches between words from the two members of the pair. The matches are encoded as embeddings with additional parameters (dimensions), which are tuned by the network. These allows for better capturing in… ▽ More

    Submitted 5 April, 2016; originally announced April 2016.

  46. arXiv:1512.05726  [pdf, other

    cs.CL cs.NE

    Semi-supervised Question Retrieval with Gated Convolutions

    Authors: Tao Lei, Hrishikesh Joshi, Regina Barzilay, Tommi Jaakkola, Katerina Tymoshenko, Alessandro Moschitti, Lluis Marquez

    Abstract: Question answering forums are rapidly growing in size with no effective automated ability to refer to and reuse answers already available for previous posted questions. In this paper, we develop a methodology for finding semantically related questions. The task is difficult since 1) key pieces of information are often buried in extraneous details in the question body and 2) available annotations o… ▽ More

    Submitted 3 April, 2016; v1 submitted 17 December, 2015; originally announced December 2015.

    Comments: NAACL 2016