Skip to main content

Showing 1–50 of 58 results for author: Florian, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.11706  [pdf, other

    cs.IR cs.CL cs.LG

    Prompts as Auto-Optimized Training Hyperparameters: Training Best-in-Class IR Models from Scratch with 10 Gold Labels

    Authors: Jasper Xian, Saron Samuel, Faraz Khoubsirat, Ronak Pradeep, Md Arafat Sultan, Radu Florian, Salim Roukos, Avirup Sil, Christopher Potts, Omar Khattab

    Abstract: We develop a method for training small-scale (under 100M parameter) neural information retrieval models with as few as 10 gold relevance labels. The method depends on generating synthetic queries for documents using a language model (LM), and the key step is that we automatically optimize the LM prompt that is used to generate these queries based on training quality. In experiments with the BIRCO… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  2. arXiv:2404.17342  [pdf, other

    cs.CL cs.AI

    Can a Multichoice Dataset be Repurposed for Extractive Question Answering?

    Authors: Teresa Lynn, Malik H. Altakrori, Samar Mohamed Magdy, Rocktim Jyoti Das, Chenyang Lyu, Mohamed Nasr, Younes Samih, Alham Fikri Aji, Preslav Nakov, Shantanu Godbole, Salim Roukos, Radu Florian, Nizar Habash

    Abstract: The rapid evolution of Natural Language Processing (NLP) has favored major languages such as English, leaving a significant gap for many others due to limited resources. This is especially evident in the context of data annotation, a task whose importance cannot be underestimated, but which is time-consuming and costly. Thus, any dataset for resource-poor languages is precious, in particular when… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

    Comments: Paper 8 pages, Appendix 12 pages. Submitted to ARR

  3. arXiv:2404.02103  [pdf, other

    cs.CL

    CLAPNQ: Cohesive Long-form Answers from Passages in Natural Questions for RAG systems

    Authors: Sara Rosenthal, Avirup Sil, Radu Florian, Salim Roukos

    Abstract: Retrieval Augmented Generation (RAG) has become a popular application for large language models. It is preferable that successful RAG systems provide accurate answers that are supported by being grounded in a passage without any hallucinations. While considerable work is required for building a full RAG pipeline, being able to benchmark performance is also necessary. We present ClapNQ, a benchmark… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: 25 pages

  4. arXiv:2403.00827  [pdf, other

    cs.CL cs.AI cs.LG

    Self-Refinement of Language Models from External Proxy Metrics Feedback

    Authors: Keshav Ramji, Young-Suk Lee, Ramón Fernandez Astudillo, Md Arafat Sultan, Tahira Naseem, Asim Munawar, Radu Florian, Salim Roukos

    Abstract: It is often desirable for Large Language Models (LLMs) to capture multiple objectives when providing a response. In document-grounded response generation, for example, agent responses are expected to be relevant to a user's query while also being grounded in a given document. In this paper, we introduce Proxy Metric-based Self-Refinement (ProMiSe), which enables an LLM to refine its own initial re… ▽ More

    Submitted 27 February, 2024; originally announced March 2024.

  5. arXiv:2310.13961  [pdf, other

    cs.CL cs.AI

    Ensemble-Instruct: Generating Instruction-Tuning Data with a Heterogeneous Mixture of LMs

    Authors: Young-Suk Lee, Md Arafat Sultan, Yousef El-Kurdi, Tahira Naseem Asim Munawar, Radu Florian, Salim Roukos, Ramón Fernandez Astudillo

    Abstract: Using in-context learning (ICL) for data generation, techniques such as Self-Instruct (Wang et al., 2023) or the follow-up Alpaca (Taori et al., 2023) can train strong conversational agents with only a small amount of human supervision. One limitation of these approaches is that they resort to very large language models (around 175B parameters) that are also proprietary and non-public. Here we exp… ▽ More

    Submitted 21 October, 2023; originally announced October 2023.

    Journal ref: EMNLP 2023

  6. arXiv:2305.17273  [pdf, other

    cs.CL cs.AI

    Slide, Constrain, Parse, Repeat: Synchronous SlidingWindows for Document AMR Parsing

    Authors: Sadhana Kumaravel, Tahira Naseem, Ramon Fernandez Astudillo, Radu Florian, Salim Roukos

    Abstract: The sliding window approach provides an elegant way to handle contexts of sizes larger than the Transformer's input window, for tasks like language modeling. Here we extend this approach to the sequence-to-sequence task of document parsing. For this, we exploit recent progress in transition-based parsing to implement a parser with synchronous sliding windows over source and target. We develop an o… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

  7. arXiv:2304.12272  [pdf, other

    cs.CL cs.AI

    AMR Parsing with Instruction Fine-tuned Pre-trained Language Models

    Authors: Young-Suk Lee, Ramón Fernandez Astudillo, Radu Florian, Tahira Naseem, Salim Roukos

    Abstract: Instruction fine-tuned language models on a collection of instruction annotated datasets (FLAN) have shown highly effective to improve model performance and generalization to unseen tasks. However, a majority of standard parsing tasks including abstract meaning representation (AMR), universal dependency (UD), semantic role labeling (SRL) has been excluded from the FLAN collections for both model t… ▽ More

    Submitted 24 April, 2023; originally announced April 2023.

  8. arXiv:2303.00807  [pdf, other

    cs.IR cs.CL

    UDAPDR: Unsupervised Domain Adaptation via LLM Prompting and Distillation of Rerankers

    Authors: Jon Saad-Falcon, Omar Khattab, Keshav Santhanam, Radu Florian, Martin Franz, Salim Roukos, Avirup Sil, Md Arafat Sultan, Christopher Potts

    Abstract: Many information retrieval tasks require large labeled datasets for fine-tuning. However, such datasets are often unavailable, and their utility for real-world applications can diminish quickly due to domain shifts. To address this challenge, we develop and motivate a method for using large language models (LLMs) to generate large numbers of synthetic queries cheaply. The method begins by generati… ▽ More

    Submitted 13 October, 2023; v1 submitted 1 March, 2023; originally announced March 2023.

    Comments: Long Paper at Empirical Methods in Natural Language Processing (EMNLP) 2023

  9. arXiv:2301.09715  [pdf, other

    cs.CL cs.IR cs.LG

    PrimeQA: The Prime Repository for State-of-the-Art Multilingual Question Answering Research and Development

    Authors: Avirup Sil, Jaydeep Sen, Bhavani Iyer, Martin Franz, Kshitij Fadnis, Mihaela Bornea, Sara Rosenthal, Scott McCarley, Rong Zhang, Vishwajeet Kumar, Yulong Li, Md Arafat Sultan, Riyaz Bhat, Radu Florian, Salim Roukos

    Abstract: The field of Question Answering (QA) has made remarkable progress in recent years, thanks to the advent of large pre-trained language models, newer realistic benchmark datasets with leaderboards, and novel algorithms for key components such as retrievers and readers. In this paper, we introduce PRIMEQA: a one-stop and open-source QA repository with an aim to democratize QA re-search and facilitate… ▽ More

    Submitted 25 January, 2023; v1 submitted 23 January, 2023; originally announced January 2023.

  10. arXiv:2212.01340  [pdf, other

    cs.IR cs.CL

    Moving Beyond Downstream Task Accuracy for Information Retrieval Benchmarking

    Authors: Keshav Santhanam, Jon Saad-Falcon, Martin Franz, Omar Khattab, Avirup Sil, Radu Florian, Md Arafat Sultan, Salim Roukos, Matei Zaharia, Christopher Potts

    Abstract: Neural information retrieval (IR) systems have progressed rapidly in recent years, in large part due to the release of publicly available benchmarking tasks. Unfortunately, some dimensions of this progress are illusory: the majority of the popular IR benchmarks today focus exclusively on downstream task accuracy and thus conceal the costs incurred by systems that trade away efficiency for quality.… ▽ More

    Submitted 2 December, 2022; originally announced December 2022.

  11. arXiv:2206.08441  [pdf, other

    cs.CL

    GAAMA 2.0: An Integrated System that Answers Boolean and Extractive Questions

    Authors: Scott McCarley, Mihaela Bornea, Sara Rosenthal, Anthony Ferritto, Md Arafat Sultan, Avirup Sil, Radu Florian

    Abstract: Recent machine reading comprehension datasets include extractive and boolean questions but current approaches do not offer integrated support for answering both question types. We present a multilingual machine reading comprehension system and front-end demo that handles boolean questions by providing both a YES/NO answer and highlighting supporting evidence, and handles extractive questions by hi… ▽ More

    Submitted 21 June, 2022; v1 submitted 16 June, 2022; originally announced June 2022.

  12. arXiv:2205.07257  [pdf, other

    cs.CL cs.LG

    Not to Overfit or Underfit the Source Domains? An Empirical Study of Domain Generalization in Question Answering

    Authors: Md Arafat Sultan, Avirup Sil, Radu Florian

    Abstract: Machine learning models are prone to overfitting their training (source) domains, which is commonly believed to be the reason why they falter in novel target domains. Here we examine the contrasting view that multi-source domain generalization (DG) is first and foremost a problem of mitigating source domain underfitting: models not adequately learning the signal already present in their multi-doma… ▽ More

    Submitted 24 October, 2022; v1 submitted 15 May, 2022; originally announced May 2022.

    Comments: Accepted at EMNLP 2022

  13. arXiv:2205.01464  [pdf, other

    cs.CL

    Inducing and Using Alignments for Transition-based AMR Parsing

    Authors: Andrew Drozdov, Jiawei Zhou, Radu Florian, Andrew McCallum, Tahira Naseem, Yoon Kim, Ramon Fernandez Astudillo

    Abstract: Transition-based parsers for Abstract Meaning Representation (AMR) rely on node-to-word alignments. These alignments are learned separately from parser training and require a complex pipeline of rule-based components, pre-processing, and post-processing to satisfy domain-specific constraints. Parsers also train on a point-estimate of the alignment pipeline, neglecting the uncertainty due to the in… ▽ More

    Submitted 3 May, 2022; originally announced May 2022.

    Comments: Accepted at NAACL 2022

  14. arXiv:2204.09248  [pdf, ps, other

    cs.CL cs.IR

    Synthetic Target Domain Supervision for Open Retrieval QA

    Authors: Revanth Gangi Reddy, Bhavani Iyer, Md Arafat Sultan, Rong Zhang, Avirup Sil, Vittorio Castelli, Radu Florian, Salim Roukos

    Abstract: Neural passage retrieval is a new and promising approach in open retrieval question answering. In this work, we stress-test the Dense Passage Retriever (DPR) -- a state-of-the-art (SOTA) open domain neural retrieval model -- on closed and specialized target domains such as COVID-19, and find that it lags behind standard BM25 in this important real-world setting. To make DPR more robust under domai… ▽ More

    Submitted 20 April, 2022; originally announced April 2022.

    Comments: Published at SIGIR 2021

  15. arXiv:2202.13229  [pdf, other

    cs.CL cs.AI

    A Generative Model for Relation Extraction and Classification

    Authors: Jian Ni, Gaetano Rossiello, Alfio Gliozzo, Radu Florian

    Abstract: Relation extraction (RE) is an important information extraction task which provides essential information to many NLP applications such as knowledge base population and question answering. In this paper, we present a novel generative model for relation extraction and classification (which we call GREC), where RE is modeled as a sequence-to-sequence generation task. We explore various encoding repr… ▽ More

    Submitted 26 February, 2022; originally announced February 2022.

  16. arXiv:2112.08513  [pdf, other

    cs.CL

    DocAMR: Multi-Sentence AMR Representation and Evaluation

    Authors: Tahira Naseem, Austin Blodgett, Sadhana Kumaravel, Tim O'Gorman, Young-Suk Lee, Jeffrey Flanigan, Ramón Fernandez Astudillo, Radu Florian, Salim Roukos, Nathan Schneider

    Abstract: Despite extensive research on parsing of English sentences into Abstraction Meaning Representation (AMR) graphs, which are compared to gold graphs via the Smatch metric, full-document parsing into a unified graph representation lacks well-defined representation and evaluation. Taking advantage of a super-sentential level of coreference annotation from previous work, we introduce a simple algorithm… ▽ More

    Submitted 6 May, 2022; v1 submitted 15 December, 2021; originally announced December 2021.

    MSC Class: I.2.7

  17. arXiv:2112.07877  [pdf, other

    cs.CL

    Learning to Transpile AMR into SPARQL

    Authors: Mihaela Bornea, Ramon Fernandez Astudillo, Tahira Naseem, Nandana Mihindukulasooriya, Ibrahim Abdelaziz, Pavan Kapanipathi, Radu Florian, Salim Roukos

    Abstract: We propose a transition-based system to transpile Abstract Meaning Representation (AMR) into SPARQL for Knowledge Base Question Answering (KBQA). This allows us to delegate part of the semantic representation to a strongly pre-trained semantic parser, while learning transpiling with small amount of paired data. We depart from recent work relating AMR and SPARQL constructs, but rather than applying… ▽ More

    Submitted 8 December, 2022; v1 submitted 14 December, 2021; originally announced December 2021.

  18. arXiv:2112.07790  [pdf, ps, other

    cs.CL cs.AI

    Maximum Bayes Smatch Ensemble Distillation for AMR Parsing

    Authors: Young-Suk Lee, Ramon Fernandez Astudillo, Thanh Lam Hoang, Tahira Naseem, Radu Florian, Salim Roukos

    Abstract: AMR parsing has experienced an unprecendented increase in performance in the last three years, due to a mixture of effects including architecture improvements and transfer learning. Self-learning techniques have also played a role in pushing performance forward. However, for most recent high performant parsers, the effect of self-learning and silver data augmentation seems to be fading. In this pa… ▽ More

    Submitted 2 May, 2022; v1 submitted 14 December, 2021; originally announced December 2021.

    Journal ref: NAACL-HLT 2022

  19. arXiv:2112.07772  [pdf, other

    cs.CL

    Do Answers to Boolean Questions Need Explanations? Yes

    Authors: Sara Rosenthal, Mihaela Bornea, Avirup Sil, Radu Florian, Scott McCarley

    Abstract: Existing datasets that contain boolean questions, such as BoolQ and TYDI QA , provide the user with a YES/NO response to the question. However, a one word response is not sufficient for an explainable system. We promote explainability by releasing a new set of annotations marking the evidence in existing TyDi QA and BoolQ datasets. We show that our annotations can be used to train a model that ext… ▽ More

    Submitted 14 December, 2021; originally announced December 2021.

    Comments: 9 pages

  20. arXiv:2110.15534  [pdf, other

    cs.CL

    Structure-aware Fine-tuning of Sequence-to-sequence Transformers for Transition-based AMR Parsing

    Authors: Jiawei Zhou, Tahira Naseem, Ramón Fernandez Astudillo, Young-Suk Lee, Radu Florian, Salim Roukos

    Abstract: Predicting linearized Abstract Meaning Representation (AMR) graphs using pre-trained sequence-to-sequence Transformer models has recently led to large improvements on AMR parsing benchmarks. These parsers are simple and avoid explicit modeling of structure but lack desirable properties such as graph well-formedness guarantees or built-in graph-sentence alignments. In this work we explore the integ… ▽ More

    Submitted 29 October, 2021; originally announced October 2021.

    Comments: EMNLP 2021 main conference

  21. arXiv:2105.03229  [pdf, other

    cs.CL

    VAULT: VAriable Unified Long Text Representation for Machine Reading Comprehension

    Authors: Haoyang Wen, Anthony Ferritto, Heng Ji, Radu Florian, Avirup Sil

    Abstract: Existing models on Machine Reading Comprehension (MRC) require complex model architecture for effectively modeling long texts with paragraph representation and classification, thereby making inference computationally inefficient for production use. In this work, we propose VAULT: a light-weight and parallel-efficient paragraph representation for MRC based on contextualized representation from long… ▽ More

    Submitted 2 June, 2021; v1 submitted 7 May, 2021; originally announced May 2021.

    Comments: Accepted at ACL 2021

  22. arXiv:2104.14674  [pdf, other

    cs.CL

    AMR Parsing with Action-Pointer Transformer

    Authors: Jiawei Zhou, Tahira Naseem, Ramón Fernandez Astudillo, Radu Florian

    Abstract: Abstract Meaning Representation parsing is a sentence-to-graph prediction task where target nodes are not explicitly aligned to sentence tokens. However, since graph nodes are semantically based on one or more sentence tokens, implicit alignments can be derived. Transition-based parsers operate over the sentence from left to right, capturing this inductive bias via alignments at the cost of limite… ▽ More

    Submitted 18 May, 2021; v1 submitted 29 April, 2021; originally announced April 2021.

    Comments: Accepted at NAACL 2021

  23. arXiv:2102.02189  [pdf, other

    cs.CL cs.AI

    Bootstrap** Multilingual AMR with Contextual Word Alignments

    Authors: Janaki Sheth, Young-Suk Lee, Ramon Fernandez Astudillo, Tahira Naseem, Radu Florian, Salim Roukos, Todd Ward

    Abstract: We develop high performance multilingualAbstract Meaning Representation (AMR) sys-tems by projecting English AMR annotationsto other languages with weak supervision. Weachieve this goal by bootstrap** transformer-based multilingual word embeddings, in partic-ular those from cross-lingual RoBERTa (XLM-R large). We develop a novel technique forforeign-text-to-English AMR alignment, usingthe contex… ▽ More

    Submitted 3 February, 2021; originally announced February 2021.

    Journal ref: EACL 2021

  24. arXiv:2012.05958  [pdf, ps, other

    cs.CL

    Multilingual Transfer Learning for QA Using Translation as Data Augmentation

    Authors: Mihaela Bornea, Lin Pan, Sara Rosenthal, Radu Florian, Avirup Sil

    Abstract: Prior work on multilingual question answering has mostly focused on using large multilingual pre-trained language models (LM) to perform zero-shot language-wise learning: train a QA model on English and test on other languages. In this work, we explore strategies that improve cross-lingual transfer by bringing the multilingual embeddings closer in the semantic space. Our first strategy augments th… ▽ More

    Submitted 10 December, 2020; originally announced December 2020.

    Journal ref: AAAI 2021

  25. arXiv:2012.03925  [pdf, other

    cs.LG

    SuperCoder: Program Learning Under Noisy Conditions From Superposition of States

    Authors: Ali Davody, Mahmoud Safari, Răzvan V. Florian

    Abstract: We propose a new method of program learning in a Domain Specific Language (DSL) which is based on gradient descent with no direct search. The first component of our method is a probabilistic representation of the DSL variables. At each timestep in the program sequence, different DSL functions are applied on the DSL variables with a certain probability, leading to different possible outcomes. Rathe… ▽ More

    Submitted 7 December, 2020; originally announced December 2020.

    Comments: 11 pages, 6 figures

  26. arXiv:2012.01414  [pdf, other

    cs.CL cs.AI cs.IR

    End-to-End QA on COVID-19: Domain Adaptation with Synthetic Training

    Authors: Revanth Gangi Reddy, Bhavani Iyer, Md Arafat Sultan, Rong Zhang, Avi Sil, Vittorio Castelli, Radu Florian, Salim Roukos

    Abstract: End-to-end question answering (QA) requires both information retrieval (IR) over a large document collection and machine reading comprehension (MRC) on the retrieved passages. Recent work has successfully trained neural IR systems using only supervised question answering (QA) examples from open-domain datasets. However, despite impressive performance on Wikipedia, neural IR lags behind traditional… ▽ More

    Submitted 2 December, 2020; originally announced December 2020.

    Comments: Preprint

  27. arXiv:2010.10673  [pdf, other

    cs.CL

    Pushing the Limits of AMR Parsing with Self-Learning

    Authors: Young-Suk Lee, Ramon Fernandez Astudillo, Tahira Naseem, Revanth Gangi Reddy, Radu Florian, Salim Roukos

    Abstract: Abstract Meaning Representation (AMR) parsing has experienced a notable growth in performance in the last two years, due both to the impact of transfer learning and the development of novel architectures specific to AMR. At the same time, self-learning techniques have helped push the performance boundaries of other natural language processing applications, such as machine translation or question a… ▽ More

    Submitted 20 October, 2020; originally announced October 2020.

    Comments: Accepted to Findings of EMNLP2020, open review https://openreview.net/forum?id=4q5-oJgLiO, code https://github.com/IBM/transition-amr-parser

  28. arXiv:2010.10669  [pdf, other

    cs.CL

    Transition-based Parsing with Stack-Transformers

    Authors: Ramon Fernandez Astudillo, Miguel Ballesteros, Tahira Naseem, Austin Blodgett, Radu Florian

    Abstract: Modeling the parser state is key to good performance in transition-based parsing. Recurrent Neural Networks considerably improved the performance of transition-based systems by modelling the global state, e.g. stack-LSTM parsers, or local state modeling of contextualized features, e.g. Bi-LSTM parsers. Given the success of Transformer architectures in recent parsing systems, this work explores mod… ▽ More

    Submitted 20 October, 2020; originally announced October 2020.

    Comments: Accepted to Findings of EMNLP2020, open review https://openreview.net/forum?id=b36spsuUAde, code https://github.com/IBM/transition-amr-parser

  29. arXiv:2010.08652  [pdf, other

    cs.CL cs.AI cs.IR cs.LG

    Cross-Lingual Relation Extraction with Transformers

    Authors: Jian Ni, Taesun Moon, Parul Awasthy, Radu Florian

    Abstract: Relation extraction (RE) is one of the most important tasks in information extraction, as it provides essential information for many NLP applications. In this paper, we propose a cross-lingual RE approach that does not require any human annotation in a target language or any cross-lingual resources. Building upon unsupervised cross-lingual representation learning frameworks, we develop several dee… ▽ More

    Submitted 16 October, 2020; originally announced October 2020.

    Comments: 11 pages

  30. arXiv:2010.05904  [pdf, other

    cs.CL

    Multi-Stage Pre-training for Low-Resource Domain Adaptation

    Authors: Rong Zhang, Revanth Gangi Reddy, Md Arafat Sultan, Vittorio Castelli, Anthony Ferritto, Radu Florian, Efsun Sarioglu Kayi, Salim Roukos, Avirup Sil, Todd Ward

    Abstract: Transfer learning techniques are particularly useful in NLP tasks where a sizable amount of high-quality annotated data is difficult to obtain. Current approaches directly adapt a pre-trained language model (LM) on in-domain text before fine-tuning to downstream tasks. We show that extending the vocabulary of the LM with domain-specific terms leads to further gains. To a bigger effect, we utilize… ▽ More

    Submitted 12 October, 2020; originally announced October 2020.

    Comments: Accepted at EMNLP 2020

  31. arXiv:2009.07317  [pdf, other

    cs.CL

    Cascaded Models for Better Fine-Grained Named Entity Recognition

    Authors: Parul Awasthy, Taesun Moon, Jian Ni, Radu Florian

    Abstract: Named Entity Recognition (NER) is an essential precursor task for many natural language applications, such as relation extraction or event extraction. Much of the NER research has been done on datasets with few classes of entity types (e.g. PER, LOC, ORG, MISC), but many real world applications (disaster relief, complex event extraction, law enforcement) can benefit from a larger NER typeset. More… ▽ More

    Submitted 15 September, 2020; originally announced September 2020.

  32. arXiv:2009.07188  [pdf, other

    cs.CL

    Event Presence Prediction Helps Trigger Detection Across Languages

    Authors: Parul Awasthy, Tahira Naseem, Jian Ni, Taesun Moon, Radu Florian

    Abstract: The task of event detection and classification is central to most information retrieval applications. We show that a Transformer based architecture can effectively model event extraction as a sequence labeling task. We propose a combination of sentence level and token level training objectives that significantly boosts the performance of a BERT based event extraction model. Our approach achieves a… ▽ More

    Submitted 15 September, 2020; originally announced September 2020.

  33. arXiv:2006.16627  [pdf, other

    cs.LG cs.NE stat.ML

    Training highly effective connectivities within neural networks with randomly initialized, fixed weights

    Authors: Cristian Ivan, Razvan Florian

    Abstract: We present some novel, straightforward methods for training the connection graph of a randomly initialized neural network without training the weights. These methods do not use hyperparameters defining cutoff thresholds and therefore remove the need for iteratively searching optimal values of such hyperparameters. We can achieve similar or higher performances than in the case of training all weigh… ▽ More

    Submitted 17 November, 2020; v1 submitted 30 June, 2020; originally announced June 2020.

    Comments: 14 pages, 13 figures

  34. arXiv:2005.09123  [pdf, ps, other

    cs.CL cs.LG

    GPT-too: A language-model-first approach for AMR-to-text generation

    Authors: Manuel Mager, Ramon Fernandez Astudillo, Tahira Naseem, Md Arafat Sultan, Young-Suk Lee, Radu Florian, Salim Roukos

    Abstract: Meaning Representations (AMRs) are broad-coverage sentence-level semantic graphs. Existing approaches to generating text from AMR have focused on training sequence-to-sequence or graph-to-sequence models on AMR annotated data only. In this paper, we propose an alternative approach that combines a strong pre-trained language model with cycle consistency-based re-scoring. Despite the simplicity of t… ▽ More

    Submitted 27 May, 2020; v1 submitted 18 May, 2020; originally announced May 2020.

    Comments: Paper accepted to the Annual Meeting of the Association for Computational Linguistics (ACL 2020)

  35. arXiv:1912.01389  [pdf, other

    cs.CL cs.LG stat.ML

    Towards Lingua Franca Named Entity Recognition with BERT

    Authors: Taesun Moon, Parul Awasthy, Jian Ni, Radu Florian

    Abstract: Information extraction is an important task in NLP, enabling the automatic extraction of data for relational database filling. Historically, research and data was produced for English text, followed in subsequent years by datasets in Arabic, Chinese (ACE/OntoNotes), Dutch, Spanish, German (CoNLL evaluations), and many others. The natural tendency has been to treat each language as a different data… ▽ More

    Submitted 12 December, 2019; v1 submitted 19 November, 2019; originally announced December 2019.

  36. arXiv:1911.02984  [pdf, other

    cs.CL cs.IR

    The TechQA Dataset

    Authors: Vittorio Castelli, Rishav Chakravarti, Saswati Dana, Anthony Ferritto, Radu Florian, Martin Franz, Dinesh Garg, Dinesh Khandelwal, Scott McCarley, Mike McCawley, Mohamed Nasr, Lin Pan, Cezar Pendus, John Pitrelli, Saurabh Pujar, Salim Roukos, Andrzej Sakrajda, Avirup Sil, Rosario Uceda-Sosa, Todd Ward, Rong Zhang

    Abstract: We introduce TechQA, a domain-adaptation question answering dataset for the technical support domain. The TechQA corpus highlights two real-world issues from the automated customer support domain. First, it contains actual questions posed by users on a technical forum, rather than questions generated specifically for a competition or a task. Second, it has a real-world size -- 600 training, 310 de… ▽ More

    Submitted 7 November, 2019; originally announced November 2019.

    Comments: Long version of conference paper to be submitted

  37. arXiv:1911.00337  [pdf, other

    cs.CL

    Ensembling Strategies for Answering Natural Questions

    Authors: Anthony Ferritto, Lin Pan, Rishav Chakravarti, Salim Roukos, Radu Florian, J. William Murdock, Avirup Sil

    Abstract: Many of the top question answering systems today utilize ensembling to improve their performance on tasks such as the Stanford Question Answering Dataset (SQuAD) and Natural Questions (NQ) challenges. Unfortunately most of these systems do not publish their ensembling strategies used in their leaderboard submissions. In this work, we investigate a number of ensembling techniques and demonstrate a… ▽ More

    Submitted 6 November, 2019; v1 submitted 30 October, 2019; originally announced November 2019.

    Comments: arXiv admin note: text overlap with arXiv:1909.05286

  38. arXiv:1911.00069  [pdf, other

    cs.CL cs.IR cs.LG

    Neural Cross-Lingual Relation Extraction Based on Bilingual Word Embedding Map**

    Authors: Jian Ni, Radu Florian

    Abstract: Relation extraction (RE) seeks to detect and classify semantic relationships between entities, which provides useful information for many NLP applications. Since the state-of-the-art RE models require large amounts of manually annotated data and language-specific resources to achieve high accuracy, it is very challenging to transfer an RE model of a resource-rich language to a resource-poor langua… ▽ More

    Submitted 31 October, 2019; originally announced November 2019.

    Comments: 11 pages, Conference on Empirical Methods in Natural Language Processing (EMNLP), 2019

  39. arXiv:1909.05286  [pdf, other

    cs.CL

    Frustratingly Easy Natural Question Answering

    Authors: Lin Pan, Rishav Chakravarti, Anthony Ferritto, Michael Glass, Alfio Gliozzo, Salim Roukos, Radu Florian, Avirup Sil

    Abstract: Existing literature on Question Answering (QA) mostly focuses on algorithmic novelty, data augmentation, or increasingly large pre-trained language models like XLNet and RoBERTa. Additionally, a lot of systems on the QA leaderboards do not have associated research documentation in order to successfully replicate their experiments. In this paper, we outline these algorithmic components such as Atte… ▽ More

    Submitted 11 September, 2019; originally announced September 2019.

  40. CFO: A Framework for Building Production NLP Systems

    Authors: Rishav Chakravarti, Cezar Pendus, Andrzej Sakrajda, Anthony Ferritto, Lin Pan, Michael Glass, Vittorio Castelli, J. William Murdock, Radu Florian, Salim Roukos, Avirup Sil

    Abstract: This paper introduces a novel orchestration framework, called CFO (COMPUTATION FLOW ORCHESTRATOR), for building, experimenting with, and deploying interactive NLP (Natural Language Processing) and IR (Information Retrieval) systems to production environments. We then demonstrate a question answering system built using this framework which incorporates state-of-the-art BERT based MRC (Machine Readi… ▽ More

    Submitted 19 June, 2020; v1 submitted 16 August, 2019; originally announced August 2019.

    Comments: http://ibm.biz/cfo_framework

    Report number: D19-3006

    Journal ref: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): System Demonstrations

  41. arXiv:1905.13370  [pdf, ps, other

    cs.CL

    Rewarding Smatch: Transition-Based AMR Parsing with Reinforcement Learning

    Authors: Tahira Naseem, Abhishek Shah, Hui Wan, Radu Florian, Salim Roukos, Miguel Ballesteros

    Abstract: Our work involves enriching the Stack-LSTM transition-based AMR parser (Ballesteros and Al-Onaizan, 2017) by augmenting training with Policy Learning and rewarding the Smatch score of sampled graphs. In addition, we also combined several AMR-to-text alignments with an attention mechanism and we supplemented the parser with pre-processed concept identification, named entities and contextualized emb… ▽ More

    Submitted 30 May, 2019; originally announced May 2019.

    Comments: Accepted as short paper at ACL 2019

  42. arXiv:1809.02040  [pdf, other

    cs.CL cs.AI

    Exploring Graph-structured Passage Representation for Multi-hop Reading Comprehension with Graph Neural Networks

    Authors: Linfeng Song, Zhiguo Wang, Mo Yu, Yue Zhang, Radu Florian, Daniel Gildea

    Abstract: Multi-hop reading comprehension focuses on one type of factoid question, where a system needs to properly integrate multiple pieces of evidence to correctly answer a question. Previous work approximates global evidence with local coreference information, encoding coreference chains with DAG-styled GRU layers within a gated-attention reader. However, coreference is limited in providing information… ▽ More

    Submitted 6 September, 2018; originally announced September 2018.

  43. arXiv:1806.10201  [pdf, other

    cs.CL

    Neural Cross-Lingual Coreference Resolution and its Application to Entity Linking

    Authors: Gourab Kundu, Avirup Sil, Radu Florian, Wael Hamza

    Abstract: We propose an entity-centric neural cross-lingual coreference model that builds on multi-lingual embeddings and language-independent features. We perform both intrinsic and extrinsic evaluations of our model. In the intrinsic evaluation, we show that our model, when trained on English and tested on Chinese and Spanish, achieves competitive results to the models trained directly on Chinese and Span… ▽ More

    Submitted 26 June, 2018; originally announced June 2018.

    Journal ref: ACL 2018

  44. arXiv:1712.01813  [pdf, other

    cs.CL

    Neural Cross-Lingual Entity Linking

    Authors: Avirup Sil, Gourab Kundu, Radu Florian, Wael Hamza

    Abstract: A major challenge in Entity Linking (EL) is making effective use of contextual information to disambiguate mentions to Wikipedia that might refer to different entities in different contexts. The problem exacerbates with cross-lingual EL which involves linking mentions written in non-English documents to entries in the English Wikipedia: to compare textual clues across languages we need to compute… ▽ More

    Submitted 5 December, 2017; originally announced December 2017.

    Comments: Association for the Advancement of Artificial Intelligence (AAAI), 2018

  45. arXiv:1712.01797  [pdf, other

    cs.CL

    One for All: Towards Language Independent Named Entity Linking

    Authors: Avirup Sil, Radu Florian

    Abstract: Entity linking (EL) is the task of disambiguating mentions in text by associating them with entries in a predefined database of mentions (persons, organizations, etc). Most previous EL research has focused mainly on one language, English, with less attention being paid to other languages, such as Spanish or Chinese. In this paper, we introduce LIEL, a Language Independent Entity Linking system, wh… ▽ More

    Submitted 5 December, 2017; originally announced December 2017.

    Comments: Association for Computational Linguistics (ACL), 2016

  46. Weakly Supervised Cross-Lingual Named Entity Recognition via Effective Annotation and Representation Projection

    Authors: Jian Ni, Georgiana Dinu, Radu Florian

    Abstract: The state-of-the-art named entity recognition (NER) systems are supervised machine learning models that require large amounts of manually annotated data to achieve high accuracy. However, annotating NER data by human is expensive and time-consuming, and can be quite difficult for a new language. In this paper, we present two weakly supervised approaches for cross-lingual NER with no human annotati… ▽ More

    Submitted 8 July, 2017; originally announced July 2017.

    Comments: 11 pages, The 55th Annual Meeting of the Association for Computational Linguistics (ACL), 2017

  47. arXiv:1707.02459  [pdf, ps, other

    cs.CL cs.AI cs.IR

    Improving Multilingual Named Entity Recognition with Wikipedia Entity Type Map**

    Authors: Jian Ni, Radu Florian

    Abstract: The state-of-the-art named entity recognition (NER) systems are statistical machine learning models that have strong generalization capability (i.e., can recognize unseen entities that do not appear in training data) based on lexical and contextual information. However, such a model could still make mistakes if its features favor a wrong entity type. In this paper, we utilize Wikipedia as an open… ▽ More

    Submitted 8 July, 2017; originally announced July 2017.

    Comments: 11 pages, Conference on Empirical Methods in Natural Language Processing (EMNLP), 2016

  48. arXiv:1707.01075  [pdf, other

    cs.CL

    Improving Slot Filling Performance with Attentive Neural Networks on Dependency Structures

    Authors: Lifu Huang, Avirup Sil, Heng Ji, Radu Florian

    Abstract: Slot Filling (SF) aims to extract the values of certain types of attributes (or slots, such as person:cities\_of\_residence) for a given entity from a large collection of source documents. In this paper we propose an effective DNN architecture for SF with the following new strategies: (1). Take a regularized dependency graph instead of a raw sentence as input to DNN, to compress the wide contexts… ▽ More

    Submitted 4 July, 2017; originally announced July 2017.

    Comments: EMNLP'2017

  49. arXiv:1703.04489  [pdf, ps, other

    cs.CL cs.AI

    Reinforcement Learning for Transition-Based Mention Detection

    Authors: Georgiana Dinu, Wael Hamza, Radu Florian

    Abstract: This paper describes an application of reinforcement learning to the mention detection task. We define a novel action-based formulation for the mention detection task, in which a model can flexibly revise past labeling decisions by grou** together tokens and assigning partial mention labels. We devise a method to create mention-level episodes and we train a model by rewarding correctly labeled c… ▽ More

    Submitted 13 March, 2017; originally announced March 2017.

    Comments: Deep Reinforcement Learning Workshop, NIPS 2016

  50. arXiv:1702.03814  [pdf, other

    cs.AI cs.CL

    Bilateral Multi-Perspective Matching for Natural Language Sentences

    Authors: Zhiguo Wang, Wael Hamza, Radu Florian

    Abstract: Natural language sentence matching is a fundamental technology for a variety of tasks. Previous approaches either match sentences from a single direction or only apply single granular (word-by-word or sentence-by-sentence) matching. In this work, we propose a bilateral multi-perspective matching (BiMPM) model under the "matching-aggregation" framework. Given two sentences $P$ and $Q$, our model fi… ▽ More

    Submitted 14 July, 2017; v1 submitted 13 February, 2017; originally announced February 2017.

    Comments: To appear in Proceedings of IJCAI 2017