Skip to main content

Showing 1–25 of 25 results for author: Toutanova, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2311.09612  [pdf, other

    cs.CV cs.CL

    Efficient End-to-End Visual Document Understanding with Rationale Distillation

    Authors: Wang Zhu, Alekh Agarwal, Mandar Joshi, Robin Jia, Jesse Thomason, Kristina Toutanova

    Abstract: Understanding visually situated language requires interpreting complex layouts of textual and visual elements. Pre-processing tools, such as optical character recognition (OCR), can map document image inputs to textual tokens, then large language models (LLMs) can reason over text. However, such methods have high computational and engineering complexity. Can small pretrained image-to-text models a… ▽ More

    Submitted 1 April, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

    Comments: Accepted by NAACL 2024

  2. arXiv:2306.00245  [pdf, other

    cs.LG cs.CL cs.CV cs.HC

    From Pixels to UI Actions: Learning to Follow Instructions via Graphical User Interfaces

    Authors: Peter Shaw, Mandar Joshi, James Cohan, Jonathan Berant, Panupong Pasupat, Hexiang Hu, Urvashi Khandelwal, Kenton Lee, Kristina Toutanova

    Abstract: Much of the previous work towards digital agents for graphical user interfaces (GUIs) has relied on text-based representations (derived from HTML or other structured data sources), which are not always readily available. These input representations have been often coupled with custom, task-specific action spaces. This paper focuses on creating agents that interact with the digital world using the… ▽ More

    Submitted 6 December, 2023; v1 submitted 31 May, 2023; originally announced June 2023.

  3. arXiv:2305.14337  [pdf, other

    cs.CL cs.IR

    Anchor Prediction: Automatic Refinement of Internet Links

    Authors: Nelson F. Liu, Kenton Lee, Kristina Toutanova

    Abstract: Internet links enable users to deepen their understanding of a topic by providing convenient access to related information. However, the majority of links are unanchored -- they link to a target webpage as a whole, and readers may expend considerable effort localizing the specific parts of the target webpage that enrich their understanding of the link's source context. To help readers effectively… ▽ More

    Submitted 24 May, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: 10 pages, 2 figures

  4. arXiv:2305.11694  [pdf, other

    cs.CL

    QUEST: A Retrieval Dataset of Entity-Seeking Queries with Implicit Set Operations

    Authors: Chaitanya Malaviya, Peter Shaw, Ming-Wei Chang, Kenton Lee, Kristina Toutanova

    Abstract: Formulating selective information needs results in queries that implicitly specify set operations, such as intersection, union, and difference. For instance, one might search for "shorebirds that are not sandpipers" or "science-fiction films shot in England". To study the ability of retrieval systems to meet such information needs, we construct QUEST, a dataset of 3357 natural language queries wit… ▽ More

    Submitted 31 May, 2023; v1 submitted 19 May, 2023; originally announced May 2023.

    Comments: ACL 2023; Dataset available at https://github.com/google-research/language/tree/master/language/quest

  5. arXiv:2302.11154  [pdf, other

    cs.CV cs.AI cs.CL

    Open-domain Visual Entity Recognition: Towards Recognizing Millions of Wikipedia Entities

    Authors: Hexiang Hu, Yi Luan, Yang Chen, Urvashi Khandelwal, Mandar Joshi, Kenton Lee, Kristina Toutanova, Ming-Wei Chang

    Abstract: Large-scale multi-modal pre-training models such as CLIP and PaLI exhibit strong generalization on various visual domains and tasks. However, existing image classification benchmarks often evaluate recognition on a specific domain (e.g., outdoor images) or a specific task (e.g., classifying plant species), which falls short of evaluating whether pre-trained foundational models are universal visual… ▽ More

    Submitted 23 February, 2023; v1 submitted 22 February, 2023; originally announced February 2023.

    Comments: Dataset available at https://open-vision-language.github.io/oven

  6. arXiv:2210.03347  [pdf, other

    cs.CL cs.CV

    Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding

    Authors: Kenton Lee, Mandar Joshi, Iulia Turc, Hexiang Hu, Fangyu Liu, Julian Eisenschlos, Urvashi Khandelwal, Peter Shaw, Ming-Wei Chang, Kristina Toutanova

    Abstract: Visually-situated language is ubiquitous -- sources range from textbooks with diagrams to web pages with images and tables, to mobile apps with buttons and forms. Perhaps due to this diversity, previous work has typically relied on domain-specific recipes with limited sharing of the underlying data, model architectures, and objectives. We present Pix2Struct, a pretrained image-to-text model for pu… ▽ More

    Submitted 15 June, 2023; v1 submitted 7 October, 2022; originally announced October 2022.

    Comments: Accepted at ICML

  7. arXiv:2205.12253  [pdf, other

    cs.CL

    Evaluating the Impact of Model Scale for Compositional Generalization in Semantic Parsing

    Authors: Linlu Qiu, Peter Shaw, Panupong Pasupat, Tianze Shi, Jonathan Herzig, Emily Pitler, Fei Sha, Kristina Toutanova

    Abstract: Despite their strong performance on many tasks, pre-trained language models have been shown to struggle on out-of-distribution compositional generalization. Meanwhile, recent work has shown considerable improvements on many NLP tasks from model scaling. Can scaling up model size also improve compositional generalization in semantic parsing? We evaluate encoder-decoder models up to 11B parameters a… ▽ More

    Submitted 24 October, 2022; v1 submitted 24 May, 2022; originally announced May 2022.

    Comments: EMNLP 2022

  8. arXiv:2204.00743  [pdf, other

    cs.CL cs.IR

    Entity-Centric Query Refinement

    Authors: David Wadden, Nikita Gupta, Kenton Lee, Kristina Toutanova

    Abstract: We introduce the task of entity-centric query refinement. Given an input query whose answer is a (potentially large) collection of entities, the task output is a small set of query refinements meant to assist the user in efficient domain exploration and entity discovery. We propose a method to create a training dataset for this task. For a given input query, we use an existing knowledge base taxon… ▽ More

    Submitted 15 September, 2022; v1 submitted 1 April, 2022; originally announced April 2022.

    Comments: AKBC 2022

  9. arXiv:2112.07610  [pdf, other

    cs.CL

    Improving Compositional Generalization with Latent Structure and Data Augmentation

    Authors: Linlu Qiu, Peter Shaw, Panupong Pasupat, Paweł Krzysztof Nowak, Tal Linzen, Fei Sha, Kristina Toutanova

    Abstract: Generic unstructured neural networks have been shown to struggle on out-of-distribution compositional generalization. Compositional data augmentation via example recombination has transferred some prior knowledge about compositionality to such black-box neural models for several semantic parsing tasks, but this often required task-specific engineering or provided limited gains. We present a more… ▽ More

    Submitted 4 May, 2022; v1 submitted 14 December, 2021; originally announced December 2021.

    Comments: NAACL 2022

  10. arXiv:2106.16171  [pdf, other

    cs.CL

    Revisiting the Primacy of English in Zero-shot Cross-lingual Transfer

    Authors: Iulia Turc, Kenton Lee, Jacob Eisenstein, Ming-Wei Chang, Kristina Toutanova

    Abstract: Despite their success, large pre-trained multilingual models have not completely alleviated the need for labeled data, which is cumbersome to collect for all target languages. Zero-shot cross-lingual transfer is emerging as a practical solution: pre-trained models later fine-tuned on one transfer language exhibit surprising performance when tested on many target languages. English is the dominant… ▽ More

    Submitted 30 June, 2021; originally announced June 2021.

  11. arXiv:2104.08445  [pdf, other

    cs.CL cs.AI

    Joint Passage Ranking for Diverse Multi-Answer Retrieval

    Authors: Sewon Min, Kenton Lee, Ming-Wei Chang, Kristina Toutanova, Hannaneh Hajishirzi

    Abstract: We study multi-answer retrieval, an under-explored problem that requires retrieving passages to cover multiple distinct answers for a given question. This task requires joint modeling of retrieved passages, as models should not repeatedly retrieve passages containing the same answer at the cost of missing a different valid answer. In this paper, we introduce JPR, the first joint passage retrieval… ▽ More

    Submitted 19 September, 2021; v1 submitted 17 April, 2021; originally announced April 2021.

    Comments: 13 pages; Published as a conference paper at EMNLP 2021 (long)

  12. arXiv:2101.10573  [pdf, other

    cs.CL

    Representations for Question Answering from Documents with Tables and Text

    Authors: Vicky Zayats, Kristina Toutanova, Mari Ostendorf

    Abstract: Tables in Web documents are pervasive and can be directly used to answer many of the queries searched on the Web, motivating their integration in question answering. Very often information presented in tables is succinct and hard to interpret with standard language representations. On the other hand, tables often appear within textual context, such as an article describing the table. Using the inf… ▽ More

    Submitted 26 January, 2021; originally announced January 2021.

    Comments: To appear at EACL 2021

  13. arXiv:2010.12725  [pdf, other

    cs.CL

    Compositional Generalization and Natural Language Variation: Can a Semantic Parsing Approach Handle Both?

    Authors: Peter Shaw, Ming-Wei Chang, Panupong Pasupat, Kristina Toutanova

    Abstract: Sequence-to-sequence models excel at handling natural language variation, but have been shown to struggle with out-of-distribution compositional generalization. This has motivated new specialized architectures with stronger compositional biases, but most of these approaches have only been evaluated on synthetically-generated datasets, which are not representative of natural language variation. In… ▽ More

    Submitted 1 June, 2021; v1 submitted 23 October, 2020; originally announced October 2020.

    Comments: ACL 2021

  14. arXiv:2005.01898  [pdf, other

    cs.CL cs.LG

    Probabilistic Assumptions Matter: Improved Models for Distantly-Supervised Document-Level Question Answering

    Authors: Hao Cheng, Ming-Wei Chang, Kenton Lee, Kristina Toutanova

    Abstract: We address the problem of extractive question answering using document-level distant super-vision, pairing questions and relevant documents with answer strings. We compare previously used probability space and distant super-vision assumptions (assumptions on the correspondence between the weak answer string labels and possible answer mention spans). We show that these assumptions interact, and tha… ▽ More

    Submitted 4 May, 2020; originally announced May 2020.

    Comments: ACL2020

  15. arXiv:2005.00181  [pdf, other

    cs.CL

    Sparse, Dense, and Attentional Representations for Text Retrieval

    Authors: Yi Luan, Jacob Eisenstein, Kristina Toutanova, Michael Collins

    Abstract: Dual encoders perform retrieval by encoding documents and queries into dense lowdimensional vectors, scoring each document by its inner product with the query. We investigate the capacity of this architecture relative to sparse bag-of-words models and attentional neural networks. Using both theoretical and empirical analysis, we establish connections between the encoding dimension, the margin betw… ▽ More

    Submitted 16 February, 2021; v1 submitted 30 April, 2020; originally announced May 2020.

    Comments: To appear in TACL 2020. The arXiv version is a pre-MIT Press publication version

  16. arXiv:2004.12006  [pdf, other

    cs.CL

    Contextualized Representations Using Textual Encyclopedic Knowledge

    Authors: Mandar Joshi, Kenton Lee, Yi Luan, Kristina Toutanova

    Abstract: We present a method to represent input texts by contextualizing them jointly with dynamically retrieved textual encyclopedic background knowledge from multiple documents. We apply our method to reading comprehension tasks by encoding questions and passages together with background sentences about the entities they mention. We show that integrating background knowledge from text is effective for ta… ▽ More

    Submitted 13 July, 2021; v1 submitted 24 April, 2020; originally announced April 2020.

    Comments: Added experiments comparing linkers

  17. arXiv:1908.08962  [pdf, other

    cs.CL

    Well-Read Students Learn Better: On the Importance of Pre-training Compact Models

    Authors: Iulia Turc, Ming-Wei Chang, Kenton Lee, Kristina Toutanova

    Abstract: Recent developments in natural language representations have been accompanied by large and expensive models that leverage vast amounts of general-domain text through self-supervised pre-training. Due to the cost of applying such models to down-stream tasks, several model compression techniques on pre-trained language representations have been proposed (Sun et al., 2019; Sanh, 2019). However, surpr… ▽ More

    Submitted 25 September, 2019; v1 submitted 23 August, 2019; originally announced August 2019.

    Comments: Added comparison to concurrent work

  18. arXiv:1906.07348  [pdf, other

    cs.CL cs.LG

    Zero-Shot Entity Linking by Reading Entity Descriptions

    Authors: Lajanugen Logeswaran, Ming-Wei Chang, Kenton Lee, Kristina Toutanova, Jacob Devlin, Honglak Lee

    Abstract: We present the zero-shot entity linking task, where mentions must be linked to unseen entities without in-domain labeled data. The goal is to enable robust transfer to highly specialized domains, and so no metadata or alias tables are assumed. In this setting, entities are only identified by text descriptions, and models must rely strictly on language understanding to resolve the new entities. Fir… ▽ More

    Submitted 17 June, 2019; originally announced June 2019.

    Comments: ACL 2019

  19. arXiv:1906.00300  [pdf, other

    cs.CL

    Latent Retrieval for Weakly Supervised Open Domain Question Answering

    Authors: Kenton Lee, Ming-Wei Chang, Kristina Toutanova

    Abstract: Recent work on open domain question answering (QA) assumes strong supervision of the supporting evidence and/or assumes a blackbox information retrieval (IR) system to retrieve evidence candidates. We argue that both are suboptimal, since gold evidence is not always available, and QA is fundamentally different from IR. We show for the first time that it is possible to jointly learn the retriever a… ▽ More

    Submitted 27 June, 2019; v1 submitted 1 June, 2019; originally announced June 2019.

    Comments: Accepted to ACL 2019

  20. arXiv:1905.10044  [pdf, ps, other

    cs.CL

    BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions

    Authors: Christopher Clark, Kenton Lee, Ming-Wei Chang, Tom Kwiatkowski, Michael Collins, Kristina Toutanova

    Abstract: In this paper we study yes/no questions that are naturally occurring --- meaning that they are generated in unprompted and unconstrained settings. We build a reading comprehension dataset, BoolQ, of such questions, and show that they are unexpectedly challenging. They often query for complex, non-factoid information, and require difficult entailment-like inference to solve. We also explore the eff… ▽ More

    Submitted 24 May, 2019; originally announced May 2019.

    Comments: In NAACL 2019

  21. arXiv:1901.09128  [pdf, other

    cs.CL

    Language Model Pre-training for Hierarchical Document Representations

    Authors: Ming-Wei Chang, Kristina Toutanova, Kenton Lee, Jacob Devlin

    Abstract: Hierarchical neural architectures are often used to capture long-distance dependencies and have been applied to many document-level tasks such as summarization, document segmentation, and sentiment analysis. However, effective usage of such a large context can be difficult to learn, especially in the case where there is limited labeled data available. Building on the recent success of language mod… ▽ More

    Submitted 25 January, 2019; originally announced January 2019.

  22. arXiv:1811.02076  [pdf, other

    cs.CL

    Improving Span-based Question Answering Systems with Coarsely Labeled Data

    Authors: Hao Cheng, Ming-Wei Chang, Kenton Lee, Ankur Parikh, Michael Collins, Kristina Toutanova

    Abstract: We study approaches to improve fine-grained short answer Question Answering models by integrating coarse-grained data annotated for paragraph-level relevance and show that coarsely annotated data can bring significant performance gains. Experiments demonstrate that the standard multi-task learning approach of sharing representations is not the most effective way to leverage coarse-grained annotati… ▽ More

    Submitted 5 November, 2018; originally announced November 2018.

  23. arXiv:1810.04805  [pdf, other

    cs.CL

    BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

    Authors: Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova

    Abstract: We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models, BERT is designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers. As a result, the pre-trained BERT model can be fine-tuned with… ▽ More

    Submitted 24 May, 2019; v1 submitted 10 October, 2018; originally announced October 2018.

  24. arXiv:1708.03743  [pdf, other

    cs.CL

    Cross-Sentence N-ary Relation Extraction with Graph LSTMs

    Authors: Nanyun Peng, Hoifung Poon, Chris Quirk, Kristina Toutanova, Wen-tau Yih

    Abstract: Past work in relation extraction has focused on binary relations in single sentences. Recent NLP inroads in high-value domains have sparked interest in the more general setting of extracting n-ary relations that span multiple sentences. In this paper, we explore a general relation extraction framework based on graph long short-term memory networks (graph LSTMs) that can be easily extended to cross… ▽ More

    Submitted 12 August, 2017; originally announced August 2017.

    Comments: Conditional accepted by TACL in December 2016; published in April 2017; presented at ACL in August 2017

    Journal ref: Transactions of the Association for Computational Linguistics (TACL) 2017, Vol 5

  25. arXiv:1707.02026  [pdf, other

    cs.CL

    A Nested Attention Neural Hybrid Model for Grammatical Error Correction

    Authors: Jianshu Ji, Qinlong Wang, Kristina Toutanova, Yongen Gong, Steven Truong, Jianfeng Gao

    Abstract: Grammatical error correction (GEC) systems strive to correct both global errors in word order and usage, and local errors in spelling and inflection. Further develo** upon recent work on neural machine translation, we propose a new hybrid neural model with nested attention layers for GEC. Experiments show that the new model can effectively correct errors of both types by incorporating word and c… ▽ More

    Submitted 9 July, 2017; v1 submitted 6 July, 2017; originally announced July 2017.