Skip to main content

Showing 1–6 of 6 results for author: Turc, I

Searching in archive cs. Search in all archives.
.
  1. arXiv:2210.03347  [pdf, other

    cs.CL cs.CV

    Pix2Struct: Screenshot Parsing as Pretraining for Visual Language Understanding

    Authors: Kenton Lee, Mandar Joshi, Iulia Turc, Hexiang Hu, Fangyu Liu, Julian Eisenschlos, Urvashi Khandelwal, Peter Shaw, Ming-Wei Chang, Kristina Toutanova

    Abstract: Visually-situated language is ubiquitous -- sources range from textbooks with diagrams to web pages with images and tables, to mobile apps with buttons and forms. Perhaps due to this diversity, previous work has typically relied on domain-specific recipes with limited sharing of the underlying data, model architectures, and objectives. We present Pix2Struct, a pretrained image-to-text model for pu… ▽ More

    Submitted 15 June, 2023; v1 submitted 7 October, 2022; originally announced October 2022.

    Comments: Accepted at ICML

  2. arXiv:2112.12870  [pdf, other

    cs.CL

    Measuring Attribution in Natural Language Generation Models

    Authors: Hannah Rashkin, Vitaly Nikolaev, Matthew Lamm, Lora Aroyo, Michael Collins, Dipanjan Das, Slav Petrov, Gaurav Singh Tomar, Iulia Turc, David Reitter

    Abstract: With recent improvements in natural language generation (NLG) models for various applications, it has become imperative to have the means to identify and evaluate whether NLG output is only sharing verifiable information about the external world. In this work, we present a new evaluation framework entitled Attributable to Identified Sources (AIS) for assessing the output of natural language genera… ▽ More

    Submitted 2 August, 2022; v1 submitted 23 December, 2021; originally announced December 2021.

  3. arXiv:2106.16171  [pdf, other

    cs.CL

    Revisiting the Primacy of English in Zero-shot Cross-lingual Transfer

    Authors: Iulia Turc, Kenton Lee, Jacob Eisenstein, Ming-Wei Chang, Kristina Toutanova

    Abstract: Despite their success, large pre-trained multilingual models have not completely alleviated the need for labeled data, which is cumbersome to collect for all target languages. Zero-shot cross-lingual transfer is emerging as a practical solution: pre-trained models later fine-tuned on one transfer language exhibit surprising performance when tested on many target languages. English is the dominant… ▽ More

    Submitted 30 June, 2021; originally announced June 2021.

  4. arXiv:2106.16163  [pdf, other

    cs.CL

    The MultiBERTs: BERT Reproductions for Robustness Analysis

    Authors: Thibault Sellam, Steve Yadlowsky, Jason Wei, Naomi Saphra, Alexander D'Amour, Tal Linzen, Jasmijn Bastings, Iulia Turc, Jacob Eisenstein, Dipanjan Das, Ian Tenney, Ellie Pavlick

    Abstract: Experiments with pre-trained models such as BERT are often based on a single checkpoint. While the conclusions drawn apply to the artifact tested in the experiment (i.e., the particular instance of the model), it is not always clear whether they hold for the more general procedure which includes the architecture, training data, initialization scheme, and loss function. Recent work has shown that r… ▽ More

    Submitted 21 March, 2022; v1 submitted 30 June, 2021; originally announced June 2021.

    Comments: Accepted at ICLR'22. Checkpoints and example analyses: http://goo.gle/multiberts

  5. CANINE: Pre-training an Efficient Tokenization-Free Encoder for Language Representation

    Authors: Jonathan H. Clark, Dan Garrette, Iulia Turc, John Wieting

    Abstract: Pipelined NLP systems have largely been superseded by end-to-end neural modeling, yet nearly all commonly-used models still require an explicit tokenization step. While recent tokenization approaches based on data-derived subword lexicons are less brittle than manually engineered tokenizers, these techniques are not equally suited to all languages, and the use of any fixed vocabulary may limit a m… ▽ More

    Submitted 18 May, 2022; v1 submitted 11 March, 2021; originally announced March 2021.

    Comments: TACL Final Version

    Journal ref: Transactions of the Association for Computational Linguistics (2022) 10: 73--91

  6. arXiv:1908.08962  [pdf, other

    cs.CL

    Well-Read Students Learn Better: On the Importance of Pre-training Compact Models

    Authors: Iulia Turc, Ming-Wei Chang, Kenton Lee, Kristina Toutanova

    Abstract: Recent developments in natural language representations have been accompanied by large and expensive models that leverage vast amounts of general-domain text through self-supervised pre-training. Due to the cost of applying such models to down-stream tasks, several model compression techniques on pre-trained language representations have been proposed (Sun et al., 2019; Sanh, 2019). However, surpr… ▽ More

    Submitted 25 September, 2019; v1 submitted 23 August, 2019; originally announced August 2019.

    Comments: Added comparison to concurrent work