Skip to main content

Showing 1–2 of 2 results for author: Mattarella-Micke, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2109.09115  [pdf, other

    cs.CL

    Do Long-Range Language Models Actually Use Long-Range Context?

    Authors: Simeng Sun, Kalpesh Krishna, Andrew Mattarella-Micke, Mohit Iyyer

    Abstract: Language models are generally trained on short, truncated input sequences, which limits their ability to use discourse-level information present in long-range context to improve their predictions. Recent efforts to improve the efficiency of self-attention have led to a proliferation of long-range Transformer language models, which can process much longer sequences than models of the past. However,… ▽ More

    Submitted 19 September, 2021; originally announced September 2021.

    Comments: EMNLP2021

  2. arXiv:2005.00770  [pdf, other

    cs.CL

    Exploring and Predicting Transferability across NLP Tasks

    Authors: Tu Vu, Tong Wang, Tsendsuren Munkhdalai, Alessandro Sordoni, Adam Trischler, Andrew Mattarella-Micke, Subhransu Maji, Mohit Iyyer

    Abstract: Recent advances in NLP demonstrate the effectiveness of training large-scale language models and transferring them to downstream tasks. Can fine-tuning these models on tasks other than language modeling further improve performance? In this paper, we conduct an extensive study of the transferability between 33 NLP tasks across three broad classes of problems (text classification, question answering… ▽ More

    Submitted 6 October, 2020; v1 submitted 2 May, 2020; originally announced May 2020.

    Comments: Accepted as a conference paper at EMNLP 2020, 45 pages, 3 figures, 34 tables