Skip to main content

Showing 1–3 of 3 results for author: Pérez-Mayos, L

.
  1. arXiv:2109.03160  [pdf, other

    cs.CL

    How much pretraining data do language models need to learn syntax?

    Authors: Laura Pérez-Mayos, Miguel Ballesteros, Leo Wanner

    Abstract: Transformers-based pretrained language models achieve outstanding results in many well-known NLU benchmarks. However, while pretraining methods are very convenient, they are expensive in terms of time and resources. This calls for a study of the impact of pretraining data size on the knowledge of the models. We explore this impact on the syntactic capabilities of RoBERTa, using models trained on i… ▽ More

    Submitted 9 September, 2021; v1 submitted 7 September, 2021; originally announced September 2021.

    Comments: To be published in proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP 2021)

  2. arXiv:2105.04688  [pdf, other

    cs.CL

    Assessing the Syntactic Capabilities of Transformer-based Multilingual Language Models

    Authors: Laura Pérez-Mayos, Alba Táboas García, Simon Mille, Leo Wanner

    Abstract: Multilingual Transformer-based language models, usually pretrained on more than 100 languages, have been shown to achieve outstanding results in a wide range of cross-lingual transfer tasks. However, it remains unknown whether the optimization for different languages conditions the capacity of the models to generalize over syntactic structures, and how languages with syntactic phenomena of differe… ▽ More

    Submitted 10 May, 2021; originally announced May 2021.

    Comments: To be published in Findings of ACL 2021

  3. arXiv:2101.11492  [pdf, other

    cs.CL

    On the Evolution of Syntactic Information Encoded by BERT's Contextualized Representations

    Authors: Laura Pérez-Mayos, Roberto Carlini, Miguel Ballesteros, Leo Wanner

    Abstract: The adaptation of pretrained language models to solve supervised tasks has become a baseline in NLP, and many recent works have focused on studying how linguistic information is encoded in the pretrained sentence representations. Among other information, it has been shown that entire syntax trees are implicitly embedded in the geometry of such models. As these models are often fine-tuned, it becom… ▽ More

    Submitted 10 February, 2021; v1 submitted 27 January, 2021; originally announced January 2021.