Skip to main content

Showing 1–3 of 3 results for author: Lévai, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2109.06327  [pdf, other

    cs.CL

    Evaluating Transferability of BERT Models on Uralic Languages

    Authors: Judit Ács, Dániel Lévai, András Kornai

    Abstract: Transformer-based language models such as BERT have outperformed previous models on a large number of English benchmarks, but their evaluation is often limited to English or a small number of well-resourced languages. In this work, we evaluate monolingual, multilingual, and randomly initialized language models from the BERT family on a variety of Uralic languages including Estonian, Finnish, Hunga… ▽ More

    Submitted 23 November, 2021; v1 submitted 13 September, 2021; originally announced September 2021.

    Comments: Seventh International Workshop for Computational Linguistics of Uralic Languages (IWCLUL 2021)

  2. arXiv:2102.10848  [pdf, other

    cs.CL

    Evaluating Contextualized Language Models for Hungarian

    Authors: Judit Ács, Dániel Lévai, Dávid Márk Nemeskey, András Kornai

    Abstract: We present an extended comparison of contextualized language models for Hungarian. We compare huBERT, a Hungarian model against 4 multilingual models including the multilingual BERT model. We evaluate these models through three tasks, morphological probing, POS tagging and NER. We find that huBERT works better than the other models, often by a large margin, particularly near the global optimum (ty… ▽ More

    Submitted 22 February, 2021; originally announced February 2021.

    Journal ref: Hungarian NLP Conference (MSZNY2021)

  3. arXiv:2006.14350  [pdf, other

    cs.LG stat.ML

    Data-dependent Pruning to find the Winning Lottery Ticket

    Authors: Dániel Lévai, Zsolt Zombori

    Abstract: The Lottery Ticket Hypothesis postulates that a freshly initialized neural network contains a small subnetwork that can be trained in isolation to achieve similar performance as the full network. Our paper examines several alternatives to search for such subnetworks. We conclude that incorporating a data dependent component into the pruning criterion in the form of the gradient of the training los… ▽ More

    Submitted 25 June, 2020; originally announced June 2020.