Skip to main content

Showing 1–3 of 3 results for author: Pedrazzini, N

Searching in archive cs. Search in all archives.
.
  1. A quantitative and typological study of Early Slavic participle clauses and their competition

    Authors: Nilo Pedrazzini

    Abstract: This thesis is a corpus-based, quantitative, and typological analysis of the functions of Early Slavic participle constructions and their finite competitors ($jegda$-'when'-clauses). The first part leverages detailed linguistic annotation on Early Slavic corpora at the morphosyntactic, dependency, information-structural, and lexical levels to obtain indirect evidence for different potential functi… ▽ More

    Submitted 8 May, 2024; v1 submitted 3 May, 2024; originally announced May 2024.

    Comments: 259 pages, 138 figures. DPhil Thesis in Linguistics submitted and defended at the University of Oxford (December 2023). This manuscript is a version formatted for improved readability and broader dissemination

    MSC Class: 68T50; 68U15; 68T35; (Primary); 86A32; 15A03 (Secondary) ACM Class: I.2.7

  2. arXiv:2404.18257  [pdf, other

    cs.CL cs.IR

    Map** 'when'-clauses in Latin American and Caribbean languages: an experiment in subtoken-based typology

    Authors: Nilo Pedrazzini

    Abstract: Languages can encode temporal subordination lexically, via subordinating conjunctions, and morphologically, by marking the relation on the predicate. Systematic cross-linguistic variation among the former can be studied using well-established token-based typological approaches to token-aligned parallel corpora. Variation among different morphological means is instead much harder to tackle and ther… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

    Comments: 10 pages, 6 figures. To be published in the 2024 Proceedings of the Workshop on Natural Language Processing for Indigenous Languages of the Americas (AmericasNLP)

    MSC Class: 68T50; 68U15; 68T35; (Primary); 86A32; 15A03 (Secondary) ACM Class: I.2.7

  3. arXiv:2011.06467  [pdf, ps, other

    cs.CL

    Exploiting Cross-Dialectal Gold Syntax for Low-Resource Historical Languages: Towards a Generic Parser for Pre-Modern Slavic

    Authors: Nilo Pedrazzini

    Abstract: This paper explores the possibility of improving the performance of specialized parsers for pre-modern Slavic by training them on data from different related varieties. Because of their linguistic heterogeneity, pre-modern Slavic varieties are treated as low-resource historical languages, whereby cross-dialectal treebank data may be exploited to overcome data scarcity and attempt the training of a… ▽ More

    Submitted 12 November, 2020; originally announced November 2020.

    Comments: Edited by Folgert Karsdorp, Barbara McGillivray, Adina Nerghes & Melvin Wevers. Conference paper (Preprint version). 11 pages. A link to the repository with the datasets used in the paper can be found in the relevant footnotes

    MSC Class: 68T50; 68T07 (Primary); 91F20 (Secondary) ACM Class: I.2.7

    Journal ref: Proceedings of the Workshop on Computational Humanities Research, 18-20 November 2020 (CEUR Workshop Proceedings, Vol. 2723), 237-247