Skip to main content

Showing 1–4 of 4 results for author: Ordan, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.18115  [pdf, other

    cs.CL

    The Knesset Corpus: An Annotated Corpus of Hebrew Parliamentary Proceedings

    Authors: Gili Goldin, Nick Howell, Noam Ordan, Ella Rabinovich, Shuly Wintner

    Abstract: We present the Knesset Corpus, a corpus of Hebrew parliamentary proceedings containing over 30 million sentences (over 384 million tokens) from all the (plenary and committee) protocols held in the Israeli parliament between 1998 and 2022. Sentences are annotated with morpho-syntactic information and are associated with detailed meta-information reflecting demographic and political properties of t… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: 28 pages, 7 figures

    MSC Class: 68T50 ACM Class: I.2.7

  2. arXiv:2210.07873  [pdf, other

    cs.CL

    A Second Wave of UD Hebrew Treebanking and Cross-Domain Parsing

    Authors: Amir Zeldes, Nick Howell, Noam Ordan, Yifat Ben Moshe

    Abstract: Foundational Hebrew NLP tasks such as segmentation, tagging and parsing, have relied to date on various versions of the Hebrew Treebank (HTB, Sima'an et al. 2001). However, the data in HTB, a single-source newswire corpus, is now over 30 years old, and does not cover many aspects of contemporary Hebrew on the web. This paper presents a new, freely available UD treebank of Hebrew stratified from a… ▽ More

    Submitted 18 October, 2022; v1 submitted 14 October, 2022; originally announced October 2022.

    Comments: Proceedings of EMNLP 2022

  3. arXiv:1704.07146  [pdf, other

    cs.CL

    Found in Translation: Reconstructing Phylogenetic Language Trees from Translations

    Authors: Ella Rabinovich, Noam Ordan, Shuly Wintner

    Abstract: Translation has played an important role in trade, law, commerce, politics, and literature for thousands of years. Translators have always tried to be invisible; ideal translations should look as if they were written originally in the target language. We show that traces of the source language remain in the translation product to the extent that it is possible to uncover the history of the source… ▽ More

    Submitted 24 April, 2017; originally announced April 2017.

    Comments: ACL2017, 11 pages

  4. arXiv:1609.03204  [pdf, other

    cs.CL

    On the Similarities Between Native, Non-native and Translated Texts

    Authors: Ella Rabinovich, Sergiu Nisioi, Noam Ordan, Shuly Wintner

    Abstract: We present a computational analysis of three language varieties: native, advanced non-native, and translation. Our goal is to investigate the similarities and differences between non-native language productions and translations, contrasting both with native language. Using a collection of computational methods we establish three main results: (1) the three types of texts are easily distinguishable… ▽ More

    Submitted 11 September, 2016; originally announced September 2016.

    Comments: ACL2016, 12 pages