Skip to main content

Showing 1–1 of 1 results for author: ImaniGooghari, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2305.08487  [pdf, other

    cs.CL

    Taxi1500: A Multilingual Dataset for Text Classification in 1500 Languages

    Authors: Chunlan Ma, Ayyoob ImaniGooghari, Haotian Ye, Renhao Pei, Ehsaneddin Asgari, Hinrich Schütze

    Abstract: While natural language processing tools have been developed extensively for some of the world's languages, a significant portion of the world's over 7000 languages are still neglected. One reason for this is that evaluation datasets do not yet cover a wide range of languages, including low-resource and endangered ones. We aim to address this issue by creating a text classification dataset encompas… ▽ More

    Submitted 4 June, 2024; v1 submitted 15 May, 2023; originally announced May 2023.