Skip to main content

Showing 1–2 of 2 results for author: N.C., G

Searching in archive cs. Search in all archives.
.
  1. arXiv:2205.03018  [pdf

    cs.CL

    Aksharantar: Open Indic-language Transliteration datasets and models for the Next Billion Users

    Authors: Yash Madhani, Sushane Parthan, Priyanka Bedekar, Gokul NC, Ruchi Khapra, Anoop Kunchukuttan, Pratyush Kumar, Mitesh M. Khapra

    Abstract: Transliteration is very important in the Indian language context due to the usage of multiple scripts and the widespread use of romanized inputs. However, few training and evaluation sets are publicly available. We introduce Aksharantar, the largest publicly available transliteration dataset for Indian languages created by mining from monolingual and parallel corpora, as well as collecting data fr… ▽ More

    Submitted 26 October, 2023; v1 submitted 6 May, 2022; originally announced May 2022.

    Comments: This manuscript is an extended version of the paper accepted to EMNLP Findings 2023. You can find the EMNLP Findings version at https://anoopkunchukuttan.gitlab.io/publications/emnlp_findings_2023_aksharantar.pdf

  2. arXiv:2110.05877  [pdf, other

    cs.CL cs.AI cs.CV cs.LG

    OpenHands: Making Sign Language Recognition Accessible with Pose-based Pretrained Models across Languages

    Authors: Prem Selvaraj, Gokul NC, Pratyush Kumar, Mitesh Khapra

    Abstract: AI technologies for Natural Languages have made tremendous progress recently. However, commensurate progress has not been made on Sign Languages, in particular, in recognizing signs as individual words or as complete sentences. We introduce OpenHands, a library where we take four key ideas from the NLP community for low-resource languages and apply them to sign languages for word-level recognition… ▽ More

    Submitted 12 October, 2021; originally announced October 2021.

    Comments: Submitted to AAAI22, 13 pages, 9 figures, 6 tables

    ACM Class: I.2.7