Skip to main content

Showing 1–5 of 5 results for author: Surange, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.11349  [pdf, other

    cs.CL

    TeClass: A Human-Annotated Relevance-based Headline Classification and Generation Dataset for Telugu

    Authors: Gopichand Kanumolu, Lokesh Madasu, Nirmal Surange, Manish Shrivastava

    Abstract: News headline generation is a crucial task in increasing productivity for both the readers and producers of news. This task can easily be aided by automated News headline-generation models. However, the presence of irrelevant headlines in scraped news articles results in sub-optimal performance of generation models. We propose that relevance-based headline classification can greatly aid the task o… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: Accepted at LREC-COLING 2024

  2. arXiv:2403.18933  [pdf, other

    cs.CL

    SemEval-2024 Task 1: Semantic Textual Relatedness for African and Asian Languages

    Authors: Nedjma Ousidhoum, Shamsuddeen Hassan Muhammad, Mohamed Abdalla, Idris Abdulmumin, Ibrahim Said Ahmad, Sanchit Ahuja, Alham Fikri Aji, Vladimir Araujo, Meriem Beloucif, Christine De Kock, Oumaima Hourrane, Manish Shrivastava, Thamar Solorio, Nirmal Surange, Krishnapriya Vishnubhotla, Seid Muhie Yimam, Saif M. Mohammad

    Abstract: We present the first shared task on Semantic Textual Relatedness (STR). While earlier shared tasks primarily focused on semantic similarity, we instead investigate the broader phenomenon of semantic relatedness across 14 languages: Afrikaans, Algerian Arabic, Amharic, English, Hausa, Hindi, Indonesian, Kinyarwanda, Marathi, Moroccan Arabic, Modern Standard Arabic, Punjabi, Spanish, and Telugu. The… ▽ More

    Submitted 17 April, 2024; v1 submitted 27 March, 2024; originally announced March 2024.

    Comments: SemEval 2024 Task Description Paper. arXiv admin note: text overlap with arXiv:2402.08638

  3. arXiv:2402.08638  [pdf, other

    cs.CL

    SemRel2024: A Collection of Semantic Textual Relatedness Datasets for 13 Languages

    Authors: Nedjma Ousidhoum, Shamsuddeen Hassan Muhammad, Mohamed Abdalla, Idris Abdulmumin, Ibrahim Said Ahmad, Sanchit Ahuja, Alham Fikri Aji, Vladimir Araujo, Abinew Ali Ayele, Pavan Baswani, Meriem Beloucif, Chris Biemann, Sofia Bourhim, Christine De Kock, Genet Shanko Dekebo, Oumaima Hourrane, Gopichand Kanumolu, Lokesh Madasu, Samuel Rutunda, Manish Shrivastava, Thamar Solorio, Nirmal Surange, Hailegnaw Getaneh Tilaye, Krishnapriya Vishnubhotla, Genta Winata , et al. (2 additional authors not shown)

    Abstract: Exploring and quantifying semantic relatedness is central to representing language and holds significant implications across various NLP tasks. While earlier NLP research primarily focused on semantic similarity, often within the English language context, we instead investigate the broader phenomenon of semantic relatedness. In this paper, we present \textit{SemRel}, a new semantic relatedness dat… ▽ More

    Submitted 31 May, 2024; v1 submitted 13 February, 2024; originally announced February 2024.

    Comments: Accepted to the Findings of ACL 2024

  4. arXiv:2311.17743  [pdf, other

    cs.CL cs.AI

    Mukhyansh: A Headline Generation Dataset for Indic Languages

    Authors: Lokesh Madasu, Gopichand Kanumolu, Nirmal Surange, Manish Shrivastava

    Abstract: The task of headline generation within the realm of Natural Language Processing (NLP) holds immense significance, as it strives to distill the true essence of textual content into concise and attention-grabbing summaries. While noteworthy progress has been made in headline generation for widely spoken languages like English, there persist numerous challenges when it comes to generating headlines i… ▽ More

    Submitted 29 November, 2023; originally announced November 2023.

    Comments: Accepted at PACLIC 2023

  5. arXiv:2303.14461  [pdf, other

    cs.CL

    Indian Language Summarization using Pretrained Sequence-to-Sequence Models

    Authors: Ashok Urlana, Sahil Manoj Bhatt, Nirmal Surange, Manish Shrivastava

    Abstract: The ILSUM shared task focuses on text summarization for two major Indian languages- Hindi and Gujarati, along with English. In this task, we experiment with various pretrained sequence-to-sequence models to find out the best model for each of the languages. We present a detailed overview of the models and our approaches in this paper. We secure the first rank across all three sub-tasks (English, H… ▽ More

    Submitted 25 March, 2023; originally announced March 2023.

    Comments: Accepted at FIRE-2022, Indian Language Summarization (ILSUM) track