Skip to main content

Showing 1–1 of 1 results for author: Sadvilkar, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2010.09657  [pdf, ps, other

    cs.CL cs.AI

    PySBD: Pragmatic Sentence Boundary Disambiguation

    Authors: Nipun Sadvilkar, Mark Neumann

    Abstract: In this paper, we present a rule-based sentence boundary disambiguation Python package that works out-of-the-box for 22 languages. We aim to provide a realistic segmenter which can provide logical sentences even when the format and domain of the input text is unknown. In our work, we adapt the Golden Rules Set (a language-specific set of sentence boundary exemplars) originally implemented as a rub… ▽ More

    Submitted 19 October, 2020; originally announced October 2020.

    Comments: 'PySBD: Pragmatic Sentence Boundary Disambiguation' is a short paper (5 Pages with references) accepted into 2nd Workshop for Natural Language Processing Open Source Software (NLP-OSS) at EMNLP 2020 happening on 19 Nov 2020