Skip to main content

Showing 1–2 of 2 results for author: Goldfarb, A R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.00505  [pdf, other

    cs.IR cs.LG

    KVP10k : A Comprehensive Dataset for Key-Value Pair Extraction in Business Documents

    Authors: Oshri Naparstek, Roi Pony, Inbar Shapira, Foad Abo Dahood, Ophir Azulai, Yevgeny Yaroker, Nadav Rubinstein, Maksym Lysak, Peter Staar, Ahmed Nassar, Nikolaos Livathinos, Christoph Auer, Elad Amrani, Idan Friedman, Orit Prince, Yevgeny Burshtein, Adi Raz Goldfarb, Udi Barzelay

    Abstract: In recent years, the challenge of extracting information from business documents has emerged as a critical task, finding applications across numerous domains. This effort has attracted substantial interest from both industry and academy, highlighting its significance in the current technological landscape. Most datasets in this area are primarily focused on Key Information Extraction (KIE), where… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

    Comments: accepted ICDAR2024

  2. arXiv:2111.14103  [pdf, other

    cs.CV

    CHARTER: heatmap-based multi-type chart data extraction

    Authors: Joseph Shtok, Sivan Harary, Ophir Azulai, Adi Raz Goldfarb, Assaf Arbelle, Leonid Karlinsky

    Abstract: The digital conversion of information stored in documents is a great source of knowledge. In contrast to the documents text, the conversion of the embedded documents graphics, such as charts and plots, has been much less explored. We present a method and a system for end-to-end conversion of document charts into machine readable tabular data format, which can be easily stored and analyzed in the d… ▽ More

    Submitted 28 November, 2021; originally announced November 2021.

    Comments: Joseph Shtok, Sivan Harary and Leonid Karlinsky had equal contribution

    Journal ref: Document Intelligence workshop at KDD 2021 conference