Skip to main content

Showing 1–1 of 1 results for author: Halama, P

Searching in archive cs. Search in all archives.
.
  1. LAMBERT: Layout-Aware (Language) Modeling for information extraction

    Authors: Łukasz Garncarek, Rafał Powalski, Tomasz Stanisławek, Bartosz Topolski, Piotr Halama, Michał Turski, Filip Graliński

    Abstract: We introduce a simple new approach to the problem of understanding documents where non-trivial layout influences the local semantics. To this end, we modify the Transformer encoder architecture in a way that allows it to use layout features obtained from an OCR system, without the need to re-learn language semantics from scratch. We only augment the input of the model with the coordinates of token… ▽ More

    Submitted 28 May, 2021; v1 submitted 19 February, 2020; originally announced February 2020.

    Comments: accepted to ICDAR 2021

    Journal ref: In: Lladós J., Lopresti D., Uchida S. (eds) Document Analysis and Recognition - ICDAR 2021. ICDAR 2021. Lecture Notes in Computer Science, vol 12821. Springer, Cham