Skip to main content

Showing 1–2 of 2 results for author: Hemmer, A

.
  1. arXiv:2312.03367  [pdf, other

    cs.CL

    Lazy-k: Decoding for Constrained Token Classification

    Authors: Arthur Hemmer, Mickaël Coustaty, Nicola Bartolo, Jérôme Brachat, Jean-Marc Ogier

    Abstract: We explore the possibility of improving probabilistic models in structured prediction. Specifically, we combine the models with constrained decoding approaches in the context of token classification for information extraction. The decoding methods search for constraint-satisfying label-assignments while maximizing the total probability. To do this, we evaluate several existing approaches, as well… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

    Comments: Accepted EMNLP Main 2023

  2. arXiv:2307.01020  [pdf, other

    cs.CL

    Estimating Post-OCR Denoising Complexity on Numerical Texts

    Authors: Arthur Hemmer, Jérôme Brachat, Mickaël Coustaty, Jean-Marc Ogier

    Abstract: Post-OCR processing has significantly improved over the past few years. However, these have been primarily beneficial for texts consisting of natural, alphabetical words, as opposed to documents of numerical nature such as invoices, payslips, medical certificates, etc. To evaluate the OCR post-processing difficulty of these datasets, we propose a method to estimate the denoising complexity of a te… ▽ More

    Submitted 3 July, 2023; originally announced July 2023.

    Comments: Accepted for publication in the ACIIDS 2023 CCIS PROCEEDINGS