Skip to main content

Showing 1–13 of 13 results for author: Coustaty, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.13236  [pdf, other

    cs.DC cs.ET

    LLMChain: Blockchain-based Reputation System for Sharing and Evaluating Large Language Models

    Authors: Mouhamed Amine Bouchiha, Quentin Telnoff, Souhail Bakkali, Ronan Champagnat, Mourad Rabah, Mickaël Coustaty, Yacine Ghamri-Doudane

    Abstract: Large Language Models (LLMs) have witnessed rapid growth in emerging challenges and capabilities of language understanding, generation, and reasoning. Despite their remarkable performance in natural language processing-based applications, LLMs are susceptible to undesirable and erratic behaviors, including hallucinations, unreliable reasoning, and the generation of harmful content. These flawed be… ▽ More

    Submitted 3 May, 2024; v1 submitted 19 April, 2024; originally announced April 2024.

    Comments: Paper accepted at IEEE 48th Annual Computers, Software, and Applications Conference (COMPSAC) IEEE, Osaka, Japan (2024)

  2. arXiv:2403.00573   

    cs.CV

    IDTrust: Deep Identity Document Quality Detection with Bandpass Filtering

    Authors: Musab Al-Ghadi, Joris Voerman, Souhail Bakkali, Mickaël Coustaty, Nicolas Sidere, Xavier St-Georges

    Abstract: The increasing use of digital technologies and mobile-based registration procedures highlights the vital role of personal identity documents (IDs) in verifying users and safeguarding sensitive information. However, the rise in counterfeit ID production poses a significant challenge, necessitating the development of reliable and efficient automated verification methods. This paper introduces IDTrus… ▽ More

    Submitted 26 June, 2024; v1 submitted 1 March, 2024; originally announced March 2024.

    Comments: The primary reason is confidentiality and the joint ownership between the L3i laboratory and the company IMDS

  3. arXiv:2312.03367  [pdf, other

    cs.CL

    Lazy-k: Decoding for Constrained Token Classification

    Authors: Arthur Hemmer, Mickaël Coustaty, Nicola Bartolo, Jérôme Brachat, Jean-Marc Ogier

    Abstract: We explore the possibility of improving probabilistic models in structured prediction. Specifically, we combine the models with constrained decoding approaches in the context of token classification for information extraction. The decoding methods search for constraint-satisfying label-assignments while maximizing the total probability. To do this, we evaluate several existing approaches, as well… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

    Comments: Accepted EMNLP Main 2023

  4. arXiv:2309.05756  [pdf, other

    cs.CV

    TransferDoc: A Self-Supervised Transferable Document Representation Learning Model Unifying Vision and Language

    Authors: Souhail Bakkali, Sanket Biswas, Zuheng Ming, Mickael Coustaty, Marçal Rusiñol, Oriol Ramos Terrades, Josep Lladós

    Abstract: The field of visual document understanding has witnessed a rapid growth in emerging challenges and powerful multi-modal strategies. However, they rely on an extensive amount of document data to learn their pretext objectives in a ``pre-train-then-fine-tune'' paradigm and thus, suffer a significant performance drop in real-world online industrial settings. One major reason is the over-reliance on O… ▽ More

    Submitted 11 September, 2023; originally announced September 2023.

    Comments: Preprint to Pattern Recognition

  5. arXiv:2307.01020  [pdf, other

    cs.CL

    Estimating Post-OCR Denoising Complexity on Numerical Texts

    Authors: Arthur Hemmer, Jérôme Brachat, Mickaël Coustaty, Jean-Marc Ogier

    Abstract: Post-OCR processing has significantly improved over the past few years. However, these have been primarily beneficial for texts consisting of natural, alphabetical words, as opposed to documents of numerical nature such as invoices, payslips, medical certificates, etc. To evaluate the OCR post-processing difficulty of these datasets, we propose a method to estimate the denoising complexity of a te… ▽ More

    Submitted 3 July, 2023; originally announced July 2023.

    Comments: Accepted for publication in the ACIIDS 2023 CCIS PROCEEDINGS

  6. arXiv:2305.08455  [pdf, other

    cs.CV cs.CL cs.LG

    Document Understanding Dataset and Evaluation (DUDE)

    Authors: Jordy Van Landeghem, Rubén Tito, Łukasz Borchmann, Michał Pietruszka, Paweł Józiak, Rafał Powalski, Dawid Jurkiewicz, Mickaël Coustaty, Bertrand Ackaert, Ernest Valveny, Matthew Blaschko, Sien Moens, Tomasz Stanisławek

    Abstract: We call on the Document AI (DocAI) community to reevaluate current methodologies and embrace the challenge of creating more practically-oriented benchmarks. Document Understanding Dataset and Evaluation (DUDE) seeks to remediate the halted research progress in understanding visually-rich documents (VRDs). We present a new dataset with novelties related to types of questions, answers, and document… ▽ More

    Submitted 11 September, 2023; v1 submitted 15 May, 2023; originally announced May 2023.

    Comments: Accepted at ICCV 2023

  7. arXiv:2305.06923  [pdf, other

    cs.CV

    EAML: Ensemble Self-Attention-based Mutual Learning Network for Document Image Classification

    Authors: Souhail Bakkali, Ziheng Ming, Mickael Coustaty, Marçal Rusiñol

    Abstract: In the recent past, complex deep neural networks have received huge interest in various document understanding tasks such as document image classification and document retrieval. As many document types have a distinct visual style, learning only visual features with deep CNNs to classify document images have encountered the problem of low inter-class discrimination, and high intra-class structural… ▽ More

    Submitted 11 May, 2023; originally announced May 2023.

    Comments: Accepted at IJDAR 2021

  8. arXiv:2305.01054  [pdf, other

    cs.DB cs.IR

    CHIC: Corporate Document for Visual question Answering

    Authors: Ibrahim Souleiman Mahamoud, Mickael Coustaty, Aurelie Joseph, Vincent Poulain d Andecy, Jean-Marc Ogier

    Abstract: The massive use of digital documents due to the substantial trend of paperless initiatives confronted some companies to find ways to process thousands of documents per day automatically. To achieve this, they use automatic information retrieval (IR) allowing them to extract useful information from large datasets quickly. In order to have effective IR methods, it is first necessary to have an adequ… ▽ More

    Submitted 1 May, 2023; originally announced May 2023.

  9. arXiv:2302.05658  [pdf, other

    cs.CL cs.AI cs.LG

    DocILE Benchmark for Document Information Localization and Extraction

    Authors: Štěpán Šimsa, Milan Šulc, Michal Uřičář, Yash Patel, Ahmed Hamdi, Matěj Kocián, Matyáš Skalický, Jiří Matas, Antoine Doucet, Mickaël Coustaty, Dimosthenis Karatzas

    Abstract: This paper introduces the DocILE benchmark with the largest dataset of business documents for the tasks of Key Information Localization and Extraction and Line Item Recognition. It contains 6.7k annotated business documents, 100k synthetically generated documents, and nearly~1M unlabeled documents for unsupervised pre-training. The dataset has been built with knowledge of domain- and task-specific… ▽ More

    Submitted 3 May, 2023; v1 submitted 11 February, 2023; originally announced February 2023.

    Comments: Accepted to ICDAR 2023

  10. arXiv:2205.12029  [pdf, other

    cs.CV

    VLCDoC: Vision-Language Contrastive Pre-Training Model for Cross-Modal Document Classification

    Authors: Souhail Bakkali, Zuheng Ming, Mickael Coustaty, Marçal Rusiñol, Oriol Ramos Terrades

    Abstract: Multimodal learning from document data has achieved great success lately as it allows to pre-train semantically meaningful features as a prior into a learnable downstream task. In this paper, we approach the document classification problem by learning cross-modal representations through language and vision cues, considering intra- and inter-modality relationships. Instead of merging features from… ▽ More

    Submitted 11 May, 2023; v1 submitted 24 May, 2022; originally announced May 2022.

    Comments: Accepted at PR

  11. arXiv:2007.07547  [pdf, other

    cs.CV cs.LG

    Evaluation of Neural Network Classification Systems on Document Stream

    Authors: Joris Voerman, Aurelie Joseph, Mickael Coustaty, Vincent Poulain d Andecy, Jean-Marc Ogier

    Abstract: One major drawback of state of the art Neural Networks (NN)-based approaches for document classification purposes is the large number of training samples required to obtain an efficient classification. The minimum required number is around one thousand annotated documents for each class. In many cases it is very difficult, if not impossible, to gather this number of samples in real industrial proc… ▽ More

    Submitted 15 July, 2020; originally announced July 2020.

    Comments: 15 pages, 3 figures and submitted to DAS conferences 2020

    ACM Class: I.7.1; J.1

  12. arXiv:2002.11424  [pdf, other

    cs.CV eess.IV

    Performance Evaluation of Deep Generative Models for Generating Hand-Written Character Images

    Authors: Tanmoy Mondal, LE Thi Thuy Trang, Mickaël Coustaty, Jean-Marc Ogier

    Abstract: There have been many work in the literature on generation of various kinds of images such as Hand-Written characters (MNIST dataset), scene images (CIFAR-10 dataset), various objects images (ImageNet dataset), road signboard images (SVHN dataset) etc. Unfortunately, there have been very limited amount of work done in the domain of document image processing. Automatic image generation can lead to t… ▽ More

    Submitted 26 February, 2020; originally announced February 2020.

  13. Applying Semantic Web Technologies for Improving the Visibility of Tourism Data

    Authors: Fayrouz Soualah-Alila, Cyril Faucher, Frédéric Bertrand, Mickaël Coustaty, Antoine Doucet

    Abstract: Tourism industry is an extremely information-intensive, complex and dynamic activity. It can benefit from semantic Web technologies, due to the significant heterogeneity of information sources and the high volume of on-line data. The management of semantically diverse annotated tourism data is facilitated by ontologies that provide methods and standards, which allow flexibility and more intelligen… ▽ More

    Submitted 15 November, 2015; originally announced November 2015.

    Comments: ESAIR: Exploiting Semantic Annotations in Information Retrieval, Oct 2015, Melbourne, Austria. 2015