Skip to main content

Showing 1–6 of 6 results for author: Alkheir, A A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2312.00699  [pdf, other

    cs.CV cs.IR

    Rethinking Detection Based Table Structure Recognition for Visually Rich Document Images

    Authors: Bin Xiao, Murat Simsek, Burak Kantarci, Ala Abu Alkheir

    Abstract: Table Structure Recognition (TSR) is a widely discussed task aiming at transforming unstructured table images into structured formats, such as HTML sequences, to make text-only models, such as ChatGPT, that can further process these tables. One type of solution is using detection models to detect table components, such as columns and rows, then applying a rule-based post-processing method to conve… ▽ More

    Submitted 10 January, 2024; v1 submitted 1 December, 2023; originally announced December 2023.

    Comments: under review

  2. Table Detection for Visually Rich Document Images

    Authors: Bin Xiao, Murat Simsek, Burak Kantarci, Ala Abu Alkheir

    Abstract: Table Detection (TD) is a fundamental task to enable visually rich document understanding, which requires the model to extract information without information loss. However, popular Intersection over Union (IoU) based evaluation metrics and IoU-based loss functions for the detection models cannot directly represent the degree of information loss for the prediction results. Therefore, we propose to… ▽ More

    Submitted 26 October, 2023; v1 submitted 30 May, 2023; originally announced May 2023.

    Comments: Accepted by Knowledge-Based Systems

  3. arXiv:2305.04833  [pdf, other

    cs.IR cs.CV

    Revisiting Table Detection Datasets for Visually Rich Documents

    Authors: Bin Xiao, Murat Simsek, Burak Kantarci, Ala Abu Alkheir

    Abstract: Table Detection has become a fundamental task for visually rich document understanding with the surging number of electronic documents. However, popular public datasets widely used in related studies have inherent limitations, including noisy and inconsistent samples, limited training samples, and limited data sources. These limitations make these datasets unreliable to evaluate the model performa… ▽ More

    Submitted 8 November, 2023; v1 submitted 3 May, 2023; originally announced May 2023.

  4. arXiv:2211.02128  [pdf, other

    cs.CV cs.IR

    Efficient Information Sharing in ICT Supply Chain Social Network via Table Structure Recognition

    Authors: Bin Xiao, Yakup Akkaya, Murat Simsek, Burak Kantarci, Ala Abu Alkheir

    Abstract: The global Information and Communications Technology (ICT) supply chain is a complex network consisting of all types of participants. It is often formulated as a Social Network to discuss the supply chain network's relations, properties, and development in supply chain management. Information sharing plays a crucial role in improving the efficiency of the supply chain, and datasheets are the most… ▽ More

    Submitted 3 November, 2022; originally announced November 2022.

    Comments: Globecom 2022

  5. Handling big tabular data of ICT supply chains: a multi-task, machine-interpretable approach

    Authors: Bin Xiao, Murat Simsek, Burak Kantarci, Ala Abu Alkheir

    Abstract: Due to the characteristics of Information and Communications Technology (ICT) products, the critical information of ICT devices is often summarized in big tabular data shared across supply chains. Therefore, it is critical to automatically interpret tabular structures with the surging amount of electronic assets. To transform the tabular data in electronic documents into a machine-interpretable fo… ▽ More

    Submitted 11 August, 2022; originally announced August 2022.

    Comments: 6 pages, 7 tables, 4 figures, IEEE Global Communications Conference (Globecom), 2022

  6. arXiv:2203.03819  [pdf, other

    cs.CV cs.IR

    Table Structure Recognition with Conditional Attention

    Authors: Bin Xiao, Murat Simsek, Burak Kantarci, Ala Abu Alkheir

    Abstract: Tabular data in digital documents is widely used to express compact and important information for readers. However, it is challenging to parse tables from unstructured digital documents, such as PDFs and images, into machine-readable format because of the complexity of table structures and the missing of meta-information. Table Structure Recognition (TSR) problem aims to recognize the structure of… ▽ More

    Submitted 7 March, 2022; originally announced March 2022.

    Comments: IJDAR under review