Skip to main content

Showing 1–1 of 1 results for author: Mahmood, M F F B

.
  1. arXiv:2405.11985  [pdf, other

    cs.CV

    MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering

    Authors: **gqun Tang, Qi Liu, Yongjie Ye, **ghui Lu, Shu Wei, Chunhui Lin, Wanqing Li, Mohamad Fitri Faiz Bin Mahmood, Hao Feng, Zhen Zhao, Yanjie Wang, Yuliang Liu, Hao Liu, Xiang Bai, Can Huang

    Abstract: Text-Centric Visual Question Answering (TEC-VQA) in its proper format not only facilitates human-machine interaction in text-centric visual environments but also serves as a de facto gold proxy to evaluate AI models in the domain of text-centric scene understanding. Nonetheless, most existing TEC-VQA benchmarks have focused on high-resource languages like English and Chinese. Despite pioneering wo… ▽ More

    Submitted 11 June, 2024; v1 submitted 20 May, 2024; originally announced May 2024.