Predicting Visual Attention in Graphic Design Documents

Chakraborty, Souradeep; Wei, Zijun; Kelton, Conor; Ahn, Seoyoung; Balasubramanian, Aruna; Zelinsky, Gregory J.; Samaras, Dimitris

doi:10.1109/TMM.2022.3176942

Computer Science > Computer Vision and Pattern Recognition

arXiv:2407.02439 (cs)

[Submitted on 2 Jul 2024]

Title:Predicting Visual Attention in Graphic Design Documents

Authors:Souradeep Chakraborty, Zijun Wei, Conor Kelton, Seoyoung Ahn, Aruna Balasubramanian, Gregory J. Zelinsky, Dimitris Samaras

View PDF HTML (experimental)

Abstract:We present a model for predicting visual attention during the free viewing of graphic design documents. While existing works on this topic have aimed at predicting static saliency of graphic designs, our work is the first attempt to predict both spatial attention and dynamic temporal order in which the document regions are fixated by gaze using a deep learning based model. We propose a two-stage model for predicting dynamic attention on such documents, with webpages being our primary choice of document design for demonstration. In the first stage, we predict the saliency maps for each of the document components (e.g. logos, banners, texts, etc. for webpages) conditioned on the type of document layout. These component saliency maps are then jointly used to predict the overall document saliency. In the second stage, we use these layout-specific component saliency maps as the state representation for an inverse reinforcement learning model of fixation scanpath prediction during document viewing. To test our model, we collected a new dataset consisting of eye movements from 41 people freely viewing 450 webpages (the largest dataset of its kind). Experimental results show that our model outperforms existing models in both saliency and scanpath prediction for webpages, and also generalizes very well to other graphic design documents such as comics, posters, mobile UIs, etc. and natural images.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2407.02439 [cs.CV]
	(or arXiv:2407.02439v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2407.02439
Journal reference:	IEEE Transactions on Multimedia 25 (2022): 4478-4493
Related DOI:	https://doi.org/10.1109/TMM.2022.3176942

Submission history

From: Souradeep Chakraborty [view email]
[v1] Tue, 2 Jul 2024 17:15:21 UTC (26,096 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Predicting Visual Attention in Graphic Design Documents

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Predicting Visual Attention in Graphic Design Documents

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators