IMDB-WIKI-SbS: An Evaluation Dataset for Crowdsourced Pairwise Comparisons

Pavlichenko, Nikita; Ustalov, Dmitry

Computer Science > Human-Computer Interaction

arXiv:2110.14990 (cs)

[Submitted on 28 Oct 2021 (v1), last revised 26 Nov 2021 (this version, v2)]

Title:IMDB-WIKI-SbS: An Evaluation Dataset for Crowdsourced Pairwise Comparisons

Authors:Nikita Pavlichenko, Dmitry Ustalov

View PDF

Abstract:Today, comprehensive evaluation of large-scale machine learning models is possible thanks to the open datasets produced using crowdsourcing, such as SQuAD, MS COCO, ImageNet, SuperGLUE, etc. These datasets capture objective responses, assuming the single correct answer, which does not allow to capture the subjective human perception. In turn, pairwise comparison tasks, in which one has to choose between only two options, allow taking peoples' preferences into account for very challenging artificial intelligence tasks, such as information retrieval and recommender system evaluation. Unfortunately, the available datasets are either small or proprietary, slowing down progress in gathering better feedback from human users. In this paper, we present IMDB-WIKI-SbS, a new large-scale dataset for evaluating pairwise comparisons. It contains 9,150 images appearing in 250,249 pairs annotated on a crowdsourcing platform. Our dataset has balanced distributions of age and gender using the well-known IMDB-WIKI dataset as ground truth. We describe how our dataset is built and then compare several baseline methods, indicating its suitability for model evaluation.

Comments:	Accepted to NeurIPS Data-Centric AI Workshop
Subjects:	Human-Computer Interaction (cs.HC)
Cite as:	arXiv:2110.14990 [cs.HC]
	(or arXiv:2110.14990v2 [cs.HC] for this version)
	https://doi.org/10.48550/arXiv.2110.14990

Submission history

From: Nikiita Pavlichenko [view email]
[v1] Thu, 28 Oct 2021 10:05:13 UTC (11 KB)
[v2] Fri, 26 Nov 2021 13:19:56 UTC (11 KB)

Computer Science > Human-Computer Interaction

Title:IMDB-WIKI-SbS: An Evaluation Dataset for Crowdsourced Pairwise Comparisons

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Human-Computer Interaction

Title:IMDB-WIKI-SbS: An Evaluation Dataset for Crowdsourced Pairwise Comparisons

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators