Measuring Annotator Agreement Generally across Complex Structured, Multi-object, and Free-text Annotation Tasks

Braylan, Alexander; Alonso, Omar; Lease, Matthew

doi:10.1145/3485447.3512242

Computer Science > Computation and Language

arXiv:2212.09503 (cs)

[Submitted on 15 Dec 2022]

Title:Measuring Annotator Agreement Generally across Complex Structured, Multi-object, and Free-text Annotation Tasks

Authors:Alexander Braylan, Omar Alonso, Matthew Lease

View PDF

Abstract:When annotators label data, a key metric for quality assurance is inter-annotator agreement (IAA): the extent to which annotators agree on their labels. Though many IAA measures exist for simple categorical and ordinal labeling tasks, relatively little work has considered more complex labeling tasks, such as structured, multi-object, and free-text annotations. Krippendorff's alpha, best known for use with simpler labeling tasks, does have a distance-based formulation with broader applicability, but little work has studied its efficacy and consistency across complex annotation tasks.
We investigate the design and evaluation of IAA measures for complex annotation tasks, with evaluation spanning seven diverse tasks: image bounding boxes, image keypoints, text sequence tagging, ranked lists, free text translations, numeric vectors, and syntax trees. We identify the difficulty of interpretability and the complexity of choosing a distance function as key obstacles in applying Krippendorff's alpha generally across these tasks. We propose two novel, more interpretable measures, showing they yield more consistent IAA measures across tasks and annotation distance functions.

Subjects:	Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as:	arXiv:2212.09503 [cs.CL]
	(or arXiv:2212.09503v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2212.09503
Related DOI:	https://doi.org/10.1145/3485447.3512242

Submission history

From: Alexander Braylan [view email]
[v1] Thu, 15 Dec 2022 20:12:48 UTC (4,698 KB)

Computer Science > Computation and Language

Title:Measuring Annotator Agreement Generally across Complex Structured, Multi-object, and Free-text Annotation Tasks

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Measuring Annotator Agreement Generally across Complex Structured, Multi-object, and Free-text Annotation Tasks

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators