Ranger: A Toolkit for Effect-Size Based Multi-Task Evaluation

Sertkan, Mete; Althammer, Sophia; Hofstätter, Sebastian

Computer Science > Computation and Language

arXiv:2305.15048 (cs)

[Submitted on 24 May 2023]

Title:Ranger: A Toolkit for Effect-Size Based Multi-Task Evaluation

Authors:Mete Sertkan, Sophia Althammer, Sebastian Hofstätter

View PDF

Abstract:In this paper, we introduce Ranger - a toolkit to facilitate the easy use of effect-size-based meta-analysis for multi-task evaluation in NLP and IR. We observed that our communities often face the challenge of aggregating results over incomparable metrics and scenarios, which makes conclusions and take-away messages less reliable. With Ranger, we aim to address this issue by providing a task-agnostic toolkit that combines the effect of a treatment on multiple tasks into one statistical evaluation, allowing for comparison of metrics and computation of an overall summary effect. Our toolkit produces publication-ready forest plots that enable clear communication of evaluation results over multiple tasks. Our goal with the ready-to-use Ranger toolkit is to promote robust, effect-size-based evaluation and improve evaluation standards in the community. We provide two case studies for common IR and NLP settings to highlight Ranger's benefits.

Comments:	Accepted at ACL 2023 (System Demonstrations)
Subjects:	Computation and Language (cs.CL); Information Retrieval (cs.IR)
Cite as:	arXiv:2305.15048 [cs.CL]
	(or arXiv:2305.15048v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2305.15048

Submission history

From: Mete Sertkan [view email]
[v1] Wed, 24 May 2023 11:38:39 UTC (9,152 KB)

Computer Science > Computation and Language

Title:Ranger: A Toolkit for Effect-Size Based Multi-Task Evaluation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Ranger: A Toolkit for Effect-Size Based Multi-Task Evaluation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators