Token-based Decision Criteria Are Suboptimal in In-context Learning

Cho, Hakaze; Sakai, Yoshihiro; Kato, Mariko; Tanaka, Kenshiro; Ishii, Akira; Inoue, Naoya

Computer Science > Computation and Language

arXiv:2406.16535 (cs)

[Submitted on 24 Jun 2024]

Title:Token-based Decision Criteria Are Suboptimal in In-context Learning

Authors:Hakaze Cho, Yoshihiro Sakai, Mariko Kato, Kenshiro Tanaka, Akira Ishii, Naoya Inoue

View PDF HTML (experimental)

Abstract:In-Context Learning (ICL) typically utilizes classification criteria from probabilities of manually selected label tokens. However, we argue that such token-based classification criteria lead to suboptimal decision boundaries, despite delicate calibrations through translation and constrained rotation. To address this problem, we propose Hidden Calibration, which renounces token probabilities and uses the nearest centroid classifier on the LM's last hidden states. In detail, we use the nearest centroid classification on the hidden states, assigning the category of the nearest centroid previously observed from a few-shot calibration set to the test sample as the predicted label. Our experiments on 3 models and 10 classification datasets indicate that Hidden Calibration consistently outperforms current token-based calibrations by about 20%. Our further analysis demonstrates that Hidden Calibration finds better classification criteria with less inter-categories overlap, and LMs provide linearly separable intra-category clusters with the help of demonstrations, which supports Hidden Calibration and gives new insights into the conventional ICL.

Comments:	21 pages, 14 figures, 8 tables
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2406.16535 [cs.CL]
	(or arXiv:2406.16535v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2406.16535

Submission history

From: Cho Hakaze [view email]
[v1] Mon, 24 Jun 2024 11:16:26 UTC (2,941 KB)

Computer Science > Computation and Language

Title:Token-based Decision Criteria Are Suboptimal in In-context Learning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Token-based Decision Criteria Are Suboptimal in In-context Learning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators