Unveiling Key Aspects of Fine-Tuning in Sentence Embeddings: A Representation Rank Analysis

Jung, Euna; Kim, Jaeill; Ko, Jungmin; Park, **woo; Rhee, Wonjong

Computer Science > Computation and Language

arXiv:2405.11297 (cs)

[Submitted on 18 May 2024]

Title:Unveiling Key Aspects of Fine-Tuning in Sentence Embeddings: A Representation Rank Analysis

Authors:Euna Jung, Jaeill Kim, Jungmin Ko, **woo Park, Wonjong Rhee

View PDF HTML (experimental)

Abstract:The latest advancements in unsupervised learning of sentence embeddings predominantly involve employing contrastive learning-based (CL-based) fine-tuning over pre-trained language models. In this study, we analyze the latest sentence embedding methods by adopting representation rank as the primary tool of analysis. We first define Phase 1 and Phase 2 of fine-tuning based on when representation rank peaks. Utilizing these phases, we conduct a thorough analysis and obtain essential findings across key aspects, including alignment and uniformity, linguistic abilities, and correlation between performance and rank. For instance, we find that the dynamics of the key aspects can undergo significant changes as fine-tuning transitions from Phase 1 to Phase 2. Based on these findings, we experiment with a rank reduction (RR) strategy that facilitates rapid and stable fine-tuning of the latest CL-based methods. Through empirical investigations, we showcase the efficacy of RR in enhancing the performance and stability of five state-of-the-art sentence embedding methods.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2405.11297 [cs.CL]
	(or arXiv:2405.11297v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2405.11297

Submission history

From: Jaeill Kim [view email]
[v1] Sat, 18 May 2024 13:51:27 UTC (4,966 KB)

Computer Science > Computation and Language

Title:Unveiling Key Aspects of Fine-Tuning in Sentence Embeddings: A Representation Rank Analysis

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Unveiling Key Aspects of Fine-Tuning in Sentence Embeddings: A Representation Rank Analysis

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators