Hi Model, generating 'nice' instead of 'good' is not as bad as generating 'rice'! Towards Context and Semantic Infused Dialogue Generation Loss Function and Evaluation Metric

Tiwari, Abhisek; Sinan, Muhammed; Roy, Kaushik; Sheth, Amit; Saha, Sriparna; Bhattacharyya, Pushpak

Computer Science > Computation and Language

arXiv:2309.05804 (cs)

[Submitted on 11 Sep 2023 (v1), last revised 29 May 2024 (this version, v2)]

Title:Hi Model, generating 'nice' instead of 'good' is not as bad as generating 'rice'! Towards Context and Semantic Infused Dialogue Generation Loss Function and Evaluation Metric

Authors:Abhisek Tiwari, Muhammed Sinan, Kaushik Roy, Amit Sheth, Sriparna Saha, Pushpak Bhattacharyya

View PDF HTML (experimental)

Abstract:Over the past two decades, dialogue modeling has made significant strides, moving from simple rule-based responses to personalized and persuasive response generation. However, despite these advancements, the objective functions and evaluation metrics for dialogue generation have remained stagnant. These lexical-based metrics, e.g., cross-entropy and BLEU, have two key limitations: (a) word-to-word matching without semantic consideration: It assigns the same credit for failure to generate "nice" and "rice" for "good", (b) missing context attribute for evaluating the generated response: Even if a generated response is relevant to the ongoing dialogue context, it may still be penalized for not matching the gold utterance provided in the corpus. In this paper, we first investigate these limitations comprehensively and propose a new loss function called Semantic Infused Contextualized diaLogue (SemTextualLogue) loss function. We also formulate an evaluation metric called Dialuation, incorporating both context and semantic relevance. We experimented with both non-pretrained and pre-trained models on two dialogue corpora, encompassing task-oriented and open-domain scenarios. We found that the dialogue generation models trained with SemTextualLogueloss attained superior performance compared to the traditional cross-entropy loss function. The findings establish that the effective training of a dialogue generation model hinges significantly on incorporating semantics and context. This pattern is also mirrored in the introduced Dialuation metric, where the consideration of both context and semantics correlates more strongly with human evaluation compared to traditional metrics.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2309.05804 [cs.CL]
	(or arXiv:2309.05804v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2309.05804

Submission history

From: Abhisek Tiwari [view email]
[v1] Mon, 11 Sep 2023 20:16:38 UTC (7,551 KB)
[v2] Wed, 29 May 2024 18:17:12 UTC (769 KB)

Computer Science > Computation and Language

Title:Hi Model, generating 'nice' instead of 'good' is not as bad as generating 'rice'! Towards Context and Semantic Infused Dialogue Generation Loss Function and Evaluation Metric

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Hi Model, generating 'nice' instead of 'good' is not as bad as generating 'rice'! Towards Context and Semantic Infused Dialogue Generation Loss Function and Evaluation Metric

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators