Non-Autoregressive Image Captioning with Counterfactuals-Critical Multi-Agent Learning

Guo, Longteng; Liu, **g; Zhu, Xinxin; He, Xingjian; Jiang, Jie; Lu, Hanqing

Computer Science > Computation and Language

arXiv:2005.04690 (cs)

[Submitted on 10 May 2020]

Title:Non-Autoregressive Image Captioning with Counterfactuals-Critical Multi-Agent Learning

Authors:Longteng Guo, **g Liu, Xinxin Zhu, Xingjian He, Jie Jiang, Hanqing Lu

View PDF

Abstract:Most image captioning models are autoregressive, i.e. they generate each word by conditioning on previously generated words, which leads to heavy latency during inference. Recently, non-autoregressive decoding has been proposed in machine translation to speed up the inference time by generating all words in parallel. Typically, these models use the word-level cross-entropy loss to optimize each word independently. However, such a learning process fails to consider the sentence-level consistency, thus resulting in inferior generation quality of these non-autoregressive models. In this paper, we propose a Non-Autoregressive Image Captioning (NAIC) model with a novel training paradigm: Counterfactuals-critical Multi-Agent Learning (CMAL). CMAL formulates NAIC as a multi-agent reinforcement learning system where positions in the target sequence are viewed as agents that learn to cooperatively maximize a sentence-level reward. Besides, we propose to utilize massive unlabeled images to boost captioning performance. Extensive experiments on MSCOCO image captioning benchmark show that our NAIC model achieves a performance comparable to state-of-the-art autoregressive models, while brings 13.9x decoding speedup.

Comments:	IJCAI 2020 (copyright held by IJCAI)
Subjects:	Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Cite as:	arXiv:2005.04690 [cs.CL]
	(or arXiv:2005.04690v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2005.04690

Submission history

From: Longteng Guo [view email]
[v1] Sun, 10 May 2020 15:09:44 UTC (338 KB)

Computer Science > Computation and Language

Title:Non-Autoregressive Image Captioning with Counterfactuals-Critical Multi-Agent Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Non-Autoregressive Image Captioning with Counterfactuals-Critical Multi-Agent Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators