Semantic RGB-D Image Synthesis

Li, Shijie; Li, Rong; Gall, Juergen

Computer Science > Computer Vision and Pattern Recognition

arXiv:2308.11356v2 (cs)

[Submitted on 22 Aug 2023 (v1), last revised 19 Sep 2023 (this version, v2)]

Title:Semantic RGB-D Image Synthesis

Authors:Shijie Li, Rong Li, Juergen Gall

View PDF

Abstract:Collecting diverse sets of training images for RGB-D semantic image segmentation is not always possible. In particular, when robots need to operate in privacy-sensitive areas like homes, the collection is often limited to a small set of locations. As a consequence, the annotated images lack diversity in appearance and approaches for RGB-D semantic image segmentation tend to overfit the training data. In this paper, we thus introduce semantic RGB-D image synthesis to address this problem. It requires synthesising a realistic-looking RGB-D image for a given semantic label map. Current approaches, however, are uni-modal and cannot cope with multi-modal data. Indeed, we show that extending uni-modal approaches to multi-modal data does not perform well. In this paper, we therefore propose a generator for multi-modal data that separates modal-independent information of the semantic layout from the modal-dependent information that is needed to generate an RGB and a depth image, respectively. Furthermore, we propose a discriminator that ensures semantic consistency between the label maps and the generated images and perceptual similarity between the real and generated images. Our comprehensive experiments demonstrate that the proposed method outperforms previous uni-modal methods by a large margin and that the accuracy of an approach for RGB-D semantic segmentation can be significantly improved by mixing real and generated images during training.

Comments:	ICCV Workshop on Representation Learning with Very Limited Images 2023
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2308.11356 [cs.CV]
	(or arXiv:2308.11356v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2308.11356

Submission history

From: Rong Li [view email]
[v1] Tue, 22 Aug 2023 11:16:24 UTC (21,475 KB)
[v2] Tue, 19 Sep 2023 02:52:19 UTC (5,041 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Semantic RGB-D Image Synthesis

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Semantic RGB-D Image Synthesis

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators