Search | arXiv e-print repository

Self-Supervised Spatially Variant PSF Estimation for Aberration-Aware Depth-from-Defocus

Authors: Zhuofeng Wu, Yusuke Monno, Masatoshi Okutomi

Abstract: In this paper, we address the task of aberration-aware depth-from-defocus (DfD), which takes account of spatially variant point spread functions (PSFs) of a real camera. To effectively obtain the spatially variant PSFs of a real camera without requiring any ground-truth PSFs, we propose a novel self-supervised learning method that leverages the pair of real sharp and blurred images, which can be e… ▽ More In this paper, we address the task of aberration-aware depth-from-defocus (DfD), which takes account of spatially variant point spread functions (PSFs) of a real camera. To effectively obtain the spatially variant PSFs of a real camera without requiring any ground-truth PSFs, we propose a novel self-supervised learning method that leverages the pair of real sharp and blurred images, which can be easily captured by changing the aperture setting of the camera. In our PSF estimation, we assume rotationally symmetric PSFs and introduce the polar coordinate system to more accurately learn the PSF estimation network. We also handle the focus breathing phenomenon that occurs in real DfD situations. Experimental results on synthetic and real data demonstrate the effectiveness of our method regarding both the PSF estimation and the depth estimation. △ Less

Submitted 28 February, 2024; originally announced February 2024.

Journal ref: International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024

arXiv:2209.06027 [pdf, other]

Two-Step Color-Polarization Demosaicking Network

Authors: Vy Nguyen, Masayuki Tanaka, Yusuke Monno, Masatoshi Okutomi

Abstract: Polarization information of light in a scene is valuable for various image processing and computer vision tasks. A division-of-focal-plane polarimeter is a promising approach to capture the polarization images of different orientations in one shot, while it requires color-polarization demosaicking. In this paper, we propose a two-step color-polarization demosaicking network~(TCPDNet), which consis… ▽ More Polarization information of light in a scene is valuable for various image processing and computer vision tasks. A division-of-focal-plane polarimeter is a promising approach to capture the polarization images of different orientations in one shot, while it requires color-polarization demosaicking. In this paper, we propose a two-step color-polarization demosaicking network~(TCPDNet), which consists of two sub-tasks of color demosaicking and polarization demosaicking. We also introduce a reconstruction loss in the YCbCr color space to improve the performance of TCPDNet. Experimental comparisons demonstrate that TCPDNet outperforms existing methods in terms of the image quality of polarization images and the accuracy of Stokes parameters. △ Less

Submitted 13 September, 2022; originally announced September 2022.

Comments: Accepted in ICIP2022. Project page: http://www.ok.sc.e.titech.ac.jp/res/PolarDem/TCPDNet.html

arXiv:2111.03615 [pdf, other]

Single Image Deraining Network with Rain Embedding Consistency and Layered LSTM

Authors: Yizhou Li, Yusuke Monno, Masatoshi Okutomi

Abstract: Single image deraining is typically addressed as residual learning to predict the rain layer from an input rainy image. For this purpose, an encoder-decoder network draws wide attention, where the encoder is required to encode a high-quality rain embedding which determines the performance of the subsequent decoding stage to reconstruct the rain layer. However, most of existing studies ignore the s… ▽ More Single image deraining is typically addressed as residual learning to predict the rain layer from an input rainy image. For this purpose, an encoder-decoder network draws wide attention, where the encoder is required to encode a high-quality rain embedding which determines the performance of the subsequent decoding stage to reconstruct the rain layer. However, most of existing studies ignore the significance of rain embedding quality, thus leading to limited performance with over/under-deraining. In this paper, with our observation of the high rain layer reconstruction performance by an rain-to-rain autoencoder, we introduce the idea of "Rain Embedding Consistency" by regarding the encoded embedding by the autoencoder as an ideal rain embedding and aim at enhancing the deraining performance by improving the consistency between the ideal rain embedding and the rain embedding derived by the encoder of the deraining network. To achieve this, a Rain Embedding Loss is applied to directly supervise the encoding process, with a Rectified Local Contrast Normalization (RLCN) as the guide that effectively extracts the candidate rain pixels. We also propose Layered LSTM for recurrent deraining and fine-grained encoder feature refinement considering different scales. Qualitative and quantitative experiments demonstrate that our proposed method outperforms previous state-of-the-art methods particularly on a real-world dataset. Our source code is available at http://www.ok.sc.e.titech.ac.jp/res/SIR/. △ Less

Submitted 5 November, 2021; originally announced November 2021.

Comments: Accepted by WACV2022, January 2022

arXiv:2012.10083 [pdf, other]

Spectral Reflectance Estimation Using Projector with Unknown Spectral Power Distribution

Authors: Hironori Hidaka, Yusuke Monno, Masatoshi Okutomi

Abstract: A lighting-based multispectral imaging system using an RGB camera and a projector is one of the most practical and low-cost systems to acquire multispectral observations for estimating the scene's spectral reflectance information. However, existing projector-based systems assume that the spectral power distribution (SPD) of each projector primary is known, which requires additional equipment such… ▽ More A lighting-based multispectral imaging system using an RGB camera and a projector is one of the most practical and low-cost systems to acquire multispectral observations for estimating the scene's spectral reflectance information. However, existing projector-based systems assume that the spectral power distribution (SPD) of each projector primary is known, which requires additional equipment such as a spectrometer to measure the SPD. In this paper, we present a method for jointly estimating the spectral reflectance and the SPD of each projector primary. In addition to adopting a common spectral reflectance basis model, we model the projector's SPD by a low-dimensional model using basis functions obtained by a newly collected projector's SPD database. Then, the spectral reflectances and the projector's SPDs are alternatively estimated based on the basis models. We experimentally show the performance of our joint estimation using a different number of projected illuminations and investigate the potential of the spectral reflectance estimation using a projector with unknown SPD. △ Less

Submitted 18 December, 2020; originally announced December 2020.

Comments: Presented at CIC2020. Projector's SPD data is available at http://www.ok.sc.e.titech.ac.jp/res/PCSSfM/pro-cam_reflectance.html

arXiv:2011.10232 [pdf, other]

Deep Snapshot HDR Imaging Using Multi-Exposure Color Filter Array

Authors: Takeru Suda, Masayuki Tanaka, Yusuke Monno, Masatoshi Okutomi

Abstract: In this paper, we propose a deep snapshot high dynamic range (HDR) imaging framework that can effectively reconstruct an HDR image from the RAW data captured using a multi-exposure color filter array (ME-CFA), which consists of a mosaic pattern of RGB filters with different exposure levels. To effectively learn the HDR image reconstruction network, we introduce the idea of luminance normalization… ▽ More In this paper, we propose a deep snapshot high dynamic range (HDR) imaging framework that can effectively reconstruct an HDR image from the RAW data captured using a multi-exposure color filter array (ME-CFA), which consists of a mosaic pattern of RGB filters with different exposure levels. To effectively learn the HDR image reconstruction network, we introduce the idea of luminance normalization that simultaneously enables effective loss computation and input data normalization by considering relative local contrasts in the "normalized-by-luminance" HDR domain. This idea makes it possible to equally handle the errors in both bright and dark areas regardless of absolute luminance levels, which significantly improves the visual image quality in a tone-mapped domain. Experimental results using two public HDR image datasets demonstrate that our framework outperforms other snapshot methods and produces high-quality HDR images with fewer visual artifacts. △ Less

Submitted 20 November, 2020; originally announced November 2020.

Comments: Accepted at ACCV2020 (Oral). Project page: http://www.ok.sc.e.titech.ac.jp/res/DSHDR/

arXiv:2007.14292 [pdf, other]

Monochrome and Color Polarization Demosaicking Using Edge-Aware Residual Interpolation

Authors: Miki Morimatsu, Yusuke Monno, Masayuki Tanaka, Masatoshi Okutomi

Abstract: A division-of-focal-plane or microgrid image polarimeter enables us to acquire a set of polarization images in one shot. Since the polarimeter consists of an image sensor equipped with a monochrome or color polarization filter array (MPFA or CPFA), the demosaicking process to interpolate missing pixel values plays a crucial role in obtaining high-quality polarization images. In this paper, we prop… ▽ More A division-of-focal-plane or microgrid image polarimeter enables us to acquire a set of polarization images in one shot. Since the polarimeter consists of an image sensor equipped with a monochrome or color polarization filter array (MPFA or CPFA), the demosaicking process to interpolate missing pixel values plays a crucial role in obtaining high-quality polarization images. In this paper, we propose a novel MPFA demosaicking method based on edge-aware residual interpolation (EARI) and also extend it to CPFA demosaicking. The key of EARI is a new edge detector for generating an effective guide image used to interpolate the missing pixel values. We also present a newly constructed full color-polarization image dataset captured using a 3-CCD camera and a rotating polarizer. Using the dataset, we experimentally demonstrate that our EARI-based method outperforms existing methods in MPFA and CPFA demosaicking. △ Less

Submitted 28 July, 2020; originally announced July 2020.

Comments: Accepted in ICIP2020. Dataset and code are available at http://www.ok.sc.e.titech.ac.jp/res/PolarDem/index.html

arXiv:2006.08145 [pdf, other]

Classifying degraded images over various levels of degradation

Authors: Kazuki Endo, Masayuki Tanaka, Masatoshi Okutomi

Abstract: Classification for degraded images having various levels of degradation is very important in practical applications. This paper proposes a convolutional neural network to classify degraded images by using a restoration network and an ensemble learning. The results demonstrate that the proposed network can classify degraded images over various levels of degradation well. This paper also reveals how… ▽ More Classification for degraded images having various levels of degradation is very important in practical applications. This paper proposes a convolutional neural network to classify degraded images by using a restoration network and an ensemble learning. The results demonstrate that the proposed network can classify degraded images over various levels of degradation well. This paper also reveals how the image-quality of training data for a classification network affects the classification performance of degraded images. △ Less

Submitted 15 June, 2020; originally announced June 2020.

Comments: Accepted by the 27th IEEE International Conference on Image Processing (ICIP 2020)

arXiv:1908.08185 [pdf, other]

Pro-Cam SSfM: Projector-Camera System for Structure and Spectral Reflectance from Motion

Authors: Chunyu Li, Yusuke Monno, Hironori Hidaka, Masatoshi Okutomi

Abstract: In this paper, we propose a novel projector-camera system for practical and low-cost acquisition of a dense object 3D model with the spectral reflectance property. In our system, we use a standard RGB camera and leverage an off-the-shelf projector as active illumination for both the 3D reconstruction and the spectral reflectance estimation. We first reconstruct the 3D points while estimating the p… ▽ More In this paper, we propose a novel projector-camera system for practical and low-cost acquisition of a dense object 3D model with the spectral reflectance property. In our system, we use a standard RGB camera and leverage an off-the-shelf projector as active illumination for both the 3D reconstruction and the spectral reflectance estimation. We first reconstruct the 3D points while estimating the poses of the camera and the projector, which are alternately moved around the object, by combining multi-view structured light and structure-from-motion (SfM) techniques. We then exploit the projector for multispectral imaging and estimate the spectral reflectance of each 3D point based on a novel spectral reflectance estimation model considering the geometric relationship between the reconstructed 3D points and the estimated projector positions. Experimental results on several real objects demonstrate that our system can precisely acquire a dense 3D model with the full spectral reflectance property using off-the-shelf devices. △ Less

Submitted 21 August, 2019; originally announced August 2019.

Comments: Accepted by ICCV 2019. Project homepage: http://www.ok.sc.e.titech.ac.jp/res/PCSSfM/

arXiv:1905.12988 [pdf, other]

3D Reconstruction of Whole Stomach from Endoscope Video Using Structure-from-Motion

Authors: Aji Resindra Widya, Yusuke Monno, Kosuke Imahori, Masatoshi Okutomi, Sho Suzuki, Takuji Gotoda, Kenji Miki

Abstract: Gastric endoscopy is a common clinical practice that enables medical doctors to diagnose the stomach inside a body. In order to identify a gastric lesion's location such as early gastric cancer within the stomach, this work addressed to reconstruct the 3D shape of a whole stomach with color texture information generated from a standard monocular endoscope video. Previous works have tried to recons… ▽ More Gastric endoscopy is a common clinical practice that enables medical doctors to diagnose the stomach inside a body. In order to identify a gastric lesion's location such as early gastric cancer within the stomach, this work addressed to reconstruct the 3D shape of a whole stomach with color texture information generated from a standard monocular endoscope video. Previous works have tried to reconstruct the 3D structures of various organs from endoscope images. However, they are mainly focused on a partial surface. In this work, we investigated how to enable structure-from-motion (SfM) to reconstruct the whole shape of a stomach from a standard endoscope video. We specifically investigated the combined effect of chromo-endoscopy and color channel selection on SfM. Our study found that 3D reconstruction of the whole stomach can be achieved by using red channel images captured under chromo-endoscopy by spreading indigo carmine (IC) dye on the stomach surface. △ Less

Submitted 30 May, 2019; originally announced May 2019.

Comments: 5 pages, 4 figures, accepted in EMBC 2019

Showing 1–9 of 9 results for author: Okutomi, M