Search | arXiv e-print repository

Towards Realistic Landmark-Guided Facial Video Inpainting Based on GANs

Authors: Fatemeh Ghorbani Lohesara, Karen Egiazarian, Sebastian Knorr

Abstract: Facial video inpainting plays a crucial role in a wide range of applications, including but not limited to the removal of obstructions in video conferencing and telemedicine, enhancement of facial expression analysis, privacy protection, integration of graphical overlays, and virtual makeup. This domain presents serious challenges due to the intricate nature of facial features and the inherent hum… ▽ More Facial video inpainting plays a crucial role in a wide range of applications, including but not limited to the removal of obstructions in video conferencing and telemedicine, enhancement of facial expression analysis, privacy protection, integration of graphical overlays, and virtual makeup. This domain presents serious challenges due to the intricate nature of facial features and the inherent human familiarity with faces, heightening the need for accurate and persuasive completions. In addressing challenges specifically related to occlusion removal in this context, our focus is on the progressive task of generating complete images from facial data covered by masks, ensuring both spatial and temporal coherence. Our study introduces a network designed for expression-based video inpainting, employing generative adversarial networks (GANs) to handle static and moving occlusions across all frames. By utilizing facial landmarks and an occlusion-free reference image, our model maintains the user's identity consistently across frames. We further enhance emotional preservation through a customized facial expression recognition (FER) loss function, ensuring detailed inpainted outputs. Our proposed framework exhibits proficiency in eliminating occlusions from facial videos in an adaptive form, whether appearing static or dynamic on the frames, while providing realistic and coherent results. △ Less

Submitted 14 February, 2024; originally announced February 2024.

Comments: Accepted in Electronic Imaging 2024

arXiv:2401.14136 [pdf, other]

doi 10.1145/3626495.3626497

Expression-aware video inpainting for HMD removal in XR applications

Authors: Fatemeh Ghorbani Lohesara, Karen Egiazarian, Sebastian Knorr

Abstract: Head-mounted displays (HMDs) serve as indispensable devices for observing extended reality (XR) environments and virtual content. However, HMDs present an obstacle to external recording techniques as they block the upper face of the user. This limitation significantly affects social XR applications, specifically teleconferencing, where facial features and eye gaze information play a vital role in… ▽ More Head-mounted displays (HMDs) serve as indispensable devices for observing extended reality (XR) environments and virtual content. However, HMDs present an obstacle to external recording techniques as they block the upper face of the user. This limitation significantly affects social XR applications, specifically teleconferencing, where facial features and eye gaze information play a vital role in creating an immersive user experience. In this study, we propose a new network for expression-aware video inpainting for HMD removal (EVI-HRnet) based on generative adversarial networks (GANs). Our model effectively fills in missing information with regard to facial landmarks and a single occlusion-free reference image of the user. The framework and its components ensure the preservation of the user's identity across frames using the reference frame. To further improve the level of realism of the inpainted output, we introduce a novel facial expression recognition (FER) loss function for emotion preservation. Our results demonstrate the remarkable capability of the proposed framework to remove HMDs from facial videos while maintaining the subject's facial expression and identity. Moreover, the outputs exhibit temporal consistency along the inpainted frames. This lightweight framework presents a practical approach for HMD occlusion removal, with the potential to enhance various collaborative XR applications without the need for additional hardware. △ Less

Submitted 25 January, 2024; originally announced January 2024.

Comments: Accepted in CVMP 2023

arXiv:2205.05602 [pdf, other]

doi 10.1109/JSTSP.2023.3262443

An Overview of Advances in Signal Processing Techniques for Classical and Quantum Wideband Synthetic Apertures

Authors: Peter Vouras, Kumar Vijay Mishra, Alexandra Artusio-Glimpse, Samuel Pinilla, Angeliki Xenaki, David W. Griffith, Karen Egiazarian

Abstract: Rapid developments in synthetic aperture (SA) systems, which generate a larger aperture with greater angular resolution than is inherently possible from the physical dimensions of a single sensor alone, are leading to novel research avenues in several signal processing applications. The SAs may either use a mechanical positioner to move an antenna through space or deploy a distributed network of s… ▽ More Rapid developments in synthetic aperture (SA) systems, which generate a larger aperture with greater angular resolution than is inherently possible from the physical dimensions of a single sensor alone, are leading to novel research avenues in several signal processing applications. The SAs may either use a mechanical positioner to move an antenna through space or deploy a distributed network of sensors. With the advent of new hardware technologies, the SAs tend to be denser nowadays. The recent opening of higher frequency bands has led to wide SA bandwidths. In general, new techniques and setups are required to harness the potential of wide SAs in space and bandwidth. Herein, we provide a brief overview of emerging signal processing trends in such spatially and spectrally wideband SA systems. This guide is intended to aid newcomers in navigating the most critical issues in SA analysis and further supports the development of new theories in the field. In particular, we cover the theoretical framework and practical underpinnings of wideband SA radar, channel sounding, sonar, radiometry, and optical applications. Apart from the classical SA applications, we also discuss the quantum electric-field-sensing probes in SAs that are currently undergoing active research but remain at nascent stages of development. △ Less

Submitted 19 March, 2023; v1 submitted 11 May, 2022; originally announced May 2022.

Comments: 48 pages, 65 figures, 3 tables

arXiv:2204.07098 [pdf, other]

Residual Swin Transformer Channel Attention Network for Image Demosaicing

Authors: Wenzhu Xing, Karen Egiazarian

Abstract: Image demosaicing is problem of interpolating full- resolution color images from raw sensor (color filter array) data. During last decade, deep neural networks have been widely used in image restoration, and in particular, in demosaicing, attaining significant performance improvement. In recent years, vision transformers have been designed and successfully used in various computer vision applicati… ▽ More Image demosaicing is problem of interpolating full- resolution color images from raw sensor (color filter array) data. During last decade, deep neural networks have been widely used in image restoration, and in particular, in demosaicing, attaining significant performance improvement. In recent years, vision transformers have been designed and successfully used in various computer vision applications. One of the recent methods of image restoration based on a Swin Transformer (ST), SwinIR, demonstrates state-of-the-art performance with a smaller number of parameters than neural network-based methods. Inspired by the success of SwinIR, we propose in this paper a novel Swin Transformer-based network for image demosaicing, called RSTCANet. To extract image features, RSTCANet stacks several residual Swin Transformer Channel Attention blocks (RSTCAB), introducing the channel attention for each two successive ST blocks. Extensive experiments demonstrate that RSTCANet out- performs state-of-the-art image demosaicing methods, and has a smaller number of parameters. △ Less

Submitted 14 April, 2022; originally announced April 2022.

arXiv:2203.16985 [pdf, other]

On design of hybrid diffractive optics for achromatic extended depth-of-field (EDoF) RGB imaging

Authors: Seyyed Reza Miri Rostami, Samuel Pinilla, Igor Shevkunov, Vladimir Katkovnik, Karen Egiazarian

Abstract: A hybrid imaging system is a simultaneous physical arrangement of a refractive lens and a multilevel phase mask (MPM) as a diffractive optical element (DOE). The favorable properties of the hybrid setup are improved extended-depth-of-field (EDoF) imaging and low chromatic aberrations. We built a fully differentiable image formation model in order to use neural network techniques to optimize imagin… ▽ More A hybrid imaging system is a simultaneous physical arrangement of a refractive lens and a multilevel phase mask (MPM) as a diffractive optical element (DOE). The favorable properties of the hybrid setup are improved extended-depth-of-field (EDoF) imaging and low chromatic aberrations. We built a fully differentiable image formation model in order to use neural network techniques to optimize imaging. At the first stage, the design framework relies on the model-based approach with numerical simulation and end-to-end joint optimization of both MPM and imaging algorithms. In the second stage, MPM is fixed as found at the first stage, and the image processing is optimized experimentally using the CNN learning-based approach with MPM implemented by a spatial light modulator. The paper is concentrated on a comparative analysis of imaging accuracy and quality for design with various basic optical parameters: aperture size, lens focal length, and distance between MPM and sensor. We point out that the varying aperture size, lens focal length, and distance between MPM and sensor are for the first time considered for end-to-end optimization of EDoF. We numerically and experimentally compare the designs for visible wavelength interval [400-700]nm and the following EDoF ranges: [0.5-100]m for simulations and [0.5-1.9]m for experimental tests. This study concerns an application of hybrid optics for compact cameras with aperture [5-9] mm and distance between MPM and sensor [3-10]mm. △ Less

Submitted 31 March, 2022; originally announced March 2022.

Comments: 16 pages, 11 figures, 1 table

arXiv:2203.01695 [pdf, other]

Unfolding-Aided Bootstrapped Phase Retrieval in Optical Imaging

Authors: Samuel Pinilla, Kumar Vijay Mishra, Igor Shevkunov, Mojtaba Soltanalian, Vladimir Katkovnik, Karen Egiazarian

Abstract: Phase retrieval in optical imaging refers to the recovery of a complex signal from phaseless data acquired in the form of its diffraction patterns. These patterns are acquired through a system with a coherent light source that employs a diffractive optical element (DOE) to modulate the scene resulting in coded diffraction patterns at the sensor. Recently, the hybrid approach of model-driven networ… ▽ More Phase retrieval in optical imaging refers to the recovery of a complex signal from phaseless data acquired in the form of its diffraction patterns. These patterns are acquired through a system with a coherent light source that employs a diffractive optical element (DOE) to modulate the scene resulting in coded diffraction patterns at the sensor. Recently, the hybrid approach of model-driven network or deep unfolding has emerged as an effective alternative to conventional model-based and learning-based phase retrieval techniques because it allows for bounding the complexity of algorithms while also retaining their efficacy. Additionally, such hybrid approaches have shown promise in improving the design of DOEs that follow theoretical uniqueness conditions. There are opportunities to exploit novel experimental setups and resolve even more complex DOE phase retrieval applications. This paper presents an overview of algorithms and applications of deep unfolding for bootstrapped - regardless of near, middle, and far zones - phase retrieval. △ Less

Submitted 9 October, 2022; v1 submitted 3 March, 2022; originally announced March 2022.

Comments: 13 pages, 11 figures, 1 table

arXiv:2109.11877 [pdf, other]

Learning-based Noise Component Map Estimation for Image Denoising

Authors: Sheyda Ghanbaralizadeh Bahnemiri, Mykola Ponomarenko, Karen Egiazarian

Abstract: A problem of image denoising when images are corrupted by a non-stationary noise is considered in this paper. Since in practice no a priori information on noise is available, noise statistics should be pre-estimated for image denoising. In this paper, deep convolutional neural network (CNN) based method for estimation of a map of local, patch-wise, standard deviations of noise (so-called sigma-map… ▽ More A problem of image denoising when images are corrupted by a non-stationary noise is considered in this paper. Since in practice no a priori information on noise is available, noise statistics should be pre-estimated for image denoising. In this paper, deep convolutional neural network (CNN) based method for estimation of a map of local, patch-wise, standard deviations of noise (so-called sigma-map) is proposed. It achieves the state-of-the-art performance in accuracy of estimation of sigma-map for the case of non-stationary noise, as well as estimation of noise variance for the case of additive white Gaussian noise. Extensive experiments on image denoising using estimated sigma-maps demonstrate that our method outperforms recent CNN-based blind image denoising methods by up to 6 dB in PSNR, as well as other state-of-the-art methods based on sigma-map estimation by up to 0.5 dB, providing same time better usage flexibility. Comparison with the ideal case, when denoising is applied using ground-truth sigma-map, shows that a difference of corresponding PSNR values for most of noise levels is within 0.1-0.2 dB and does not exceeds 0.6 dB. △ Less

Submitted 24 September, 2021; originally announced September 2021.

Comments: 5 pages, submitted to IEEE Signal Processing Letters

arXiv:2108.05616 [pdf, ps, other]

SSR-PR: Single-shot Super-Resolution Phase Retrieval based two prior calibration tests

Authors: Peter Kocsis, Igor Shevkunov, Vladimir Katkovnik, Heikki Rekola, Karen Egiazarian

Abstract: We propose a novel approach and algorithm based on two preliminary tests of the optical system elements to enhance the super-resolved complex-valued imaging. The approach is developed for inverse phase imaging in a single-shot lensless optical setup. Imaging is based on wavefront modulation by a single binary phase mask. The preliminary tests compensate errors in the optical system and correct a c… ▽ More We propose a novel approach and algorithm based on two preliminary tests of the optical system elements to enhance the super-resolved complex-valued imaging. The approach is developed for inverse phase imaging in a single-shot lensless optical setup. Imaging is based on wavefront modulation by a single binary phase mask. The preliminary tests compensate errors in the optical system and correct a carrying wavefront, reducing the gap between real-life experiments and computational modeling, which improve imaging significantly both qualitatively and quantitatively. These two tests are performed for observation of the laser beam and phase mask along, and might be considered as a preliminary system calibration. The corrected carrying wavefront is embedded into the proposed iterative Single-shot Super-Resolution Phase Retrieval (SSR-PR) algorithm. Improved initial diffraction pattern upsampling, and a combination of sparse and deep learning based filters achieves the super-resolved reconstructions. Simulations and physical experiments demonstrate the high-quality super-resolution phase imaging. In the simulations, we showed that the SSR-PR algorithm corrects the errors of the proposed optical system and reconstructs phase details 4x smaller than the sensor pixel size. In physical experiment 2um thick lines of USAF phase-target were resolved, which is almost 2x smaller than the sensor pixel size and corresponds to the smallest resolvable group of used test target. For phase bio-imaging, we provide Buccal Epithelial Cells reconstructed in computational super-resolution and the quality was of the same level as a digital holographic system with 40x magnification objective. Furthermore, the single-shot advantage provides the possibility to record dynamic scenes, where the framerate is limited only by the used camera. We provide amplitude-phase video clip of a moving alive single-celled eukaryote. △ Less

Submitted 12 August, 2021; originally announced August 2021.

arXiv:2105.07891 [pdf, other]

ADMM and Spectral Proximity Operators in Hyperspectral Broadband Phase Retrieval for Quantitative Phase Imaging

Authors: Vladimir Katkovnik, Igor Shevkunov, Karen Egiazarian

Abstract: A novel formulation of the hyperspectral broadband phase retrieval is developed for the scenario where both object and modulation phase masks are spectrally varying. The proposed algorithm is based on a complex domain version of the alternating direction method of multipliers (ADMM) and Spectral Proximity Operators (SPO) derived for Gaussian and Poissonian observations. Computations for these oper… ▽ More A novel formulation of the hyperspectral broadband phase retrieval is developed for the scenario where both object and modulation phase masks are spectrally varying. The proposed algorithm is based on a complex domain version of the alternating direction method of multipliers (ADMM) and Spectral Proximity Operators (SPO) derived for Gaussian and Poissonian observations. Computations for these operators are reduced to the solution of sets of cubic (for Gaussian) and quadratic (for Poissonian) algebraic equations. These proximity operators resolve two problems. Firstly, the complex domain spectral components of signals are extracted from the total intensity observations calculated as sums of the signal spectral intensities. In this way, the spectral analysis of the total intensities is achieved. Secondly, the noisy observations are filtered, compromising noisy intensity observations and their predicted counterparts. The ability to resolve the hyperspectral broadband phase retrieval problem and to find the spectrum varying object are essentially defined by the spectral properties of object and image formation operators. The simulation tests demonstrate that the phase retrieval in this formulation can be successfully resolved. △ Less

Submitted 14 May, 2021; originally announced May 2021.

arXiv:2103.05720 [pdf, other]

Power-Balanced Hybrid Optics Boosted Design for Achromatic Extended-Depth-of-Field Imaging via Optimized Mixed OTF

Authors: Seyyed Reza Miri Rostami, Samuel Pinilla, Igor Shevkunov, Vladimir Katkovnik, Karen Egiazarian

Abstract: The power-balanced hybrid optical imaging system is a special design of a diffractive computational camera, introduced in this paper, with image formation by a refractive lens and Multilevel Phase Mask (MPM). This system provides a long focal depth with low chromatic aberrations thanks to MPM and a high energy light concentration due to the refractive lens. We introduce the concept of optical powe… ▽ More The power-balanced hybrid optical imaging system is a special design of a diffractive computational camera, introduced in this paper, with image formation by a refractive lens and Multilevel Phase Mask (MPM). This system provides a long focal depth with low chromatic aberrations thanks to MPM and a high energy light concentration due to the refractive lens. We introduce the concept of optical power balance between the lens and MPM which controls the contribution of each element to modulate the incoming light. Additional unique features of our MPM design are the inclusion of quantization of the MPM's shape on the number of levels and the Fresnel order (thickness) using a smoothing function. To optimize optical power-balance as well as the MPM, we build a fully-differentiable image formation model for joint optimization of optical and imaging parameters for the proposed camera using Neural Network techniques. Additionally, we optimize a single Wiener-like optical transfer function (OTF) invariant to depth to reconstruct a sharp image. We numerically and experimentally compare the designed system with its counterparts, lensless and just-lens optical systems, for the visible wavelength interval (400-700)nm and the depth-of-field range (0.5-$\infty$m for numerical and 0.5-2m for experimental). The attained results demonstrate that the proposed system equipped with the optimal OTF overcomes its counterparts (even when they are used with optimized OTF) in terms of reconstruction quality for off-focus distances. The simulation results also reveal that optimizing the optical power-balance, Fresnel order, and the number of levels parameters are essential for system performance attaining an improvement of up to 5dB of PSNR using the optimized OTF compared with its counterpart lensless setup. △ Less

Submitted 1 July, 2021; v1 submitted 9 March, 2021; originally announced March 2021.

Comments: 18 pages, 14 figures

arXiv:2003.00762 [pdf, other]

Flashlight CNN Image Denoising

Authors: Pham Huu Thanh Binh, Cristóvão Cruz, Karen Egiazarian

Abstract: This paper proposes a learning-based denoising method called FlashLight CNN (FLCNN) that implements a deep neural network for image denoising. The proposed approach is based on deep residual networks and inception networks and it is able to leverage many more parameters than residual networks alone for denoising grayscale images corrupted by additive white Gaussian noise (AWGN). FlashLight CNN dem… ▽ More This paper proposes a learning-based denoising method called FlashLight CNN (FLCNN) that implements a deep neural network for image denoising. The proposed approach is based on deep residual networks and inception networks and it is able to leverage many more parameters than residual networks alone for denoising grayscale images corrupted by additive white Gaussian noise (AWGN). FlashLight CNN demonstrates state of the art performance when compared quantitatively and visually with the current state of the art image denoising methods. △ Less

Submitted 2 July, 2020; v1 submitted 2 March, 2020; originally announced March 2020.

arXiv:2002.10886 [pdf, other]

doi 10.1364/OE.393009

Lensless hyperspectral imaging by Fourier transform spectroscopy for broadband visible light: phase retrieval technique

Authors: Igor Shevkunov, Vladimir Katkovnik, Karen Egiazarian

Abstract: A novel phase retrieval algorithm for broadband hyperspectral phase imaging from noisy intensity observations is proposed. It utilizes advantages of the Fourier Transform spectroscopy in the self-referencing optical setup and provides, additionally beyond spectral intensity distribution, reconstruction of the investigated object's phase. The noise amplification Fellgett's disadvantage is relaxed b… ▽ More A novel phase retrieval algorithm for broadband hyperspectral phase imaging from noisy intensity observations is proposed. It utilizes advantages of the Fourier Transform spectroscopy in the self-referencing optical setup and provides, additionally beyond spectral intensity distribution, reconstruction of the investigated object's phase. The noise amplification Fellgett's disadvantage is relaxed by the application of sparse wavefront noise filtering embedded in the proposed algorithm. The algorithm reliability is proved by simulation tests and results of physical experiments on transparent objects which demonstrate precise phase imaging and object depth (profile) reconstructions. △ Less

Submitted 16 March, 2020; v1 submitted 24 February, 2020; originally announced February 2020.

Comments: 12 pages, 8 figures

arXiv:1910.10100 [pdf, ps, other]

doi 10.1109/TCI.2020.3032101

The Practicality of Stochastic Optimization in Imaging Inverse Problems

Authors: Junqi Tang, Karen Egiazarian, Mohammad Golbabaee, Mike Davies

Abstract: In this work we investigate the practicality of stochastic gradient descent and recently introduced variants with variance-reduction techniques in imaging inverse problems. Such algorithms have been shown in the machine learning literature to have optimal complexities in theory, and provide great improvement empirically over the deterministic gradient methods. Surprisingly, in some tasks such as i… ▽ More In this work we investigate the practicality of stochastic gradient descent and recently introduced variants with variance-reduction techniques in imaging inverse problems. Such algorithms have been shown in the machine learning literature to have optimal complexities in theory, and provide great improvement empirically over the deterministic gradient methods. Surprisingly, in some tasks such as image deblurring, many of such methods fail to converge faster than the accelerated deterministic gradient methods, even in terms of epoch counts. We investigate this phenomenon and propose a theory-inspired mechanism for the practitioners to efficiently characterize whether it is beneficial for an inverse problem to be solved by stochastic optimization techniques or not. Using standard tools in numerical linear algebra, we derive conditions on the spectral structure of the inverse problem for being a suitable application of stochastic gradient methods. Particularly, we show that, for an imaging inverse problem, if and only if its Hessain matrix has a fast-decaying eigenspectrum, then the stochastic gradient methods can be more advantageous than deterministic methods for solving such a problem. Our results also provide guidance on choosing appropriately the partition minibatch schemes, showing that a good minibatch scheme typically has relatively low correlation within each of the minibatches. Finally, we propose an accelerated primal-dual SGD algorithm in order to tackle another key bottleneck of stochastic optimization which is the heavy computation of proximal operators. The proposed method has fast convergence rate in practice, and is able to efficiently handle non-smooth regularization terms which are coupled with linear operators. △ Less

Submitted 8 November, 2019; v1 submitted 22 October, 2019; originally announced October 2019.

Journal ref: Published in IEEE Transactions on Computational Imaging, Vol. 6, 2020

arXiv:1910.03013 [pdf, ps, other]

Hyperspectral holography and spectroscopy: computational features of inverse discrete cosine transform

Authors: Vladimir Katkovnik, Igor Shevkunov, Karen Egiazarian

Abstract: Broadband hyperspectral digital holography and Fourier transform spectroscopy are important instruments in various science and application fields. In the digital hyperspectral holography and spectroscopy the variable of interest are obtained as inverse discrete cosine transforms of observed diffractive intensity patterns. In these notes, we provide a variety of algorithms for the inverse cosine tr… ▽ More Broadband hyperspectral digital holography and Fourier transform spectroscopy are important instruments in various science and application fields. In the digital hyperspectral holography and spectroscopy the variable of interest are obtained as inverse discrete cosine transforms of observed diffractive intensity patterns. In these notes, we provide a variety of algorithms for the inverse cosine transform with the proofs of perfect spectrum reconstruction, as well as we discuss and illustrate some nontrivial features of these algorithms. △ Less

Submitted 4 October, 2019; originally announced October 2019.

Comments: 20 pages, 9 figures

arXiv:1907.03104 [pdf, other]

doi 10.1016/j.optlaseng.2019.105973

Hyperspectral phase imaging based on denoising in complex-valued eigensubspace

Authors: Igor Shevkunov, Vladimir Katkovnik, Daniel Claus, Giancarlo Pedrini, Nikolay Petrov, Karen Egiazarian

Abstract: A new denoising algorithm for hyperspectral complex domain data has been developed and studied. This algorithm is based on the complex domain block-matching 3D filter including the 3D Wiener filtering stage. The developed algorithm is applied and tuned to work in the singular value decomposition (SVD) eigenspace of reduced dimension. The accuracy and quantitative advantage of the new algorithm are… ▽ More A new denoising algorithm for hyperspectral complex domain data has been developed and studied. This algorithm is based on the complex domain block-matching 3D filter including the 3D Wiener filtering stage. The developed algorithm is applied and tuned to work in the singular value decomposition (SVD) eigenspace of reduced dimension. The accuracy and quantitative advantage of the new algorithm are demonstrated in simulation tests and in the processing of the experimental data. It is shown that the algorithm is effective and provides reliable results even for highly noisy data. △ Less

Submitted 6 July, 2019; originally announced July 2019.

arXiv:1803.02112 [pdf, other]

doi 10.1109/LSP.2018.2850222

Nonlocality-Reinforced Convolutional Neural Networks for Image Denoising

Authors: Cristóvão Cruz, Alessandro Foi, Vladimir Katkovnik, Karen Egiazarian

Abstract: We introduce a paradigm for nonlocal sparsity reinforced deep convolutional neural network denoising. It is a combination of a local multiscale denoising by a convolutional neural network (CNN) based denoiser and a nonlocal denoising based on a nonlocal filter (NLF) exploiting the mutual similarities between groups of patches. CNN models are leveraged with noise levels that progressively decrease… ▽ More We introduce a paradigm for nonlocal sparsity reinforced deep convolutional neural network denoising. It is a combination of a local multiscale denoising by a convolutional neural network (CNN) based denoiser and a nonlocal denoising based on a nonlocal filter (NLF) exploiting the mutual similarities between groups of patches. CNN models are leveraged with noise levels that progressively decrease at every iteration of our framework, while their output is regularized by a nonlocal prior implicit within the NLF. Unlike complicated neural networks that embed the nonlocality prior within the layers of the network, our framework is modular, it uses standard pre-trained CNNs together with standard nonlocal filters. An instance of the proposed framework, called NN3D, is evaluated over large grayscale image datasets showing state-of-the-art performance. △ Less

Submitted 21 June, 2018; v1 submitted 6 March, 2018; originally announced March 2018.

Comments: Accepted for publication in IEEE SPL

arXiv:1802.02011 [pdf, other]

Multi-frequency phase retrieval from noisy data

Authors: Vladimir Katkovnik, Karen Egiazarian

Abstract: The phase retrieval from multi-frequency intensity (power) observations is considered. The object to be reconstructed is complex-valued. A novel algorithm is presented that accomplishes both the object phase (absolute phase) retrieval and denoising for Poissonian and Gaussian measurements. The algorithm is derived from the maximum likelihood formulation with Block Matching 3D (BM3D) sparsity prior… ▽ More The phase retrieval from multi-frequency intensity (power) observations is considered. The object to be reconstructed is complex-valued. A novel algorithm is presented that accomplishes both the object phase (absolute phase) retrieval and denoising for Poissonian and Gaussian measurements. The algorithm is derived from the maximum likelihood formulation with Block Matching 3D (BM3D) sparsity priors. These priors result in two filtering: one is in the complex domain for complex-valued multi-frequency object images and another one in the real domain for the object phase. The algorithm is iterative with alternating projections between the object and measurement variables. The simulation experiments are produced for Fourier transform image formation and random phase modulations of the object, then the observations are random object diffraction patterns. The results demonstrate the success of the algorithm for reconstruction of the complex phase objects with the high-accuracy performance even for very noisy data. △ Less

Submitted 6 February, 2018; originally announced February 2018.

Comments: 5 pages, 3 figures

arXiv:1711.10792 [pdf, other]

Blind estimation of white Gaussian noise variance in highly textured images

Authors: Mykola Ponomarenko, Nikolay Gapon, Viacheslav Voronin, Karen Egiazarian

Abstract: In the paper, a new method of blind estimation of noise variance in a single highly textured image is proposed. An input image is divided into 8x8 blocks and discrete cosine transform (DCT) is performed for each block. A part of 64 DCT coefficients with lowest energy calculated through all blocks is selected for further analysis. For the DCT coefficients, a robust estimate of noise variance is cal… ▽ More In the paper, a new method of blind estimation of noise variance in a single highly textured image is proposed. An input image is divided into 8x8 blocks and discrete cosine transform (DCT) is performed for each block. A part of 64 DCT coefficients with lowest energy calculated through all blocks is selected for further analysis. For the DCT coefficients, a robust estimate of noise variance is calculated. Corresponding to the obtained estimate, a part of blocks having very large values of local variance calculated only for the selected DCT coefficients are excluded from the further analysis. These two steps (estimation of noise variance and exclusion of blocks) are iteratively repeated three times. For the verification of the proposed method, a new noise-free test image database TAMPERE17 consisting of many highly textured images is designed. It is shown for this database and different values of noise variance from the set {25, 49, 100, 225}, that the proposed method provides approximately two times lower estimation root mean square error than other methods. △ Less

Submitted 29 November, 2017; originally announced November 2017.

Comments: IS&T International Symposium on Electronic Imaging (EI 2018), Image Processing: Algorithms and Systems XVI

arXiv:1711.00693 [pdf, other]

Statistical evaluation of visual quality metrics for image denoising

Authors: Karen Egiazarian, Mykola Ponomarenko, Vladimir Lukin, Oleg Ieremeiem

Abstract: This paper studies the problem of full reference visual quality assessment of denoised images with a special emphasis on images with low contrast and noise-like texture. Denoising of such images together with noise removal often results in image details loss or smoothing. A new test image database, FLT, containing 75 noise-free "reference" images and 300 filtered ("distorted") images is developed.… ▽ More This paper studies the problem of full reference visual quality assessment of denoised images with a special emphasis on images with low contrast and noise-like texture. Denoising of such images together with noise removal often results in image details loss or smoothing. A new test image database, FLT, containing 75 noise-free "reference" images and 300 filtered ("distorted") images is developed. Each reference image, corrupted by an additive white Gaussian noise, is denoised by the BM3D filter with four different values of threshold parameter (four levels of noise suppression). After carrying out a perceptual quality assessment of distorted images, the mean opinion scores (MOS) are obtained and compared with the values of known full reference quality metrics. As a result, the Spearman Rank Order Correlation Coefficient (SROCC) between PSNR values and MOS has a value close to zero, and SROCC between values of known full-reference image visual quality metrics and MOS does not exceed 0.82 (which is reached by a new visual quality metric proposed in this paper). The FLT dataset is more complex than earlier datasets used for assessment of visual quality for image denoising. Thus, it can be effectively used to design new image visual quality metrics for image denoising. △ Less

Submitted 2 November, 2017; originally announced November 2017.

Comments: Submitted to ICASSP 2018

arXiv:1711.00362 [pdf, other]

Complex-valued image denosing based on group-wise complex-domain sparsity

Authors: Vladimir Katkovnik, Mykola Ponomarenko, Karen Egiazarian

Abstract: Phase imaging and wavefront reconstruction from noisy observations of complex exponent is a topic of this paper. It is a highly non-linear problem because the exponent is a 2π-periodic function of phase. The reconstruction of phase and amplitude is difficult. Even with an additive Gaussian noise in observations distributions of noisy components in phase and amplitude are signal dependent and non-G… ▽ More Phase imaging and wavefront reconstruction from noisy observations of complex exponent is a topic of this paper. It is a highly non-linear problem because the exponent is a 2π-periodic function of phase. The reconstruction of phase and amplitude is difficult. Even with an additive Gaussian noise in observations distributions of noisy components in phase and amplitude are signal dependent and non-Gaussian. Additional difficulties follow from a prior unknown correlation of phase and amplitude in real life scenarios. In this paper, we propose a new class of non-iterative and iterative complex domain filters based on group-wise sparsity in complex domain. This sparsity is based on the techniques implemented in Block-Matching 3D filtering (BM3D) and 3D/4D High-Order Singular Decomposition (HOSVD) exploited for spectrum design, analysis and filtering. The introduced algorithms are a generalization of the ideas used in the CD-BM3D algorithms presented in our previous publications. The algorithms are implemented as a MATLAB Toolbox. The efficiency of the algorithms is demonstrated by simulation tests. △ Less

Submitted 1 November, 2017; originally announced November 2017.

Comments: Submitted to Signal Processing

arXiv:1707.07752 [pdf, other]

A novel two- and multi-level binary phase mask design for enhanced depth-of-focus

Authors: Vladimir Katkovnik, Nicholas Hogasten, Karen Egiazarian

Abstract: This paper introduces a two-and multi-level phase mask design for improved depth of focus. A novel technique is proposed incorporating cubic and generalized cubic wavefront coding (WFC). The obtained system is optical-electronic requiring computational deblurring post-processing, in order to obtain a sharp image from the observed blurred data. A midwave infrared (MWIR) system is simulated showing… ▽ More This paper introduces a two-and multi-level phase mask design for improved depth of focus. A novel technique is proposed incorporating cubic and generalized cubic wavefront coding (WFC). The obtained system is optical-electronic requiring computational deblurring post-processing, in order to obtain a sharp image from the observed blurred data. A midwave infrared (MWIR) system is simulated showing that this design will produce high quality images even for large amounts of defocus. It is furthermore shown that this technique can be used to design a flat, single optical element, systems where the phase mask performs both the function of focusing and phase modulation. It is demonstrated that in this lensless design the WFC coding components can be omitted and WFC effects are achieved as a result of the proposed algorithm for phase mask design which uses the quadratic phase of the thin refractive lens as the input signal. △ Less

Submitted 14 July, 2017; originally announced July 2017.

Comments: 15 pages, 12 Figures

MSC Class: 15A29; 78A97; 82B26 ACM Class: J.2; J.7; K.8.1

arXiv:1704.04126 [pdf, other]

doi 10.1109/TIP.2017.2779265

Single Image Super-Resolution based on Wiener Filter in Similarity Domain

Authors: Cristóvão Cruz, Rakesh Mehta, Vladimir Katkovnik, Karen Egiazarian

Abstract: Single image super resolution (SISR) is an ill-posed problem aiming at estimating a plausible high resolution (HR) image from a single low resolution (LR) image. Current state-of-the-art SISR methods are patch-based. They use either external data or internal self-similarity to learn a prior for a HR image. External data based methods utilize large number of patches from the training data, while se… ▽ More Single image super resolution (SISR) is an ill-posed problem aiming at estimating a plausible high resolution (HR) image from a single low resolution (LR) image. Current state-of-the-art SISR methods are patch-based. They use either external data or internal self-similarity to learn a prior for a HR image. External data based methods utilize large number of patches from the training data, while self-similarity based approaches leverage one or more similar patches from the input image. In this paper we propose a self-similarity based approach that is able to use large groups of similar patches extracted from the input image to solve the SISR problem. We introduce a novel prior leading to collaborative filtering of patch groups in 1D similarity domain and couple it with an iterative back-projection framework. The performance of the proposed algorithm is evaluated on a number of SISR benchmark datasets. Without using any external data, the proposed approach outperforms the current non-CNN based methods on the tested datasets for various scaling factors. On certain datasets, the gain is over 1 dB, when compared to the recent method A+. For high sampling rate (x4) the proposed method performs similarly to very recent state-of-the-art deep convolutional network based approaches. △ Less

Submitted 29 November, 2017; v1 submitted 13 April, 2017; originally announced April 2017.

Comments: Paper accepted for publication on IEEE Transactions on Image Processing

arXiv:1106.6180 [pdf, ps, other]

doi 10.1109/TIP.2011.2176954

BM3D Frames and Variational Image Deblurring

Authors: Aram Danielyan, Vladimir Katkovnik, Karen Egiazarian

Abstract: A family of the Block Matching 3-D (BM3D) algorithms for various imaging problems has been recently proposed within the framework of nonlocal patch-wise image modeling [1], [2]. In this paper we construct analysis and synthesis frames, formalizing the BM3D image modeling and use these frames to develop novel iterative deblurring algorithms. We consider two different formulations of the deblurring… ▽ More A family of the Block Matching 3-D (BM3D) algorithms for various imaging problems has been recently proposed within the framework of nonlocal patch-wise image modeling [1], [2]. In this paper we construct analysis and synthesis frames, formalizing the BM3D image modeling and use these frames to develop novel iterative deblurring algorithms. We consider two different formulations of the deblurring problem: one given by minimization of the single objective function and another based on the Nash equilibrium balance of two objective functions. The latter results in an algorithm where the denoising and deblurring operations are decoupled. The convergence of the developed algorithms is proved. Simulation experiments show that the decoupled algorithm derived from the Nash equilibrium formulation demonstrates the best numerical and visual results and shows superiority with respect to the state of the art in the field, confirming a valuable potential of BM3D-frames as an advanced image modeling tool. △ Less

Submitted 30 June, 2011; originally announced June 2011.

Comments: Submitted to IEEE Transactions on Image Processing on May 18, 2011. implementation of the proposed algorithm is available as part of the BM3D package at http://www.cs.tut.fi/~foi/GCF-BM3D

arXiv:0708.2893 [pdf]

Fast Recursive Coding Based on Grou** of Symbols

Authors: Nikolay Ponomarenko, Vladimir Lukin, Karen Egiazarian, Jaakko Astola, Boris Y Ryabko

Abstract: A novel fast recursive coding technique is proposed. It operates with only integer values not longer 8 bits and is multiplication free. Recursion the algorithm is based on indirectly provides rather effective coding of symbols for very large alphabets. The code length for the proposed technique can be up to 20-30% less than for arithmetic coding and, in the worst case it is only by 1-3% larger. A novel fast recursive coding technique is proposed. It operates with only integer values not longer 8 bits and is multiplication free. Recursion the algorithm is based on indirectly provides rather effective coding of symbols for very large alphabets. The code length for the proposed technique can be up to 20-30% less than for arithmetic coding and, in the worst case it is only by 1-3% larger. △ Less

Submitted 21 August, 2007; originally announced August 2007.

Comments: 3 pages, submitted to IEEE Transactions on Information Theory

ACM Class: E.4

arXiv:cs/0504005 [pdf, ps, other]

Fast Codes for Large Alphabets

Authors: Boris Ryabko, Jaakko Astola, Karen Egiazarian

Abstract: We address the problem of constructing a fast lossless code in the case when the source alphabet is large. The main idea of the new scheme may be described as follows. We group letters with small probabilities in subsets (acting as super letters) and use time consuming coding for these subsets only, whereas letters in the subsets have the same code length and therefore can be coded fast. The des… ▽ More We address the problem of constructing a fast lossless code in the case when the source alphabet is large. The main idea of the new scheme may be described as follows. We group letters with small probabilities in subsets (acting as super letters) and use time consuming coding for these subsets only, whereas letters in the subsets have the same code length and therefore can be coded fast. The described scheme can be applied to sources with known and unknown statistics. △ Less

Submitted 2 April, 2005; originally announced April 2005.

Comments: published

Showing 1–25 of 25 results for author: Egiazarian, K