Search | arXiv e-print repository

SpecNeRF: Gaussian Directional Encoding for Specular Reflections

Authors: Li Ma, Vasu Agrawal, Haithem Turki, Changil Kim, Chen Gao, Pedro Sander, Michael Zollhöfer, Christian Richardt

Abstract: Neural radiance fields have achieved remarkable performance in modeling the appearance of 3D scenes. However, existing approaches still struggle with the view-dependent appearance of glossy surfaces, especially under complex lighting of indoor environments. Unlike existing methods, which typically assume distant lighting like an environment map, we propose a learnable Gaussian directional encoding… ▽ More Neural radiance fields have achieved remarkable performance in modeling the appearance of 3D scenes. However, existing approaches still struggle with the view-dependent appearance of glossy surfaces, especially under complex lighting of indoor environments. Unlike existing methods, which typically assume distant lighting like an environment map, we propose a learnable Gaussian directional encoding to better model the view-dependent effects under near-field lighting conditions. Importantly, our new directional encoding captures the spatially-varying nature of near-field lighting and emulates the behavior of prefiltered environment maps. As a result, it enables the efficient evaluation of preconvolved specular color at any 3D location with varying roughness coefficients. We further introduce a data-driven geometry prior that helps alleviate the shape radiance ambiguity in reflection modeling. We show that our Gaussian directional encoding and geometry prior significantly improve the modeling of challenging specular reflections in neural radiance fields, which helps decompose appearance into more physically meaningful components. △ Less

Submitted 16 May, 2024; v1 submitted 20 December, 2023; originally announced December 2023.

Comments: Accepted to CVPR2024 as Highlight, Project page: https://limacv.github.io/SpecNeRF_web/

arXiv:2311.12664 [pdf, other]

The DURel Annotation Tool: Human and Computational Measurement of Semantic Proximity, Sense Clusters and Semantic Change

Authors: Dominik Schlechtweg, Shafqat Mumtaz Virk, Pauline Sander, Emma Sköldberg, Lukas Theuer Linke, Tuo Zhang, Nina Tahmasebi, Jonas Kuhn, Sabine Schulte im Walde

Abstract: We present the DURel tool that implements the annotation of semantic proximity between uses of words into an online, open source interface. The tool supports standardized human annotation as well as computational annotation, building on recent advances with Word-in-Context models. Annotator judgments are clustered with automatic graph clustering techniques and visualized for analysis. This allows… ▽ More We present the DURel tool that implements the annotation of semantic proximity between uses of words into an online, open source interface. The tool supports standardized human annotation as well as computational annotation, building on recent advances with Word-in-Context models. Annotator judgments are clustered with automatic graph clustering techniques and visualized for analysis. This allows to measure word senses with simple and intuitive micro-task judgments between use pairs, requiring minimal preparation efforts. The tool offers additional functionalities to compare the agreement between annotators to guarantee the inter-subjectivity of the obtained judgments and to calculate summary statistics giving insights into sense frequency distributions, semantic variation or changes of senses over time. △ Less

Submitted 5 February, 2024; v1 submitted 21 November, 2023; originally announced November 2023.

Comments: EACL Demo, 7 pages

arXiv:2303.05312 [pdf, other]

3D Video Loops from Asynchronous Input

Authors: Li Ma, Xiaoyu Li, **g Liao, Pedro V. Sander

Abstract: Loo** videos are short video clips that can be looped endlessly without visible seams or artifacts. They provide a very attractive way to capture the dynamism of natural scenes. Existing methods have been mostly limited to 2D representations. In this paper, we take a step forward and propose a practical solution that enables an immersive experience on dynamic 3D loo** scenes. The key challenge… ▽ More Loo** videos are short video clips that can be looped endlessly without visible seams or artifacts. They provide a very attractive way to capture the dynamism of natural scenes. Existing methods have been mostly limited to 2D representations. In this paper, we take a step forward and propose a practical solution that enables an immersive experience on dynamic 3D loo** scenes. The key challenge is to consider the per-view loo** conditions from asynchronous input while maintaining view consistency for the 3D representation. We propose a novel sparse 3D video representation, namely Multi-Tile Video (MTV), which not only provides a view-consistent prior, but also greatly reduces memory usage, making the optimization of a 4D volume tractable. Then, we introduce a two-stage pipeline to construct the 3D loo** MTV from completely asynchronous multi-view videos with no time overlap. A novel loo** loss based on video temporal retargeting algorithms is adopted during the optimization to loop the 3D scene. Experiments of our framework have shown promise in successfully generating and rendering photorealistic 3D loo** videos in real time even on mobile devices. The code, dataset, and live demos are available in https://limacv.github.io/VideoLoop3D_web/. △ Less

Submitted 21 March, 2023; v1 submitted 9 March, 2023; originally announced March 2023.

Comments: For more information, please visit the homepage at https://limacv.github.io/VideoLoop3D_web/

arXiv:2210.02553 [pdf, other]

doi 10.1145/3550469.3555415

Water Simulation and Rendering from a Still Photograph

Authors: Ryusuke Sugimoto, Mingming He, **g Liao, Pedro V. Sander

Abstract: We propose an approach to simulate and render realistic water animation from a single still input photograph. We first segment the water surface, estimate rendering parameters, and compute water reflection textures with a combination of neural networks and traditional optimization techniques. Then we propose an image-based screen space local reflection model to render the water surface overlaid on… ▽ More We propose an approach to simulate and render realistic water animation from a single still input photograph. We first segment the water surface, estimate rendering parameters, and compute water reflection textures with a combination of neural networks and traditional optimization techniques. Then we propose an image-based screen space local reflection model to render the water surface overlaid on the input image and generate real-time water animation. Our approach creates realistic results with no user intervention for a wide variety of natural scenes containing large bodies of water with different lighting and water surface conditions. Since our method provides a 3D representation of the water surface, it naturally enables direct editing of water parameters and also supports interactive applications like adding synthetic objects to the scene. △ Less

Submitted 5 October, 2022; originally announced October 2022.

Comments: Accepted for publication at ACM SIGGRAPH Asia (Conference Papers). Videos, demos and updates will be on the project website: https://rsugimoto.net/WaterAnimationProject/

arXiv:2207.00210 [pdf, other]

doi 10.1145/3550454.3555494

Neural Parameterization for Dynamic Human Head Editing

Authors: Li Ma, Xiaoyu Li, **g Liao, Xuan Wang, Qi Zhang, Jue Wang, Pedro Sander

Abstract: Implicit radiance functions emerged as a powerful scene representation for reconstructing and rendering photo-realistic views of a 3D scene. These representations, however, suffer from poor editability. On the other hand, explicit representations such as polygonal meshes allow easy editing but are not as suitable for reconstructing accurate details in dynamic human heads, such as fine facial featu… ▽ More Implicit radiance functions emerged as a powerful scene representation for reconstructing and rendering photo-realistic views of a 3D scene. These representations, however, suffer from poor editability. On the other hand, explicit representations such as polygonal meshes allow easy editing but are not as suitable for reconstructing accurate details in dynamic human heads, such as fine facial features, hair, teeth, and eyes. In this work, we present Neural Parameterization (NeP), a hybrid representation that provides the advantages of both implicit and explicit methods. NeP is capable of photo-realistic rendering while allowing fine-grained editing of the scene geometry and appearance. We first disentangle the geometry and appearance by parameterizing the 3D geometry into 2D texture space. We enable geometric editability by introducing an explicit linear deformation blending layer. The deformation is controlled by a set of sparse key points, which can be explicitly and intuitively displaced to edit the geometry. For appearance, we develop a hybrid 2D texture consisting of an explicit texture map for easy editing and implicit view and time-dependent residuals to model temporal and view variations. We compare our method to several reconstruction and editing baselines. The results show that the NeP achieves almost the same level of rendering accuracy while maintaining high editability. △ Less

Submitted 27 October, 2022; v1 submitted 1 July, 2022; originally announced July 2022.

Comments: 15 pages, 18 figures

arXiv:2111.14292 [pdf, other]

Deblur-NeRF: Neural Radiance Fields from Blurry Images

Authors: Li Ma, Xiaoyu Li, **g Liao, Qi Zhang, Xuan Wang, Jue Wang, Pedro V. Sander

Abstract: Neural Radiance Field (NeRF) has gained considerable attention recently for 3D scene reconstruction and novel view synthesis due to its remarkable synthesis quality. However, image blurriness caused by defocus or motion, which often occurs when capturing scenes in the wild, significantly degrades its reconstruction quality. To address this problem, We propose Deblur-NeRF, the first method that can… ▽ More Neural Radiance Field (NeRF) has gained considerable attention recently for 3D scene reconstruction and novel view synthesis due to its remarkable synthesis quality. However, image blurriness caused by defocus or motion, which often occurs when capturing scenes in the wild, significantly degrades its reconstruction quality. To address this problem, We propose Deblur-NeRF, the first method that can recover a sharp NeRF from blurry input. We adopt an analysis-by-synthesis approach that reconstructs blurry views by simulating the blurring process, thus making NeRF robust to blurry inputs. The core of this simulation is a novel Deformable Sparse Kernel (DSK) module that models spatially-varying blur kernels by deforming a canonical sparse kernel at each spatial location. The ray origin of each kernel point is jointly optimized, inspired by the physical blurring process. This module is parameterized as an MLP that has the ability to be generalized to various blur types. Jointly optimizing the NeRF and the DSK module allows us to restore a sharp NeRF. We demonstrate that our method can be used on both camera motion blur and defocus blur: the two most common types of blur in real scenes. Evaluation results on both synthetic and real-world data show that our method outperforms several baselines. The synthetic and real datasets along with the source code is publicly available at https://limacv.github.io/deblurnerf/ △ Less

Submitted 27 March, 2022; v1 submitted 28 November, 2021; originally announced November 2021.

Comments: accepted in CVPR2022

arXiv:2104.09820 [pdf]

doi 10.1109/TCSVT.2018.2880227

Microshift: An Efficient Image Compression Algorithm for Hardware

Authors: Bo Zhang, Pedro V. Sander, Chi-Ying Tsui, Amine Bermak

Abstract: In this paper, we propose an image compression algorithm called Microshift. We employ an algorithm hardware co-design methodology, yielding a hardware-friendly compression approach with low power consumption. In our method, the image is first micro-shifted, then the sub-quantized values are further compressed. Two methods, the FAST and MRF model, are proposed to recover the bit-depth by exploiting… ▽ More In this paper, we propose an image compression algorithm called Microshift. We employ an algorithm hardware co-design methodology, yielding a hardware-friendly compression approach with low power consumption. In our method, the image is first micro-shifted, then the sub-quantized values are further compressed. Two methods, the FAST and MRF model, are proposed to recover the bit-depth by exploiting the spatial correlation of natural images. Both methods can decompress images progressively. Our compression algorithm compresses images to 1.25 bits per pixel on average with PSNR of 33.16 dB, outperforming other on-chip compression algorithms. Then, we propose a hardware architecture and implement the algorithm on an FPGA and ASIC. The results on the VLSI design further validate the low hardware complexity and high power efficiency, showing our method is promising, particularly for low-power wireless vision sensor networks. △ Less

Submitted 20 April, 2021; originally announced April 2021.

Comments: Accepted to IEEE Transactions on Circuits and Systems for Video Technology

arXiv:2104.08852 [pdf, other]

Let's See Clearly: Contaminant Artifact Removal for Moving Cameras

Authors: Xiaoyu Li, Bo Zhang, **g Liao, Pedro V. Sander

Abstract: Contaminants such as dust, dirt and moisture adhering to the camera lens can greatly affect the quality and clarity of the resulting image or video. In this paper, we propose a video restoration method to automatically remove these contaminants and produce a clean video. Our approach first seeks to detect attention maps that indicate the regions that need to be restored. In order to leverage the c… ▽ More Contaminants such as dust, dirt and moisture adhering to the camera lens can greatly affect the quality and clarity of the resulting image or video. In this paper, we propose a video restoration method to automatically remove these contaminants and produce a clean video. Our approach first seeks to detect attention maps that indicate the regions that need to be restored. In order to leverage the corresponding clean pixels from adjacent frames, we propose a flow completion module to hallucinate the flow of the background scene to the attention regions degraded by the contaminants. Guided by the attention maps and completed flows, we propose a recurrent technique to restore the input frame by fetching clean pixels from adjacent frames. Finally, a multi-frame processing stage is used to further process the entire video sequence in order to enforce temporal consistency. The entire network is trained on a synthetic dataset that approximates the physical lighting properties of contaminant artifacts. This new dataset and our novel framework lead to our method that is able to address different contaminants and outperforms competitive restoration approaches both qualitatively and quantitatively. △ Less

Submitted 18 April, 2021; originally announced April 2021.

Comments: 10 pages, 11 figures

ACM Class: I.2.6; I.4.4

arXiv:2008.04149 [pdf, other]

doi 10.1109/TVCG.2021.3049419

Deep Sketch-guided Cartoon Video Inbetweening

Authors: Xiaoyu Li, Bo Zhang, **g Liao, Pedro V. Sander

Abstract: We propose a novel framework to produce cartoon videos by fetching the color information from two input keyframes while following the animated motion guided by a user sketch. The key idea of the proposed approach is to estimate the dense cross-domain correspondence between the sketch and cartoon video frames, and employ a blending module with occlusion estimation to synthesize the middle frame gui… ▽ More We propose a novel framework to produce cartoon videos by fetching the color information from two input keyframes while following the animated motion guided by a user sketch. The key idea of the proposed approach is to estimate the dense cross-domain correspondence between the sketch and cartoon video frames, and employ a blending module with occlusion estimation to synthesize the middle frame guided by the sketch. After that, the input frames and the synthetic frame equipped with established correspondence are fed into an arbitrary-time frame interpolation pipeline to generate and refine additional inbetween frames. Finally, a module to preserve temporal consistency is employed. Compared to common frame interpolation methods, our approach can address frames with relatively large motion and also has the flexibility to enable users to control the generated video sequences by editing the sketch guidance. By explicitly considering the correspondence between frames and the sketch, we can achieve higher quality results than other image synthesis methods. Our results show that our system generalizes well to different movie frames, achieving better results than existing solutions. △ Less

Submitted 18 January, 2021; v1 submitted 10 August, 2020; originally announced August 2020.

Comments: 15 pages, 16 figures

ACM Class: I.2.6; I.4.9

arXiv:1909.09470 [pdf, other]

Document Rectification and Illumination Correction using a Patch-based CNN

Authors: Xiaoyu Li, Bo Zhang, **g Liao, Pedro V. Sander

Abstract: We propose a novel learning method to rectify document images with various distortion types from a single input image. As opposed to previous learning-based methods, our approach seeks to first learn the distortion flow on input image patches rather than the entire image. We then present a robust technique to stitch the patch results into the rectified document by processing in the gradient domain… ▽ More We propose a novel learning method to rectify document images with various distortion types from a single input image. As opposed to previous learning-based methods, our approach seeks to first learn the distortion flow on input image patches rather than the entire image. We then present a robust technique to stitch the patch results into the rectified document by processing in the gradient domain. Furthermore, we propose a second network to correct the uneven illumination, further improving the readability and OCR accuracy. Due to the less complex distortion present on the smaller image patches, our patch-based approach followed by stitching and illumination correction can significantly improve the overall accuracy in both the synthetic and real datasets. △ Less

Submitted 20 September, 2019; originally announced September 2019.

Comments: 11 pages, 10 figures

ACM Class: I.4.3; I.2.6

arXiv:1909.03459 [pdf, other]

Blind Geometric Distortion Correction on Images Through Deep Learning

Authors: Xiaoyu Li, Bo Zhang, Pedro V. Sander, **g Liao

Abstract: We propose the first general framework to automatically correct different types of geometric distortion in a single input image. Our proposed method employs convolutional neural networks (CNNs) trained by using a large synthetic distortion dataset to predict the displacement field between distorted images and corrected images. A model fitting method uses the CNN output to estimate the distortion p… ▽ More We propose the first general framework to automatically correct different types of geometric distortion in a single input image. Our proposed method employs convolutional neural networks (CNNs) trained by using a large synthetic distortion dataset to predict the displacement field between distorted images and corrected images. A model fitting method uses the CNN output to estimate the distortion parameters, achieving a more accurate prediction. The final corrected image is generated based on the predicted flow using an efficient, high-quality resampling method. Experimental results demonstrate that our algorithm outperforms traditional correction methods, and allows for interesting applications such as distortion transfer, distortion exaggeration, and co-occurring distortion correction. △ Less

Submitted 8 September, 2019; originally announced September 2019.

Comments: 10 pages, 11 figures, published in CVPR 2019

ACM Class: I.4.3; I.2.6

arXiv:1906.09909 [pdf, other]

Deep Exemplar-based Video Colorization

Authors: Bo Zhang, Mingming He, **g Liao, Pedro V. Sander, Lu Yuan, Amine Bermak, Dong Chen

Abstract: This paper presents the first end-to-end network for exemplar-based video colorization. The main challenge is to achieve temporal consistency while remaining faithful to the reference style. To address this issue, we introduce a recurrent framework that unifies the semantic correspondence and color propagation steps. Both steps allow a provided reference image to guide the colorization of every fr… ▽ More This paper presents the first end-to-end network for exemplar-based video colorization. The main challenge is to achieve temporal consistency while remaining faithful to the reference style. To address this issue, we introduce a recurrent framework that unifies the semantic correspondence and color propagation steps. Both steps allow a provided reference image to guide the colorization of every frame, thus reducing accumulated propagation errors. Video frames are colorized in sequence based on the colorization history, and its coherency is further enforced by the temporal consistency loss. All of these components, learned end-to-end, help produce realistic videos with good temporal stability. Experiments show our result is superior to the state-of-the-art methods both quantitatively and qualitatively. △ Less

Submitted 24 June, 2019; originally announced June 2019.

arXiv:1807.06587 [pdf, other]

Deep Exemplar-based Colorization

Authors: Mingming He, Dongdong Chen, **g Liao, Pedro V. Sander, Lu Yuan

Abstract: We propose the first deep learning approach for exemplar-based local colorization. Given a reference color image, our convolutional neural network directly maps a grayscale image to an output colorized image. Rather than using hand-crafted rules as in traditional exemplar-based methods, our end-to-end colorization network learns how to select, propagate, and predict colors from the large-scale dat… ▽ More We propose the first deep learning approach for exemplar-based local colorization. Given a reference color image, our convolutional neural network directly maps a grayscale image to an output colorized image. Rather than using hand-crafted rules as in traditional exemplar-based methods, our end-to-end colorization network learns how to select, propagate, and predict colors from the large-scale data. The approach performs robustly and generalizes well even when using reference images that are unrelated to the input grayscale image. More importantly, as opposed to other learning-based colorization methods, our network allows the user to achieve customizable results by simply feeding different references. In order to further reduce manual effort in selecting the references, the system automatically recommends references with our proposed image retrieval algorithm, which considers both semantic and luminance information. The colorization can be performed fully automatically by simply picking the top reference suggestion. Our approach is validated through a user study and favorable quantitative comparisons to the-state-of-the-art methods. Furthermore, our approach can be naturally extended to video colorization. Our code and models will be freely available for public use. △ Less

Submitted 21 July, 2018; v1 submitted 17 July, 2018; originally announced July 2018.

Comments: To Appear in Siggraph 2018

arXiv:1710.00756 [pdf, other]

Progressive Color Transfer with Dense Semantic Correspondences

Authors: Mingming He, **g Liao, Dongdong Chen, Lu Yuan, Pedro V. Sander

Abstract: We propose a new algorithm for color transfer between images that have perceptually similar semantic structures. We aim to achieve a more accurate color transfer that leverages semantically-meaningful dense correspondence between images. To accomplish this, our algorithm uses neural representations for matching. Additionally, the color transfer should be spatially variant and globally coherent. Th… ▽ More We propose a new algorithm for color transfer between images that have perceptually similar semantic structures. We aim to achieve a more accurate color transfer that leverages semantically-meaningful dense correspondence between images. To accomplish this, our algorithm uses neural representations for matching. Additionally, the color transfer should be spatially variant and globally coherent. Therefore, our algorithm optimizes a local linear model for color transfer satisfying both local and global constraints. Our proposed approach jointly optimizes matching and color transfer, adopting a coarse-to-fine strategy. The proposed method can be successfully extended from one-to-one to one-to-many color transfer. The latter further addresses the problem of mismatching elements of the input image. We validate our proposed method by testing it on a large variety of image content. △ Less

Submitted 12 December, 2018; v1 submitted 2 October, 2017; originally announced October 2017.

Comments: Accepted by TOG

arXiv:1408.7092 [pdf]

Challenges in Bridging Social Semantics and Formal Semantics on the Web

Authors: Fabien Lucien Gandon, Michel Buffa, Elena Cabrio, Catherine Faron-Zucker, Alain Giboin, Nhan Le Thanh, Isabelle Mirbel, Peter Sander, Andrea G. B. Tettamanzi, Serena Villata

Abstract: This paper describes several results of Wimmics, a research lab which names stands for: web-instrumented man-machine interactions, communities, and semantics. The approaches introduced here rely on graph-oriented knowledge representation, reasoning and operationalization to model and support actors, actions and interactions in web-based epistemic communities. The re-search results are applied to s… ▽ More This paper describes several results of Wimmics, a research lab which names stands for: web-instrumented man-machine interactions, communities, and semantics. The approaches introduced here rely on graph-oriented knowledge representation, reasoning and operationalization to model and support actors, actions and interactions in web-based epistemic communities. The re-search results are applied to support and foster interactions in online communities and manage their resources. △ Less

Submitted 29 August, 2014; originally announced August 2014.

Journal ref: 5h International Conference, ICEIS 2013 190 (2013) 3-15

Showing 1–15 of 15 results for author: Sander, P