-
Parametric Modeling and Estimation of Photon Registrations for 3D Imaging
Authors:
Weijian Zhang,
Hashan K. Weerasooriya,
Prateek Chennuri,
Stanley H. Chan
Abstract:
In single-photon light detection and ranging (SP-LiDAR) systems, the histogram distortion due to hardware dead time fundamentally limits the precision of depth estimation. To compensate for the dead time effects, the photon registration distribution is typically modeled based on the Markov chain self-excitation process. However, this is a discrete process and it is computationally expensive, thus…
▽ More
In single-photon light detection and ranging (SP-LiDAR) systems, the histogram distortion due to hardware dead time fundamentally limits the precision of depth estimation. To compensate for the dead time effects, the photon registration distribution is typically modeled based on the Markov chain self-excitation process. However, this is a discrete process and it is computationally expensive, thus hindering potential neural network applications and fast simulations. In this paper, we overcome the modeling challenge by proposing a continuous parametric model. We introduce a Gaussian-uniform mixture model (GUMM) and periodic padding to address high noise floors and noise slopes respectively. By deriving and implementing a customized expectation maximization (EM) algorithm, we achieve accurate histogram matching in scenarios that were deemed difficult in the literature.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Graphical copula GARCH modeling with dynamic conditional dependence
Authors:
Lupe Shun Hin Chan,
Amanda Man Ying Chu,
Mike Ka Pui So
Abstract:
Modeling returns on large portfolios is a challenging problem as the number of parameters in the covariance matrix grows as the square of the size of the portfolio. Traditional correlation models, for example, the dynamic conditional correlation (DCC)-GARCH model, often ignore the nonlinear dependencies in the tail of the return distribution. In this paper, we aim to develop a framework to model t…
▽ More
Modeling returns on large portfolios is a challenging problem as the number of parameters in the covariance matrix grows as the square of the size of the portfolio. Traditional correlation models, for example, the dynamic conditional correlation (DCC)-GARCH model, often ignore the nonlinear dependencies in the tail of the return distribution. In this paper, we aim to develop a framework to model the nonlinear dependencies dynamically, namely the graphical copula GARCH (GC-GARCH) model. Motivated from the capital asset pricing model, to allow modeling of large portfolios, the number of parameters can be greatly reduced by introducing conditional independence among stocks given some risk factors. The joint distribution of the risk factors is factorized using a directed acyclic graph (DAG) with pair-copula construction (PCC) to enhance the modeling of the tails of the return distribution while offering the flexibility of having complex dependent structures. The DAG induces topological orders to the risk factors, which can be regarded as a list of directions of the flow of information. The conditional distributions among stock returns are also modeled using PCC. Dynamic conditional dependence structures are incorporated to allow the parameters in the copulas to be time-varying. Three-stage estimation is used to estimate parameters in the marginal distributions, the risk factor copulas, and the stock copulas. The simulation study shows that the proposed estimation procedure can estimate the parameters and the underlying DAG structure accurately. In the investment experiment of the empirical study, we demonstrate that the GC-GARCH model produces more precise conditional value-at-risk prediction and considerably higher cumulative portfolio returns than the DCC-GARCH model.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
Generative Quanta Color Imaging
Authors:
Vishal Purohit,
Junjie Luo,
Yiheng Chi,
Qi Guo,
Stanley H. Chan,
Qiang Qiu
Abstract:
The astonishing development of single-photon cameras has created an unprecedented opportunity for scientific and industrial imaging. However, the high data throughput generated by these 1-bit sensors creates a significant bottleneck for low-power applications. In this paper, we explore the possibility of generating a color image from a single binary frame of a single-photon camera. We evidently fi…
▽ More
The astonishing development of single-photon cameras has created an unprecedented opportunity for scientific and industrial imaging. However, the high data throughput generated by these 1-bit sensors creates a significant bottleneck for low-power applications. In this paper, we explore the possibility of generating a color image from a single binary frame of a single-photon camera. We evidently find this problem being particularly difficult to standard colorization approaches due to the substantial degree of exposure variation. The core innovation of our paper is an exposure synthesis model framed under a neural ordinary differential equation (Neural ODE) that allows us to generate a continuum of exposures from a single observation. This innovation ensures consistent exposure in binary images that colorizers take on, resulting in notably enhanced colorization. We demonstrate applications of the method in single-image and burst colorization and show superior generative performance over baselines. Project website can be found at https://vishal-s-p.github.io/projects/2023/generative_quanta_color.html.
△ Less
Submitted 27 March, 2024;
originally announced March 2024.
-
Tutorial on Diffusion Models for Imaging and Vision
Authors:
Stanley H. Chan
Abstract:
The astonishing growth of generative tools in recent years has empowered many exciting applications in text-to-image generation and text-to-video generation. The underlying principle behind these generative tools is the concept of diffusion, a particular sampling mechanism that has overcome some shortcomings that were deemed difficult in the previous approaches. The goal of this tutorial is to dis…
▽ More
The astonishing growth of generative tools in recent years has empowered many exciting applications in text-to-image generation and text-to-video generation. The underlying principle behind these generative tools is the concept of diffusion, a particular sampling mechanism that has overcome some shortcomings that were deemed difficult in the previous approaches. The goal of this tutorial is to discuss the essential ideas underlying the diffusion models. The target audience of this tutorial includes undergraduate and graduate students who are interested in doing research on diffusion models or applying these models to solve other problems.
△ Less
Submitted 26 March, 2024;
originally announced March 2024.
-
Resolution Limit of Single-Photon LiDAR
Authors:
Stanley H. Chan,
Hashan K. Weerasooriya,
Weijian Zhang,
Pamela Abshire,
Istvan Gyongy,
Robert K. Henderson
Abstract:
Single-photon Light Detection and Ranging (LiDAR) systems are often equipped with an array of detectors for improved spatial resolution and sensing speed. However, given a fixed amount of flux produced by the laser transmitter across the scene, the per-pixel Signal-to-Noise Ratio (SNR) will decrease when more pixels are packed in a unit space. This presents a fundamental trade-off between the spat…
▽ More
Single-photon Light Detection and Ranging (LiDAR) systems are often equipped with an array of detectors for improved spatial resolution and sensing speed. However, given a fixed amount of flux produced by the laser transmitter across the scene, the per-pixel Signal-to-Noise Ratio (SNR) will decrease when more pixels are packed in a unit space. This presents a fundamental trade-off between the spatial resolution of the sensor array and the SNR received at each pixel. Theoretical characterization of this fundamental limit is explored. By deriving the photon arrival statistics and introducing a series of new approximation techniques, the Mean Squared Error (MSE) of the maximum-likelihood estimator of the time delay is derived. The theoretical predictions align well with simulations and real data.
△ Less
Submitted 30 March, 2024; v1 submitted 25 March, 2024;
originally announced March 2024.
-
PCH-EM: A solution to information loss in the photon transfer method
Authors:
Aaron J. Hendrickson,
David P. Haefner,
Stanley H. Chan,
Nicholas R. Shade,
Eric R. Fossum
Abstract:
Working from a Poisson-Gaussian noise model, a multi-sample extension of the Photon Counting Histogram Expectation Maximization (PCH-EM) algorithm is derived as a general-purpose alternative to the Photon Transfer (PT) method. This algorithm is derived from the same model, requires the same experimental data, and estimates the same sensor performance parameters as the time-tested PT method, all wh…
▽ More
Working from a Poisson-Gaussian noise model, a multi-sample extension of the Photon Counting Histogram Expectation Maximization (PCH-EM) algorithm is derived as a general-purpose alternative to the Photon Transfer (PT) method. This algorithm is derived from the same model, requires the same experimental data, and estimates the same sensor performance parameters as the time-tested PT method, all while obtaining lower uncertainty estimates. It is shown that as read noise becomes large, multiple data samples are necessary to capture enough information about the parameters of a device under test, justifying the need for a multi-sample extension. An estimation procedure is devised consisting of initial PT characterization followed by repeated iteration of PCH-EM to demonstrate the improvement in estimate uncertainty achievable with PCH-EM; particularly in the regime of Deep Sub-Electron Read Noise (DSERN). A statistical argument based on the information theoretic concept of sufficiency is formulated to explain how PT data reduction procedures discard information contained in raw sensor data, thus explaining why the proposed algorithm is able to obtain lower uncertainty estimates of key sensor performance parameters such as read noise and conversion gain. Experimental data captured from a CMOS quanta image sensor with DSERN is then used to demonstrate the algorithm's usage and validate the underlying theory and statistical model. In support of the reproducible research effort, the code associated with this work can be obtained on the MathWorks File Exchange (Hendrickson et al., 2024).
△ Less
Submitted 7 March, 2024;
originally announced March 2024.
-
Linear extensions and continued fractions
Authors:
Swee Hong Chan,
Igor Pak
Abstract:
We introduce several new constructions of finite posets with the number of linear extensions given by generalized continued fractions. We apply our results to the problem of the minimum number of elements needed for a poset with a given number of linear extensions.
We introduce several new constructions of finite posets with the number of linear extensions given by generalized continued fractions. We apply our results to the problem of the minimum number of elements needed for a poset with a given number of linear extensions.
△ Less
Submitted 12 June, 2024; v1 submitted 17 January, 2024;
originally announced January 2024.
-
Spatio-Temporal Turbulence Mitigation: A Translational Perspective
Authors:
Xingguang Zhang,
Nicholas Chimitt,
Yiheng Chi,
Zhiyuan Mao,
Stanley H. Chan
Abstract:
Recovering images distorted by atmospheric turbulence is a challenging inverse problem due to the stochastic nature of turbulence. Although numerous turbulence mitigation (TM) algorithms have been proposed, their efficiency and generalization to real-world dynamic scenarios remain severely limited. Building upon the intuitions of classical TM algorithms, we present the Deep Atmospheric TUrbulence…
▽ More
Recovering images distorted by atmospheric turbulence is a challenging inverse problem due to the stochastic nature of turbulence. Although numerous turbulence mitigation (TM) algorithms have been proposed, their efficiency and generalization to real-world dynamic scenarios remain severely limited. Building upon the intuitions of classical TM algorithms, we present the Deep Atmospheric TUrbulence Mitigation network (DATUM). DATUM aims to overcome major challenges when transitioning from classical to deep learning approaches. By carefully integrating the merits of classical multi-frame TM methods into a deep network structure, we demonstrate that DATUM can efficiently perform long-range temporal aggregation using a recurrent fashion, while deformable attention and temporal-channel attention seamlessly facilitate pixel registration and lucky imaging. With additional supervision, tilt and blur degradation can be jointly mitigated. These inductive biases empower DATUM to significantly outperform existing methods while delivering a tenfold increase in processing speed. A large-scale training dataset, ATSyn, is presented as a co-invention to enable generalization in real turbulence. Our code and datasets are available at https://xg416.github.io/DATUM.
△ Less
Submitted 7 April, 2024; v1 submitted 8 January, 2024;
originally announced January 2024.
-
StableKD: Breaking Inter-block Optimization Entanglement for Stable Knowledge Distillation
Authors:
Shiu-hong Kao,
Jierun Chen,
S. H. Gary Chan
Abstract:
Knowledge distillation (KD) has been recognized as an effective tool to compress and accelerate models. However, current KD approaches generally suffer from an accuracy drop and/or an excruciatingly long distillation process. In this paper, we tackle the issue by first providing a new insight into a phenomenon that we call the Inter-Block Optimization Entanglement (IBOE), which makes the conventio…
▽ More
Knowledge distillation (KD) has been recognized as an effective tool to compress and accelerate models. However, current KD approaches generally suffer from an accuracy drop and/or an excruciatingly long distillation process. In this paper, we tackle the issue by first providing a new insight into a phenomenon that we call the Inter-Block Optimization Entanglement (IBOE), which makes the conventional end-to-end KD approaches unstable with noisy gradients. We then propose StableKD, a novel KD framework that breaks the IBOE and achieves more stable optimization. StableKD distinguishes itself through two operations: Decomposition and Recomposition, where the former divides a pair of teacher and student networks into several blocks for separate distillation, and the latter progressively merges them back, evolving towards end-to-end distillation. We conduct extensive experiments on CIFAR100, Imagewoof, and ImageNet datasets with various teacher-student pairs. Compared to other KD approaches, our simple yet effective StableKD greatly boosts the model accuracy by 1% ~ 18%, speeds up the convergence up to 10 times, and outperforms them with only 40% of the training data.
△ Less
Submitted 20 December, 2023;
originally announced December 2023.
-
Kernel Diffusion: An Alternate Approach to Blind Deconvolution
Authors:
Yash Sanghvi,
Yiheng Chi,
Stanley H. Chan
Abstract:
Blind deconvolution problems are severely ill-posed because neither the underlying signal nor the forward operator are not known exactly. Conventionally, these problems are solved by alternating between estimation of the image and kernel while kee** the other fixed. In this paper, we show that this framework is flawed because of its tendency to get trapped in local minima and, instead, suggest t…
▽ More
Blind deconvolution problems are severely ill-posed because neither the underlying signal nor the forward operator are not known exactly. Conventionally, these problems are solved by alternating between estimation of the image and kernel while kee** the other fixed. In this paper, we show that this framework is flawed because of its tendency to get trapped in local minima and, instead, suggest the use of a kernel estimation strategy with a non-blind solver. This framework is employed by a diffusion method which is trained to sample the blur kernel from the conditional distribution with guidance from a pre-trained non-blind solver. The proposed diffusion method leads to state-of-the-art results on both synthetic and real blur datasets.
△ Less
Submitted 4 December, 2023;
originally announced December 2023.
-
Linear extensions of finite posets
Authors:
Swee Hong Chan,
Igor Pak
Abstract:
We give a broad survey of inequalities for the number of linear extensions of finite posets. We review many examples, discuss open problems, and present recent results on the subject. We emphasize the bounds, the equality conditions of the inequalities, and the computational complexity aspects of the results.
We give a broad survey of inequalities for the number of linear extensions of finite posets. We review many examples, discuss open problems, and present recent results on the subject. We emphasize the bounds, the equality conditions of the inequalities, and the computational complexity aspects of the results.
△ Less
Submitted 5 November, 2023;
originally announced November 2023.
-
Equality cases of the Alexandrov--Fenchel inequality are not in the polynomial hierarchy
Authors:
Swee Hong Chan,
Igor Pak
Abstract:
Describing the equality conditions of the Alexandrov--Fenchel inequality has been a major open problem for decades. We prove that in the case of convex polytopes, this description is not in the polynomial hierarchy unless the polynomial hierarchy collapses to a finite level. This is the first hardness result for the problem, and is a complexity counterpart of the recent result by Shenfeld and van…
▽ More
Describing the equality conditions of the Alexandrov--Fenchel inequality has been a major open problem for decades. We prove that in the case of convex polytopes, this description is not in the polynomial hierarchy unless the polynomial hierarchy collapses to a finite level. This is the first hardness result for the problem, and is a complexity counterpart of the recent result by Shenfeld and van Handel (arXiv:archive/201104059), which gave a geometric characterization of the equality conditions. The proof involves Stanley's order polytopes and employs poset theoretic technology.
△ Less
Submitted 31 October, 2023; v1 submitted 11 September, 2023;
originally announced September 2023.
-
The Secrets of Non-Blind Poisson Deconvolution
Authors:
Abhiram Gnanasambandam,
Yash Sanghvi,
Stanley H. Chan
Abstract:
Non-blind image deconvolution has been studied for several decades but most of the existing work focuses on blur instead of noise. In photon-limited conditions, however, the excessive amount of shot noise makes traditional deconvolution algorithms fail. In searching for reasons why these methods fail, we present a systematic analysis of the Poisson non-blind deconvolution algorithms reported in th…
▽ More
Non-blind image deconvolution has been studied for several decades but most of the existing work focuses on blur instead of noise. In photon-limited conditions, however, the excessive amount of shot noise makes traditional deconvolution algorithms fail. In searching for reasons why these methods fail, we present a systematic analysis of the Poisson non-blind deconvolution algorithms reported in the literature, covering both classical and deep learning methods. We compile a list of five "secrets" highlighting the do's and don'ts when designing algorithms. Based on this analysis, we build a proof-of-concept method by combining the five secrets. We find that the new method performs on par with some of the latest methods while outperforming some older ones.
△ Less
Submitted 6 September, 2023;
originally announced September 2023.
-
Computational complexity of counting coincidences
Authors:
Swee Hong Chan,
Igor Pak
Abstract:
Can you decide if there is a coincidence in the numbers counting two different combinatorial objects? For example, can you decide if two regions in $\mathbb{R}^3$ have the same number of domino tilings? There are two versions of the problem, with $2\times 1 \times 1$ and $2\times 2 \times 1$ boxes. We prove that in both cases the coincidence problem is not in the polynomial hierarchy unless the po…
▽ More
Can you decide if there is a coincidence in the numbers counting two different combinatorial objects? For example, can you decide if two regions in $\mathbb{R}^3$ have the same number of domino tilings? There are two versions of the problem, with $2\times 1 \times 1$ and $2\times 2 \times 1$ boxes. We prove that in both cases the coincidence problem is not in the polynomial hierarchy unless the polynomial hierarchy collapses to a finite level. While the conclusions are the same, the proofs are notably different and generalize in different directions.
We proceed to explore the coincidence problem for counting independent sets and matchings in graphs, matroid bases, order ideals and linear extensions in posets, permutation patterns, and the Kronecker coefficients. We also make a number of conjectures for counting other combinatorial objects such as plane triangulations, contingency tables, standard Young tableaux, reduced factorizations and the Littlewood--Richardson coefficients.
△ Less
Submitted 9 February, 2024; v1 submitted 20 August, 2023;
originally announced August 2023.
-
Computational Image Formation: Simulators in the Deep Learning Era
Authors:
Stanley H. Chan
Abstract:
At the pinnacle of computational imaging is the co-optimization of camera and algorithm. This, however, is not the only form of computational imaging. In problems such as imaging through adverse weather, the bigger challenge is how to accurately simulate the forward degradation process so that we can synthesize data to train reconstruction models and/or integrating the forward model as part of the…
▽ More
At the pinnacle of computational imaging is the co-optimization of camera and algorithm. This, however, is not the only form of computational imaging. In problems such as imaging through adverse weather, the bigger challenge is how to accurately simulate the forward degradation process so that we can synthesize data to train reconstruction models and/or integrating the forward model as part of the reconstruction algorithm. This article introduces the concept of computational image formation (CIF). Compared to the standard inverse problems where the goal is to recover the latent image $x$ from the observation $y = G(x)$, CIF shifts the focus to designing an approximate map** $H$ such that $H \approx G$ while giving a good image reconstruction result. The word "computational" highlights the fact that the image formation is now replaced by a numerical simulator. While matching the mother nature remains an important goal, CIF pays even greater attention on strategically choosing an $H$ so that the reconstruction performance is maximized.
The goal of this article is to conceptualize the idea of CIF by elaborating on its meaning and implications. The first part of the article is a discussion on the four attributes of a CIF simulator: accurate enough to mimic $G$, fast enough to be integrated as part of the reconstruction, provides a well-posed inverse problem when plugged into the reconstruction, and differentiable to allow backpropagation. The second part of the article is a detailed case study based on imaging through atmospheric turbulence. A plethora of simulators, old and new ones, are discussed. The third part of the article is a collection of other examples that fall into the category of CIF, including imaging through bad weather, dynamic vision sensors, and differentiable optics. Finally, thoughts about the future direction and recommendations to the community are shared.
△ Less
Submitted 26 October, 2023; v1 submitted 21 July, 2023;
originally announced July 2023.
-
Physics-Driven Turbulence Image Restoration with Stochastic Refinement
Authors:
Ajay Jaiswal,
Xingguang Zhang,
Stanley H. Chan,
Zhangyang Wang
Abstract:
Image distortion by atmospheric turbulence is a stochastic degradation, which is a critical problem in long-range optical imaging systems. A number of research has been conducted during the past decades, including model-based and emerging deep-learning solutions with the help of synthetic data. Although fast and physics-grounded simulation tools have been introduced to help the deep-learning model…
▽ More
Image distortion by atmospheric turbulence is a stochastic degradation, which is a critical problem in long-range optical imaging systems. A number of research has been conducted during the past decades, including model-based and emerging deep-learning solutions with the help of synthetic data. Although fast and physics-grounded simulation tools have been introduced to help the deep-learning models adapt to real-world turbulence conditions recently, the training of such models only relies on the synthetic data and ground truth pairs. This paper proposes the Physics-integrated Restoration Network (PiRN) to bring the physics-based simulator directly into the training process to help the network to disentangle the stochasticity from the degradation and the underlying image. Furthermore, to overcome the ``average effect" introduced by deterministic models and the domain gap between the synthetic and real-world degradation, we further introduce PiRN with Stochastic Refinement (PiRN-SR) to boost its perceptual quality. Overall, our PiRN and PiRN-SR improve the generalization to real-world unknown turbulence conditions and provide a state-of-the-art restoration in both pixel-wise accuracy and perceptual quality. Our codes are available at \url{https://github.com/VITA-Group/PiRN}.
△ Less
Submitted 20 July, 2023;
originally announced July 2023.
-
Spatially Varying Exposure with 2-by-2 Multiplexing: Optimality and Universality
Authors:
Xiangyu Qu,
Yiheng Chi,
Stanley H. Chan
Abstract:
The advancement of new digital image sensors has enabled the design of exposure multiplexing schemes where a single image capture can have multiple exposures and conversion gains in an interlaced format, similar to that of a Bayer color filter array. In this paper, we ask the question of how to design such multiplexing schemes for adaptive high-dynamic range (HDR) imaging where the multiplexing sc…
▽ More
The advancement of new digital image sensors has enabled the design of exposure multiplexing schemes where a single image capture can have multiple exposures and conversion gains in an interlaced format, similar to that of a Bayer color filter array. In this paper, we ask the question of how to design such multiplexing schemes for adaptive high-dynamic range (HDR) imaging where the multiplexing scheme can be updated according to the scenes. We present two new findings.
(i) We address the problem of design optimality. We show that given a multiplex pattern, the conventional optimality criteria based on the input/output-referred signal-to-noise ratio (SNR) of the independently measured pixels can lead to flawed decisions because it cannot encapsulate the location of the saturated pixels. We overcome the issue by proposing a new concept known as the spatially varying exposure risk (SVE-Risk) which is a pseudo-idealistic quantification of the amount of recoverable pixels. We present an efficient enumeration algorithm to select the optimal multiplex patterns.
(ii) We report a design universality observation that the design of the multiplex pattern can be decoupled from the image reconstruction algorithm. This is a significant departure from the recent literature that the multiplex pattern should be jointly optimized with the reconstruction algorithm. Our finding suggests that in the context of exposure multiplexing, an end-to-end training may not be necessary.
△ Less
Submitted 29 June, 2023;
originally announced June 2023.
-
On the cross-product conjecture for the number of linear extensions
Authors:
Swee Hong Chan,
Igor Pak,
Greta Panova
Abstract:
We prove a weak version of the cross--product conjecture: ${F}(k+1,\ell) {F}(k,\ell+1) \geq (\frac12+\varepsilon) {F}(k,\ell) {F}(k+1,\ell+1)$, where ${F}(k,\ell)$ is the number of linear extensions for which the values at fixed elements $x,y,z$ are $k$ and $\ell$ apart, respectively, and where $\varepsilon>0$ depends on the poset. We also prove the converse inequality and disprove the {generalize…
▽ More
We prove a weak version of the cross--product conjecture: ${F}(k+1,\ell) {F}(k,\ell+1) \geq (\frac12+\varepsilon) {F}(k,\ell) {F}(k+1,\ell+1)$, where ${F}(k,\ell)$ is the number of linear extensions for which the values at fixed elements $x,y,z$ are $k$ and $\ell$ apart, respectively, and where $\varepsilon>0$ depends on the poset. We also prove the converse inequality and disprove the {generalized cross--product conjecture}. The proofs use geometric inequalities for mixed volumes and combinatorics of words.
△ Less
Submitted 15 June, 2023;
originally announced June 2023.
-
Anisoplanatic Optical Turbulence Simulation for Near-Continuous $C_n^2$ Profiles without Wave Propagation
Authors:
Nicholas Chimitt,
Stanley H. Chan
Abstract:
For the simulation of anisoplanatic optical turbulence, split-step propagation is the gold standard. Within the context of the degradations being limited to phase distortions, one instead may focus on generating the phase realizations directly, a method which has been utilized in previous so-called multi-aperture simulations. Presently, this modality assumes a constant $C_n^2$ profile. This work p…
▽ More
For the simulation of anisoplanatic optical turbulence, split-step propagation is the gold standard. Within the context of the degradations being limited to phase distortions, one instead may focus on generating the phase realizations directly, a method which has been utilized in previous so-called multi-aperture simulations. Presently, this modality assumes a constant $C_n^2$ profile. This work presents an alternative derivation for Zernike correlations under anisoplanatic conditions. Multi-aperture simulation may easily incorporate these correlations into its framework and achieve a significantly higher degree of accuracy with a minimal increase in time. We additionally use our developed methodology to explain previously reported discrepancies in an empirical implementation of split-step with the analytic tilt correlation. Finally, we outline a major limitation for Zernike-based simulation which still remains.
△ Less
Submitted 15 May, 2023;
originally announced May 2023.
-
HDR Imaging with Spatially Varying Signal-to-Noise Ratios
Authors:
Yiheng Chi,
Xingguang Zhang,
Stanley H. Chan
Abstract:
While today's high dynamic range (HDR) image fusion algorithms are capable of blending multiple exposures, the acquisition is often controlled so that the dynamic range within one exposure is narrow. For HDR imaging in photon-limited situations, the dynamic range can be enormous and the noise within one exposure is spatially varying. Existing image denoising algorithms and HDR fusion algorithms bo…
▽ More
While today's high dynamic range (HDR) image fusion algorithms are capable of blending multiple exposures, the acquisition is often controlled so that the dynamic range within one exposure is narrow. For HDR imaging in photon-limited situations, the dynamic range can be enormous and the noise within one exposure is spatially varying. Existing image denoising algorithms and HDR fusion algorithms both fail to handle this situation, leading to severe limitations in low-light HDR imaging. This paper presents two contributions. Firstly, we identify the source of the problem. We find that the issue is associated with the co-existence of (1) spatially varying signal-to-noise ratio, especially the excessive noise due to very dark regions, and (2) a wide luminance range within each exposure. We show that while the issue can be handled by a bank of denoisers, the complexity is high. Secondly, we propose a new method called the spatially varying high dynamic range (SV-HDR) fusion network to simultaneously denoise and fuse images. We introduce a new exposure-shared block within our custom-designed multi-scale transformer framework. In a variety of testing conditions, the performance of the proposed SV-HDR is better than the existing methods.
△ Less
Submitted 15 April, 2023; v1 submitted 30 March, 2023;
originally announced March 2023.
-
Towards Understanding the Effect of Pretraining Label Granularity
Authors:
Guan Zhe Hong,
Yin Cui,
Ariel Fuxman,
Stanley H. Chan,
Enming Luo
Abstract:
In this paper, we study how the granularity of pretraining labels affects the generalization of deep neural networks in image classification tasks. We focus on the "fine-to-coarse" transfer learning setting, where the pretraining label space is more fine-grained than that of the target problem. Empirically, we show that pretraining on the leaf labels of ImageNet21k produces better transfer results…
▽ More
In this paper, we study how the granularity of pretraining labels affects the generalization of deep neural networks in image classification tasks. We focus on the "fine-to-coarse" transfer learning setting, where the pretraining label space is more fine-grained than that of the target problem. Empirically, we show that pretraining on the leaf labels of ImageNet21k produces better transfer results on ImageNet1k than pretraining on other coarser granularity levels, which supports the common practice used in the community. Theoretically, we explain the benefit of fine-grained pretraining by proving that, for a data distribution satisfying certain hierarchy conditions, 1) coarse-grained pretraining only allows a neural network to learn the "common" or "easy-to-learn" features well, while 2) fine-grained pretraining helps the network learn the "rarer" or "fine-grained" features in addition to the common ones, thus improving its accuracy on hard downstream test samples in which common features are missing or weak in strength. Furthermore, we perform comprehensive experiments using the label hierarchies of iNaturalist 2021 and observe that the following conditions, in addition to proper choice of label granularity, enable the transfer to work well in practice: 1) the pretraining dataset needs to have a meaningful label hierarchy, and 2) the pretraining and target label functions need to align well.
△ Less
Submitted 5 October, 2023; v1 submitted 29 March, 2023;
originally announced March 2023.
-
Scattering and Gathering for Spatially Varying Blurs
Authors:
Nicholas Chimitt,
Xingguang Zhang,
Yiheng Chi,
Stanley H. Chan
Abstract:
A spatially varying blur kernel $h(\mathbf{x},\mathbf{u})$ is specified by an input coordinate $\mathbf{u} \in \mathbb{R}^2$ and an output coordinate $\mathbf{x} \in \mathbb{R}^2$. For computational efficiency, we sometimes write $h(\mathbf{x},\mathbf{u})$ as a linear combination of spatially invariant basis functions. The associated pixelwise coefficients, however, can be indexed by either the in…
▽ More
A spatially varying blur kernel $h(\mathbf{x},\mathbf{u})$ is specified by an input coordinate $\mathbf{u} \in \mathbb{R}^2$ and an output coordinate $\mathbf{x} \in \mathbb{R}^2$. For computational efficiency, we sometimes write $h(\mathbf{x},\mathbf{u})$ as a linear combination of spatially invariant basis functions. The associated pixelwise coefficients, however, can be indexed by either the input coordinate or the output coordinate. While appearing subtle, the two indexing schemes will lead to two different forms of convolutions known as scattering and gathering, respectively. We discuss the origin of the operations. We discuss conditions under which the two operations are identical. We show that scattering is more suitable for simulating how light propagates and gathering is more suitable for image filtering such as denoising.
△ Less
Submitted 9 March, 2024; v1 submitted 9 March, 2023;
originally announced March 2023.
-
Structured Kernel Estimation for Photon-Limited Deconvolution
Authors:
Yash Sanghvi,
Zhiyuan Mao,
Stanley H. Chan
Abstract:
Images taken in a low light condition with the presence of camera shake suffer from motion blur and photon shot noise. While state-of-the-art image restoration networks show promising results, they are largely limited to well-illuminated scenes and their performance drops significantly when photon shot noise is strong.
In this paper, we propose a new blur estimation technique customized for phot…
▽ More
Images taken in a low light condition with the presence of camera shake suffer from motion blur and photon shot noise. While state-of-the-art image restoration networks show promising results, they are largely limited to well-illuminated scenes and their performance drops significantly when photon shot noise is strong.
In this paper, we propose a new blur estimation technique customized for photon-limited conditions. The proposed method employs a gradient-based backpropagation method to estimate the blur kernel. By modeling the blur kernel using a low-dimensional representation with the key points on the motion trajectory, we significantly reduce the search space and improve the regularity of the kernel estimation problem. When plugged into an iterative framework, our novel low-dimensional representation provides improved kernel estimates and hence significantly better deconvolution performance when compared to end-to-end trained neural networks. The source code and pretrained models are available at \url{https://github.com/sanghviyashiitb/structured-kernel-cvpr23}
△ Less
Submitted 6 March, 2023;
originally announced March 2023.
-
Multivariate correlation inequalities for $P$-partitions
Authors:
Swee Hong Chan,
Igor Pak
Abstract:
Motivated by the Lam--Pylyavskyy inequalities for Schur functions, we give a far reaching multivariate generalization of Fishburn's correlation inequality for the number of linear extensions of posets. We then give a multivariate generalization of the Daykin--Daykin--Paterson inequality proving log-concavity of the order polynomial of a poset. We also prove a multivariate $P$-partition version of…
▽ More
Motivated by the Lam--Pylyavskyy inequalities for Schur functions, we give a far reaching multivariate generalization of Fishburn's correlation inequality for the number of linear extensions of posets. We then give a multivariate generalization of the Daykin--Daykin--Paterson inequality proving log-concavity of the order polynomial of a poset. We also prove a multivariate $P$-partition version of the cross-product inequality by Brightwell--Felsner--Trotter. The proofs are based on a multivariate generalization of the Ahlswede--Daykin inequality.
△ Less
Submitted 22 December, 2022;
originally announced December 2022.
-
Correlation inequalities for linear extensions
Authors:
Swee Hong Chan,
Igor Pak
Abstract:
We employ the combinatorial atlas technology to prove new correlation inequalities for the number of linear extensions of finite posets. These include the approximate independence of probabilities and expectations of values of random linear extensions, closely related to Stanley's inequality. We also give applications to the numbers of standard Young tableaux and to Euler numbers.
We employ the combinatorial atlas technology to prove new correlation inequalities for the number of linear extensions of finite posets. These include the approximate independence of probabilities and expectations of values of random linear extensions, closely related to Stanley's inequality. We also give applications to the numbers of standard Young tableaux and to Euler numbers.
△ Less
Submitted 29 November, 2022;
originally announced November 2022.
-
Real-Time Dense Field Phase-to-Space Simulation of Imaging through Atmospheric Turbulence
Authors:
Nicholas Chimitt,
Xingguang Zhang,
Zhiyuan Mao,
Stanley H. Chan
Abstract:
Numerical simulation of atmospheric turbulence is one of the biggest bottlenecks in develo** computational techniques for solving the inverse problem in long-range imaging. The classical split-step method is based upon numerical wave propagation which splits the propagation path into many segments and propagates every pixel in each segment individually via the Fresnel integral. This repeated eva…
▽ More
Numerical simulation of atmospheric turbulence is one of the biggest bottlenecks in develo** computational techniques for solving the inverse problem in long-range imaging. The classical split-step method is based upon numerical wave propagation which splits the propagation path into many segments and propagates every pixel in each segment individually via the Fresnel integral. This repeated evaluation becomes increasingly time-consuming for larger images. As a result, the split-step simulation is often done only on a sparse grid of points followed by an interpolation to the other pixels. Even so, the computation is expensive for real-time applications. In this paper, we present a new simulation method that enables \emph{real-time} processing over a \emph{dense} grid of points. Building upon the recently developed multi-aperture model and the phase-to-space transform, we overcome the memory bottleneck in drawing random samples from the Zernike correlation tensor. We show that the cross-correlation of the Zernike modes has an insignificant contribution to the statistics of the random samples. By approximating these cross-correlation blocks in the Zernike tensor, we restore the homogeneity of the tensor which then enables Fourier-based random sampling. On a $512\times512$ image, the new simulator achieves 0.025 seconds per frame over a dense field. On a $3840 \times 2160$ image which would have taken 13 hours to simulate using the split-step method, the new simulator can run at approximately 60 seconds per frame.
△ Less
Submitted 13 October, 2022;
originally announced October 2022.
-
What Does a One-Bit Quanta Image Sensor Offer?
Authors:
Stanley H. Chan
Abstract:
The one-bit quanta image sensor (QIS) is a photon-counting device that captures image intensities using binary bits. Assuming that the analog voltage generated at the floating diffusion of the photodiode follows a Poisson-Gaussian distribution, the sensor produces either a ``1'' if the voltage is above a certain threshold or ``0'' if it is below the threshold. The concept of this binary sensor has…
▽ More
The one-bit quanta image sensor (QIS) is a photon-counting device that captures image intensities using binary bits. Assuming that the analog voltage generated at the floating diffusion of the photodiode follows a Poisson-Gaussian distribution, the sensor produces either a ``1'' if the voltage is above a certain threshold or ``0'' if it is below the threshold. The concept of this binary sensor has been proposed for more than a decade, and physical devices have been built to realize the concept. However, what benefits does a one-bit QIS offer compared to a conventional multi-bit CMOS image sensor? Besides the known empirical results, are there theoretical proofs to support these findings?
The goal of this paper is to provide new theoretical support from a signal processing perspective. In particular, it is theoretically found that the sensor can offer three benefits: (1) Low-light: One-bit QIS performs better at low-light because it has a low read noise, and its one-bit quantization can produce an error-free measurement. However, this requires the exposure time to be appropriately configured. (2) Frame rate: One-bit sensors can operate at a much higher speed because a response is generated as soon as a photon is detected. However, in the presence of read noise, there exists an optimal frame rate beyond which the performance will degrade. A Closed-form expression of the optimal frame rate is derived. (3) Dynamic range: One-bit QIS offers a higher dynamic range. The benefit is brought by two complementary characteristics of the sensor: nonlinearity and exposure bracketing. The decoupling of the two factors is theoretically proved, and closed-form expressions are derived.
△ Less
Submitted 18 August, 2022;
originally announced August 2022.
-
Photon-Limited Blind Deconvolution using Unsupervised Iterative Kernel Estimation
Authors:
Yash Sanghvi,
Abhiram Gnanasambandam,
Zhiyuan Mao,
Stanley H. Chan
Abstract:
Blind deconvolution is a challenging problem, but in low-light it is even more difficult. Existing algorithms, both classical and deep-learning based, are not designed for this condition. When the photon shot noise is strong, conventional deconvolution methods fail because (1) the image does not have enough signal-to-noise ratio to perform the blur estimation; (2) While deep neural networks are po…
▽ More
Blind deconvolution is a challenging problem, but in low-light it is even more difficult. Existing algorithms, both classical and deep-learning based, are not designed for this condition. When the photon shot noise is strong, conventional deconvolution methods fail because (1) the image does not have enough signal-to-noise ratio to perform the blur estimation; (2) While deep neural networks are powerful, many of them do not consider the forward process. When the noise is strong, these networks fail to simultaneously deblur and denoise; (3) While iterative schemes are known to be robust in the classical frameworks, they are seldom considered in deep neural networks because it requires a differentiable non-blind solver.
This paper addresses the above challenges by presenting an \emph{unsupervised} blind deconvolution method. At the core of this method is a reformulation of the general blind deconvolution framework from the conventional image-kernel alternating minimization to a purely kernel-based minimization. This kernel-based minimization leads to a new iterative scheme that backpropagates an unsupervised loss through a pre-trained non-blind solver to update the blur kernel. Experimental results show that the proposed framework achieves superior results than state-of-the-art blind deconvolution algorithms in low-light conditions.
△ Less
Submitted 17 November, 2022; v1 submitted 31 July, 2022;
originally announced August 2022.
-
Single Frame Atmospheric Turbulence Mitigation: A Benchmark Study and A New Physics-Inspired Transformer Model
Authors:
Zhiyuan Mao,
Ajay Jaiswal,
Zhangyang Wang,
Stanley H. Chan
Abstract:
Image restoration algorithms for atmospheric turbulence are known to be much more challenging to design than traditional ones such as blur or noise because the distortion caused by the turbulence is an entanglement of spatially varying blur, geometric distortion, and sensor noise. Existing CNN-based restoration methods built upon convolutional kernels with static weights are insufficient to handle…
▽ More
Image restoration algorithms for atmospheric turbulence are known to be much more challenging to design than traditional ones such as blur or noise because the distortion caused by the turbulence is an entanglement of spatially varying blur, geometric distortion, and sensor noise. Existing CNN-based restoration methods built upon convolutional kernels with static weights are insufficient to handle the spatially dynamical atmospheric turbulence effect. To address this problem, in this paper, we propose a physics-inspired transformer model for imaging through atmospheric turbulence. The proposed network utilizes the power of transformer blocks to jointly extract a dynamical turbulence distortion map and restore a turbulence-free image. In addition, recognizing the lack of a comprehensive dataset, we collect and present two new real-world turbulence datasets that allow for evaluation with both classical objective metrics (e.g., PSNR and SSIM) and a new task-driven metric using text recognition accuracy. Both real testing sets and all related code will be made publicly available.
△ Less
Submitted 24 July, 2022; v1 submitted 20 July, 2022;
originally announced July 2022.
-
Imaging through the Atmosphere using Turbulence Mitigation Transformer
Authors:
Xingguang Zhang,
Zhiyuan Mao,
Nicholas Chimitt,
Stanley H. Chan
Abstract:
Restoring images distorted by atmospheric turbulence is a ubiquitous problem in long-range imaging applications. While existing deep-learning-based methods have demonstrated promising results in specific testing conditions, they suffer from three limitations: (1) lack of generalization capability from synthetic training data to real turbulence data; (2) failure to scale, hence causing memory and s…
▽ More
Restoring images distorted by atmospheric turbulence is a ubiquitous problem in long-range imaging applications. While existing deep-learning-based methods have demonstrated promising results in specific testing conditions, they suffer from three limitations: (1) lack of generalization capability from synthetic training data to real turbulence data; (2) failure to scale, hence causing memory and speed challenges when extending the idea to a large number of frames; (3) lack of a fast and accurate simulator to generate data for training neural networks. In this paper, we introduce the turbulence mitigation transformer (TMT) that explicitly addresses these issues. TMT brings three contributions: Firstly, TMT explicitly uses turbulence physics by decoupling the turbulence degradation and introducing a multi-scale loss for removing distortion, thus improving effectiveness. Secondly, TMT presents a new attention module along the temporal axis to extract extra features efficiently, thus improving memory and speed. Thirdly, TMT introduces a new simulator based on the Fourier sampler, temporal correlation, and flexible kernel size, thus improving our capability to synthesize better training data. TMT outperforms state-of-the-art video restoration models, especially in generalizing from synthetic to real turbulence data. Code, videos, and datasets are available at \href{https://xg416.github.io/TMT}{https://xg416.github.io/TMT}.
△ Less
Submitted 11 December, 2023; v1 submitted 13 July, 2022;
originally announced July 2022.
-
Tilt-then-Blur or Blur-then-Tilt? Clarifying the Atmospheric Turbulence Model
Authors:
Stanley H. Chan
Abstract:
Imaging at a long distance often requires advanced image restoration algorithms to compensate for the distortions caused by atmospheric turbulence. However, unlike many standard restoration problems such as deconvolution, the forward image formation model of the atmospheric turbulence does not have a simple expression. Thanks to the Zernike representation of the phase, one can show that the forwar…
▽ More
Imaging at a long distance often requires advanced image restoration algorithms to compensate for the distortions caused by atmospheric turbulence. However, unlike many standard restoration problems such as deconvolution, the forward image formation model of the atmospheric turbulence does not have a simple expression. Thanks to the Zernike representation of the phase, one can show that the forward model is a combination of tilt (pixel shifting due to the linear phase terms) and blur (image smoothing due to the high order aberrations).
Confusions then arise between the ordering of the two operators. Should the model be tilt-then-blur, or blur-then-tilt? Some papers in the literature say that the model is tilt-then-blur, whereas more papers say that it is blur-then-tilt. This paper clarifies the differences between the two and discusses why the tilt-then-blur is the correct model. Recommendations are given to the research community.
△ Less
Submitted 18 August, 2022; v1 submitted 13 July, 2022;
originally announced July 2022.
-
Solar eclipse observations with small radio telescope in Hong Kong in 21cm radio frequency band
Authors:
Chun Sing Leung,
Thomas K. T. Fok,
Kenneith H. K. Hui,
K. W. Ng,
C. M. Lee,
S. H. Chan
Abstract:
Small radio telescope in 21cm was used for studying the partial solar eclipse, with magnitude 0.89, in Hong Kong on 21st June, 2020. The radio telescope SPIDER 300A was designed and constructed by the Radio2Space Company, Italy. Radio flux density time curves (light curve) and a two-dimension map** of the eclipse is presented in this paper. Standard radio data reduction methods were used to obta…
▽ More
Small radio telescope in 21cm was used for studying the partial solar eclipse, with magnitude 0.89, in Hong Kong on 21st June, 2020. The radio telescope SPIDER 300A was designed and constructed by the Radio2Space Company, Italy. Radio flux density time curves (light curve) and a two-dimension map** of the eclipse is presented in this paper. Standard radio data reduction methods were used to obtain the intensity time curve. We also adopted the semi-pipeline method for the reduction of data to obtain the same results as with the built-in software of the radio telescope SPIDER 300A. The total solar radio flux of the eclipse was found to reduce by maximum 55 +/- 5 percent, while the maximum eclipsed area of the same eclipse is 86.08%. Other radio observations of solar eclipses in Hong Kong are also discussed in this paper, including SPIDER 300A observation of partial solar eclipse on 26th December 2019 (APPENDIX A); and small radio telescope (SRT), developed by the Haystack Observatory, MIT, USA, observation of 2020 eclipse (APPENDIX B).
△ Less
Submitted 2 July, 2022;
originally announced July 2022.
-
Effective poset inequalities
Authors:
Swee Hong Chan,
Igor Pak,
Greta Panova
Abstract:
We explore inequalities on linear extensions of posets and make them effective in different ways. First, we study the Björner--Wachs inequality and generalize it to inequalities on order polynomials and their $q$-analogues via direct injections and FKG inequalities. Second, we give an injective proof of the Sidorenko inequality with computational complexity significance, namely that the difference…
▽ More
We explore inequalities on linear extensions of posets and make them effective in different ways. First, we study the Björner--Wachs inequality and generalize it to inequalities on order polynomials and their $q$-analogues via direct injections and FKG inequalities. Second, we give an injective proof of the Sidorenko inequality with computational complexity significance, namely that the difference is in $\#P$. Third, we generalize the Sidorenko inequality to posets with small chain intersections and give complexity theoretic applications.
△ Less
Submitted 9 May, 2023; v1 submitted 5 May, 2022;
originally announced May 2022.
-
On the Insensitivity of Bit Density to Read Noise in One-bit Quanta Image Sensors
Authors:
Stanley H. Chan
Abstract:
The one-bit quanta image sensor is a photon-counting device that produces binary measurements where each bit represents the presence or absence of a photon. In the presence of read noise, the sensor quantizes the analog voltage into the binary bits using a threshold value $q$. The average number of ones in the bitstream is known as the bit-density and is often the sufficient statistics for signal…
▽ More
The one-bit quanta image sensor is a photon-counting device that produces binary measurements where each bit represents the presence or absence of a photon. In the presence of read noise, the sensor quantizes the analog voltage into the binary bits using a threshold value $q$. The average number of ones in the bitstream is known as the bit-density and is often the sufficient statistics for signal estimation. An intriguing phenomenon is observed when the quanta exposure is at the unity and the threshold is $q = 0.5$. The bit-density demonstrates a complete insensitivity as long as the read noise level does not exceeds a certain limit. In other words, the bit density stays at a constant independent of the amount of read noise. This paper provides a mathematical explanation of the phenomenon by deriving conditions under which the phenomenon happens. It was found that the insensitivity holds when some forms of the symmetry of the underlying Poisson-Gaussian distribution holds.
△ Less
Submitted 30 January, 2023; v1 submitted 11 March, 2022;
originally announced March 2022.
-
Introduction to the combinatorial atlas
Authors:
Swee Hong Chan,
Igor Pak
Abstract:
We give elementary self-contained proofs of the strong Mason conjecture recently proved by Anari at. al. (arXiv:1811.01600) and BrändĂ©n--Huh (arXiv:1902.03719), and of the classical Alexandrov--Fenchel inequality. Both proofs use the combinatorial atlas technology recently introduced by the authors (arXiv:2110.10740). We also give a formal relationship between combinatorial atlases and Lorentzian…
▽ More
We give elementary self-contained proofs of the strong Mason conjecture recently proved by Anari at. al. (arXiv:1811.01600) and Brändén--Huh (arXiv:1902.03719), and of the classical Alexandrov--Fenchel inequality. Both proofs use the combinatorial atlas technology recently introduced by the authors (arXiv:2110.10740). We also give a formal relationship between combinatorial atlases and Lorentzian polynomials.
△ Less
Submitted 3 March, 2022;
originally announced March 2022.
-
Exposure-Referred Signal-to-Noise Ratio for Digital Image Sensors
Authors:
Abhiram Gnanasambandam,
Stanley H. Chan
Abstract:
The signal-to-noise ratio (SNR) is a fundamental tool to measure the performance of an image sensor. However, confusions sometimes arise between the two types of SNRs. The first one is the output-referred SNR which measures the ratio between the signal and the noise seen at the sensor's output. This SNR is easy to compute, and it is linear in the log-log scale for most image sensors. The second SN…
▽ More
The signal-to-noise ratio (SNR) is a fundamental tool to measure the performance of an image sensor. However, confusions sometimes arise between the two types of SNRs. The first one is the output-referred SNR which measures the ratio between the signal and the noise seen at the sensor's output. This SNR is easy to compute, and it is linear in the log-log scale for most image sensors. The second SNR is the exposure-referred SNR, also known as the input-referred SNR. This SNR considers the noise at the input by including a derivative term to the output-referred SNR. The two SNRs have similar behaviors for sensors with a large full-well capacity. However, for sensors with a small full-well capacity, the exposure-referred SNR can capture some behaviors that the output-referred SNR cannot.
While the exposure-referred SNR has been known and used by the industry for a long time, a theoretically rigorous derivation from a signal processing perspective is lacking. In particular, while various equations can be found in different sources of the literature, there is currently no paper that attempts to assemble, derive, and organize these equations in one place. This paper aims to fill the gap by answering four questions: (1) How is the exposure-referred SNR derived from first principles? (2) Is the output-referred SNR a special case of the exposure-referred SNR, or are they completely different? (3) How to compute the SNR efficiently? (4) What utilities can the SNR bring to solving imaging tasks? New theoretical results are derived for image sensors of any bit-depth and full-well capacity.
△ Less
Submitted 12 June, 2022; v1 submitted 10 December, 2021;
originally announced December 2021.
-
Graph-Based Depth Denoising & Dequantization for Point Cloud Enhancement
Authors:
Xue Zhang,
Gene Cheung,
Jiahao Pang,
Yash Sanghvi,
Abhiram Gnanasambandam,
Stanley H. Chan
Abstract:
A 3D point cloud is typically constructed from depth measurements acquired by sensors at one or more viewpoints. The measurements suffer from both quantization and noise corruption. To improve quality, previous works denoise a point cloud \textit{a posteriori} after projecting the imperfect depth data onto 3D space. Instead, we enhance depth measurements directly on the sensed images \textit{a pri…
▽ More
A 3D point cloud is typically constructed from depth measurements acquired by sensors at one or more viewpoints. The measurements suffer from both quantization and noise corruption. To improve quality, previous works denoise a point cloud \textit{a posteriori} after projecting the imperfect depth data onto 3D space. Instead, we enhance depth measurements directly on the sensed images \textit{a priori}, before synthesizing a 3D point cloud. By enhancing near the physical sensing process, we tailor our optimization to our depth formation model before subsequent processing steps that obscure measurement errors.
Specifically, we model depth formation as a combined process of signal-dependent noise addition and non-uniform log-based quantization. The designed model is validated (with parameters fitted) using collected empirical data from a representative depth sensor. To enhance each pixel row in a depth image, we first encode intra-view similarities between available row pixels as edge weights via feature graph learning. We next establish inter-view similarities with another rectified depth image via viewpoint map** and sparse linear interpolation. This leads to a maximum a posteriori (MAP) graph filtering objective that is convex and differentiable. We minimize the objective efficiently using accelerated gradient descent (AGD), where the optimal step size is approximated via Gershgorin circle theorem (GCT). Experiments show that our method significantly outperformed recent point cloud denoising schemes and state-of-the-art image denoising schemes in two established point cloud quality metrics.
△ Less
Submitted 6 October, 2022; v1 submitted 8 November, 2021;
originally announced November 2021.
-
Photon Limited Non-Blind Deblurring Using Algorithm Unrolling
Authors:
Yash Sanghvi,
Abhiram Gnanasambandam,
Stanley H. Chan
Abstract:
Image deblurring in photon-limited conditions is ubiquitous in a variety of low-light applications such as photography, microscopy, and astronomy. However, the presence of photon shot noise due to low illumination and/or short exposure makes the deblurring task substantially more challenging than the conventional deblurring problems. In this paper, we present an algorithm unrolling approach for th…
▽ More
Image deblurring in photon-limited conditions is ubiquitous in a variety of low-light applications such as photography, microscopy, and astronomy. However, the presence of photon shot noise due to low illumination and/or short exposure makes the deblurring task substantially more challenging than the conventional deblurring problems. In this paper, we present an algorithm unrolling approach for the photon-limited deblurring problem by unrolling a Plug-and-Play algorithm for a fixed number of iterations. By introducing a three-operator splitting formation of the Plug-and-Play framework, we obtain a series of differentiable steps which allows the fixed iteration unrolled network to be trained end-to-end. The proposed algorithm demonstrates significantly better image recovery compared to existing state-of-the-art deblurring approaches. We also present a new photon-limited deblurring dataset for evaluating the performance of algorithms.
△ Less
Submitted 26 October, 2022; v1 submitted 28 October, 2021;
originally announced October 2021.
-
Log-concave poset inequalities
Authors:
Swee Hong Chan,
Igor Pak
Abstract:
We study combinatorial inequalities for various classes of set systems: matroids, polymatroids, poset antimatroids, and interval greedoids. We prove log-concavity inequalities for counting certain weighted feasible words, which generalize and extend several previous results establishing Mason conjectures for the numbers of independent sets of matroids. Notably, we prove matching equality condition…
▽ More
We study combinatorial inequalities for various classes of set systems: matroids, polymatroids, poset antimatroids, and interval greedoids. We prove log-concavity inequalities for counting certain weighted feasible words, which generalize and extend several previous results establishing Mason conjectures for the numbers of independent sets of matroids. Notably, we prove matching equality conditions for both earlier inequalities and our extensions.
In contrast with much of the previous work, our proofs are combinatorial and employ nothing but linear algebra. We use the language formulation of greedoids which allows a linear algebraic setup, which in turn can be analyzed recursively. The underlying non-commutative nature of matrices associated with greedoids allows us to proceed beyond polymatroids and prove the equality conditions. As further application of our tools, we rederive both Stanley's inequality on the number of certain linear extensions, and its equality conditions, which we then also extend to the weighted case.
△ Less
Submitted 27 February, 2024; v1 submitted 20 October, 2021;
originally announced October 2021.
-
Detecting and Segmenting Adversarial Graphics Patterns from Images
Authors:
Xiangyu Qu,
Stanley H. Chan
Abstract:
Adversarial attacks pose a substantial threat to computer vision system security, but the social media industry constantly faces another form of "adversarial attack" in which the hackers attempt to upload inappropriate images and fool the automated screening systems by adding artificial graphics patterns. In this paper, we formulate the defense against such attacks as an artificial graphics patter…
▽ More
Adversarial attacks pose a substantial threat to computer vision system security, but the social media industry constantly faces another form of "adversarial attack" in which the hackers attempt to upload inappropriate images and fool the automated screening systems by adding artificial graphics patterns. In this paper, we formulate the defense against such attacks as an artificial graphics pattern segmentation problem. We evaluate the efficacy of several segmentation algorithms and, based on observation of their performance, propose a new method tailored to this specific problem. Extensive experiments show that the proposed method outperforms the baselines and has a promising generalization capability, which is the most crucial aspect in segmenting artificial graphics patterns.
△ Less
Submitted 20 August, 2021;
originally announced August 2021.
-
Optical Adversarial Attack
Authors:
Abhiram Gnanasambandam,
Alex M. Sherman,
Stanley H. Chan
Abstract:
We introduce OPtical ADversarial attack (OPAD). OPAD is an adversarial attack in the physical space aiming to fool image classifiers without physically touching the objects (e.g., moving or painting the objects). The principle of OPAD is to use structured illumination to alter the appearance of the target objects. The system consists of a low-cost projector, a camera, and a computer. The challenge…
▽ More
We introduce OPtical ADversarial attack (OPAD). OPAD is an adversarial attack in the physical space aiming to fool image classifiers without physically touching the objects (e.g., moving or painting the objects). The principle of OPAD is to use structured illumination to alter the appearance of the target objects. The system consists of a low-cost projector, a camera, and a computer. The challenge of the problem is the non-linearity of the radiometric response of the projector and the spatially varying spectral response of the scene. Attacks generated in a conventional approach do not work in this setting unless they are calibrated to compensate for such a projector-camera model. The proposed solution incorporates the projector-camera model into the adversarial attack optimization, where a new attack formulation is derived. Experimental results prove the validity of the solution. It is demonstrated that OPAD can optically attack a real 3D object in the presence of background lighting for white-box, black-box, targeted, and untargeted attacks. Theoretical analysis is presented to quantify the fundamental performance limit of the system.
△ Less
Submitted 15 August, 2021; v1 submitted 13 August, 2021;
originally announced August 2021.
-
Accelerating Atmospheric Turbulence Simulation via Learned Phase-to-Space Transform
Authors:
Zhiyuan Mao,
Nicholas Chimitt,
Stanley H. Chan
Abstract:
Fast and accurate simulation of imaging through atmospheric turbulence is essential for develo** turbulence mitigation algorithms. Recognizing the limitations of previous approaches, we introduce a new concept known as the phase-to-space (P2S) transform to significantly speed up the simulation. P2S is build upon three ideas: (1) reformulating the spatially varying convolution as a set of invaria…
▽ More
Fast and accurate simulation of imaging through atmospheric turbulence is essential for develo** turbulence mitigation algorithms. Recognizing the limitations of previous approaches, we introduce a new concept known as the phase-to-space (P2S) transform to significantly speed up the simulation. P2S is build upon three ideas: (1) reformulating the spatially varying convolution as a set of invariant convolutions with basis functions, (2) learning the basis function via the known turbulence statistics models, (3) implementing the P2S transform via a light-weight network that directly convert the phase representation to spatial representation. The new simulator offers 300x -- 1000x speed up compared to the mainstream split-step simulators while preserving the essential turbulence statistics.
△ Less
Submitted 20 August, 2021; v1 submitted 24 July, 2021;
originally announced July 2021.
-
Graph Signal Restoration Using Nested Deep Algorithm Unrolling
Authors:
Masatoshi Nagahama,
Koki Yamada,
Yuichi Tanaka,
Stanley H. Chan,
Yonina C. Eldar
Abstract:
Graph signal processing is a ubiquitous task in many applications such as sensor, social, transportation and brain networks, point cloud processing, and graph neural networks. Often, graph signals are corrupted in the sensing process, thus requiring restoration. In this paper, we propose two graph signal restoration methods based on deep algorithm unrolling (DAU). First, we present a graph signal…
▽ More
Graph signal processing is a ubiquitous task in many applications such as sensor, social, transportation and brain networks, point cloud processing, and graph neural networks. Often, graph signals are corrupted in the sensing process, thus requiring restoration. In this paper, we propose two graph signal restoration methods based on deep algorithm unrolling (DAU). First, we present a graph signal denoiser by unrolling iterations of the alternating direction method of multiplier (ADMM). We then suggest a general restoration method for linear degradation by unrolling iterations of Plug-and-Play ADMM (PnP-ADMM). In the second approach, the unrolled ADMM-based denoiser is incorporated as a submodule, leading to a nested DAU structure. The parameters in the proposed denoising/restoration methods are trainable in an end-to-end manner. Our approach is interpretable and keeps the number of parameters small since we only tune graph-independent regularization parameters. We overcome two main challenges in existing graph signal restoration methods: 1) limited performance of convex optimization algorithms due to fixed parameters which are often determined manually. 2) large number of parameters of graph neural networks that result in difficulty of training. Several experiments for graph signal denoising and interpolation are performed on synthetic and real-world data. The proposed methods show performance improvements over several existing techniques in terms of root mean squared error in both tasks.
△ Less
Submitted 1 June, 2022; v1 submitted 30 June, 2021;
originally announced June 2021.
-
DROID: Driver-centric Risk Object Identification
Authors:
Chengxi Li,
Stanley H. Chan,
Yi-Ting Chen
Abstract:
Identification of high-risk driving situations is generally approached through collision risk estimation or accident pattern recognition. In this work, we approach the problem from the perspective of subjective risk. We operationalize subjective risk assessment by predicting driver behavior changes and identifying the cause of changes. To this end, we introduce a new task called driver-centric ris…
▽ More
Identification of high-risk driving situations is generally approached through collision risk estimation or accident pattern recognition. In this work, we approach the problem from the perspective of subjective risk. We operationalize subjective risk assessment by predicting driver behavior changes and identifying the cause of changes. To this end, we introduce a new task called driver-centric risk object identification (DROID), which uses egocentric video to identify object(s) influencing a driver's behavior, given only the driver's response as the supervision signal. We formulate the task as a cause-effect problem and present a novel two-stage DROID framework, taking inspiration from models of situation awareness and causal inference. A subset of data constructed from the Honda Research Institute Driving Dataset (HDD) is used to evaluate DROID. We demonstrate state-of-the-art DROID performance, even compared with strong baseline models using this dataset. Additionally, we conduct extensive ablative studies to justify our design choices. Moreover, we demonstrate the applicability of DROID for risk assessment.
△ Less
Submitted 28 February, 2023; v1 submitted 24 June, 2021;
originally announced June 2021.
-
Log-concavity in planar random walks
Authors:
Swee Hong Chan,
Igor Pak,
Greta Panova
Abstract:
We prove log-concavity of exit probabilities of lattice random walks in certain planar regions.
We prove log-concavity of exit probabilities of lattice random walks in certain planar regions.
△ Less
Submitted 22 November, 2021; v1 submitted 20 June, 2021;
originally announced June 2021.
-
Extensions of the Kahn--Saks inequality for posets of width two
Authors:
Swee Hong Chan,
Igor Pak,
Greta Panova
Abstract:
The Kahn--Saks inequality is a classical result on the number of linear extensions of finite posets. We give a new proof of this inequality for posets of width two using explicit injections of lattice paths. As a consequence we obtain a $q$-analogue, a multivariate generalization and an equality condition in this case. We also discuss the equality conditions of the Kahn--Saks inequality for genera…
▽ More
The Kahn--Saks inequality is a classical result on the number of linear extensions of finite posets. We give a new proof of this inequality for posets of width two using explicit injections of lattice paths. As a consequence we obtain a $q$-analogue, a multivariate generalization and an equality condition in this case. We also discuss the equality conditions of the Kahn--Saks inequality for general posets and prove several implications between conditions conjectured to be equivalent.
△ Less
Submitted 14 February, 2022; v1 submitted 13 June, 2021;
originally announced June 2021.
-
The cross-product conjecture for width two posets
Authors:
Swee Hong Chan,
Igor Pak,
Greta Panova
Abstract:
The cross--product conjecture (CPC) of Brightwell, Felsner and Trotter (1995) is a two-parameter quadratic inequality for the number of linear extensions of a poset $P= (X, \prec)$ with given value differences on three distinct elements in $X$. We give two different proofs of this inequality for posets of width two. The first proof is algebraic and generalizes CPC to a four-parameter family. The s…
▽ More
The cross--product conjecture (CPC) of Brightwell, Felsner and Trotter (1995) is a two-parameter quadratic inequality for the number of linear extensions of a poset $P= (X, \prec)$ with given value differences on three distinct elements in $X$. We give two different proofs of this inequality for posets of width two. The first proof is algebraic and generalizes CPC to a four-parameter family. The second proof is combinatorial and extends CPC to a $q$-analogue. Further applications include relationships between CPC and other poset inequalities, including a new $q$-analogue of the Kahn--Saks inequality.
△ Less
Submitted 9 March, 2022; v1 submitted 18 April, 2021;
originally announced April 2021.
-
Student-Teacher Learning from Clean Inputs to Noisy Inputs
Authors:
Guanzhe Hong,
Zhiyuan Mao,
Xiaojun Lin,
Stanley H. Chan
Abstract:
Feature-based student-teacher learning, a training method that encourages the student's hidden features to mimic those of the teacher network, is empirically successful in transferring the knowledge from a pre-trained teacher network to the student network. Furthermore, recent empirical results demonstrate that, the teacher's features can boost the student network's generalization even when the st…
▽ More
Feature-based student-teacher learning, a training method that encourages the student's hidden features to mimic those of the teacher network, is empirically successful in transferring the knowledge from a pre-trained teacher network to the student network. Furthermore, recent empirical results demonstrate that, the teacher's features can boost the student network's generalization even when the student's input sample is corrupted by noise. However, there is a lack of theoretical insights into why and when this method of transferring knowledge can be successful between such heterogeneous tasks. We analyze this method theoretically using deep linear networks, and experimentally using nonlinear networks. We identify three vital factors to the success of the method: (1) whether the student is trained to zero training loss; (2) how knowledgeable the teacher is on the clean-input problem; (3) how the teacher decomposes its knowledge in its hidden features. Lack of proper control in any of the three factors leads to failure of the student-teacher learning method.
△ Less
Submitted 12 March, 2021;
originally announced March 2021.
-
Recurrence of horizontal-vertical walks
Authors:
Swee Hong Chan
Abstract:
Consider a nearest neighbor random walk on the two-dimensional integer lattice, where each vertex is initially labeled either `H' or `V', uniformly and independently. At each discrete time step, the walker resamples the label at its current location (changing `H' to `V' and `V' to `H' with probability $q$). Then, it takes a mean zero horizontal step if the new label is `H', and a mean zero vertica…
▽ More
Consider a nearest neighbor random walk on the two-dimensional integer lattice, where each vertex is initially labeled either `H' or `V', uniformly and independently. At each discrete time step, the walker resamples the label at its current location (changing `H' to `V' and `V' to `H' with probability $q$). Then, it takes a mean zero horizontal step if the new label is `H', and a mean zero vertical step if the new label is `V'. This model is a randomized version of the deterministic rotor walk, for which its recurrence (i.e., visiting every vertex infinitely often with probability 1) in two dimensions is still an open problem. We answer the analogous question for the the horizontal-vertical walk, by showing that the horizontal-vertical walk is recurrent for $q \in (\frac{1}{3},1]$.
△ Less
Submitted 6 May, 2022; v1 submitted 19 December, 2020;
originally announced December 2020.
-
HDR Imaging with Quanta Image Sensors: Theoretical Limits and Optimal Reconstruction
Authors:
Abhiram Gnanasambandam,
Stanley H. Chan
Abstract:
High dynamic range (HDR) imaging is one of the biggest achievements in modern photography. Traditional solutions to HDR imaging are designed for and applied to CMOS image sensors (CIS). However, the mainstream one-micron CIS cameras today generally have a high read noise and low frame-rate. These, in turn, limit the acquisition speed and quality, making the cameras slow in the HDR mode. In this pa…
▽ More
High dynamic range (HDR) imaging is one of the biggest achievements in modern photography. Traditional solutions to HDR imaging are designed for and applied to CMOS image sensors (CIS). However, the mainstream one-micron CIS cameras today generally have a high read noise and low frame-rate. These, in turn, limit the acquisition speed and quality, making the cameras slow in the HDR mode. In this paper, we propose a new computational photography technique for HDR imaging. Recognizing the limitations of CIS, we use the Quanta Image Sensor (QIS) to trade the spatial-temporal resolution with bit-depth. QIS is a single-photon image sensor that has comparable pixel pitch to CIS but substantially lower dark current and read noise. We provide a complete theoretical characterization of the sensor in the context of HDR imaging, by proving the fundamental limits in the dynamic range that QIS can offer and the trade-offs with noise and speed. In addition, we derive an optimal reconstruction algorithm for single-bit and multi-bit QIS. Our algorithm is theoretically optimal for \emph{all} linear reconstruction schemes based on exposure bracketing. Experimental results confirm the validity of the theory and algorithm, based on synthetic and real QIS data.
△ Less
Submitted 2 December, 2020; v1 submitted 6 November, 2020;
originally announced November 2020.