Search | arXiv e-print repository

Neural Approximate Mirror Maps for Constrained Diffusion Models

Authors: Berthy T. Feng, Ricardo Baptista, Katherine L. Bouman

Abstract: Diffusion models excel at creating visually-convincing images, but they often struggle to meet subtle constraints inherent in the training data. Such constraints could be physics-based (e.g., satisfying a PDE), geometric (e.g., respecting symmetry), or semantic (e.g., including a particular number of objects). When the training data all satisfy a certain constraint, enforcing this constraint on a… ▽ More Diffusion models excel at creating visually-convincing images, but they often struggle to meet subtle constraints inherent in the training data. Such constraints could be physics-based (e.g., satisfying a PDE), geometric (e.g., respecting symmetry), or semantic (e.g., including a particular number of objects). When the training data all satisfy a certain constraint, enforcing this constraint on a diffusion model not only improves its distribution-matching accuracy but also makes it more reliable for generating valid synthetic data and solving constrained inverse problems. However, existing methods for constrained diffusion models are inflexible with different types of constraints. Recent work proposed to learn mirror diffusion models (MDMs) in an unconstrained space defined by a mirror map and to impose the constraint with an inverse mirror map, but analytical mirror maps are challenging to derive for complex constraints. We propose neural approximate mirror maps (NAMMs) for general constraints. Our approach only requires a differentiable distance function from the constraint set. We learn an approximate mirror map that pushes data into an unconstrained space and a corresponding approximate inverse that maps data back to the constraint set. A generative model, such as an MDM, can then be trained in the learned mirror space and its samples restored to the constraint set by the inverse map. We validate our approach on a variety of constraints, showing that compared to an unconstrained diffusion model, a NAMM-based MDM substantially improves constraint satisfaction. We also demonstrate how existing diffusion-based inverse-problem solvers can be easily applied in the learned mirror space to solve constrained inverse problems. △ Less

Submitted 18 June, 2024; originally announced June 2024.

arXiv:2406.02785 [pdf, other]

Event-horizon-scale Imaging of M87* under Different Assumptions via Deep Generative Image Priors

Authors: Berthy T. Feng, Katherine L. Bouman, William T. Freeman

Abstract: Reconstructing images from the Event Horizon Telescope (EHT) observations of M87*, the supermassive black hole at the center of the galaxy M87, depends on a prior to impose desired image statistics. However, given the impossibility of directly observing black holes, there is no clear choice for a prior. We present a framework for flexibly designing a range of priors, each bringing different biases… ▽ More Reconstructing images from the Event Horizon Telescope (EHT) observations of M87*, the supermassive black hole at the center of the galaxy M87, depends on a prior to impose desired image statistics. However, given the impossibility of directly observing black holes, there is no clear choice for a prior. We present a framework for flexibly designing a range of priors, each bringing different biases to the image reconstruction. These priors can be weak (e.g., impose only basic natural-image statistics) or strong (e.g., impose assumptions of black-hole structure). Our framework uses Bayesian inference with score-based priors, which are data-driven priors arising from a deep generative model that can learn complicated image distributions. Using our Bayesian imaging approach with sophisticated data-driven priors, we can assess how visual features and uncertainty of reconstructed images change depending on the prior. In addition to simulated data, we image the real EHT M87* data and discuss how recovered features are influenced by the choice of prior. △ Less

Submitted 4 June, 2024; originally announced June 2024.

arXiv:2405.10463 [pdf, other]

Single-shot volumetric fluorescence imaging with neural fields

Authors: Oumeng Zhang, Haowen Zhou, Brandon Y. Feng, Elin M. Larsson, Reinaldo E. Alcalde, Siyuan Yin, Catherine Deng, Changhuei Yang

Abstract: Single-shot volumetric fluorescence (SVF) imaging offers a significant advantage over traditional imaging methods that require scanning across multiple axial planes as it can capture biological processes with high temporal resolution across a large field of view. The key challenges in SVF imaging include requiring sparsity constraints to meet the multiplexing requirements of compressed sensing, el… ▽ More Single-shot volumetric fluorescence (SVF) imaging offers a significant advantage over traditional imaging methods that require scanning across multiple axial planes as it can capture biological processes with high temporal resolution across a large field of view. The key challenges in SVF imaging include requiring sparsity constraints to meet the multiplexing requirements of compressed sensing, eliminating depth ambiguity in the reconstruction, and maintaining high resolution across a large field of view. In this paper, we introduce the QuadraPol point spread function (PSF) combined with neural fields, a novel approach for SVF imaging. This method utilizes a custom polarizer at the back focal plane and a polarization camera to detect fluorescence, effectively encoding the 3D scene within a compact PSF without depth ambiguity. Additionally, we propose a reconstruction algorithm based on the neural fields technique that provides improved reconstruction quality and addresses the inaccuracies of phase retrieval methods used to correct imaging system aberrations. This algorithm combines the accuracy of experimental PSFs with the long depth of field of computationally generated retrieved PSFs. QuadraPol PSF, combined with neural fields, significantly reduces the acquisition time of a conventional fluorescence microscope by approximately 20 times and captures a 100 mm$^3$ cubic volume in one shot. We validate the effectiveness of both our hardware and algorithm through all-in-focus imaging of bacterial colonies on sand surfaces and visualization of plant root morphology. Our approach offers a powerful tool for advancing biological research and ecological studies. △ Less

Submitted 4 June, 2024; v1 submitted 16 May, 2024; originally announced May 2024.

arXiv:2404.15761 [pdf, other]

Rechargeable UAV Trajectory Optimization for Real-Time Persistent Data Collection of Large-Scale Sensor Networks

Authors: Rui Wang, Deshi Li, Qingqing Wu, Kaitao Meng, Boning Feng, Lele Cong

Abstract: Unmanned aerial vehicles (UAVs) have received plenty of attention due to their high flexibility and enhanced communication ability, nonetheless, the limited onboard energy restricts UAVs' application on persistent data collection missions in large areas. In this paper, we propose a rechargeable UAV-assisted periodic data collection scheme, where a UAV is dispatched to periodically collect data fro… ▽ More Unmanned aerial vehicles (UAVs) have received plenty of attention due to their high flexibility and enhanced communication ability, nonetheless, the limited onboard energy restricts UAVs' application on persistent data collection missions in large areas. In this paper, we propose a rechargeable UAV-assisted periodic data collection scheme, where a UAV is dispatched to periodically collect data from sensor nodes (SNs) in the mission area and charged by a wireless charging platform. Specifically, the periodic data collection completion time is minimized by optimizing the UAV trajectory to reach the optimal balance among the collection time, flight time, and recharging time. The formulated problem is non-convex and difficult to solve directly. To tackle this problem, we divide the main problem into two sub-problems and address them by leveraging successive convex approximation (SCA), bisection search, and heuristic methods. Then, we propose a periodic trajectory optimization algorithm to iteratively solve the two sub-problems to minimize the completion time. Furthermore, to deal with the dynamics of SNs, we propose a low-complexity trajectory adjustment strategy, where the trajectory can be maintained or adjusted locally at the SNs change, which significantly mitigates the computation cost of re-optimization. The simulation results show the superiority and robustness of the proposed scheme and the completion time is on average 39% and 33% lower than the two benchmarks, respectively. △ Less

Submitted 6 June, 2024; v1 submitted 24 April, 2024; originally announced April 2024.

Comments: 13 pages, 17 figures, submitted to IEEE for possible publication

arXiv:2404.09734 [pdf, other]

Weighted Sum-Rate Maximization for Movable Antenna-Enhanced Wireless Networks

Authors: Biqian Feng, Yongpeng Wu, Xiang-Gen Xia, Chengshan Xiao

Abstract: This letter investigates the weighted sum rate maximization problem in movable antenna (MA)-enhanced systems. To reduce the computational complexity, we transform it into a more tractable weighted minimum mean square error (WMMSE) problem well-suited for MA. We then adopt the WMMSE algorithm and majorization-minimization algorithm to optimize the beamforming and antenna positions, respectively. Mo… ▽ More This letter investigates the weighted sum rate maximization problem in movable antenna (MA)-enhanced systems. To reduce the computational complexity, we transform it into a more tractable weighted minimum mean square error (WMMSE) problem well-suited for MA. We then adopt the WMMSE algorithm and majorization-minimization algorithm to optimize the beamforming and antenna positions, respectively. Moreover, we propose a planar movement mode, which constrains each MA to a specified area, we obtain a low-complexity closed-form solution. Numerical results demonstrate that the MA-enhanced system outperforms the conventional system. Besides, the computation time for the planar movement mode is reduced by approximately 30\% at a little performance expense. △ Less

Submitted 15 April, 2024; originally announced April 2024.

Comments: Accepted by IEEE Wireless Communications Letters

arXiv:2404.07985 [pdf, other]

WaveMo: Learning Wavefront Modulations to See Through Scattering

Authors: Mingyang Xie, Haiyun Guo, Brandon Y. Feng, Lingbo **, Ashok Veeraraghavan, Christopher A. Metzler

Abstract: Imaging through scattering media is a fundamental and pervasive challenge in fields ranging from medical diagnostics to astronomy. A promising strategy to overcome this challenge is wavefront modulation, which induces measurement diversity during image acquisition. Despite its importance, designing optimal wavefront modulations to image through scattering remains under-explored. This paper introdu… ▽ More Imaging through scattering media is a fundamental and pervasive challenge in fields ranging from medical diagnostics to astronomy. A promising strategy to overcome this challenge is wavefront modulation, which induces measurement diversity during image acquisition. Despite its importance, designing optimal wavefront modulations to image through scattering remains under-explored. This paper introduces a novel learning-based framework to address the gap. Our approach jointly optimizes wavefront modulations and a computationally lightweight feedforward "proxy" reconstruction network. This network is trained to recover scenes obscured by scattering, using measurements that are modified by these modulations. The learned modulations produced by our framework generalize effectively to unseen scattering scenarios and exhibit remarkable versatility. During deployment, the learned modulations can be decoupled from the proxy network to augment other more computationally expensive restoration algorithms. Through extensive experiments, we demonstrate our approach significantly advances the state of the art in imaging through scattering media. Our project webpage is at https://wavemo-2024.github.io/. △ Less

Submitted 11 April, 2024; originally announced April 2024.

arXiv:2404.00471 [pdf, other]

doi 10.1109/ICASSP48485.2024.10447579

Score-Based Diffusion Models for Photoacoustic Tomography Image Reconstruction

Authors: Sreemanti Dey, Snigdha Saha, Berthy T. Feng, Manxiu Cui, Laure Delisle, Oscar Leong, Lihong V. Wang, Katherine L. Bouman

Abstract: Photoacoustic tomography (PAT) is a rapidly-evolving medical imaging modality that combines optical absorption contrast with ultrasound imaging depth. One challenge in PAT is image reconstruction with inadequate acoustic signals due to limited sensor coverage or due to the density of the transducer array. Such cases call for solving an ill-posed inverse reconstruction problem. In this work, we use… ▽ More Photoacoustic tomography (PAT) is a rapidly-evolving medical imaging modality that combines optical absorption contrast with ultrasound imaging depth. One challenge in PAT is image reconstruction with inadequate acoustic signals due to limited sensor coverage or due to the density of the transducer array. Such cases call for solving an ill-posed inverse reconstruction problem. In this work, we use score-based diffusion models to solve the inverse problem of reconstructing an image from limited PAT measurements. The proposed approach allows us to incorporate an expressive prior learned by a diffusion model on simulated vessel structures while still being robust to varying transducer sparsity conditions. △ Less

Submitted 30 March, 2024; originally announced April 2024.

Comments: 5 pages

Journal ref: ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Seoul, Korea, Republic of, 2024, pp. 2470-2474

arXiv:2312.04679 [pdf, other]

ConVRT: Consistent Video Restoration Through Turbulence with Test-time Optimization of Neural Video Representations

Authors: Haoming Cai, **gxi Chen, Brandon Y. Feng, Weiyun Jiang, Mingyang Xie, Kevin Zhang, Ashok Veeraraghavan, Christopher Metzler

Abstract: tmospheric turbulence presents a significant challenge in long-range imaging. Current restoration algorithms often struggle with temporal inconsistency, as well as limited generalization ability across varying turbulence levels and scene content different than the training data. To tackle these issues, we introduce a self-supervised method, Consistent Video Restoration through Turbulence (ConVRT)… ▽ More tmospheric turbulence presents a significant challenge in long-range imaging. Current restoration algorithms often struggle with temporal inconsistency, as well as limited generalization ability across varying turbulence levels and scene content different than the training data. To tackle these issues, we introduce a self-supervised method, Consistent Video Restoration through Turbulence (ConVRT) a test-time optimization method featuring a neural video representation designed to enhance temporal consistency in restoration. A key innovation of ConVRT is the integration of a pretrained vision-language model (CLIP) for semantic-oriented supervision, which steers the restoration towards sharp, photorealistic images in the CLIP latent space. We further develop a principled selection strategy of text prompts, based on their statistical correlation with a perceptual metric. ConVRT's test-time optimization allows it to adapt to a wide range of real-world turbulence conditions, effectively leveraging the insights gained from pre-trained models on simulated data. ConVRT offers a comprehensive and effective solution for mitigating real-world turbulence in dynamic videos. △ Less

Submitted 7 December, 2023; originally announced December 2023.

Comments: https://convrt-2024.github.io/

arXiv:2310.18529 [pdf, other]

FPM-INR: Fourier ptychographic microscopy image stack reconstruction using implicit neural representations

Authors: Haowen Zhou, Brandon Y. Feng, Haiyun Guo, Siyu Lin, Mingshu Liang, Christopher A. Metzler, Changhuei Yang

Abstract: Image stacks provide invaluable 3D information in various biological and pathological imaging applications. Fourier ptychographic microscopy (FPM) enables reconstructing high-resolution, wide field-of-view image stacks without z-stack scanning, thus significantly accelerating image acquisition. However, existing FPM methods take tens of minutes to reconstruct and gigabytes of memory to store a hig… ▽ More Image stacks provide invaluable 3D information in various biological and pathological imaging applications. Fourier ptychographic microscopy (FPM) enables reconstructing high-resolution, wide field-of-view image stacks without z-stack scanning, thus significantly accelerating image acquisition. However, existing FPM methods take tens of minutes to reconstruct and gigabytes of memory to store a high-resolution volumetric scene, impeding fast gigapixel-scale remote digital pathology. While deep learning approaches have been explored to address this challenge, existing methods poorly generalize to novel datasets and can produce unreliable hallucinations. This work presents FPM-INR, a compact and efficient framework that integrates physics-based optical models with implicit neural representations (INR) to represent and reconstruct FPM image stacks. FPM-INR is agnostic to system design or sample types and does not require external training data. In our demonstrated experiments, FPM-INR substantially outperforms traditional FPM algorithms with up to a 25-fold increase in speed and an 80-fold reduction in memory usage for continuous image stack representations. △ Less

Submitted 31 October, 2023; v1 submitted 27 October, 2023; originally announced October 2023.

Comments: Project Page: https://hwzhou2020.github.io/FPM-INR-Web/

arXiv:2310.10835 [pdf, other]

Provable Probabilistic Imaging using Score-Based Generative Priors

Authors: Yu Sun, Zihui Wu, Yifan Chen, Berthy T. Feng, Katherine L. Bouman

Abstract: Estimating high-quality images while also quantifying their uncertainty are two desired features in an image reconstruction algorithm for solving ill-posed inverse problems. In this paper, we propose plug-and-play Monte Carlo (PMC) as a principled framework for characterizing the space of possible solutions to a general inverse problem. PMC is able to incorporate expressive score-based generative… ▽ More Estimating high-quality images while also quantifying their uncertainty are two desired features in an image reconstruction algorithm for solving ill-posed inverse problems. In this paper, we propose plug-and-play Monte Carlo (PMC) as a principled framework for characterizing the space of possible solutions to a general inverse problem. PMC is able to incorporate expressive score-based generative priors for high-quality image reconstruction while also performing uncertainty quantification via posterior sampling. In particular, we introduce two PMC algorithms which can be viewed as the sampling analogues of the traditional plug-and-play priors (PnP) and regularization by denoising (RED) algorithms. We also establish a theoretical analysis for characterizing the convergence of the PMC algorithms. Our analysis provides non-asymptotic stationarity guarantees for both algorithms, even in the presence of non-log-concave likelihoods and imperfect score networks. We demonstrate the performance of the PMC algorithms on multiple representative inverse problems with both linear and nonlinear forward models. Experimental results show that PMC significantly improves reconstruction quality and enables high-fidelity uncertainty quantification. △ Less

Submitted 29 December, 2023; v1 submitted 16 October, 2023; originally announced October 2023.

arXiv:2308.06720 [pdf, other]

Joint Beamforming and Antenna Movement Design for Moveable Antenna Systems Based on Statistical CSI

Authors: Xintai Chen, Biqian Feng, Yongpeng Wu, Derrick Wing Kwan Ng, Robert Schober

Abstract: This paper studies a novel movable antenna (MA)-enhanced multiple-input multiple-output (MIMO) system to leverage the corresponding spatial degrees of freedom (DoFs) for improving the performance of wireless communications. We aim to maximize the achievable rate by jointly optimizing the MA positions and the transmit covariance matrix based on statistical channel state information (CSI). To solve… ▽ More This paper studies a novel movable antenna (MA)-enhanced multiple-input multiple-output (MIMO) system to leverage the corresponding spatial degrees of freedom (DoFs) for improving the performance of wireless communications. We aim to maximize the achievable rate by jointly optimizing the MA positions and the transmit covariance matrix based on statistical channel state information (CSI). To solve the resulting design problem, we develop a constrained stochastic successive convex approximation (CSSCA) algorithm applicable for the general movement mode. Furthermore, we propose two simplified antenna movement modes, namely the linear movement mode and the planar movement mode, to facilitate efficient antenna movement and reduce the computational complexity of the CSSCA algorithm. Numerical results show that the considered MA-enhanced system can significantly improve the achievable rate compared to conventional MIMO systems employing uniform planar arrays (UPAs) and that the proposed planar movement mode performs closely to the performance upper bound achieved by the general movement mode. △ Less

Submitted 18 August, 2023; v1 submitted 13 August, 2023; originally announced August 2023.

Comments: Accepted by GLOBECOM 2023

arXiv:2306.05629 [pdf, other]

R-PMAC: A Robust Preamble Based MAC Mechanism Applied in Industrial Internet of Things

Authors: Kai Song, Biqian Feng, Yongpeng Wu, Zhen Gao, Wenjun Zhang

Abstract: This paper proposes a novel media access control (MAC) mechanism, called the robust preamble-based MAC mechanism (R-PMAC), which can be applied to power line communication (PLC) networks in the context of the Industrial Internet of Things (IIoT). Compared with other MAC mechanisms such as P-MAC and the MAC layer of IEEE1901.1, R-PMAC has higher networking speed. Besides, it supports whitelist auth… ▽ More This paper proposes a novel media access control (MAC) mechanism, called the robust preamble-based MAC mechanism (R-PMAC), which can be applied to power line communication (PLC) networks in the context of the Industrial Internet of Things (IIoT). Compared with other MAC mechanisms such as P-MAC and the MAC layer of IEEE1901.1, R-PMAC has higher networking speed. Besides, it supports whitelist authentication and functions properly in the presence of data frame loss. Firstly, we outline three basic mechanisms of R-PMAC, containing precise time difference calculation, preambles generation and short ID allocation. Secondly, we elaborate its networking process of single layer and multiple layers. Thirdly, we illustrate its robust mechanisms, including collision handling and data retransmission. Moreover, a low-cost hardware platform is established to measure the time of connecting hundreds of PLC nodes for the R-PMAC, P-MAC, and IEEE1901.1 mechanisms in a real power line environment. The experiment results show that R-PMAC outperforms the other mechanisms by achieving a 50% reduction in networking time. These findings indicate that the R-PMAC mechanism holds great potential for quickly and effectively building a PLC network in actual industrial scenarios. △ Less

Submitted 8 June, 2023; originally announced June 2023.

Comments: This paper has been accepted by IEEE Internet of Things Journal

arXiv:2305.07584 [pdf, other]

Proactive Content Caching Scheme in Urban Vehicular Networks

Authors: Biqian Feng, Chenyuan Feng, Daquan Feng, Yongpeng Wu, Xiang-Gen Xia

Abstract: Stream media content caching is a key enabling technology to promote the value chain of future urban vehicular networks. Nevertheless, the high mobility of vehicles, intermittency of information transmissions, high dynamics of user requests, limited caching capacities and extreme complexity of business scenarios pose an enormous challenge to content caching and distribution in vehicular networks.… ▽ More Stream media content caching is a key enabling technology to promote the value chain of future urban vehicular networks. Nevertheless, the high mobility of vehicles, intermittency of information transmissions, high dynamics of user requests, limited caching capacities and extreme complexity of business scenarios pose an enormous challenge to content caching and distribution in vehicular networks. To tackle this problem, this paper aims to design a novel edge-computing-enabled hierarchical cooperative caching framework. Firstly, we profoundly analyze the spatio-temporal correlation between the historical vehicle trajectory of user requests and construct the system model to predict the vehicle trajectory and content popularity, which lays a foundation for mobility-aware content caching and dispatching. Meanwhile, we probe into privacy protection strategies to realize privacy-preserved prediction model. Furthermore, based on trajectory and popular content prediction results, content caching strategy is studied, and adaptive and dynamic resource management schemes are proposed for hierarchical cooperative caching networks. Finally, simulations are provided to verify the superiority of our proposed scheme and algorithms. It shows that the proposed algorithms effectively improve the performance of the considered system in terms of hit ratio and average delay, and narrow the gap to the optimal caching scheme comparing with the traditional schemes. △ Less

Submitted 12 May, 2023; originally announced May 2023.

Comments: Accepted by IEEE Transactions on Communications

arXiv:2208.02466 [pdf, other]

Linear MIMO Precoders Design for Finite Alphabet Inputs via Model-Free Training

Authors: Chen Cao, Biqian Feng, Yongpeng Wu, Derrick Wing Kwan Ng, Wenjun Zhang

Abstract: This paper investigates a novel method for designing linear precoders with finite alphabet inputs based on autoencoders (AE) without the knowledge of the channel model. By model-free training of the autoencoder in a multiple-input multiple-output (MIMO) system, the proposed method can effectively solve the optimization problem to design the precoders that maximize the mutual information between th… ▽ More This paper investigates a novel method for designing linear precoders with finite alphabet inputs based on autoencoders (AE) without the knowledge of the channel model. By model-free training of the autoencoder in a multiple-input multiple-output (MIMO) system, the proposed method can effectively solve the optimization problem to design the precoders that maximize the mutual information between the channel inputs and outputs, when only the input-output information of the channel can be observed. Specifically, the proposed method regards the receiver and the precoder as two independent parameterized functions in the AE and alternately trains them using the exact and approximated gradient, respectively. Compared with previous precoders design methods, it alleviates the limitation of requiring the explicit channel model to be known. Simulation results show that the proposed method works as well as those methods under known channel models in terms of maximizing the mutual information and reducing the bit error rate. △ Less

Submitted 4 August, 2022; originally announced August 2022.

Comments: Accepted by GLOBECOM 2022

arXiv:2203.11404 [pdf, other]

Enhanced Preamble Based MAC Mechanism for IIoT-oriented PLC Network

Authors: Kai Song, Biqian Feng, Yongpeng Wu, Wenjun Zhang

Abstract: In this paper, we propose an enhanced preamble based media access control mechanism (E-PMAC), which can be applied in power line communication (PLC) network for Industrial Internet of Things (IIoT). We introduce detailed technologies used in E-PMAC, including delay calibration mechanism, preamble design, and slot allocation algorithm. With these technologies, E-PMAC is more robust than existing pr… ▽ More In this paper, we propose an enhanced preamble based media access control mechanism (E-PMAC), which can be applied in power line communication (PLC) network for Industrial Internet of Things (IIoT). We introduce detailed technologies used in E-PMAC, including delay calibration mechanism, preamble design, and slot allocation algorithm. With these technologies, E-PMAC is more robust than existing preamble based MAC mechanism (P-MAC). Besides, we analyze the disadvantage of P-MAC in multi-layer networking and design the networking process of E-PMAC to accelerate networking process. We analyze the complexity of networking process in P-MAC and E-PMAC and prove that E-PMAC has lower complexity than P-MAC. Finally, we simulate the single-layer networking and multi-layer networking of E-PMAC, P-MAC, and existing PLC protocol, i.e. , IEEE1901.1. The simulation results indicate that E-PMAC spends much less time in networking than IEEE1901.1 and P-MAC. Finally, with our work, a PLC network based on E-PMAC mechanism can be realized. △ Less

Submitted 21 March, 2022; originally announced March 2022.

Comments: 7 pages, 12 figures, to appeal in The 2022 IEEE 95th Vehicular Technology Conference (VTC2022-Spring)

arXiv:2203.06764 [pdf, other]

TurbuGAN: An Adversarial Learning Approach to Spatially-Varying Multiframe Blind Deconvolution with Applications to Imaging Through Turbulence

Authors: Brandon Yushan Feng, Mingyang Xie, Christopher A. Metzler

Abstract: We present a self-supervised and self-calibrating multi-shot approach to imaging through atmospheric turbulence, called TurbuGAN. Our approach requires no paired training data, adapts itself to the distribution of the turbulence, leverages domain-specific data priors, and can generalize from tens to thousands of measurements. We achieve such functionality through an adversarial sensing framework a… ▽ More We present a self-supervised and self-calibrating multi-shot approach to imaging through atmospheric turbulence, called TurbuGAN. Our approach requires no paired training data, adapts itself to the distribution of the turbulence, leverages domain-specific data priors, and can generalize from tens to thousands of measurements. We achieve such functionality through an adversarial sensing framework adapted from CryoGAN, which uses a discriminator network to match the distributions of captured and simulated measurements. Our framework builds on CryoGAN by (1) generalizing the forward measurement model to incorporate physically accurate and computationally efficient models for light propagation through anisoplanatic turbulence, (2) enabling adaptation to slightly misspecified forward models, and (3) leveraging domain-specific prior knowledge using pretrained generative networks, when available. We validate TurbuGAN on both computationally simulated and experimentally captured images distorted with anisoplanatic turbulence. △ Less

Submitted 2 January, 2023; v1 submitted 13 March, 2022; originally announced March 2022.

arXiv:2202.13566 [pdf]

doi 10.1109/MIS.2020.3026990

Learning Parameters for a Generalized Vidale-Wolfe Response Model with Flexible Ad Elasticity and Word-of-Mouth

Authors: Yanwu Yang, Baozhu Feng, Daniel Zeng

Abstract: In this research, we investigate a generalized form of Vidale-Wolfe (GVW) model. One key element of our modeling work is that the GVW model contains two useful indexes representing advertiser's elasticity and the word-of-mouth (WoM) effect, respectively. Moreover, we discuss some desirable properties of the GVW model, and present a deep neural network (DNN)-based estimation method to learn its par… ▽ More In this research, we investigate a generalized form of Vidale-Wolfe (GVW) model. One key element of our modeling work is that the GVW model contains two useful indexes representing advertiser's elasticity and the word-of-mouth (WoM) effect, respectively. Moreover, we discuss some desirable properties of the GVW model, and present a deep neural network (DNN)-based estimation method to learn its parameters. Furthermore, based on three realworld datasets, we conduct computational experiments to validate the GVW model and identified properties. In addition, we also discuss potential advantages of the GVW model over econometric models. The research outcome shows that both the ad elasticity index and the WoM index have significant influences on advertising responses, and the GVW model has potential advantages over econometric models of advertising, in terms of several interesting phenomena drawn from practical advertising situations. The GVW model and its deep learning-based estimation method provide a basis to support big data-driven advertising analytics and decision makings; in the meanwhile, identified properties and experimental findings of this research illuminate critical managerial insights for advertisers in various advertising forms. △ Less

Submitted 28 February, 2022; originally announced February 2022.

Comments: 20 pages, 8 figures, 1 table

MSC Class: 68Txx ACM Class: I.2.6

Journal ref: IEEE Intelligent Systems, 36(5), 69-79 (2021)

arXiv:2201.05932 [pdf]

doi 10.1016/j.energy.2022.123226

Joint Planning of Distributed Generations and Energy Storage in Active Distribution Networks: A Bi-Level Programming Approach

Authors: Yang Li, Bo Feng, Bin Wang, Shuchao Sun

Abstract: In order to improve the penetration of renewable energy resources for distribution networks, a joint planning model of distributed generations (DGs) and energy storage is proposed for an active distribution network by using a bi-level programming approach in this paper. In this model, the upper-level aims to seek the optimal location and capacity of DGs and energy storage, while the lower-level op… ▽ More In order to improve the penetration of renewable energy resources for distribution networks, a joint planning model of distributed generations (DGs) and energy storage is proposed for an active distribution network by using a bi-level programming approach in this paper. In this model, the upper-level aims to seek the optimal location and capacity of DGs and energy storage, while the lower-level optimizes the operation of energy storage devices. To solve this model, an improved binary particle swarm optimization (IBPSO) algorithm based on chaos optimization is developed, and the optimal joint planning is achieved through alternating iterations between the two levels. The simulation results on the PG & E 69-bus distribution system demonstrate that the presented approach manages to reduce the planning deviation caused by the uncertainties of DG outputs and remarkably improve the voltage profile and operational economy of distribution systems. △ Less

Submitted 15 January, 2022; originally announced January 2022.

Comments: Accepted by Energy

Journal ref: Energy 245 (2022) 123226

arXiv:2112.08133 [pdf]

doi 10.1016/j.bios.2021.113699

Ptychographic sensor for large-scale lensless microbial monitoring with high spatiotemporal resolution

Authors: Shaowei Jiang, Chengfei Guo, Zichao Bian, Ruihai Wang, Jiakai Zhu, Pengming Song, Patrick Hu, Derek Hu, Zibang Zhang, Kazunori Hoshino, Bin Feng, Guoan Zheng

Abstract: Traditional microbial detection methods often rely on the overall property of microbial cultures and cannot resolve individual growth event at high spatiotemporal resolution. As a result, they require bacteria to grow to confluence and then interpret the results. Here, we demonstrate the application of an integrated ptychographic sensor for lensless cytometric analysis of microbial cultures over a… ▽ More Traditional microbial detection methods often rely on the overall property of microbial cultures and cannot resolve individual growth event at high spatiotemporal resolution. As a result, they require bacteria to grow to confluence and then interpret the results. Here, we demonstrate the application of an integrated ptychographic sensor for lensless cytometric analysis of microbial cultures over a large scale and with high spatiotemporal resolution. The reported device can be placed within a regular incubator or used as a standalone incubating unit for long-term microbial monitoring. For longitudinal study where massive data are acquired at sequential time points, we report a new temporal-similarity constraint to increase the temporal resolution of ptychographic reconstruction by 7-fold. With this strategy, the reported device achieves a centimeter-scale field of view, a half-pitch spatial resolution of 488 nm, and a temporal resolution of 15-second intervals. For the first time, we report the direct observation of bacterial growth in a 15-second interval by tracking the phase wraps of the recovered images, with high phase sensitivity like that in interferometric measurements. We also characterize cell growth via longitudinal dry mass measurement and perform rapid bacterial detection at low concentrations. For drug-screening application, we demonstrate proof-of-concept antibiotic susceptibility testing and perform single-cell analysis of antibiotic-induced filamentation. The combination of high phase sensitivity, high spatiotemporal resolution, and large field of view is unique among existing microscopy techniques. As a quantitative and miniaturized platform, it can improve studies with microorganisms and other biospecimens at resource-limited settings. △ Less

Submitted 15 December, 2021; originally announced December 2021.

Comments: 18 pages, 6 figures

arXiv:2106.15458 [pdf, other]

Optimization Techniques in Reconfigurable Intelligent Surface Aided Networks

Authors: Biqian Feng, Junyuan Gao, Yongpeng Wu, Wenjun Zhang, Xiang-Gen Xia, Chengshan Xiao

Abstract: Reconfigurable intelligent surface (RIS)-aided networks have been investigated for the purpose of improving the system performance. However, the introduced unit modulus phase shifts and coupling characteristic bring enormous challenges to the optimization in the RIS-aided networks. Many efforts have been made to jointly optimize phase shift vector and other parameters. This article intends to surv… ▽ More Reconfigurable intelligent surface (RIS)-aided networks have been investigated for the purpose of improving the system performance. However, the introduced unit modulus phase shifts and coupling characteristic bring enormous challenges to the optimization in the RIS-aided networks. Many efforts have been made to jointly optimize phase shift vector and other parameters. This article intends to survey the latest research results about the optimization in RIS-aided networks. A taxonomy is devised to categorize the existing literatures based on optimization types, phase shift form, and decoupling methods. Furthermore, in alternating optimization framework, we introduce in detail how to exploit the aforementioned technologies flexibly. It is known that most works could not guarantee a stationary point. To overcome this problem, we propose a unified framework for the optimization problem of RIS-aided networks with continuous phase shifts to find a stationary point. Finally, key challenges are outlined to provide guidelines for the domain researchers and designers to explore more efficient optimization frameworks, and then open issues are discussed. △ Less

Submitted 18 August, 2021; v1 submitted 29 June, 2021; originally announced June 2021.

Comments: Accepted by IEEE wireless communication magazine

arXiv:2104.02735 [pdf, other]

doi 10.1109/CVPR52688.2022.01575

Visual Vibration Tomography: Estimating Interior Material Properties from Monocular Video

Authors: Berthy T. Feng, Alexander C. Ogren, Chiara Daraio, Katherine L. Bouman

Abstract: An object's interior material properties, while invisible to the human eye, determine motion observed on its surface. We propose an approach that estimates heterogeneous material properties of an object from a monocular video of its surface vibrations. Specifically, we show how to estimate Young's modulus and density throughout a 3D object with known geometry. Knowledge of how these values change… ▽ More An object's interior material properties, while invisible to the human eye, determine motion observed on its surface. We propose an approach that estimates heterogeneous material properties of an object from a monocular video of its surface vibrations. Specifically, we show how to estimate Young's modulus and density throughout a 3D object with known geometry. Knowledge of how these values change across the object is useful for simulating its motion and characterizing any defects. Traditional non-destructive testing approaches, which often require expensive instruments, generally estimate only homogenized material properties or simply identify the presence of defects. In contrast, our approach leverages monocular video to (1) identify image-space modes from an object's sub-pixel motion, and (2) directly infer spatially-varying Young's modulus and density values from the observed modes. We demonstrate our approach on both simulated and real videos. △ Less

Submitted 23 April, 2023; v1 submitted 6 April, 2021; originally announced April 2021.

arXiv:2103.06116 [pdf, other]

Spatial Attention-based Non-reference Perceptual Quality Prediction Network for Omnidirectional Images

Authors: Li Yang, Mai Xu, Deng Xin, Bo Feng

Abstract: Due to the strong correlation between visual attention and perceptual quality, many methods attempt to use human saliency information for image quality assessment. Although this mechanism can get good performance, the networks require human saliency labels, which is not easily accessible for omnidirectional images (ODI). To alleviate this issue, we propose a spatial attention-based perceptual qual… ▽ More Due to the strong correlation between visual attention and perceptual quality, many methods attempt to use human saliency information for image quality assessment. Although this mechanism can get good performance, the networks require human saliency labels, which is not easily accessible for omnidirectional images (ODI). To alleviate this issue, we propose a spatial attention-based perceptual quality prediction network for non-reference quality assessment on ODIs (SAP-net). To drive our SAP-net, we establish a large-scale IQA dataset of ODIs (IQA-ODI), which is composed of subjective scores of 200 subjects on 1,080 ODIs. In IQA-ODI, there are 120 high quality ODIs as reference, and 960 ODIs with impairments in both JPEG compression and map projection. Without any human saliency labels, our network can adaptively estimate human perceptual quality on impaired ODIs through a self-attention manner, which significantly promotes the prediction performance of quality scores. Moreover, our method greatly reduces the computational complexity in quality assessment task on ODIs. Extensive experiments validate that our network outperforms 9 state-of-the-art methods for quality assessment on ODIs. The dataset and code have been available on \url{ https://github.com/yanglixiaoshen/SAP-Net}. △ Less

Submitted 10 March, 2021; originally announced March 2021.

Comments: Accepted by IEEE ICME 2021

arXiv:2009.05236 [pdf, other]

doi 10.1145/3459637.3482230

An Efficient Quantitative Approach for Optimizing Convolutional Neural Networks

Authors: Yuke Wang, Boyuan Feng, Xueqiao Peng, Yufei Ding

Abstract: With the increasing popularity of deep learning, Convolutional Neural Networks (CNNs) have been widely applied in various domains, such as image classification and object detection, and achieve stunning success in terms of their high accuracy over the traditional statistical methods. To exploit the potential of CNN models, a huge amount of research and industry efforts have been devoted to optimiz… ▽ More With the increasing popularity of deep learning, Convolutional Neural Networks (CNNs) have been widely applied in various domains, such as image classification and object detection, and achieve stunning success in terms of their high accuracy over the traditional statistical methods. To exploit the potential of CNN models, a huge amount of research and industry efforts have been devoted to optimizing CNNs. Among these endeavors, CNN architecture design has attracted tremendous attention because of its great potential of improving model accuracy or reducing model complexity. However, existing work either introduces repeated training overhead in the search process or lacks an interpretable metric to guide the design. To clear these hurdles, we propose 3D-Receptive Field (3DRF), an explainable and easy-to-compute metric, to estimate the quality of a CNN architecture and guide the search process of designs. To validate the effectiveness of 3DRF, we build a static optimizer to improve the CNN architectures at both the stage level and the kernel level. Our optimizer not only provides a clear and reproducible procedure but also mitigates unnecessary training efforts in the architecture search process. Extensive experiments and studies show that the models generated by our optimizer can achieve up to 5.47% accuracy improvement and up to 65.38% parameters deduction, compared with state-of-the-art CNN structures like MobileNet and ResNet. △ Less

Submitted 15 September, 2021; v1 submitted 11 September, 2020; originally announced September 2020.

arXiv:2009.00473 [pdf, other]

doi 10.1109/TSP.2020.3021985

Large Intelligent Surface Aided Physical Layer Security Transmission

Authors: Biqian Feng, Yongpeng Wu, Mengfan Zheng, Xiang-Gen Xia, Yongjian Wang, Chengshan Xiao

Abstract: In this paper, we investigate a large intelligent surface-enhanced (LIS-enhanced) system, where a LIS is deployed to assist secure transmission. Our design aims to maximize the achievable secrecy rates in different channel models, i.e., Rician fading and (or) independent and identically distributed Gaussian fading for the legitimate and eavesdropper channels. In addition, we take into consideratio… ▽ More In this paper, we investigate a large intelligent surface-enhanced (LIS-enhanced) system, where a LIS is deployed to assist secure transmission. Our design aims to maximize the achievable secrecy rates in different channel models, i.e., Rician fading and (or) independent and identically distributed Gaussian fading for the legitimate and eavesdropper channels. In addition, we take into consideration an artificial noise-aided transmission structure for further improving system performance. The difficulties of tackling the aforementioned problems are the structure of the expected secrecy rate expressions and the non-convex phase shift constraint. To facilitate the design, we propose two frameworks, namely the sample average approximation based (SAA-based) algorithm and the hybrid stochastic projected gradient-convergent policy (hybrid SPG-CP) algorithm, to calculate the expectation terms in the secrecy rate expressions. Meanwhile, majorization minimization (MM) is adopted to address the non-convexity of the phase shift constraint. In addition, we give some analyses on two special scenarios by making full use of the expectation terms. Simulation results show that the proposed algorithms effectively optimize the secrecy communication rate for the considered setup, and the LIS-enhanced system greatly improves secrecy performance compared to conventional architectures without LIS. △ Less

Submitted 1 September, 2020; originally announced September 2020.

Comments: Accepted by IEEE Transactions on Signal Processing

arXiv:2007.10479 [pdf, other]

doi 10.1016/j.neucom.2020.06.045

Deep multi-metric learning for text-independent speaker verification

Authors: Jiwei Xu, Xinggang Wang, Bin Feng, Wenyu Liu

Abstract: Text-independent speaker verification is an important artificial intelligence problem that has a wide spectrum of applications, such as criminal investigation, payment certification, and interest-based customer services. The purpose of text-independent speaker verification is to determine whether two given uncontrolled utterances originate from the same speaker or not. Extracting speech features f… ▽ More Text-independent speaker verification is an important artificial intelligence problem that has a wide spectrum of applications, such as criminal investigation, payment certification, and interest-based customer services. The purpose of text-independent speaker verification is to determine whether two given uncontrolled utterances originate from the same speaker or not. Extracting speech features for each speaker using deep neural networks is a promising direction to explore and a straightforward solution is to train the discriminative feature extraction network by using a metric learning loss function. However, a single loss function often has certain limitations. Thus, we use deep multi-metric learning to address the problem and introduce three different losses for this problem, i.e., triplet loss, n-pair loss and angular loss. The three loss functions work in a cooperative way to train a feature extraction network equipped with Residual connections and squeeze-and-excitation attention. We conduct experiments on the large-scale \texttt{VoxCeleb2} dataset, which contains over a million utterances from over $6,000$ speakers, and the proposed deep neural network obtains an equal error rate of $3.48\%$, which is a very competitive result. Codes for both training and testing and pretrained models are available at \url{https://github.com/GreatJiweix/DmmlTiSV}, which is the first publicly available code repository for large-scale text-independent speaker verification with performance on par with the state-of-the-art systems. △ Less

Submitted 17 July, 2020; originally announced July 2020.

Journal ref: Neurocomputing, Volume 410, 14 October 2020, Pages 394-400

Showing 1–25 of 25 results for author: Feng, B