Search | arXiv e-print repository

Neural Graphics Texture Compression Supporting Random Acces

Authors: Farzad Farhadzadeh, Qiqi Hou, Hoang Le, Amir Said, Randall Rauwendaal, Alex Bourd, Fatih Porikli

Abstract: Advances in rendering have led to tremendous growth in texture assets, including resolution, complexity, and novel textures components, but this growth in data volume has not been matched by advances in its compression. Meanwhile Neural Image Compression (NIC) has advanced significantly and shown promising results, but the proposed methods cannot be directly adapted to neural texture compression.… ▽ More Advances in rendering have led to tremendous growth in texture assets, including resolution, complexity, and novel textures components, but this growth in data volume has not been matched by advances in its compression. Meanwhile Neural Image Compression (NIC) has advanced significantly and shown promising results, but the proposed methods cannot be directly adapted to neural texture compression. First, texture compression requires on-demand and real-time decoding with random access during parallel rendering (e.g. block texture decompression on GPUs). Additionally, NIC does not support multi-resolution reconstruction (mip-levels), nor does it have the ability to efficiently jointly compress different sets of texture channels. In this work, we introduce a novel approach to texture set compression that integrates traditional GPU texture representation and NIC techniques, designed to enable random access and support many-channel texture sets. To achieve this goal, we propose an asymmetric auto-encoder framework that employs a convolutional encoder to capture detailed information in a bottleneck-latent space, and at decoder side we utilize a fully connected network, whose inputs are sampled latent features plus positional information, for a given texture coordinate and mip level. This latent data is defined to enable simplified access to multi-resolution data by simply changing the scanning strides. Experimental results demonstrate that this approach provides much better results than conventional texture compression, and significant improvement over the latest method using neural networks. △ Less

Submitted 6 May, 2024; originally announced July 2024.

Comments: ECCV submission

arXiv:2406.15819 [pdf, other]

Automatic AI Model Selection for Wireless Systems: Online Learning via Digital Twinning

Authors: Qiushuo Hou, Matteo Zecchin, Sangwoo Park, Yunlong Cai, Guanding Yu, Kaushik Chowdhury, Osvaldo Simeone

Abstract: In modern wireless network architectures, such as O-RAN, artificial intelligence (AI)-based applications are deployed at intelligent controllers to carry out functionalities like scheduling or power control. The AI "apps" are selected on the basis of contextual information such as network conditions, topology, traffic statistics, and design goals. The map** between context and AI model parameter… ▽ More In modern wireless network architectures, such as O-RAN, artificial intelligence (AI)-based applications are deployed at intelligent controllers to carry out functionalities like scheduling or power control. The AI "apps" are selected on the basis of contextual information such as network conditions, topology, traffic statistics, and design goals. The map** between context and AI model parameters is ideally done in a zero-shot fashion via an automatic model selection (AMS) map** that leverages only contextual information without requiring any current data. This paper introduces a general methodology for the online optimization of AMS map**s. Optimizing an AMS map** is challenging, as it requires exposure to data collected from many different contexts. Therefore, if carried out online, this initial optimization phase would be extremely time consuming. A possible solution is to leverage a digital twin of the physical system to generate synthetic data from multiple simulated contexts. However, given that the simulator at the digital twin is imperfect, a direct use of simulated data for the optimization of the AMS map** would yield poor performance when tested in the real system. This paper proposes a novel method for the online optimization of AMS map** that corrects for the bias of the simulator by means of limited real data collected from the physical system. Experimental results for a graph neural network-based power control app demonstrate the significant advantages of the proposed approach. △ Less

Submitted 22 June, 2024; originally announced June 2024.

Comments: submitted for a journal publication

arXiv:2405.08021 [pdf, other]

Diff-ETS: Learning a Diffusion Probabilistic Model for Electromyography-to-Speech Conversion

Authors: Zhao Ren, Kevin Scheck, Qinhan Hou, Stefano van Gogh, Michael Wand, Tanja Schultz

Abstract: Electromyography-to-Speech (ETS) conversion has demonstrated its potential for silent speech interfaces by generating audible speech from Electromyography (EMG) signals during silent articulations. ETS models usually consist of an EMG encoder which converts EMG signals to acoustic speech features, and a vocoder which then synthesises the speech signals. Due to an inadequate amount of available dat… ▽ More Electromyography-to-Speech (ETS) conversion has demonstrated its potential for silent speech interfaces by generating audible speech from Electromyography (EMG) signals during silent articulations. ETS models usually consist of an EMG encoder which converts EMG signals to acoustic speech features, and a vocoder which then synthesises the speech signals. Due to an inadequate amount of available data and noisy signals, the synthesised speech often exhibits a low level of naturalness. In this work, we propose Diff-ETS, an ETS model which uses a score-based diffusion probabilistic model to enhance the naturalness of synthesised speech. The diffusion model is applied to improve the quality of the acoustic features predicted by an EMG encoder. In our experiments, we evaluated fine-tuning the diffusion model on predictions of a pre-trained EMG encoder, and training both models in an end-to-end fashion. We compared Diff-ETS with a baseline ETS model without diffusion using objective metrics and a listening test. The results indicated the proposed Diff-ETS significantly improved speech naturalness over the baseline. △ Less

Submitted 11 May, 2024; originally announced May 2024.

Comments: Accepted by EMBC 2024

arXiv:2403.17879 [pdf, other]

Low-Latency Neural Stereo Streaming

Authors: Qiqi Hou, Farzad Farhadzadeh, Amir Said, Guillaume Sautiere, Hoang Le

Abstract: The rise of new video modalities like virtual reality or autonomous driving has increased the demand for efficient multi-view video compression methods, both in terms of rate-distortion (R-D) performance and in terms of delay and runtime. While most recent stereo video compression approaches have shown promising performance, they compress left and right views sequentially, leading to poor parallel… ▽ More The rise of new video modalities like virtual reality or autonomous driving has increased the demand for efficient multi-view video compression methods, both in terms of rate-distortion (R-D) performance and in terms of delay and runtime. While most recent stereo video compression approaches have shown promising performance, they compress left and right views sequentially, leading to poor parallelization and runtime performance. This work presents Low-Latency neural codec for Stereo video Streaming (LLSS), a novel parallel stereo video coding method designed for fast and efficient low-latency stereo video streaming. Instead of using a sequential cross-view motion compensation like existing methods, LLSS introduces a bidirectional feature shifting module to directly exploit mutual information among views and encode them effectively with a joint cross-view prior model for entropy coding. Thanks to this design, LLSS processes left and right views in parallel, minimizing latency; all while substantially improving R-D performance compared to both existing neural and conventional codecs. △ Less

Submitted 26 March, 2024; originally announced March 2024.

Comments: Accepted by CVPR2024

arXiv:2312.08866 [pdf, other]

MCANet: Medical Image Segmentation with Multi-Scale Cross-Axis Attention

Authors: Hao Shao, Quansheng Zeng, Qibin Hou, Jufeng Yang

Abstract: Efficiently capturing multi-scale information and building long-range dependencies among pixels are essential for medical image segmentation because of the various sizes and shapes of the lesion regions or organs. In this paper, we present Multi-scale Cross-axis Attention (MCA) to solve the above challenging issues based on the efficient axial attention. Instead of simply connecting axial attentio… ▽ More Efficiently capturing multi-scale information and building long-range dependencies among pixels are essential for medical image segmentation because of the various sizes and shapes of the lesion regions or organs. In this paper, we present Multi-scale Cross-axis Attention (MCA) to solve the above challenging issues based on the efficient axial attention. Instead of simply connecting axial attention along the horizontal and vertical directions sequentially, we propose to calculate dual cross attentions between two parallel axial attentions to capture global information better. To process the significant variations of lesion regions or organs in individual sizes and shapes, we also use multiple convolutions of strip-shape kernels with different kernel sizes in each axial attention path to improve the efficiency of the proposed MCA in encoding spatial information. We build the proposed MCA upon the MSCAN backbone, yielding our network, termed MCANet. Our MCANet with only 4M+ parameters performs even better than most previous works with heavy backbones (e.g., Swin Transformer) on four challenging tasks, including skin lesion segmentation, nuclei segmentation, abdominal multi-organ segmentation, and polyp segmentation. Code is available at https://github.com/haoshao-nku/medical_seg. △ Less

Submitted 19 December, 2023; v1 submitted 14 December, 2023; originally announced December 2023.

arXiv:2306.13277 [pdf, ps, other]

Meta-Gating Framework for Fast and Continuous Resource Optimization in Dynamic Wireless Environments

Authors: Qiushuo Hou, Mengyuan Lee, Guanding Yu, Yunlong Cai

Abstract: With the great success of deep learning (DL) in image classification, speech recognition, and other fields, more and more studies have applied various neural networks (NNs) to wireless resource allocation. Generally speaking, these artificial intelligent (AI) models are trained under some special learning hypotheses, especially that the statistics of the training data are static during the trainin… ▽ More With the great success of deep learning (DL) in image classification, speech recognition, and other fields, more and more studies have applied various neural networks (NNs) to wireless resource allocation. Generally speaking, these artificial intelligent (AI) models are trained under some special learning hypotheses, especially that the statistics of the training data are static during the training stage. However, the distribution of channel state information (CSI) is constantly changing in the real-world wireless communication environment. Therefore, it is essential to study effective dynamic DL technologies to solve wireless resource allocation problems. In this paper, we propose a novel framework, named meta-gating, for solving resource allocation problems in an episodically dynamic wireless environment, where the CSI distribution changes over periods and remains constant within each period. The proposed framework, consisting of an inner network and an outer network, aims to adapt to the dynamic wireless environment by achieving three important goals, i.e., seamlessness, quickness and continuity. Specifically, for the former two goals, we propose a training method by combining a model-agnostic meta-learning (MAML) algorithm with an unsupervised learning mechanism. With this training method, the inner network is able to fast adapt to different channel distributions because of the good initialization. As for the goal of continuity, the outer network can learn to evaluate the importance of inner network's parameters under different CSI distributions, and then decide which subset of the inner network should be activated through the gating operation. Additionally, we theoretically analyze the performance of the proposed meta-gating framework. △ Less

Submitted 22 June, 2023; originally announced June 2023.

Comments: accepted by IEEE TCOM

arXiv:2301.06943 [pdf, other]

Self-supervised Domain Adaptation for Breaking the Limits of Low-quality Fundus Image Quality Enhancement

Authors: Qingshan Hou, Peng Cao, Jiaqi Wang, Xiaoli Liu, **zhu Yang, Osmar R. Zaiane

Abstract: Retinal fundus images have been applied for the diagnosis and screening of eye diseases, such as Diabetic Retinopathy (DR) or Diabetic Macular Edema (DME). However, both low-quality fundus images and style inconsistency potentially increase uncertainty in the diagnosis of fundus disease and even lead to misdiagnosis by ophthalmologists. Most of the existing image enhancement methods mainly focus o… ▽ More Retinal fundus images have been applied for the diagnosis and screening of eye diseases, such as Diabetic Retinopathy (DR) or Diabetic Macular Edema (DME). However, both low-quality fundus images and style inconsistency potentially increase uncertainty in the diagnosis of fundus disease and even lead to misdiagnosis by ophthalmologists. Most of the existing image enhancement methods mainly focus on improving the image quality by leveraging the guidance of high-quality images, which is difficult to be collected in medical applications. In this paper, we tackle image quality enhancement in a fully unsupervised setting, i.e., neither paired images nor high-quality images. To this end, we explore the potential of the self-supervised task for improving the quality of fundus images without the requirement of high-quality reference images. Specifically, we construct multiple patch-wise domains via an auxiliary pre-trained quality assessment network and a style clustering. To achieve robust low-quality image enhancement and address style inconsistency, we formulate two self-supervised domain adaptation tasks to disentangle the features of image content, low-quality factor and style information by exploring intrinsic supervision signals within the low-quality images. Extensive experiments are conducted on EyeQ and Messidor datasets, and results show that our DASQE method achieves new state-of-the-art performance when only low-quality images are available. △ Less

Submitted 17 January, 2023; originally announced January 2023.

arXiv:2210.01879 [pdf, other]

A Perceptual Quality Metric for Video Frame Interpolation

Authors: Qiqi Hou, Abhijay Ghildyal, Feng Liu

Abstract: Research on video frame interpolation has made significant progress in recent years. However, existing methods mostly use off-the-shelf metrics to measure the quality of interpolation results with the exception of a few methods that employ user studies, which is time-consuming. As video frame interpolation results often exhibit unique artifacts, existing quality metrics sometimes are not consisten… ▽ More Research on video frame interpolation has made significant progress in recent years. However, existing methods mostly use off-the-shelf metrics to measure the quality of interpolation results with the exception of a few methods that employ user studies, which is time-consuming. As video frame interpolation results often exhibit unique artifacts, existing quality metrics sometimes are not consistent with human perception when measuring the interpolation results. Some recent deep learning-based perceptual quality metrics are shown more consistent with human judgments, but their performance on videos is compromised since they do not consider temporal information. In this paper, we present a dedicated perceptual quality metric for measuring video frame interpolation results. Our method learns perceptual features directly from videos instead of individual frames. It compares pyramid features extracted from video frames and employs Swin Transformer blocks-based spatio-temporal modules to extract spatio-temporal information. To train our metric, we collected a new video frame interpolation quality assessment dataset. Our experiments show that our dedicated quality metric outperforms state-of-the-art methods when measuring video frame interpolation results. Our code and model are made publicly available at \url{https://github.com/hqqxyy/VFIPS}. △ Less

Submitted 4 October, 2022; originally announced October 2022.

Comments: ECCV 2022

arXiv:2106.00532 [pdf, other]

Topology and Admittance Estimation: Precision Limits and Algorithms

Authors: Yuxiao Liu, Ning Zhang, Qingchun Hou, Audun Botterud, Chongqing Kang

Abstract: Distribution grid topology and admittance information are essential for system planning, operation, and protection. In many distribution grids, missing or inaccurate topology and admittance data call for efficient estimation methods. However, measurement data may be insufficient or contaminated with large noise, which will introduce fundamental limits to the estimation accuracy. This work explores… ▽ More Distribution grid topology and admittance information are essential for system planning, operation, and protection. In many distribution grids, missing or inaccurate topology and admittance data call for efficient estimation methods. However, measurement data may be insufficient or contaminated with large noise, which will introduce fundamental limits to the estimation accuracy. This work explores the theoretical precision limits of the topology and admittance estimation (TAE) problem, with different measurement devices, noise levels, and the number of measurements. On this basis, we propose a conservative progressive self-adaptive (CPS) algorithm to estimate the topology and admittance. Results on IEEE 33 and 141-bus systems validate that the proposed CPS method can approach the theoretical precision limits under various measurement settings. △ Less

Submitted 1 June, 2021; originally announced June 2021.

arXiv:2004.09579 [pdf]

Sparse Oblique Decision Tree for Power System Security Rules Extraction and Embedding

Authors: Qingchun Hou, Ning Zhang, Daniel S. Kirschen, Ershun Du, Yaohua Cheng, Chongqing Kang

Abstract: Increasing the penetration of variable generation has a substantial effect on the operational reliability of power systems. The higher level of uncertainty that stems from this variability makes it more difficult to determine whether a given operating condition will be secure or insecure. Data-driven techniques provide a promising way to identify security rules that can be embedded in economic dis… ▽ More Increasing the penetration of variable generation has a substantial effect on the operational reliability of power systems. The higher level of uncertainty that stems from this variability makes it more difficult to determine whether a given operating condition will be secure or insecure. Data-driven techniques provide a promising way to identify security rules that can be embedded in economic dispatch model to keep power system operating states secure. This paper proposes using a sparse weighted oblique decision tree to learn accurate, understandable, and embeddable security rules that are linear and can be extracted as sparse matrices using a recursive algorithm. These matrices can then be easily embedded as security constraints in power system economic dispatch calculations using the Big-M method. Tests on several large datasets with high renewable energy penetration demonstrate the effectiveness of the proposed method. In particular, the sparse weighted oblique decision tree outperforms the state-of-art weighted oblique decision tree while kee** the security rules simple. When embedded in the economic dispatch, these rules significantly increase the percentage of secure states and reduce the average solution time. △ Less

Submitted 20 April, 2020; originally announced April 2020.

Comments: 8 pages, 6 figures

arXiv:2002.10864 [pdf, other]

doi 10.1109/TIP.2021.3072811

Cross-layer Feature Pyramid Network for Salient Object Detection

Authors: Zun Li, Congyan Lang, Junhao Liew, Qibin Hou, Yidong Li, Jiashi Feng

Abstract: Feature pyramid network (FPN) based models, which fuse the semantics and salient details in a progressive manner, have been proven highly effective in salient object detection. However, it is observed that these models often generate saliency maps with incomplete object structures or unclear object boundaries, due to the \emph{indirect} information propagation among distant layers that makes such… ▽ More Feature pyramid network (FPN) based models, which fuse the semantics and salient details in a progressive manner, have been proven highly effective in salient object detection. However, it is observed that these models often generate saliency maps with incomplete object structures or unclear object boundaries, due to the \emph{indirect} information propagation among distant layers that makes such fusion structure less effective. In this work, we propose a novel Cross-layer Feature Pyramid Network (CFPN), in which direct cross-layer communication is enabled to improve the progressive fusion in salient object detection. Specifically, the proposed network first aggregates multi-scale features from different layers into feature maps that have access to both the high- and low-level information. Then, it distributes the aggregated features to all the involved layers to gain access to richer context. In this way, the distributed features per layer own both semantics and salient details from all other layers simultaneously, and suffer reduced loss of important information. Extensive experimental results over six widely used salient object detection benchmarks and with three popular backbones clearly demonstrate that CFPN can accurately locate fairly complete salient regions and effectively segment the object boundaries. △ Less

Submitted 25 February, 2020; originally announced February 2020.

Comments: 10 pages, 7 figures

arXiv:1910.04028 [pdf]

Embedding Lithium-ion Battery Scrap** Criterion and Degradation Model in Optimal Operation of Peak-shaving Energy Storage

Authors: Qingchun Hou, Yanghao Yu, Ershun Du, Hongjie He, Ning Zhang, Chongqing Kang, Guo**g Liu, Huan Zhu

Abstract: Lithium-ion battery systems have been used in practical power systems for peak-shaving, demand response, and frequency regulation. However, a lithium-ion battery is degrading while cycling and would be scrapped when the capacity reduces to a certain threshold (e.g. 80%). Such scrap** criterion may not explore the maximum benefit from the battery storage. In this paper, we propose a novel scrappi… ▽ More Lithium-ion battery systems have been used in practical power systems for peak-shaving, demand response, and frequency regulation. However, a lithium-ion battery is degrading while cycling and would be scrapped when the capacity reduces to a certain threshold (e.g. 80%). Such scrap** criterion may not explore the maximum benefit from the battery storage. In this paper, we propose a novel scrap** criterion for peak-shaving energy storage based on battery efficiency, time-of-use price, and arbitrage benefit. A new battery life model with scrap** parameters is then derived using this criterion. Embedded with the life model, an optimal operation method for peak-shaving energy storage system is presented. The results of case study show that the operation method could maximize the benefits of peak-shaving energy storage while delaying battery degradation. Compared with the traditional 80% capacity-based scrap** criterion, our efficiency-based scrap** criterion can significantly improve the lifetime benefit of the battery. △ Less

Submitted 14 June, 2020; v1 submitted 4 October, 2019; originally announced October 2019.

Comments: Add references

arXiv:1910.02400 [pdf]

doi 10.1109/iSPEC48194.2019.8975321

A Linear LMP Model for Active and Reactive Power with Power Loss

Authors: Yanghao Yu, Qingchun Hou, Yi Ge, Guo**g Liu, Ning Zhang

Abstract: Pricing the reactive power is more necessary than ever before because of the increasing challenge of renewable energy integration on reactive power balance and voltage control. However, reactive power price is hard to be efficiently calculated because of the non-linear nature of optimal AC power flow equation. This paper proposes a linear model to calculate active and reactive power LMP simultaneo… ▽ More Pricing the reactive power is more necessary than ever before because of the increasing challenge of renewable energy integration on reactive power balance and voltage control. However, reactive power price is hard to be efficiently calculated because of the non-linear nature of optimal AC power flow equation. This paper proposes a linear model to calculate active and reactive power LMP simultaneously considering power loss. Firstly, a linearized AC power flow equation is proposed based on an augmented Generation Shift Distribution Factors (GSDF) matrix. Secondly, a linearized LMP model is derived using GSDF and loss factors. The formulation of LMP is further decomposed into four components: energy, congestion, voltage limitation and power loss. Finally, an iterate algorithm is proposed for calculating LMP with the proposed model. The performance of the proposed model is validated by the IEEE-118 bus system. △ Less

Submitted 6 October, 2019; originally announced October 2019.

Comments: 6 pages, 6 figures, accepted by IEEE Sustainable Power & Energy Conference (iSPEC2019)

Journal ref: 2019 IEEE Sustainable Power and Energy Conference (iSPEC), Bei**g, China, 2019, pp. 1699-1704

arXiv:1909.11711 [pdf]

doi 10.1016/j.apenergy.2019.03.067

Probabilistic duck curve in high PV penetration power system: Concept, modeling, and empirical analysis in China

Authors: Qingchun Hou, Ning Zhang, Ershun Du, Miao Miao, Fei Peng, Chongqing Kang

Abstract: The high penetration of photovoltaic (PV) is resha** the electricity net-load curve and has a significant impact on power system operation and planning. The concept of duck curve is widely used to describe the timing imbalance between peak demand and PV generation. The traditional duck curve is deterministic and only shows a single extreme or typical scenario during a day. Thus, it cannot captur… ▽ More The high penetration of photovoltaic (PV) is resha** the electricity net-load curve and has a significant impact on power system operation and planning. The concept of duck curve is widely used to describe the timing imbalance between peak demand and PV generation. The traditional duck curve is deterministic and only shows a single extreme or typical scenario during a day. Thus, it cannot capture both the probability of that scenario and the uncertainty of PV generation and loads. These weaknesses limit the application of the duck curve on power system planning under high PV penetration. To address this issue, the novel concepts of probabilistic duck curve (PDC) and probabilistic ramp curve (PRC) are proposed to accurately model the uncertainty and variability of electricity net load and ramp under high PV penetration. An efficient method is presented for modeling PDC and PRC using kernel density estimation, copula function, and dependent discrete convolution. Several indices are designed to quantify the characteristics of the PDC and PRC. For the application, we demonstrate how the PDC and PRC will benefit flexible resource planning. Finally, an empirical study on the Qinghai provincial power system of China validates the effectiveness of the presented method. The results of PDC and PRC intuitively illustrate that the ramp demand and the valley of net load face considerable uncertainty under high PV penetration. The results of flexible resource planning indicate that retrofitting coal-fired units has remarkable performance on enhancing the power system flexibility in Qinghai. In average, reducing the minimal output of coal-fired units by 1 MW will increase PV accommodation by over 4 MWh each day. △ Less

Submitted 25 September, 2019; originally announced September 2019.

arXiv:1709.02212 [pdf, ps, other]

Maximizing the Smallest Eigenvalue of a Symmetric Matrix: A Submodular Optimization Approach

Authors: Andrew Clark, Qiqiang Hou, Linda Bushnell, Radha Poovendran

Abstract: This paper studies the problem of selecting a submatrix of a positive definite matrix in order to achieve a desired bound on the smallest eigenvalue of the submatrix. Maximizing this smallest eigenvalue has applications to selecting input nodes in order to guarantee consensus of networks with negative edges as well as maximizing the convergence rate of distributed systems. We develop a submodular… ▽ More This paper studies the problem of selecting a submatrix of a positive definite matrix in order to achieve a desired bound on the smallest eigenvalue of the submatrix. Maximizing this smallest eigenvalue has applications to selecting input nodes in order to guarantee consensus of networks with negative edges as well as maximizing the convergence rate of distributed systems. We develop a submodular optimization approach to maximizing the smallest eigenvalue by first proving that positivity of the eigenvalues of a submatrix can be characterized using the probability distribution of the quadratic form induced by the submatrix. We then exploit that connection to prove that positive-definiteness of a submatrix can be expressed as a constraint on a submodular function. We prove that our approach results in polynomial-time algorithms with provable bounds on the size of the submatrix. We also present generalizations to non-symmetric matrices, alternative sufficient conditions for the smallest eigenvalue to exceed a desired bound that are valid for Laplacian matrices, and a numerical evaluation. △ Less

Submitted 7 September, 2017; originally announced September 2017.

Showing 1–15 of 15 results for author: Hou, Q