-
Large Language Model-aided Edge Learning in Distribution System State Estimation
Authors:
Renyou Xie,
Xin Yin,
Chaojie Li,
Nian Liu,
Bo Zhao,
Zhaoyang Dong
Abstract:
Distribution system state estimation (DSSE) plays a crucial role in the real-time monitoring, control, and operation of distribution networks. Besides intensive computational requirements, conventional DSSE methods need high-quality measurements to obtain accurate states, whereas missing values often occur due to sensor failures or communication delays. To address these challenging issues, a forec…
▽ More
Distribution system state estimation (DSSE) plays a crucial role in the real-time monitoring, control, and operation of distribution networks. Besides intensive computational requirements, conventional DSSE methods need high-quality measurements to obtain accurate states, whereas missing values often occur due to sensor failures or communication delays. To address these challenging issues, a forecast-then-estimate framework of edge learning is proposed for DSSE, leveraging large language models (LLMs) to forecast missing measurements and provide pseudo-measurements. Firstly, natural language-based prompts and measurement sequences are integrated by the proposed LLM to learn patterns from historical data and provide accurate forecasting results. Secondly, a convolutional layer-based neural network model is introduced to improve the robustness of state estimation under missing measurement. Thirdly, to alleviate the overfitting of the deep learning-based DSSE, it is reformulated as a multi-task learning framework containing shared and task-specific layers. The uncertainty weighting algorithm is applied to find the optimal weights to balance different tasks. The numerical simulation on the Simbench case is used to demonstrate the effectiveness of the proposed forecast-then-estimate framework.
△ Less
Submitted 11 May, 2024;
originally announced May 2024.
-
AddSR: Accelerating Diffusion-based Blind Super-Resolution with Adversarial Diffusion Distillation
Authors:
Rui Xie,
Ying Tai,
Chen Zhao,
Kai Zhang,
Zhenyu Zhang,
Jun Zhou,
Xiaoqian Ye,
Qian Wang,
Jian Yang
Abstract:
Blind super-resolution methods based on stable diffusion showcase formidable generative capabilities in reconstructing clear high-resolution images with intricate details from low-resolution inputs. However, their practical applicability is often hampered by poor efficiency, stemming from the requirement of thousands or hundreds of sampling steps. Inspired by the efficient adversarial diffusion di…
▽ More
Blind super-resolution methods based on stable diffusion showcase formidable generative capabilities in reconstructing clear high-resolution images with intricate details from low-resolution inputs. However, their practical applicability is often hampered by poor efficiency, stemming from the requirement of thousands or hundreds of sampling steps. Inspired by the efficient adversarial diffusion distillation (ADD), we design~\name~to address this issue by incorporating the ideas of both distillation and ControlNet. Specifically, we first propose a prediction-based self-refinement strategy to provide high-frequency information in the student model output with marginal additional time cost. Furthermore, we refine the training process by employing HR images, rather than LR images, to regulate the teacher model, providing a more robust constraint for distillation. Second, we introduce a timestep-adaptive ADD to address the perception-distortion imbalance problem introduced by original ADD. Extensive experiments demonstrate our~\name~generates better restoration results, while achieving faster speed than previous SD-based state-of-the-art models (e.g., $7$$\times$ faster than SeeSR).
△ Less
Submitted 23 May, 2024; v1 submitted 2 April, 2024;
originally announced April 2024.
-
Deep CSI Compression for Dual-Polarized Massive MIMO Channels with Disentangled Representation Learning
Authors:
Suhang Fan,
Wei Xu,
Renjie Xie,
Shi **,
Derrick Wing Kwan Ng,
Naofal Al-Dhahir
Abstract:
Channel state information (CSI) feedback is critical for achieving the promised advantages of enhancing spectral and energy efficiencies in massive multiple-input multiple-output (MIMO) wireless communication systems. Deep learning (DL)-based methods have been proven effective in reducing the required signaling overhead for CSI feedback. In practical dual-polarized MIMO scenarios, channels in the…
▽ More
Channel state information (CSI) feedback is critical for achieving the promised advantages of enhancing spectral and energy efficiencies in massive multiple-input multiple-output (MIMO) wireless communication systems. Deep learning (DL)-based methods have been proven effective in reducing the required signaling overhead for CSI feedback. In practical dual-polarized MIMO scenarios, channels in the vertical and horizontal polarization directions tend to exhibit high polarization correlation. To fully exploit the inherent propagation similarity within dual-polarized channels, we propose a disentangled representation neural network (NN) for CSI feedback, referred to as DiReNet. The proposed DiReNet disentangles dual-polarized CSI into three components: polarization-shared information, vertical polarization-specific information, and horizontal polarization-specific information. This disentanglement of dual-polarized CSI enables the minimization of information redundancy caused by the polarization correlation and improves the performance of CSI compression and recovery. Additionally, flexible quantization and network extension schemes are designed. Consequently, our method provides a pragmatic solution for CSI feedback to harness the physical MIMO polarization as a priori information. Our experimental results show that the performance of our proposed DiReNet surpasses that of existing DL-based networks, while also effectively reducing the number of network parameters by nearly one third.
△ Less
Submitted 28 March, 2024;
originally announced March 2024.
-
Controllable Preference Optimization: Toward Controllable Multi-Objective Alignment
Authors:
Yiju Guo,
Ganqu Cui,
Lifan Yuan,
Ning Ding,
Jiexin Wang,
Huimin Chen,
Bowen Sun,
Ruobing Xie,
Jie Zhou,
Yankai Lin,
Zhiyuan Liu,
Maosong Sun
Abstract:
Alignment in artificial intelligence pursues the consistency between model responses and human preferences as well as values. In practice, the multifaceted nature of human preferences inadvertently introduces what is known as the "alignment tax" -a compromise where enhancements in alignment within one objective (e.g.,harmlessness) can diminish performance in others (e.g.,helpfulness). However, exi…
▽ More
Alignment in artificial intelligence pursues the consistency between model responses and human preferences as well as values. In practice, the multifaceted nature of human preferences inadvertently introduces what is known as the "alignment tax" -a compromise where enhancements in alignment within one objective (e.g.,harmlessness) can diminish performance in others (e.g.,helpfulness). However, existing alignment techniques are mostly unidirectional, leading to suboptimal trade-offs and poor flexibility over various objectives. To navigate this challenge, we argue the prominence of grounding LLMs with evident preferences. We introduce controllable preference optimization (CPO), which explicitly specifies preference scores for different objectives, thereby guiding the model to generate responses that meet the requirements. Our experimental analysis reveals that the aligned models can provide responses that match various preferences among the "3H" (helpfulness, honesty, harmlessness) desiderata. Furthermore, by introducing diverse data and alignment goals, we surpass baseline methods in aligning with single objectives, hence mitigating the impact of the alignment tax and achieving Pareto improvements in multi-objective alignment.
△ Less
Submitted 29 February, 2024;
originally announced February 2024.
-
AGADIR: Towards Array-Geometry Agnostic Directional Speech Recognition
Authors:
Ju Lin,
Niko Moritz,
Yiteng Huang,
Ruiming Xie,
Ming Sun,
Christian Fuegen,
Frank Seide
Abstract:
Wearable devices like smart glasses are approaching the compute capability to seamlessly generate real-time closed captions for live conversations. We build on our recently introduced directional Automatic Speech Recognition (ASR) for smart glasses that have microphone arrays, which fuses multi-channel ASR with serialized output training, for wearer/conversation-partner disambiguation as well as s…
▽ More
Wearable devices like smart glasses are approaching the compute capability to seamlessly generate real-time closed captions for live conversations. We build on our recently introduced directional Automatic Speech Recognition (ASR) for smart glasses that have microphone arrays, which fuses multi-channel ASR with serialized output training, for wearer/conversation-partner disambiguation as well as suppression of cross-talk speech from non-target directions and noise.
When ASR work is part of a broader system-development process, one may be faced with changes to microphone geometries as system development progresses.
This paper aims to make multi-channel ASR insensitive to limited variations of microphone-array geometry. We show that a model trained on multiple similar geometries is largely agnostic and generalizes well to new geometries, as long as they are not too different. Furthermore, training the model this way improves accuracy for seen geometries by 15 to 28\% relative. Lastly, we refine the beamforming by a novel Non-Linearly Constrained Minimum Variance criterion.
△ Less
Submitted 18 January, 2024;
originally announced January 2024.
-
A Multi-timescale and Chance-Constrained Energy Dispatching Strategy of Integrated Heat-Power Community with Shared Hybrid Energy Storage
Authors:
Wenyi Zhang,
Yue Chen,
Rui Xie,
Yunjian Xu
Abstract:
The community in the future may develop into an integrated heat-power system, which includes a high proportion of renewable energy, power generator units, heat generator units, and shared hybrid energy storage. In the integrated heat-power system with coupling heat-power generators and demands, the key challenges lie in the interaction between heat and power, the inherent uncertainty of renewable…
▽ More
The community in the future may develop into an integrated heat-power system, which includes a high proportion of renewable energy, power generator units, heat generator units, and shared hybrid energy storage. In the integrated heat-power system with coupling heat-power generators and demands, the key challenges lie in the interaction between heat and power, the inherent uncertainty of renewable energy and consumers' demands, and the multi-timescale scheduling of heat and power. In this paper, we propose a game theoretic model of the integrated heat-power system. For the welfare-maximizing community operator, its energy dispatch strategy is under chance constraints, where the day-ahead scheduling determines the scheduled energy dispatching strategies, and the real-time dispatch considers the adjustment of generators. For utility-maximizing consumers, their demands are sensitive to the preference parameters. Taking into account the uncertainty in both renewable energy and consumer demand, we prove the existence and uniqueness of the Stackelberg game equilibrium and develop a fixed point algorithm to find the market equilibrium between the community operator and community consumers. Numerical simulations on integrated heat-power system validate the effectiveness of the proposed multi-timescale integrated heat and power model.
△ Less
Submitted 23 October, 2023;
originally announced October 2023.
-
The Multi-modality Cell Segmentation Challenge: Towards Universal Solutions
Authors:
Jun Ma,
Ronald Xie,
Shamini Ayyadhury,
Cheng Ge,
Anubha Gupta,
Ritu Gupta,
Song Gu,
Yao Zhang,
Gihun Lee,
Joonkee Kim,
Wei Lou,
Haofeng Li,
Eric Upschulte,
Timo Dickscheid,
José Guilherme de Almeida,
Yixin Wang,
Lin Han,
Xin Yang,
Marco Labagnara,
Vojislav Gligorovski,
Maxime Scheder,
Sahand Jamal Rahi,
Carly Kempster,
Alice Pollitt,
Leon Espinosa
, et al. (15 additional authors not shown)
Abstract:
Cell segmentation is a critical step for quantitative single-cell analysis in microscopy images. Existing cell segmentation methods are often tailored to specific modalities or require manual interventions to specify hyper-parameters in different experimental settings. Here, we present a multi-modality cell segmentation benchmark, comprising over 1500 labeled images derived from more than 50 diver…
▽ More
Cell segmentation is a critical step for quantitative single-cell analysis in microscopy images. Existing cell segmentation methods are often tailored to specific modalities or require manual interventions to specify hyper-parameters in different experimental settings. Here, we present a multi-modality cell segmentation benchmark, comprising over 1500 labeled images derived from more than 50 diverse biological experiments. The top participants developed a Transformer-based deep-learning algorithm that not only exceeds existing methods but can also be applied to diverse microscopy images across imaging platforms and tissue types without manual parameter adjustments. This benchmark and the improved algorithm offer promising avenues for more accurate and versatile cell analysis in microscopy imaging.
△ Less
Submitted 1 April, 2024; v1 submitted 10 August, 2023;
originally announced August 2023.
-
Self-information Domain-based Neural CSI Compression with Feature Coupling
Authors:
Ziqing Yin,
Renjie Xie,
Wei Xu,
Zhaohui Yang,
Xiaohu You
Abstract:
Deep learning (DL)-based channel state information (CSI) feedback methods compressed the CSI matrix by exploiting its delay and angle features straightforwardly, while the measure in terms of information contained in the CSI matrix has rarely been considered. Based on this observation, we introduce self-information as an informative CSI representation from the perspective of information theory, wh…
▽ More
Deep learning (DL)-based channel state information (CSI) feedback methods compressed the CSI matrix by exploiting its delay and angle features straightforwardly, while the measure in terms of information contained in the CSI matrix has rarely been considered. Based on this observation, we introduce self-information as an informative CSI representation from the perspective of information theory, which reflects the amount of information of the original CSI matrix in an explicit way. Then, a novel DL-based network is proposed for temporal CSI compression in the self-information domain, namely SD-CsiNet. The proposed SD-CsiNet projects the raw CSI onto a self-information matrix in the newly-defined self-information domain, extracts both temporal and spatial features of the self-information matrix, and then couples these two features for effective compression. Experimental results verify the effectiveness of the proposed SD-CsiNet by exploiting the self-information of CSI. Particularly for compression ratios 1/8 and 1/16, the SD-CsiNet respectively achieves 7.17 dB and 3.68 dB performance gains compared to state-of-the-art methods.
△ Less
Submitted 30 April, 2023;
originally announced May 2023.
-
SynthVSR: Scaling Up Visual Speech Recognition With Synthetic Supervision
Authors:
Xubo Liu,
Egor Lakomkin,
Konstantinos Vougioukas,
**chuan Ma,
Honglie Chen,
Ruiming Xie,
Morrie Doulaty,
Niko Moritz,
Jáchym Kolář,
Stavros Petridis,
Maja Pantic,
Christian Fuegen
Abstract:
Recently reported state-of-the-art results in visual speech recognition (VSR) often rely on increasingly large amounts of video data, while the publicly available transcribed video datasets are limited in size. In this paper, for the first time, we study the potential of leveraging synthetic visual data for VSR. Our method, termed SynthVSR, substantially improves the performance of VSR systems wit…
▽ More
Recently reported state-of-the-art results in visual speech recognition (VSR) often rely on increasingly large amounts of video data, while the publicly available transcribed video datasets are limited in size. In this paper, for the first time, we study the potential of leveraging synthetic visual data for VSR. Our method, termed SynthVSR, substantially improves the performance of VSR systems with synthetic lip movements. The key idea behind SynthVSR is to leverage a speech-driven lip animation model that generates lip movements conditioned on the input speech. The speech-driven lip animation model is trained on an unlabeled audio-visual dataset and could be further optimized towards a pre-trained VSR model when labeled videos are available. As plenty of transcribed acoustic data and face images are available, we are able to generate large-scale synthetic data using the proposed lip animation model for semi-supervised VSR training. We evaluate the performance of our approach on the largest public VSR benchmark - Lip Reading Sentences 3 (LRS3). SynthVSR achieves a WER of 43.3% with only 30 hours of real labeled data, outperforming off-the-shelf approaches using thousands of hours of video. The WER is further reduced to 27.9% when using all 438 hours of labeled data from LRS3, which is on par with the state-of-the-art self-supervised AV-HuBERT method. Furthermore, when combined with large-scale pseudo-labeled audio-visual data SynthVSR yields a new state-of-the-art VSR WER of 16.9% using publicly available data only, surpassing the recent state-of-the-art approaches trained with 29 times more non-public machine-transcribed video data (90,000 hours). Finally, we perform extensive ablation studies to understand the effect of each component in our proposed method.
△ Less
Submitted 3 April, 2023; v1 submitted 30 March, 2023;
originally announced March 2023.
-
Sizing Grid-Connected Wind Power Generation and Energy Storage with Wake Effect and Endogenous Uncertainty: A Distributionally Robust Method
Authors:
Rui Xie,
Wei Wei,
Yue Chen
Abstract:
Wind power, as a green energy resource, is growing rapidly worldwide, along with energy storage systems (ESSs) to mitigate its volatility. Sizing of wind power generation and ESSs has become an important problem to be addressed. Wake effect in a wind farm can cause wind speed deficits and a drop in downstream wind turbine power generation, which however was rarely considered in the sizing problem…
▽ More
Wind power, as a green energy resource, is growing rapidly worldwide, along with energy storage systems (ESSs) to mitigate its volatility. Sizing of wind power generation and ESSs has become an important problem to be addressed. Wake effect in a wind farm can cause wind speed deficits and a drop in downstream wind turbine power generation, which however was rarely considered in the sizing problem in power systems. In this paper, a bi-objective distributionally robust optimization (DRO) model is proposed to determine the capacities of wind power generation and ESSs considering the wake effect. An ambiguity set based on Wasserstein metric is established to characterize the wind power and demand uncertainties. In particular, wind power uncertainty is affected by the wind power generation capacity which is determined in the first stage. Thus, the proposed model is a DRO problem with endogenous uncertainty (or decision-dependent uncertainty). To solve the proposed model, a stochastic programming approximation method based on minimum Lipschitz constants is developed to turn the DRO model into a linear program. Then, an iterative algorithm is built, embedded with methods for evaluating the minimum Lipschitz constants. Case studies demonstrate the necessity of considering wake effect and the effectiveness of the proposed method.
△ Less
Submitted 11 June, 2023; v1 submitted 30 December, 2022;
originally announced December 2022.
-
Cooperative Sensing and Heterogeneous Information Fusion in VCPS: A Multi-agent Deep Reinforcement Learning Approach
Authors:
Xincao Xu,
Kai Liu,
Penglin Dai,
Ruitao Xie,
**g**g Cao,
Jiangtao Luo
Abstract:
Cooperative sensing and heterogeneous information fusion are critical to realize vehicular cyber-physical systems (VCPSs). This paper makes the first attempt to quantitatively measure the quality of VCPS by designing a new metric called Age of View (AoV). Specifically, we first present the system architecture where heterogeneous information can be cooperatively sensed and uploaded via vehicle-to-i…
▽ More
Cooperative sensing and heterogeneous information fusion are critical to realize vehicular cyber-physical systems (VCPSs). This paper makes the first attempt to quantitatively measure the quality of VCPS by designing a new metric called Age of View (AoV). Specifically, we first present the system architecture where heterogeneous information can be cooperatively sensed and uploaded via vehicle-to-infrastructure (V2I) communications in vehicular edge computing (VEC). Logical views are constructed by fusing the heterogeneous information at edge nodes. Further, we formulate the problem by deriving a cooperative sensing model based on the multi-class M/G/1 priority queue, and defining the AoV by modeling the timeliness, completeness and consistency of the logical views. On this basis, a multi-agent deep reinforcement learning solution is proposed. In particular, the system state includes vehicle sensed information, edge cached information and view requirements. The vehicle action space consists of the sensing frequencies and uploading priorities of information. A difference-reward-based credit assignment is designed to divide the system reward, which is defined as the VCPS quality, into the difference reward for vehicles. Edge node allocates V2I bandwidth to vehicles based on predicted vehicle trajectories and view requirements. Finally, we build the simulation model and give a comprehensive performance evaluation, which conclusively demonstrates the superiority of the proposed solution.
△ Less
Submitted 27 January, 2023; v1 submitted 25 September, 2022;
originally announced September 2022.
-
Efficient low-thrust trajectory data generation based on generative adversarial network
Authors:
Ruida Xie,
Andrew G. Dempster
Abstract:
Deep learning-based techniques have been introduced into the field of trajectory optimization in recent years. Deep Neural Networks (DNNs) are trained and used as the surrogates of conventional optimization process. They can provide low thrust (LT) transfer cost estimation and enable more complex preliminary mission designs. However, it is a challenge to efficiently obtain the required amount of t…
▽ More
Deep learning-based techniques have been introduced into the field of trajectory optimization in recent years. Deep Neural Networks (DNNs) are trained and used as the surrogates of conventional optimization process. They can provide low thrust (LT) transfer cost estimation and enable more complex preliminary mission designs. However, it is a challenge to efficiently obtain the required amount of trajectory data for training. A Generative Adversarial Network (GAN) is adapted to generate the feasible LT trajectory data efficiently. The GAN consists of a generator and a discriminator, both of which are deep networks. The generator generates fake LT transfer features using random noise as input, while the discriminator distinguishes the generator's fake LT transfer features from real LT transfer features. The GAN is trained until the generator generates fake LT transfers that the discriminator cannot identify. This indicates the generator generates low thrust transfer features that have the same distribution as the real transfer features. The generated low thrust transfer data have a high convergence rate, and they can be used to efficiently produce training data for deep learning models. The proposed approach is validated by generating feasible LT transfers in a Near-Earth Asteroid (NEA) mission scenario. The convergence rate of GAN-generated samples is 84.3%.
△ Less
Submitted 14 September, 2022;
originally announced September 2022.
-
Disentangled Representation Learning for RF Fingerprint Extraction under Unknown Channel Statistics
Authors:
Renjie Xie,
Wei Xu,
Jiabao Yu,
Aiqun Hu,
Derrick Wing Kwan Ng,
A. Lee Swindlehurst
Abstract:
Deep learning (DL) applied to a device's radio-frequency fingerprint~(RFF) has attracted significant attention in physical-layer authentication due to its extraordinary classification performance. Conventional DL-RFF techniques are trained by adopting maximum likelihood estimation~(MLE). Although their discriminability has recently been extended to unknown devices in open-set scenarios, they still…
▽ More
Deep learning (DL) applied to a device's radio-frequency fingerprint~(RFF) has attracted significant attention in physical-layer authentication due to its extraordinary classification performance. Conventional DL-RFF techniques are trained by adopting maximum likelihood estimation~(MLE). Although their discriminability has recently been extended to unknown devices in open-set scenarios, they still tend to overfit the channel statistics embedded in the training dataset. This restricts their practical applications as it is challenging to collect sufficient training data capturing the characteristics of all possible wireless channel environments. To address this challenge, we propose a DL framework of disentangled representation~(DR) learning that first learns to factor the signals into a device-relevant component and a device-irrelevant component via adversarial learning. Then, it shuffles these two parts within a dataset for implicit data augmentation, which imposes a strong regularization on RFF extractor learning to avoid the possible overfitting of device-irrelevant channel statistics, without collecting additional data from unknown channels. Experiments validate that the proposed approach, referred to as DR-based RFF, outperforms conventional methods in terms of generalizability to unknown devices even under unknown complicated propagation environments, e.g., dispersive multipath fading channels, even though all the training data are collected in a simple environment with dominated direct line-of-sight~(LoS) propagation paths.
△ Less
Submitted 17 October, 2022; v1 submitted 4 August, 2022;
originally announced August 2022.
-
Deep CSI Compression for Massive MIMO: A Self-information Model-driven Neural Network
Authors:
Ziqing Yin,
Wei Xu,
Renjie Xie,
Shaoqing Zhang,
Derrick Wing Kwan Ng,
Xiaohu You
Abstract:
In order to fully exploit the advantages of massive multiple-input multiple-output (mMIMO), it is critical for the transmitter to accurately acquire the channel state information (CSI). Deep learning (DL)-based methods have been proposed for CSI compression and feedback to the transmitter. Although most existing DL-based methods consider the CSI matrix as an image, structural features of the CSI i…
▽ More
In order to fully exploit the advantages of massive multiple-input multiple-output (mMIMO), it is critical for the transmitter to accurately acquire the channel state information (CSI). Deep learning (DL)-based methods have been proposed for CSI compression and feedback to the transmitter. Although most existing DL-based methods consider the CSI matrix as an image, structural features of the CSI image are rarely exploited in neural network design. As such, we propose a model of self-information that dynamically measures the amount of information contained in each patch of a CSI image from the perspective of structural features. Then, by applying the self-information model, we propose a model-and-data-driven network for CSI compression and feedback, namely IdasNet. The IdasNet includes the design of a module of self-information deletion and selection (IDAS), an encoder of informative feature compression (IFC), and a decoder of informative feature recovery (IFR). In particular, the model-driven module of IDAS pre-compresses the CSI image by removing informative redundancy in terms of the self-information. The encoder of IFC then conducts feature compression to the pre-compressed CSI image and generates a feature codeword which contains two components, i.e., codeword values and position indices of the codeword values. Subsequently, the IFR decoder decouples the codeword values as well as position indices to recover the CSI image. Experimental results verify that the proposed IdasNet noticeably outperforms existing DL-based networks under various compression ratios while it has the number of network parameters reduced by orders-of-magnitude compared with various existing methods.
△ Less
Submitted 25 April, 2022;
originally announced April 2022.
-
Generative Compression for Face Video: A Hybrid Scheme
Authors:
Anni Tang,
Yan Huang,
Jun Ling,
Zhiyu Zhang,
Yiwei Zhang,
Rong Xie,
Li Song
Abstract:
As the latest video coding standard, versatile video coding (VVC) has shown its ability in retaining pixel quality. To excavate more compression potential for video conference scenarios under ultra-low bitrate, this paper proposes a bitrate adjustable hybrid compression scheme for face video. This hybrid scheme combines the pixel-level precise recovery capability of traditional coding with the gen…
▽ More
As the latest video coding standard, versatile video coding (VVC) has shown its ability in retaining pixel quality. To excavate more compression potential for video conference scenarios under ultra-low bitrate, this paper proposes a bitrate adjustable hybrid compression scheme for face video. This hybrid scheme combines the pixel-level precise recovery capability of traditional coding with the generation capability of deep learning based on abridged information, where Pixel wise Bi-Prediction, Low-Bitrate-FOM and Lossless Keypoint Encoder collaborate to achieve PSNR up to 36.23 dB at a low bitrate of 1.47 KB/s. Without introducing any additional bitrate, our method has a clear advantage over VVC under a completely fair comparative experiment, which proves the effectiveness of our proposed scheme. Moreover, our scheme can adapt to any existing encoder / configuration to deal with different encoding requirements, and the bitrate can be dynamically adjusted according to the network condition.
△ Less
Submitted 20 March, 2023; v1 submitted 21 April, 2022;
originally announced April 2022.
-
A Novel Wide-Area Control Strategy for Dam** of Critical Frequency Oscillations via Modulation of Active Power Injections
Authors:
Ruichao Xie,
Innocent Kamwa,
C. Y. Chung
Abstract:
This paper proposes a novel wide-area control strategy for modulating the active power injections to damp the critical frequency oscillations in power systems, this includes the inter-area oscillations and the transient frequency swing. The proposed method pursues an efficient utilization of the limited power reserve of existing distributed energy resources (DERs) to mitigate these oscillations. T…
▽ More
This paper proposes a novel wide-area control strategy for modulating the active power injections to damp the critical frequency oscillations in power systems, this includes the inter-area oscillations and the transient frequency swing. The proposed method pursues an efficient utilization of the limited power reserve of existing distributed energy resources (DERs) to mitigate these oscillations. This is accomplished by decoupling the dam** control actions at different sites using the oscillation signals of the concerned mode as the power commands. A theoretical basis for this decoupled modulating control is provided. Technically, the desired sole modal oscillation signals are filtered out by linearly combining the system-wide frequencies, which is determined by the linear quadratic regulator based sparsity-promoting (LQRSP) technique. With the proposed strategy, the modulation of each active power injection can be effectively engineered considering the response limit and steady-state output capability of the supporting device. The method is validated based on a two-area test system and is further demonstrated based on the New England 39-bus test system.
△ Less
Submitted 25 March, 2020;
originally announced March 2020.
-
Pathological Myopic Image Analysis with Transfer Learning
Authors:
Ruitao Xie,
Libo Liu,
**gxin Liu,
Connor S Qiu
Abstract:
We present a summary of transfer learning based methods for several challenging myopic fundus image analysis tasks including classification of pathological and non-pathological myopia,localisation of fovea,and segmentation of optic disc.By adapting existing popular deep learning architectures,our proposed methods have achieved 1st and 2nd place in several tasks at the Pathologic Myopia Challenge h…
▽ More
We present a summary of transfer learning based methods for several challenging myopic fundus image analysis tasks including classification of pathological and non-pathological myopia,localisation of fovea,and segmentation of optic disc.By adapting existing popular deep learning architectures,our proposed methods have achieved 1st and 2nd place in several tasks at the Pathologic Myopia Challenge held at ISBI2019.
△ Less
Submitted 31 July, 2019;
originally announced August 2019.
-
Learning an Inverse Tone Map** Network with a Generative Adversarial Regularizer
Authors:
Shiyu Ning,
Hongteng Xu,
Li Song,
Rong Xie,
Wenjun Zhang
Abstract:
Transferring a low-dynamic-range (LDR) image to a high-dynamic-range (HDR) image, which is the so-called inverse tone map** (iTM), is an important imaging technique to improve visual effects of imaging devices. In this paper, we propose a novel deep learning-based iTM method, which learns an inverse tone map** network with a generative adversarial regularizer. In the framework of alternating o…
▽ More
Transferring a low-dynamic-range (LDR) image to a high-dynamic-range (HDR) image, which is the so-called inverse tone map** (iTM), is an important imaging technique to improve visual effects of imaging devices. In this paper, we propose a novel deep learning-based iTM method, which learns an inverse tone map** network with a generative adversarial regularizer. In the framework of alternating optimization, we learn a U-Net-based HDR image generator to transfer input LDR images to HDR ones, and a simple CNN-based discriminator to classify the real HDR images and the generated ones. Specifically, when learning the generator we consider the content-related loss and the generative adversarial regularizer jointly to improve the stability and the robustness of the generated HDR images. Using the learned generator as the proposed inverse tone map** network, we achieve superior iTM results to the state-of-the-art methods consistently.
△ Less
Submitted 20 April, 2018;
originally announced April 2018.