-
The THU-HCSI Multi-Speaker Multi-Lingual Few-Shot Voice Cloning System for LIMMITS'24 Challenge
Authors:
Yixuan Zhou,
Shuoyi Zhou,
Shun Lei,
Zhiyong Wu,
Menglin Wu
Abstract:
This paper presents the multi-speaker multi-lingual few-shot voice cloning system developed by THU-HCSI team for LIMMITS'24 Challenge. To achieve high speaker similarity and naturalness in both mono-lingual and cross-lingual scenarios, we build the system upon YourTTS and add several enhancements. For further improving speaker similarity and speech quality, we introduce speaker-aware text encoder…
▽ More
This paper presents the multi-speaker multi-lingual few-shot voice cloning system developed by THU-HCSI team for LIMMITS'24 Challenge. To achieve high speaker similarity and naturalness in both mono-lingual and cross-lingual scenarios, we build the system upon YourTTS and add several enhancements. For further improving speaker similarity and speech quality, we introduce speaker-aware text encoder and flow-based decoder with Transformer blocks. In addition, we denoise the few-shot data, mix up them with pre-training data, and adopt a speaker-balanced sampling strategy to guarantee effective fine-tuning for target speakers. The official evaluations in track 1 show that our system achieves the best speaker similarity MOS of 4.25 and obtains considerable naturalness MOS of 3.97.
△ Less
Submitted 25 April, 2024;
originally announced April 2024.
-
Uncertainty Modeling in Ultrasound Image Segmentation for Precise Fetal Biometric Measurements
Authors:
Shuge Lei
Abstract:
Medical image segmentation, particularly in the context of ultrasound data, is a crucial aspect of computer vision and medical imaging. This paper delves into the complexities of uncertainty in the segmentation process, focusing on fetal head and femur ultrasound images. The proposed methodology involves extracting target contours and exploring techniques for precise parameter measurement. Uncerta…
▽ More
Medical image segmentation, particularly in the context of ultrasound data, is a crucial aspect of computer vision and medical imaging. This paper delves into the complexities of uncertainty in the segmentation process, focusing on fetal head and femur ultrasound images. The proposed methodology involves extracting target contours and exploring techniques for precise parameter measurement. Uncertainty modeling methods are employed to enhance the training and testing processes of the segmentation network. The study reveals that the average absolute error in fetal head circumference measurement is 8.0833mm, with a relative error of 4.7347%. Similarly, the average absolute error in fetal femur measurement is 2.6163mm, with a relative error of 6.3336%. Uncertainty modeling experiments employing Test-Time Augmentation (TTA) demonstrate effective interpretability of data uncertainty on both datasets. This suggests that incorporating data uncertainty based on the TTA method can support clinical practitioners in making informed decisions and obtaining more reliable measurement results in practical clinical applications. The paper contributes to the advancement of ultrasound image segmentation, addressing critical challenges and improving the reliability of biometric measurements.
△ Less
Submitted 17 January, 2024;
originally announced January 2024.
-
Dual-Channel Reliable Breast Ultrasound Image Classification Based on Explainable Attribution and Uncertainty Quantification
Authors:
Shuge Lei,
Haonan Hu,
Dasheng Sun,
Huabin Zhang,
Kehong Yuan,
Jian Dai,
Jijun Tang,
Yan Tong
Abstract:
This paper focuses on the classification task of breast ultrasound images and researches on the reliability measurement of classification results. We proposed a dual-channel evaluation framework based on the proposed inference reliability and predictive reliability scores. For the inference reliability evaluation, human-aligned and doctor-agreed inference rationales based on the improved feature a…
▽ More
This paper focuses on the classification task of breast ultrasound images and researches on the reliability measurement of classification results. We proposed a dual-channel evaluation framework based on the proposed inference reliability and predictive reliability scores. For the inference reliability evaluation, human-aligned and doctor-agreed inference rationales based on the improved feature attribution algorithm SP-RISA are gracefully applied. Uncertainty quantification is used to evaluate the predictive reliability via the Test Time Enhancement. The effectiveness of this reliability evaluation framework has been verified on our breast ultrasound clinical dataset YBUS, and its robustness is verified on the public dataset BUSI. The expected calibration errors on both datasets are significantly lower than traditional evaluation methods, which proves the effectiveness of our proposed reliability measurement.
△ Less
Submitted 7 January, 2024;
originally announced January 2024.
-
Sample Robust Scheduling of Electricity-Gas Systems Under Wind Power Uncertainty
Authors:
Rong-Peng Liu,
Yunhe Hou,
Yujia Li,
Shunbo Lei,
Wei Wei,
Xiaozhe Wang
Abstract:
This paper adopts a two-stage sample robust optimization (SRO) model to address the wind power penetrated unit commitment optimal energy flow (UC-OEF) problem for IEGSs. The two-stage SRO model can be approximately transformed into a computationally efficient form. Specifically, we employ linear decision rules to simplify the proposed UC-OEF model. Moreover, we further enhance the tractability of…
▽ More
This paper adopts a two-stage sample robust optimization (SRO) model to address the wind power penetrated unit commitment optimal energy flow (UC-OEF) problem for IEGSs. The two-stage SRO model can be approximately transformed into a computationally efficient form. Specifically, we employ linear decision rules to simplify the proposed UC-OEF model. Moreover, we further enhance the tractability of the simplified model by exploring its structural features and, accordingly, develop a solution method.
△ Less
Submitted 30 December, 2023;
originally announced January 2024.
-
Improving Language Model-Based Zero-Shot Text-to-Speech Synthesis with Multi-Scale Acoustic Prompts
Authors:
Shun Lei,
Yixuan Zhou,
Liyang Chen,
Dan Luo,
Zhiyong Wu,
Xixin Wu,
Shiyin Kang,
Tao Jiang,
Yahui Zhou,
Yuxing Han,
Helen Meng
Abstract:
Zero-shot text-to-speech (TTS) synthesis aims to clone any unseen speaker's voice without adaptation parameters. By quantizing speech waveform into discrete acoustic tokens and modeling these tokens with the language model, recent language model-based TTS models show zero-shot speaker adaptation capabilities with only a 3-second acoustic prompt of an unseen speaker. However, they are limited by th…
▽ More
Zero-shot text-to-speech (TTS) synthesis aims to clone any unseen speaker's voice without adaptation parameters. By quantizing speech waveform into discrete acoustic tokens and modeling these tokens with the language model, recent language model-based TTS models show zero-shot speaker adaptation capabilities with only a 3-second acoustic prompt of an unseen speaker. However, they are limited by the length of the acoustic prompt, which makes it difficult to clone personal speaking style. In this paper, we propose a novel zero-shot TTS model with the multi-scale acoustic prompts based on a neural codec language model VALL-E. A speaker-aware text encoder is proposed to learn the personal speaking style at the phoneme-level from the style prompt consisting of multiple sentences. Following that, a VALL-E based acoustic decoder is utilized to model the timbre from the timbre prompt at the frame-level and generate speech. The experimental results show that our proposed method outperforms baselines in terms of naturalness and speaker similarity, and can achieve better performance by scaling out to a longer style prompt.
△ Less
Submitted 9 April, 2024; v1 submitted 21 September, 2023;
originally announced September 2023.
-
Watch the Speakers: A Hybrid Continuous Attribution Network for Emotion Recognition in Conversation With Emotion Disentanglement
Authors:
Shanglin Lei,
** Wang,
Guanting Dong,
Jiang Li,
Yingjian Liu
Abstract:
Emotion Recognition in Conversation (ERC) has attracted widespread attention in the natural language processing field due to its enormous potential for practical applications. Existing ERC methods face challenges in achieving generalization to diverse scenarios due to insufficient modeling of context, ambiguous capture of dialogue relationships and overfitting in speaker modeling. In this work, we…
▽ More
Emotion Recognition in Conversation (ERC) has attracted widespread attention in the natural language processing field due to its enormous potential for practical applications. Existing ERC methods face challenges in achieving generalization to diverse scenarios due to insufficient modeling of context, ambiguous capture of dialogue relationships and overfitting in speaker modeling. In this work, we present a Hybrid Continuous Attributive Network (HCAN) to address these issues in the perspective of emotional continuation and emotional attribution. Specifically, HCAN adopts a hybrid recurrent and attention-based module to model global emotion continuity. Then a novel Emotional Attribution Encoding (EAE) is proposed to model intra- and inter-emotional attribution for each utterance. Moreover, aiming to enhance the robustness of the model in speaker modeling and improve its performance in different scenarios, A comprehensive loss function emotional cognitive loss $\mathcal{L}_{\rm EC}$ is proposed to alleviate emotional drift and overcome the overfitting of the model to speaker modeling. Our model achieves state-of-the-art performance on three datasets, demonstrating the superiority of our work. Another extensive comparative experiments and ablation studies on three benchmarks are conducted to provided evidence to support the efficacy of each module. Further exploration of generalization ability experiments shows the plug-and-play nature of the EAE module in our method.
△ Less
Submitted 19 September, 2023; v1 submitted 18 September, 2023;
originally announced September 2023.
-
GRASS: Unified Generation Model for Speech-to-Semantic Tasks
Authors:
Aobo Xia,
Shuyu Lei,
Yushu Yang,
Xiang Guo,
Hua Chai
Abstract:
This paper explores the instruction fine-tuning technique for speech-to-semantic tasks by introducing a unified end-to-end (E2E) framework that generates target text conditioned on a task-related prompt for audio data. We pre-train the model using large and diverse data, where instruction-speech pairs are constructed via a text-to-speech (TTS) system. Extensive experiments demonstrate that our pro…
▽ More
This paper explores the instruction fine-tuning technique for speech-to-semantic tasks by introducing a unified end-to-end (E2E) framework that generates target text conditioned on a task-related prompt for audio data. We pre-train the model using large and diverse data, where instruction-speech pairs are constructed via a text-to-speech (TTS) system. Extensive experiments demonstrate that our proposed model achieves state-of-the-art (SOTA) results on many benchmarks covering speech named entity recognition, speech sentiment analysis, speech question answering, and more, after fine-tuning. Furthermore, the proposed model achieves competitive performance in zero-shot and few-shot scenarios. To facilitate future work on instruction fine-tuning for speech-to-semantic tasks, we release our instruction dataset and code.
△ Less
Submitted 11 September, 2023; v1 submitted 6 September, 2023;
originally announced September 2023.
-
Towards Improving the Expressiveness of Singing Voice Synthesis with BERT Derived Semantic Information
Authors:
Shaohuan Zhou,
Shun Lei,
Weiya You,
Deyi Tuo,
Yuren You,
Zhiyong Wu,
Shiyin Kang,
Helen Meng
Abstract:
This paper presents an end-to-end high-quality singing voice synthesis (SVS) system that uses bidirectional encoder representation from Transformers (BERT) derived semantic embeddings to improve the expressiveness of the synthesized singing voice. Based on the main architecture of recently proposed VISinger, we put forward several specific designs for expressive singing voice synthesis. First, dif…
▽ More
This paper presents an end-to-end high-quality singing voice synthesis (SVS) system that uses bidirectional encoder representation from Transformers (BERT) derived semantic embeddings to improve the expressiveness of the synthesized singing voice. Based on the main architecture of recently proposed VISinger, we put forward several specific designs for expressive singing voice synthesis. First, different from the previous SVS models, we use text representation of lyrics extracted from pre-trained BERT as additional input to the model. The representation contains information about semantics of the lyrics, which could help SVS system produce more expressive and natural voice. Second, we further introduce an energy predictor to stabilize the synthesized voice and model the wider range of energy variations that also contribute to the expressiveness of singing voice. Last but not the least, to attenuate the off-key issues, the pitch predictor is re-designed to predict the real to note pitch ratio. Both objective and subjective experimental results indicate that the proposed SVS system can produce singing voice with higher-quality outperforming VISinger.
△ Less
Submitted 31 August, 2023;
originally announced August 2023.
-
Towards Spontaneous Style Modeling with Semi-supervised Pre-training for Conversational Text-to-Speech Synthesis
Authors:
Weiqin Li,
Shun Lei,
Qiaochu Huang,
Yixuan Zhou,
Zhiyong Wu,
Shiyin Kang,
Helen Meng
Abstract:
The spontaneous behavior that often occurs in conversations makes speech more human-like compared to reading-style. However, synthesizing spontaneous-style speech is challenging due to the lack of high-quality spontaneous datasets and the high cost of labeling spontaneous behavior. In this paper, we propose a semi-supervised pre-training method to increase the amount of spontaneous-style speech an…
▽ More
The spontaneous behavior that often occurs in conversations makes speech more human-like compared to reading-style. However, synthesizing spontaneous-style speech is challenging due to the lack of high-quality spontaneous datasets and the high cost of labeling spontaneous behavior. In this paper, we propose a semi-supervised pre-training method to increase the amount of spontaneous-style speech and spontaneous behavioral labels. In the process of semi-supervised learning, both text and speech information are considered for detecting spontaneous behaviors labels in speech. Moreover, a linguistic-aware encoder is used to model the relationship between each sentence in the conversation. Experimental results indicate that our proposed method achieves superior expressive speech synthesis performance with the ability to model spontaneous behavior in spontaneous-style speech and predict reasonable spontaneous behavior from text.
△ Less
Submitted 31 August, 2023;
originally announced August 2023.
-
MSStyleTTS: Multi-Scale Style Modeling with Hierarchical Context Information for Expressive Speech Synthesis
Authors:
Shun Lei,
Yixuan Zhou,
Liyang Chen,
Zhiyong Wu,
Xixin Wu,
Shiyin Kang,
Helen Meng
Abstract:
Expressive speech synthesis is crucial for many human-computer interaction scenarios, such as audiobooks, podcasts, and voice assistants. Previous works focus on predicting the style embeddings at one single scale from the information within the current sentence. Whereas, context information in neighboring sentences and multi-scale nature of style in human speech are neglected, making it challengi…
▽ More
Expressive speech synthesis is crucial for many human-computer interaction scenarios, such as audiobooks, podcasts, and voice assistants. Previous works focus on predicting the style embeddings at one single scale from the information within the current sentence. Whereas, context information in neighboring sentences and multi-scale nature of style in human speech are neglected, making it challenging to convert multi-sentence text into natural and expressive speech. In this paper, we propose MSStyleTTS, a style modeling method for expressive speech synthesis, to capture and predict styles at different levels from a wider range of context rather than a sentence. Two sub-modules, including multi-scale style extractor and multi-scale style predictor, are trained together with a FastSpeech 2 based acoustic model. The predictor is designed to explore the hierarchical context information by considering structural relationships in context and predict style embeddings at global-level, sentence-level and subword-level. The extractor extracts multi-scale style embedding from the ground-truth speech and explicitly guides the style prediction. Evaluations on both in-domain and out-of-domain audiobook datasets demonstrate that the proposed method significantly outperforms the three baselines. In addition, we conduct the analysis of the context information and multi-scale style representations that have never been discussed before.
△ Less
Submitted 29 July, 2023;
originally announced July 2023.
-
GTN-Bailando: Genre Consistent Long-Term 3D Dance Generation based on Pre-trained Genre Token Network
Authors:
Haolin Zhuang,
Shun Lei,
Long Xiao,
Weiqin Li,
Liyang Chen,
Sicheng Yang,
Zhiyong Wu,
Shiyin Kang,
Helen Meng
Abstract:
Music-driven 3D dance generation has become an intensive research topic in recent years with great potential for real-world applications. Most existing methods lack the consideration of genre, which results in genre inconsistency in the generated dance movements. In addition, the correlation between the dance genre and the music has not been investigated. To address these issues, we propose a genr…
▽ More
Music-driven 3D dance generation has become an intensive research topic in recent years with great potential for real-world applications. Most existing methods lack the consideration of genre, which results in genre inconsistency in the generated dance movements. In addition, the correlation between the dance genre and the music has not been investigated. To address these issues, we propose a genre-consistent dance generation framework, GTN-Bailando. First, we propose the Genre Token Network (GTN), which infers the genre from music to enhance the genre consistency of long-term dance generation. Second, to improve the generalization capability of the model, the strategy of pre-training and fine-tuning is adopted.Experimental results on the AIST++ dataset show that the proposed dance generation framework outperforms state-of-the-art methods in terms of motion quality and genre consistency.
△ Less
Submitted 25 April, 2023;
originally announced April 2023.
-
Context-aware Coherent Speaking Style Prediction with Hierarchical Transformers for Audiobook Speech Synthesis
Authors:
Shun Lei,
Yixuan Zhou,
Liyang Chen,
Zhiyong Wu,
Shiyin Kang,
Helen Meng
Abstract:
Recent advances in text-to-speech have significantly improved the expressiveness of synthesized speech. However, it is still challenging to generate speech with contextually appropriate and coherent speaking style for multi-sentence text in audiobooks. In this paper, we propose a context-aware coherent speaking style prediction method for audiobook speech synthesis. To predict the style embedding…
▽ More
Recent advances in text-to-speech have significantly improved the expressiveness of synthesized speech. However, it is still challenging to generate speech with contextually appropriate and coherent speaking style for multi-sentence text in audiobooks. In this paper, we propose a context-aware coherent speaking style prediction method for audiobook speech synthesis. To predict the style embedding of the current utterance, a hierarchical transformer-based context-aware style predictor with a mixture attention mask is designed, considering both text-side context information and speech-side style information of previous speeches. Based on this, we can generate long-form speech with coherent style and prosody sentence by sentence. Objective and subjective evaluations on a Mandarin audiobook dataset demonstrate that our proposed model can generate speech with more expressive and coherent speaking style than baselines, for both single-sentence and multi-sentence test.
△ Less
Submitted 13 April, 2023;
originally announced April 2023.
-
A Distributionally Robust Resilience Enhancement Strategy for Distribution Networks Considering Decision-Dependent Contingencies
Authors:
Yujia Li,
Shunbo Lei,
Wei Sun,
Chenxi Hu,
Yunhe Hou
Abstract:
When performing the resilience enhancement for distribution networks, there are two obstacles to reliably model the uncertain contingencies: 1) decision-dependent uncertainty (DDU) due to various line hardening decisions, and 2) distributional ambiguity due to limited outage information during extreme weather events (EWEs). To address these two challenges, this paper develops scenario-wise decisio…
▽ More
When performing the resilience enhancement for distribution networks, there are two obstacles to reliably model the uncertain contingencies: 1) decision-dependent uncertainty (DDU) due to various line hardening decisions, and 2) distributional ambiguity due to limited outage information during extreme weather events (EWEs). To address these two challenges, this paper develops scenario-wise decision-dependent ambiguity sets (SWDD-ASs), where the DDU and distributional ambiguity inherent in EWE-induced contingencies are simultaneously captured for each possible EWE scenario. Then, a two-stage trilevel decision-dependent distributionally robust resilient enhancement (DD-DRRE) model is formulated, whose outputs include the optimal line hardening, distributed generation (DG) allocation, and proactive network reconfiguration strategy under the worst-case distributions in SWDD-ASs. Subsequently, the DD-DRRE model is equivalently recast to a mixed-integer linear programming (MILP)-based master problem and multiple scenario-wise subproblems, facilitating the adoption of a customized column-and-constraint generation (C&CG) algorithm. Finally, case studies demonstrate a remarkable improvement in the out-of-sample performance of our model, compared to its prevailing stochastic and robust counterparts. Moreover, the potential values of incorporating the ambiguity and distributional information are quantitatively estimated, providing a useful reference for planners with different budgets and risk-aversion levels.
△ Less
Submitted 23 August, 2022; v1 submitted 2 July, 2022;
originally announced July 2022.
-
Towards Multi-Scale Speaking Style Modelling with Hierarchical Context Information for Mandarin Speech Synthesis
Authors:
Shun Lei,
Yixuan Zhou,
Liyang Chen,
Jiankun Hu,
Zhiyong Wu,
Shiyin Kang,
Helen Meng
Abstract:
Previous works on expressive speech synthesis focus on modelling the mono-scale style embedding from the current sentence or context, but the multi-scale nature of speaking style in human speech is neglected. In this paper, we propose a multi-scale speaking style modelling method to capture and predict multi-scale speaking style for improving the naturalness and expressiveness of synthetic speech.…
▽ More
Previous works on expressive speech synthesis focus on modelling the mono-scale style embedding from the current sentence or context, but the multi-scale nature of speaking style in human speech is neglected. In this paper, we propose a multi-scale speaking style modelling method to capture and predict multi-scale speaking style for improving the naturalness and expressiveness of synthetic speech. A multi-scale extractor is proposed to extract speaking style embeddings at three different levels from the ground-truth speech, and explicitly guide the training of a multi-scale style predictor based on hierarchical context information. Both objective and subjective evaluations on a Mandarin audiobooks dataset demonstrate that our proposed method can significantly improve the naturalness and expressiveness of the synthesized speech.
△ Less
Submitted 5 July, 2022; v1 submitted 6 April, 2022;
originally announced April 2022.
-
Resilient Distribution System Restoration with Communication Recovery by Drone Small Cells
Authors:
Haochen Zhang,
Chen Chen,
Shunbo Lei,
Zhaohong Bie
Abstract:
Distribution system (DS) restoration after natural disasters often faces the challenge of communication failures to feeder automation (FA) facilities, resulting in prolonged load pick-up process. This letter discusses the utilization of drone small cells for wireless communication recovery of FA, and proposes an integrated DS restoration strategy with communication recovery. Demonstrative case stu…
▽ More
Distribution system (DS) restoration after natural disasters often faces the challenge of communication failures to feeder automation (FA) facilities, resulting in prolonged load pick-up process. This letter discusses the utilization of drone small cells for wireless communication recovery of FA, and proposes an integrated DS restoration strategy with communication recovery. Demonstrative case studies are conducted to validate the proposed model, and its advantages are illustrated by comparing to benchmark strategies.
△ Less
Submitted 30 March, 2022;
originally announced March 2022.
-
On Time Step** Schemes Considering Switching Behaviors for Power System Electromagnetic Transient Simulation
Authors:
Sheng Lei
Abstract:
Several difficulties will appear when typical electromagnetic transient simulation, using the implicit trapezoidal method and fixed step sizes, is applied to power systems with switching behaviors. These difficulties are addressed by different aspects of time step** schemes in the literature. This paper first details the different aspects and reviews corresponding methods. Some misunderstanding…
▽ More
Several difficulties will appear when typical electromagnetic transient simulation, using the implicit trapezoidal method and fixed step sizes, is applied to power systems with switching behaviors. These difficulties are addressed by different aspects of time step** schemes in the literature. This paper first details the different aspects and reviews corresponding methods. Some misunderstanding in the literature is clarified. Issues that may be encountered by the existing methods are concurrently revealed. Based on the detailed review, the paper then puts forward a novel time step** scheme which fully addresses the difficulties. The effectiveness of the proposed scheme is demonstrated via numerical case studies.
△ Less
Submitted 26 March, 2022;
originally announced March 2022.
-
Towards Expressive Speaking Style Modelling with Hierarchical Context Information for Mandarin Speech Synthesis
Authors:
Shun Lei,
Yixuan Zhou,
Liyang Chen,
Zhiyong Wu,
Shiyin Kang,
Helen Meng
Abstract:
Previous works on expressive speech synthesis mainly focus on current sentence. The context in adjacent sentences is neglected, resulting in inflexible speaking style for the same text, which lacks speech variations. In this paper, we propose a hierarchical framework to model speaking style from context. A hierarchical context encoder is proposed to explore a wider range of contextual information…
▽ More
Previous works on expressive speech synthesis mainly focus on current sentence. The context in adjacent sentences is neglected, resulting in inflexible speaking style for the same text, which lacks speech variations. In this paper, we propose a hierarchical framework to model speaking style from context. A hierarchical context encoder is proposed to explore a wider range of contextual information considering structural relationship in context, including inter-phrase and inter-sentence relations. Moreover, to encourage this encoder to learn style representation better, we introduce a novel training strategy with knowledge distillation, which provides the target for encoder training. Both objective and subjective evaluations on a Mandarin lecture dataset demonstrate that the proposed method can significantly improve the naturalness and expressiveness of the synthesized speech.
△ Less
Submitted 6 April, 2022; v1 submitted 23 March, 2022;
originally announced March 2022.
-
Improved Method for Dealing with Discontinuities in Power System Transient Simulation Based on Frequency Response Optimized Integrators Considering Second Order Derivative
Authors:
Sheng Lei,
Alexander Flueck
Abstract:
Potential disagreement in the result induced by discontinuities is revealed in this paper between a novel power system transient simulation scheme using numerical integrators considering second order derivative and conventional ones using numerical integrators considering first order derivative. The disagreement is due to the formula of the different numerical integrators. An improved method for d…
▽ More
Potential disagreement in the result induced by discontinuities is revealed in this paper between a novel power system transient simulation scheme using numerical integrators considering second order derivative and conventional ones using numerical integrators considering first order derivative. The disagreement is due to the formula of the different numerical integrators. An improved method for dealing with discontinuities in the novel transient simulation scheme is proposed to resolve the disagreement. The effectiveness of the improved method is demonstrated and verified via numerical case studies. Although the disagreement is studied on and the improved method is proposed for a particular transient simulation scheme, similar conclusions also apply to other ones using numerical integrators considering high order derivative.
△ Less
Submitted 7 June, 2021;
originally announced June 2021.
-
Wide-Beam Array Antenna Power Gain Maximization via ADMM Framework
Authors:
Shiwen Lei,
**g Tian,
Zhipeng Lin,
Haoquan Hu,
Bo Chen,
Wei Yang,
Pu Tang,
Xiangdong Qiu
Abstract:
This paper proposes two algorithms to maximize the minimum array power gain in a wide-beam mainlobe by solving the power gain pattern synthesis (PGPS) problem with and without sidelobe constraints. Firstly, the nonconvex PGPS problem is transformed into a nonconvex linear inequality optimization problem and then converted to an augmented Lagrangian problem by introducing auxiliary variables via th…
▽ More
This paper proposes two algorithms to maximize the minimum array power gain in a wide-beam mainlobe by solving the power gain pattern synthesis (PGPS) problem with and without sidelobe constraints. Firstly, the nonconvex PGPS problem is transformed into a nonconvex linear inequality optimization problem and then converted to an augmented Lagrangian problem by introducing auxiliary variables via the Alternating Direction Method of Multipliers (ADMM) framework. Next,the original intractable problem is converted into a series of nonconvex and convex subproblems. The nonconvex subproblems are solved by dividing their solution space into a finite set of smaller ones, in which the solution would be obtained pseudoanalytically. In such a way, the proposed algorithms are superior to the existing PGPS-based ones as their convergence can be theoretically guaranteed with a lower computational burden. Numerical examples with both isotropic element pattern (IEP) and active element pattern (AEP) arrays are simulated to show the effectiveness and superiority of the proposed algorithms by comparing with the related existing algorithms.
△ Less
Submitted 21 April, 2021;
originally announced April 2021.
-
Studies on Frequency Response Optimized Integrators Considering Second Order Derivative
Authors:
Sheng Lei,
Alexander Flueck
Abstract:
This paper presents comprehensive studies on frequency response optimized integrators considering second order derivative regarding their numerical error, numerical stability and transient performance. Frequency domain error analysis is conducted on these numerical integrators to reveal their accuracy. Numerical stability of the numerical integrators is investigated. Interesting new types of numer…
▽ More
This paper presents comprehensive studies on frequency response optimized integrators considering second order derivative regarding their numerical error, numerical stability and transient performance. Frequency domain error analysis is conducted on these numerical integrators to reveal their accuracy. Numerical stability of the numerical integrators is investigated. Interesting new types of numerical stability are recognized. Transient performance of the numerical integrators is defined to qualitatively characterize their ability to track fast decaying transients. This property is related to unsatisfactory phenomena such as numerical oscillation which frequently appear in time domain simulation of circuits and systems. Transient performance analysis of the numerical integrators is provided. Theoretical observations from the analysis of the numerical integrators are verified via time domain case studies.
△ Less
Submitted 8 January, 2021;
originally announced January 2021.
-
Knowledge AI: New Medical AI Solution for Medical image Diagnosis
Authors:
Yingni Wang,
Shuge Lei,
Jian Dai,
Kehong Yuan
Abstract:
The implementation of medical AI has always been a problem. The effect of traditional perceptual AI algorithm in medical image processing needs to be improved. Here we propose a method of knowledge AI, which is a combination of perceptual AI and clinical knowledge and experience. Based on this method, the geometric information mining of medical images can represent the experience and information a…
▽ More
The implementation of medical AI has always been a problem. The effect of traditional perceptual AI algorithm in medical image processing needs to be improved. Here we propose a method of knowledge AI, which is a combination of perceptual AI and clinical knowledge and experience. Based on this method, the geometric information mining of medical images can represent the experience and information and evaluate the quality of medical images.
△ Less
Submitted 8 January, 2021;
originally announced January 2021.
-
Proper Selection of Obreshkov-Like Numerical Integrators Used as Numerical Differentiators for Power System Transient Simulation
Authors:
Sheng Lei,
Alexander Flueck
Abstract:
Obreshkov-like numerical integrators have been widely applied to power system transient simulation. Misuse of the numerical integrators as numerical differentiators may lead to numerical oscillation or bias. Criteria for Obreshkov-like numerical integrators to be used as numerical differentiators are proposed in this paper to avoid these misleading phenomena. The coefficients of a numerical integr…
▽ More
Obreshkov-like numerical integrators have been widely applied to power system transient simulation. Misuse of the numerical integrators as numerical differentiators may lead to numerical oscillation or bias. Criteria for Obreshkov-like numerical integrators to be used as numerical differentiators are proposed in this paper to avoid these misleading phenomena. The coefficients of a numerical integrator for the highest order derivative turn out to determine its suitability. Some existing Obreshkov-like numerical integrators are examined under the proposed criteria. It is revealed that the notorious numerical oscillations induced by the implicit trapezoidal method cannot always be eliminated by using the backward Euler method for a few time steps. Guided by the proposed criteria, a frequency response optimized integrator considering second order derivative is put forward which is suitable to be used as a numerical differentiator. Theoretical observations are demonstrated in time domain via case studies. The paper points out how to properly select the numerical integrators for power system transient simulation and helps to prevent their misuse.
△ Less
Submitted 15 February, 2022; v1 submitted 2 December, 2020;
originally announced December 2020.
-
Transient Simulation of Grid-Feeding Converter System for Stability Studies Using Frequency Response Optimized Integrators
Authors:
Sheng Lei,
Alexander Flueck
Abstract:
A grid-feeding converter system is added to a novel power system transient simulation scheme based on frequency response optimized integrators considering second order derivative. The converter system and its implementation in the simulation scheme are detailed. Case studies verify the accuracy and efficiency of the simulation scheme. Furthermore, this paper proposes and justifies extending the si…
▽ More
A grid-feeding converter system is added to a novel power system transient simulation scheme based on frequency response optimized integrators considering second order derivative. The converter system and its implementation in the simulation scheme are detailed. Case studies verify the accuracy and efficiency of the simulation scheme. Furthermore, this paper proposes and justifies extending the simulation scheme by integrating commonly used numerical integrators considering first order derivative for part of the studied system. The proposed extension has an insignificant impact on the accuracy of the simulation scheme while significantly enhancing its efficiency. It also reduces the development burden in adding new devices.
△ Less
Submitted 20 February, 2021; v1 submitted 10 November, 2020;
originally announced November 2020.
-
Multistep Frequency Response Optimized Integrators and Their Application to Accelerating a Power System Transient Simulation Scheme
Authors:
Sheng Lei,
Alexander Flueck
Abstract:
This paper proposes several explicit and implicit multistep frequency response optimized integrators considering first or second order derivative. A prediction-based method aiming at accelerating a novel power system transient simulation scheme without impacting its accuracy is further put forward utilizing the proposed numerical integrators and some others available in the literature. Case studie…
▽ More
This paper proposes several explicit and implicit multistep frequency response optimized integrators considering first or second order derivative. A prediction-based method aiming at accelerating a novel power system transient simulation scheme without impacting its accuracy is further put forward utilizing the proposed numerical integrators and some others available in the literature. Case studies verify the effectiveness of the proposed prediction method. Although they are utilized to accelerate the simulation scheme in this paper, the proposed numerical integrators are in fact general-purpose and can be applied to other areas.
△ Less
Submitted 15 February, 2021; v1 submitted 1 November, 2020;
originally announced November 2020.
-
Initialization Process of a Power System Transient Simulation Scheme for Stability Studies
Authors:
Sheng Lei,
Alexander Flueck
Abstract:
The initialization process of a novel power system transient simulation scheme for stability studies is put forward, by further develo** a "time-domain harmonic power-flow algorithm". The initialization process is formulated as an algebraic problem to ensure that the power system under study is in steady state and operated at a specified operating point, at the beginning of a transient simulatio…
▽ More
The initialization process of a novel power system transient simulation scheme for stability studies is put forward, by further develo** a "time-domain harmonic power-flow algorithm". The initialization process is formulated as an algebraic problem to ensure that the power system under study is in steady state and operated at a specified operating point, at the beginning of a transient simulation run. The algebraic problem is then solved efficiently by a preconditioned finite difference Newton-GMRES method. Case studies verify the validity and efficiency of the initialization process. The proposed initialization process is general-purpose and can be applied to other power system transient simulation schemes.
△ Less
Submitted 29 August, 2020;
originally announced August 2020.
-
Few-Shot Semantic Segmentation Augmented with Image-Level Weak Annotations
Authors:
Shuo Lei,
Xuchao Zhang,
Jianfeng He,
Fanglan Chen,
Chang-Tien Lu
Abstract:
Despite the great progress made by deep neural networks in the semantic segmentation task, traditional neural-networkbased methods typically suffer from a shortage of large amounts of pixel-level annotations. Recent progress in fewshot semantic segmentation tackles the issue by only a few pixel-level annotated examples. However, these few-shot approaches cannot easily be applied to multi-way or we…
▽ More
Despite the great progress made by deep neural networks in the semantic segmentation task, traditional neural-networkbased methods typically suffer from a shortage of large amounts of pixel-level annotations. Recent progress in fewshot semantic segmentation tackles the issue by only a few pixel-level annotated examples. However, these few-shot approaches cannot easily be applied to multi-way or weak annotation settings. In this paper, we advance the few-shot segmentation paradigm towards a scenario where image-level annotations are available to help the training process of a few pixel-level annotations. Our key idea is to learn a better prototype representation of the class by fusing the knowledge from the image-level labeled data. Specifically, we propose a new framework, called PAIA, to learn the class prototype representation in a metric space by integrating image-level annotations. Furthermore, by considering the uncertainty of pseudo-masks, a distilled soft masked average pooling strategy is designed to handle distractions in image-level annotations. Extensive empirical results on two datasets show superior performance of PAIA.
△ Less
Submitted 18 June, 2021; v1 submitted 3 July, 2020;
originally announced July 2020.
-
Efficient Power System Transient Simulation Based on Frequency Response Optimized Integrators Considering Second Order Derivative
Authors:
Sheng Lei,
Alexander Flueck
Abstract:
Frequency response optimized integrators considering second order derivative are proposed in this paper. Based on the proposed numerical integrators, and others which also consider second order derivative, this paper puts forward a novel power system transient simulation scheme. Instead of using a unique numerical integrator, the proposed simulation scheme chooses proper ones according to the domi…
▽ More
Frequency response optimized integrators considering second order derivative are proposed in this paper. Based on the proposed numerical integrators, and others which also consider second order derivative, this paper puts forward a novel power system transient simulation scheme. Instead of using a unique numerical integrator, the proposed simulation scheme chooses proper ones according to the dominant frequency component of the differential state variables. With the proposed simulation scheme, computational efficiency is improved by using large step sizes without sacrificing accuracy. Numerical case studies demonstrate the validity and efficiency of the simulation scheme.
△ Less
Submitted 2 May, 2020;
originally announced May 2020.
-
Baseline Estimation of Commercial Building HVAC Fan Power Using Tensor Completion
Authors:
Shunbo Lei,
David Hong,
Johanna L. Mathieu,
Ian A. Hiskens
Abstract:
Commercial building heating, ventilation, and air conditioning (HVAC) systems have been studied for providing ancillary services to power grids via demand response (DR). One critical issue is to estimate the counterfactual baseline power consumption that would have prevailed without DR. Baseline methods have been developed based on whole building electric load profiles. New methods are necessary t…
▽ More
Commercial building heating, ventilation, and air conditioning (HVAC) systems have been studied for providing ancillary services to power grids via demand response (DR). One critical issue is to estimate the counterfactual baseline power consumption that would have prevailed without DR. Baseline methods have been developed based on whole building electric load profiles. New methods are necessary to estimate the baseline power consumption of HVAC sub-components (e.g., supply and return fans), which have different characteristics compared to that of the whole building. Tensor completion can estimate the unobserved entries of multi-dimensional tensors describing complex data sets. It exploits high-dimensional data to capture granular insights into the problem. This paper proposes to use it for baselining HVAC fan power, by utilizing its capability of capturing dominant fan power patterns. The tensor completion method is evaluated using HVAC fan power data from several buildings at the University of Michigan, and compared with several existing methods. The tensor completion method generally outperforms the benchmarks.
△ Less
Submitted 24 April, 2020;
originally announced April 2020.
-
Compressed Sensing for Reconstructing Coherent Multidimensional Spectra
Authors:
Zhengjun Wang,
Shiwen Lei,
Khadga Jung Karki,
Andreas Jakobsson,
Tönu Pullerits
Abstract:
We apply two sparse reconstruction techniques, the least absolute shrinkage and selection operator (LASSO) and the sparse exponential mode analysis (SEMA), to two-dimensional (2D) spectroscopy. The algorithms are first tested on model data, showing that both are able to reconstruct the spectra using only a fraction of the data required by the traditional Fourier-based estimator. Through the analys…
▽ More
We apply two sparse reconstruction techniques, the least absolute shrinkage and selection operator (LASSO) and the sparse exponential mode analysis (SEMA), to two-dimensional (2D) spectroscopy. The algorithms are first tested on model data, showing that both are able to reconstruct the spectra using only a fraction of the data required by the traditional Fourier-based estimator. Through the analysis of a sparsely sampled experimental fluorescence detected 2D spectra of LH2 complexes, we conclude that both SEMA and LASSO can be used to significantly reduce the required data, still allowing to reconstruct the multidimensional spectra. Of the two techniques, it is shown that SEMA offers preferable performance, providing more accurate estimation of the spectral line widths and their positions. Furthermore, SEMA allows for off-grid components, enabling the use of a much smaller dictionary than the LASSO, thereby improving both the performance and lowering the computational complexity for reconstructing coherent multidimensional spectra.
△ Less
Submitted 14 December, 2019;
originally announced December 2019.
-
Transmission System Resilience Enhancement with Extended Steady-state Security Region in Consideration of Uncertain Topology Changes
Authors:
Chong Wang,
Feng Wu,
** Ju,
Shunbo Lei,
Tianguang Lu,
Yunhe Hou
Abstract:
The increasing extreme weather events poses unprecedented challenges on power system operation because of their uncertain and sequential impacts on power systems. This paper proposes the concept of an extended steady-state security region (ESSR), and resilience enhancement for transmission systems based on ESSR in consideration of uncertain varying topology changes caused by the extreme weather ev…
▽ More
The increasing extreme weather events poses unprecedented challenges on power system operation because of their uncertain and sequential impacts on power systems. This paper proposes the concept of an extended steady-state security region (ESSR), and resilience enhancement for transmission systems based on ESSR in consideration of uncertain varying topology changes caused by the extreme weather events is implemented. ESSR is a ploytope describing a region, in which the operating points are within the operating constraints. In consideration of uncertain varying topology changes with ESSR, the resilience enhancement problem is built as a bilevel programming optimization model, in which the system operators deploy the optimal strategy against the most threatening scenario caused by the extreme weather events. To avoid the curse of dimensionality with regard to system topologies for a large scale system, the Monte Carlo method is used to generate uncertain system topologies, and a recursive McCormick envelope-based approach is proposed to connect generated system topologies to optimization variables. Karush Kuhn Tucker (KKT) conditions are used to transform the suboptimization model in the second level into a group of equivalent constraints in the first level. A simple test system and IEEE 118-bus system are used to validate the proposed.
△ Less
Submitted 22 November, 2019;
originally announced November 2019.