-
PDE Control Gym: A Benchmark for Data-Driven Boundary Control of Partial Differential Equations
Authors:
Luke Bhan,
Yuexin Bian,
Miroslav Krstic,
Yuanyuan Shi
Abstract:
Over the last decade, data-driven methods have surged in popularity, emerging as valuable tools for control theory. As such, neural network approximations of control feedback laws, system dynamics, and even Lyapunov functions have attracted growing attention. With the ascent of learning based control, the need for accurate, fast, and easy-to-use benchmarks has increased. In this work, we present t…
▽ More
Over the last decade, data-driven methods have surged in popularity, emerging as valuable tools for control theory. As such, neural network approximations of control feedback laws, system dynamics, and even Lyapunov functions have attracted growing attention. With the ascent of learning based control, the need for accurate, fast, and easy-to-use benchmarks has increased. In this work, we present the first learning-based environment for boundary control of PDEs. In our benchmark, we introduce three foundational PDE problems - a 1D transport PDE, a 1D reaction-diffusion PDE, and a 2D Navier-Stokes PDE - whose solvers are bundled in an user-friendly reinforcement learning gym. With this gym, we then present the first set of model-free, reinforcement learning algorithms for solving this series of benchmark problems, achieving stability, although at a higher cost compared to model-based PDE backstep**. With the set of benchmark environments and detailed examples, this work significantly lowers the barrier to entry for learning-based PDE control - a topic largely unexplored by the data-driven control community. The entire benchmark is available on Github along with detailed documentation and the presented reinforcement learning models are open sourced.
△ Less
Submitted 23 May, 2024; v1 submitted 18 May, 2024;
originally announced May 2024.
-
Improving Sequential Market Clearing via Value-oriented Renewable Energy Forecasting
Authors:
Yufan Zhang,
Honglin Wen,
Yuexin Bian,
Yuanyuan Shi
Abstract:
Large penetration of renewable energy sources (RESs) brings huge uncertainty into the electricity markets. While existing deterministic market clearing fails to accommodate the uncertainty, the recently proposed stochastic market clearing struggles to achieve desirable market properties. In this work, we propose a value-oriented forecasting approach, which tactically determines the RESs generation…
▽ More
Large penetration of renewable energy sources (RESs) brings huge uncertainty into the electricity markets. While existing deterministic market clearing fails to accommodate the uncertainty, the recently proposed stochastic market clearing struggles to achieve desirable market properties. In this work, we propose a value-oriented forecasting approach, which tactically determines the RESs generation that enters the day-ahead market. With such a forecast, the existing deterministic market clearing framework can be maintained, and the day-ahead and real-time overall operation cost is reduced. At the training phase, the forecast model parameters are estimated to minimize expected day-ahead and real-time overall operation costs, instead of minimizing forecast errors in a statistical sense. Theoretically, we derive the exact form of the loss function for training the forecast model that aligns with such a goal. For market clearing modeled by linear programs, this loss function is a piecewise linear function. Additionally, we derive the analytical gradient of the loss function with respect to the forecast, which inspires an efficient training strategy. A numerical study shows our forecasts can bring significant benefits of the overall cost reduction to deterministic market clearing, compared to quality-oriented forecasting approach.
△ Less
Submitted 14 May, 2024;
originally announced May 2024.
-
Ventilation and Temperature Control for Energy-efficient and Healthy Buildings: A Differentiable PDE Approach
Authors:
Yuexin Bian,
Xiaohan Fu,
Rajesh K. Gupta,
Yuanyuan Shi
Abstract:
In this paper, we introduce a novel framework for building learning and control, focusing on ventilation and thermal management to enhance energy efficiency. We validate the performance of the proposed framework in system model learning via two case studies: a synthetic study focusing on the joint learning of temperature and CO2 fields, and an application to a real-world dataset for CO2 field lear…
▽ More
In this paper, we introduce a novel framework for building learning and control, focusing on ventilation and thermal management to enhance energy efficiency. We validate the performance of the proposed framework in system model learning via two case studies: a synthetic study focusing on the joint learning of temperature and CO2 fields, and an application to a real-world dataset for CO2 field learning. For building control, we demonstrate that the proposed framework can optimize the control actions and significantly reduce the energy cost while maintaining a comfort and healthy indoor environment. When compared to existing traditional methods, an optimization-based method with ODE models and reinforcement learning, our approach can significantly reduce the energy consumption while guarantees all the safety-critical air quality and control constraints. Promising future research directions involve validating and improving the proposed PDE models through accurate estimation of airflow fields within indoor environments. Additionally, incorporating uncertainty modeling into the PDE framework for HVAC control presents an opportunity to enhance the efficiency and reliability of building HVAC system management.
△ Less
Submitted 13 March, 2024;
originally announced March 2024.
-
Secure and Scalable Network Slicing with Plug-and-Play Support for Power Distribution System Communication Networks
Authors:
Jian Zhong,
Chen Chen,
Yuqi Qian,
Yiheng Bian,
Yuxiong Huang,
Zhaohong Bie
Abstract:
With the rapid development of power distribution systems (PDSs), the number of terminal devices and the types of delivered services involved are constantly growing. These trends make the operations of PDSs highly dependent on the support of advanced communication networks, which face two related challenges. The first is to provide sufficient flexibility, resilience, and security to meet varying de…
▽ More
With the rapid development of power distribution systems (PDSs), the number of terminal devices and the types of delivered services involved are constantly growing. These trends make the operations of PDSs highly dependent on the support of advanced communication networks, which face two related challenges. The first is to provide sufficient flexibility, resilience, and security to meet varying demands and ensure the proper operation of gradually diversifying network services. The second is to realize the automatic identification of terminal devices, thus reducing the network maintenance burden. To solve these problems, this paper presents a novel multiservice network integration and device authentication slice-based network slicing scheme. In this scheme, the integration of PDS communication networks enables network resource sharing, and recovery from communication interruption is achieved through network slicing in the integrated network. Authentication servers periodically poll terminal devices, adjusting network slice ranges based on authentication results, thereby facilitating dynamic network slicing. Additionally, secure plug-and-play support for PDS terminal devices and network protection are achieved through device identification and dynamic adjustment of network slices. On this basis, a network optimization and upgrading methodology for load balancing and robustness enhancement is further proposed. This approach is designed to improve the performance of PDS communication networks, adapting to ongoing PDS development and the evolution of PDS services. The simulation results show that the proposed schemes endow a PDS communication network with favorable resource utilization, fault recovery, terminal device plug-and-play support, load balancing, and improved network robustness.
△ Less
Submitted 2 March, 2024;
originally announced March 2024.
-
Resilient Microgrid Formation Considering Communication Interruptions
Authors:
Jian Zhong,
Chen Chen,
Young-** Kim,
Yuxiong Huang,
Mengjie Teng,
Yiheng Bian,
Zhaohong Bie
Abstract:
Distribution system (DS) communication failures following extreme events often degrade monitoring and control functions, thus preventing the acquisition of complete global DS component state information, on which existing post-disaster DS restoration methods are based. This letter proposes methods of inferring the states of DS components in the case of incomplete component state information. By us…
▽ More
Distribution system (DS) communication failures following extreme events often degrade monitoring and control functions, thus preventing the acquisition of complete global DS component state information, on which existing post-disaster DS restoration methods are based. This letter proposes methods of inferring the states of DS components in the case of incomplete component state information. By using the known DS information, the operating states of unobservable DS branches and buses can be inferred, providing complete information for DS performance restoration before full communication recovery
△ Less
Submitted 2 March, 2024;
originally announced March 2024.
-
Single-pixel imaging based on deep learning
Authors:
Kai Song,
Yaoxing Bian,
Ku Wu,
Hongrui Liu,
Shuang** Han,
Jiaming Li,
Jiazhao Tian,
Chengbin Qin,
Jianyong Hu,
Liantuan Xiao
Abstract:
Single-pixel imaging can collect images at the wavelengths outside the reach of conventional focal plane array detectors. However, the limited image quality and lengthy computational times for iterative reconstruction still impede the practical application of single-pixel imaging. Recently, deep learning has been introduced into single-pixel imaging, which has attracted a lot of attention due to i…
▽ More
Single-pixel imaging can collect images at the wavelengths outside the reach of conventional focal plane array detectors. However, the limited image quality and lengthy computational times for iterative reconstruction still impede the practical application of single-pixel imaging. Recently, deep learning has been introduced into single-pixel imaging, which has attracted a lot of attention due to its exceptional reconstruction quality, fast reconstruction speed, and the potential to complete advanced sensing tasks without reconstructing images. Here, this advance is discussed and some opinions are offered. Firstly, based on the fundamental principles of single-pixel imaging and deep learning, the principles and algorithms of single-pixel imaging based on deep learning are described and analyzed. Subsequently, the implementation technologies of single-pixel imaging based on deep learning are reviewed. They are divided into super-resolution single-pixel imaging, single-pixel imaging through scattering media, photon-level single-pixel imaging, optical encryption based on single-pixel imaging, color single-pixel imaging, and image-free sensing according to diverse application fields. Finally, major challenges and corresponding feasible approaches are discussed, as well as more possible applications in the future.
△ Less
Submitted 16 November, 2023; v1 submitted 25 October, 2023;
originally announced October 2023.
-
Deriving Loss Function for Value-oriented Renewable Energy Forecasting
Authors:
Yufan Zhang,
Honglin Wen,
Yuexin Bian,
Yuanyuan Shi
Abstract:
Renewable energy forecasting is the workhorse for efficient energy dispatch. However, forecasts with small mean squared errors (MSE) may not necessarily lead to low operation costs. Here, we propose a forecasting approach specifically tailored for operational purposes, by incorporating operational problems into the estimation of forecast models via designing a loss function. We formulate a bilevel…
▽ More
Renewable energy forecasting is the workhorse for efficient energy dispatch. However, forecasts with small mean squared errors (MSE) may not necessarily lead to low operation costs. Here, we propose a forecasting approach specifically tailored for operational purposes, by incorporating operational problems into the estimation of forecast models via designing a loss function. We formulate a bilevel program, where the operation problem is at the lower level, and the forecast model estimation is at the upper level. We establish the relationship between the lower-level optimal solutions and forecasts through multiparametric programming. By integrating it into the upper-level objective for minimizing expected operation cost, we convert the bilevel problem to a single-level one and derive the loss function for training the model. It is proved to be piecewise linear, for linear operation problem. Compared to the commonly used loss functions, e.g. MSE, our approach achieves lower operation costs.
△ Less
Submitted 1 October, 2023;
originally announced October 2023.
-
AutoPrep: An Automatic Preprocessing Framework for In-the-Wild Speech Data
Authors:
Jianwei Yu,
Hangting Chen,
Yanyao Bian,
Xiang Li,
Yi Luo,
**chuan Tian,
Mengyang Liu,
Jiayi Jiang,
Shuai Wang
Abstract:
Recently, the utilization of extensive open-sourced text data has significantly advanced the performance of text-based large language models (LLMs). However, the use of in-the-wild large-scale speech data in the speech technology community remains constrained. One reason for this limitation is that a considerable amount of the publicly available speech data is compromised by background noise, spee…
▽ More
Recently, the utilization of extensive open-sourced text data has significantly advanced the performance of text-based large language models (LLMs). However, the use of in-the-wild large-scale speech data in the speech technology community remains constrained. One reason for this limitation is that a considerable amount of the publicly available speech data is compromised by background noise, speech overlap**, lack of speech segmentation information, missing speaker labels, and incomplete transcriptions, which can largely hinder their usefulness. On the other hand, human annotation of speech data is both time-consuming and costly. To address this issue, we introduce an automatic in-the-wild speech data preprocessing framework (AutoPrep) in this paper, which is designed to enhance speech quality, generate speaker labels, and produce transcriptions automatically. The proposed AutoPrep framework comprises six components: speech enhancement, speech segmentation, speaker clustering, target speech extraction, quality filtering and automatic speech recognition. Experiments conducted on the open-sourced WenetSpeech and our self-collected AutoPrepWild corpora demonstrate that the proposed AutoPrep framework can generate preprocessed data with similar DNSMOS and PDNSMOS scores compared to several open-sourced TTS datasets. The corresponding TTS system can achieve up to 0.68 in-domain speaker similarity.
△ Less
Submitted 25 September, 2023;
originally announced September 2023.
-
DurIAN-E: Duration Informed Attention Network For Expressive Text-to-Speech Synthesis
Authors:
Yu Gu,
Yianrao Bian,
Guangzhi Lei,
Chao Weng,
Dan Su
Abstract:
This paper introduces an improved duration informed attention neural network (DurIAN-E) for expressive and high-fidelity text-to-speech (TTS) synthesis. Inherited from the original DurIAN model, an auto-regressive model structure in which the alignments between the input linguistic information and the output acoustic features are inferred from a duration model is adopted. Meanwhile the proposed Du…
▽ More
This paper introduces an improved duration informed attention neural network (DurIAN-E) for expressive and high-fidelity text-to-speech (TTS) synthesis. Inherited from the original DurIAN model, an auto-regressive model structure in which the alignments between the input linguistic information and the output acoustic features are inferred from a duration model is adopted. Meanwhile the proposed DurIAN-E utilizes multiple stacked SwishRNN-based Transformer blocks as linguistic encoders. Style-Adaptive Instance Normalization (SAIN) layers are exploited into frame-level encoders to improve the modeling ability of expressiveness. A denoiser incorporating both denoising diffusion probabilistic model (DDPM) for mel-spectrograms and SAIN modules is conducted to further improve the synthetic speech quality and expressiveness. Experimental results prove that the proposed expressive TTS model in this paper can achieve better performance than the state-of-the-art approaches in both subjective mean opinion score (MOS) and preference tests.
△ Less
Submitted 22 September, 2023;
originally announced September 2023.
-
SnakeGAN: A Universal Vocoder Leveraging DDSP Prior Knowledge and Periodic Inductive Bias
Authors:
Sipan Li,
Songxiang Liu,
Luwen Zhang,
Xiang Li,
Yanyao Bian,
Chao Weng,
Zhiyong Wu,
Helen Meng
Abstract:
Generative adversarial network (GAN)-based neural vocoders have been widely used in audio synthesis tasks due to their high generation quality, efficient inference, and small computation footprint. However, it is still challenging to train a universal vocoder which can generalize well to out-of-domain (OOD) scenarios, such as unseen speaking styles, non-speech vocalization, singing, and musical pi…
▽ More
Generative adversarial network (GAN)-based neural vocoders have been widely used in audio synthesis tasks due to their high generation quality, efficient inference, and small computation footprint. However, it is still challenging to train a universal vocoder which can generalize well to out-of-domain (OOD) scenarios, such as unseen speaking styles, non-speech vocalization, singing, and musical pieces. In this work, we propose SnakeGAN, a GAN-based universal vocoder, which can synthesize high-fidelity audio in various OOD scenarios. SnakeGAN takes a coarse-grained signal generated by a differentiable digital signal processing (DDSP) model as prior knowledge, aiming at recovering high-fidelity waveform from a Mel-spectrogram. We introduce periodic nonlinearities through the Snake activation function and anti-aliased representation into the generator, which further brings the desired inductive bias for audio synthesis and significantly improves the extrapolation capacity for universal vocoding in unseen scenarios. To validate the effectiveness of our proposed method, we train SnakeGAN with only speech data and evaluate its performance for various OOD distributions with both subjective and objective metrics. Experimental results show that SnakeGAN significantly outperforms the compared approaches and can generate high-fidelity audio samples including unseen speakers with unseen styles, singing voices, instrumental pieces, and nonverbal vocalization.
△ Less
Submitted 14 September, 2023;
originally announced September 2023.
-
Toward Value-oriented Renewable Energy Forecasting: An Iterative Learning Approach
Authors:
Yufan Zhang,
Mengshuo Jia,
Honglin Wen,
Yuexin Bian,
Yuanyuan Shi
Abstract:
Energy forecasting is an essential task in power system operations. Operators usually issue forecasts and leverage them to schedule energy dispatch ahead of time. However, forecast models are typically developed in a way that overlooks the operational value of the forecasts. To bridge the gap, we design a value-oriented point forecasting approach for sequential energy dispatch problems with renewa…
▽ More
Energy forecasting is an essential task in power system operations. Operators usually issue forecasts and leverage them to schedule energy dispatch ahead of time. However, forecast models are typically developed in a way that overlooks the operational value of the forecasts. To bridge the gap, we design a value-oriented point forecasting approach for sequential energy dispatch problems with renewable energy sources. At the training phase, we align the loss function with the overall operation cost function, thereby achieving reduced operation costs. The forecast model parameter estimation is formulated as a bilevel program. Under mild assumptions, we convert the upper-level objective into an equivalent form using the dual solutions obtained from the lower-level operation problems. Additionally, a novel iterative solution strategy is proposed for the newly formulated bilevel program. Under such an iterative scheme, we show that the upper-level objective is locally linear regarding the forecast model output, and can act as the loss function. Numerical experiments demonstrate that, compared to commonly used statistical quality-oriented point forecasting methods, forecasts obtained by the proposed approach result in lower operation costs. Meanwhile, the proposed approach is more computationally efficient than traditional two-stage stochastic programs.
△ Less
Submitted 4 April, 2024; v1 submitted 1 September, 2023;
originally announced September 2023.
-
Benchmarking Explanatory Models for Inertia Forecasting using Public Data of the Nordic Area
Authors:
Jemima Graham,
Evelyn Heylen,
Yuankai Bian,
Fei Teng
Abstract:
This paper investigates the performance of a day-ahead explanatory model for inertia forecasting based on field data in the Nordic system, which achieves a 43% reduction in mean absolute percentage error (MAPE) against a state-of-the-art time-series forecast model. The generalizability of the explanatory model is verified by its consistent performance on Nordic and Great Britain datasets. Also, it…
▽ More
This paper investigates the performance of a day-ahead explanatory model for inertia forecasting based on field data in the Nordic system, which achieves a 43% reduction in mean absolute percentage error (MAPE) against a state-of-the-art time-series forecast model. The generalizability of the explanatory model is verified by its consistent performance on Nordic and Great Britain datasets. Also, it appears that a long duration of training data is not required to obtain accurate results with this model, but taking a more spatially granular approach reduces the MAPE by 3.6%. Finally, two further model enhancements are studied considering the specific features in Nordic system: (i) a monthly interaction variable applied to the day-ahead national demand forecast feature, reducing the MAPE by up to 18%; and (ii) a feature based on the inertia from hydropower, although this has a negligible impact. The field dataset used for benchmarking is also made publicly available.
△ Less
Submitted 14 July, 2023;
originally announced July 2023.
-
Predicting Strategic Energy Storage Behaviors
Authors:
Yuexin Bian,
Ningkun Zheng,
Yang Zheng,
Bolun Xu,
Yuanyuan Shi
Abstract:
Energy storage are strategic participants in electricity markets to arbitrage price differences. Future power system operators must understand and predict strategic storage arbitrage behaviors for market power monitoring and capacity adequacy planning. This paper proposes a novel data-driven approach that incorporates prior model knowledge for predicting the strategic behaviors of price-taker ener…
▽ More
Energy storage are strategic participants in electricity markets to arbitrage price differences. Future power system operators must understand and predict strategic storage arbitrage behaviors for market power monitoring and capacity adequacy planning. This paper proposes a novel data-driven approach that incorporates prior model knowledge for predicting the strategic behaviors of price-taker energy storage systems. We propose a gradient-descent method to find the storage model parameters given the historical price signals and observations. We prove that the identified model parameters will converge to the true user parameters under a class of quadratic objective and linear equality-constrained storage models. We demonstrate the effectiveness of our approach through numerical experiments with synthetic and real-world storage behavior data. The proposed approach significantly improves the accuracy of storage model identification and behavior forecasting compared to previous blackbox data-driven approaches.
△ Less
Submitted 31 January, 2024; v1 submitted 20 June, 2023;
originally announced June 2023.
-
A deep local attention network for pre-operative lymph node metastasis prediction in pancreatic cancer via multiphase CT imaging
Authors:
Zhilin Zheng,
Xu Fang,
Jiawen Yao,
Mengmeng Zhu,
Le Lu,
Lingyun Huang,
**g Xiao,
Yu Shi,
Hong Lu,
Jian** Lu,
Ling Zhang,
Chengwei Shao,
Yun Bian
Abstract:
Lymph node (LN) metastasis status is one of the most critical prognostic and cancer staging factors for patients with resectable pancreatic ductal adenocarcinoma (PDAC), or in general, for any types of solid malignant tumors. Preoperative prediction of LN metastasis from non-invasive CT imaging is highly desired, as it might be straightforwardly used to guide the following neoadjuvant treatment de…
▽ More
Lymph node (LN) metastasis status is one of the most critical prognostic and cancer staging factors for patients with resectable pancreatic ductal adenocarcinoma (PDAC), or in general, for any types of solid malignant tumors. Preoperative prediction of LN metastasis from non-invasive CT imaging is highly desired, as it might be straightforwardly used to guide the following neoadjuvant treatment decision and surgical planning. Most studies only capture the tumor characteristics in CT imaging to implicitly infer LN metastasis and very few work exploit direct LN's CT imaging information. To the best of our knowledge, this is the first work to propose a fully-automated LN segmentation and identification network to directly facilitate the LN metastasis status prediction task. Nevertheless LN segmentation/detection is very challenging since LN can be easily confused with other hard negative anatomic structures (e.g., vessels) from radiological images. We explore the anatomical spatial context priors of pancreatic LN locations by generating a guiding attention map from related organs and vessels to assist segmentation and infer LN status. As such, LN segmentation is impelled to focus on regions that are anatomically adjacent or plausible with respect to the specific organs and vessels. The metastasized LN identification network is trained to classify the segmented LN instances into positives or negatives by reusing the segmentation network as a pre-trained backbone and padding a new classification head. More importantly, we develop a LN metastasis status prediction network that combines the patient-wise aggregation results of LN segmentation/identification and deep imaging features extracted from the tumor region. Extensive quantitative nested five-fold cross-validation is conducted on a discovery dataset of 749 patients with PDAC.
△ Less
Submitted 4 January, 2023;
originally announced January 2023.
-
Carbon-Aware EV Charging
Authors:
Kai-Wen Cheng,
Yuexin Bian,
Yuanyuan Shi,
Yize Chen
Abstract:
This paper examines the problem of optimizing the charging pattern of electric vehicles (EV) by taking real-time electricity grid carbon intensity into consideration. The objective of the proposed charging scheme is to minimize the carbon emissions contributed by EV charging events, while simultaneously satisfying constraints posed by EV user's charging schedules, charging station transformer limi…
▽ More
This paper examines the problem of optimizing the charging pattern of electric vehicles (EV) by taking real-time electricity grid carbon intensity into consideration. The objective of the proposed charging scheme is to minimize the carbon emissions contributed by EV charging events, while simultaneously satisfying constraints posed by EV user's charging schedules, charging station transformer limits, and battery physical constraints. Using real-world EV charging data and California electricity generation records, this paper shows that our carbon-aware real-time charging scheme saves an average of 3.81% of carbon emission while delivering satisfactory amount of energy. Furthermore, by using an adaptive balanced factor, we can reduce 26.00% of carbon emission on average while compromising 12.61% of total energy delivered.
△ Less
Submitted 25 September, 2022;
originally announced September 2022.
-
Identifying Electrocardiogram Abnormalities Using a Handcrafted-Rule-Enhanced Neural Network
Authors:
Yuexin Bian,
**tai Chen,
Xiaojun Chen,
Xiaoxian Yang,
Danny Z. Chen,
JIan Wu
Abstract:
A large number of people suffer from life-threatening cardiac abnormalities, and electrocardiogram (ECG) analysis is beneficial to determining whether an individual is at risk of such abnormalities. Automatic ECG classification methods, especially the deep learning based ones, have been proposed to detect cardiac abnormalities using ECG records, showing good potential to improve clinical diagnosis…
▽ More
A large number of people suffer from life-threatening cardiac abnormalities, and electrocardiogram (ECG) analysis is beneficial to determining whether an individual is at risk of such abnormalities. Automatic ECG classification methods, especially the deep learning based ones, have been proposed to detect cardiac abnormalities using ECG records, showing good potential to improve clinical diagnosis and help early prevention of cardiovascular diseases. However, the predictions of the known neural networks still do not satisfactorily meet the needs of clinicians, and this phenomenon suggests that some information used in clinical diagnosis may not be well captured and utilized by these methods. In this paper, we introduce some rules into convolutional neural networks, which help present clinical knowledge to deep learning based ECG analysis, in order to improve automated ECG diagnosis performance. Specifically, we propose a Handcrafted-Rule-enhanced Neural Network (called HRNN) for ECG classification with standard 12-lead ECG input, which consists of a rule inference module and a deep learning module. Experiments on two large-scale public ECG datasets show that our new approach considerably outperforms existing state-of-the-art methods. Further, our proposed approach not only can improve the diagnosis performance, but also can assist in detecting mislabelled ECG samples. Our codes are available at https://github.com/alwaysbyx/ecg_processing.
△ Less
Submitted 16 June, 2022;
originally announced June 2022.
-
Adversarial Patch Attacks and Defences in Vision-Based Tasks: A Survey
Authors:
Abhijith Sharma,
Yijun Bian,
Phil Munz,
Apurva Narayan
Abstract:
Adversarial attacks in deep learning models, especially for safety-critical systems, are gaining more and more attention in recent years, due to the lack of trust in the security and robustness of AI models. Yet the more primitive adversarial attacks might be physically infeasible or require some resources that are hard to access like the training data, which motivated the emergence of patch attac…
▽ More
Adversarial attacks in deep learning models, especially for safety-critical systems, are gaining more and more attention in recent years, due to the lack of trust in the security and robustness of AI models. Yet the more primitive adversarial attacks might be physically infeasible or require some resources that are hard to access like the training data, which motivated the emergence of patch attacks. In this survey, we provide a comprehensive overview to cover existing techniques of adversarial patch attacks, aiming to help interested researchers quickly catch up with the progress in this field. We also discuss existing techniques for develo** detection and defences against adversarial patches, aiming to help the community better understand this field and its applications in the real world.
△ Less
Submitted 16 June, 2022;
originally announced June 2022.
-
Automatic Prosody Annotation with Pre-Trained Text-Speech Model
Authors:
Ziqian Dai,
Jianwei Yu,
Yan Wang,
Nuo Chen,
Yanyao Bian,
Guangzhi Li,
Deng Cai,
Dong Yu
Abstract:
Prosodic boundary plays an important role in text-to-speech synthesis (TTS) in terms of naturalness and readability. However, the acquisition of prosodic boundary labels relies on manual annotation, which is costly and time-consuming. In this paper, we propose to automatically extract prosodic boundary labels from text-audio data via a neural text-speech model with pre-trained audio encoders. This…
▽ More
Prosodic boundary plays an important role in text-to-speech synthesis (TTS) in terms of naturalness and readability. However, the acquisition of prosodic boundary labels relies on manual annotation, which is costly and time-consuming. In this paper, we propose to automatically extract prosodic boundary labels from text-audio data via a neural text-speech model with pre-trained audio encoders. This model is pre-trained on text and speech data separately and jointly fine-tuned on TTS data in a triplet format: {speech, text, prosody}. The experimental results on both automatic evaluation and human evaluation demonstrate that: 1) the proposed text-speech prosody annotation framework significantly outperforms text-only baselines; 2) the quality of automatic prosodic boundary annotations is comparable to human annotations; 3) TTS systems trained with model-annotated boundaries are slightly better than systems that use manual ones.
△ Less
Submitted 16 June, 2022;
originally announced June 2022.
-
Content-Dependent Fine-Grained Speaker Embedding for Zero-Shot Speaker Adaptation in Text-to-Speech Synthesis
Authors:
Yixuan Zhou,
Changhe Song,
Xiang Li,
Luwen Zhang,
Zhiyong Wu,
Yanyao Bian,
Dan Su,
Helen Meng
Abstract:
Zero-shot speaker adaptation aims to clone an unseen speaker's voice without any adaptation time and parameters. Previous researches usually use a speaker encoder to extract a global fixed speaker embedding from reference speech, and several attempts have tried variable-length speaker embedding. However, they neglect to transfer the personal pronunciation characteristics related to phoneme content…
▽ More
Zero-shot speaker adaptation aims to clone an unseen speaker's voice without any adaptation time and parameters. Previous researches usually use a speaker encoder to extract a global fixed speaker embedding from reference speech, and several attempts have tried variable-length speaker embedding. However, they neglect to transfer the personal pronunciation characteristics related to phoneme content, leading to poor speaker similarity in terms of detailed speaking styles and pronunciation habits. To improve the ability of the speaker encoder to model personal pronunciation characteristics, we propose content-dependent fine-grained speaker embedding for zero-shot speaker adaptation. The corresponding local content embeddings and speaker embeddings are extracted from a reference speech, respectively. Instead of modeling the temporal relations, a reference attention module is introduced to model the content relevance between the reference speech and the input text, and to generate the fine-grained speaker embedding for each phoneme encoder output. The experimental results show that our proposed method can improve speaker similarity of synthesized speeches, especially for unseen speakers.
△ Less
Submitted 11 November, 2022; v1 submitted 3 April, 2022;
originally announced April 2022.
-
Research on Flexibility Margin of Electric-Hydrogen Coupling Energy Block Based on Model Predictive Control
Authors:
Zijiao Han,
Shun Yuan,
Yannan Dong,
Shaohua Ma,
Yudong Bian,
Xinyu Mao
Abstract:
Hydrogen energy plays an important role in the transformation of low-carbon energy, and electric hydrogen coupling will become a typical energy scenario. Aiming at the operation flexibility of low-carbon electricity hydrogen coupling system with high proportion of wind power and photovoltaic, this paper studies the flexibility margin of electricity hydrogen coupling energy block based on model pre…
▽ More
Hydrogen energy plays an important role in the transformation of low-carbon energy, and electric hydrogen coupling will become a typical energy scenario. Aiming at the operation flexibility of low-carbon electricity hydrogen coupling system with high proportion of wind power and photovoltaic, this paper studies the flexibility margin of electricity hydrogen coupling energy block based on model predictive control (MPC). By analyzing the power exchange characteristics of heterogeneous energy, the homogenization models of various heterogeneous energy sources are established. According to the analysis of power system flexibility margin, three dimensions of flexibility margin evaluation indexes are defined from the dimension of system operation, and an electricity hydrogen coupling energy block scheduling model is established. The model predictive control algorithm is used to optimize the power balance operation of the electro hydrogen coupling energy block, and the flexibility margin of the energy block is quantitatively analyzed and calculated. Through the example analysis, it is verified that the calculation method proposed in this paper can not only realize the on-line power balance optimization of electric hydrogen coupling energy block, but also effectively quantify the operation flexibility margin of electric hydrogen coupling energy block.
△ Less
Submitted 25 March, 2022;
originally announced March 2022.
-
Enhancing Word-Level Semantic Representation via Dependency Structure for Expressive Text-to-Speech Synthesis
Authors:
Yixuan Zhou,
Changhe Song,
**gbei Li,
Zhiyong Wu,
Yanyao Bian,
Dan Su,
Helen Meng
Abstract:
Exploiting rich linguistic information in raw text is crucial for expressive text-to-speech (TTS). As large scale pre-trained text representation develops, bidirectional encoder representations from Transformers (BERT) has been proven to embody semantic information and employed to TTS recently. However, original or simply fine-tuned BERT embeddings still cannot provide sufficient semantic knowledg…
▽ More
Exploiting rich linguistic information in raw text is crucial for expressive text-to-speech (TTS). As large scale pre-trained text representation develops, bidirectional encoder representations from Transformers (BERT) has been proven to embody semantic information and employed to TTS recently. However, original or simply fine-tuned BERT embeddings still cannot provide sufficient semantic knowledge that expressive TTS models should take into account. In this paper, we propose a word-level semantic representation enhancing method based on dependency structure and pre-trained BERT embedding. The BERT embedding of each word is reprocessed considering its specific dependencies and related words in the sentence, to generate more effective semantic representation for TTS. To better utilize the dependency structure, relational gated graph network (RGGN) is introduced to make semantic information flow and aggregate through the dependency structure. The experimental results show that the proposed method can further improve the naturalness and expressiveness of synthesized speeches on both Mandarin and English datasets.
△ Less
Submitted 11 November, 2022; v1 submitted 14 April, 2021;
originally announced April 2021.
-
Distributed Model Predicted Control of Multi-agent Systems with Applications to Multi-vehicle Cooperation
Authors:
Yougang Bian,
Changkun Du,
Manjiang Hu,
Haikuo Liu
Abstract:
This paper proposes a distributed model predicted control (DMPC) approach for consensus control of multi-agent systems (MASs) with linear agent dynamics and bounded control input constraints. Within the proposed DMPC framework, each agent exchanges assumed state trajectories with neighbors and solves a local open-loop optimization problem to obtain the optimal control input. In the optimization pr…
▽ More
This paper proposes a distributed model predicted control (DMPC) approach for consensus control of multi-agent systems (MASs) with linear agent dynamics and bounded control input constraints. Within the proposed DMPC framework, each agent exchanges assumed state trajectories with neighbors and solves a local open-loop optimization problem to obtain the optimal control input. In the optimization problem, a discrete-time consensus protocol is introduced into update law design for assumed terminal states, with which asymptotic consensus of assumed terminal states and recursive feasibility are rigorously proved. Together with the optimal cost function, an infinite series of cost-to-go functions is introduced into the design of a Lyapunov function, with which closed-loop asymptotic consensus is finally proved. Two applications including cooperation of autonomous underwater vehicles (AUVs) and connected and automated vehicles (CAVs) are used to validate the effectiveness of the proposed DMPC approach.
△ Less
Submitted 15 September, 2020;
originally announced September 2020.
-
Robust Pancreatic Ductal Adenocarcinoma Segmentation with Multi-Institutional Multi-Phase Partially-Annotated CT Scans
Authors:
Ling Zhang,
Yu Shi,
Jiawen Yao,
Yun Bian,
Kai Cao,
Dakai **,
**g Xiao,
Le Lu
Abstract:
Accurate and automated tumor segmentation is highly desired since it has the great potential to increase the efficiency and reproducibility of computing more complete tumor measurements and imaging biomarkers, comparing to (often partial) human measurements. This is probably the only viable means to enable the large-scale clinical oncology patient studies that utilize medical imaging. Deep learnin…
▽ More
Accurate and automated tumor segmentation is highly desired since it has the great potential to increase the efficiency and reproducibility of computing more complete tumor measurements and imaging biomarkers, comparing to (often partial) human measurements. This is probably the only viable means to enable the large-scale clinical oncology patient studies that utilize medical imaging. Deep learning approaches have shown robust segmentation performances for certain types of tumors, e.g., brain tumors in MRI imaging, when a training dataset with plenty of pixel-level fully-annotated tumor images is available. However, more than often, we are facing the challenge that only (very) limited annotations are feasible to acquire, especially for hard tumors. Pancreatic ductal adenocarcinoma (PDAC) segmentation is one of the most challenging tumor segmentation tasks, yet critically important for clinical needs. Previous work on PDAC segmentation is limited to the moderate amounts of annotated patient images (n<300) from venous or venous+arterial phase CT scans. Based on a new self-learning framework, we propose to train the PDAC segmentation model using a much larger quantity of patients (n~=1,000), with a mix of annotated and un-annotated venous or multi-phase CT images. Pseudo annotations are generated by combining two teacher models with different PDAC segmentation specialties on unannotated images, and can be further refined by a teaching assistant model that identifies associated vessels around the pancreas. A student model is trained on both manual and pseudo annotated multi-phase images. Experiment results show that our proposed method provides an absolute improvement of 6.3% Dice score over the strong baseline of nnUNet trained on annotated images, achieving the performance (Dice = 0.71) similar to the inter-observer variability between radiologists.
△ Less
Submitted 24 August, 2020;
originally announced August 2020.
-
High-throughput screening of encapsulated islets using wide-field lens-free on-chip imaging
Authors:
Yibo Zhang,
Michael Alexander,
Sam Yang,
Yinxu Bian,
Elliot Botvinick,
Jonathan R. T. Lakey,
Aydogan Ozcan
Abstract:
Islet microencapsulation is a promising solution to diabetes treatment, but its quality control based on manual microscopic inspection is extremely low-throughput, highly variable and laborious. This study presents a high-throughput islet-encapsulation quality screening system based on lens-free on-chip imaging with a wide field-of-view of 18.15 cm^2, which is more than 100 times larger than that…
▽ More
Islet microencapsulation is a promising solution to diabetes treatment, but its quality control based on manual microscopic inspection is extremely low-throughput, highly variable and laborious. This study presents a high-throughput islet-encapsulation quality screening system based on lens-free on-chip imaging with a wide field-of-view of 18.15 cm^2, which is more than 100 times larger than that of a lens-based optical microscope, enabling it to image and analyze ~8,000 microcapsules in a single frame. Custom-written image reconstruction and processing software provides the user with clinically important information, such as microcapsule count, size, intactness, and information on whether each capsule contains an islet. This high-throughput and cost-effective platform can be useful for researchers to develop better encapsulation protocols as well as perform quality control prior to transplantation.
△ Less
Submitted 8 March, 2018;
originally announced March 2018.