-
Battery Operations in Electricity Markets: Strategic Behavior and Distortions
Authors:
Jerry Anunrojwong,
Santiago R. Balseiro,
Omar Besbes,
Bolun Xu
Abstract:
Electric power systems are undergoing a major transformation as they integrate intermittent renewable energy sources, and batteries to smooth out variations in renewable energy production. As privately-owned batteries grow from their role as marginal "price-takers" to significant players in the market, a natural question arises: How do batteries operate in electricity markets, and how does the str…
▽ More
Electric power systems are undergoing a major transformation as they integrate intermittent renewable energy sources, and batteries to smooth out variations in renewable energy production. As privately-owned batteries grow from their role as marginal "price-takers" to significant players in the market, a natural question arises: How do batteries operate in electricity markets, and how does the strategic behavior of decentralized batteries distort decisions compared to centralized batteries?
We propose an analytically tractable model that captures salient features of the highly complex electricity market. We derive in closed form the resulting battery behavior and generation cost in three operating regimes: (i) no battery, (ii) centralized battery, and (ii) decentralized profit-maximizing battery. We establish that a decentralized battery distorts its discharge decisions in three ways. First, there is quantity withholding, i.e., discharging less than centrally optimal. Second, there is a shift in participation from day-ahead to real-time, i.e., postponing some of its discharge from day-ahead to real-time. Third, there is reduction in real-time responsiveness, or discharging less in response to smoothing real-time demand than centrally optimal. We quantify each of the three forms of distortions in terms of market fundamentals. To illustrate our results, we calibrate our model to Los Angeles and Houston and show that the loss from incentive misalignment could be consequential.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Perturbed Decision-Focused Learning for Modeling Strategic Energy Storage
Authors:
Ming Yi,
Saud Alghumayjan,
Bolun Xu
Abstract:
This paper presents a novel decision-focused framework integrating the physical energy storage model into machine learning pipelines. Motivated by the model predictive control for energy storage, our end-to-end method incorporates the prior knowledge of the storage model and infers the hidden reward that incentivizes energy storage decisions. This is achieved through a dual-layer framework, combin…
▽ More
This paper presents a novel decision-focused framework integrating the physical energy storage model into machine learning pipelines. Motivated by the model predictive control for energy storage, our end-to-end method incorporates the prior knowledge of the storage model and infers the hidden reward that incentivizes energy storage decisions. This is achieved through a dual-layer framework, combining a prediction layer with an optimization layer. We introduce the perturbation idea into the designed decision-focused loss function to ensure the differentiability over linear storage models, supported by a theoretical analysis of the perturbed loss function. We also develop a hybrid loss function for effective model training. We provide two challenging applications for our proposed framework: energy storage arbitrage, and energy storage behavior prediction. The numerical experiments on real price data demonstrate that our arbitrage approach achieves the highest profit against existing methods. The numerical experiments on synthetic and real-world energy storage data show that our approach achieves the best behavior prediction performance against existing benchmark methods, which shows the effectiveness of our method.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
AV-CrossNet: an Audiovisual Complex Spectral Map** Network for Speech Separation By Leveraging Narrow- and Cross-Band Modeling
Authors:
Vahid Ahmadi Kalkhorani,
Cheng Yu,
Anurag Kumar,
Ke Tan,
Buye Xu,
DeLiang Wang
Abstract:
Adding visual cues to audio-based speech separation can improve separation performance. This paper introduces AV-CrossNet, an audiovisual (AV) system for speech enhancement, target speaker extraction, and multi-talker speaker separation. AV-CrossNet is extended from the CrossNet architecture, which is a recently proposed network that performs complex spectral map** for speech separation by lever…
▽ More
Adding visual cues to audio-based speech separation can improve separation performance. This paper introduces AV-CrossNet, an audiovisual (AV) system for speech enhancement, target speaker extraction, and multi-talker speaker separation. AV-CrossNet is extended from the CrossNet architecture, which is a recently proposed network that performs complex spectral map** for speech separation by leveraging global attention and positional encoding. To effectively utilize visual cues, the proposed system incorporates pre-extracted visual embeddings and employs a visual encoder comprising temporal convolutional layers. Audio and visual features are fused in an early fusion layer before feeding to AV-CrossNet blocks. We evaluate AV-CrossNet on multiple datasets, including LRS, VoxCeleb, and COG-MHEAR challenge. Evaluation results demonstrate that AV-CrossNet advances the state-of-the-art performance in all audiovisual tasks, even on untrained and mismatched datasets.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
Prudent Price-Responsive Demands
Authors:
Liudong Chen,
Bolun Xu
Abstract:
We investigate a flexible demand with a risk-neutral cost-saving objective in response to volatile electricity prices. We introduce the concept of prudent demand, which states that future price uncertainties will affect immediate consumption patterns, despite the price expectations remaining unchanged. We develop a theoretical framework and prove that demand exhibits prudence when the third-order…
▽ More
We investigate a flexible demand with a risk-neutral cost-saving objective in response to volatile electricity prices. We introduce the concept of prudent demand, which states that future price uncertainties will affect immediate consumption patterns, despite the price expectations remaining unchanged. We develop a theoretical framework and prove that demand exhibits prudence when the third-order derivative of its utility cost function is positive, and show a prudent demand demonstrates risk-averse behaviors despite the objective being risk-neutral. . Prudent demands exhibit skewness aversion, with increased price skewness elevating the cost associated with prudence. We validate our theoretical findings through numerical simulations and conclude their implications for demand response modeling and the future design of incentive-based demand response mechanisms.
△ Less
Submitted 25 May, 2024;
originally announced May 2024.
-
Study of 5G base station antenna array performance for self-interference reduction
Authors:
Bing Xue,
Katsuyuki Haneda,
Clemens Icheln
Abstract:
The study of 5G base station antenna array performance for self-interference reduction is derived. The line of sight signal channel model and Rayleigh channel model are developed. The relevant calculations for channel capacities are shown. This is the pre-material for this study. More results and conclusions will be presented soon.
The study of 5G base station antenna array performance for self-interference reduction is derived. The line of sight signal channel model and Rayleigh channel model are developed. The relevant calculations for channel capacities are shown. This is the pre-material for this study. More results and conclusions will be presented soon.
△ Less
Submitted 23 May, 2024;
originally announced May 2024.
-
Permittivity Characterization of Human Skin Based on a Quasi-optical System at Sub-THz
Authors:
Bing Xue,
Juha Tuomela,
Katsuyuki Haneda,
Clemens Icheln,
Juha Ala-Laurinaho
Abstract:
This paper introduces a novel approach to experimentally characterize effective human skin permittivity at sub-Terahertz (sub-THz) frequencies, specifically from $140$~to $210$~GHz, utilizing a quasi-optical measurement system. To ensure accurate measurement of the reflection coefficients of human skin, a planar, rigid, and thick reference plate with a low-loss dielectric is utilized to flatten th…
▽ More
This paper introduces a novel approach to experimentally characterize effective human skin permittivity at sub-Terahertz (sub-THz) frequencies, specifically from $140$~to $210$~GHz, utilizing a quasi-optical measurement system. To ensure accurate measurement of the reflection coefficients of human skin, a planar, rigid, and thick reference plate with a low-loss dielectric is utilized to flatten the human skin surface. A permittivity characterization method is proposed to reduce permittivity estimation deviations resulting from the pressure effects on the phase displacements of skins under the measurements but also to ensure repeatability of the measurement. In practical permittivity characterizations, the complex permittivities of the finger, palm, and arm of seven volunteers show small standard deviations for the repeated measurements, respectively, while those show significant variations across different regions of the skins and for different persons. The proposed measurement system holds significant potential for future skin permittivity estimation in sub-THz bands, facilitating further studies on human-electromagnetic-wave interactions based on the measured permittivity values.
△ Less
Submitted 20 May, 2024;
originally announced May 2024.
-
Inner-approximate Reachability Computation via Zonotopic Boundary Analysis
Authors:
De** Ren,
Zhen Liang,
Chenyu Wu,
Jianqiang Ding,
Taoran Wu,
Bai Xue
Abstract:
Inner-approximate reachability analysis involves calculating subsets of reachable sets, known as inner-approximations. This analysis is crucial in the fields of dynamic systems analysis and control theory as it provides a reliable estimation of the set of states that a system can reach from given initial states at a specific time instant. In this paper, we study the inner-approximate reachability…
▽ More
Inner-approximate reachability analysis involves calculating subsets of reachable sets, known as inner-approximations. This analysis is crucial in the fields of dynamic systems analysis and control theory as it provides a reliable estimation of the set of states that a system can reach from given initial states at a specific time instant. In this paper, we study the inner-approximate reachability analysis problem based on the set-boundary reachability method for systems modelled by ordinary differential equations, in which the computed inner-approximations are represented with zonotopes. The set-boundary reachability method computes an inner-approximation by excluding states reached from the initial set's boundary. The effectiveness of this method is highly dependent on the efficient extraction of the exact boundary of the initial set. To address this, we propose methods leveraging boundary and tiling matrices that can efficiently extract and refine the exact boundary of the initial set represented by zonotopes. Additionally, we enhance the exclusion strategy by contracting the outer-approximations in a flexible way, which allows for the computation of less conservative inner-approximations. To evaluate the proposed method, we compare it with state-of-the-art methods against a series of benchmarks. The numerical results demonstrate that our method is not only efficient but also accurate in computing inner-approximations.
△ Less
Submitted 21 May, 2024; v1 submitted 17 May, 2024;
originally announced May 2024.
-
Co-learning-aided Multi-modal-deep-learning Framework of Passive DOA Estimators for a Heterogeneous Hybrid Massive MIMO Receiver
Authors:
Jiatong Bai,
Feng Shu,
Qinghe Zheng,
Bo Xu,
Baihua Shi,
Yiwen Chen,
Weibin Zhang,
Xianpeng Wang
Abstract:
Due to its excellent performance in rate and resolution, fully-digital (FD) massive multiple-input multiple-output (MIMO) antenna arrays has been widely applied in data transmission and direction of arrival (DOA) measurements, etc. But it confronts with two main challenges: high computational complexity and circuit cost. The two problems may be addressed well by hybrid analog-digital (HAD) structu…
▽ More
Due to its excellent performance in rate and resolution, fully-digital (FD) massive multiple-input multiple-output (MIMO) antenna arrays has been widely applied in data transmission and direction of arrival (DOA) measurements, etc. But it confronts with two main challenges: high computational complexity and circuit cost. The two problems may be addressed well by hybrid analog-digital (HAD) structure. But there exists the problem of phase ambiguity for HAD, which leads to its low-efficiency or high-latency. Does exist there such a MIMO structure of owning low-cost, low-complexity and high time efficiency at the same time. To satisfy the three properties, a novel heterogeneous hybrid MIMO receiver structure of integrating FD and heterogeneous HAD ($\rm{H}^2$AD-FD) is proposed and corresponding multi-modal (MD)-learning framework is developed. The framework includes three major stages: 1) generate the candidate sets via root multiple signal classification (Root-MUSIC) or deep learning (DL); 2) infer the class of true solutions from candidate sets using machine learning (ML) methods; 3) fuse the two-part true solutions to achieve a better DOA estimation. The above process form two methods named MD-Root-MUSIC and MDDL. To improve DOA estimation accuracy and reduce the clustering complexity, a co-learning-aided MD framework is proposed to form two enhanced methods named CoMDDL and CoMD-RootMUSIC. Moreover, the Cramer-Rao lower bound (CRLB) for the proposed $\rm{H}^2$AD-FD structure is also derived. Experimental results demonstrate that our proposed four methods could approach the CRLB for signal-to-noise ratio (SNR) > 0 dB and the proposed CoMDDL and MDDL perform better than CoMD-RootMUSIC and MD-RootMUSIC, particularly in the extremely low SNR region.
△ Less
Submitted 12 June, 2024; v1 submitted 27 April, 2024;
originally announced May 2024.
-
Human Skin Permittivity Characterization for Mobile Handset Evaluation at Sub-THz
Authors:
Bing Xue,
Katsuyuki Haneda,
Clemens Icheln,
Juha Ala-Laurinaho
Abstract:
This manuscript proposes a method for characterizing the complex permittivity of the human finger skin based on an open-ended waveguide covered with a thin dielectric sheet at sub-terahertz frequencies. The measurement system is initially analyzed through full-wave simulations with a detailed finger model. Next, the model is simplified by replacing the finger with an infinite sheet of human skin t…
▽ More
This manuscript proposes a method for characterizing the complex permittivity of the human finger skin based on an open-ended waveguide covered with a thin dielectric sheet at sub-terahertz frequencies. The measurement system is initially analyzed through full-wave simulations with a detailed finger model. Next, the model is simplified by replacing the finger with an infinite sheet of human skin to calculate the forward electromagnetic problem related to the permittivity characterization. Following this, a radial basis network is employed to train the inverse problem solver. Finally, the complex permittivities of finger skins are characterized for 10 volunteers. The variations in complex relative permittivity across different individuals and skin regions are analyzed, revealing a deviation of $<\pm 1.5$ for both the dielectric constants and loss factors across 140 to 220 GHz. Repeated measurements at the same location on the finger demonstrate good repeatability with a relative estimation uncertainty $<\pm 1.5\%$.
△ Less
Submitted 23 May, 2024; v1 submitted 9 May, 2024;
originally announced May 2024.
-
Complex Permittivity Characterization of Low-Loss Dielectric Slabs at Sub-THz
Authors:
Bing Xue,
Katsuyuki Haneda,
Clemens Icheln,
Juha Tuomela
Abstract:
This manuscript presents a novel method for characterizing the permittivities of low-loss dielectric slabs in sub-terahertz (sub-THz) frequencies, specifically above 100 GHz using a quasi-optical system. The algorithm is introduced with detailed derivations, and the measurement sensitivity is analyzed through simulations. Subsequently, the method's validity is established via simulations, demonstr…
▽ More
This manuscript presents a novel method for characterizing the permittivities of low-loss dielectric slabs in sub-terahertz (sub-THz) frequencies, specifically above 100 GHz using a quasi-optical system. The algorithm is introduced with detailed derivations, and the measurement sensitivity is analyzed through simulations. Subsequently, the method's validity is established via simulations, demonstrating high accuracy (error 0.1% for the loss tangent) for a 30 mm thick plate material and relatively lower accuracy (error <5% for the loss tangent) for a 6 mm thick plate material. Notably, this accuracy surpasses that of the approach presented in [1] when the same window width is used to extract signals. Furthermore, a comparison between the permittivities of plexiglass with a 30 mm thickness characterized by the proposed method and the approach in [1] reveals a maximum difference in the dielectric constant of 0.011 and in loss tangent of 0.00071 from 140 to 220 GHz. Finally, the relative complex permittivities of plexiglass at 142.86 GHz obtained by both methods are compared with the reference values provided in [2], exhibiting differences of 0.06 in the dielectric constant.
△ Less
Submitted 9 May, 2024;
originally announced May 2024.
-
Market Power and Withholding Behavior of Energy Storage Units
Authors:
Yiqian Wu,
Bolun Xu,
James Anderson
Abstract:
Electricity markets are experiencing a rapid increase in energy storage unit participation. Unlike conventional generation resources, quantifying the competitive operation and identifying if a storage unit is exercising market power is challenging, particularly in the context of multi-interval bidding strategies. We present a framework to differentiate strategic capacity withholding behaviors attr…
▽ More
Electricity markets are experiencing a rapid increase in energy storage unit participation. Unlike conventional generation resources, quantifying the competitive operation and identifying if a storage unit is exercising market power is challenging, particularly in the context of multi-interval bidding strategies. We present a framework to differentiate strategic capacity withholding behaviors attributed to market power from inherent competitive bidding in storage unit strategies. Our framework evaluates the profitability of strategic storage unit participation, analyzing bidding behaviors as both price takers and price makers using a self-scheduling model, and investigates how they leverage market inefficiencies. Specifically, we propose a price sensitivity model derived from the linear supply function equilibrium model to examine the price-anticipating bidding strategy, effectively capturing the influence of market power. We introduce a sufficient ex-post analysis for market operators to identify potential exploitative behaviors by monitoring instances of withholding within the bidding profiles, ensuring market resilience and competitiveness. We discuss and verify applicability of the proposed framework to realistic settings. Our analysis substantiates commonly observed economic bidding behaviors of storage units. Furthermore, it demonstrates that significant price volatility offers considerable profit opportunities not only for participants possessing market power but also for typical strategic profit seekers.
△ Less
Submitted 2 May, 2024;
originally announced May 2024.
-
Finite-time Safety and Reach-avoid Verification of Stochastic Discrete-time Systems
Authors:
Bai Xue
Abstract:
This paper studies finite-time safety and reach-avoid verification for stochastic discrete-time dynamical systems. The aim is to ascertain lower and upper bounds of the probability that, within a predefined finite-time horizon, a system starting from an initial state in a safe set will either exit the safe set (safety verification) or reach a target set while remaining within the safe set until th…
▽ More
This paper studies finite-time safety and reach-avoid verification for stochastic discrete-time dynamical systems. The aim is to ascertain lower and upper bounds of the probability that, within a predefined finite-time horizon, a system starting from an initial state in a safe set will either exit the safe set (safety verification) or reach a target set while remaining within the safe set until the first encounter with the target (reach-avoid verification). We introduce novel barrier-like sufficient conditions for characterizing these bounds, which either complement existing ones or fill gaps. Finally, we demonstrate the efficacy of these conditions on two examples.
△ Less
Submitted 28 April, 2024;
originally announced April 2024.
-
Energy Storage Arbitrage in Two-settlement Markets: A Transformer-Based Approach
Authors:
Saud Alghumayjan,
Jiajun Han,
Ningkun Zheng,
Ming Yi,
Bolun Xu
Abstract:
This paper presents an integrated model for bidding energy storage in day-ahead and real-time markets to maximize profits. We show that in integrated two-stage bidding, the real-time bids are independent of day-ahead settlements, while the day-ahead bids should be based on predicted real-time prices. We utilize a transformer-based model for real-time price prediction, which captures complex dynami…
▽ More
This paper presents an integrated model for bidding energy storage in day-ahead and real-time markets to maximize profits. We show that in integrated two-stage bidding, the real-time bids are independent of day-ahead settlements, while the day-ahead bids should be based on predicted real-time prices. We utilize a transformer-based model for real-time price prediction, which captures complex dynamical patterns of real-time prices, and use the result for day-ahead bidding design. For real-time bidding, we utilize a long short-term memory-dynamic programming hybrid real-time bidding model. We train and test our model with historical data from New York State, and our results showed that the integrated system achieved promising results of almost a 20\% increase in profit compared to only bidding in real-time markets, and at the same time reducing the risk in terms of the number of days with negative profits.
△ Less
Submitted 26 April, 2024;
originally announced April 2024.
-
Sky-GVIO: an enhanced GNSS/INS/Vision navigation with FCN-based sky-segmentation in urban canyon
Authors:
**grong Wang,
Bo Xu,
Ronghe **,
Shoujian Zhang,
Kefu Gao,
**gnan Liu
Abstract:
Accurate, continuous, and reliable positioning is a critical component of achieving autonomous driving. However, in complex urban canyon environments, the vulnerability of a stand-alone sensor and non-line-of-sight (NLOS) caused by high buildings, trees, and elevated structures seriously affect positioning results. To address these challenges, a sky-view images segmentation algorithm based on Full…
▽ More
Accurate, continuous, and reliable positioning is a critical component of achieving autonomous driving. However, in complex urban canyon environments, the vulnerability of a stand-alone sensor and non-line-of-sight (NLOS) caused by high buildings, trees, and elevated structures seriously affect positioning results. To address these challenges, a sky-view images segmentation algorithm based on Fully Convolutional Network (FCN) is proposed for GNSS NLOS detection. Building upon this, a novel NLOS detection and mitigation algorithm (named S-NDM) is extended to the tightly coupled Global Navigation Satellite Systems (GNSS), Inertial Measurement Units (IMU), and visual feature system which is called Sky-GVIO, with the aim of achieving continuous and accurate positioning in urban canyon environments. Furthermore, the system harmonizes Single Point Positioning (SPP) with Real-Time Kinematic (RTK) methodologies to bolster its operational versatility and resilience. In urban canyon environments, the positioning performance of S-NDM algorithm proposed in this paper is evaluated under different tightly coupled SPP-related and RTK-related models. The results exhibit that Sky-GVIO system achieves meter-level accuracy under SPP mode and sub-decimeter precision with RTK, surpassing the performance of GNSS/INS/Vision frameworks devoid of S-NDM. Additionally, the sky-view image dataset, inclusive of training and evaluation subsets, has been made publicly accessible for scholarly exploration at https://github.com/whuwangjr/sky-view-images .
△ Less
Submitted 17 April, 2024;
originally announced April 2024.
-
A Framework for Safe Probabilistic Invariance Verification of Stochastic Dynamical Systems
Authors:
Taoran Wu,
Yiqing Yu,
Bican Xia,
Ji Wang,
Bai Xue
Abstract:
Ensuring safety through set invariance has proven to be a valuable method in various robotics and control applications. This paper introduces a comprehensive framework for the safe probabilistic invariance verification of both discrete- and continuous-time stochastic dynamical systems over an infinite time horizon. The objective is to ascertain the lower and upper bounds of the liveness probabilit…
▽ More
Ensuring safety through set invariance has proven to be a valuable method in various robotics and control applications. This paper introduces a comprehensive framework for the safe probabilistic invariance verification of both discrete- and continuous-time stochastic dynamical systems over an infinite time horizon. The objective is to ascertain the lower and upper bounds of the liveness probability for a given safe set and set of initial states. This probability signifies the likelihood of the system remaining within the safe set indefinitely, starting from the set of initial states. To address this problem, we propose optimizations for verifying safe probabilistic invariance in discrete-time and continuous-time stochastic dynamical systems. These optimizations adapt classical stochastic barrier certificates, which are based on Doob's non-negative supermartingale inequality, and the equations described in [29],[31], which can precisely define the probability of reaching a target set while avoiding unsafe states. Finally, we demonstrate the effectiveness of these optimizations through several examples using semi-definite programming tools.
△ Less
Submitted 13 April, 2024;
originally announced April 2024.
-
Economic Capacity Withholding Bounds of Competitive Energy Storage Bidders
Authors:
Xin Qin,
Ioannis Lestas,
Bolun Xu
Abstract:
Economic withholding in electricity markets refers to generators bidding higher than their true marginal fuel cost, and is a typical approach to exercising market power. However, existing market designs require storage to design bids strategically based on their own future price predictions, motivating storage to conduct economic withholding without assuming market power. As energy storage takes u…
▽ More
Economic withholding in electricity markets refers to generators bidding higher than their true marginal fuel cost, and is a typical approach to exercising market power. However, existing market designs require storage to design bids strategically based on their own future price predictions, motivating storage to conduct economic withholding without assuming market power. As energy storage takes up more significant roles in wholesale electricity markets, understanding its motivations for economic withholding and the consequent effects on social welfare becomes increasingly vital. This paper derives a theoretical framework to study the economic capacity withholding behavior of storage participating in competitive electricity markets and validate our results in simulations based on the ISO New England system. We demonstrate that storage bids can reach unbounded high levels under conditions where future price predictions show bounded expectations but unbounded deviations. Conversely, in scenarios with peak price limitations, we show the upper bounds of storage bids are grounded in bounded price expectations. Most importantly, we show that storage capacity withholding can potentially lower the overall system cost when price models account for system uncertainties. Our paper reveals energy storage is not a market manipulator but an honest player contributing to the social welfare. It helps electricity market researchers and operators better understand the economic withholding behavior of storage and reform market policies to maximize storage contributing to a cost-efficient decolonization.
△ Less
Submitted 8 March, 2024;
originally announced March 2024.
-
A Closer Look at Wav2Vec2 Embeddings for On-Device Single-Channel Speech Enhancement
Authors:
Ravi Shankar,
Ke Tan,
Buye Xu,
Anurag Kumar
Abstract:
Self-supervised learned models have been found to be very effective for certain speech tasks such as automatic speech recognition, speaker identification, keyword spotting and others. While the features are undeniably useful in speech recognition and associated tasks, their utility in speech enhancement systems is yet to be firmly established, and perhaps not properly understood. In this paper, we…
▽ More
Self-supervised learned models have been found to be very effective for certain speech tasks such as automatic speech recognition, speaker identification, keyword spotting and others. While the features are undeniably useful in speech recognition and associated tasks, their utility in speech enhancement systems is yet to be firmly established, and perhaps not properly understood. In this paper, we investigate the uses of SSL representations for single-channel speech enhancement in challenging conditions and find that they add very little value for the enhancement task. Our constraints are designed around on-device real-time speech enhancement -- model is causal, the compute footprint is small. Additionally, we focus on low SNR conditions where such models struggle to provide good enhancement. In order to systematically examine how SSL representations impact performance of such enhancement models, we propose a variety of techniques to utilize these embeddings which include different forms of knowledge-distillation and pre-training.
△ Less
Submitted 2 March, 2024;
originally announced March 2024.
-
Converse Barrier Certificates for Finite-time Safety Verification of Continuous-time Perturbed Deterministic Systems
Authors:
Yonghan Li,
Chenyu Wu,
Taoran Wu,
Shijie Wang,
Bai Xue
Abstract:
In this paper, we investigate the problem of verifying the finite-time safety of continuous-time perturbed deterministic systems represented by ordinary differential equations in the presence of measurable disturbances. Given a finite time horizon, if the system is safe, it, starting from a compact initial set, will remain within an open and bounded safe region throughout the specified time horizo…
▽ More
In this paper, we investigate the problem of verifying the finite-time safety of continuous-time perturbed deterministic systems represented by ordinary differential equations in the presence of measurable disturbances. Given a finite time horizon, if the system is safe, it, starting from a compact initial set, will remain within an open and bounded safe region throughout the specified time horizon, regardless of the disturbances. The main contribution of this work is to uncover that there exists a time-dependent barrier certificate if and only if the system is safe. This barrier certificate satisfies the following conditions: negativity over the initial set at the initial time instant, non-negativity over the boundary of the safe set, and non-increasing behavior along the system dynamics over the specified finite time horizon. The existence problem is explored using a Hamilton-Jacobi differential equation, which has a unique Lipschitz viscosity solution.
△ Less
Submitted 26 February, 2024;
originally announced February 2024.
-
On the Importance of Neural Wiener Filter for Resource Efficient Multichannel Speech Enhancement
Authors:
Tsun-An Hsieh,
Jacob Donley,
Daniel Wong,
Buye Xu,
Ashutosh Pandey
Abstract:
We introduce a time-domain framework for efficient multichannel speech enhancement, emphasizing low latency and computational efficiency. This framework incorporates two compact deep neural networks (DNNs) surrounding a multichannel neural Wiener filter (NWF). The first DNN enhances the speech signal to estimate NWF coefficients, while the second DNN refines the output from the NWF. The NWF, while…
▽ More
We introduce a time-domain framework for efficient multichannel speech enhancement, emphasizing low latency and computational efficiency. This framework incorporates two compact deep neural networks (DNNs) surrounding a multichannel neural Wiener filter (NWF). The first DNN enhances the speech signal to estimate NWF coefficients, while the second DNN refines the output from the NWF. The NWF, while conceptually similar to the traditional frequency-domain Wiener filter, undergoes a training process optimized for low-latency speech enhancement, involving fine-tuning of both analysis and synthesis transforms. Our research results illustrate that the NWF output, having minimal nonlinear distortions, attains performance levels akin to those of the first DNN, deviating from conventional Wiener filter paradigms. Training all components jointly outperforms sequential training, despite its simplicity. Consequently, this framework achieves superior performance with fewer parameters and reduced computational demands, making it a compelling solution for resource-efficient multichannel speech enhancement.
△ Less
Submitted 15 January, 2024;
originally announced January 2024.
-
Decoupled Spatial and Temporal Processing for Resource Efficient Multichannel Speech Enhancement
Authors:
Ashutosh Pandey,
Buye Xu
Abstract:
We present a novel model designed for resource-efficient multichannel speech enhancement in the time domain, with a focus on low latency, lightweight, and low computational requirements. The proposed model incorporates explicit spatial and temporal processing within deep neural network (DNN) layers. Inspired by frequency-dependent multichannel filtering, our spatial filtering process applies multi…
▽ More
We present a novel model designed for resource-efficient multichannel speech enhancement in the time domain, with a focus on low latency, lightweight, and low computational requirements. The proposed model incorporates explicit spatial and temporal processing within deep neural network (DNN) layers. Inspired by frequency-dependent multichannel filtering, our spatial filtering process applies multiple trainable filters to each hidden unit across the spatial dimension, resulting in a multichannel output. The temporal processing is applied over a single-channel output stream from the spatial processing using a Long Short-Term Memory (LSTM) network. The output from the temporal processing stage is then further integrated into the spatial dimension through elementwise multiplication. This explicit separation of spatial and temporal processing results in a resource-efficient network design. Empirical findings from our experiments show that our proposed model significantly outperforms robust baseline models while demanding far fewer parameters and computations, while achieving an ultra-low algorithmic latency of just 2 milliseconds.
△ Less
Submitted 15 January, 2024;
originally announced January 2024.
-
A New Framework for Bounding Reachability Probabilities of Continuous-time Stochastic Systems
Authors:
Bai Xue
Abstract:
This manuscript presents an innovative framework for constructing barrier functions to bound reachability probabilities for continuous-time stochastic systems described by stochastic differential equations (SDEs). The reachability probabilities considered in this paper encompass two aspects: the probability of reaching a set of specified states within a predefined finite time horizon, and the prob…
▽ More
This manuscript presents an innovative framework for constructing barrier functions to bound reachability probabilities for continuous-time stochastic systems described by stochastic differential equations (SDEs). The reachability probabilities considered in this paper encompass two aspects: the probability of reaching a set of specified states within a predefined finite time horizon, and the probability of reaching a set of specified states at a particular time instant. The barrier functions presented in this manuscript are developed either by relaxing a parabolic partial differential equation that characterizes the exact reachability probability or by applying the Grönwall's inequality. In comparison to the prevailing construction method, which relies on Doob's non-negative supermartingale inequality (or Ville's inequality), the proposed barrier functions provide stronger alternatives, complement existing methods, or fill gaps.
△ Less
Submitted 25 December, 2023;
originally announced December 2023.
-
A WECC-based Model for Simulating Two-stage Market Clearing with High-temporal-resolution
Authors:
Ningkun Zheng,
Bolun Xu
Abstract:
This paper presents a new open-source model for simulating two-stage market clearing based on the Western Electricity Coordinating Council Anchor Data Set. We model accurate two-stage market clearing with day-ahead unit commitment at hourly resolution and real-time economic dispatch with five-minute resolution. Both day-ahead unit commitment and real-time economic dispatch can incorporate look-ahe…
▽ More
This paper presents a new open-source model for simulating two-stage market clearing based on the Western Electricity Coordinating Council Anchor Data Set. We model accurate two-stage market clearing with day-ahead unit commitment at hourly resolution and real-time economic dispatch with five-minute resolution. Both day-ahead unit commitment and real-time economic dispatch can incorporate look-ahead rolling horizons. The model includes seven market regions and a full year of data, detailing 2,403 individual generation assets across diverse energy sources. The year-long simulation demonstrates the capability of our model to closely reflect the generation and price patterns of the California ISO. Our sensitivity analysis revealed that extending the ED look-ahead horizon reduces system costs by up to 0.12%. We expect this new system model to fulfill the needs of conducting electricity market analysis at finer time granularity for market designs and emerging technology integration. While we focus on the western interconnection, the model serves as a base to simulate other two-stage clearing market locations.
△ Less
Submitted 23 December, 2023;
originally announced December 2023.
-
Explainable Severity ranking via pairwise n-hidden comparison: a case study of glaucoma
Authors:
Hong Nguyen,
Cuong V. Nguyen,
Shrikanth Narayanan,
Benjamin Y. Xu,
Michael Pazzani
Abstract:
Primary open-angle glaucoma (POAG) is a chronic and progressive optic nerve condition that results in an acquired loss of optic nerve fibers and potential blindness. The gradual onset of glaucoma results in patients progressively losing their vision without being consciously aware of the changes. To diagnose POAG and determine its severity, patients must undergo a comprehensive dilated eye examina…
▽ More
Primary open-angle glaucoma (POAG) is a chronic and progressive optic nerve condition that results in an acquired loss of optic nerve fibers and potential blindness. The gradual onset of glaucoma results in patients progressively losing their vision without being consciously aware of the changes. To diagnose POAG and determine its severity, patients must undergo a comprehensive dilated eye examination. In this work, we build a framework to rank, compare, and interpret the severity of glaucoma using fundus images. We introduce a siamese-based severity ranking using pairwise n-hidden comparisons. We additionally have a novel approach to explaining why a specific image is deemed more severe than others. Our findings indicate that the proposed severity ranking model surpasses traditional ones in terms of diagnostic accuracy and delivers improved saliency explanations.
△ Less
Submitted 5 December, 2023;
originally announced December 2023.
-
Universal Deoxidation of Semiconductor Substrates Assisted by Machine-Learning and Real-Time-Feedback-Control
Authors:
Chao Shen,
Wenkang Zhan,
Jian Tang,
Zhaofeng Wu,
Bo Xu,
Chao Zhao,
Zhanguo Wang
Abstract:
Thin film deposition is an essential step in the semiconductor process. During preparation or loading, the substrate is exposed to the air unavoidably, which has motivated studies of the process control to remove the surface oxide before thin film deposition. Optimizing the deoxidation process in molecular beam epitaxy (MBE) for a random substrate is a multidimensional challenge and sometimes cont…
▽ More
Thin film deposition is an essential step in the semiconductor process. During preparation or loading, the substrate is exposed to the air unavoidably, which has motivated studies of the process control to remove the surface oxide before thin film deposition. Optimizing the deoxidation process in molecular beam epitaxy (MBE) for a random substrate is a multidimensional challenge and sometimes controversial. Due to variations in semiconductor materials and growth processes, the determination of substrate deoxidation temperature is highly dependent on the grower's expertise; the same substrate may yield inconsistent results when evaluated by different growers. Here, we employ a machine learning (ML) hybrid convolution and vision transformer (CNN-ViT) model. This model utilizes reflection high-energy electron diffraction (RHEED) video as input to determine the deoxidation status of the substrate as output, enabling automated substrate deoxidation under a controlled architecture. This also extends to the successful application of deoxidation processes on other substrates. Furthermore, we showcase the potential of models trained on data from a single MBE equipment to achieve high-accuracy deployment on other equipment. In contrast to traditional methods, our approach holds exceptional practical value. It standardizes deoxidation temperatures across various equipment and substrate materials, advancing the standardization research process in semiconductor preparation, a significant milestone in thin film growth technology. The concepts and methods demonstrated in this work are anticipated to revolutionize semiconductor manufacturing in optoelectronics and microelectronics industries by applying them to diverse material growth processes.
△ Less
Submitted 4 December, 2023;
originally announced December 2023.
-
Automatic Detection of Alzheimer's Disease with Multi-Modal Fusion of Clinical MRI Scans
Authors:
Long Chen,
Liben Chen,
Binfeng Xu,
Wenxin Zhang,
Narges Razavian
Abstract:
The aging population of the U.S. drives the prevalence of Alzheimer's disease. Brookmeyer et al. forecasts approximately 15 million Americans will have either clinical AD or mild cognitive impairment by 2060. In response to this urgent call, methods for early detection of Alzheimer's disease have been developed for prevention and pre-treatment. Notably, literature on the application of deep learni…
▽ More
The aging population of the U.S. drives the prevalence of Alzheimer's disease. Brookmeyer et al. forecasts approximately 15 million Americans will have either clinical AD or mild cognitive impairment by 2060. In response to this urgent call, methods for early detection of Alzheimer's disease have been developed for prevention and pre-treatment. Notably, literature on the application of deep learning in the automatic detection of the disease has been proliferating. This study builds upon previous literature and maintains a focus on leveraging multi-modal information to enhance automatic detection. We aim to predict the stage of the disease - Cognitively Normal (CN), Mildly Cognitive Impairment (MCI), and Alzheimer's Disease (AD), based on two different types of brain MRI scans. We design an AlexNet-based deep learning model that learns the synergy of complementary information from both T1 and FLAIR MRI scans.
△ Less
Submitted 29 November, 2023;
originally announced November 2023.
-
A Quantitative Approach to Understand Self-Supervised Models as Cross-lingual Feature Extractors
Authors:
Shuyue Stella Li,
Beining Xu,
Xiangyu Zhang,
Hexin Liu,
Wenhan Chao,
Leibny Paola Garcia
Abstract:
In this work, we study the features extracted by English self-supervised learning (SSL) models in cross-lingual contexts and propose a new metric to predict the quality of feature representations. Using automatic speech recognition (ASR) as a downstream task, we analyze the effect of model size, training objectives, and model architecture on the models' performance as a feature extractor for a set…
▽ More
In this work, we study the features extracted by English self-supervised learning (SSL) models in cross-lingual contexts and propose a new metric to predict the quality of feature representations. Using automatic speech recognition (ASR) as a downstream task, we analyze the effect of model size, training objectives, and model architecture on the models' performance as a feature extractor for a set of topologically diverse corpora. We develop a novel metric, the Phonetic-Syntax Ratio (PSR), to measure the phonetic and synthetic information in the extracted representations using deep generalized canonical correlation analysis. Results show the contrastive loss in the wav2vec2.0 objective facilitates more effective cross-lingual feature extraction. There is a positive correlation between PSR scores and ASR performance, suggesting that phonetic information extracted by monolingual SSL models can be used for downstream tasks in cross-lingual settings. The proposed metric is an effective indicator of the quality of the representations and can be useful for model selection.
△ Less
Submitted 27 November, 2023;
originally announced November 2023.
-
Double Reverse Regularization Network Based on Self-Knowledge Distillation for SAR Object Classification
Authors:
Bo Xu,
Hao Zheng,
Zhigang Hu,
Liu Yang,
Meiguang Zheng
Abstract:
In current synthetic aperture radar (SAR) object classification, one of the major challenges is the severe overfitting issue due to the limited dataset (few-shot) and noisy data. Considering the advantages of knowledge distillation as a learned label smoothing regularization, this paper proposes a novel Double Reverse Regularization Network based on Self-Knowledge Distillation (DRRNet-SKD). Specif…
▽ More
In current synthetic aperture radar (SAR) object classification, one of the major challenges is the severe overfitting issue due to the limited dataset (few-shot) and noisy data. Considering the advantages of knowledge distillation as a learned label smoothing regularization, this paper proposes a novel Double Reverse Regularization Network based on Self-Knowledge Distillation (DRRNet-SKD). Specifically, through exploring the effect of distillation weight on the process of distillation, we are inspired to adopt the double reverse thought to implement an effective regularization network by combining offline and online distillation in a complementary way. Then, the Adaptive Weight Assignment (AWA) module is designed to adaptively assign two reverse-changing weights based on the network performance, allowing the student network to better benefit from both teachers. The experimental results on OpenSARShip and FUSAR-Ship demonstrate that DRRNet-SKD exhibits remarkable performance improvement on classical CNNs, outperforming state-of-the-art self-knowledge distillation methods.
△ Less
Submitted 26 November, 2023;
originally announced November 2023.
-
Assessment of Transmission-level Fault Impacts on 3-phase and 1-phase Distribution IBR Operation
Authors:
Qi Xiao,
Jongha Woo,
Lidong Song,
Bei Xu,
David Lubkeman,
Ning Lu,
Abdul Shafae Mohammed,
Johan Enslin,
Cara De Coste Chacko,
Kat Sico,
Steven G. Whisenant
Abstract:
The widespread deployment of inverter-based resources (IBRs) renders distribution systems susceptible to transmission-level faults. This paper presents a comprehensive analysis of the impact of transmission-level faults on 3-phase and 1-phase distribution IBR operation. To evaluate distributed IBR trip** across various phases and locations on a distribution feeder, we conduct simulations of both…
▽ More
The widespread deployment of inverter-based resources (IBRs) renders distribution systems susceptible to transmission-level faults. This paper presents a comprehensive analysis of the impact of transmission-level faults on 3-phase and 1-phase distribution IBR operation. To evaluate distributed IBR trip** across various phases and locations on a distribution feeder, we conduct simulations of both symmetrical and unsymmetrical transmission faults at progressively greater electrical distances on a real-time transmission and distribution (T&D) co-simulation platform. The IBR power-to-load ratios (PLRs) at 50%, 100%, and 300% are considered to emulate low, medium, and high IBR conditions. Our results indicate that, while 1-phase and 2-phase faults typically trigger fewer IBR trips when compared to 3-phase faults, a significant power imbalance arises from the trip** of 1-phase IBRs on the affected phases. The imbalance can result in significant power quality problems and unintended equipment trip**. It may be necessary to design fault-ride-through mechanisms specifically tailored to 1-phase IBRs to help mitigate the power imbalances caused by unbalanced faults.
△ Less
Submitted 1 April, 2024; v1 submitted 19 November, 2023;
originally announced November 2023.
-
Using Shallow Neural Networks with Functional Connectivity from EEG signals for Early Diagnosis of Alzheimer's and Frontotemporal Dementia
Authors:
Zaineb Ajra,
Binbin Xu,
Gérard Dray,
Jacky Montmain,
Stéphane Perrey
Abstract:
{Introduction: } Dementia is a neurological disorder associated with aging that can cause a loss of cognitive functions, impacting daily life. Alzheimer's disease (AD) is the most common cause of dementia, accounting for 50--70\% of cases, while frontotemporal dementia (FTD) affects social skills and personality. Electroencephalography (EEG) provides an effective tool to study the effects of AD on…
▽ More
{Introduction: } Dementia is a neurological disorder associated with aging that can cause a loss of cognitive functions, impacting daily life. Alzheimer's disease (AD) is the most common cause of dementia, accounting for 50--70\% of cases, while frontotemporal dementia (FTD) affects social skills and personality. Electroencephalography (EEG) provides an effective tool to study the effects of AD on the brain. {Methods: } In this study, we propose to use shallow neural networks applied to two sets of features: spectral-temporal and functional connectivity using four methods. We compare three supervised machine learning techniques to the CNN models to classify EEG signals of AD / FTD and control cases. We also evaluate different measures of functional connectivity from common EEG frequency bands considering multiple thresholds. {Results and Discussion: } Results showed that the shallow CNN-based models achieved the highest accuracy of 94.54\% with AEC in test dataset when considering all connections, outperforming conventional methods and providing potentially an additional early dementia diagnosis tool. \url{https://doi.org/10.3389%2Ffneur.2023.1270405}
△ Less
Submitted 6 November, 2023;
originally announced November 2023.
-
Resource Allocation for Near-Field Communications: Fundamentals, Tools, and Outlooks
Authors:
Bokai Xu,
Jiayi Zhang,
Hongyang Du,
Zhe Wang,
Yuanwei Liu,
Dusit Niyato,
Bo Ai,
Khaled B. Letaief
Abstract:
Extremely large-scale multiple-input-multiple output (XL-MIMO) is a promising technology to achieve high spectral efficiency (SE) and energy efficiency (EE) in future wireless systems. The larger array aperture of XL-MIMO makes communication scenarios closer to the near-field region. Therefore, near-field resource allocation is essential in realizing the above key performance indicators (KPIs). Mo…
▽ More
Extremely large-scale multiple-input-multiple output (XL-MIMO) is a promising technology to achieve high spectral efficiency (SE) and energy efficiency (EE) in future wireless systems. The larger array aperture of XL-MIMO makes communication scenarios closer to the near-field region. Therefore, near-field resource allocation is essential in realizing the above key performance indicators (KPIs). Moreover, the overall performance of XL-MIMO systems heavily depends on the channel characteristics of the selected users, eliminating interference between users through beamforming, power control, etc. The above resource allocation issue constitutes a complex joint multi-objective optimization problem since many variables and parameters must be optimized, including the spatial degree of freedom, rate, power allocation, and transmission technique. In this article, we review the basic properties of near-field communications and focus on the corresponding "resource allocation" problems. First, we identify available resources in near-field communication systems and highlight their distinctions from far-field communications. Then, we summarize optimization tools, such as numerical techniques and machine learning methods, for addressing near-field resource allocation, emphasizing their strengths and limitations. Finally, several important research directions of near-field communications are pointed out for further investigation.
△ Less
Submitted 11 May, 2024; v1 submitted 26 October, 2023;
originally announced October 2023.
-
Lookup Table meets Local Laplacian Filter: Pyramid Reconstruction Network for Tone Map**
Authors:
Feng Zhang,
Ming Tian,
Zhiqiang Li,
Bin Xu,
Qingbo Lu,
Changxin Gao,
Nong Sang
Abstract:
Tone map** aims to convert high dynamic range (HDR) images to low dynamic range (LDR) representations, a critical task in the camera imaging pipeline. In recent years, 3-Dimensional LookUp Table (3D LUT) based methods have gained attention due to their ability to strike a favorable balance between enhancement performance and computational efficiency. However, these methods often fail to deliver…
▽ More
Tone map** aims to convert high dynamic range (HDR) images to low dynamic range (LDR) representations, a critical task in the camera imaging pipeline. In recent years, 3-Dimensional LookUp Table (3D LUT) based methods have gained attention due to their ability to strike a favorable balance between enhancement performance and computational efficiency. However, these methods often fail to deliver satisfactory results in local areas since the look-up table is a global operator for tone map**, which works based on pixel values and fails to incorporate crucial local information. To this end, this paper aims to address this issue by exploring a novel strategy that integrates global and local operators by utilizing closed-form Laplacian pyramid decomposition and reconstruction. Specifically, we employ image-adaptive 3D LUTs to manipulate the tone in the low-frequency image by leveraging the specific characteristics of the frequency information. Furthermore, we utilize local Laplacian filters to refine the edge details in the high-frequency components in an adaptive manner. Local Laplacian filters are widely used to preserve edge details in photographs, but their conventional usage involves manual tuning and fixed implementation within camera imaging pipelines or photo editing tools. We propose to learn parameter value maps progressively for local Laplacian filters from annotated data using a lightweight network. Our model achieves simultaneous global tone manipulation and local edge detail preservation in an end-to-end manner. Extensive experimental results on two benchmark datasets demonstrate that the proposed method performs favorably against state-of-the-art methods.
△ Less
Submitted 3 January, 2024; v1 submitted 26 October, 2023;
originally announced October 2023.
-
Frequency-Aware Re-Parameterization for Over-Fitting Based Image Compression
Authors:
Yun Ye,
Yanjie Pan,
Qually Jiang,
Ming Lu,
Xiaoran Fang,
Beryl Xu
Abstract:
Over-fitting-based image compression requires weights compactness for compression and fast convergence for practical use, posing challenges for deep convolutional neural networks (CNNs) based methods. This paper presents a simple re-parameterization method to train CNNs with reduced weights storage and accelerated convergence. The convolution kernels are re-parameterized as a weighted sum of discr…
▽ More
Over-fitting-based image compression requires weights compactness for compression and fast convergence for practical use, posing challenges for deep convolutional neural networks (CNNs) based methods. This paper presents a simple re-parameterization method to train CNNs with reduced weights storage and accelerated convergence. The convolution kernels are re-parameterized as a weighted sum of discrete cosine transform (DCT) kernels enabling direct optimization in the frequency domain. Combined with L1 regularization, the proposed method surpasses vanilla convolutions by achieving a significantly improved rate-distortion with low computational cost. The proposed method is verified with extensive experiments of over-fitting-based image restoration on various datasets, achieving up to -46.12% BD-rate on top of HEIF with only 200 iterations.
△ Less
Submitted 12 October, 2023;
originally announced October 2023.
-
Safe Exit Controllers Synthesis for Continuous-time Stochastic Systems
Authors:
Bai Xue
Abstract:
This paper tackles the problem of generating safe exit controllers for continuous-time systems described by stochastic differential equations (SDEs). The primary aim is to develop controllers that maximize the lower bounds of the exit probability that the system escapes from a safe but uncomfortable set within a specified time frame and guide it towards a comfortable set. The paper considers two d…
▽ More
This paper tackles the problem of generating safe exit controllers for continuous-time systems described by stochastic differential equations (SDEs). The primary aim is to develop controllers that maximize the lower bounds of the exit probability that the system escapes from a safe but uncomfortable set within a specified time frame and guide it towards a comfortable set. The paper considers two distinct cases: one in which the boundary of the safe set is a subset of the boundary of the uncomfortable set, and the other where the boundaries of the two sets do not intersect. To begin, we present a sufficient condition for establishing lower bounds on the exit probability in the first case. This condition serves as a guideline for constructing an online linear programming problem. The linear programming problem is designed to implicitly synthesize an optimal exit controller that maximizes the lower bounds of the exit probability. The method employed in the first case is then extended to the second one. Finally, we demonstrate the effectiveness of the proposed approaches on one example.
△ Less
Submitted 8 October, 2023;
originally announced October 2023.
-
Reach-avoid Analysis for Sampled-data Systems with Measurement Uncertainties
Authors:
Taoran Wu,
De** Ren,
Shuyuan Zhang,
Lei Wang,
Bai Xue
Abstract:
Digital control has become increasingly prevalent in modern systems, making continuous-time plants controlled by discrete-time (digital) controllers ubiquitous and crucial across industries, including aerospace, automotive, and manufacturing. This paper focuses on investigating the reach-avoid problem in such systems, where the objective is to reach a goal set while avoiding unsafe states, especia…
▽ More
Digital control has become increasingly prevalent in modern systems, making continuous-time plants controlled by discrete-time (digital) controllers ubiquitous and crucial across industries, including aerospace, automotive, and manufacturing. This paper focuses on investigating the reach-avoid problem in such systems, where the objective is to reach a goal set while avoiding unsafe states, especially in the presence of state measurement uncertainties. We propose an approach that builds upon the concept of exponential control guidance barrier functions, originally used for synthesizing continuous-time feedback controllers. We introduce a sufficient condition that, if met by a given continuous-time feedback controller, ensures the safe guidance of the system into the goal set in its sampled-data implementation, despite state measurement uncertainties. The event of reaching the goal set is determined based on state measurements obtained at the sampling time instants. Numerical examples are provided to demonstrate the validity of our theoretical developments, showcasing successful implementation in solving the reach-avoid problem in sampled-data systems with state measurement uncertainties.
△ Less
Submitted 7 October, 2023;
originally announced October 2023.
-
Correct-by-Construction for Hybrid Systems by Synthesizing Reset Controller
Authors:
Jiang Liu,
Han Su,
Yunjun Bai,
Bin Gu,
Bai Xue,
Mengfei Yang,
Naijun Zhan
Abstract:
Controller synthesis, including reset controller, feedback controller, and switching logic controller, provides an essential mechanism to guarantee the correctness and reliability of hybrid systems in a correct-by-construction manner. Unfortunately, reset controller synthesis is still in an infant stage in the literature, although it makes theoretical and practical significance. In this paper, we…
▽ More
Controller synthesis, including reset controller, feedback controller, and switching logic controller, provides an essential mechanism to guarantee the correctness and reliability of hybrid systems in a correct-by-construction manner. Unfortunately, reset controller synthesis is still in an infant stage in the literature, although it makes theoretical and practical significance. In this paper, we propose a convex programming based method to synthesize reset controllers for polynomial hybrid systems subject to safety, possibly together with liveness. Such a problem essentially corresponds to computing an initial set of continuous states in each mode and a reset map associated with each discrete jump such that any trajectory starting from any computed initial state keeps safe if only safety constraints are given or reaches the target set eventually and keeps safe before that if both safety and liveness are given, through the computed reset maps. Both cases can be reduced to reach-avoid and/or differential invariant generation problems, further encoded as convex optimization problems. Finally, several examples are provided to demonstrate the efficiency and effectiveness of our method.
△ Less
Submitted 11 September, 2023;
originally announced September 2023.
-
Under-frequency Load Shedding for Power Reserve Management in Islanded Microgrids
Authors:
Bei Xu,
Victor Paduani,
Qi Xiao,
Lidong Song,
David Lubkeman,
Ning Lu
Abstract:
This paper introduces under-frequency load shedding (UFLS) schemes specially designed to fulfill the power reserve requirements in islanded microgrids (MGs), where only one grid-forming resource is available for frequency regulation. When the power consumption of the MG exceeds a pre-defined threshold, the MG frequency will be lowered to various setpoints, thereby triggering UFLS for different lev…
▽ More
This paper introduces under-frequency load shedding (UFLS) schemes specially designed to fulfill the power reserve requirements in islanded microgrids (MGs), where only one grid-forming resource is available for frequency regulation. When the power consumption of the MG exceeds a pre-defined threshold, the MG frequency will be lowered to various setpoints, thereby triggering UFLS for different levels of load reduction. Three types of controllable devices are considered for executing UFLS: sectionalizers, smart meters, and controllable appliances. To avoid unnecessary UFLS activation, various time delay settings are analyzed, allowing short-lived power spikes caused by events like motor startups or cold-load pickups to be disregarded. We tested the proposed UFLS schemes on a modified IEEE 123-bus system on the OPAL-RT eMEGASIM platform. Simulation results verify the efficacy of the proposed approaches in restoring power reserves, maintaining phase power balance, and effectively handling short-lived power fluctuations. Furthermore, in comparison to sectionalizer-based UFLS, using smart meters or controllable loads for UFLS allows for a more accurate per-phase load shedding in a progressive manner. As a result, it leads to better balanced three-phase voltage and serves more loads.
△ Less
Submitted 6 September, 2023; v1 submitted 3 September, 2023;
originally announced September 2023.
-
Karma: Adaptive Video Streaming via Causal Sequence Modeling
Authors:
Bowei Xu,
Hao Chen,
Zhan Ma
Abstract:
Optimal adaptive bitrate (ABR) decision depends on a comprehensive characterization of state transitions that involve interrelated modalities over time including environmental observations, returns, and actions. However, state-of-the-art learning-based ABR algorithms solely rely on past observations to decide the next action. This paradigm tends to cause a chain of deviations from optimal action w…
▽ More
Optimal adaptive bitrate (ABR) decision depends on a comprehensive characterization of state transitions that involve interrelated modalities over time including environmental observations, returns, and actions. However, state-of-the-art learning-based ABR algorithms solely rely on past observations to decide the next action. This paradigm tends to cause a chain of deviations from optimal action when encountering unfamiliar observations, which consequently undermines the model generalization. This paper presents Karma, an ABR algorithm that utilizes causal sequence modeling to improve generalization by comprehending the interrelated causality among past observations, returns, and actions and timely refining action when deviation occurs. Unlike direct observation-to-action map**, Karma recurrently maintains a multi-dimensional time series of observations, returns, and actions as input and employs causal sequence modeling via a decision transformer to determine the next action. In the input sequence, Karma uses the maximum cumulative future quality of experience (QoE) (a.k.a, QoE-to-go) as an extended return signal, which is periodically estimated based on current network conditions and playback status. We evaluate Karma through trace-driven simulations and real-world field tests, demonstrating superior performance compared to existing state-of-the-art ABR algorithms, with an average QoE improvement ranging from 10.8% to 18.7% across diverse network conditions. Furthermore, Karma exhibits strong generalization capabilities, showing leading performance under unseen networks in both simulations and real-world tests.
△ Less
Submitted 20 August, 2023;
originally announced August 2023.
-
ApproBiVT: Lead ASR Models to Generalize Better Using Approximated Bias-Variance Tradeoff Guided Early Stop** and Checkpoint Averaging
Authors:
Fangyuan Wang,
Ming Hao,
Yuhai Shi,
Bo Xu
Abstract:
The conventional recipe for Automatic Speech Recognition (ASR) models is to 1) train multiple checkpoints on a training set while relying on a validation set to prevent overfitting using early stop** and 2) average several last checkpoints or that of the lowest validation losses to obtain the final model. In this paper, we rethink and update the early stop** and checkpoint averaging from the p…
▽ More
The conventional recipe for Automatic Speech Recognition (ASR) models is to 1) train multiple checkpoints on a training set while relying on a validation set to prevent overfitting using early stop** and 2) average several last checkpoints or that of the lowest validation losses to obtain the final model. In this paper, we rethink and update the early stop** and checkpoint averaging from the perspective of the bias-variance tradeoff. Theoretically, the bias and variance represent the fitness and variability of a model and the tradeoff of them determines the overall generalization error. But, it's impractical to evaluate them precisely. As an alternative, we take the training loss and validation loss as proxies of bias and variance and guide the early stop** and checkpoint averaging using their tradeoff, namely an Approximated Bias-Variance Tradeoff (ApproBiVT). When evaluating with advanced ASR models, our recipe provides 2.5%-3.7% and 3.1%-4.6% CER reduction on the AISHELL-1 and AISHELL-2, respectively.
△ Less
Submitted 5 August, 2023;
originally announced August 2023.
-
Towards General Low-Light Raw Noise Synthesis and Modeling
Authors:
Feng Zhang,
Bin Xu,
Zhiqiang Li,
Xinran Liu,
Qingbo Lu,
Changxin Gao,
Nong Sang
Abstract:
Modeling and synthesizing low-light raw noise is a fundamental problem for computational photography and image processing applications. Although most recent works have adopted physics-based models to synthesize noise, the signal-independent noise in low-light conditions is far more complicated and varies dramatically across camera sensors, which is beyond the description of these models. To addres…
▽ More
Modeling and synthesizing low-light raw noise is a fundamental problem for computational photography and image processing applications. Although most recent works have adopted physics-based models to synthesize noise, the signal-independent noise in low-light conditions is far more complicated and varies dramatically across camera sensors, which is beyond the description of these models. To address this issue, we introduce a new perspective to synthesize the signal-independent noise by a generative model. Specifically, we synthesize the signal-dependent and signal-independent noise in a physics- and learning-based manner, respectively. In this way, our method can be considered as a general model, that is, it can simultaneously learn different noise characteristics for different ISO levels and generalize to various sensors. Subsequently, we present an effective multi-scale discriminator termed Fourier transformer discriminator (FTD) to distinguish the noise distribution accurately. Additionally, we collect a new low-light raw denoising (LRD) dataset for training and benchmarking. Qualitative validation shows that the noise generated by our proposed noise model can be highly similar to the real noise in terms of distribution. Furthermore, extensive denoising experiments demonstrate that our method performs favorably against state-of-the-art methods on different sensors.
△ Less
Submitted 17 August, 2023; v1 submitted 31 July, 2023;
originally announced July 2023.
-
Equitable Time-Varying Pricing Tariff Design: A Joint Learning and Optimization Approach
Authors:
Liudong Chen,
Bolun Xu
Abstract:
Time-varying pricing tariffs incentivize consumers to shift their electricity demand and reduce costs, but may increase the energy burden for consumers with limited response capability. The utility must thus balance affordability and response incentives when designing these tariffs by considering consumers' response expectations. This paper proposes a joint learning-based identification and optimi…
▽ More
Time-varying pricing tariffs incentivize consumers to shift their electricity demand and reduce costs, but may increase the energy burden for consumers with limited response capability. The utility must thus balance affordability and response incentives when designing these tariffs by considering consumers' response expectations. This paper proposes a joint learning-based identification and optimization method to design equitable time-varying tariffs. Our proposed method encodes historical prices and demand response data into a recurrent neural network (RNN) to capture high-dimensional and non-linear consumer price response behaviors. We then embed the RNN into the tariff design optimization, formulating a non-linear optimization problem with a quadratic objective. We propose a gradient-based solution method that achieves fast and scalable computation. Simulation using real-world consumer data shows that our equitable tariffs protect low-income consumers from price surges while effectively motivating consumers to reduce peak demand. The method also ensures revenue recovery for the utility company and achieves robust performance against demand response uncertainties and prediction errors.
△ Less
Submitted 26 July, 2023;
originally announced July 2023.
-
Efficient Gaussian Process Classification-based Physical-Layer Authentication with Configurable Fingerprints for 6G-Enabled IoT
Authors:
Rui Meng,
Fangzhou Zhu,
Xiaodong Xu,
Liang **,
Bizhu Wang,
Bingxuan Xu,
Han Meng,
** Zhang
Abstract:
Physical-Layer Authentication (PLA) has been recently believed as an endogenous-secure and energy-efficient technique to recognize IoT terminals. However, the major challenge of applying the state-of-the-art PLA schemes directly to 6G-enabled IoT is the inaccurate channel fingerprint estimation in low Signal-Noise Ratio (SNR) environments, which will greatly influence the reliability and robustnes…
▽ More
Physical-Layer Authentication (PLA) has been recently believed as an endogenous-secure and energy-efficient technique to recognize IoT terminals. However, the major challenge of applying the state-of-the-art PLA schemes directly to 6G-enabled IoT is the inaccurate channel fingerprint estimation in low Signal-Noise Ratio (SNR) environments, which will greatly influence the reliability and robustness of PLA. To tackle this issue, we propose a configurable-fingerprint-based PLA architecture through Intelligent Reflecting Surface (IRS) that helps create an alternative wireless transmission path to provide more accurate fingerprints. According to Baye's theorem, we propose a Gaussian Process Classification (GPC)-based PLA scheme, which utilizes the Expectation Propagation (EP) method to obtain the identities of unknown fingerprints. Considering that obtaining sufficient labeled fingerprint samples to train the GPC-based authentication model is challenging for future 6G systems, we further extend the GPC-based PLA to the Efficient-GPC (EGPC)-based PLA through active learning, which requires fewer labeled fingerprints and is more feasible. We also propose three fingerprint selecting algorithms to choose fingerprints, whose identities are queried to the upper-layers authentication mechanisms. For this reason, the proposed EGPC-based scheme is also a lightweight cross-layer authentication method to offer a superior security level. The simulations conducted on synthetic datasets demonstrate that the IRS-assisted scheme reduces the authentication error rate by 98.69% compared to the non-IRS-based scheme. Additionally, the proposed fingerprint selection algorithms reduce the authentication error rate by 65.96% to 86.93% and 45.45% to 70.00% under perfect and imperfect channel estimation conditions, respectively, when compared with baseline algorithms.
△ Less
Submitted 23 July, 2023;
originally announced July 2023.
-
BigDipper: A hyperscale BFT system with short term censorship resistance
Authors:
Bowen Xue,
Soubhik Deb,
Sreeram Kannan
Abstract:
Byzantine-fault-tolerant (BFT) protocols underlie a variety of decentralized applications including payments, auctions, data feed oracles, and decentralized social networks\cite{chainlink,lens}. In most leader-based BFT protocols, an important property that has been missing is the censorship resistance of transaction in the short term. The protocol should provide inclusion guarantees in the next b…
▽ More
Byzantine-fault-tolerant (BFT) protocols underlie a variety of decentralized applications including payments, auctions, data feed oracles, and decentralized social networks\cite{chainlink,lens}. In most leader-based BFT protocols, an important property that has been missing is the censorship resistance of transaction in the short term. The protocol should provide inclusion guarantees in the next block height even if the current and future leaders have the intent of censoring.In this paper, we present a BFT system, BigDipper, that achieves censorship resistance while providing fast confirmation for clients and hyperscale throughput. The core idea is to decentralize inclusion of transactions by allowing every BFT replica to create their own mini-block, and then enforcing the leader on their inclusions. To achieve this, BigDipper creates a modular system made of three components. First, clients use a transaction broadcast protocol to send transaction to multiple replicas. As a distribution of replicas receiving the client's transactions, they prepare mini-blocks to send to the data availability (DA) component, which characterizes the censorship resistant properties of the whole system. We design three censorship resistant DA (DA-CR) protocols whose properties are captured by three parameters. The third component interleaves the second DA-CR protocol into the leader based BFT protocol, it enforces the leader to include all the data from the DA-CR into the final block. At last, we demonstrate an integration with a two-phase Hotstuff-2.
△ Less
Submitted 24 September, 2023; v1 submitted 3 July, 2023;
originally announced July 2023.
-
Machine-Learning-Assisted and Real-Time-Feedback-Controlled Growth of InAs/GaAs Quantum Dots
Authors:
Chao Shen,
Wenkang Zhan,
Kaiyao Xin,
Manyang Li,
Zhenyu Sun,
Hui Cong,
Chi Xu,
Jian Tang,
Zhaofeng Wu,
Bo Xu,
Zhongming Wei,
Chunlai Xue,
Chao Zhao,
Zhanguo Wang
Abstract:
Self-assembled InAs/GaAs quantum dots (QDs) have properties highly valuable for develo** various optoelectronic devices such as QD lasers and single photon sources. The applications strongly rely on the density and quality of these dots, which has motivated studies of the growth process control to realize high-quality epi-wafers and devices. Establishing the process parameters in molecular beam…
▽ More
Self-assembled InAs/GaAs quantum dots (QDs) have properties highly valuable for develo** various optoelectronic devices such as QD lasers and single photon sources. The applications strongly rely on the density and quality of these dots, which has motivated studies of the growth process control to realize high-quality epi-wafers and devices. Establishing the process parameters in molecular beam epitaxy (MBE) for a specific density of QDs is a multidimensional optimization challenge, usually addressed through time-consuming and iterative trial-and-error. Here, we report a real-time feedback control method to realize the growth of QDs with arbitrary density, which is fully automated and intelligent. We developed a machine learning (ML) model named 3D ResNet 50 trained using reflection high-energy electron diffraction (RHEED) videos as input instead of static images and providing real-time feedback on surface morphologies for process control. As a result, we demonstrated that ML from previous growth could predict the post-growth density of QDs, by successfully tuning the QD densities in near-real time from 1.5E10 cm-2 down to 3.8E8 cm-2 or up to 1.4E11 cm-2. Compared to traditional methods, our approach, with in situ tuning capabilities and excellent reliability, can dramatically expedite the material optimization process and improve the reproducibility of MBE, constituting significant progress for thin film growth techniques. The concepts and methodologies proved feasible in this work are promising to be applied to a variety of material growth processes, which will revolutionize semiconductor manufacturing for optoelectronic and microelectronic industries.
△ Less
Submitted 11 October, 2023; v1 submitted 22 June, 2023;
originally announced June 2023.
-
Predicting Strategic Energy Storage Behaviors
Authors:
Yuexin Bian,
Ningkun Zheng,
Yang Zheng,
Bolun Xu,
Yuanyuan Shi
Abstract:
Energy storage are strategic participants in electricity markets to arbitrage price differences. Future power system operators must understand and predict strategic storage arbitrage behaviors for market power monitoring and capacity adequacy planning. This paper proposes a novel data-driven approach that incorporates prior model knowledge for predicting the strategic behaviors of price-taker ener…
▽ More
Energy storage are strategic participants in electricity markets to arbitrage price differences. Future power system operators must understand and predict strategic storage arbitrage behaviors for market power monitoring and capacity adequacy planning. This paper proposes a novel data-driven approach that incorporates prior model knowledge for predicting the strategic behaviors of price-taker energy storage systems. We propose a gradient-descent method to find the storage model parameters given the historical price signals and observations. We prove that the identified model parameters will converge to the true user parameters under a class of quadratic objective and linear equality-constrained storage models. We demonstrate the effectiveness of our approach through numerical experiments with synthetic and real-world storage behavior data. The proposed approach significantly improves the accuracy of storage model identification and behavior forecasting compared to previous blackbox data-driven approaches.
△ Less
Submitted 31 January, 2024; v1 submitted 20 June, 2023;
originally announced June 2023.
-
VILAS: Exploring the Effects of Vision and Language Context in Automatic Speech Recognition
Authors:
Ziyi Ni,
Minglun Han,
Feilong Chen,
Linghui Meng,
**g Shi,
Pin Lv,
Bo Xu
Abstract:
Enhancing automatic speech recognition (ASR) performance by leveraging additional multimodal information has shown promising results in previous studies. However, most of these works have primarily focused on utilizing visual cues derived from human lip motions. In fact, context-dependent visual and linguistic cues can also benefit in many scenarios. In this paper, we first propose ViLaS (Vision a…
▽ More
Enhancing automatic speech recognition (ASR) performance by leveraging additional multimodal information has shown promising results in previous studies. However, most of these works have primarily focused on utilizing visual cues derived from human lip motions. In fact, context-dependent visual and linguistic cues can also benefit in many scenarios. In this paper, we first propose ViLaS (Vision and Language into Automatic Speech Recognition), a novel multimodal ASR model based on the continuous integrate-and-fire (CIF) mechanism, which can integrate visual and textual context simultaneously or separately, to facilitate speech recognition. Next, we introduce an effective training strategy that improves performance in modal-incomplete test scenarios. Then, to explore the effects of integrating vision and language, we create VSDial, a multimodal ASR dataset with multimodal context cues in both Chinese and English versions. Finally, empirical results are reported on the public Flickr8K and self-constructed VSDial datasets. We explore various cross-modal fusion schemes, analyze fine-grained crossmodal alignment on VSDial, and provide insights into the effects of integrating multimodal information on speech recognition.
△ Less
Submitted 18 December, 2023; v1 submitted 31 May, 2023;
originally announced May 2023.
-
P-vectors: A Parallel-Coupled TDNN/Transformer Network for Speaker Verification
Authors:
Xiyuan Wang,
Fangyuan Wang,
Bo Xu,
Liang Xu,
**g Xiao
Abstract:
Typically, the Time-Delay Neural Network (TDNN) and Transformer can serve as a backbone for Speaker Verification (SV). Both of them have advantages and disadvantages from the perspective of global and local feature modeling. How to effectively integrate these two style features is still an open issue. In this paper, we explore a Parallel-coupled TDNN/Transformer Network (p-vectors) to replace the…
▽ More
Typically, the Time-Delay Neural Network (TDNN) and Transformer can serve as a backbone for Speaker Verification (SV). Both of them have advantages and disadvantages from the perspective of global and local feature modeling. How to effectively integrate these two style features is still an open issue. In this paper, we explore a Parallel-coupled TDNN/Transformer Network (p-vectors) to replace the serial hybrid networks. The p-vectors allows TDNN and Transformer to learn the complementary information from each other through Soft Feature Alignment Interaction (SFAI) under the premise of preserving local and global features. Also, p-vectors uses the Spatial Frequency-channel Attention (SFA) to enhance the spatial interdependence modeling for input features. Finally, the outputs of dual branches of p-vectors are combined by Embedding Aggregation Layer (EAL). Experiments show that p-vectors outperforms MACCIF-TDNN and MFA-Conformer with relative improvements of 11.5% and 13.9% in EER on VoxCeleb1-O.
△ Less
Submitted 25 May, 2023; v1 submitted 24 May, 2023;
originally announced May 2023.
-
Jac-PCG Based Low-Complexity Precoding for Extremely Large-Scale MIMO Systems
Authors:
Bokai Xu,
Jiayi Zhang,
Jiaxun Li,
Huahua Xiao,
Bo Ai
Abstract:
Extremely large-scale multiple-input-multipleoutput (XL-MIMO) has been reviewed as a promising technology for future sixth-generation (6G) networks to achieve higher performance. In practice, various linear precoding schemes, such as zero-forcing (ZF) and regularized ZF (RZF) precoding, are sufficient to achieve near-optimal performance in traditional massive MIMO (mMIMO) systems. It is critical t…
▽ More
Extremely large-scale multiple-input-multipleoutput (XL-MIMO) has been reviewed as a promising technology for future sixth-generation (6G) networks to achieve higher performance. In practice, various linear precoding schemes, such as zero-forcing (ZF) and regularized ZF (RZF) precoding, are sufficient to achieve near-optimal performance in traditional massive MIMO (mMIMO) systems. It is critical to note that in large-scale antenna arrays the operation of channel matrix inversion poses a significant computational challenge for these precoders. Therefore, we explore several iterative methods for determining the precoding matrix for XL-MIMO systems instead of direct matrix inversion. Taking into account small- and large-scale fading as well as spatial correlation between antennas, we study their computational complexity and convergence rate. Furthermore, we propose the Jacobi-Preconditioning Conjugate Gradient (Jac-PCG) iterative inversion method, which enjoys a faster convergence speed than the CG method. Besides, the closed-form expression of spectral efficiency (SE) considering the interference between subarrays in downlink XL-MIMO systems is derived. In the numerical results, it is shown that the complexity given by the Jac-PCG algorithm has about 54% reduction than the traditional RZF algorithm at basically the same SE performance.
△ Less
Submitted 23 May, 2023;
originally announced May 2023.
-
Model Predictive Control with Reach-avoid Analysis
Authors:
De** Ren,
Wanli Lu,
Jidong Lv,
Lijun Zhang,
Bai Xue
Abstract:
In this paper we investigate the optimal controller synthesis problem, so that the system under the controller can reach a specified target set while satisfying given constraints. Existing model predictive control (MPC) methods learn from a set of discrete states visited by previous (sub-)optimized trajectories and thus result in computationally expensive mixed-integer nonlinear optimization. In t…
▽ More
In this paper we investigate the optimal controller synthesis problem, so that the system under the controller can reach a specified target set while satisfying given constraints. Existing model predictive control (MPC) methods learn from a set of discrete states visited by previous (sub-)optimized trajectories and thus result in computationally expensive mixed-integer nonlinear optimization. In this paper a novel MPC method is proposed based on reach-avoid analysis to solve the controller synthesis problem iteratively. The reach-avoid analysis is concerned with computing a reach-avoid set which is a set of initial states such that the system can reach the target set successfully. It not only provides terminal constraints, which ensure feasibility of MPC, but also expands discrete states in existing methods into a continuous set (i.e., reach-avoid sets) and thus leads to nonlinear optimization which is more computationally tractable online due to the absence of integer variables. Finally, we evaluate the proposed method and make comparisons with state-of-the-art ones based on several examples.
△ Less
Submitted 21 June, 2023; v1 submitted 15 May, 2023;
originally announced May 2023.
-
X-LLM: Bootstrap** Advanced Large Language Models by Treating Multi-Modalities as Foreign Languages
Authors:
Feilong Chen,
Minglun Han,
Haozhi Zhao,
Qingyang Zhang,
**g Shi,
Shuang Xu,
Bo Xu
Abstract:
Large language models (LLMs) have demonstrated remarkable language abilities. GPT-4, based on advanced LLMs, exhibits extraordinary multimodal capabilities beyond previous visual language models. We attribute this to the use of more advanced LLMs compared with previous multimodal models. Unfortunately, the model architecture and training strategies of GPT-4 are unknown. To endow LLMs with multimod…
▽ More
Large language models (LLMs) have demonstrated remarkable language abilities. GPT-4, based on advanced LLMs, exhibits extraordinary multimodal capabilities beyond previous visual language models. We attribute this to the use of more advanced LLMs compared with previous multimodal models. Unfortunately, the model architecture and training strategies of GPT-4 are unknown. To endow LLMs with multimodal capabilities, we propose X-LLM, which converts Multi-modalities (images, speech, videos) into foreign languages using X2L interfaces and inputs them into a large Language model (ChatGLM). Specifically, X-LLM aligns multiple frozen single-modal encoders and a frozen LLM using X2L interfaces, where ``X'' denotes multi-modalities such as image, speech, and videos, and ``L'' denotes languages. X-LLM's training consists of three stages: (1) Converting Multimodal Information: The first stage trains each X2L interface to align with its respective single-modal encoder separately to convert multimodal information into languages. (2) Aligning X2L representations with the LLM: single-modal encoders are aligned with the LLM through X2L interfaces independently. (3) Integrating multiple modalities: all single-modal encoders are aligned with the LLM through X2L interfaces to integrate multimodal capabilities into the LLM. Our experiments show that X-LLM demonstrates impressive multimodel chat abilities, sometimes exhibiting the behaviors of multimodal GPT-4 on unseen images/instructions, and yields a 84.5\% relative score compared with GPT-4 on a synthetic multimodal instruction-following dataset. And we also conduct quantitative tests on using LLM for ASR and multimodal ASR, ho** to promote the era of LLM-based speech recognition.
△ Less
Submitted 21 May, 2023; v1 submitted 6 May, 2023;
originally announced May 2023.
-
Simulation of Non-inductive Vector Control of Permanent Magnet Synchronous Motor Based on Sliding Mode Observer
Authors:
Caiyue Zhang,
Zipin Liu,
Bowen Xu
Abstract:
Permanent magnet synchronous motors (PMSM) are widely used due to their numerous benefits. It is critical to get rotor position and speed information in order to operate the motor accurately. Sensorless control techniques have emerged as a popular study area both at home and overseas. The sliding mode observer (SMO) may indirectly detect rotor position and has the benefits of easy implementation a…
▽ More
Permanent magnet synchronous motors (PMSM) are widely used due to their numerous benefits. It is critical to get rotor position and speed information in order to operate the motor accurately. Sensorless control techniques have emerged as a popular study area both at home and overseas. The sliding mode observer (SMO) may indirectly detect rotor position and has the benefits of easy implementation and efficient algorithms. In this study, a mathematical model for sensorless control of a PMSM is developed using SMO, vector control, and other techniques. With a surface-mounted PMSM as the study object, a mathematical model for sensorless control of PMSM is developed. PMSM's sliding mode observer model is built in the matlab/simulink environment. Experiments demonstrate that the system can track the rotor position and speed of the motor precisely and fulfill the requirements of sensorless vector control of PMSM.
△ Less
Submitted 6 May, 2023;
originally announced May 2023.