Search | arXiv e-print repository

Smart Speech Segmentation using Acousto-Linguistic Features with look-ahead

Authors: Piyush Behre, Naveen Parihar, Sharman Tan, Amy Shah, Eva Sharma, Geoffrey Liu, Shuangyu Chang, Hosam Khalil, Chris Basoglu, Sayan Pathak

Abstract: Segmentation for continuous Automatic Speech Recognition (ASR) has traditionally used silence timeouts or voice activity detectors (VADs), which are both limited to acoustic features. This segmentation is often overly aggressive, given that people naturally pause to think as they speak. Consequently, segmentation happens mid-sentence, hindering both punctuation and downstream tasks like machine tr… ▽ More Segmentation for continuous Automatic Speech Recognition (ASR) has traditionally used silence timeouts or voice activity detectors (VADs), which are both limited to acoustic features. This segmentation is often overly aggressive, given that people naturally pause to think as they speak. Consequently, segmentation happens mid-sentence, hindering both punctuation and downstream tasks like machine translation for which high-quality segmentation is critical. Model-based segmentation methods that leverage acoustic features are powerful, but without an understanding of the language itself, these approaches are limited. We present a hybrid approach that leverages both acoustic and language information to improve segmentation. Furthermore, we show that including one word as a look-ahead boosts segmentation quality. On average, our models improve segmentation-F0.5 score by 9.8% over baseline. We show that this approach works for multiple languages. For the downstream task of machine translation, it improves the translation BLEU score by an average of 1.05 points. △ Less

Submitted 27 October, 2022; v1 submitted 25 October, 2022; originally announced October 2022.

arXiv:2203.00888 [pdf, other]

Towards Contextual Spelling Correction for Customization of End-to-end Speech Recognition Systems

Authors: Xiaoqiang Wang, Yanqing Liu, **yu Li, Veljko Miljanic, Sheng Zhao, Hosam Khalil

Abstract: Contextual biasing is an important and challenging task for end-to-end automatic speech recognition (ASR) systems, which aims to achieve better recognition performance by biasing the ASR system to particular context phrases such as person names, music list, proper nouns, etc. Existing methods mainly include contextual LM biasing and adding bias encoder into end-to-end ASR models. In this work, we… ▽ More Contextual biasing is an important and challenging task for end-to-end automatic speech recognition (ASR) systems, which aims to achieve better recognition performance by biasing the ASR system to particular context phrases such as person names, music list, proper nouns, etc. Existing methods mainly include contextual LM biasing and adding bias encoder into end-to-end ASR models. In this work, we introduce a novel approach to do contextual biasing by adding a contextual spelling correction model on top of the end-to-end ASR system. We incorporate contextual information into a sequence-to-sequence spelling correction model with a shared context encoder. Our proposed model includes two different mechanisms: autoregressive (AR) and non-autoregressive (NAR). We propose filtering algorithms to handle large-size context lists, and performance balancing mechanisms to control the biasing degree of the model. We demonstrate the proposed model is a general biasing solution which is domain-insensitive and can be adopted in different scenarios. Experiments show that the proposed method achieves as much as 51% relative word error rate (WER) reduction over ASR system and outperforms traditional biasing methods. Compared to the AR solution, the proposed NAR model reduces model size by 43.2% and speeds up inference by 2.1 times. △ Less

Submitted 7 September, 2022; v1 submitted 2 March, 2022; originally announced March 2022.

Comments: Accepted in IEEE Transactions on Audio, Speech and Language Processing (TASLP)

arXiv:2003.06390 [pdf, other]

Robust tracking of an unknown trajectory with a multi-rotor UAV: A high-gain observer approach

Authors: C. J. Boss, V. Srivastava, H. K. Khalil

Abstract: We study a trajectory tracking problem for a multi-rotor in the presence of modeling error and external disturbances. The desired trajectory is unknown and generated from a reference system with unknown or partially known dynamics. We assume that only position and orientation measurements for the multi-rotor and position measurements for the reference system can be accessed. We adopt an extended h… ▽ More We study a trajectory tracking problem for a multi-rotor in the presence of modeling error and external disturbances. The desired trajectory is unknown and generated from a reference system with unknown or partially known dynamics. We assume that only position and orientation measurements for the multi-rotor and position measurements for the reference system can be accessed. We adopt an extended high-gain observer (EHGO) estimation framework to estimate the feed-forward term required for trajectory tracking, the multi-rotor states, as well as modeling error and external disturbances. We design an output feedback controller for trajectory tracking that comprises a feedback linearizing controller and the EHGO. We rigorously analyze the proposed controller and establish its stability properties. Finally, we numerically illustrate our theoretical results using the example of a multi-rotor landing on a ground vehicle. △ Less

Submitted 27 April, 2020; v1 submitted 13 March, 2020; originally announced March 2020.

arXiv:1607.07402 [pdf, ps, other]

Semi-global Output Feedback Stabilization of Non-Minimum Phase Nonlinear Systems

Authors: Almuatazbellah M. Boker, Hassan K. Khalil

Abstract: We solve the problem of output feedback stabilization of a class of nonlinear systems, which may have unstable zero dynamics. We allow for any globally stabilizing full state feedback control scheme to be used as long as it satisfies a particular ISS condition. We show semi-global stability of the origin of the closed-loop system and also the recovery of the performance of an auxiliary system usin… ▽ More We solve the problem of output feedback stabilization of a class of nonlinear systems, which may have unstable zero dynamics. We allow for any globally stabilizing full state feedback control scheme to be used as long as it satisfies a particular ISS condition. We show semi-global stability of the origin of the closed-loop system and also the recovery of the performance of an auxiliary system using a full-order observer. This observer is based on the use of an extended high-gain observer to provide estimates of the output and its derivatives plus a signal used by an extended Kalman filter to provide estimates of the remaining states. Finally, we provide a simulation example that illustrates the design procedure. △ Less

Submitted 25 July, 2016; originally announced July 2016.

Comments: 9 pages, 1 figure

Showing 1–4 of 4 results for author: Khalil, H