Showing 1–2 of 2 results for author: Asaad, I

Search v0.5.6 released 2020-02-24

arXiv:2405.20101 [pdf, other]

cs.SD cs.CL eess.AS

Fill in the Gap! Combining Self-supervised Representation Learning with Neural Audio Synthesis for Speech Inpainting

Authors: Ihab Asaad, Maxime Jacquelin, Olivier Perrotin, Laurent Girin, Thomas Hueber

Abstract: Most speech self-supervised learning (SSL) models are trained with a pretext task which consists in predicting missing parts of the input signal, either future segments (causal prediction) or segments masked anywhere within the input (non-causal prediction). Learned speech representations can then be efficiently transferred to downstream tasks (e.g., automatic speech or speaker recognition). In th… ▽ More Most speech self-supervised learning (SSL) models are trained with a pretext task which consists in predicting missing parts of the input signal, either future segments (causal prediction) or segments masked anywhere within the input (non-causal prediction). Learned speech representations can then be efficiently transferred to downstream tasks (e.g., automatic speech or speaker recognition). In the present study, we investigate the use of a speech SSL model for speech inpainting, that is reconstructing a missing portion of a speech signal from its surrounding context, i.e., fulfilling a downstream task that is very similar to the pretext task. To that purpose, we combine an SSL encoder, namely HuBERT, with a neural vocoder, namely HiFiGAN, playing the role of a decoder. In particular, we propose two solutions to match the HuBERT output with the HiFiGAN input, by freezing one and fine-tuning the other, and vice versa. Performance of both approaches was assessed in single- and multi-speaker settings, for both informed and blind inpainting configurations (i.e., the position of the mask is known or unknown, respectively), with different objective metrics and a perceptual evaluation. Performances show that if both solutions allow to correctly reconstruct signal portions up to the size of 200ms (and even 400ms in some cases), fine-tuning the SSL encoder provides a more accurate signal reconstruction in the single-speaker setting case, while freezing it (and training the neural vocoder instead) is a better strategy when dealing with multi-speaker data. △ Less

Submitted 30 May, 2024; originally announced May 2024.
arXiv:1909.10568 [pdf]

eess.SY

Design of neural nonlinear PFC Controller to control speed of Autonomous Car

Authors: Isam Asaad, Bilal Chiha

Abstract: In this research, we are going to design a neural nonlinear predictive functional controller (PFC) to achieve a reduced fuel consumption for a chosen autonomous car walks according to a supplied speed trajectory on known roads. We used a fitting neural network as a simple tool for modelling the car's engine and control laws needed to calculate the suitable control commands passed to the brakes and… ▽ More In this research, we are going to design a neural nonlinear predictive functional controller (PFC) to achieve a reduced fuel consumption for a chosen autonomous car walks according to a supplied speed trajectory on known roads. We used a fitting neural network as a simple tool for modelling the car's engine and control laws needed to calculate the suitable control commands passed to the brakes and gas pedals' actuators. Independent model method and constraints handling are used to provide controller robustness. We used MATLAB Simulink and IPG CarMaker to design and test our PFC controller. The performance of designed PFC controller is compared to the performance of a PI controller which exists within IPG CarMaker simulator. Keywords :- Predictive Functional Controller, Fuel Consumption, Neural Network, Independent Model, Constraint Handling, PI Controller. △ Less

Submitted 23 September, 2019; originally announced September 2019.

Comments: 9 pages, 16 figures, Published with International Journal of Computer Science Trends and Technology (IJCST)

Journal ref: International Journal of Computer Science Trends and Technology (IJCST) V7(3): Page(125-133) May-Jun 2019. ISSN: 2347-8578

Search v0.5.6 released 2020-02-24