Skip to main content

Showing 1–19 of 19 results for author: Yousefi, M

Searching in archive eess. Search in all archives.
.
  1. arXiv:2405.17809  [pdf, other

    cs.CL cs.AI cs.SD eess.AS

    TransVIP: Speech to Speech Translation System with Voice and Isochrony Preservation

    Authors: Chenyang Le, Yao Qian, Dongmei Wang, Long Zhou, Shujie Liu, Xiaofei Wang, Midia Yousefi, Yanmin Qian, **yu Li, Sheng Zhao, Michael Zeng

    Abstract: There is a rising interest and trend in research towards directly translating speech from one language to another, known as end-to-end speech-to-speech translation. However, most end-to-end models struggle to outperform cascade models, i.e., a pipeline framework by concatenating speech recognition, machine translation and text-to-speech models. The primary challenges stem from the inherent complex… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: Work in progress

  2. arXiv:2404.15961  [pdf, other

    eess.SP cs.AI

    Soil analysis with machine-learning-based processing of stepped-frequency GPR field measurements: Preliminary study

    Authors: Chunlei Xu, Michael Pregesbauer, Naga Sravani Chilukuri, Daniel Windhager, Mahsa Yousefi, Pedro Julian, Lothar Ratschbacher

    Abstract: Ground Penetrating Radar (GPR) has been widely studied as a tool for extracting soil parameters relevant to agriculture and horticulture. When combined with Machine-Learning-based (ML) methods, high-resolution Stepped Frequency Countinuous Wave Radar (SFCW) measurements hold the promise to give cost effective access to depth resolved soil parameters, including at root-level depth. In a first step… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

  3. arXiv:2404.07239  [pdf

    q-bio.QM cs.AI eess.IV

    Advancements in Radiomics and Artificial Intelligence for Thyroid Cancer Diagnosis

    Authors: Milad Yousefi, Shadi Farabi Maleki, Ali Jafarizadeh, Mahya Ahmadpour Youshanlui, Aida Jafari, Siamak Pedrammehr, Roohallah Alizadehsani, Ryszard Tadeusiewicz, Pawel Plawiak

    Abstract: Thyroid cancer is an increasing global health concern that requires advanced diagnostic methods. The application of AI and radiomics to thyroid cancer diagnosis is examined in this review. A review of multiple databases was conducted in compliance with PRISMA guidelines until October 2023. A combination of keywords led to the discovery of an English academic publication on thyroid cancer and relat… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Comments: 50 pages, 8 figures, 1 table, 119 references

    ACM Class: J.3.2; J.3.3

  4. arXiv:2404.06690  [pdf, other

    eess.AS cs.AI cs.CL cs.LG cs.SD

    CoVoMix: Advancing Zero-Shot Speech Generation for Human-like Multi-talker Conversations

    Authors: Leying Zhang, Yao Qian, Long Zhou, Shujie Liu, Dongmei Wang, Xiaofei Wang, Midia Yousefi, Yanmin Qian, **yu Li, Lei He, Sheng Zhao, Michael Zeng

    Abstract: Recent advancements in zero-shot text-to-speech (TTS) modeling have led to significant strides in generating high-fidelity and diverse speech. However, dialogue generation, along with achieving human-like naturalness in speech, continues to be a challenge. In this paper, we introduce CoVoMix: Conversational Voice Mixture Generation, a novel model for zero-shot, human-like, multi-speaker, multi-rou… ▽ More

    Submitted 29 May, 2024; v1 submitted 9 April, 2024; originally announced April 2024.

  5. arXiv:2310.05950  [pdf, other

    eess.SP physics.optics

    Quantization of Neural Network Equalizers in Optical Fiber Transmission Experiments

    Authors: Jamal Darweesh, Nelson Costa, Antonio Napoli, Bernhard Spinnler, Yves Jaouen, Mansoor Yousefi

    Abstract: The quantization of neural networks for the mitigation of the nonlinear and components' distortions in dual-polarization optical fiber transmission is studied. Two low-complexity neural network equalizers are applied in three 16-QAM 34.4 GBaud transmission experiments with different representative fibers. A number of post-training quantization and quantization-aware training algorithms are compare… ▽ More

    Submitted 9 September, 2023; originally announced October 2023.

    Comments: 15 pages, 9 figures, 5 tables

  6. arXiv:2309.12521  [pdf, other

    cs.SD eess.AS

    Profile-Error-Tolerant Target-Speaker Voice Activity Detection

    Authors: Dongmei Wang, Xiong Xiao, Naoyuki Kanda, Midia Yousefi, Takuya Yoshioka, Jian Wu

    Abstract: Target-Speaker Voice Activity Detection (TS-VAD) utilizes a set of speaker profiles alongside an input audio signal to perform speaker diarization. While its superiority over conventional methods has been demonstrated, the method can suffer from errors in speaker profiles, as those profiles are typically obtained by running a traditional clustering-based diarization method over the input signal. T… ▽ More

    Submitted 3 April, 2024; v1 submitted 21 September, 2023; originally announced September 2023.

    Comments: Submission for ICASSP 2024

  7. arXiv:2307.06821  [pdf, other

    cs.NI cs.LG eess.SP

    Equalization in Dispersion-Managed Systems Using Learned Digital Back-Propagation

    Authors: Mohannad Abu-Romoh, Nelson Costa, Yves Jaouën, Antonio Napoli, João Pedro, Bernhard Spinnler, Mansoor Yousefi

    Abstract: In this paper, we investigate the use of the learned digital back-propagation (LDBP) for equalizing dual-polarization fiber-optic transmission in dispersion-managed (DM) links. LDBP is a deep neural network that optimizes the parameters of DBP using the stochastic gradient descent. We evaluate DBP and LDBP in a simulated WDM dual-polarization fiber transmission system operating at the bitrate of 2… ▽ More

    Submitted 26 May, 2023; originally announced July 2023.

  8. arXiv:2305.02234  [pdf

    eess.SP

    Forged Channel: A Breakthrough Approach for Accurate Parkinson's Disease Classification using Leave-One-Subject-Out Cross-Validation

    Authors: A. Hamidi, k. Mohamed-Pour, M. Yousefi

    Abstract: This paper introduces a novel technique called "Forged Channel," which aims to comprehensively represent EEG signals in order to achieve accurate classification of Parkinson's disease. The forged channel method prepares EEG signals in a manner that allows a deep learning model to effectively perceive all EEG channels within a single input. By employing this approach alongside a convolutional neura… ▽ More

    Submitted 16 April, 2024; v1 submitted 3 May, 2023; originally announced May 2023.

    Comments: 5 Pages, 2 Figure, 3 Table

  9. Low Complexity Convolutional Neural Networks for Equalization in Optical Fiber Transmission

    Authors: Mohannad Abu-romoh, Nelson Costa, Antonio Napoli, João Pedro, Yves Jaouën, Mansoor Yousefi

    Abstract: A convolutional neural network is proposed to mitigate fiber transmission effects, achieving a five-fold reduction in trainable parameters compared to alternative equalizers, and 3.5 dB improvement in MSE compared to DBP with comparable complexity.

    Submitted 11 October, 2022; originally announced October 2022.

    Comments: 2 pages, 3 figures. Submitted to the OSA Advanced Photonics Congress 2021. Presented in Signal Processing in Photonic Communications (SPPCom) 2021. From the session: Neural Networks Applications for Photonic Systems (SpM5C)

  10. arXiv:2207.12154  [pdf, other

    eess.SP

    Complexity Reduction over Bi-RNN-Based Nonlinearity Mitigation in Dual-Pol Fiber-Optic Communications via a CRNN-Based Approach

    Authors: Abtin Shahkarami, Mansoor Yousefi, Yves Jaouen

    Abstract: Bidirectional recurrent neural networks (bi-RNNs), in particular, bidirectional long short term memory (bi-LSTM), bidirectional gated recurrent unit, and convolutional bi-LSTM models have recently attracted attention for nonlinearity mitigation in fiber-optic communication. The recently adopted approaches based on these models, however, incur a high computational complexity which may impede their… ▽ More

    Submitted 25 July, 2022; originally announced July 2022.

  11. arXiv:2205.11376  [pdf, other

    eess.SP cs.AI cs.LG

    Learned Digital Back-Propagation for Dual-Polarization Dispersion Managed Systems

    Authors: Mohannad Abu-romoh, Nelson Costa, Antonio Napoli, Bernhard Spinnler, Yves Jaouën, Mansoor Yousefi

    Abstract: Digital back-propagation (DBP) and learned DBP (LDBP) are proposed for nonlinearity mitigation in WDM dual-polarization dispersion-managed systems. LDBP achieves Q-factor improvement of 1.8 dB and 1.2 dB, respectively, over linear equalization and a variant of DBP adapted to DM systems.

    Submitted 23 May, 2022; originally announced May 2022.

  12. arXiv:2205.11284  [pdf, other

    eess.SP

    Few-bit Quantization of Neural Networks for Nonlinearity Mitigation in a Fiber Transmission Experiment

    Authors: Jamal Darweesh, Nelson Costa, Antonio Napoli, Bernhard Spinnler, Yves Jaouen, Mansoor Yousefi, .

    Abstract: A neural network is quantized for the mitigation of nonlinear and components distortions in a 16-QAM 9x50km dual-polarization fiber transmission experiment. Post-training additive power-of-two quantization at 6 bits incurs a negligible Q-factor penalty. At 5 bits, the model size is reduced by 85%, with 0.8 dB penalty.

    Submitted 25 May, 2022; v1 submitted 23 May, 2022; originally announced May 2022.

    Comments: 4 pages ,3 figuers

  13. arXiv:2204.04488  [pdf

    q-bio.NC eess.SP

    Comparison of EEG based epilepsy diagnosis using neural networks and wavelet transform

    Authors: Mohammad Reza Yousefi, Saina Golnejad, Melika Mohammad Hosseini, Amin Dehghani

    Abstract: Epilepsy is one of the common neurological disorders characterized by recurrent and uncontrollable seizures, which seriously affect the life of patients. In many cases, electroencephalograms signal can provide important physiological information about the activity of the human brain which can be used to diagnose epilepsy. However, visual inspection of a large number of electroencephalogram signals… ▽ More

    Submitted 12 August, 2023; v1 submitted 9 April, 2022; originally announced April 2022.

    Comments: 8 pages, 4 tables, 3 figures

  14. arXiv:2111.08635  [pdf, other

    eess.AS cs.LG cs.SD

    Single-channel speech separation using Soft-minimum Permutation Invariant Training

    Authors: Midia Yousefi, John H. L. Hansen

    Abstract: The goal of speech separation is to extract multiple speech sources from a single microphone recording. Recently, with the advancement of deep learning and availability of large datasets, speech separation has been formulated as a supervised learning problem. These approaches aim to learn discriminative patterns of speech, speakers, and background noise using a supervised learning algorithm, typic… ▽ More

    Submitted 16 November, 2021; originally announced November 2021.

  15. arXiv:2111.00320  [pdf, other

    eess.AS cs.AI cs.LG cs.SD eess.SP

    Speaker conditioning of acoustic models using affine transformation for multi-speaker speech recognition

    Authors: Midia Yousefi, John H. L. Hanse

    Abstract: This study addresses the problem of single-channel Automatic Speech Recognition of a target speaker within an overlap speech scenario. In the proposed method, the hidden representations in the acoustic model are modulated by speaker auxiliary information to recognize only the desired speaker. Affine transformation layers are inserted into the acoustic model network to integrate speaker information… ▽ More

    Submitted 30 October, 2021; originally announced November 2021.

  16. arXiv:2111.00316  [pdf, other

    eess.AS cs.LG cs.SD

    Real-time Speaker counting in a cocktail party scenario using Attention-guided Convolutional Neural Network

    Authors: Midia Yousefi, John H. L. Hansen

    Abstract: Most current speech technology systems are designed to operate well even in the presence of multiple active speakers. However, most solutions assume that the number of co-current speakers is known. Unfortunately, this information might not always be available in real-world applications. In this study, we propose a real-time, single-channel attention-guided Convolutional Neural Network (CNN) to est… ▽ More

    Submitted 30 October, 2021; originally announced November 2021.

  17. arXiv:2001.09937  [pdf, other

    eess.AS eess.SP

    Frame-based overlap** speech detection using Convolutional Neural Networks

    Authors: Midia Yousefi, John H. L. Hansen

    Abstract: Naturalistic speech recordings usually contain speech signals from multiple speakers. This phenomenon can degrade the performance of speech technologies due to the complexity of tracing and recognizing individual speakers. In this study, we investigate the detection of overlap** speech on segments as short as 25 ms using Convolutional Neural Networks. We evaluate the detection performance using… ▽ More

    Submitted 12 February, 2020; v1 submitted 27 January, 2020; originally announced January 2020.

  18. arXiv:1908.01768  [pdf, other

    eess.AS cs.LG cs.SD stat.ML

    Probabilistic Permutation Invariant Training for Speech Separation

    Authors: Midia Yousefi, Soheil Khorram, John H. L. Hansen

    Abstract: Single-microphone, speaker-independent speech separation is normally performed through two steps: (i) separating the specific speech sources, and (ii) determining the best output-label assignment to find the separation error. The second step is the main obstacle in training neural networks for speech separation. Recently proposed Permutation Invariant Training (PIT) addresses this problem by deter… ▽ More

    Submitted 4 August, 2019; originally announced August 2019.

    Comments: Interspeech 2019

  19. arXiv:1804.06941  [pdf, other

    eess.SY

    Reducing Conservatism in Model-Invariant Safety-Preserving Control of Propofol Anesthesia Using Falsification

    Authors: Mahdi Yousefi, Klaske van Heusden, Ian M. Mitchell, J. Mark Ansermino, Guy A. Dumont

    Abstract: This work provides a formalized model-invariant safety system for closed-loop anesthesia that uses feedback from measured data for model falsification to reduce conservatism. The safety system maintains predicted propofol plasma concentrations, as well as the patient's blood pressure, within safety bounds despite uncertainty in patient responses to propofol. Model-invariant formal verification is… ▽ More

    Submitted 18 April, 2018; originally announced April 2018.

    Comments: 11 pages, 9 figures, submitted to IEEE TCST