Skip to main content

Showing 1–20 of 20 results for author: Tewfik, A H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2310.15261  [pdf, ps, other

    cs.SD cs.HC cs.LG eess.AS

    Modality Dropout for Multimodal Device Directed Speech Detection using Verbal and Non-Verbal Features

    Authors: Gautam Krishna, Sameer Dharur, Oggi Rudovic, Pranay Dighe, Saurabh Adya, Ahmed Hussen Abdelaziz, Ahmed H Tewfik

    Abstract: Device-directed speech detection (DDSD) is the binary classification task of distinguishing between queries directed at a voice assistant versus side conversation or background speech. State-of-the-art DDSD systems use verbal cues, e.g acoustic, text and/or automatic speech recognition system (ASR) features, to classify speech as device-directed or otherwise, and often have to contend with one or… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

    Comments: 5 pages

  2. arXiv:2103.02087  [pdf, other

    eess.SP cs.LG

    Deep J-Sense: Accelerated MRI Reconstruction via Unrolled Alternating Optimization

    Authors: Marius Arvinte, Sriram Vishwanath, Ahmed H. Tewfik, Jonathan I. Tamir

    Abstract: Accelerated multi-coil magnetic resonance imaging reconstruction has seen a substantial recent improvement combining compressed sensing with deep learning. However, most of these methods rely on estimates of the coil sensitivity profiles, or on calibration data for estimating model parameters. Prior work has shown that these methods degrade in performance when the quality of these estimators are p… ▽ More

    Submitted 11 April, 2021; v1 submitted 2 March, 2021; originally announced March 2021.

  3. arXiv:2103.00383  [pdf, other

    cs.SD cs.LG eess.AS q-bio.QM

    Brain Signals to Rescue Aphasia, Apraxia and Dysarthria Speech Recognition

    Authors: Gautam Krishna, Mason Carnahan, Shilpa Shamapant, Yashitha Surendranath, Saumya Jain, Arundhati Ghosh, Co Tran, Jose del R Millan, Ahmed H Tewfik

    Abstract: In this paper, we propose a deep learning-based algorithm to improve the performance of automatic speech recognition (ASR) systems for aphasia, apraxia, and dysarthria speech by utilizing electroencephalography (EEG) features recorded synchronously with aphasia, apraxia, and dysarthria speech. We demonstrate a significant decoding performance improvement by more than 50\% during test time for isol… ▽ More

    Submitted 17 July, 2021; v1 submitted 27 February, 2021; originally announced March 2021.

    Comments: Accepted to IEEE EMBC 2021

  4. arXiv:2101.04269  [pdf, other

    cs.CV eess.IV

    Pneumonia Detection on Chest X-ray using Radiomic Features and Contrastive Learning

    Authors: Yan Han, Chongyan Chen, Ahmed H Tewfik, Ying Ding, Yifan Peng

    Abstract: Chest X-ray becomes one of the most common medical diagnoses due to its noninvasiveness. The number of chest X-ray images has skyrocketed, but reading chest X-rays still have been manually performed by radiologists, which creates huge burnouts and delays. Traditionally, radiomics, as a subfield of radiology that can extract a large number of quantitative features from medical images, demonstrates… ▽ More

    Submitted 4 May, 2022; v1 submitted 11 January, 2021; originally announced January 2021.

    Comments: Accepted for ISBI 2021

  5. arXiv:2012.12843  [pdf, other

    cs.LG eess.SP stat.ML

    EQ-Net: A Unified Deep Learning Framework for Log-Likelihood Ratio Estimation and Quantization

    Authors: Marius Arvinte, Ahmed H. Tewfik, Sriram Vishwanath

    Abstract: In this work, we introduce EQ-Net: the first holistic framework that solves both the tasks of log-likelihood ratio (LLR) estimation and quantization using a data-driven method. We motivate our approach with theoretical insights on two practical estimation algorithms at the ends of the complexity spectrum and reveal a connection between the complexity of an algorithm and the information bottleneck… ▽ More

    Submitted 3 May, 2021; v1 submitted 23 December, 2020; originally announced December 2020.

  6. arXiv:2008.07621  [pdf, other

    eess.AS cs.LG cs.SD eess.SP

    Speech Recognition using EEG signals recorded using dry electrodes

    Authors: Gautam Krishna, Co Tran, Mason Carnahan, Morgan M Hagood, Ahmed H Tewfik

    Abstract: In this paper, we demonstrate speech recognition using electroencephalography (EEG) signals obtained using dry electrodes on a limited English vocabulary consisting of three vowels and one word using a deep learning model. We demonstrate a test accuracy of 79.07 percent on a subset vocabulary consisting of two English vowels. Our results demonstrate the feasibility of using EEG signals recorded us… ▽ More

    Submitted 13 August, 2020; originally announced August 2020.

  7. arXiv:2006.03638  [pdf, other

    cs.CV cs.LG

    Robust Face Verification via Disentangled Representations

    Authors: Marius Arvinte, Ahmed H. Tewfik, Sriram Vishwanath

    Abstract: We introduce a robust algorithm for face verification, i.e., deciding whether twoimages are of the same person or not. Our approach is a novel take on the idea ofusing deep generative networks for adversarial robustness. We use the generativemodel during training as an online augmentation method instead of a test-timepurifier that removes adversarial noise. Our architecture uses a contrastive loss… ▽ More

    Submitted 23 June, 2020; v1 submitted 5 June, 2020; originally announced June 2020.

    Comments: Preprint

  8. arXiv:2003.00007  [pdf, other

    eess.AS cs.LG cs.SD stat.ML

    Generating EEG features from Acoustic features

    Authors: Gautam Krishna, Co Tran, Mason Carnahan, Yan Han, Ahmed H Tewfik

    Abstract: In this paper we demonstrate predicting electroencephalograpgy (EEG) features from acoustic features using recurrent neural network (RNN) based regression model and generative adversarial network (GAN). We predict various types of EEG features from acoustic features. We compare our results with the previously studied problem on speech synthesis using EEG and our results demonstrate that EEG featur… ▽ More

    Submitted 18 March, 2020; v1 submitted 29 February, 2020; originally announced March 2020.

  9. arXiv:2001.00501  [pdf, other

    eess.AS cs.LG cs.SD stat.ML

    EEG based Continuous Speech Recognition using Transformers

    Authors: Gautam Krishna, Co Tran, Mason Carnahan, Ahmed H Tewfik

    Abstract: In this paper we investigate continuous speech recognition using electroencephalography (EEG) features using recently introduced end-to-end transformer based automatic speech recognition (ASR) model. Our results demonstrate that transformer based model demonstrate faster training compared to recurrent neural network (RNN) based sequence-to-sequence EEG models and better performance during inferenc… ▽ More

    Submitted 5 May, 2020; v1 submitted 31 December, 2019; originally announced January 2020.

  10. arXiv:1912.07730  [pdf, other

    cs.LG eess.AS eess.IV stat.ML

    Continuous Speech Recognition using EEG and Video

    Authors: Gautam Krishna, Mason Carnahan, Co Tran, Ahmed H Tewfik

    Abstract: In this paper we investigate whether electroencephalography (EEG) features can be used to improve the performance of continuous visual speech recognition systems. We implemented a connectionist temporal classification (CTC) based end-to-end automatic speech recognition (ASR) model for performing recognition. Our results demonstrate that EEG features are helpful in enhancing the performance of cont… ▽ More

    Submitted 27 December, 2019; v1 submitted 16 December, 2019; originally announced December 2019.

    Comments: On preparation for submission to EUSIPCO 2020. arXiv admin note: text overlap with arXiv:1911.11610, arXiv:1911.04261

  11. arXiv:1911.11610  [pdf, other

    eess.AS cs.LG cs.SD stat.ML

    Improving EEG based Continuous Speech Recognition

    Authors: Gautam Krishna, Co Tran, Mason Carnahan, Yan Han, Ahmed H Tewfik

    Abstract: In this paper we introduce various techniques to improve the performance of electroencephalography (EEG) features based continuous speech recognition (CSR) systems. A connectionist temporal classification (CTC) based automatic speech recognition (ASR) system was implemented for performing recognition. We introduce techniques to initialize the weights of the recurrent layers in the encoder of the C… ▽ More

    Submitted 23 December, 2019; v1 submitted 24 November, 2019; originally announced November 2019.

    Comments: On preparation for submission to EUSIPCO 2020. arXiv admin note: text overlap with arXiv:1911.04261, arXiv:1906.08871

  12. arXiv:1911.04261  [pdf, other

    cs.SD eess.AS eess.SP

    Voice Activity Detection in presence of background noise using EEG

    Authors: Gautam Krishna, Co Tran, Mason Carnahan, Yan Han, Ahmed H Tewfik

    Abstract: In this paper we demonstrate that performance of voice activity detection (VAD) system operating in presence of background noise can be improved by concatenating acoustic input features with electroencephalography (EEG) features. We also demonstrate that VAD using only EEG features shows better performance than VAD using only acoustic features in presence of background noise. We implemented a recu… ▽ More

    Submitted 14 March, 2020; v1 submitted 8 November, 2019; originally announced November 2019.

    Comments: On preparation for submission to EUSIPCO 2020. arXiv admin note: text overlap with arXiv:1906.08871, arXiv:1909.09132

  13. arXiv:1909.09132  [pdf, other

    eess.AS cs.LG cs.SD stat.ML

    Spoken Speech Enhancement using EEG

    Authors: Gautam Krishna, Co Tran, Yan Han, Mason Carnahan, Ahmed H Tewfik

    Abstract: In this paper we demonstrate spoken speech enhancement using electroencephalography (EEG) signals using a generative adversarial network (GAN) based model, gated recurrent unit (GRU) regression based model, temporal convolutional network (TCN) regression model and finally using a mixed TCN GRU regression model. We compare our EEG based speech enhancement results with traditional log minimum mean… ▽ More

    Submitted 19 April, 2020; v1 submitted 13 September, 2019; originally announced September 2019.

  14. arXiv:1908.05743  [pdf, other

    eess.AS cs.SD

    State-of-the-art Speech Recognition using EEG and Towards Decoding of Speech Spectrum From EEG

    Authors: Gautam Krishna, Yan Han, Co Tran, Mason Carnahan, Ahmed H Tewfik

    Abstract: In this paper we first demonstrate continuous noisy speech recognition using electroencephalography (EEG) signals on English vocabulary using different types of state of the art end-to-end automatic speech recognition (ASR) models, we further provide results obtained using EEG data recorded under different experimental conditions. We finally demonstrate decoding of speech spectrum from EEG signals… ▽ More

    Submitted 4 March, 2020; v1 submitted 14 August, 2019; originally announced August 2019.

  15. arXiv:1906.08871  [pdf, other

    eess.AS cs.CL cs.LG cs.SD stat.ML

    Advancing Speech Recognition With No Speech Or With Noisy Speech

    Authors: Gautam Krishna, Co Tran, Mason Carnahan, Ahmed H Tewfik

    Abstract: In this paper we demonstrate end-to-end continuous speech recognition (CSR) using electroencephalography (EEG) signals with no speech signal as input. An attention model based automatic speech recognition (ASR) and connectionist temporal classification (CTC) based ASR systems were implemented for performing recognition. We further demonstrate CSR for noisy speech by fusing with EEG features.

    Submitted 14 March, 2020; v1 submitted 17 June, 2019; originally announced June 2019.

    Comments: Extended version of our accepted IEEE EUSIPCO 2019 paper with additional results for CTC model based recognition. arXiv admin note: substantial text overlap with arXiv:1906.08045, arXiv:1906.08044

  16. arXiv:1906.08045  [pdf, other

    eess.AS cs.LG cs.SD eess.SP stat.ML

    Speech Recognition With No Speech Or With Noisy Speech Beyond English

    Authors: Gautam Krishna, Co Tran, Yan Han, Mason Carnahan, Ahmed H Tewfik

    Abstract: In this paper we demonstrate continuous noisy speech recognition using connectionist temporal classification (CTC) model on limited Chinese vocabulary using electroencephalography (EEG) features with no speech signal as input and we further demonstrate single CTC model based continuous noisy speech recognition on limited joint English and Chinese vocabulary using EEG features with no speech signal… ▽ More

    Submitted 26 February, 2020; v1 submitted 17 June, 2019; originally announced June 2019.

    Comments: arXiv admin note: text overlap with arXiv:1906.08871

  17. arXiv:1906.08044  [pdf, other

    eess.AS cs.LG cs.SD eess.SP stat.ML

    Robust End-to-End Speaker Verification Using EEG

    Authors: Yan Han, Gautam Krishna, Co Tran, Mason Carnahan, Ahmed H Tewfik

    Abstract: In this paper we demonstrate that performance of a speaker verification system can be improved by concatenating electroencephalography (EEG) signal features with speech signal features or only using EEG signal features. We use state-of-the-art end-to-end deep learning model for performing speaker verification and we demonstrate our results for noisy speech. Our results indicate that EEG signals ca… ▽ More

    Submitted 9 June, 2020; v1 submitted 17 June, 2019; originally announced June 2019.

    Comments: Accepted for EUSIPCO 2020

  18. arXiv:1906.07849  [pdf, other

    cs.LG eess.SP stat.ML

    Deep Learning-Based Quantization of L-Values for Gray-Coded Modulation

    Authors: Marius Arvinte, Sriram Vishwanath, Ahmed H. Tewfik

    Abstract: In this work, a deep learning-based quantization scheme for log-likelihood ratio (L-value) storage is introduced. We analyze the dependency between the average magnitude of different L-values from the same quadrature amplitude modulation (QAM) symbol and show they follow a consistent ordering. Based on this we design a deep autoencoder that jointly compresses and separately reconstructs each L-val… ▽ More

    Submitted 9 May, 2021; v1 submitted 18 June, 2019; originally announced June 2019.

    Comments: Submitted to IEEE Globecom 2019

  19. arXiv:1903.04656  [pdf, other

    cs.LG eess.SP stat.ML

    Deep Log-Likelihood Ratio Quantization

    Authors: Marius Arvinte, Ahmed H. Tewfik, Sriram Vishwanath

    Abstract: In this work, a deep learning-based method for log-likelihood ratio (LLR) lossy compression and quantization is proposed, with emphasis on a single-input single-output uncorrelated fading communication setting. A deep autoencoder network is trained to compress, quantize and reconstruct the bit log-likelihood ratios corresponding to a single transmitted symbol. Specifically, the encoder maps to a l… ▽ More

    Submitted 9 May, 2021; v1 submitted 11 March, 2019; originally announced March 2019.

    Comments: Accepted for publication at EUSIPCO 2019. Camera-ready version

  20. arXiv:1903.00739  [pdf, other

    cs.LG stat.ML

    Speech Recognition with no speech or with noisy speech

    Authors: Gautam Krishna, Co Tran, Jianguo Yu, Ahmed H Tewfik

    Abstract: The performance of automatic speech recognition systems(ASR) degrades in the presence of noisy speech. This paper demonstrates that using electroencephalography (EEG) can help automatic speech recognition systems overcome performance loss in the presence of noise. The paper also shows that distillation training of automatic speech recognition systems using EEG features will increase their performa… ▽ More

    Submitted 2 March, 2019; originally announced March 2019.

    Comments: Accepted for ICASSP 2019