Skip to main content

Showing 1–15 of 15 results for author: Mehta, R

Searching in archive eess. Search in all archives.
.
  1. Streaming Bilingual End-to-End ASR model using Attention over Multiple Softmax

    Authors: Aditya Patil, Vikas Joshi, Purvi Agrawal, Rupesh Mehta

    Abstract: Even with several advancements in multilingual modeling, it is challenging to recognize multiple languages using a single neural model, without knowing the input language and most multilingual models assume the availability of the input language. In this work, we propose a novel bilingual end-to-end (E2E) modeling approach, where a single neural model can recognize both languages and also support… ▽ More

    Submitted 21 January, 2024; originally announced January 2024.

    Comments: Published in IEEE's Spoken Language Technology (SLT) 2022, 8 pages (6 + 2 for references), 5 figures

    Journal ref: 2022 IEEE Spoken Language Technology Workshop (SLT), Doha, Qatar, 2023, pp. 252-259

  2. arXiv:2309.11097  [pdf

    cs.HC eess.SP

    Evaluating Mental Stress Among College Students Using Heart Rate and Hand Acceleration Data Collected from Wearable Sensors

    Authors: Moein Razavi, Anthony McDonald, Ranjana Mehta, Farzan Sasangohar

    Abstract: Stress is various mental health disorders including depression and anxiety among college students. Early stress diagnosis and intervention may lower the risk of develo** mental illnesses. We examined a machine learning-based method for identification of stress using data collected in a naturalistic study utilizing self-reported stress as ground truth as well as physiological data such as heart r… ▽ More

    Submitted 20 September, 2023; originally announced September 2023.

  3. arXiv:2308.10984  [pdf, other

    cs.CV eess.IV

    Debiasing Counterfactuals In the Presence of Spurious Correlations

    Authors: Amar Kumar, Nima Fathi, Raghav Mehta, Brennan Nichyporuk, Jean-Pierre R. Falet, Sotirios Tsaftaris, Tal Arbel

    Abstract: Deep learning models can perform well in complex medical imaging classification tasks, even when basing their conclusions on spurious correlations (i.e. confounders), should they be prevalent in the training dataset, rather than on the causal image markers of interest. This would thereby limit their ability to generalize across the population. Explainability based on counterfactual image generatio… ▽ More

    Submitted 21 August, 2023; originally announced August 2023.

    Comments: Accepted to the FAIMI (Fairness of AI in Medical Imaging) workshop at MICCAI 2023

  4. arXiv:2307.01738  [pdf, other

    eess.IV cs.CV

    Mitigating Calibration Bias Without Fixed Attribute Grou** for Improved Fairness in Medical Imaging Analysis

    Authors: Changjian Shui, Justin Szeto, Raghav Mehta, Douglas L. Arnold, Tal Arbel

    Abstract: Trustworthy deployment of deep learning medical imaging models into real-world clinical practice requires that they be calibrated. However, models that are well calibrated overall can still be poorly calibrated for a sub-population, potentially resulting in a clinician unwittingly making poor decisions for this group based on the recommendations of the model. Although methods have been shown to su… ▽ More

    Submitted 20 July, 2023; v1 submitted 4 July, 2023; originally announced July 2023.

  5. arXiv:2210.17398  [pdf, other

    cs.CV eess.IV

    Rethinking Generalization: The Impact of Annotation Style on Medical Image Segmentation

    Authors: Brennan Nichyporuk, Jillian Cardinell, Justin Szeto, Raghav Mehta, Jean-Pierre R. Falet, Douglas L. Arnold, Sotirios A. Tsaftaris, Tal Arbel

    Abstract: Generalization is an important attribute of machine learning models, particularly for those that are to be deployed in a medical context, where unreliable predictions can have real world consequences. While the failure of models to generalize across datasets is typically attributed to a mismatch in the data distributions, performance gaps are often a consequence of biases in the 'ground-truth' lab… ▽ More

    Submitted 13 December, 2022; v1 submitted 31 October, 2022; originally announced October 2022.

    Comments: Accepted for publication at the Journal of Machine Learning for Biomedical Imaging (MELBA) https://www.melba-journal.org/papers/2022:029.html

    Journal ref: Machine.Learning.for.Biomedical.Imaging. 1 (2022)

  6. arXiv:2204.00348  [pdf, other

    cs.CL cs.SD eess.AS

    WavFT: Acoustic model finetuning with labelled and unlabelled data

    Authors: Utkarsh Chauhan, Vikas Joshi, Rupesh R. Mehta

    Abstract: Unsupervised and self-supervised learning methods have leveraged unlabelled data to improve the pretrained models. However, these methods need significantly large amount of unlabelled data and the computational cost of training models with such large amount of data can be prohibitively high. We address this issue by using unlabelled data during finetuning, instead of pretraining. We propose acoust… ▽ More

    Submitted 1 April, 2022; originally announced April 2022.

  7. arXiv:2112.10074  [pdf, other

    eess.IV cs.CV cs.LG

    QU-BraTS: MICCAI BraTS 2020 Challenge on Quantifying Uncertainty in Brain Tumor Segmentation - Analysis of Ranking Scores and Benchmarking Results

    Authors: Raghav Mehta, Angelos Filos, Ujjwal Baid, Chiharu Sako, Richard McKinley, Michael Rebsamen, Katrin Datwyler, Raphael Meier, Piotr Radojewski, Gowtham Krishnan Murugesan, Sahil Nalawade, Chandan Ganesh, Ben Wagner, Fang F. Yu, Baowei Fei, Ananth J. Madhuranthakam, Joseph A. Maldjian, Laura Daza, Catalina Gomez, Pablo Arbelaez, Chengliang Dai, Shuo Wang, Hadrien Reynaud, Yuan-han Mo, Elsa Angelini , et al. (67 additional authors not shown)

    Abstract: Deep learning (DL) models have provided state-of-the-art performance in various medical imaging benchmarking challenges, including the Brain Tumor Segmentation (BraTS) challenges. However, the task of focal pathology multi-compartment segmentation (e.g., tumor and lesion sub-regions) is particularly challenging, and potential errors hinder translating DL models into clinical workflows. Quantifying… ▽ More

    Submitted 23 August, 2022; v1 submitted 19 December, 2021; originally announced December 2021.

    Comments: Accepted for publication at the Journal of Machine Learning for Biomedical Imaging (MELBA): https://www.melba-journal.org/papers/2022:026.html

    Journal ref: Machine.Learning.for.Biomedical.Imaging. 1 (2022)

  8. arXiv:2111.01561  [pdf, other

    eess.IV cs.CV physics.med-ph

    Sub-cortical structure segmentation database for young population

    Authors: Jayanthi Sivaswamy, Alphin J Thottupattu, Mythri V, Raghav Mehta, R Sheelakumari, Chandrasekharan Kesavadas

    Abstract: Segmentation of sub-cortical structures from MRI scans is of interest in many neurological diagnosis. Since this is a laborious task machine learning and specifically deep learning (DL) methods have become explored. The structural complexity of the brain demands a large, high quality segmentation dataset to develop good DL-based solutions for sub-cortical structure segmentation. Towards this, we a… ▽ More

    Submitted 9 November, 2021; v1 submitted 1 November, 2021; originally announced November 2021.

  9. arXiv:2108.00713  [pdf, other

    eess.IV cs.CV cs.LG

    Cohort Bias Adaptation in Aggregated Datasets for Lesion Segmentation

    Authors: Brennan Nichyporuk, Jillian Cardinell, Justin Szeto, Raghav Mehta, Sotirios Tsaftaris, Douglas L. Arnold, Tal Arbel

    Abstract: Many automatic machine learning models developed for focal pathology (e.g. lesions, tumours) detection and segmentation perform well, but do not generalize as well to new patient cohorts, impeding their widespread adoption into real clinical contexts. One strategy to create a more diverse, generalizable training set is to naively pool datasets from different cohorts. Surprisingly, training on this… ▽ More

    Submitted 18 May, 2022; v1 submitted 2 August, 2021; originally announced August 2021.

    Comments: Accepted at DART 2021

  10. arXiv:2103.16617  [pdf, other

    eess.IV cs.CV cs.LG

    HAD-Net: A Hierarchical Adversarial Knowledge Distillation Network for Improved Enhanced Tumour Segmentation Without Post-Contrast Images

    Authors: Saverio Vadacchino, Raghav Mehta, Nazanin Mohammadi Sepahvand, Brennan Nichyporuk, James J. Clark, Tal Arbel

    Abstract: Segmentation of enhancing tumours or lesions from MRI is important for detecting new disease activity in many clinical contexts. However, accurate segmentation requires the inclusion of medical images (e.g., T1 post contrast MRI) acquired after injecting patients with a contrast agent (e.g., Gadolinium), a process no longer thought to be safe. Although a number of modality-agnostic segmentation ne… ▽ More

    Submitted 12 May, 2021; v1 submitted 30 March, 2021; originally announced March 2021.

    Comments: Accepted at Medical Imaging with Deep Learning (MIDL) 2021

  11. arXiv:2008.06871  [pdf, other

    eess.SY math.OC

    Attractive Ellipsoid Sliding Mode Observer Design for State of Charge Estimation of Lithium-ion Cells

    Authors: Anirudh Nath, Raghvendra Gupta, Rohit Mehta, Supreet Singh Bahga, Amit Gupta, Shubhendu Bhasin

    Abstract: This work investigates the real-time estimation of the state-of-charge (SoC) of Lithium-ion (Li-ion) cells for reliable, safe and efficient utilization. A novel attractive ellipsoid based sliding-mode observer (AESMO) algorithm is designed to estimate the SoC in real-time. The algorithm utilizes standard equivalent circuit model of a Li-ion cell and provides reliable and efficient SoC estimate in… ▽ More

    Submitted 16 August, 2020; originally announced August 2020.

    Comments: 19 pages, 12 figures, 2 tables

  12. arXiv:2008.05086  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    Transfer Learning Approaches for Streaming End-to-End Speech Recognition System

    Authors: Vikas Joshi, Rui Zhao, Rupesh R. Mehta, Kshitiz Kumar, **yu Li

    Abstract: Transfer learning (TL) is widely used in conventional hybrid automatic speech recognition (ASR) system, to transfer the knowledge from source to target language. TL can be applied to end-to-end (E2E) ASR system such as recurrent neural network transducer (RNN-T) models, by initializing the encoder and/or prediction network of the target language with the pre-trained models from source language. In… ▽ More

    Submitted 17 August, 2020; v1 submitted 11 August, 2020; originally announced August 2020.

  13. arXiv:2005.14262  [pdf, other

    eess.IV cs.CV

    Uncertainty Evaluation Metric for Brain Tumour Segmentation

    Authors: Raghav Mehta, Angelos Filos, Yarin Gal, Tal Arbel

    Abstract: In this paper, we develop a metric designed to assess and rank uncertainty measures for the task of brain tumour sub-tissue segmentation in the BraTS 2019 sub-challenge on uncertainty quantification. The metric is designed to: (1) reward uncertainty measures where high confidence is assigned to correct assertions, and where incorrect assertions are assigned low confidence and (2) penalize measures… ▽ More

    Submitted 28 May, 2020; originally announced May 2020.

    Report number: MIDL/2019/ExtendedAbstract/H-PvDNIex

  14. arXiv:1908.08074  [pdf, other

    eess.IV cs.CV

    DUAL-GLOW: Conditional Flow-Based Generative Model for Modality Transfer

    Authors: Haoliang Sun, Ronak Mehta, Hao H. Zhou, Zhichun Huang, Sterling C. Johnson, Vivek Prabhakaran, Vikas Singh

    Abstract: Positron emission tomography (PET) imaging is an imaging modality for diagnosing a number of neurological diseases. In contrast to Magnetic Resonance Imaging (MRI), PET is costly and involves injecting a radioactive substance into the patient. Motivated by developments in modality transfer in vision, we study the generation of certain types of PET images from MRI data. We derive new flow-based gen… ▽ More

    Submitted 21 August, 2019; originally announced August 2019.

    Journal ref: ICCV 2019

  15. arXiv:1906.09426  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    End-to-End ASR for Code-switched Hindi-English Speech

    Authors: Brij Mohan Lal Srivastava, Basil Abraham, Sunayana Sitaram, Rupesh Mehta, Preethi Jyothi

    Abstract: End-to-end (E2E) models have been explored for large speech corpora and have been found to match or outperform traditional pipeline-based systems in some languages. However, most prior work on end-to-end models use speech corpora exceeding hundreds or thousands of hours. In this study, we explore end-to-end models for code-switched Hindi-English language with less than 50 hours of data. We utilize… ▽ More

    Submitted 22 June, 2019; originally announced June 2019.