Skip to main content

Showing 1–16 of 16 results for author: Murthy, H A

Searching in archive eess. Search in all archives.
.
  1. arXiv:2303.07130  [pdf, other

    eess.IV cs.CV cs.LG

    Enhancing COVID-19 Severity Analysis through Ensemble Methods

    Authors: Anand Thyagachandran, Hema A Murthy

    Abstract: Computed Tomography (CT) scans provide a detailed image of the lungs, allowing clinicians to observe the extent of damage caused by COVID-19. The CT severity score (CTSS) based scoring method is used to identify the extent of lung involvement observed on a CT scan. This paper presents a domain knowledge-based pipeline for extracting regions of infection in COVID-19 patients using a combination of… ▽ More

    Submitted 17 March, 2023; v1 submitted 13 March, 2023; originally announced March 2023.

  2. arXiv:2302.06227  [pdf, other

    eess.AS cs.SD

    Fast and small footprint Hybrid HMM-HiFiGAN based system for speech synthesis in Indian languages

    Authors: Sudhanshu Srivastava, Ishika Gupta, Anusha Prakash, Jom Kuriakose, Hema A. Murthy

    Abstract: Hidden-Markov-model (HMM) based text-to-speech (HTS) offers flexibility in speaking styles along with fast training and synthesis while being computationally less intense. HTS performs well even in low-resource scenarios. The primary drawback is that the voice quality is poor compared to that of E2E systems. A hybrid approach combining HMM-based feature generation and neural-network-based HiFi-GAN… ▽ More

    Submitted 13 February, 2023; originally announced February 2023.

    Comments: 5 pages, 5 figures

  3. arXiv:2212.11982  [pdf, other

    eess.AS

    HMM-based data augmentation for E2E systems for building conversational speech synthesis systems

    Authors: Ishika Gupta, Anusha Prakash, Jom Kuriakose, Hema A. Murthy

    Abstract: This paper proposes an approach to build a high-quality text-to-speech (TTS) system for technical domains using data augmentation. An end-to-end (E2E) system is trained on hidden Markov model (HMM) based synthesized speech and further fine-tuned with studio-recorded TTS data to improve the timbre of the synthesized voice. The motivation behind the work is that issues of word skips and repetitions… ▽ More

    Submitted 22 December, 2022; originally announced December 2022.

    Comments: 6 pages, 7 figures, 33 references

  4. arXiv:2211.08790  [pdf, other

    eess.AS cs.LG

    Structural Segmentation and Labeling of Tabla Solo Performances

    Authors: Gowriprasad R, R Aravind, Hema A Murthy

    Abstract: Tabla is a North Indian percussion instrument used as an accompaniment and an exclusive instrument for solo performances. Tabla solo is intricate and elaborate, exhibiting rhythmic evolution through a sequence of homogeneous sections marked by shared rhythmic characteristics. Each section has a specific structure and name associated with it. Tabla learning and performance in the Indian subcontinen… ▽ More

    Submitted 16 November, 2022; originally announced November 2022.

    Comments: 35 pages, 11 figures

  5. arXiv:2211.01603  [pdf, other

    q-bio.GN cs.LG eess.SP

    Using Signal Processing in Tandem With Adapted Mixture Models for Classifying Genomic Signals

    Authors: Saish Jaiswal, Shreya Nema, Hema A Murthy, Manikandan Narayanan

    Abstract: Genomic signal processing has been used successfully in bioinformatics to analyze biomolecular sequences and gain varied insights into DNA structure, gene organization, protein binding, sequence evolution, etc. But challenges remain in finding the appropriate spectral representation of a biomolecular sequence, especially when multiple variable-length sequences need to be handled consistently. In t… ▽ More

    Submitted 3 November, 2022; originally announced November 2022.

  6. arXiv:2210.17153  [pdf, other

    eess.AS cs.SD

    The Importance of Accurate Alignments in End-to-End Speech Synthesis

    Authors: Anusha Prakash, Hema A Murthy

    Abstract: Unit selection synthesis systems required accurate segmentation and labeling of the speech signal owing to the concatenative nature. Hidden Markov model-based speech synthesis accommodates some transcription errors, but it was later shown that accurate transcriptions yield highly intelligible speech with smaller amounts of training data. With the arrival of end-to-end (E2E) systems, it was observe… ▽ More

    Submitted 31 October, 2022; originally announced October 2022.

    Comments: Version 1 uploaded

  7. arXiv:2103.03215  [pdf, other

    eess.AS cs.SD

    Front-end Diarization for Percussion Separation in Taniavartanam of Carnatic Music Concerts

    Authors: Nauman Dawalatabad, Jilt Sebastian, Jom Kuriakose, C. Chandra Sekhar, Shrikanth Narayanan, Hema A. Murthy

    Abstract: Instrument separation in an ensemble is a challenging task. In this work, we address the problem of separating the percussive voices in the taniavartanam segments of Carnatic music. In taniavartanam, a number of percussive instruments play together or in tandem. Separation of instruments in regions where only one percussion is present leads to interference and artifacts at the output, as source se… ▽ More

    Submitted 4 March, 2021; originally announced March 2021.

  8. arXiv:2011.02195  [pdf, other

    eess.SP cs.LG cs.SD eess.AS

    Correlation based Multi-phasal models for improved imagined speech EEG recognition

    Authors: Rini A Sharon, Hema A Murthy

    Abstract: Translation of imagined speech electroencephalogram(EEG) into human understandable commands greatly facilitates the design of naturalistic brain computer interfaces. To achieve improved imagined speech unit classification, this work aims to profit from the parallel information contained in multi-phasal EEG data recorded while speaking, imagining and performing articulatory movements corresponding… ▽ More

    Submitted 4 November, 2020; originally announced November 2020.

    Journal ref: Interspeech SMM 2020

  9. Novel Architectures for Unsupervised Information Bottleneck based Speaker Diarization of Meetings

    Authors: Nauman Dawalatabad, Srikanth Madikeri, C. Chandra Sekhar, Hema A. Murthy

    Abstract: Speaker diarization is an important problem that is topical, and is especially useful as a preprocessor for conversational speech related applications. The objective of this paper is two-fold: (i) segment initialization by uniformly distributing speaker information across the initial segments, and (ii) incorporating speaker discriminative features within the unsupervised diarization framework. In… ▽ More

    Submitted 13 October, 2020; originally announced October 2020.

    Comments: Accepted in IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING

    Journal ref: IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 29, 2021, pp 14-27

  10. arXiv:2010.05497  [pdf, other

    cs.LG eess.SP q-bio.NC

    The "Sound of Silence" in EEG -- Cognitive voice activity detection

    Authors: Rini A Sharon, Hema A Murthy

    Abstract: Speech cognition bears potential application as a brain computer interface that can improve the quality of life for the otherwise communication impaired people. While speech and resting state EEG are popularly studied, here we attempt to explore a "non-speech"(NS) state of brain activity corresponding to the silence regions of speech audio. Firstly, speech perception is studied to inspect the exis… ▽ More

    Submitted 12 October, 2020; originally announced October 2020.

  11. arXiv:2009.04983  [pdf, other

    eess.AS cs.SD

    Exploration of End-to-end Synthesisers forZero Resource Speech Challenge 2020

    Authors: Karthik Pandia D S, Anusha Prakash, Mano Ranjith Kumar, Hema A Murthy

    Abstract: A Spoken dialogue system for an unseen language is referred to as Zero resource speech. It is especially beneficial for develo** applications for languages that have low digital resources. Zero resource speech synthesis is the task of building text-to-speech (TTS) models in the absence of transcriptions. In this work, speech is modelled as a sequence of transient and steady-state acoustic units,… ▽ More

    Submitted 10 September, 2020; originally announced September 2020.

    Comments: Accepted for publication in Interspeech 2020

  12. Evidence of Task-Independent Person-Specific Signatures in EEG using Subspace Techniques

    Authors: Mari Ganesh Kumar, Shrikanth Narayanan, Mriganka Sur, Hema A Murthy

    Abstract: Electroencephalography (EEG) signals are promising as alternatives to other biometrics owing to their protection against spoofing. Previous studies have focused on capturing individual variability by analyzing task/condition-specific EEG. This work attempts to model biometric signatures independent of task/condition by normalizing the associated variance. Toward this goal, the paper extends ideas… ▽ More

    Submitted 25 March, 2021; v1 submitted 27 July, 2020; originally announced July 2020.

    Comments: ©2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

    Journal ref: IEEE Transactions on Information Forensics and Security, 2021

  13. Generic Indic Text-to-speech Synthesisers with Rapid Adaptation in an End-to-end Framework

    Authors: Anusha Prakash, Hema A Murthy

    Abstract: Building text-to-speech (TTS) synthesisers for Indian languages is a difficult task owing to a large number of active languages. Indian languages can be classified into a finite set of families, prominent among them, Indo-Aryan and Dravidian. The proposed work exploits this property to build a generic TTS system using multiple languages from the same family in an end-to-end framework. Generic syst… ▽ More

    Submitted 12 June, 2020; originally announced June 2020.

    Journal ref: INTERSPEECH (2002) 2962-2966

  14. Zero resource speech synthesis using transcripts derived from perceptual acoustic units

    Authors: Karthik Pandia D S, Hema A Murthy

    Abstract: Zerospeech synthesis is the task of building vocabulary independent speech synthesis systems, where transcriptions are not available for training data. It is, therefore, necessary to convert training data into a sequence of fundamental acoustic units that can be used for synthesis during the test. This paper attempts to discover, and model perceptual acoustic units consisting of steady-state, and… ▽ More

    Submitted 8 June, 2020; originally announced June 2020.

  15. arXiv:1904.07453  [pdf, other

    eess.AS cs.CR cs.LG cs.SD

    Spoof detection using time-delay shallow neural network and feature switching

    Authors: Mari Ganesh Kumar, Suvidha Rupesh Kumar, Saranya M, B. Bharathi, Hema A. Murthy

    Abstract: Detecting spoofed utterances is a fundamental problem in voice-based biometrics. Spoofing can be performed either by logical accesses like speech synthesis, voice conversion or by physical accesses such as replaying the pre-recorded utterance. Inspired by the state-of-the-art \emph{x}-vector based speaker verification approach, this paper proposes a time-delay shallow neural network (TD-SNN) for s… ▽ More

    Submitted 23 January, 2020; v1 submitted 16 April, 2019; originally announced April 2019.

    Journal ref: 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 1011--1017

  16. Incremental Transfer Learning in Two-pass Information Bottleneck based Speaker Diarization System for Meetings

    Authors: Nauman Dawalatabad, Srikanth Madikeri, C Chandra Sekhar, Hema A Murthy

    Abstract: The two-pass information bottleneck (TPIB) based speaker diarization system operates independently on different conversational recordings. TPIB system does not consider previously learned speaker discriminative information while diarizing new conversations. Hence, the real time factor (RTF) of TPIB system is high owing to the training time required for the artificial neural network (ANN). This pap… ▽ More

    Submitted 21 February, 2019; originally announced February 2019.

    Comments: 5 pages, 2 figures, To appear in Proc. ICASSP 2019, May 12-17, 2019, Brighton, UK