Skip to main content

Showing 1–8 of 8 results for author: Ramoji, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2311.12564  [pdf

    eess.AS cs.LG eess.SP

    Summary of the DISPLACE Challenge 2023 - DIarization of SPeaker and LAnguage in Conversational Environments

    Authors: Shikha Baghel, Shreyas Ramoji, Somil Jain, Pratik Roy Chowdhuri, Prachi Singh, Deepu Vijayasenan, Sriram Ganapathy

    Abstract: In multi-lingual societies, where multiple languages are spoken in a small geographic vicinity, informal conversations often involve mix of languages. Existing speech technologies may be inefficient in extracting information from such conversations, where the speech data is rich in diversity with multiple languages and speakers. The DISPLACE (DIarization of SPeaker and LAnguage in Conversational E… ▽ More

    Submitted 3 January, 2024; v1 submitted 21 November, 2023; originally announced November 2023.

  2. arXiv:2303.00830  [pdf, other

    eess.AS cs.SD eess.SP

    DISPLACE Challenge: DIarization of SPeaker and LAnguage in Conversational Environments

    Authors: Shikha Baghel, Shreyas Ramoji, Sidharth, Ranjana H, Prachi Singh, Somil Jain, Pratik Roy Chowdhuri, Kaustubh Kulkarni, Swapnil Padhi, Deepu Vijayasenan, Sriram Ganapathy

    Abstract: In multilingual societies, social conversations often involve code-mixed speech. The current speech technology may not be well equipped to extract information from multi-lingual multi-speaker conversations. The DISPLACE challenge entails a first-of-kind task to benchmark speaker and language diarization on the same data, as the data contains multi-speaker conversations in multilingual code-mixed s… ▽ More

    Submitted 5 June, 2023; v1 submitted 1 March, 2023; originally announced March 2023.

  3. arXiv:2103.09148  [pdf, other

    eess.AS cs.SD

    DiCOVA Challenge: Dataset, task, and baseline system for COVID-19 diagnosis using acoustics

    Authors: Ananya Muguli, Lancelot Pinto, Nirmala R., Neeraj Sharma, Prashant Krishnan, Prasanta Kumar Ghosh, Rohit Kumar, Shrirama Bhat, Srikanth Raj Chetupalli, Sriram Ganapathy, Shreyas Ramoji, Viral Nanda

    Abstract: The DiCOVA challenge aims at accelerating research in diagnosing COVID-19 using acoustics (DiCOVA), a topic at the intersection of speech and audio processing, respiratory health diagnosis, and machine learning. This challenge is an open call for researchers to analyze a dataset of sound recordings collected from COVID-19 infected and non-COVID-19 individuals for a two-class classification. These… ▽ More

    Submitted 17 June, 2021; v1 submitted 16 March, 2021; originally announced March 2021.

    Comments: To appear in Proceedings of Interspeech, 2021

  4. arXiv:2008.04527  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    Neural PLDA Modeling for End-to-End Speaker Verification

    Authors: Shreyas Ramoji, Prashant Krishnan, Sriram Ganapathy

    Abstract: While deep learning models have made significant advances in supervised classification problems, the application of these models for out-of-set verification tasks like speaker recognition has been limited to deriving feature embeddings. The state-of-the-art x-vector PLDA based speaker verification systems use a generative model based on probabilistic linear discriminant analysis (PLDA) for computi… ▽ More

    Submitted 11 August, 2020; originally announced August 2020.

    Comments: Accepted in Interspeech 2020. GitHub Implementation Repos: https://github.com/iiscleap/E2E-NPLDA and https://github.com/iiscleap/NeuralPlda

  5. Coswara -- A Database of Breathing, Cough, and Voice Sounds for COVID-19 Diagnosis

    Authors: Neeraj Sharma, Prashant Krishnan, Rohit Kumar, Shreyas Ramoji, Srikanth Raj Chetupalli, Nirmala R., Prasanta Kumar Ghosh, Sriram Ganapathy

    Abstract: The COVID-19 pandemic presents global challenges transcending boundaries of country, race, religion, and economy. The current gold standard method for COVID-19 detection is the reverse transcription polymerase chain reaction (RT-PCR) testing. However, this method is expensive, time-consuming, and violates social distancing. Also, as the pandemic is expected to stay for a while, there is a need for… ▽ More

    Submitted 11 August, 2020; v1 submitted 21 May, 2020; originally announced May 2020.

    Comments: A description of Coswara dataset to evaluate COVID-19 diagnosis using respiratory sounds

  6. arXiv:2002.03562  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    NPLDA: A Deep Neural PLDA Model for Speaker Verification

    Authors: Shreyas Ramoji, Prashant Krishnan, Sriram Ganapathy

    Abstract: The state-of-art approach for speaker verification consists of a neural network based embedding extractor along with a backend generative model such as the Probabilistic Linear Discriminant Analysis (PLDA). In this work, we propose a neural network approach for backend modeling in speaker recognition. The likelihood ratio score of the generative PLDA model is posed as a discriminative similarity f… ▽ More

    Submitted 24 May, 2020; v1 submitted 10 February, 2020; originally announced February 2020.

    Comments: Published in Odyssey 2020, the Speaker and Language Recognition Workshop (VOiCES Special Session). Link to GitHub Implementation: https://github.com/iiscleap/NeuralPlda. arXiv admin note: substantial text overlap with arXiv:2001.07034

    Journal ref: in Proc. Odyssey 2020 The Speaker and Language Recognition Workshop, Pages 202-209

  7. arXiv:2002.02735  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    LEAP System for SRE19 CTS Challenge -- Improvements and Error Analysis

    Authors: Shreyas Ramoji, Prashant Krishnan, Bhargavram Mysore, Prachi Singh, Sriram Ganapathy

    Abstract: The NIST Speaker Recognition Evaluation - Conversational Telephone Speech (CTS) challenge 2019 was an open evaluation for the task of speaker verification in challenging conditions. In this paper, we provide a detailed account of the LEAP SRE system submitted to the CTS challenge focusing on the novel components in the back-end system modeling. All the systems used the time-delay neural network (T… ▽ More

    Submitted 24 May, 2020; v1 submitted 7 February, 2020; originally announced February 2020.

    Comments: Published In Proc. Odyssey 2020, the Speaker and Language Recognition Workshop. Link to GitHub Implementation: https://github.com/iiscleap/NeuralPlda

    Journal ref: in Proc. Odyssey 2020 The Speaker and Language Recognition Workshop, 281--288

  8. arXiv:2001.07034  [pdf, other

    eess.AS cs.LG cs.SD eess.SP

    Pairwise Discriminative Neural PLDA for Speaker Verification

    Authors: Shreyas Ramoji, Prashant Krishnan V, Prachi Singh, Sriram Ganapathy

    Abstract: The state-of-art approach to speaker verification involves the extraction of discriminative embeddings like x-vectors followed by a generative model back-end using a probabilistic linear discriminant analysis (PLDA). In this paper, we propose a Pairwise neural discriminative model for the task of speaker verification which operates on a pair of speaker embeddings such as x-vectors/i-vectors and ou… ▽ More

    Submitted 7 February, 2020; v1 submitted 20 January, 2020; originally announced January 2020.

    Comments: This paper was submitted to IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2020. Link to GitHub Repository: https://github.com/iiscleap/NeuralPlda