Skip to main content

Showing 1–6 of 6 results for author: Vijayasenan, D

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.09494  [pdf, other

    eess.AS cs.LG

    The Second DISPLACE Challenge : DIarization of SPeaker and LAnguage in Conversational Environments

    Authors: Shareef Babu Kalluri, Prachi Singh, Pratik Roy Chowdhuri, Apoorva Kulkarni, Shikha Baghel, Pradyoth Hegde, Swapnil Sontakke, Deepak K T, S. R. Mahadeva Prasanna, Deepu Vijayasenan, Sriram Ganapathy

    Abstract: The DIarization of SPeaker and LAnguage in Conversational Environments (DISPLACE) 2024 challenge is the second in the series of DISPLACE challenges, which involves tasks of speaker diarization (SD) and language diarization (LD) on a challenging multilingual conversational speech dataset. In the DISPLACE 2024 challenge, we also introduced the task of automatic speech recognition (ASR) on this datas… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 5 pages, 3 figures, Interspeech 2024

  2. arXiv:2311.12564  [pdf

    eess.AS cs.LG eess.SP

    Summary of the DISPLACE Challenge 2023 - DIarization of SPeaker and LAnguage in Conversational Environments

    Authors: Shikha Baghel, Shreyas Ramoji, Somil Jain, Pratik Roy Chowdhuri, Prachi Singh, Deepu Vijayasenan, Sriram Ganapathy

    Abstract: In multi-lingual societies, where multiple languages are spoken in a small geographic vicinity, informal conversations often involve mix of languages. Existing speech technologies may be inefficient in extracting information from such conversations, where the speech data is rich in diversity with multiple languages and speakers. The DISPLACE (DIarization of SPeaker and LAnguage in Conversational E… ▽ More

    Submitted 3 January, 2024; v1 submitted 21 November, 2023; originally announced November 2023.

  3. arXiv:2303.00830  [pdf, other

    eess.AS cs.SD eess.SP

    DISPLACE Challenge: DIarization of SPeaker and LAnguage in Conversational Environments

    Authors: Shikha Baghel, Shreyas Ramoji, Sidharth, Ranjana H, Prachi Singh, Somil Jain, Pratik Roy Chowdhuri, Kaustubh Kulkarni, Swapnil Padhi, Deepu Vijayasenan, Sriram Ganapathy

    Abstract: In multilingual societies, social conversations often involve code-mixed speech. The current speech technology may not be well equipped to extract information from multi-lingual multi-speaker conversations. The DISPLACE challenge entails a first-of-kind task to benchmark speaker and language diarization on the same data, as the data contains multi-speaker conversations in multilingual code-mixed s… ▽ More

    Submitted 5 June, 2023; v1 submitted 1 March, 2023; originally announced March 2023.

  4. arXiv:2208.10737  [pdf, other

    eess.IV cs.CV eess.SP

    Semi-Automatic Labeling and Semantic Segmentation of Gram-Stained Microscopic Images from DIBaS Dataset

    Authors: Chethan Reddy G. P., Pullagurla Abhijith Reddy, Vidyashree R. Kanabur, Deepu Vijayasenan, Sumam S. David, Sreejith Govindan

    Abstract: In this paper, a semi-automatic annotation of bacteria genera and species from DIBaS dataset is implemented using clustering and thresholding algorithms. A Deep learning model is trained to achieve the semantic segmentation and classification of the bacteria species. Classification accuracy of 95% is achieved. Deep learning models find tremendous applications in biomedical image processing. Automa… ▽ More

    Submitted 23 August, 2022; originally announced August 2022.

  5. arXiv:2011.04299  [pdf, other

    cs.SD cs.LG eess.AS

    COVID-19 Patient Detection from Telephone Quality Speech Data

    Authors: Kotra Venkata Sai Ritwik, Shareef Babu Kalluri, Deepu Vijayasenan

    Abstract: In this paper, we try to investigate the presence of cues about the COVID-19 disease in the speech data. We use an approach that is similar to speaker recognition. Each sentence is represented as super vectors of short term Mel filter bank features for each phoneme. These features are used to learn a two-class classifier to separate the COVID-19 speech from normal. Experiments on a small dataset c… ▽ More

    Submitted 9 November, 2020; originally announced November 2020.

    Comments: 6 pages, 7 figures

  6. arXiv:2007.06021  [pdf, other

    eess.AS cs.LG

    NISP: A Multi-lingual Multi-accent Dataset for Speaker Profiling

    Authors: Shareef Babu Kalluri, Deepu Vijayasenan, Sriram Ganapathy, Ragesh Rajan M, Prashant Krishnan

    Abstract: Many commercial and forensic applications of speech demand the extraction of information about the speaker characteristics, which falls into the broad category of speaker profiling. The speaker characteristics needed for profiling include physical traits of the speaker like height, age, and gender of the speaker along with the native language of the speaker. Many of the datasets available have onl… ▽ More

    Submitted 12 July, 2020; originally announced July 2020.

    Comments: 5pages, Initial version submitted to Interspeech2020