Skip to main content

Showing 1–5 of 5 results for author: Gangashetty, S V

Searching in archive cs. Search in all archives.
.
  1. arXiv:2310.14654  [pdf, ps, other

    cs.CL eess.AS

    SPRING-INX: A Multilingual Indian Language Speech Corpus by SPRING Lab, IIT Madras

    Authors: Nithya R, Malavika S, Jordan F, Arjun Gangwar, Metilda N J, S Umesh, Rithik Sarab, Akhilesh Kumar Dubey, Govind Divakaran, Samudra Vijaya K, Suryakanth V Gangashetty

    Abstract: India is home to a multitude of languages of which 22 languages are recognised by the Indian Constitution as official. Building speech based applications for the Indian population is a difficult problem owing to limited data and the number of languages and accents to accommodate. To encourage the language technology community to build speech based applications in Indian languages, we are open sour… ▽ More

    Submitted 24 October, 2023; v1 submitted 23 October, 2023; originally announced October 2023.

    Comments: 3 pages, About SPRING-INX Data

  2. arXiv:2007.01359  [pdf, ps, other

    cs.CL

    A Bayesian Multilingual Document Model for Zero-shot Topic Identification and Discovery

    Authors: Santosh Kesiraju, Sangeet Sagar, Ondřej Glembek, Lukáš Burget, Ján Černocký, Suryakanth V Gangashetty

    Abstract: In this paper, we present a Bayesian multilingual document model for learning language-independent document embeddings. The model is an extension of BaySMM [Kesiraju et al 2020] to the multilingual scenario. It learns to represent the document embeddings in the form of Gaussian distributions, thereby encoding the uncertainty in its covariance. We propagate the learned uncertainties through linear… ▽ More

    Submitted 23 March, 2024; v1 submitted 2 July, 2020; originally announced July 2020.

  3. Learning document embeddings along with their uncertainties

    Authors: Santosh Kesiraju, Oldřich Plchot, Lukáš Burget, Suryakanth V Gangashetty

    Abstract: Majority of the text modelling techniques yield only point-estimates of document embeddings and lack in capturing the uncertainty of the estimates. These uncertainties give a notion of how well the embeddings represent a document. We present Bayesian subspace multinomial model (Bayesian SMM), a generative log-linear model that learns to represent documents in the form of Gaussian distributions, th… ▽ More

    Submitted 18 October, 2019; v1 submitted 20 August, 2019; originally announced August 2019.

  4. arXiv:1606.05844  [pdf, other

    cs.SD cs.LG

    Statistical Parametric Speech Synthesis Using Bottleneck Representation From Sequence Auto-encoder

    Authors: Sivanand Achanta, KNRK Raju Alluri, Suryakanth V Gangashetty

    Abstract: In this paper, we describe a statistical parametric speech synthesis approach with unit-level acoustic representation. In conventional deep neural network based speech synthesis, the input text features are repeated for the entire duration of phoneme for map** text and speech parameters. This map** is learnt at the frame-level which is the de-facto acoustic representation. However much of this… ▽ More

    Submitted 19 June, 2016; originally announced June 2016.

    Comments: 5 pages (with references)

  5. arXiv:1508.00354  [pdf, ps, other

    cs.SD cs.CL

    Significance of Maximum Spectral Amplitude in Sub-bands for Spectral Envelope Estimation and Its Application to Statistical Parametric Speech Synthesis

    Authors: Sivanand Achanta, Anandaswarup Vadapalli, Sai Krishna R., Suryakanth V. Gangashetty

    Abstract: In this paper we propose a technique for spectral envelope estimation using maximum values in the sub-bands of Fourier magnitude spectrum (MSASB). Most other methods in the literature parametrize spectral envelope in cepstral domain such as Mel-generalized cepstrum etc. Such cepstral domain representations, although compact, are not readily interpretable. This difficulty is overcome by our method… ▽ More

    Submitted 3 August, 2015; originally announced August 2015.