Skip to main content

Showing 1–42 of 42 results for author: Jyothi, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.10993  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    CoSTA: Code-Switched Speech Translation using Aligned Speech-Text Interleaving

    Authors: Bhavani Shankar, Preethi Jyothi, Pushpak Bhattacharyya

    Abstract: Code-switching is a widely prevalent linguistic phenomenon in multilingual societies like India. Building speech-to-text models for code-switched speech is challenging due to limited availability of datasets. In this work, we focus on the problem of spoken translation (ST) of code-switched speech in Indian languages to English text. We present a new end-to-end model architecture COSTA that scaffol… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  2. arXiv:2405.11200  [pdf, other

    cs.CL

    LexGen: Domain-aware Multilingual Lexicon Generation

    Authors: Karthika NJ, Ayush Maheshwari, Atul Kumar Singh, Preethi Jyothi, Ganesh Ramakrishnan, Krishnakant Bhatt

    Abstract: Lexicon or dictionary generation across domains is of significant societal importance, as it can potentially enhance information accessibility for a diverse user base while preserving language identity. Prior work in the field primarily focuses on bilingual lexical induction, which deals with word alignments using map**-based or corpora-based approaches. Though initiated by researchers, the rese… ▽ More

    Submitted 18 May, 2024; originally announced May 2024.

  3. arXiv:2403.08011  [pdf, other

    cs.CL cs.AI cs.LG

    Gujarati-English Code-Switching Speech Recognition using ensemble prediction of spoken language

    Authors: Yash Sharma, Basil Abraham, Preethi Jyothi

    Abstract: An important and difficult task in code-switched speech recognition is to recognize the language, as lots of words in two languages can sound similar, especially in some accents. We focus on improving performance of end-to-end Automatic Speech Recognition models by conditioning transformer layers on language ID of words and character in the output in an per layer supervised manner. To this end, we… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

    Comments: Bachelor's thesis, 28 pages, includes appendix

  4. arXiv:2402.02080  [pdf, other

    cs.CL

    Translation Errors Significantly Impact Low-Resource Languages in Cross-Lingual Learning

    Authors: Ashish Sunil Agrawal, Barah Fazili, Preethi Jyothi

    Abstract: Popular benchmarks (e.g., XNLI) used to evaluate cross-lingual language understanding consist of parallel versions of English evaluation sets in multiple target languages created with the help of professional translators. When creating such parallel data, it is critical to ensure high-quality translations for all target languages for an accurate characterization of cross-lingual transfer. In this… ▽ More

    Submitted 3 February, 2024; originally announced February 2024.

    Comments: Accepted to main proceedings of "The 18th Conference of the European Chapter of the Association for Computational Linguistics"

  5. arXiv:2310.16749  [pdf, other

    cs.CL cs.HC

    DISCO: A Large Scale Human Annotated Corpus for Disfluency Correction in Indo-European Languages

    Authors: Vineet Bhat, Preethi Jyothi, Pushpak Bhattacharyya

    Abstract: Disfluency correction (DC) is the process of removing disfluent elements like fillers, repetitions and corrections from spoken utterances to create readable and interpretable text. DC is a vital post-processing step applied to Automatic Speech Recognition (ASR) outputs, before subsequent processing by downstream language understanding tasks. Existing DC research has primarily focused on English du… ▽ More

    Submitted 25 October, 2023; originally announced October 2023.

    Comments: Accepted at EMNLP 2023 Findings

  6. arXiv:2310.15970  [pdf, other

    cs.CL cs.AI cs.LG

    Accented Speech Recognition With Accent-specific Codebooks

    Authors: Darshan Prabhu, Preethi Jyothi, Sriram Ganapathy, Vinit Unni

    Abstract: Speech accents pose a significant challenge to state-of-the-art automatic speech recognition (ASR) systems. Degradation in performance across underrepresented accents is a severe deterrent to the inclusive adoption of ASR. In this work, we propose a novel accent adaptation approach for end-to-end ASR systems using cross-attention with a trainable set of codebooks. These learnable codebooks capture… ▽ More

    Submitted 26 October, 2023; v1 submitted 24 October, 2023; originally announced October 2023.

    Comments: Accepted to EMNLP 2023 Main Conference (Long Paper)

  7. arXiv:2310.06702  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Temporally Aligning Long Audio Interviews with Questions: A Case Study in Multimodal Data Integration

    Authors: Piyush Singh Pasi, Karthikeya Battepati, Preethi Jyothi, Ganesh Ramakrishnan, Tanmay Mahapatra, Manoj Singh

    Abstract: The problem of audio-to-text alignment has seen significant amount of research using complete supervision during training. However, this is typically not in the context of long audio recordings wherein the text being queried does not appear verbatim within the audio file. This work is a collaboration with a non-governmental organization called CARE India that collects long audio health surveys fro… ▽ More

    Submitted 10 October, 2023; originally announced October 2023.

    Comments: Work Accepted in IJCAI-23- AI and Social Good Track

  8. arXiv:2307.05006  [pdf, ps, other

    cs.CL cs.LG eess.AS

    Improving RNN-Transducers with Acoustic LookAhead

    Authors: Vinit S. Unni, Ashish Mittal, Preethi Jyothi, Sunita Sarawagi

    Abstract: RNN-Transducers (RNN-Ts) have gained widespread acceptance as an end-to-end model for speech to text conversion because of their high accuracy and streaming capabilities. A typical RNN-T independently encodes the input audio and the text context, and combines the two encodings by a thin joint network. While this architecture provides SOTA streaming accuracy, it also makes the model vulnerable to s… ▽ More

    Submitted 10 July, 2023; originally announced July 2023.

    Comments: 5 pages, 1 fig, 7 tables, Proceedings of Interspeech 2023

  9. arXiv:2306.06384  [pdf, other

    cs.CL

    Adversarial Training For Low-Resource Disfluency Correction

    Authors: Vineet Bhat, Preethi Jyothi, Pushpak Bhattacharyya

    Abstract: Disfluencies commonly occur in conversational speech. Speech with disfluencies can result in noisy Automatic Speech Recognition (ASR) transcripts, which affects downstream tasks like machine translation. In this paper, we propose an adversarially-trained sequence-tagging model for Disfluency Correction (DC) that utilizes a small amount of labeled real disfluent data in conjunction with a large amo… ▽ More

    Submitted 10 June, 2023; originally announced June 2023.

    Comments: Accepted for Findings of ACL 2023

  10. arXiv:2305.16957  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    DisfluencyFixer: A tool to enhance Language Learning through Speech To Speech Disfluency Correction

    Authors: Vineet Bhat, Preethi Jyothi, Pushpak Bhattacharyya

    Abstract: Conversational speech often consists of deviations from the speech plan, producing disfluent utterances that affect downstream NLP tasks. Removing these disfluencies is necessary to create fluent and coherent speech. This paper presents DisfluencyFixer, a tool that performs speech-to-speech disfluency correction in English and Hindi using a pipeline of Automatic Speech Recognition (ASR), Disfluenc… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

    Comments: To be published in Interspeech 2023 - Show and Tell Demonstrations

  11. arXiv:2211.01458  [pdf, other

    cs.CL cs.SD eess.AS

    Towards Zero-Shot Code-Switched Speech Recognition

    Authors: Brian Yan, Matthew Wiesner, Ondrej Klejch, Preethi Jyothi, Shinji Watanabe

    Abstract: In this work, we seek to build effective code-switched (CS) automatic speech recognition systems (ASR) under the zero-shot setting where no transcribed CS speech data is available for training. Previously proposed frameworks which conditionally factorize the bilingual task into its constituent monolingual parts are a promising starting point for leveraging monolingual data efficiently. However, th… ▽ More

    Submitted 9 November, 2022; v1 submitted 2 November, 2022; originally announced November 2022.

    Comments: 5 pages

  12. arXiv:2210.16892  [pdf, other

    cs.LG

    Partitioned Gradient Matching-based Data Subset Selection for Compute-Efficient Robust ASR Training

    Authors: Ashish Mittal, Durga Sivasubramanian, Rishabh Iyer, Preethi Jyothi, Ganesh Ramakrishnan

    Abstract: Training state-of-the-art ASR systems such as RNN-T often has a high associated financial and environmental cost. Training with a subset of training data could mitigate this problem if the subset selected could achieve on-par performance with training with the entire dataset. Although there are many data subset selection(DSS) algorithms, direct application to the RNN-T is difficult, especially the… ▽ More

    Submitted 30 October, 2022; originally announced October 2022.

  13. arXiv:2210.06996  [pdf, other

    cs.CL cs.LG

    DICTDIS: Dictionary Constrained Disambiguation for Improved NMT

    Authors: Ayush Maheshwari, Piyush Sharma, Preethi Jyothi, Ganesh Ramakrishnan

    Abstract: Domain-specific neural machine translation (NMT) systems (\eg, in educational applications) are socially significant with the potential to help make information accessible to a diverse set of users in multilingual societies. It is desirable that such NMT systems be lexically constrained and draw from domain-specific dictionaries. Dictionaries could present multiple candidate translations for a sou… ▽ More

    Submitted 21 May, 2023; v1 submitted 13 October, 2022; originally announced October 2022.

  14. arXiv:2204.00871  [pdf, other

    cs.CL cs.LG

    Accurate Online Posterior Alignments for Principled Lexically-Constrained Decoding

    Authors: Soumya Chatterjee, Sunita Sarawagi, Preethi Jyothi

    Abstract: Online alignment in machine translation refers to the task of aligning a target word to a source word when the target sequence has only been partially decoded. Good online alignments facilitate important applications such as lexically constrained translation where user-defined dictionaries are used to inject lexical constraints into the translation model. We propose a novel posterior alignment tec… ▽ More

    Submitted 2 April, 2022; originally announced April 2022.

    Comments: 15 pages, 2 figures. ACL 2022

  15. arXiv:2203.16860  [pdf, other

    cs.CV cs.MM cs.SD eess.AS eess.IV

    Investigating Modality Bias in Audio Visual Video Parsing

    Authors: Piyush Singh Pasi, Shubham Nemani, Preethi Jyothi, Ganesh Ramakrishnan

    Abstract: We focus on the audio-visual video parsing (AVVP) problem that involves detecting audio and visual event labels with temporal boundaries. The task is especially challenging since it is weakly supervised with only event labels available as a bag of labels for each video. An existing state-of-the-art model for AVVP uses a hybrid attention network (HAN) to generate cross-modal features for both audio… ▽ More

    Submitted 11 November, 2022; v1 submitted 31 March, 2022; originally announced March 2022.

    Comments: Work under review for ICASSP 2023

  16. arXiv:2203.02317  [pdf, other

    cs.CL cs.LG

    Adaptive Discounting of Implicit Language Models in RNN-Transducers

    Authors: Vinit Unni, Shreya Khare, Ashish Mittal, Preethi Jyothi, Sunita Sarawagi, Samarth Bharadwaj

    Abstract: RNN-Transducer (RNN-T) models have become synonymous with streaming end-to-end ASR systems. While they perform competitively on a number of evaluation categories, rare words pose a serious challenge to RNN-T models. One main reason for the degradation in performance on rare words is that the language model (LM) internal to RNN-Ts can become overconfident and lead to hallucinated predictions that a… ▽ More

    Submitted 21 February, 2022; originally announced March 2022.

    Comments: Proceedings for ICASSP 2022

  17. arXiv:2202.01157  [pdf, other

    cs.CL cs.LG

    Error Correction in ASR using Sequence-to-Sequence Models

    Authors: Samrat Dutta, Shreyansh Jain, Ayush Maheshwari, Souvik Pal, Ganesh Ramakrishnan, Preethi Jyothi

    Abstract: Post-editing in Automatic Speech Recognition (ASR) entails automatically correcting common and systematic errors produced by the ASR system. The outputs of an ASR system are largely prone to phonetic and spelling errors. In this paper, we propose to use a powerful pre-trained sequence-to-sequence model, BART, further adaptively trained to serve as a denoising model, to correct errors of such types… ▽ More

    Submitted 23 August, 2022; v1 submitted 2 February, 2022; originally announced February 2022.

  18. arXiv:2110.04908  [pdf, other

    eess.AS cs.SD

    DITTO: Data-efficient and Fair Targeted Subset Selection for ASR Accent Adaptation

    Authors: Suraj Kothawade, Anmol Mekala, Chandra Sekhara D, Mayank Kothyari, Rishabh Iyer, Ganesh Ramakrishnan, Preethi Jyothi

    Abstract: State-of-the-art Automatic Speech Recognition (ASR) systems are known to exhibit disparate performance on varying speech accents. To improve performance on a specific target accent, a commonly adopted solution is to finetune the ASR model using accent-specific labeled speech. However, acquiring large amounts of labeled speech for specific target accents is challenging. Choosing an informative subs… ▽ More

    Submitted 5 June, 2023; v1 submitted 10 October, 2021; originally announced October 2021.

    Comments: ACL 2023

  19. arXiv:2107.09931  [pdf, other

    cs.CL cs.LG

    The Effectiveness of Intermediate-Task Training for Code-Switched Natural Language Understanding

    Authors: Archiki Prasad, Mohammad Ali Rehan, Shreya Pathak, Preethi Jyothi

    Abstract: While recent benchmarks have spurred a lot of new work on improving the generalization of pretrained multilingual language models on multilingual tasks, techniques to improve code-switched natural language understanding tasks have been far less explored. In this work, we propose the use of bilingual intermediate pretraining as a reliable technique to derive large and consistent performance gains o… ▽ More

    Submitted 21 July, 2021; originally announced July 2021.

  20. arXiv:2107.06483  [pdf, other

    cs.CL

    From Machine Translation to Code-Switching: Generating High-Quality Code-Switched Text

    Authors: Ishan Tarunesh, Syamantak Kumar, Preethi Jyothi

    Abstract: Generating code-switched text is a problem of growing interest, especially given the scarcity of corpora containing large volumes of real code-switched text. In this work, we adapt a state-of-the-art neural machine translation model to generate Hindi-English code-switched sentences starting from monolingual Hindi sentences. We outline a carefully designed curriculum of pretraining steps, including… ▽ More

    Submitted 14 July, 2021; originally announced July 2021.

    Comments: In Proceedings of The Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL-IJCNLP 2021)

  21. arXiv:2106.05852  [pdf

    eess.AS cs.CL cs.LG cs.SD

    Automatic Speech Recognition in Sanskrit: A New Speech Corpus and Modelling Insights

    Authors: Devaraja Adiga, Rishabh Kumar, Amrith Krishna, Preethi Jyothi, Ganesh Ramakrishnan, Pawan Goyal

    Abstract: Automatic speech recognition (ASR) in Sanskrit is interesting, owing to the various linguistic peculiarities present in the language. The Sanskrit language is lexically productive, undergoes euphonic assimilation of phones at the word boundaries and exhibits variations in spelling conventions and in pronunciations. In this work, we propose the first large scale study of automatic speech recognitio… ▽ More

    Submitted 23 July, 2021; v1 submitted 2 June, 2021; originally announced June 2021.

    Comments: Accepted paper at the 59th Annual Meeting of the Association for Computational Linguistics (ACL 2021 Findings)

  22. arXiv:2104.04598  [pdf, other

    cs.SD cs.CV cs.LG eess.AS eess.IV

    Cross-Modal learning for Audio-Visual Video Parsing

    Authors: Jatin Lamba, Abhishek, Jayaprakash Akula, Rishabh Dabral, Preethi Jyothi, Ganesh Ramakrishnan

    Abstract: In this paper, we present a novel approach to the audio-visual video parsing (AVVP) task that demarcates events from a video separately for audio and visual modalities. The proposed parsing approach simultaneously detects the temporal boundaries in terms of start and end times of such events. We show how AVVP can benefit from the following techniques geared towards effective cross-modal learning:… ▽ More

    Submitted 21 June, 2021; v1 submitted 3 April, 2021; originally announced April 2021.

    Comments: Work accepted at Interspeech 2021

  23. arXiv:2104.02656  [pdf, other

    cs.CV cs.AI cs.GR cs.MM cs.SD eess.AS eess.IV

    Collaborative Learning to Generate Audio-Video Jointly

    Authors: Vinod K Kurmi, Vipul Bajaj, Badri N Patro, K S Venkatesh, Vinay P Namboodiri, Preethi Jyothi

    Abstract: There have been a number of techniques that have demonstrated the generation of multimedia data for one modality at a time using GANs, such as the ability to generate images, videos, and audio. However, so far, the task of multi-modal generation of data, specifically for audio and videos both, has not been sufficiently well-explored. Towards this, we propose a method that demonstrates that we are… ▽ More

    Submitted 31 March, 2021; originally announced April 2021.

    Comments: ICASSP 2021 (Accepted)

  24. Multilingual and code-switching ASR challenges for low resource Indian languages

    Authors: Anuj Diwan, Rakesh Vaideeswaran, Sanket Shah, Ankita Singh, Srinivasa Raghavan, Shreya Khare, Vinit Unni, Saurabh Vyas, Akash Rajpuria, Chiranjeevi Yarra, Ashish Mittal, Prasanta Kumar Ghosh, Preethi Jyothi, Kalika Bali, Vivek Seshadri, Sunayana Sitaram, Samarth Bharadwaj, Jai Nanavati, Raoul Nanavati, Karthik Sankaranarayanan, Tejaswi Seeram, Basil Abraham

    Abstract: Recently, there is increasing interest in multilingual automatic speech recognition (ASR) where a speech recognition system caters to multiple low resource languages by taking advantage of low amounts of labeled corpora in multiple languages. With multilingualism becoming common in today's world, there has been increasing interest in code-switching ASR as well. In code-switching, multiple language… ▽ More

    Submitted 31 March, 2021; originally announced April 2021.

    Comments: 6 pages

  25. Select, Substitute, Search: A New Benchmark for Knowledge-Augmented Visual Question Answering

    Authors: Aman Jain, Mayank Kothyari, Vishwajeet Kumar, Preethi Jyothi, Ganesh Ramakrishnan, Soumen Chakrabarti

    Abstract: Multimodal IR, spanning text corpus, knowledge graph and images, called outside knowledge visual question answering (OKVQA), is of much recent interest. However, the popular data set has serious limitations. A surprisingly large fraction of queries do not assess the ability to integrate cross-modal information. Instead, some are independent of the image, some depend on speculation, some require OC… ▽ More

    Submitted 10 August, 2021; v1 submitted 9 March, 2021; originally announced March 2021.

    Comments: Accepted at SIGIR 2021

  26. arXiv:2103.05457  [pdf, other

    cs.IR

    Rudder: A Cross Lingual Video and Text Retrieval Dataset

    Authors: Jayaprakash A, Abhishek, Rishabh Dabral, Ganesh Ramakrishnan, Preethi Jyothi

    Abstract: Video retrieval using natural language queries requires learning semantically meaningful joint embeddings between the text and the audio-visual input. Often, such joint embeddings are learnt using pairwise (or triplet) contrastive loss objectives which cannot give enough attention to 'difficult-to-retrieve' samples during training. This problem is especially pronounced in data-scarce settings wher… ▽ More

    Submitted 9 March, 2021; originally announced March 2021.

  27. arXiv:2103.03142  [pdf, other

    cs.SD cs.CL eess.AS

    Error-driven Fixed-Budget ASR Personalization for Accented Speakers

    Authors: Abhijeet Awasthi, Aman Kansal, Sunita Sarawagi, Preethi Jyothi

    Abstract: We consider the task of personalizing ASR models while being constrained by a fixed budget on recording speaker-specific utterances. Given a speaker and an ASR model, we propose a method of identifying sentences for which the speaker's utterances are likely to be harder for the given ASR model to recognize. We assume a tiny amount of speaker-specific data to learn phoneme-level error models which… ▽ More

    Submitted 2 June, 2021; v1 submitted 4 March, 2021; originally announced March 2021.

    Comments: In ICASSP 2021

  28. arXiv:2102.06237  [pdf, other

    eess.AS cs.LG cs.SD

    An Investigation of End-to-End Models for Robust Speech Recognition

    Authors: Archiki Prasad, Preethi Jyothi, Rajbabu Velmurugan

    Abstract: End-to-end models for robust automatic speech recognition (ASR) have not been sufficiently well-explored in prior work. With end-to-end models, one could choose to preprocess the input speech using speech enhancement techniques and train the model using enhanced speech. Another alternative is to pass the noisy speech as input and modify the model architecture to adapt to noisy speech. A systematic… ▽ More

    Submitted 11 February, 2021; originally announced February 2021.

    Comments: Accepted to appear at ICASSP 2021

  29. arXiv:2101.10368  [pdf, other

    cs.CL

    Meta-Learning for Effective Multi-task and Multilingual Modelling

    Authors: Ishan Tarunesh, Sushil Khyalia, Vishwajeet Kumar, Ganesh Ramakrishnan, Preethi Jyothi

    Abstract: Natural language processing (NLP) tasks (e.g. question-answering in English) benefit from knowledge of other tasks (e.g. named entity recognition in English) and knowledge of other languages (e.g. question-answering in Spanish). Such shared representations are typically learned in isolation, either across tasks or across languages. In this work, we propose a meta-learning approach to learn the int… ▽ More

    Submitted 22 March, 2021; v1 submitted 25 January, 2021; originally announced January 2021.

    Comments: In Proceedings of The 16th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2021)

  30. arXiv:2010.09322  [pdf, other

    eess.AS cs.AI cs.CL cs.LG cs.SD

    Reduce and Reconstruct: ASR for Low-Resource Phonetic Languages

    Authors: Anuj Diwan, Preethi Jyothi

    Abstract: This work presents a seemingly simple but effective technique to improve low-resource ASR systems for phonetic languages. By identifying sets of acoustically similar graphemes in these languages, we first reduce the output alphabet of the ASR system using linguistically meaningful reductions and then reconstruct the original alphabet using a standalone module. We demonstrate that this lessens the… ▽ More

    Submitted 3 June, 2021; v1 submitted 19 October, 2020; originally announced October 2020.

    Comments: 5 pages, 1 figure. Accepted at INTERSPEECH 2021

  31. arXiv:2010.05549  [pdf, ps, other

    cs.CL

    Improving Low Resource Code-switched ASR using Augmented Code-switched TTS

    Authors: Yash Sharma, Basil Abraham, Karan Taneja, Preethi Jyothi

    Abstract: Building Automatic Speech Recognition (ASR) systems for code-switched speech has recently gained renewed attention due to the widespread use of speech technologies in multilingual communities worldwide. End-to-end ASR systems are a natural modeling choice due to their ease of use and superior performance in monolingual settings. However, it is well known that end-to-end systems require large amoun… ▽ More

    Submitted 12 October, 2020; originally announced October 2020.

    Comments: Interspeech 2020, 5 pages

  32. arXiv:2006.13519  [pdf, other

    eess.AS cs.CL cs.SD

    Black-box Adaptation of ASR for Accented Speech

    Authors: Kartik Khandelwal, Preethi Jyothi, Abhijeet Awasthi, Sunita Sarawagi

    Abstract: We introduce the problem of adapting a black-box, cloud-based ASR system to speech from a target accent. While leading online ASR services obtain impressive performance on main-stream accents, they perform poorly on sub-populations - we observed that the word error rate (WER) achieved by Google's ASR API on Indian accents is almost twice the WER on US accents. Existing adaptation methods either re… ▽ More

    Submitted 24 June, 2020; originally announced June 2020.

    Comments: A slightly different version submitted to INTERSPEECH 2020 (currently under review)

  33. arXiv:1910.11536  [pdf, other

    cs.CL

    Stem-driven Language Models for Morphologically Rich Languages

    Authors: Yash Shah, Ishan Tarunesh, Harsh Deshpande, Preethi Jyothi

    Abstract: Neural language models (LMs) have shown to benefit significantly from enhancing word vectors with subword-level information, especially for morphologically rich languages. This has been mainly tackled by providing subword-level information as an input; using subword units in the output layer has been far less explored. In this work, we propose LMs that are cognizant of the underlying stems in each… ▽ More

    Submitted 25 October, 2019; originally announced October 2019.

    Comments: 5 pages, 3 figures, under review at ICASSP 2020

  34. arXiv:1906.09426  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    End-to-End ASR for Code-switched Hindi-English Speech

    Authors: Brij Mohan Lal Srivastava, Basil Abraham, Sunayana Sitaram, Rupesh Mehta, Preethi Jyothi

    Abstract: End-to-end (E2E) models have been explored for large speech corpora and have been found to match or outperform traditional pipeline-based systems in some languages. However, most prior work on end-to-end models use speech corpora exceeding hundreds or thousands of hours. In this study, we explore end-to-end models for code-switched Hindi-English language with less than 50 hours of data. We utilize… ▽ More

    Submitted 22 June, 2019; originally announced June 2019.

  35. arXiv:1906.02525  [pdf, other

    cs.CL

    Cross-Lingual Training for Automatic Question Generation

    Authors: Vishwajeet Kumar, Nitish Joshi, Arijit Mukherjee, Ganesh Ramakrishnan, Preethi Jyothi

    Abstract: Automatic question generation (QG) is a challenging problem in natural language understanding. QG systems are typically built assuming access to a large number of training instances where each instance is a question and its corresponding answer. For a new language, such training instances are hard to obtain making the QG problem even more challenging. Using this as our motivation, we study the reu… ▽ More

    Submitted 6 June, 2019; originally announced June 2019.

    Comments: ACL 2019

  36. arXiv:1809.01962  [pdf, other

    cs.CL cs.LG

    Code-switched Language Models Using Dual RNNs and Same-Source Pretraining

    Authors: Saurabh Garg, Tanmay Parekh, Preethi Jyothi

    Abstract: This work focuses on building language models (LMs) for code-switched text. We propose two techniques that significantly improve these LMs: 1) A novel recurrent neural network unit with dual components that focus on each language in the code-switched text separately 2) Pretraining the LM using synthetic text from a generative model estimated using the training data. We demonstrate the effectivenes… ▽ More

    Submitted 6 September, 2018; originally announced September 2018.

    Comments: Accepted at EMNLP 2018

  37. arXiv:1808.07733  [pdf, other

    cs.CL

    Revisiting the Importance of Encoding Logic Rules in Sentiment Classification

    Authors: Kalpesh Krishna, Preethi Jyothi, Mohit Iyyer

    Abstract: We analyze the performance of different sentiment classification models on syntactically complex inputs like A-but-B sentences. The first contribution of this analysis addresses reproducible research: to meaningfully compare different models, their accuracies must be averaged over far more random seeds than what has traditionally been reported. With proper averaging in place, we notice that the di… ▽ More

    Submitted 23 August, 2018; originally announced August 2018.

    Comments: EMNLP 2018 Camera Ready

  38. arXiv:1804.10745  [pdf, other

    cs.LG stat.ML

    Generalizing Across Domains via Cross-Gradient Training

    Authors: Shiv Shankar, Vihari Piratla, Soumen Chakrabarti, Siddhartha Chaudhuri, Preethi Jyothi, Sunita Sarawagi

    Abstract: We present CROSSGRAD, a method to use multi-domain training data to learn a classifier that generalizes to new domains. CROSSGRAD does not need an adaptation phase via labeled or unlabeled data, or domain features in the new domain. Most existing domain adaptation methods attempt to erase domain signals using techniques like domain adversarial training. In contrast, CROSSGRAD is free to use domain… ▽ More

    Submitted 1 May, 2018; v1 submitted 28 April, 2018; originally announced April 2018.

    Comments: The first two authors contributed equally; Accepted at ICLR 2018

  39. arXiv:1712.08992  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Leveraging Native Language Speech for Accent Identification using Deep Siamese Networks

    Authors: Aditya Siddhant, Preethi Jyothi, Sriram Ganapathy

    Abstract: The problem of automatic accent identification is important for several applications like speaker profiling and recognition as well as for improving speech recognition systems. The accented nature of speech can be primarily attributed to the influence of the speaker's native language on the given speech recording. In this paper, we propose a novel accent identification system whose training exploi… ▽ More

    Submitted 18 June, 2018; v1 submitted 24 December, 2017; originally announced December 2017.

    Comments: Published in ASRU 2017

  40. arXiv:1711.01048  [pdf, other

    cs.CL

    Dual Language Models for Code Switched Speech Recognition

    Authors: Saurabh Garg, Tanmay Parekh, Preethi Jyothi

    Abstract: In this work, we present a simple and elegant approach to language modeling for bilingual code-switched text. Since code-switching is a blend of two or more different languages, a standard bilingual language model can be improved upon by using structures of the monolingual language models. We propose a novel technique called dual language models, which involves building two complementary monolingu… ▽ More

    Submitted 3 August, 2018; v1 submitted 3 November, 2017; originally announced November 2017.

    Comments: Accepted at Interspeech 2018

  41. arXiv:1612.03991  [pdf, ps, other

    cs.CL

    Performance Improvements of Probabilistic Transcript-adapted ASR with Recurrent Neural Network and Language-specific Constraints

    Authors: Xiang Kong, Preethi Jyothi, Mark Hasegawa-Johnson

    Abstract: Mismatched transcriptions have been proposed as a mean to acquire probabilistic transcriptions from non-native speakers of a language.Prior work has demonstrated the value of these transcriptions by successfully adapting cross-lingual ASR systems for different tar-get languages. In this work, we describe two techniques to refine these probabilistic transcriptions: a noisy-channel model of non-nati… ▽ More

    Submitted 12 December, 2016; originally announced December 2016.

  42. arXiv:1607.01958  [pdf

    cs.CL cs.IR cs.LG

    Stock trend prediction using news sentiment analysis

    Authors: Joshi Kalyani, Prof. H. N. Bharathi, Prof. Rao Jyothi

    Abstract: Efficient Market Hypothesis is the popular theory about stock prediction. With its failure much research has been carried in the area of prediction of stocks. This project is about taking non quantifiable data such as financial news articles about a company and predicting its future stock trend with news sentiment classification. Assuming that news articles have impact on stock market, this is an… ▽ More

    Submitted 7 July, 2016; originally announced July 2016.

    Comments: 11 PAGES, 4 FIGURES