Skip to main content

Showing 1–29 of 29 results for author: Joshi, V

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.10247  [pdf, other

    cs.CL cs.AI

    QCQA: Quality and Capacity-aware grouped Query Attention

    Authors: Vinay Joshi, Prashant Laddha, Shambhavi Sinha, Om Ji Omer, Sreenivas Subramoney

    Abstract: Excessive memory requirements of key and value features (KV-cache) present significant challenges in the autoregressive inference of large language models (LLMs), restricting both the speed and length of text generation. Approaches such as Multi-Query Attention (MQA) and Grouped Query Attention (GQA) mitigate these challenges by grou** query heads and consequently reducing the number of correspo… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

  2. arXiv:2405.01568  [pdf

    cs.SE

    Convert any android device into a programmable IoT device with the help of IoT Everywhere Framework

    Authors: Vishnu Joshi

    Abstract: The world around us is transforming as the field of the Internet of Things is taking over the world faster than we thought. Everyone in the tech industry is building wonderful things with the help of IoT. Smartwatches, smart coffee machines, smart television, smart homes are some of the examples. Building IoT sensor modules with sensors that connect to the internet can be very intimidating for peo… ▽ More

    Submitted 14 April, 2024; originally announced May 2024.

    Comments: 4 pages, 10 figures

  3. arXiv:2402.11780  [pdf, other

    cs.AR cs.AI

    CiMNet: Towards Joint Optimization for DNN Architecture and Configuration for Compute-In-Memory Hardware

    Authors: Souvik Kundu, Anthony Sarah, Vinay Joshi, Om J Omer, Sreenivas Subramoney

    Abstract: With the recent growth in demand for large-scale deep neural networks, compute in-memory (CiM) has come up as a prominent solution to alleviate bandwidth and on-chip interconnect bottlenecks that constrain Von-Neuman architectures. However, the construction of CiM hardware poses a challenge as any specific memory hierarchy in terms of cache sizes and memory bandwidth at different interfaces may no… ▽ More

    Submitted 18 March, 2024; v1 submitted 18 February, 2024; originally announced February 2024.

    Comments: 6 pages, 4 figures, 5 tables; Accepted as a full paper by the tinyML Research Symposium 2024

  4. arXiv:2402.07118  [pdf, other

    cs.HC cs.AI cs.LG eess.IV eess.SP

    Next-Generation Teleophthalmology: AI-enabled Quality Assessment Aiding Remote Smartphone-based Consultation

    Authors: Dhruv Srikanth, Jayang Gurung, N Satya Deepika, Vineet Joshi, Pravin Vaddavalli, Soumya Jana

    Abstract: Blindness and other eye diseases are a global health concern, particularly in low- and middle-income countries like India. In this regard, during the COVID-19 pandemic, teleophthalmology became a lifeline, and the Grabi attachment for smartphone-based eye imaging gained in use. However, quality of user-captured image often remained inadequate, requiring clinician vetting and delays. In this backdr… ▽ More

    Submitted 11 February, 2024; originally announced February 2024.

    Comments: 4 pages, Submitted to IEEE EMBC 2024

  5. Streaming Bilingual End-to-End ASR model using Attention over Multiple Softmax

    Authors: Aditya Patil, Vikas Joshi, Purvi Agrawal, Rupesh Mehta

    Abstract: Even with several advancements in multilingual modeling, it is challenging to recognize multiple languages using a single neural model, without knowing the input language and most multilingual models assume the availability of the input language. In this work, we propose a novel bilingual end-to-end (E2E) modeling approach, where a single neural model can recognize both languages and also support… ▽ More

    Submitted 21 January, 2024; originally announced January 2024.

    Comments: Published in IEEE's Spoken Language Technology (SLT) 2022, 8 pages (6 + 2 for references), 5 figures

    Journal ref: 2022 IEEE Spoken Language Technology Workshop (SLT), Doha, Qatar, 2023, pp. 252-259

  6. arXiv:2306.00248  [pdf, other

    cs.IR cs.AI

    TransAct: Transformer-based Realtime User Action Model for Recommendation at Pinterest

    Authors: Xue Xia, Pong Eksombatchai, Nikil Pancha, Dhruvil Deven Badani, Po-Wei Wang, Neng Gu, Saurabh Vishwas Joshi, Nazanin Farahpour, Zhiyuan Zhang, Andrew Zhai

    Abstract: Sequential models that encode user activity for next action prediction have become a popular design choice for building web-scale personalized recommendation systems. Traditional methods of sequential recommendation either utilize end-to-end learning on realtime user actions, or learn user representations separately in an offline batch-generated manner. This paper (1) presents Pinterest's ranking… ▽ More

    Submitted 31 May, 2023; originally announced June 2023.

    Comments: \c{opyright} {ACM} {2023}. This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in KDD'23, http://dx.doi.org/10.1145/3580305.3599918

  7. arXiv:2304.14807  [pdf, other

    physics.plasm-ph cs.AI cs.LG physics.comp-ph

    Deep Learning assisted microwave-plasma interaction based technique for plasma density estimation

    Authors: Pratik Ghosh, Bhaskar Chaudhury, Shishir Purohit, Vishv Joshi, Ashray Kothari, Devdeep Shetranjiwala

    Abstract: The electron density is a key parameter to characterize any plasma. Most of the plasma applications and research in the area of low-temperature plasmas (LTPs) are based on the accurate estimations of plasma density and plasma temperature. The conventional methods for electron density measurements offer axial and radial profiles for any given linear LTP device. These methods have major disadvantage… ▽ More

    Submitted 28 June, 2023; v1 submitted 28 April, 2023; originally announced April 2023.

  8. arXiv:2303.17853  [pdf, other

    physics.pop-ph astro-ph.HE cs.CL

    Can AI Put Gamma-Ray Astrophysicists Out of a Job?

    Authors: Samuel T. Spencer, Vikas Joshi, Alison M. W. Mitchell

    Abstract: In what will likely be a litany of generative-model-themed arXiv submissions celebrating April the 1st, we evaluate the capacity of state-of-the-art transformer models to create a paper detailing the detection of a Pulsar Wind Nebula with a non-existent Imaging Atmospheric Cherenkov Telescope (IACT) Array. We do this to evaluate the ability of such models to interpret astronomical observations and… ▽ More

    Submitted 4 April, 2023; v1 submitted 31 March, 2023; originally announced March 2023.

  9. arXiv:2303.15218  [pdf, other

    cs.LG cs.AI

    Evaluating XGBoost for Balanced and Imbalanced Data: Application to Fraud Detection

    Authors: Gissel Velarde, Anindya Sudhir, Sanjay Deshmane, Anuj Deshmunkh, Khushboo Sharma, Vaibhav Joshi

    Abstract: This paper evaluates XGboost's performance given different dataset sizes and class distributions, from perfectly balanced to highly imbalanced. XGBoost has been selected for evaluation, as it stands out in several benchmarks due to its detection performance and speed. After introducing the problem of fraud detection, the paper reviews evaluation metrics for detection systems or binary classifiers,… ▽ More

    Submitted 27 March, 2023; originally announced March 2023.

    Comments: 17 pages, 8 figures, 9 tables, Presented at NVIDIA GTC, The Conference for the Era of AI and the Metaverse, March 23, 2023. [S51129]

  10. arXiv:2303.14584  [pdf, other

    cs.CV

    Learning video embedding space with Natural Language Supervision

    Authors: Phani Krishna Uppala, Abhishek Bamotra, Shriti Priya, Vaidehi Joshi

    Abstract: The recent success of the CLIP model has shown its potential to be applied to a wide range of vision and language tasks. However this only establishes embedding space relationship of language to images, not to the video domain. In this paper, we propose a novel approach to map video embedding space to natural langugage. We propose a two-stage approach that first extracts visual features from each… ▽ More

    Submitted 7 April, 2023; v1 submitted 25 March, 2023; originally announced March 2023.

  11. A 64-core mixed-signal in-memory compute chip based on phase-change memory for deep neural network inference

    Authors: Manuel Le Gallo, Riduan Khaddam-Aljameh, Milos Stanisavljevic, Athanasios Vasilopoulos, Benedikt Kersting, Martino Dazzi, Geethan Karunaratne, Matthias Braendli, Abhairaj Singh, Silvia M. Mueller, Julian Buechel, Xavier Timoneda, Vinay Joshi, Urs Egger, Angelo Garofalo, Anastasios Petropoulos, Theodore Antonakopoulos, Kevin Brew, Samuel Choi, Injo Ok, Timothy Philip, Victor Chan, Claire Silvestre, Ishtiaq Ahsan, Nicole Saulnier , et al. (4 additional authors not shown)

    Abstract: The need to repeatedly shuttle around synaptic weight values from memory to processing units has been a key source of energy inefficiency associated with hardware implementation of artificial neural networks. Analog in-memory computing (AIMC) with spatially instantiated synaptic weights holds high promise to overcome this challenge, by performing matrix-vector multiplications (MVMs) directly withi… ▽ More

    Submitted 6 December, 2022; originally announced December 2022.

    Journal ref: Nature Electronics 6, 680-693 (2023)

  12. arXiv:2211.15532  [pdf, other

    cs.CL cs.CY

    YZR-net : Self-supervised Hidden representations Invariant to Transformations for profanity detection

    Authors: Vedant Sandeep Joshi, Sivanagaraja Tatinati, Yubo Wang

    Abstract: On current {\it e-}learning platforms, live classes are an important tool that provides students with an opportunity to get more involved while learning new concepts. In such classes, the element of interaction with teachers and fellow peers helps in removing learning silos and gives each student a chance to experience some aspects relevant to offline learning in this era of virtual classes. One c… ▽ More

    Submitted 22 November, 2022; originally announced November 2022.

  13. EEG aided boosting of single-lead ECG based sleep staging with Deep Knowledge Distillation

    Authors: Vaibhav Joshi, Sricharan V, Preejith SP, Mohanasankar Sivaprakasam

    Abstract: An electroencephalogram (EEG) signal is currently accepted as a standard for automatic sleep staging. Lately, Near-human accuracy in automated sleep staging has been achievable by Deep Learning (DL) based approaches, enabling multi-fold progress in this area. However, An extensive and expensive clinical setup is required for EEG based sleep staging. Additionally, the EEG setup being obtrusive in n… ▽ More

    Submitted 18 November, 2022; originally announced November 2022.

    Comments: Medical Measurements and Apllications (MeMeA-2022)

  14. arXiv:2208.09600  [pdf, other

    cs.LG

    Looking For A Match: Self-supervised Clustering For Automatic Doubt Matching In e-learning Platforms

    Authors: Vedant Sandeep Joshi, Sivanagaraja Tatinati, Yubo Wang

    Abstract: Recently, e-learning platforms have grown as a place where students can post doubts (as a snap taken with smart phones) and get them resolved in minutes. However, the significant increase in the number of student-posted doubts with high variance in quality on these platforms not only presents challenges for teachers' navigation to address them but also increases the resolution time per doubt. Both… ▽ More

    Submitted 20 August, 2022; originally announced August 2022.

  15. arXiv:2204.00348  [pdf, other

    cs.CL cs.SD eess.AS

    WavFT: Acoustic model finetuning with labelled and unlabelled data

    Authors: Utkarsh Chauhan, Vikas Joshi, Rupesh R. Mehta

    Abstract: Unsupervised and self-supervised learning methods have leveraged unlabelled data to improve the pretrained models. However, these methods need significantly large amount of unlabelled data and the computational cost of training models with such large amount of data can be prohibitively high. We address this issue by using unlabelled data during finetuning, instead of pretraining. We propose acoust… ▽ More

    Submitted 1 April, 2022; originally announced April 2022.

  16. arXiv:2112.07252  [pdf, other

    eess.SP cs.AI

    A Deep Knowledge Distillation framework for EEG assisted enhancement of single-lead ECG based sleep staging

    Authors: Vaibhav Joshi, Sricharan Vijayarangan, Preejith SP, Mohanasankar Sivaprakasam

    Abstract: Automatic Sleep Staging study is presently done with the help of Electroencephalogram (EEG) signals. Recently, Deep Learning (DL) based approaches have enabled significant progress in this area, allowing for near-human accuracy in automated sleep staging. However, EEG based sleep staging requires an extensive as well as an expensive clinical setup. Moreover, the requirement of an expert for setup… ▽ More

    Submitted 5 February, 2022; v1 submitted 14 December, 2021; originally announced December 2021.

    Comments: Preprint Only

  17. arXiv:2102.05271  [pdf, other

    cs.AR cs.AI cs.ET cs.LG

    Hybrid In-memory Computing Architecture for the Training of Deep Neural Networks

    Authors: Vinay Joshi, Wangxin He, Jae-sun Seo, Bipin Rajendran

    Abstract: The cost involved in training deep neural networks (DNNs) on von-Neumann architectures has motivated the development of novel solutions for efficient DNN training accelerators. We propose a hybrid in-memory computing (HIC) architecture for the training of DNNs on hardware accelerators that results in memory-efficient inference and outperforms baseline software accuracy in benchmark tasks. We intro… ▽ More

    Submitted 10 February, 2021; originally announced February 2021.

    Comments: Accepted at ISCAS 2021 for publication

  18. arXiv:2012.01930  [pdf, ps, other

    cs.LG cs.CY

    Learning Explainable Interventions to Mitigate HIV Transmission in Sex Workers Across Five States in India

    Authors: Raghav Awasthi, Prachi Patel, Vineet Joshi, Shama Karkal, Tavpritesh Sethi

    Abstract: Female sex workers(FSWs) are one of the most vulnerable and stigmatized groups in society. As a result, they often suffer from a lack of quality access to care. Grassroot organizations engaged in improving health services are often faced with the challenge of improving the effectiveness of interventions due to complex influences. This work combines structure learning, discriminative modeling, and… ▽ More

    Submitted 30 November, 2020; originally announced December 2020.

    Comments: Presented at NeurIPS 2020 Workshop on Machine Learning for the Develo** World

  19. arXiv:2008.05086  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    Transfer Learning Approaches for Streaming End-to-End Speech Recognition System

    Authors: Vikas Joshi, Rui Zhao, Rupesh R. Mehta, Kshitiz Kumar, **yu Li

    Abstract: Transfer learning (TL) is widely used in conventional hybrid automatic speech recognition (ASR) system, to transfer the knowledge from source to target language. TL can be applied to end-to-end (E2E) ASR system such as recurrent neural network transducer (RNN-T) models, by initializing the encoder and/or prediction network of the target language with the pre-trained models from source language. In… ▽ More

    Submitted 17 August, 2020; v1 submitted 11 August, 2020; originally announced August 2020.

  20. arXiv:2007.13417  [pdf, other

    physics.app-ph cs.CV eess.IV

    Image-driven discriminative and generative machine learning algorithms for establishing microstructure-processing relationships

    Authors: Wufei Ma, Elizabeth Kautz, Arun Baskaran, Aritra Chowdhury, Vineet Joshi, Bülent Yener, Daniel Lewis

    Abstract: We investigate methods of microstructure representation for the purpose of predicting processing condition from microstructure image data. A binary alloy (uranium-molybdenum) that is currently under development as a nuclear fuel was studied for the purpose of develo** an improved machine learning approach to image recognition, characterization, and building predictive capabilities linking micros… ▽ More

    Submitted 27 July, 2020; originally announced July 2020.

    Comments: 14 pages, 15 figures

  21. arXiv:2006.05257  [pdf, other

    eess.AS cs.CL cs.SD

    Learning not to Discriminate: Task Agnostic Learning for Improving Monolingual and Code-switched Speech Recognition

    Authors: Gurunath Reddy Madhumani, Sanket Shah, Basil Abraham, Vikas Joshi, Sunayana Sitaram

    Abstract: Recognizing code-switched speech is challenging for Automatic Speech Recognition (ASR) for a variety of reasons, including the lack of code-switched training data. Recently, we showed that monolingual ASR systems fine-tuned on code-switched data deteriorate in performance on monolingual speech recognition, which is not desirable as ASR systems deployed in multilingual scenarios should recognize bo… ▽ More

    Submitted 9 June, 2020; originally announced June 2020.

    Comments: 5 pages (4 pages + 1 reference), 3 tables, 2 figures

  22. arXiv:2006.00782  [pdf, other

    eess.AS cs.CL cs.SD

    Learning to Recognize Code-switched Speech Without Forgetting Monolingual Speech Recognition

    Authors: Sanket Shah, Basil Abraham, Gurunath Reddy M, Sunayana Sitaram, Vikas Joshi

    Abstract: Recently, there has been significant progress made in Automatic Speech Recognition (ASR) of code-switched speech, leading to gains in accuracy on code-switched datasets in many language pairs. Code-switched speech co-occurs with monolingual speech in one or both languages being mixed. In this work, we show that fine-tuning ASR models on code-switched speech harms performance on monolingual speech.… ▽ More

    Submitted 1 June, 2020; originally announced June 2020.

    Comments: 5 pages (4 pages + 1 page references), 5 tables, 1 figure, 1 algorithm, 16 references

  23. arXiv:2003.11256  [pdf, other

    cs.LG cs.NE

    ESSOP: Efficient and Scalable Stochastic Outer Product Architecture for Deep Learning

    Authors: Vinay Joshi, Geethan Karunaratne, Manuel Le Gallo, Irem Boybat, Christophe Piveteau, Abu Sebastian, Bipin Rajendran, Evangelos Eleftheriou

    Abstract: Deep neural networks (DNNs) have surpassed human-level accuracy in a variety of cognitive tasks but at the cost of significant memory/time requirements in DNN training. This limits their deployment in energy and memory limited applications that require real-time learning. Matrix-vector multiplications (MVM) and vector-vector outer product (VVOP) are the two most expensive operations associated wit… ▽ More

    Submitted 25 March, 2020; originally announced March 2020.

    Comments: 5 pages. 5 figures. Accepted at ISCAS 2020 for publication

  24. Mixed-precision deep learning based on computational memory

    Authors: S. R. Nandakumar, Manuel Le Gallo, Christophe Piveteau, Vinay Joshi, Giovanni Mariani, Irem Boybat, Geethan Karunaratne, Riduan Khaddam-Aljameh, Urs Egger, Anastasios Petropoulos, Theodore Antonakopoulos, Bipin Rajendran, Abu Sebastian, Evangelos Eleftheriou

    Abstract: Deep neural networks (DNNs) have revolutionized the field of artificial intelligence and have achieved unprecedented success in cognitive tasks such as image and speech recognition. Training of large DNNs, however, is computationally intensive and this has motivated the search for novel computing architectures targeting this application. A computational memory unit with nanoscale resistive memory… ▽ More

    Submitted 31 January, 2020; originally announced January 2020.

    Journal ref: Frontiers in Neuroscience 14:406 (2020)

  25. arXiv:1909.04164  [pdf, other

    cs.CL

    Knowledge Enhanced Contextual Word Representations

    Authors: Matthew E. Peters, Mark Neumann, Robert L. Logan IV, Roy Schwartz, Vidur Joshi, Sameer Singh, Noah A. Smith

    Abstract: Contextual word representations, typically trained on unstructured, unlabeled text, do not contain any explicit grounding to real world entities and are often unable to remember facts about those entities. We propose a general method to embed multiple knowledge bases (KBs) into large scale models, and thereby enhance their representations with structured, human-curated knowledge. For each KB, we f… ▽ More

    Submitted 30 October, 2019; v1 submitted 9 September, 2019; originally announced September 2019.

    Comments: EMNLP 2019

  26. arXiv:1906.05496  [pdf, other

    physics.app-ph cs.CV cs.LG

    An image-driven machine learning approach to kinetic modeling of a discontinuous precipitation reaction

    Authors: Elizabeth Kautz, Wufei Ma, Saumyadeep Jana, Arun Devaraj, Vineet Joshi, Bülent Yener, Daniel Lewis

    Abstract: Micrograph quantification is an essential component of several materials science studies. Machine learning methods, in particular convolutional neural networks, have previously demonstrated performance in image recognition tasks across several disciplines (e.g. materials science, medical imaging, facial recognition). Here, we apply these well-established methods to develop an approach to microstru… ▽ More

    Submitted 13 June, 2019; originally announced June 2019.

    Comments: 30 pages, 8 figures

  27. Accurate deep neural network inference using computational phase-change memory

    Authors: Vinay Joshi, Manuel Le Gallo, Simon Haefeli, Irem Boybat, S. R. Nandakumar, Christophe Piveteau, Martino Dazzi, Bipin Rajendran, Abu Sebastian, Evangelos Eleftheriou

    Abstract: In-memory computing is a promising non-von Neumann approach for making energy-efficient deep learning inference hardware. Crossbar arrays of resistive memory devices can be used to encode the network weights and perform efficient analog matrix-vector multiplications without intermediate movements of data. However, due to device variability and noise, the network needs to be trained in a specific w… ▽ More

    Submitted 11 April, 2020; v1 submitted 7 June, 2019; originally announced June 2019.

    Comments: This is a pre-print of an article accepted for publication in Nature Communications

    Journal ref: Nature Communications 11, Article number: 2473 (2020)

  28. arXiv:1805.06556  [pdf, other

    cs.CL

    Extending a Parser to Distant Domains Using a Few Dozen Partially Annotated Examples

    Authors: Vidur Joshi, Matthew Peters, Mark Hopkins

    Abstract: We revisit domain adaptation for parsers in the neural era. First we show that recent advances in word representations greatly diminish the need for domain adaptation when the target domain is syntactically similar to the source domain. As evidence, we train a parser on the Wall Street Jour- nal alone that achieves over 90% F1 on the Brown corpus. For more syntactically dis- tant domains, we provi… ▽ More

    Submitted 16 May, 2018; originally announced May 2018.

    Comments: ACL 2018

  29. arXiv:1307.4048  [pdf, ps, other

    cs.LG cs.CV stat.ML

    Modified SPLICE and its Extension to Non-Stereo Data for Noise Robust Speech Recognition

    Authors: D. S. Pavan Kumar, N. Vishnu Prasad, Vikas Joshi, S. Umesh

    Abstract: In this paper, a modification to the training process of the popular SPLICE algorithm has been proposed for noise robust speech recognition. The modification is based on feature correlations, and enables this stereo-based algorithm to improve the performance in all noise conditions, especially in unseen cases. Further, the modified framework is extended to work for non-stereo datasets where clean… ▽ More

    Submitted 15 July, 2013; originally announced July 2013.

    Comments: Submitted to Automatic Speech Recognition and Understanding (ASRU) 2013 Workshop