Skip to main content

Showing 1–24 of 24 results for author: M, G

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.20735  [pdf, other

    cs.CV

    Language Augmentation in CLIP for Improved Anatomy Detection on Multi-modal Medical Images

    Authors: Mansi Kakkar, Dattesh Shanbhag, Chandan Aladahalli, Gurunath Reddy M

    Abstract: Vision-language models have emerged as a powerful tool for previously challenging multi-modal classification problem in the medical domain. This development has led to the exploration of automated image description generation for multi-modal clinical scans, particularly for radiology report generation. Existing research has focused on clinical descriptions for specific modalities or body regions,… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

    Comments: $©$ 2024 IEEE. Accepted in 46th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) 2024

  2. arXiv:2405.12018  [pdf, other

    cs.CV

    Continuous Sign Language Recognition with Adapted Conformer via Unsupervised Pretraining

    Authors: Neena Aloysius, Geetha M, Prema Nedungadi

    Abstract: Conventional Deep Learning frameworks for continuous sign language recognition (CSLR) are comprised of a single or multi-modal feature extractor, a sequence-learning module, and a decoder for outputting the glosses. The sequence learning module is a crucial part wherein transformers have demonstrated their efficacy in the sequence-to-sequence tasks. Analyzing the research progress in the field of… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

  3. arXiv:2401.13891  [pdf

    cs.SE

    Text to speech synthesis

    Authors: Harini s, Manoj G M

    Abstract: Text-to-speech (TTS) synthesis is a technology that converts written text into spoken words, enabling a natural and accessible means of communication. This abstract explores the key aspects of TTS synthesis, encompassing its underlying technologies, applications, and implications for various sectors. The technology utilizes advanced algorithms and linguistic models to convert textual information i… ▽ More

    Submitted 24 January, 2024; originally announced January 2024.

  4. arXiv:2401.02541  [pdf

    cs.RO cond-mat.mtrl-sci

    Autonomous Multi-Rotor UAVs: A Holistic Approach to Design, Optimization, and Fabrication

    Authors: Aniruth A, Chirag Satpathy, Jothika K, Nitteesh M, Gokulraj M, Venkatram K, Harshith G, Shristi S, Anushka Vani, Jonathan Spurgeon

    Abstract: Unmanned Aerial Vehicles (UAVs) have become pivotal in domains spanning military, agriculture, surveillance, and logistics, revolutionizing data collection and environmental interaction. With the advancement in drone technology, there is a compelling need to develop a holistic methodology for designing UAVs. This research focuses on establishing a procedure encompassing conceptual design, use of c… ▽ More

    Submitted 4 January, 2024; originally announced January 2024.

  5. arXiv:2310.18642  [pdf

    cs.CV cs.AI

    One-shot Localization and Segmentation of Medical Images with Foundation Models

    Authors: Deepa Anand, Gurunath Reddy M, Vanika Singhal, Dattesh D. Shanbhag, Shriram KS, Uday Patil, Chitresh Bhushan, Kavitha Manickam, Dawei Gui, Rakesh Mullick, Avinash Gopal, Parminder Bhatia, Taha Kass-Hout

    Abstract: Recent advances in Vision Transformers (ViT) and Stable Diffusion (SD) models with their ability to capture rich semantic features of the image have been used for image correspondence tasks on natural images. In this paper, we examine the ability of a variety of pre-trained ViT (DINO, DINOv2, SAM, CLIP) and SD models, trained exclusively on natural images, for solving the correspondence problems o… ▽ More

    Submitted 28 October, 2023; originally announced October 2023.

    Comments: Accepted at NeurIPS 2023 R0-FoMo Workshop

  6. arXiv:2301.10015  [pdf, other

    cs.SD cs.AI eess.AS

    Deep Attention-Based Alignment Network for Melody Generation from Incomplete Lyrics

    Authors: Gurunath Reddy M, Zhe Zhang, Yi Yu, Florian Harscoet, Simon Canales, Suhua Tang

    Abstract: We propose a deep attention-based alignment network, which aims to automatically predict lyrics and melody with given incomplete lyrics as input in a way similar to the music creation of humans. Most importantly, a deep neural lyrics-to-melody net is trained in an encoder-decoder way to predict possible pairs of lyrics-melody when given incomplete lyrics (few keywords). The attention mechanism is… ▽ More

    Submitted 22 January, 2023; originally announced January 2023.

    Comments: arXiv admin note: substantial text overlap with arXiv:2011.06380

  7. arXiv:2209.15186  [pdf, other

    cs.ET

    Leveraging Probabilistic Switching in Superparamagnets for Temporal Information Encoding in Neuromorphic Systems

    Authors: Kezhou Yang, Dhuruva Priyan G M, Abhronil Sengupta

    Abstract: Brain-inspired computing - leveraging neuroscientific principles underpinning the unparalleled efficiency of the brain in solving cognitive tasks - is emerging to be a promising pathway to solve several algorithmic and computational challenges faced by deep learning today. Nonetheless, current research in neuromorphic computing is driven by our well-developed notions of running deep learning algor… ▽ More

    Submitted 11 January, 2023; v1 submitted 29 September, 2022; originally announced September 2022.

  8. arXiv:2209.03149  [pdf, other

    cs.SI

    MultiViz: A Gephi Plugin for Scalable Visualization of Multi-Layer Networks

    Authors: Jayamohan Pillai C. S., Ayan Chatterjee, Geetha M., Amitava Mukherjee

    Abstract: The process of visually presenting networks is an effective way to understand entity relationships within the networks since it reveals the overall structure and topology of the network. Real networks are extremely difficult to visualize due to their immense complexity, which includes vast amounts of data, several types of interactions, various subsystems and several levels of connectivity as well… ▽ More

    Submitted 6 September, 2022; originally announced September 2022.

  9. arXiv:2206.07910  [pdf, ps, other

    cs.CR cs.LG

    Introducing the Huber mechanism for differentially private low-rank matrix completion

    Authors: R Adithya Gowtham, Gokularam M, Thulasi Tholeti, Sheetal Kalyani

    Abstract: Performing low-rank matrix completion with sensitive user data calls for privacy-preserving approaches. In this work, we propose a novel noise addition mechanism for preserving differential privacy where the noise distribution is inspired by Huber loss, a well-known loss function in robust statistics. The proposed Huber mechanism is evaluated against existing differential privacy mechanisms while… ▽ More

    Submitted 16 June, 2022; originally announced June 2022.

    Comments: 13 pages

  10. arXiv:2202.01078  [pdf, other

    cs.SD eess.AS

    Melody Extraction from Polyphonic Music by Deep Learning Approaches: A Review

    Authors: Gurunath Reddy M, K. Sreenivasa Rao, Partha Pratim Das

    Abstract: Melody extraction is a vital music information retrieval task among music researchers for its potential applications in education pedagogy and the music industry. Melody extraction is a notoriously challenging task due to the presence of background instruments. Also, often melodic source exhibits similar characteristics to that of the other instruments. The interfering background accompaniment wit… ▽ More

    Submitted 2 February, 2022; originally announced February 2022.

    Comments: 72 pages

  11. arXiv:2106.06980  [pdf

    eess.IV cs.CV

    An Approach Towards Physics Informed Lung Ultrasound Image Scoring Neural Network for Diagnostic Assistance in COVID-19

    Authors: Mahesh Raveendranatha Panicker, Yale Tung Chen, Gayathri M, Madhavanunni A N, Kiran Vishnu Narayan, C Kesavadas, A P Vinod

    Abstract: Ultrasound is fast becoming an inevitable diagnostic tool for regular and continuous monitoring of the lung with the recent outbreak of COVID-19. In this work, a novel approach is presented to extract acoustic propagation-based features to automatically highlight the region below pleura, which is an important landmark in lung ultrasound (LUS). Subsequently, a multichannel input formed by using the… ▽ More

    Submitted 13 June, 2021; originally announced June 2021.

    Comments: 8 pages, 8 figures, 3 tables, submitted to Springer SIVP Special Issue for COVID19

  12. arXiv:2011.04297  [pdf, other

    cs.SD cs.CL cs.LG eess.AS

    Knowledge Distillation for Singing Voice Detection

    Authors: Soumava Paul, Gurunath Reddy M, K Sreenivasa Rao, Partha Pratim Das

    Abstract: Singing Voice Detection (SVD) has been an active area of research in music information retrieval (MIR). Currently, two deep neural network-based methods, one based on CNN and the other on RNN, exist in literature that learn optimized features for the voice detection (VD) task and achieve state-of-the-art performance on common datasets. Both these models have a huge number of parameters (1.4M for C… ▽ More

    Submitted 19 August, 2021; v1 submitted 9 November, 2020; originally announced November 2020.

    Comments: Accepted at INTERSPEECH 2021. 5 pages, 3 figures

  13. arXiv:2010.06142  [pdf, other

    cs.LG

    Hindsight Experience Replay with Kronecker Product Approximate Curvature

    Authors: Dhuruva Priyan G M, Abhik Singla, Shalabh Bhatnagar

    Abstract: Hindsight Experience Replay (HER) is one of the efficient algorithm to solve Reinforcement Learning tasks related to sparse rewarded environments.But due to its reduced sample efficiency and slower convergence HER fails to perform effectively. Natural gradients solves these challenges by converging the model parameters better. It avoids taking bad actions that collapse the training performance. Ho… ▽ More

    Submitted 9 October, 2020; originally announced October 2020.

    Comments: arXiv admin note: text overlap with arXiv:1708.05144 by other authors

  14. arXiv:2008.00106  [pdf, other

    cs.CV

    Utilising Visual Attention Cues for Vehicle Detection and Tracking

    Authors: Feiyan Hu, Venkatesh G M, Noel E. O'Connor, Alan F. Smeaton, Suzanne Little

    Abstract: Advanced Driver-Assistance Systems (ADAS) have been attracting attention from many researchers. Vision-based sensors are the closest way to emulate human driver visual behavior while driving. In this paper, we explore possible ways to use visual attention (saliency) for object detection and tracking. We investigate: 1) How a visual attention map such as a \emph{subjectness} attention or saliency m… ▽ More

    Submitted 31 July, 2020; originally announced August 2020.

    Comments: Accepted in ICPR2020

  15. arXiv:2006.00782  [pdf, other

    eess.AS cs.CL cs.SD

    Learning to Recognize Code-switched Speech Without Forgetting Monolingual Speech Recognition

    Authors: Sanket Shah, Basil Abraham, Gurunath Reddy M, Sunayana Sitaram, Vikas Joshi

    Abstract: Recently, there has been significant progress made in Automatic Speech Recognition (ASR) of code-switched speech, leading to gains in accuracy on code-switched datasets in many language pairs. Code-switched speech co-occurs with monolingual speech in one or both languages being mixed. In this work, we show that fine-tuning ASR models on code-switched speech harms performance on monolingual speech.… ▽ More

    Submitted 1 June, 2020; originally announced June 2020.

    Comments: 5 pages (4 pages + 1 page references), 5 tables, 1 figure, 1 algorithm, 16 references

  16. arXiv:2003.12017  [pdf

    q-bio.PE cs.LG

    Prediction of number of cases expected and estimation of the final size of coronavirus epidemic in India using the logistic model and genetic algorithm

    Authors: Ganesh Kumar M, Soman K. P, Gopalakrishnan E. A, Vijay Krishna Menon, Sowmya V

    Abstract: In this paper, we have applied the logistic growth regression model and genetic algorithm to predict the number of coronavirus infected cases that can be expected in upcoming days in India and also estimated the final size and its peak time of the coronavirus epidemic in India.

    Submitted 26 March, 2020; originally announced March 2020.

  17. arXiv:1909.04406  [pdf, ps, other

    stat.ML cs.LG eess.SP

    Subspace clustering without knowing the number of clusters: A parameter free approach

    Authors: Vishnu Menon, Gokularam M, Sheetal Kalyani

    Abstract: Subspace clustering, the task of clustering high dimensional data when the data points come from a union of subspaces is one of the fundamental tasks in unsupervised machine learning. Most of the existing algorithms for this task require prior knowledge of the number of clusters along with few additional parameters which need to be set or tuned apriori according to the type of data to be clustered… ▽ More

    Submitted 20 June, 2020; v1 submitted 10 September, 2019; originally announced September 2019.

  18. arXiv:1905.09231  [pdf, other

    cs.CV eess.IV

    Separating Overlap** Tissue Layers from Microscopy Images

    Authors: Zahra Montazeri, Gopi M

    Abstract: Manual preparation of tissue slices for microscopy imaging can introduce tissue tears and overlaps. Typically, further digital processing algorithms such as registration and 3D reconstruction from tissue image stacks cannot handle images with tissue tear/overlap artifacts, and so such images are usually discarded. In this paper, we propose an imaging model and an algorithm to digitally separate ov… ▽ More

    Submitted 22 May, 2019; originally announced May 2019.

  19. arXiv:1904.09765  [pdf, other

    cs.SD cs.LG eess.AS stat.ML

    hf0: A hybrid pitch extraction method for multimodal voice

    Authors: Pradeep Rengaswamy, Gurunath Reddy M, Krothapalli Sreenivasa Rao

    Abstract: Pitch or fundamental frequency (f0) extraction is a fundamental problem studied extensively for its potential applications in speech and clinical applications. In literature, explicit mode specific (modal speech or singing voice or emotional/ expressive speech or noisy speech) signal processing and deep learning f0 extraction methods that exploit the quasi periodic nature of the signal in time, ha… ▽ More

    Submitted 22 April, 2019; originally announced April 2019.

    Comments: Pitch Extraction, F0 extraction, harmonic signals, speech, monophonic songs, Convolutional Neural Network, 5 pages, 5 figures

  20. arXiv:1811.09956  [pdf, other

    cs.SD cs.LG eess.AS stat.ML

    Glottal Closure Instants Detection From Pathological Acoustic Speech Signal Using Deep Learning

    Authors: Gurunath Reddy M, Tanumay Mandal, Krothapalli Sreenivasa Rao

    Abstract: In this paper, we propose a classification based glottal closure instants (GCI) detection from pathological acoustic speech signal, which finds many applications in vocal disorder analysis. Till date, GCI for pathological disorder is extracted from laryngeal (glottal source) signal recorded from Electroglottograph, a dedicated device designed to measure the vocal folds vibration around the larynx.… ▽ More

    Submitted 25 November, 2018; originally announced November 2018.

    Comments: Machine Learning for Health (ML4H) Workshop at NeurIPS 2018 arXiv:1811.07216

    Report number: ML4H/2018/39

  21. arXiv:1501.01364  [pdf

    cs.CV

    Leader Follower Formation Control of Ground Vehicles Using Camshift Based Guidance

    Authors: S. M. Vaitheeswaran, Bharath M. K., Gokul M

    Abstract: Autonomous ground vehicles have been designed for the purpose of that relies on ranging and bearing information received from forward looking camera on the Formation control . A visual guidance control algorithm is designed where real time image processing is used to provide feedback signals. The vision subsystem and control subsystem work in parallel to accomplish formation control. A proportiona… ▽ More

    Submitted 6 January, 2015; originally announced January 2015.

  22. arXiv:1410.7654  [pdf

    cs.IR

    XML Information Retrieval:An overview

    Authors: Suma D., U. Dinesh Acharya, Geetha M., Raviraja Holla M

    Abstract: Locating and distilling the valuable relevant information continued to be the major challenges of Information Retrieval (IR) Systems owing to the explosive growth of online web information. These challenges can be considered the XML Information Retrieval challenges as XML has become a de facto standard over the Web. The research on XML IR starts with the classical IR strategies customized to XML I… ▽ More

    Submitted 27 October, 2014; originally announced October 2014.

    Comments: 7 pages, 0 figures

    Journal ref: International Global Journal For Engineering Research, Volume 10 Issue 1, 2014 pg. 26-32

  23. arXiv:1312.3787  [pdf

    cs.CV

    Analysis and Understanding of Various Models for Efficient Representation and Accurate Recognition of Human Faces

    Authors: Dharini S., Guru Prasad M., Hari haran. V., Kiran Tej J. L., Kunal Ghosh

    Abstract: In this paper we have tried to compare the various face recognition models against their classical problems. We look at the methods followed by these approaches and evaluate to what extent they are able to solve the problems. All methods proposed have some drawbacks under certain conditions. To overcome these drawbacks we propose a multi-model approach

    Submitted 14 February, 2015; v1 submitted 13 December, 2013; originally announced December 2013.

    Comments: Proceedings of National Conference on "Emerging Trends in IT" - eit10, March 2010

  24. arXiv:1305.3213  [pdf

    cs.CY

    The Product Promotion and Consumer Retention Gap in Online Shop**

    Authors: Senthur Balan S, Sowmyan Jegatheesan, Sakthi Ganesh M

    Abstract: As the number of online shop** websites increases day by day, so are the online advertisement strategies and promotional techniques. The number of people who uses internet keeps on increasing daily and it has become a vast marketplace to promote products, surely it will be a prime reason to drive any companies growth in the future.This paper primarily focuses on the areas on which online shoppin… ▽ More

    Submitted 14 May, 2013; originally announced May 2013.

    Comments: 4 Pages,1 Table, 2012 4th International Conference on Electronics Computer Technology (ICECT 2012) 978-1-4673-1850-1/12 2012 IEEE Page 158-161