Skip to main content

Showing 1–50 of 60 results for author: M, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.20735  [pdf, other

    cs.CV

    Language Augmentation in CLIP for Improved Anatomy Detection on Multi-modal Medical Images

    Authors: Mansi Kakkar, Dattesh Shanbhag, Chandan Aladahalli, Gurunath Reddy M

    Abstract: Vision-language models have emerged as a powerful tool for previously challenging multi-modal classification problem in the medical domain. This development has led to the exploration of automated image description generation for multi-modal clinical scans, particularly for radiology report generation. Existing research has focused on clinical descriptions for specific modalities or body regions,… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

    Comments: $©$ 2024 IEEE. Accepted in 46th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) 2024

  2. arXiv:2402.14693  [pdf, ps, other

    cs.NI cs.IT

    Joint AP-UE Association and Power Factor Optimization for Distributed Massive MIMO

    Authors: Mohd Saif Ali Khan, Samar Agnihotri, Karthik R. M

    Abstract: The uplink sum-throughput of distributed massive multiple-input-multiple-output (mMIMO) networks depends majorly on Access point (AP)-User Equipment (UE) association and power control. The AP-UE association and power control both are important problems in their own right in distributed mMIMO networks to improve scalability and reduce front-haul load of the network, and to enhance the system perfor… ▽ More

    Submitted 28 June, 2024; v1 submitted 22 February, 2024; originally announced February 2024.

    Comments: Accepted at the IEEE International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC) 2024

  3. arXiv:2402.07912  [pdf, other

    cs.HC cs.AI

    Spatial Computing: Concept, Applications, Challenges and Future Directions

    Authors: Gokul Yenduri, Ramalingam M, Praveen Kumar Reddy Maddikunta, Thippa Reddy Gadekallu, Rutvij H Jhaveri, Ajay Bandi, Junxin Chen, Wei Wang, Adarsh Arunkumar Shirawalmath, Raghav Ravishankar, Weizheng Wang

    Abstract: Spatial computing is a technological advancement that facilitates the seamless integration of devices into the physical environment, resulting in a more natural and intuitive digital world user experience. Spatial computing has the potential to become a significant advancement in the field of computing. From GPS and location-based services to healthcare, spatial computing technologies have influen… ▽ More

    Submitted 30 January, 2024; originally announced February 2024.

    Comments: Submitted to peer reviewe

  4. arXiv:2401.05422  [pdf, ps, other

    eess.SP cs.AI

    Machine Learning (ML)-assisted Beam Management in millimeter (mm)Wave Distributed Multiple Input Multiple Output (D-MIMO) systems

    Authors: Karthik R M, Dhiraj Nagaraja Hegde, Muris Sarajlic, Abhishek Sarkar

    Abstract: Beam management (BM) protocols are critical for establishing and maintaining connectivity between network radio nodes and User Equipments (UEs). In Distributed Multiple Input Multiple Output systems (D-MIMO), a number of access points (APs), coordinated by a central processing unit (CPU), serves a number of UEs. At mmWave frequencies, the problem of finding the best AP and beam to serve the UEs is… ▽ More

    Submitted 30 December, 2023; originally announced January 2024.

  5. arXiv:2401.02472  [pdf, ps, other

    cs.DC

    Code Generation for a Variety of Accelerators for a Graph DSL

    Authors: Ashwina Kumar, M. Venkata Krishna, Prasanna Bartakke, Rahul Kumar, Rajesh Pandian M, Nibedita Behera, Rupesh Nasre

    Abstract: Sparse graphs are ubiquitous in real and virtual worlds. With the phenomenal growth in semi-structured and unstructured data, sizes of the underlying graphs have witnessed a rapid growth over the years. Analyzing such large structures necessitates parallel processing, which is challenged by the intrinsic irregularity of sparse computation, memory access, and communication. It would be ideal if pro… ▽ More

    Submitted 4 January, 2024; originally announced January 2024.

    Comments: arXiv admin note: text overlap with arXiv:2305.03317

  6. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  7. arXiv:2312.05609  [pdf, other

    quant-ph cs.CR cs.CY cs.ET cs.NI

    Comprehensive Analysis of BB84, A Quantum Key Distribution Protocol

    Authors: SujayKumar Reddy M, Chandra Mohan B

    Abstract: Quantum Key Distribution (QKD) is a technique that enables secure communication between two parties by sharing a secret key. One of the most well-known QKD protocols is the BB84 protocol, proposed by Charles Bennett and Gilles Brassard in 1984. In this protocol, Alice and Bob use a quantum channel to exchange qubits, allowing them to generate a shared key that is resistant to eavesdrop**. This p… ▽ More

    Submitted 9 December, 2023; originally announced December 2023.

    Comments: 16 pages, 17 figures

  8. arXiv:2310.18642  [pdf

    cs.CV cs.AI

    One-shot Localization and Segmentation of Medical Images with Foundation Models

    Authors: Deepa Anand, Gurunath Reddy M, Vanika Singhal, Dattesh D. Shanbhag, Shriram KS, Uday Patil, Chitresh Bhushan, Kavitha Manickam, Dawei Gui, Rakesh Mullick, Avinash Gopal, Parminder Bhatia, Taha Kass-Hout

    Abstract: Recent advances in Vision Transformers (ViT) and Stable Diffusion (SD) models with their ability to capture rich semantic features of the image have been used for image correspondence tasks on natural images. In this paper, we examine the ability of a variety of pre-trained ViT (DINO, DINOv2, SAM, CLIP) and SD models, trained exclusively on natural images, for solving the correspondence problems o… ▽ More

    Submitted 28 October, 2023; originally announced October 2023.

    Comments: Accepted at NeurIPS 2023 R0-FoMo Workshop

  9. arXiv:2310.07264  [pdf, other

    cs.LG

    Classification of Dysarthria based on the Levels of Severity. A Systematic Review

    Authors: Afnan Al-Ali, Somaya Al-Maadeed, Moutaz Saleh, Rani Chinnappa Naidu, Zachariah C Alex, Prakash Ramachandran, Rajeev Khoodeeram, Rajesh Kumar M

    Abstract: Dysarthria is a neurological speech disorder that can significantly impact affected individuals' communication abilities and overall quality of life. The accurate and objective classification of dysarthria and the determination of its severity are crucial for effective therapeutic intervention. While traditional assessments by speech-language pathologists (SLPs) are common, they are often subjecti… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

    Comments: no comments

  10. arXiv:2309.15709  [pdf, ps, other

    cs.NI cs.IT

    Distributed Pilot Assignment for Distributed Massive-MIMO Networks

    Authors: Mohd Saif Ali Khan, Samar Agnihotri, Karthik R. M

    Abstract: Pilot contamination is a critical issue in distributed massive MIMO networks, where the reuse of pilot sequences due to limited availability of orthogonal pilots for channel estimation leads to performance degradation. In this work, we propose a novel distributed pilot assignment scheme to effectively mitigate the impact of pilot contamination. Our proposed scheme not only reduces signaling overhe… ▽ More

    Submitted 1 July, 2024; v1 submitted 27 September, 2023; originally announced September 2023.

    Comments: Presented at the IEEE Wireless Communications and Networking Conference (WCNC) 2024

  11. arXiv:2307.16745  [pdf, other

    cs.CV cs.AI cs.CY cs.MM

    Advancing Smart Malnutrition Monitoring: A Multi-Modal Learning Approach for Vital Health Parameter Estimation

    Authors: Ashish Marisetty, Prathistith Raj M, Praneeth Nemani, Venkanna Udutalapally, Debanjan Das

    Abstract: Malnutrition poses a significant threat to global health, resulting from an inadequate intake of essential nutrients that adversely impacts vital organs and overall bodily functioning. Periodic examinations and mass screenings, incorporating both conventional and non-invasive techniques, have been employed to combat this challenge. However, these approaches suffer from critical limitations, such a… ▽ More

    Submitted 31 July, 2023; originally announced July 2023.

  12. arXiv:2307.06659  [pdf, other

    cs.CR cs.CV cs.CY

    A Comprehensive Analysis of Blockchain Applications for Securing Computer Vision Systems

    Authors: Ramalingam M, Chemmalar Selvi, Nancy Victor, Rajeswari Chengoden, Sweta Bhattacharya, Praveen Kumar Reddy Maddikunta, Duehee Lee, Md. Jalil Piran, Neelu Khare, Gokul Yendri, Thippa Reddy Gadekallu

    Abstract: Blockchain (BC) and Computer Vision (CV) are the two emerging fields with the potential to transform various sectors.The ability of BC can help in offering decentralized and secure data storage, while CV allows machines to learn and understand visual data. This integration of the two technologies holds massive promise for develo** innovative applications that can provide solutions to the challen… ▽ More

    Submitted 13 July, 2023; originally announced July 2023.

  13. arXiv:2305.18734  [pdf, other

    cs.NE

    IcSDE+ -- An Indicator for Constrained Multi-Objective Optimization

    Authors: Oladayo S. Ajani, Rammohan Mallipeddi, Sri Srinivasa Raju M

    Abstract: The effectiveness of Constrained Multi-Objective Evolutionary Algorithms (CMOEAs) depends on their ability to reach the different feasible regions during evolution, by exploiting the information present in infeasible solutions, in addition to optimizing the several conflicting objectives. Over the years, researchers have proposed several CMOEAs to handle CMOPs. However, among the different CMOEAs… ▽ More

    Submitted 30 May, 2023; originally announced May 2023.

    Comments: 13 pages, 2 main figues

  14. arXiv:2305.10435  [pdf, other

    cs.CL cs.AI

    Generative Pre-trained Transformer: A Comprehensive Review on Enabling Technologies, Potential Applications, Emerging Challenges, and Future Directions

    Authors: Gokul Yenduri, Ramalingam M, Chemmalar Selvi G, Supriya Y, Gautam Srivastava, Praveen Kumar Reddy Maddikunta, Deepti Raj G, Rutvij H Jhaveri, Prabadevi B, Weizheng Wang, Athanasios V. Vasilakos, Thippa Reddy Gadekallu

    Abstract: The Generative Pre-trained Transformer (GPT) represents a notable breakthrough in the domain of natural language processing, which is propelling us toward the development of machines that can understand and communicate using language in a manner that closely resembles that of humans. GPT is based on the transformer architecture, a deep neural network designed for natural language processing tasks.… ▽ More

    Submitted 21 May, 2023; v1 submitted 11 May, 2023; originally announced May 2023.

    Comments: Submitted to peer review

  15. arXiv:2305.03317  [pdf, other

    cs.DC

    StarPlat: A Versatile DSL for Graph Analytics

    Authors: Nibedita Behera, Ashwina Kumar, Ebenezer Rajadurai T, Sai Nitish, Rajesh Pandian M, Rupesh Nasre

    Abstract: Graphs model several real-world phenomena. With the growth of unstructured and semi-structured data, parallelization of graph algorithms is inevitable. Unfortunately, due to inherent irregularity of computation, memory access, and communication, graph algorithms are traditionally challenging to parallelize. To tame this challenge, several libraries, frameworks, and domain-specific languages (DSLs)… ▽ More

    Submitted 5 May, 2023; originally announced May 2023.

    Comments: 30 pages, 21 figures

  16. arXiv:2301.10015  [pdf, other

    cs.SD cs.AI eess.AS

    Deep Attention-Based Alignment Network for Melody Generation from Incomplete Lyrics

    Authors: Gurunath Reddy M, Zhe Zhang, Yi Yu, Florian Harscoet, Simon Canales, Suhua Tang

    Abstract: We propose a deep attention-based alignment network, which aims to automatically predict lyrics and melody with given incomplete lyrics as input in a way similar to the music creation of humans. Most importantly, a deep neural lyrics-to-melody net is trained in an encoder-decoder way to predict possible pairs of lyrics-melody when given incomplete lyrics (few keywords). The attention mechanism is… ▽ More

    Submitted 22 January, 2023; originally announced January 2023.

    Comments: arXiv admin note: substantial text overlap with arXiv:2011.06380

  17. arXiv:2211.01338  [pdf, other

    eess.AS cs.CL cs.MM cs.SD eess.IV

    Technology Pipeline for Large Scale Cross-Lingual Dubbing of Lecture Videos into Multiple Indian Languages

    Authors: Anusha Prakash, Arun Kumar, Ashish Seth, Bhagyashree Mukherjee, Ishika Gupta, Jom Kuriakose, Jordan Fernandes, K V Vikram, Mano Ranjith Kumar M, Metilda Sagaya Mary, Mohammad Wajahat, Mohana N, Mudit Batra, Navina K, Nihal John George, Nithya Ravi, Pruthwik Mishra, Sudhanshu Srivastava, Vasista Sai Lodagala, Vandan Mujadia, Kada Sai Venkata Vineeth, Vrunda Sukhadia, Dipti Sharma, Hema Murthy, Pushpak Bhattacharya , et al. (2 additional authors not shown)

    Abstract: Cross-lingual dubbing of lecture videos requires the transcription of the original audio, correction and removal of disfluencies, domain term discovery, text-to-text translation into the target language, chunking of text using target language rhythm, text-to-speech synthesis followed by isochronous lipsyncing to the original video. This task becomes challenging when the source and target languages… ▽ More

    Submitted 1 November, 2022; originally announced November 2022.

  18. arXiv:2210.08629  [pdf, other

    math.CO cs.DM

    A Note On $\ell$-Rauzy Graphs for the Infinite Fibonacci Word

    Authors: Rajavel Praveen M, Rama R

    Abstract: The $\ell$-Rauzy graph of order $k$ for any infinite word is a directed graph in which an arc $(v_1,v_2)$ is formed if the concatenation of the word $v_1$ and the suffix of $v_2$ of length $k-\ell$ is a subword of the infinite word. In this paper, we consider one of the important aperiodic recurrent words, the infinite Fibonacci word for discussion. We prove a few basic properties of the $\ell$-Ra… ▽ More

    Submitted 27 October, 2022; v1 submitted 16 October, 2022; originally announced October 2022.

    Comments: 10 pages, 4 figures

  19. arXiv:2210.03948  [pdf, other

    cs.IT eess.SP

    Optimizing the Placement and Beamforming of RIS in Cellular Networks: A System-Level Modeling Perspective

    Authors: Pavan Reddy M., SaiDhiraj Amuru, Kiran Kuchi

    Abstract: In this letter, we present in detail the system-level modeling of reconfigurable intelligent surface (RIS)-assisted cellular systems by considering a 3-dimensional channel model between base station, RIS, and user. We prove that the optimal placement of RIS to achieve wider coverage is exactly opposite to the base station, under the constraint of single RIS in each sector. We propose a novel beamf… ▽ More

    Submitted 2 May, 2023; v1 submitted 8 October, 2022; originally announced October 2022.

  20. Variability Analysis of Isolated Intersections Through Case Study

    Authors: Savithramma R M, R Sumathi, Sudhira H S

    Abstract: Population and economic growth of urban areas have led to intensive use of private vehicles, thereby increasing traffic volume and congestion on roads. The traffic management in the city is a challenge for concerned authorities, and the signalized intersections are the primary interest of traffic management. Interpreting traffic patterns and current traffic signal operations can provide thorough i… ▽ More

    Submitted 8 October, 2022; originally announced October 2022.

  21. arXiv:2205.02645  [pdf, other

    q-bio.QM cs.LG math.DS

    Discovering stochastic dynamical equations from biological time series data

    Authors: Arshed Nabeel, Ashwin Karichannavar, Shuaib Palathingal, Jitesh Jhawar, David B. Brückner, Danny Raj M., Vishwesha Guttal

    Abstract: Stochastic differential equations (SDEs) are an important framework to model dynamics with randomness, as is common in most biological systems. The inverse problem of integrating these models with empirical data remains a major challenge. Here, we present an equation discovery methodology that takes time series data as an input, analyses fine scale fluctuations and outputs an interpretable SDE tha… ▽ More

    Submitted 17 February, 2024; v1 submitted 5 May, 2022; originally announced May 2022.

    Comments: Updates: v3: Significantly reorganized the paper and added a section analysis of a cell migration dataset. v4: Update arXiv title to match the updated title of the manuscript. v5: Added sections detailing the limitations of the approach

  22. arXiv:2202.01078  [pdf, other

    cs.SD eess.AS

    Melody Extraction from Polyphonic Music by Deep Learning Approaches: A Review

    Authors: Gurunath Reddy M, K. Sreenivasa Rao, Partha Pratim Das

    Abstract: Melody extraction is a vital music information retrieval task among music researchers for its potential applications in education pedagogy and the music industry. Melody extraction is a notoriously challenging task due to the presence of background instruments. Also, often melodic source exhibits similar characteristics to that of the other instruments. The interfering background accompaniment wit… ▽ More

    Submitted 2 February, 2022; originally announced February 2022.

    Comments: 72 pages

  23. arXiv:2201.02129  [pdf, other

    cs.IT eess.SP

    Spectral and Energy Efficient User Pairing for RIS-assisted Uplink NOMA Systems with Imperfect Phase Compensation

    Authors: Kusuma Priya P., Pavan Reddy M., Abhinav Kumar

    Abstract: Non-orthogonal multiple access (NOMA) is considered a key technology for improving the spectral efficiency of fifth-generation (5G) and beyond 5G cellular networks. NOMA is beneficial when the channel vectors of the users are in the same direction, which is not always possible in conventional wireless systems. With the help of a reconfigurable intelligent surface (RIS), the base station can contro… ▽ More

    Submitted 6 January, 2022; originally announced January 2022.

  24. arXiv:2111.04003  [pdf

    cs.LG

    Predictive Model for Gross Community Production Rate of Coral Reefs using Ensemble Learning Methodologies

    Authors: Umanandini S, Rishivardhan M, Aouthithiye Barathwaj SR Y, Jasline Augusta J, Shrirang Sapate, Reenasree S, Vigneash M

    Abstract: Coral reefs play a vital role in maintaining the ecological balance of the marine ecosystem. Various marine organisms depend on coral reefs for their existence and their natural processes. Coral reefs provide the necessary habitat for reproduction and growth for various exotic species of the marine ecosystem. In this article, we discuss the most important parameters which influence the lifecycle o… ▽ More

    Submitted 23 January, 2023; v1 submitted 7 November, 2021; originally announced November 2021.

    Comments: 8 pages, 18 figures

    MSC Class: 68T20 ACM Class: I.2.8

  25. arXiv:2110.05864  [pdf, other

    cs.LG nlin.AO physics.app-ph

    Disentangling intrinsic motion from neighbourhood effects in heterogeneous collective motion

    Authors: Arshed Nabeel, Danny Raj M

    Abstract: Most real world collectives, including active particles, living cells, and grains, are heterogeneous, where individuals with differing properties interact. The differences among individuals in their intrinsic properties have emergent effects at the group level. It is often of interest to infer how the intrinsic properties differ among the individuals, based on their observed movement patterns. How… ▽ More

    Submitted 5 March, 2022; v1 submitted 12 October, 2021; originally announced October 2021.

    Comments: Supplementary movies can be found in: https://www.dannyraj.com/obsinf-supp-info

    Journal ref: Chaos 32, 063119 (2022)

  26. arXiv:2106.07938  [pdf, ps, other

    cs.IT eess.SP

    User Pairing and Power Allocation for IRS-Assisted NOMA Systems with Imperfect Phase Compensation

    Authors: Pavan Reddy M., Abhinav Kumar

    Abstract: In this letter, we analyze the performance of the intelligent reflecting surface (IRS) assisted downlink non-orthogonal multiple access (NOMA) systems in the presence of imperfect phase compensation. We derive an upper bound on the imperfect phase compensation to achieve minimum required data rates for each user. Using this bound, we propose an adaptive user pairing algorithm to maximize the netwo… ▽ More

    Submitted 15 June, 2021; originally announced June 2021.

  27. Toward Blockchain for Edge-of-Things: A New Paradigm, Opportunities, and Future Directions

    Authors: Prabadevi B, N Deepa, Quoc-Viet Pham, Dinh C. Nguyen, Praveen Kumar Reddy M, Thippa Reddy G, Pubudu N. Pathirana, Octavia Dobre

    Abstract: Blockchain is gaining momentum as a promising technology for many application domains, one of them being the Edge-of- Things (EoT) that is enabled by the integration of edge computing and the Internet-of-Things (IoT). Particularly, the amalgamation of blockchain and EoT leads to a new paradigm, called blockchain enabled EoT (BEoT) that is crucial for enabling future low-latency and high-security s… ▽ More

    Submitted 27 April, 2021; originally announced April 2021.

    Comments: Accepted at the IEEE Internet of Things Magazine

  28. arXiv:2101.00798  [pdf, other

    cs.NI cs.AI

    Fusion of Federated Learning and Industrial Internet of Things: A Survey

    Authors: Parimala M, Swarna Priya R M, Quoc-Viet Pham, Kapal Dev, Praveen Kumar Reddy Maddikunta, Thippa Reddy Gadekallu, Thien Huynh-The

    Abstract: Industrial Internet of Things (IIoT) lays a new paradigm for the concept of Industry 4.0 and paves an insight for new industrial era. Nowadays smart machines and smart factories use machine learning/deep learning based models for incurring intelligence. However, storing and communicating the data to the cloud and end device leads to issues in preserving privacy. In order to address this issue, fed… ▽ More

    Submitted 4 January, 2021; originally announced January 2021.

    Comments: This work has been submitted for possible publication. Any comments and suggestions are appreciated

  29. arXiv:2011.04297  [pdf, other

    cs.SD cs.CL cs.LG eess.AS

    Knowledge Distillation for Singing Voice Detection

    Authors: Soumava Paul, Gurunath Reddy M, K Sreenivasa Rao, Partha Pratim Das

    Abstract: Singing Voice Detection (SVD) has been an active area of research in music information retrieval (MIR). Currently, two deep neural network-based methods, one based on CNN and the other on RNN, exist in literature that learn optimized features for the voice detection (VD) task and achieve state-of-the-art performance on common datasets. Both these models have a huge number of parameters (1.4M for C… ▽ More

    Submitted 19 August, 2021; v1 submitted 9 November, 2020; originally announced November 2020.

    Comments: Accepted at INTERSPEECH 2021. 5 pages, 3 figures

  30. Multiclass Model for Agriculture development using Multivariate Statistical method

    Authors: N Deepa, Mohammad Zubair Khan, Prabadevi B, Durai Raj Vincent P M, Praveen Kumar Reddy Maddikunta, Thippa Reddy Gadekallu

    Abstract: Mahalanobis taguchi system (MTS) is a multi-variate statistical method extensively used for feature selection and binary classification problems. The calculation of orthogonal array and signal-to-noise ratio in MTS makes the algorithm complicated when more number of factors are involved in the classification problem. Also the decision is based on the accuracy of normal and abnormal observations of… ▽ More

    Submitted 7 October, 2020; v1 submitted 12 September, 2020; originally announced September 2020.

    Comments: in IEEE Access

  31. arXiv:2007.06021  [pdf, other

    eess.AS cs.LG

    NISP: A Multi-lingual Multi-accent Dataset for Speaker Profiling

    Authors: Shareef Babu Kalluri, Deepu Vijayasenan, Sriram Ganapathy, Ragesh Rajan M, Prashant Krishnan

    Abstract: Many commercial and forensic applications of speech demand the extraction of information about the speaker characteristics, which falls into the broad category of speaker profiling. The speaker characteristics needed for profiling include physical traits of the speaker like height, age, and gender of the speaker along with the native language of the speaker. Many of the datasets available have onl… ▽ More

    Submitted 12 July, 2020; originally announced July 2020.

    Comments: 5pages, Initial version submitted to Interspeech2020

  32. arXiv:2007.05853  [pdf

    cs.CV

    Complex Wavelet SSIM based Image Data Augmentation

    Authors: Ritin Raveendran, Aviral Singh, Rajesh Kumar M

    Abstract: One of the biggest problems in neural learning networks is the lack of training data available to train the network. Data augmentation techniques over the past few years, have therefore been developed, aiming to increase the amount of artificial training data with the limited number of real world samples. In this paper, we look particularly at the MNIST handwritten dataset an image dataset used fo… ▽ More

    Submitted 11 July, 2020; originally announced July 2020.

  33. arXiv:2006.14107  [pdf, other

    cs.CV

    Kinematic-Structure-Preserved Representation for Unsupervised 3D Human Pose Estimation

    Authors: Jogendra Nath Kundu, Siddharth Seth, Rahul M V, Mugalodi Rakesh, R. Venkatesh Babu, Anirban Chakraborty

    Abstract: Estimation of 3D human pose from monocular image has gained considerable attention, as a key step to several human-centric applications. However, generalizability of human pose estimation models developed using supervision on large-scale in-studio datasets remains questionable, as these models often perform unsatisfactorily on unseen in-the-wild environments. Though weakly-supervised models have b… ▽ More

    Submitted 24 June, 2020; originally announced June 2020.

    Comments: AAAI 2020 (Oral)

  34. arXiv:2006.00782  [pdf, other

    eess.AS cs.CL cs.SD

    Learning to Recognize Code-switched Speech Without Forgetting Monolingual Speech Recognition

    Authors: Sanket Shah, Basil Abraham, Gurunath Reddy M, Sunayana Sitaram, Vikas Joshi

    Abstract: Recently, there has been significant progress made in Automatic Speech Recognition (ASR) of code-switched speech, leading to gains in accuracy on code-switched datasets in many language pairs. Code-switched speech co-occurs with monolingual speech in one or both languages being mixed. In this work, we show that fine-tuning ASR models on code-switched speech harms performance on monolingual speech.… ▽ More

    Submitted 1 June, 2020; originally announced June 2020.

    Comments: 5 pages (4 pages + 1 page references), 5 tables, 1 figure, 1 algorithm, 16 references

  35. Deepfake Forensics Using Recurrent Neural Networks

    Authors: Rahul U, Ragul M, Raja Vignesh K, Tejeswinee K

    Abstract: As of late an AI based free programming device has made it simple to make authentic face swaps in recordings that leaves barely any hints of control, in what are known as "deepfake" recordings. Situations where these genuine istic counterfeit recordings are utilized to make political pain, extort somebody or phony fear based oppression occasions are effectively imagined. This paper proposes a tran… ▽ More

    Submitted 1 May, 2020; originally announced May 2020.

    Comments: This submission has been removed by arXiv administrators due to copyright infringement

  36. Deepfake Video Forensics based on Transfer Learning

    Authors: Rahul U, Ragul M, Raja Vignesh K, Tejeswinee K

    Abstract: Deeplearning has been used to solve complex problems in various domains. As it advances, it also creates applications which become a major threat to our privacy, security and even to our Democracy. Such an application which is being developed recently is the "Deepfake". Deepfake models can create fake images and videos that humans cannot differentiate them from the genuine ones. Therefore, the cou… ▽ More

    Submitted 29 April, 2020; originally announced April 2020.

    Comments: This submission has been removed by arXiv administrators due to copyright infringement

    Report number: F9747038620

  37. arXiv:2004.04393  [pdf, other

    cs.CV cs.LG

    Universal Source-Free Domain Adaptation

    Authors: Jogendra Nath Kundu, Naveen Venkat, Rahul M V, R. Venkatesh Babu

    Abstract: There is a strong incentive to develop versatile learning techniques that can transfer the knowledge of class-separability from a labeled source domain to an unlabeled target domain in the presence of a domain-shift. Existing domain adaptation (DA) approaches are not equipped for practical DA scenarios as a result of their reliance on the knowledge of source-target label-set relationship (e.g. Clo… ▽ More

    Submitted 9 April, 2020; originally announced April 2020.

    Comments: CVPR 2020. Code available at https://github.com/val-iisc/usfda

  38. arXiv:2004.04388  [pdf, other

    cs.CV cs.LG

    Towards Inheritable Models for Open-Set Domain Adaptation

    Authors: Jogendra Nath Kundu, Naveen Venkat, Ambareesh Revanur, Rahul M V, R. Venkatesh Babu

    Abstract: There has been a tremendous progress in Domain Adaptation (DA) for visual recognition tasks. Particularly, open-set DA has gained considerable attention wherein the target domain contains additional unseen categories. Existing open-set DA approaches demand access to a labeled source dataset along with unlabeled target instances. However, this reliance on co-existing source and target data is highl… ▽ More

    Submitted 9 April, 2020; originally announced April 2020.

    Comments: CVPR 2020 (Oral). Code available at https://github.com/val-iisc/inheritune

  39. arXiv:2002.02370  [pdf

    cs.MM

    Data hiding in speech signal using steganography and encryption

    Authors: Hanisha Chowdary N, Karan K, Bharath K P, Rajesh Kumar M

    Abstract: Data privacy and data security are always on highest priority in the world. We need a reliable method to encrypt the data so that it reaches the destination safely. Encryption is a simple yet effective way to protect our data while transmitting it to a destination. The proposed method has state of art technology of steganography and encryption. This paper puts forward a different approach for data… ▽ More

    Submitted 13 January, 2020; originally announced February 2020.

  40. arXiv:2001.10094  [pdf

    eess.AS cs.SD

    OMAP-L138 LCDK Development Kit

    Authors: Bharath K P, Sylash K, Pravina K, Rajesh Kumar M

    Abstract: Low cost and low power consumption processor play a vital role in the field of Digital Signal Processing (DSP). The OMAP-L138 development kit which is low cost, low power consumption, ease and speed, with a wide variety of applications includes Digital signal processing, Image processing and video processing. This paper represents the basic introduction to OMAP-L138 processor and quick procedural… ▽ More

    Submitted 13 January, 2020; originally announced January 2020.

  41. arXiv:2001.04215  [pdf

    cs.CV

    Radial Based Analysis of GRNN in Non-Textured Image Inpainting

    Authors: Karthik R, Anvita Dwivedi, Haripriya M, Bharath K P, Rajesh Kumar M

    Abstract: Image inpainting algorithms are used to restore some damaged or missing information region of an image based on the surrounding information. The method proposed in this paper applies the radial based analysis of image inpainting on GRNN. The damaged areas are first isolated from rest of the areas and then arranged by their size and then inpainted using GRNN. The training of the neural network is d… ▽ More

    Submitted 13 January, 2020; originally announced January 2020.

  42. arXiv:2001.04208  [pdf

    cs.CV

    Handwritten Character Recognition Using Unique Feature Extraction Technique

    Authors: Sai Abhishikth Ayyadevara, P N V Sai Ram Teja, Bharath K P, Rajesh Kumar M

    Abstract: One of the most arduous and captivating domains under image processing is handwritten character recognition. In this paper we have proposed a feature extraction technique which is a combination of unique features of geometric, zone-based hybrid, gradient features extraction approaches and three different neural networks namely the Multilayer Perceptron network using Backpropagation algorithm (MLP… ▽ More

    Submitted 13 January, 2020; originally announced January 2020.

  43. arXiv:1905.05520  [pdf, other

    eess.SP cs.IT

    A Novel Beamformed Control Channel Design for LTE with Full Dimension-MIMO

    Authors: Pavan Reddy M., Harish Kumar D., Saidhiraj Amuru, Kiran Kuchi

    Abstract: The Full Dimension-MIMO (FD-MIMO) technology is capable of achieving huge improvements in network throughput with simultaneous connectivity of a large number of mobile wireless devices, unmanned aerial vehicles, and the Internet of Things (IoT). In FD-MIMO, with a large number of antennae at the base station and the ability to perform beamforming, the capacity of the physical downlink shared chann… ▽ More

    Submitted 14 May, 2019; originally announced May 2019.

  44. arXiv:1904.09765  [pdf, other

    cs.SD cs.LG eess.AS stat.ML

    hf0: A hybrid pitch extraction method for multimodal voice

    Authors: Pradeep Rengaswamy, Gurunath Reddy M, Krothapalli Sreenivasa Rao

    Abstract: Pitch or fundamental frequency (f0) extraction is a fundamental problem studied extensively for its potential applications in speech and clinical applications. In literature, explicit mode specific (modal speech or singing voice or emotional/ expressive speech or noisy speech) signal processing and deep learning f0 extraction methods that exploit the quasi periodic nature of the signal in time, ha… ▽ More

    Submitted 22 April, 2019; originally announced April 2019.

    Comments: Pitch Extraction, F0 extraction, harmonic signals, speech, monophonic songs, Convolutional Neural Network, 5 pages, 5 figures

  45. arXiv:1903.01902  [pdf, other

    cs.ET q-bio.QM

    BacSoft: A Tool to Archive Data on Bacteria

    Authors: Amay Agrawal, Dixita Limbachiya, Ravikumar M., Taslimarif Saiyed, Manish K. Gupta

    Abstract: Recently, DNA data storage systems have attracted many researchers worldwide. Motivated by the success stories of such systems, in this work we propose a software called BacSoft to clone the data in a bacterial plasmid by using the concept of genetic engineering. We consider the encoding schemes such that it satisfies constraints significant for bacterial data storage.

    Submitted 5 March, 2019; originally announced March 2019.

    Comments: 8 pages, 13 figures, poster abstract DNA Computing and Molecular Programming, DNA24 conference, **an, China, Oct 2018

  46. arXiv:1811.09956  [pdf, other

    cs.SD cs.LG eess.AS stat.ML

    Glottal Closure Instants Detection From Pathological Acoustic Speech Signal Using Deep Learning

    Authors: Gurunath Reddy M, Tanumay Mandal, Krothapalli Sreenivasa Rao

    Abstract: In this paper, we propose a classification based glottal closure instants (GCI) detection from pathological acoustic speech signal, which finds many applications in vocal disorder analysis. Till date, GCI for pathological disorder is extracted from laryngeal (glottal source) signal recorded from Electroglottograph, a dedicated device designed to measure the vocal folds vibration around the larynx.… ▽ More

    Submitted 25 November, 2018; originally announced November 2018.

    Comments: Machine Learning for Health (ML4H) Workshop at NeurIPS 2018 arXiv:1811.07216

    Report number: ML4H/2018/39

  47. arXiv:1810.06635  [pdf, other

    cs.CL cs.CV cs.SD eess.AS

    Semi-supervised and Active-learning Scenarios: Efficient Acoustic Model Refinement for a Low Resource Indian Language

    Authors: Maharajan Chellapriyadharshini, Anoop Toffy, Srinivasa Raghavan K. M., V Ramasubramanian

    Abstract: We address the problem of efficient acoustic-model refinement (continuous retraining) using semi-supervised and active learning for a low resource Indian language, wherein the low resource constraints are having i) a small labeled corpus from which to train a baseline `seed' acoustic model and ii) a large training corpus without orthographic labeling or from which to perform a data selection for m… ▽ More

    Submitted 2 October, 2018; originally announced October 2018.

    Journal ref: Proc. Interspeech 2018

  48. arXiv:1809.04154  [pdf

    cs.CV

    Intensity and Rescale Invariant Copy Move Forgery Detection Techniques

    Authors: Tejas K, Swathi C, Rajesh Kumar M

    Abstract: In this contemporary world digital media such as videos and images behave as an active medium to carry valuable information across the globe on all fronts. However there are several techniques evolved to tamper the image which has made their authenticity untrustworthy. CopyMove Forgery CMF is one of the most common forgeries present in an image where a cluster of pixels are duplicated in the same… ▽ More

    Submitted 11 September, 2018; originally announced September 2018.

    Comments: Further research is active on this paper in VIT University. Hence, the paper is yet not published

  49. arXiv:1806.02907  [pdf

    cs.CV

    Copy Move Forgery using Hus Invariant Moments and Log Polar Transformations

    Authors: Tejas K, Swathi C, Rajesh Kumar M

    Abstract: With the increase in interchange of data, there is a growing necessity of security. Considering the volumes of digital data that is transmitted, they are in need to be secure. Among the many forms of tampering possible, one widespread technique is Copy Move Forgery CMF. This forgery occurs when parts of the image are copied and duplicated elsewhere in the same image. There exist a number of algori… ▽ More

    Submitted 7 June, 2018; originally announced June 2018.

    Comments: This paper was submitted, accepted and presented in the 3rd International Conference on RTEICT, IEEE Conference

  50. arXiv:1802.06288  [pdf

    cs.NE

    Implementation of Neural Network and feature extraction to classify ECG signals

    Authors: R Karthik, Dhruv Tyagi, Amogh Raut, Soumya Saxena, Rajesh Kumar M

    Abstract: This paper presents a suitable and efficient implementation of a feature extraction algorithm (Pan Tompkins algorithm) on electrocardiography (ECG) signals, for detection and classification of four cardiac diseases: Sleep Apnea, Arrhythmia, Supraventricular Arrhythmia and Long Term Atrial Fibrillation (AF) and differentiating them from the normal heart beat by using pan Tompkins RR detection follo… ▽ More

    Submitted 17 February, 2018; originally announced February 2018.

    Comments: SPRINGER LNEE