Search | arXiv e-print repository

Student Perspectives on Using a Large Language Model (LLM) for an Assignment on Professional Ethics

Authors: Virginia Grande, Natalie Kiesler, Maria Andreina Francisco R

Abstract: The advent of Large Language Models (LLMs) started a serious discussion among educators on how LLMs would affect, e.g., curricula, assessments, and students' competencies. Generative AI and LLMs also raised ethical questions and concerns for computing educators and professionals. This experience report presents an assignment within a course on professional competencies, including some related to e… ▽ More The advent of Large Language Models (LLMs) started a serious discussion among educators on how LLMs would affect, e.g., curricula, assessments, and students' competencies. Generative AI and LLMs also raised ethical questions and concerns for computing educators and professionals. This experience report presents an assignment within a course on professional competencies, including some related to ethics, that computing master's students need in their careers. For the assignment, student groups discussed the ethical process by Lennerfors et al. by analyzing a case: a fictional researcher considers whether to attend the real CHI 2024 conference in Hawaii. The tasks were (1) to participate in in-class discussions on the case, (2) to use an LLM of their choice as a discussion partner for said case, and (3) to document both discussions, reflecting on their use of the LLM. Students reported positive experiences with the LLM as a way to increase their knowledge and understanding, although some identified limitations. The LLM provided a wider set of options for action in the studied case, including unfeasible ones. The LLM would not select a course of action, so students had to choose themselves, which they saw as coherent. From the educators' perspective, there is a need for more instruction for students using LLMs: some students did not perceive the tools as such but rather as an authoritative knowledge base. Therefore, this work has implications for educators considering the use of LLMs as discussion partners or tools to practice critical thinking, especially in computing ethics education. △ Less

Submitted 9 April, 2024; originally announced June 2024.

Comments: accepted at ITiCSE 2024, Milan, Italy

arXiv:2405.03537 [pdf, other]

Exploring the Efficacy of Federated-Continual Learning Nodes with Attention-Based Classifier for Robust Web Phishing Detection: An Empirical Investigation

Authors: Jesher Joshua M, Adhithya R, Sree Dananjay S, M Revathi

Abstract: Web phishing poses a dynamic threat, requiring detection systems to quickly adapt to the latest tactics. Traditional approaches of accumulating data and periodically retraining models are outpaced. We propose a novel paradigm combining federated learning and continual learning, enabling distributed nodes to continually update models on streams of new phishing data, without accumulating data. These… ▽ More Web phishing poses a dynamic threat, requiring detection systems to quickly adapt to the latest tactics. Traditional approaches of accumulating data and periodically retraining models are outpaced. We propose a novel paradigm combining federated learning and continual learning, enabling distributed nodes to continually update models on streams of new phishing data, without accumulating data. These locally adapted models are then aggregated at a central server via federated learning. To enhance detection, we introduce a custom attention-based classifier model with residual connections, tailored for web phishing, leveraging attention mechanisms to capture intricate phishing patterns. We evaluate our hybrid learning paradigm across continual learning strategies (cumulative, replay, MIR, LwF) and model architectures through an empirical investigation. Our main contributions are: (1) a new hybrid federated-continual learning paradigm for robust web phishing detection, and (2) a novel attention + residual connections based model explicitly designed for this task, attaining 0.93 accuracy, 0.90 precision, 0.96 recall and 0.93 f1-score with the LwF strategy, outperforming traditional approaches in detecting emerging phishing threats while retaining past knowledge. △ Less

Submitted 16 June, 2024; v1 submitted 6 May, 2024; originally announced May 2024.

arXiv:2404.05765 [pdf]

A Novel Bi-LSTM And Transformer Architecture For Generating Tabla Music

Authors: Roopa Mayya, Vivekanand Venkataraman, Anwesh P R, Narayana Darapaneni

Abstract: Introduction: Music generation is a complex task that has received significant attention in recent years, and deep learning techniques have shown promising results in this field. Objectives: While extensive work has been carried out on generating Piano and other Western music, there is limited research on generating classical Indian music due to the scarcity of Indian music in machine-encoded form… ▽ More Introduction: Music generation is a complex task that has received significant attention in recent years, and deep learning techniques have shown promising results in this field. Objectives: While extensive work has been carried out on generating Piano and other Western music, there is limited research on generating classical Indian music due to the scarcity of Indian music in machine-encoded formats. In this technical paper, methods for generating classical Indian music, specifically tabla music, is proposed. Initially, this paper explores piano music generation using deep learning architectures. Then the fundamentals are extended to generating tabla music. Methods: Tabla music in waveform (.wav) files are pre-processed using the librosa library in Python. A novel Bi-LSTM with an Attention approach and a transformer model are trained on the extracted features and labels. Results: The models are then used to predict the next sequences of tabla music. A loss of 4.042 and MAE of 1.0814 are achieved with the Bi-LSTM model. With the transformer model, a loss of 55.9278 and MAE of 3.5173 are obtained for tabla music generation. Conclusion: The resulting music embodies a harmonious fusion of novelty and familiarity, pushing the limits of music composition to new horizons. △ Less

Submitted 6 April, 2024; originally announced April 2024.

arXiv:2404.05764 [pdf]

Study of the effect of Sharpness on Blind Video Quality Assessment

Authors: Anantha Prabhu, David Pratap, Narayana Darapeni, Anwesh P R

Abstract: Introduction: Video Quality Assessment (VQA) is one of the important areas of study in this modern era, where video is a crucial component of communication with applications in every field. Rapid technology developments in mobile technology enabled anyone to create videos resulting in a varied range of video quality scenarios. Objectives: Though VQA was present for some time with the classical met… ▽ More Introduction: Video Quality Assessment (VQA) is one of the important areas of study in this modern era, where video is a crucial component of communication with applications in every field. Rapid technology developments in mobile technology enabled anyone to create videos resulting in a varied range of video quality scenarios. Objectives: Though VQA was present for some time with the classical metrices like SSIM and PSNR, the advent of machine learning has brought in new techniques of VQAs which are built upon Convolutional Neural Networks (CNNs) or Deep Neural Networks (DNNs). Methods: Over the past years various research studies such as the BVQA which performed video quality assessment of nature-based videos using DNNs exposed the powerful capabilities of machine learning algorithms. BVQA using DNNs explored human visual system effects such as content dependency and time-related factors normally known as temporal effects. Results: This study explores the sharpness effect on models like BVQA. Sharpness is the measure of the clarity and details of the video image. Sharpness typically involves analyzing the edges and contrast of the image to determine the overall level of detail and sharpness. Conclusion: This study uses the existing video quality databases such as CVD2014. A comparative study of the various machine learning parameters such as SRCC and PLCC during the training and testing are presented along with the conclusion. △ Less

Submitted 6 April, 2024; originally announced April 2024.

arXiv:2403.01926 [pdf, other]

IndicVoices: Towards building an Inclusive Multilingual Speech Dataset for Indian Languages

Authors: Tahir Javed, Janki Atul Nawale, Eldho Ittan George, Sakshi Joshi, Kaushal Santosh Bhogale, Deovrat Mehendale, Ishvinder Virender Sethi, Aparna Ananthanarayanan, Hafsah Faquih, Pratiti Palit, Sneha Ravishankar, Saranya Sukumaran, Tripura Panchagnula, Sunjay Murali, Kunal Sharad Gandhi, Ambujavalli R, Manickam K M, C Venkata Vaijayanthi, Krishnan Srinivasa Raghavan Karunganni, Pratyush Kumar, Mitesh M Khapra

Abstract: We present INDICVOICES, a dataset of natural and spontaneous speech containing a total of 7348 hours of read (9%), extempore (74%) and conversational (17%) audio from 16237 speakers covering 145 Indian districts and 22 languages. Of these 7348 hours, 1639 hours have already been transcribed, with a median of 73 hours per language. Through this paper, we share our journey of capturing the cultural,… ▽ More We present INDICVOICES, a dataset of natural and spontaneous speech containing a total of 7348 hours of read (9%), extempore (74%) and conversational (17%) audio from 16237 speakers covering 145 Indian districts and 22 languages. Of these 7348 hours, 1639 hours have already been transcribed, with a median of 73 hours per language. Through this paper, we share our journey of capturing the cultural, linguistic and demographic diversity of India to create a one-of-its-kind inclusive and representative dataset. More specifically, we share an open-source blueprint for data collection at scale comprising of standardised protocols, centralised tools, a repository of engaging questions, prompts and conversation scenarios spanning multiple domains and topics of interest, quality control mechanisms, comprehensive transcription guidelines and transcription tools. We hope that this open source blueprint will serve as a comprehensive starter kit for data collection efforts in other multilingual regions of the world. Using INDICVOICES, we build IndicASR, the first ASR model to support all the 22 languages listed in the 8th schedule of the Constitution of India. All the data, tools, guidelines, models and other materials developed as a part of this work will be made publicly available △ Less

Submitted 4 March, 2024; originally announced March 2024.

arXiv:2403.01186 [pdf]

Evault for legal records

Authors: Jeba N, Anas S, Anuragav S, Abhishek R, Sachin K

Abstract: Innovative solution for addressing the challenges in the legal records management system through a blockchain-based eVault platform. Our objective is to create a secure, transparent, and accessible ecosystem that caters to the needs of all stakeholders, including lawyers, judges, clients, and registrars. First and foremost, our solution is built on a robust blockchain platform like Ethereum harnes… ▽ More Innovative solution for addressing the challenges in the legal records management system through a blockchain-based eVault platform. Our objective is to create a secure, transparent, and accessible ecosystem that caters to the needs of all stakeholders, including lawyers, judges, clients, and registrars. First and foremost, our solution is built on a robust blockchain platform like Ethereum harnessing the power of smart contracts to manage access, permissions, and transactions effectively. This ensures the utmost security and transparency in every interaction within the system. To make our eVault system user-friendly, we've developed intuitive interfaces for all stakeholders. Lawyers, judges, clients, and even registrars can effortlessly upload and retrieve legal documents, track changes, and share information within the platform. But that's not all; we've gone a step further by incorporating a document creation and saving feature within our app and website. This feature allows users to generate and securely store legal documents, streamlining the entire documentation process. △ Less

Submitted 8 March, 2024; v1 submitted 2 March, 2024; originally announced March 2024.

Comments: Blockchain, evault, legal records

arXiv:2403.00887 [pdf, other]

SEGAA: A Unified Approach to Predicting Age, Gender, and Emotion in Speech

Authors: Aron R, Indra Sigicharla, Chirag Periwal, Mohanaprasad K, Nithya Darisini P S, Sourabh Tiwari, Shivani Arora

Abstract: The interpretation of human voices holds importance across various applications. This study ventures into predicting age, gender, and emotion from vocal cues, a field with vast applications. Voice analysis tech advancements span domains, from improving customer interactions to enhancing healthcare and retail experiences. Discerning emotions aids mental health, while age and gender detection are vi… ▽ More The interpretation of human voices holds importance across various applications. This study ventures into predicting age, gender, and emotion from vocal cues, a field with vast applications. Voice analysis tech advancements span domains, from improving customer interactions to enhancing healthcare and retail experiences. Discerning emotions aids mental health, while age and gender detection are vital in various contexts. Exploring deep learning models for these predictions involves comparing single, multi-output, and sequential models highlighted in this paper. Sourcing suitable data posed challenges, resulting in the amalgamation of the CREMA-D and EMO-DB datasets. Prior work showed promise in individual predictions, but limited research considered all three variables simultaneously. This paper identifies flaws in an individual model approach and advocates for our novel multi-output learning architecture Speech-based Emotion Gender and Age Analysis (SEGAA) model. The experiments suggest that Multi-output models perform comparably to individual models, efficiently capturing the intricate relationships between variables and speech inputs, all while achieving improved runtime. △ Less

Submitted 1 March, 2024; originally announced March 2024.

arXiv:2402.02811 [pdf, other]

Multi-scale fMRI time series analysis for understanding neurodegeneration in MCI

Authors: Ammu R., Debanjali Bhattacharya, Ameiy Acharya, Ninad Aithal, Neelam Sinha

Abstract: In this study, we present a technique that spans multi-scale views (global scale -- meaning brain network-level and local scale -- examining each individual ROI that constitutes the network) applied to resting-state fMRI volumes. Deep learning based classification is utilized in understanding neurodegeneration. The novelty of the proposed approach lies in utilizing two extreme scales of analysis.… ▽ More In this study, we present a technique that spans multi-scale views (global scale -- meaning brain network-level and local scale -- examining each individual ROI that constitutes the network) applied to resting-state fMRI volumes. Deep learning based classification is utilized in understanding neurodegeneration. The novelty of the proposed approach lies in utilizing two extreme scales of analysis. One branch considers the entire network within graph-analysis framework. Concurrently, the second branch scrutinizes each ROI within a network independently, focusing on evolution of dynamics. For each subject, graph-based approach employs partial correlation to profile the subject in a single graph where each ROI is a node, providing insights into differences in levels of participation. In contrast, non-linear analysis employs recurrence plots to profile a subject as a multichannel 2D image, revealing distinctions in underlying dynamics. The proposed approach is employed for classification of a cohort of 50 healthy control (HC) and 50 Mild Cognitive Impairment (MCI), sourced from ADNI dataset. Results point to: (1) reduced activity in ROIs such as PCC in MCI (2) greater activity in occipital in MCI, which is not seen in HC (3) when analysed for dynamics, all ROIs in MCI show greater predictability in time-series. △ Less

Submitted 5 February, 2024; originally announced February 2024.

Comments: 12 pages, 3 figures and 4 tables

arXiv:2312.05797 [pdf, other]

Multimodality in Online Education: A Comparative Study

Authors: Praneeta Immadisetty, Pooja Rajesh, Akshita Gupta, Anala M R, Soumya A, K. N. Subramanya

Abstract: The commencement of the decade brought along with it a grave pandemic and in response the movement of education forums predominantly into the online world. With a surge in the usage of online video conferencing platforms and tools to better gauge student understanding, there needs to be a mechanism to assess whether instructors can grasp the extent to which students understand the subject and thei… ▽ More The commencement of the decade brought along with it a grave pandemic and in response the movement of education forums predominantly into the online world. With a surge in the usage of online video conferencing platforms and tools to better gauge student understanding, there needs to be a mechanism to assess whether instructors can grasp the extent to which students understand the subject and their response to the educational stimuli. The current systems consider only a single cue with a lack of focus in the educational domain. Thus, there is a necessity for the measurement of an all-encompassing holistic overview of the students' reaction to the subject matter. This paper highlights the need for a multimodal approach to affect recognition and its deployment in the online classroom while considering four cues, posture and gesture, facial, eye tracking and verbal recognition. It compares the various machine learning models available for each cue and provides the most suitable approach given the available dataset and parameters of classroom footage. A multimodal approach derived from weighted majority voting is proposed by combining the most fitting models from this analysis of individual cues based on accuracy, ease of procuring data corpus, sensitivity and any major drawbacks. △ Less

Submitted 17 December, 2023; v1 submitted 10 December, 2023; originally announced December 2023.

arXiv:2310.06841 [pdf]

Malware Classification using Deep Neural Networks: Performance Evaluation and Applications in Edge Devices

Authors: Akhil M R, Adithya Krishna V Sharma, Harivardhan Swamy, Pavan A, Ashray Shetty, Anirudh B Sathyanarayana

Abstract: With the increasing extent of malware attacks in the present day along with the difficulty in detecting modern malware, it is necessary to evaluate the effectiveness and performance of Deep Neural Networks (DNNs) for malware classification. Multiple DNN architectures can be designed and trained to detect and classify malware binaries. Results demonstrate the potential of DNNs in accurately classif… ▽ More With the increasing extent of malware attacks in the present day along with the difficulty in detecting modern malware, it is necessary to evaluate the effectiveness and performance of Deep Neural Networks (DNNs) for malware classification. Multiple DNN architectures can be designed and trained to detect and classify malware binaries. Results demonstrate the potential of DNNs in accurately classifying malware with high accuracy rates observed across different malware types. Additionally, the feasibility of deploying these DNN models on edge devices to enable real-time classification, particularly in resource-constrained scenarios proves to be integral to large IoT systems. By optimizing model architectures and leveraging edge computing capabilities, the proposed methodologies achieve efficient performance even with limited resources. This study contributes to advancing malware detection techniques and emphasizes the significance of integrating cybersecurity measures for the early detection of malware and further preventing the adverse effects caused by such attacks. Optimal considerations regarding the distribution of security tasks to edge devices are addressed to ensure that the integrity and availability of large scale IoT systems are not compromised due to malware attacks, advocating for a more resilient and secure digital ecosystem. △ Less

Submitted 21 August, 2023; originally announced October 2023.

arXiv:2309.16654 [pdf, other]

Novel Deep Learning Pipeline for Automatic Weapon Detection

Authors: Haribharathi Sivakumar, Vijay Arvind. R, Pawan Ragavendhar V, G. Balamurugan

Abstract: Weapon and gun violence have recently become a pressing issue today. The degree of these crimes and activities has risen to the point of being termed as an epidemic. This prevalent misuse of weapons calls for an automatic system that detects weapons in real-time. Real-time surveillance video is captured and recorded in almost all public forums and places. These videos contain abundant raw data whi… ▽ More Weapon and gun violence have recently become a pressing issue today. The degree of these crimes and activities has risen to the point of being termed as an epidemic. This prevalent misuse of weapons calls for an automatic system that detects weapons in real-time. Real-time surveillance video is captured and recorded in almost all public forums and places. These videos contain abundant raw data which can be extracted and processed into meaningful information. This paper proposes a novel pipeline consisting of an ensemble of convolutional neural networks with distinct architectures. Each neural network is trained with a unique mini-batch with little to no overlap in the training samples. This paper will present several promising results using multiple datasets associated with comparing the proposed architecture and state-of-the-art (SoA) models. The proposed pipeline produced an average increase of 5% in accuracy, specificity, and recall compared to the SoA systems. △ Less

Submitted 28 September, 2023; originally announced September 2023.

Comments: Accepted for presentation at the IEEE 2nd International Conference on Automation, Robotics and Computer Engineering

arXiv:2309.09191 [pdf, other]

End-to-End Optimized Pipeline for Prediction of Protein Folding Kinetics

Authors: Vijay Arvind. R, Haribharathi Sivakumar, Brindha. R

Abstract: Protein folding is the intricate process by which a linear sequence of amino acids self-assembles into a unique three-dimensional structure. Protein folding kinetics is the study of pathways and time-dependent mechanisms a protein undergoes when it folds. Understanding protein kinetics is essential as a protein needs to fold correctly for it to perform its biological functions optimally, and a mis… ▽ More Protein folding is the intricate process by which a linear sequence of amino acids self-assembles into a unique three-dimensional structure. Protein folding kinetics is the study of pathways and time-dependent mechanisms a protein undergoes when it folds. Understanding protein kinetics is essential as a protein needs to fold correctly for it to perform its biological functions optimally, and a misfolded protein can sometimes be contorted into shapes that are not ideal for a cellular environment giving rise to many degenerative, neuro-degenerative disorders and amyloid diseases. Monitoring at-risk individuals and detecting protein discrepancies in a protein's folding kinetics at the early stages could majorly result in public health benefits, as preventive measures can be taken. This research proposes an efficient pipeline for predicting protein folding kinetics with high accuracy and low memory footprint. The deployed machine learning (ML) model outperformed the state-of-the-art ML models by 4.8% in terms of accuracy while consuming 327x lesser memory and being 7.3% faster. △ Less

Submitted 17 September, 2023; originally announced September 2023.

Comments: Accepted for presentation at the 22nd International Conference on Machine Learning and Applications

arXiv:2309.09175

Imbalanced Data Stream Classification using Dynamic Ensemble Selection

Authors: Priya. S, Haribharathi Sivakumar, Vijay Arvind. R

Abstract: Modern streaming data categorization faces significant challenges from concept drift and class imbalanced data. This negatively impacts the output of the classifier, leading to improper classification. Furthermore, other factors such as the overlap** of multiple classes limit the extent of the correctness of the output. This work proposes a novel framework for integrating data pre-processing and… ▽ More Modern streaming data categorization faces significant challenges from concept drift and class imbalanced data. This negatively impacts the output of the classifier, leading to improper classification. Furthermore, other factors such as the overlap** of multiple classes limit the extent of the correctness of the output. This work proposes a novel framework for integrating data pre-processing and dynamic ensemble selection, by formulating the classification framework for the nonstationary drifting imbalanced data stream, which employs the data pre-processing and dynamic ensemble selection techniques. The proposed framework was evaluated using six artificially generated data streams with differing imbalance ratios in combination with two different types of concept drifts. Each stream is composed of 200 chunks of 500 objects described by eight features and contains five concept drifts. Seven pre-processing techniques and two dynamic ensemble selection methods were considered. According to experimental results, data pre-processing combined with Dynamic Ensemble Selection techniques significantly delivers more accuracy when dealing with imbalanced data streams. △ Less

Submitted 28 September, 2023; v1 submitted 17 September, 2023; originally announced September 2023.

Comments: Made an error in the research and need to rectify it

arXiv:2307.16157 [pdf, other]

A Simple Robot Selection Criteria After Path Planning Using Wavefront Algorithm

Authors: Rajashekhar V S, Dhaya C, Dinakar Raj C K, Dharshan P, Mukesh Kumar S, Harish B, Ajith R, Kamaleshwaran K

Abstract: In this work we present a technique to select the best robot for accomplishing a task assuming that the map of the environment is known in advance. To do so, capabilities of the robots are listed and the environments where they can be used are mapped. There are five robots that included for doing the tasks. They are the robotic lizard, half-humanoid, robotic snake, biped and quadruped. Each of the… ▽ More In this work we present a technique to select the best robot for accomplishing a task assuming that the map of the environment is known in advance. To do so, capabilities of the robots are listed and the environments where they can be used are mapped. There are five robots that included for doing the tasks. They are the robotic lizard, half-humanoid, robotic snake, biped and quadruped. Each of these robots are capable of performing certain activities and also they have their own limitations. The process of considering the robot performances and acting based on their limitations is the focus of this work. The wavefront algorithm is used to find the nature of terrain. Based on the terrain a suitable robot is selected from the list of five robots by the wavefront algorithm. Using this robot the mission is accomplished. △ Less

Submitted 30 July, 2023; originally announced July 2023.

Comments: 8 pages, 4 figures

arXiv:2307.14343 [pdf]

Pruning Distorted Images in MNIST Handwritten Digits

Authors: Amarnath R, Vinay Kumar V

Abstract: Recognizing handwritten digits is a challenging task primarily due to the diversity of writing styles and the presence of noisy images. The widely used MNIST dataset, which is commonly employed as a benchmark for this task, includes distorted digits with irregular shapes, incomplete strokes, and varying skew in both the training and testing datasets. Consequently, these factors contribute to reduc… ▽ More Recognizing handwritten digits is a challenging task primarily due to the diversity of writing styles and the presence of noisy images. The widely used MNIST dataset, which is commonly employed as a benchmark for this task, includes distorted digits with irregular shapes, incomplete strokes, and varying skew in both the training and testing datasets. Consequently, these factors contribute to reduced accuracy in digit recognition. To overcome this challenge, we propose a two-stage deep learning approach. In the first stage, we create a simple neural network to identify distorted digits within the training set. This model serves to detect and filter out such distorted and ambiguous images. In the second stage, we exclude these identified images from the training dataset and proceed to retrain the model using the filtered dataset. This process aims to improve the classification accuracy and confidence levels while mitigating issues of underfitting and overfitting. Our experimental results demonstrate the effectiveness of the proposed approach, achieving an accuracy rate of over 99.5% on the testing dataset. This significant improvement showcases the potential of our method in enhancing digit classification accuracy. In our future work, we intend to explore the scalability of this approach and investigate techniques to further enhance accuracy by reducing the size of the training data. △ Less

Submitted 26 May, 2023; originally announced July 2023.

Comments: 26 pages, 10 figures, 14 tables, 54 references

arXiv:2307.10005 [pdf, other]

Alzheimer's Disease Detection from Spontaneous Speech and Text: A review

Authors: Vrindha M. K., Geethu V., Anurenjan P. R., Deepak S., Sreeni K. G.

Abstract: In the past decade, there has been a surge in research examining the use of voice and speech analysis as a means of detecting neurodegenerative diseases such as Alzheimer's. Many studies have shown that certain acoustic features can be used to differentiate between normal aging and Alzheimer's disease, and speech analysis has been found to be a cost-effective method of detecting Alzheimer's dement… ▽ More In the past decade, there has been a surge in research examining the use of voice and speech analysis as a means of detecting neurodegenerative diseases such as Alzheimer's. Many studies have shown that certain acoustic features can be used to differentiate between normal aging and Alzheimer's disease, and speech analysis has been found to be a cost-effective method of detecting Alzheimer's dementia. The aim of this review is to analyze the various algorithms used in speech-based detection and classification of Alzheimer's disease. A literature survey was conducted using databases such as Web of Science, Google Scholar, and Science Direct, and articles published from January 2020 to the present were included based on keywords such as ``Alzheimer's detection'', "speech," and "natural language processing." The ADReSS, Pitt corpus, and CCC datasets are commonly used for the analysis of dementia from speech, and this review focuses on the various acoustic and linguistic feature engineering-based classification models drawn from 15 studies. Based on the findings of this study, it appears that a more accurate model for classifying Alzheimer's disease can be developed by considering both linguistic and acoustic data. The review suggests that speech signals can be a useful tool for detecting dementia and may serve as a reliable biomarker for efficiently identifying Alzheimer's disease. △ Less

Submitted 19 July, 2023; originally announced July 2023.

arXiv:2210.04218 [pdf, other]

Transformer-based Flood Scene Segmentation for Develo** Countries

Authors: Ahan M R, Roshan Roy, Shreyas Sunil Kulkarni, Vaibhav Soni, Ashish Chittora

Abstract: Floods are large-scale natural disasters that often induce a massive number of deaths, extensive material damage, and economic turmoil. The effects are more extensive and longer-lasting in high-population and low-resource develo** countries. Early Warning Systems (EWS) constantly assess water levels and other factors to forecast floods, to help minimize damage. Post-disaster, disaster response t… ▽ More Floods are large-scale natural disasters that often induce a massive number of deaths, extensive material damage, and economic turmoil. The effects are more extensive and longer-lasting in high-population and low-resource develo** countries. Early Warning Systems (EWS) constantly assess water levels and other factors to forecast floods, to help minimize damage. Post-disaster, disaster response teams undertake a Post Disaster Needs Assessment (PDSA) to assess structural damage and determine optimal strategies to respond to highly affected neighbourhoods. However, even today in develo** countries, EWS and PDSA analysis of large volumes of image and video data is largely a manual process undertaken by first responders and volunteers. We propose FloodTransformer, which to the best of our knowledge, is the first visual transformer-based model to detect and segment flooded areas from aerial images at disaster sites. We also propose a custom metric, Flood Capacity (FC) to measure the spatial extent of water coverage and quantify the segmented flooded area for EWS and PDSA analyses. We use the SWOC Flood segmentation dataset and achieve 0.93 mIoU, outperforming all other methods. We further show the robustness of this approach by validating across unseen flood images from other flood data sources. △ Less

Submitted 9 October, 2022; originally announced October 2022.

Comments: Presented at NeurIPS 2021 Workshop on Machine Learning for the Develo** World

arXiv:2203.10194 [pdf, other]

Analysis and Adaptation of YOLOv4 for Object Detection in Aerial Images

Authors: Aryaman Singh Samyal, Akshatha K R, Soham Hans, Karunakar A K, Satish Shenoy B

Abstract: The recent and rapid growth in Unmanned Aerial Vehicles (UAVs) deployment for various computer vision tasks has paved the path for numerous opportunities to make them more effective and valuable. Object detection in aerial images is challenging due to variations in appearance, pose, and scale. Autonomous aerial flight systems with their inherited limited memory and computational power demand accur… ▽ More The recent and rapid growth in Unmanned Aerial Vehicles (UAVs) deployment for various computer vision tasks has paved the path for numerous opportunities to make them more effective and valuable. Object detection in aerial images is challenging due to variations in appearance, pose, and scale. Autonomous aerial flight systems with their inherited limited memory and computational power demand accurate and computationally efficient detection algorithms for real-time applications. Our work shows the adaptation of the popular YOLOv4 framework for predicting the objects and their locations in aerial images with high accuracy and inference speed. We utilized transfer learning for faster convergence of the model on the VisDrone DET aerial object detection dataset. The trained model resulted in a mean average precision (mAP) of 45.64% with an inference speed reaching 8.7 FPS on the Tesla K80 GPU and was highly accurate in detecting truncated and occluded objects. We experimentally evaluated the impact of varying network resolution sizes and training epochs on the performance. A comparative study with several contemporary aerial object detectors proved that YOLOv4 performed better, implying a more suitable detection algorithm to incorporate on aerial platforms. △ Less

Submitted 18 March, 2022; originally announced March 2022.

arXiv:2112.06712 [pdf, other]

A Case For Noisy Shallow Gate-Based Circuits In Quantum Machine Learning

Authors: Patrick Selig, Niall Murphy, Ashwin Sundareswaran R, David Redmond, Simon Caton

Abstract: There is increasing interest in the development of gate-based quantum circuits for the training of machine learning models. Yet, little is understood concerning the parameters of circuit design, and the effects of noise and other measurement errors on the performance of quantum machine learning models. In this paper, we explore the practical implications of key circuit design parameters (number of… ▽ More There is increasing interest in the development of gate-based quantum circuits for the training of machine learning models. Yet, little is understood concerning the parameters of circuit design, and the effects of noise and other measurement errors on the performance of quantum machine learning models. In this paper, we explore the practical implications of key circuit design parameters (number of qubits, depth etc.) using several standard machine learning datasets and IBM's Qiskit simulator. In total we evaluate over 6500 unique circuits with $n \approx 120700$ individual runs. We find that in general shallow (low depth) wide (more qubits) circuit topologies tend to outperform deeper ones in settings without noise. We also explore the implications and effects of different notions of noise and discuss circuit topologies that are more / less robust to noise for classification machine learning tasks. Based on the findings we define guidelines for circuit topologies that show near-term promise for the realisation of quantum machine learning algorithms using gate-based NISQ quantum computer. △ Less

Submitted 13 December, 2021; originally announced December 2021.

Comments: 15 pages, 11 figures, Published in the proceedings of International Conference on Rebooting Computing (ICRC). IEEE, 2021

arXiv:2109.03435 [pdf, other]

SSEGEP: Small SEGment Emphasized Performance evaluation metric for medical image segmentation

Authors: Ammu R, Neelam Sinha

Abstract: Automatic image segmentation is a critical component of medical image analysis, and hence quantifying segmentation performance is crucial. Challenges in medical image segmentation are mainly due to spatial variations of regions to be segmented and imbalance in distribution of classes. Commonly used metrics treat all detected pixels, indiscriminately. However, pixels in smaller segments must be tre… ▽ More Automatic image segmentation is a critical component of medical image analysis, and hence quantifying segmentation performance is crucial. Challenges in medical image segmentation are mainly due to spatial variations of regions to be segmented and imbalance in distribution of classes. Commonly used metrics treat all detected pixels, indiscriminately. However, pixels in smaller segments must be treated differently from pixels in larger segments, as detection of smaller ones aid in early treatment of associated disease and are also easier to miss. To address this, we propose a novel evaluation metric for segmentation performance, emphasizing smaller segments, by assigning higher weightage to smaller segment pixels. Weighted false positives are also considered in deriving the new metric named, "SSEGEP"(Small SEGment Emphasized Performance evaluation metric), (range : 0(Bad) to 1(Good)). The experiments were performed on diverse anatomies(eye, liver, pancreas and breast) from publicly available datasets to show applicability of the proposed metric across different imaging techniques. Mean opinion score (MOS) and statistical significance testing is used to quantify the relevance of proposed approach. Across 33 fundus images, where the largest exudate is 1.41%, and the smallest is 0.0002% of the image, the proposed metric is 30% closer to MOS, as compared to Dice Similarity Coefficient (DSC). Statistical significance testing resulted in promising p-value of order 10^{-18} with SSEGEP for hepatic tumor compared to DSC. The proposed metric is found to perform better for the images having multiple segments for a single label. △ Less

Submitted 8 September, 2021; originally announced September 2021.

arXiv:2109.01467 [pdf, other]

Semi-Implicit Neural Solver for Time-dependent Partial Differential Equations

Authors: Suprosanna Shit, Ivan Ezhov, Leon Mächler, Abinav R., Jana Lipkova, Johannes C. Paetzold, Florian Kofler, Marie Piraud, Bjoern H. Menze

Abstract: Fast and accurate solutions of time-dependent partial differential equations (PDEs) are of pivotal interest to many research fields, including physics, engineering, and biology. Generally, implicit/semi-implicit schemes are preferred over explicit ones to improve stability and correctness. However, existing semi-implicit methods are usually iterative and employ a general-purpose solver, which may… ▽ More Fast and accurate solutions of time-dependent partial differential equations (PDEs) are of pivotal interest to many research fields, including physics, engineering, and biology. Generally, implicit/semi-implicit schemes are preferred over explicit ones to improve stability and correctness. However, existing semi-implicit methods are usually iterative and employ a general-purpose solver, which may be sub-optimal for a specific class of PDEs. In this paper, we propose a neural solver to learn an optimal iterative scheme in a data-driven fashion for any class of PDEs. Specifically, we modify a single iteration of a semi-implicit solver using a deep neural network. We provide theoretical guarantees for the correctness and convergence of neural solvers analogous to conventional iterative solvers. In addition to the commonly used Dirichlet boundary condition, we adopt a diffuse domain approach to incorporate a diverse type of boundary conditions, e.g., Neumann. We show that the proposed neural solver can go beyond linear PDEs and applies to a class of non-linear PDEs, where the non-linear component is non-stiff. We demonstrate the efficacy of our method on 2D and 3D scenarios. To this end, we show how our model generalizes to parameter settings, which are different from training; and achieves faster convergence than semi-implicit schemes. △ Less

Submitted 3 September, 2021; originally announced September 2021.

arXiv:2012.13380 [pdf, other]

A Regret bound for Non-stationary Multi-Armed Bandits with Fairness Constraints

Authors: Shaarad A. R, Ambedkar Dukkipati

Abstract: The multi-armed bandits' framework is the most common platform to study strategies for sequential decision-making problems. Recently, the notion of fairness has attracted a lot of attention in the machine learning community. One can impose the fairness condition that at any given point of time, even during the learning phase, a poorly performing candidate should not be preferred over a better cand… ▽ More The multi-armed bandits' framework is the most common platform to study strategies for sequential decision-making problems. Recently, the notion of fairness has attracted a lot of attention in the machine learning community. One can impose the fairness condition that at any given point of time, even during the learning phase, a poorly performing candidate should not be preferred over a better candidate. This fairness constraint is known to be one of the most stringent and has been studied in the stochastic multi-armed bandits' framework in a stationary setting for which regret bounds have been established. The main aim of this paper is to study this problem in a non-stationary setting. We present a new algorithm called Fair Upper Confidence Bound with Exploration Fair-UCBe algorithm for solving a slowly varying stochastic $k$-armed bandit problem. With this we present two results: (i) Fair-UCBe indeed satisfies the above mentioned fairness condition, and (ii) it achieves a regret bound of $O\left(k^{\frac{3}{2}} T^{1 - \fracα{2}} \sqrt{\log T}\right)$, for some suitable $α\in (0, 1)$, where $T$ is the time horizon. This is the first fair algorithm with a sublinear regret bound applicable to non-stationary bandits to the best of our knowledge. We show that the performance of our algorithm in the non-stationary case approaches that of its stationary counterpart as the variation in the environment tends to zero. △ Less

Submitted 24 December, 2020; originally announced December 2020.

arXiv:2011.04988 [pdf, other]

AIM 2020 Challenge on Rendering Realistic Bokeh

Authors: Andrey Ignatov, Radu Timofte, Ming Qian, Congyu Qiao, Jiamin Lin, Zhenyu Guo, Chenghua Li, Cong Leng, Jian Cheng, Juewen Peng, Xianrui Luo, Ke Xian, Zi** Wu, Zhiguo Cao, Densen Puthussery, Jiji C V, Hrishikesh P S, Melvin Kuriakose, Saikat Dutta, Sourya Dipta Das, Nisarg A. Shah, Kuldeep Purohit, Praveen Kandula, Maitreya Suin, A. N. Rajagopalan , et al. (10 additional authors not shown)

Abstract: This paper reviews the second AIM realistic bokeh effect rendering challenge and provides the description of the proposed solutions and results. The participating teams were solving a real-world bokeh simulation problem, where the goal was to learn a realistic shallow focus technique using a large-scale EBB! bokeh dataset consisting of 5K shallow / wide depth-of-field image pairs captured using th… ▽ More This paper reviews the second AIM realistic bokeh effect rendering challenge and provides the description of the proposed solutions and results. The participating teams were solving a real-world bokeh simulation problem, where the goal was to learn a realistic shallow focus technique using a large-scale EBB! bokeh dataset consisting of 5K shallow / wide depth-of-field image pairs captured using the Canon 7D DSLR camera. The participants had to render bokeh effect based on only one single frame without any additional data from other cameras or sensors. The target metric used in this challenge combined the runtime and the perceptual quality of the solutions measured in the user study. To ensure the efficiency of the submitted models, we measured their runtime on standard desktop CPUs as well as were running the models on smartphone GPUs. The proposed solutions significantly improved the baseline results, defining the state-of-the-art for practical bokeh effect rendering problem. △ Less

Submitted 10 November, 2020; originally announced November 2020.

Comments: Published in ECCV 2020 Workshop (Advances in Image Manipulation), https://data.vision.ee.ethz.ch/cvl/aim20/

arXiv:2010.13187 [pdf, other]

Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling

Authors: Akash Srivastava, Yamini Bansal, Yukun Ding, Cole Lincoln Hurwitz, Kai Xu, Bernhard Egger, Prasanna Sattigeri, Joshua B. Tenenbaum, Phuong Le, Arun Prakash R, Nengfeng Zhou, Joel Vaughan, Yaquan Wang, Anwesha Bhattacharyya, Kristjan Greenewald, David D. Cox, Dan Gutfreund

Abstract: Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the (aggregate) posterior to encourage statistical independence of the latent factors. This approach introduces a trade-off between disentangled representation learning and reconstruction quality since the model does not have enough capacity to learn correlated latent variables that capture… ▽ More Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the (aggregate) posterior to encourage statistical independence of the latent factors. This approach introduces a trade-off between disentangled representation learning and reconstruction quality since the model does not have enough capacity to learn correlated latent variables that capture detail information present in most image data. To overcome this trade-off, we present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method; then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables, adding detail information while maintaining conditioning on the previously learned disentangled factors. Taken together, our multi-stage modelling approach results in a single, coherent probabilistic model that is theoretically justified by the principal of D-separation and can be realized with a variety of model classes including likelihood-based models such as variational autoencoders, implicit models such as generative adversarial networks, and tractable models like normalizing flows or mixtures of Gaussians. We demonstrate that our multi-stage model has higher reconstruction quality than current state-of-the-art methods with equivalent disentanglement performance across multiple standard benchmarks. In addition, we apply the multi-stage model to generate synthetic tabular datasets, showcasing an enhanced performance over benchmark models across a variety of metrics. The interpretability analysis further indicates that the multi-stage model can effectively uncover distinct and meaningful features of variations from which the original distribution can be recovered. △ Less

Submitted 3 April, 2024; v1 submitted 25 October, 2020; originally announced October 2020.

arXiv:2006.08870 [pdf, other]

End-to-End Code Switching Language Models for Automatic Speech Recognition

Authors: Ahan M. R., Shreyas Sunil Kulkarni

Abstract: In this paper, we particularly work on the code-switched text, one of the most common occurrences in the bilingual communities across the world. Due to the discrepancies in the extraction of code-switched text from an Automated Speech Recognition(ASR) module, and thereby extracting the monolingual text from the code-switched text, we propose an approach for extracting monolingual text using Deep B… ▽ More In this paper, we particularly work on the code-switched text, one of the most common occurrences in the bilingual communities across the world. Due to the discrepancies in the extraction of code-switched text from an Automated Speech Recognition(ASR) module, and thereby extracting the monolingual text from the code-switched text, we propose an approach for extracting monolingual text using Deep Bi-directional Language Models(LM) such as BERT and other Machine Translation models, and also explore different ways of extracting code-switched text from the ASR model. We also explain the robustness of the model by comparing the results of Perplexity and other different metrics like WER, to the standard bi-lingual text output without any external information. △ Less

Submitted 15 June, 2020; originally announced June 2020.

Comments: 5 pages, 2 figures, To appear in the proceedings of First Workshop on Speech Technologies for Code-switching in Multilingual Communities 2020

arXiv:1909.10471 [pdf, ps, other]

Subpacketization in Coded Caching with Demand Privacy

Authors: Aravind V R, Pradeep Sarvepalli, Andrew Thangaraj

Abstract: Coded caching is a technique where we utilize multi-casting opportunities to reduce rate in cached networks. One limitation of coded caching schemes is that they reveal the demands of all users to their peers. In this work, we consider coded caching schemes that assure privacy for user demands with a particular focus on reducing subpacketization. For the 2-user, 2-file case, we present a new linea… ▽ More Coded caching is a technique where we utilize multi-casting opportunities to reduce rate in cached networks. One limitation of coded caching schemes is that they reveal the demands of all users to their peers. In this work, we consider coded caching schemes that assure privacy for user demands with a particular focus on reducing subpacketization. For the 2-user, 2-file case, we present a new linear demand-private scheme with the lowest possible subpacketization. This is done by presenting the scheme explicitly and proving impossibility results under lower subpacketization. Additionally, when only partial privacy is required, we show that subpacketization can be significantly reduced when there are a large number of files. △ Less

Submitted 23 September, 2019; originally announced September 2019.

Comments: 13 pages, 5 figures

arXiv:1909.05146 [pdf]

Word and character segmentation directly in run-length compressed handwritten document images

Authors: Amarnath R, P. Nagabhushan, Mohammed Javed

Abstract: From the literature, it is demonstrated that performing text-line segmentation directly in the run-length compressed handwritten document images significantly reduces the computational time and memory space. In this paper, we investigate the issues of word and character segmentation directly on the run-length compressed document images. Primarily, the spreads of the characters are intelligently ex… ▽ More From the literature, it is demonstrated that performing text-line segmentation directly in the run-length compressed handwritten document images significantly reduces the computational time and memory space. In this paper, we investigate the issues of word and character segmentation directly on the run-length compressed document images. Primarily, the spreads of the characters are intelligently extracted from the foreground runs of the compressed data and subsequently connected components are established. The spacing between the connected components would be larger between the adjacent words when compared to that of intra-words. With this knowledge, a threshold is empirically chosen for inter-word separation. Every connected component within a word is further analysed for character segmentation. Here, min-cut graph concept is used for separating the touching characters. Over-segmentation and under-segmentation issues are addressed by insertion and deletion operations respectively. The approach has been developed particularly for compressed handwritten English document images. However, the model has been tested on non-English document images. △ Less

Submitted 18 August, 2019; originally announced September 2019.

Comments: 17 pages,19 figures

arXiv:1907.07270 [pdf, other]

Style Transfer Applied to Face Liveness Detection with User-Centered Models

Authors: Israel A. Laurensi R., Luciana T. Menon, Manoel Camillo O. Penna N., Alessandro L. Koerich, Alceu S. Britto Jr

Abstract: This paper proposes a face anti-spoofing user-centered model (FAS-UCM). The major difficulty, in this case, is obtaining fraudulent images from all users to train the models. To overcome this problem, the proposed method is divided in three main parts: generation of new spoof images, based on style transfer and spoof image representation models; training of a Convolutional Neural Network (CNN) for… ▽ More This paper proposes a face anti-spoofing user-centered model (FAS-UCM). The major difficulty, in this case, is obtaining fraudulent images from all users to train the models. To overcome this problem, the proposed method is divided in three main parts: generation of new spoof images, based on style transfer and spoof image representation models; training of a Convolutional Neural Network (CNN) for liveness detection; evaluation of the live and spoof testing images for each subject. The generalization of the CNN to perform style transfer has shown promising qualitative results. Preliminary results have shown that the proposed method is capable of distinguishing between live and spoof images on the SiW database, with an average classification error rate of 0.22. △ Less

Submitted 16 July, 2019; originally announced July 2019.

arXiv:1901.11477 [pdf]

doi 10.18201/ijisae.2018448451

Text line Segmentation in Compressed Representation of Handwritten Document using Tunneling Algorithm

Authors: Amarnath R, P Nagabhushan

Abstract: In this research work, we perform text line segmentation directly in compressed representation of an unconstrained handwritten document image. In this relation, we make use of text line terminal points which is the current state-of-the-art. The terminal points spotted along both margins (left and right) of a document image for every text line are considered as source and target respectively. The t… ▽ More In this research work, we perform text line segmentation directly in compressed representation of an unconstrained handwritten document image. In this relation, we make use of text line terminal points which is the current state-of-the-art. The terminal points spotted along both margins (left and right) of a document image for every text line are considered as source and target respectively. The tunneling algorithm uses a single agent (or robot) to identify the coordinate positions in the compressed representation to perform text-line segmentation of the document. The agent starts at a source point and progressively tunnels a path routing in between two adjacent text lines and reaches the probable target. The agent's navigation path from source to the target bypassing obstacles, if any, results in segregating the two adjacent text lines. However, the target point would be known only when the agent reaches the destination; this is applicable for all source points and henceforth we could analyze the correspondence between source and target nodes. Artificial Intelligence in Expert systems, dynamic programming and greedy strategies are employed for every search space while tunneling. An exhaustive experimentation is carried out on various benchmark datasets including ICDAR13 and the performances are reported. △ Less

Submitted 3 January, 2019; originally announced January 2019.

Comments: Compressed Representation, Handwritten Document Image, Text-Line Terminal Point, Text-Line Segmentation, Search Space, Grid

Journal ref: International Journal of Intelligent Systems and Applications in Engineering, Vol 6, No 4 (2018)

arXiv:1812.11135 [pdf, other]

Online Decentralized Receding Horizon Trajectory Optimization for Multi-Robot systems

Authors: Govind Aadithya R, Shravan Krishnan, Vijay Arvindh, Sivanathan K

Abstract: A novel decentralised trajectory generation algorithm for Multi Agent systems is presented. Multi-robot systems have the capacity to transform lives in a variety of fields. But, trajectory generation for multi-robot systems is still in its nascent stage and limited to heavily controlled environments. To overcome that, an online trajectory optimization algorithm that generates collision-free trajec… ▽ More A novel decentralised trajectory generation algorithm for Multi Agent systems is presented. Multi-robot systems have the capacity to transform lives in a variety of fields. But, trajectory generation for multi-robot systems is still in its nascent stage and limited to heavily controlled environments. To overcome that, an online trajectory optimization algorithm that generates collision-free trajectories for robots, when given initial state and desired end pose, is proposed. It utilizes a simple method for obstacle detection, local shape based maps for obstacles and communication of robots' current states. Using the local maps, safe regions are formulated. Based upon the communicated data, trajectories are predicted for other robots and incorporated for collision-avoidance by resizing the regions of free space that the robot can be in without colliding. A trajectory is then optimized constraining the robot to remain within the safe region with the trajectories represented by piecewise polynomials parameterized by time. The algorithm is implemented using a receding horizon principle. The proposed algorithm is extensively tested in simulations on Gazebo using ROS with fourth order differentially flat aerial robots and non-holonomic second order wheeled robots in structured and unstructured environments. △ Less

Submitted 28 December, 2018; originally announced December 2018.

Comments: Submitted to IEEE Transactions on Robotics;

arXiv:1812.00868 [pdf, other]

Collision-Free Multi Robot Trajectory Optimization in Unknown Environments using Decentralized Trajectory Planning

Authors: Vijay Arvindh, Govind Aadithya R, Shravan Krishnan, Sivanathan K

Abstract: Multi robot systems have the potential to be utilized in a variety of applications. In most of the previous works, the trajectory generation for multi robot systems is implemented in known environments. To overcome that we present an online trajectory optimization algorithm that utilizes communication of robots' current states to account to the other robots while using local object based maps for… ▽ More Multi robot systems have the potential to be utilized in a variety of applications. In most of the previous works, the trajectory generation for multi robot systems is implemented in known environments. To overcome that we present an online trajectory optimization algorithm that utilizes communication of robots' current states to account to the other robots while using local object based maps for identifying obstacles. Based upon this data, we predict the trajectory expected to be traversed by the robots and utilize that to avoid collisions by formulating regions of free space that the robot can be without colliding with other robots and obstacles. A trajectory is optimized constraining the robot to remain within this region.The proposed method is tested in simulations on Gazebo using ROS. △ Less

Submitted 3 December, 2018; originally announced December 2018.

Comments: 6 pages,6 figures. To be Presented at 2018 IEEE 4th International Symposium in Robotics and Manufacturing Automation (ROMA)

arXiv:1811.07818 [pdf]

doi 10.18201/ijisae.2018644775

Novel approach to locate region of interest in mammograms for Breast cancer

Authors: BV Divyashree, Amarnath R, Naveen M, G Hemantha Kumar

Abstract: Locating region of interest for breast cancer masses in the mammographic image is a challenging problem in medical image processing. In this research work, the keen idea is to efficiently extract suspected mass region for further examination. In particular to this fact breast boundary segmentation on sliced rgb image using modified intensity based approach followed by quad tree based division to s… ▽ More Locating region of interest for breast cancer masses in the mammographic image is a challenging problem in medical image processing. In this research work, the keen idea is to efficiently extract suspected mass region for further examination. In particular to this fact breast boundary segmentation on sliced rgb image using modified intensity based approach followed by quad tree based division to spot out suspicious area are proposed in the paper. To evaluate the performance DDSM standard dataset are experimented and achieved acceptable accuracy. △ Less

Submitted 1 November, 2018; originally announced November 2018.

Comments: ROI, breast cancer, mammographic images, segmentation, entropy, quad tree

Journal ref: International Journal of Intelligent Systems and Applications in Engineering.(ISSN:2147-6799) Vol 6, No 3 (2018)

arXiv:1806.07834 [pdf, other]

A Look at Motion Planning for Autonomous Vehicles at an Intersection

Authors: Shravan Krishnan, Govind Aadithya R, Rahul Ramakrishnan, Vijay Arvindh, Sivanathan K

Abstract: Autonomous Vehicles are currently being tested in a variety of scenarios. As we move towards Autonomous Vehicles, how should intersections look? To answer that question, we break down an intersection management into the different conundrums and scenarios involved in the trajectory planning and current approaches to solve them. Then, a brief analysis of current works in autonomous intersection is c… ▽ More Autonomous Vehicles are currently being tested in a variety of scenarios. As we move towards Autonomous Vehicles, how should intersections look? To answer that question, we break down an intersection management into the different conundrums and scenarios involved in the trajectory planning and current approaches to solve them. Then, a brief analysis of current works in autonomous intersection is conducted. With a critical eye, we try to delve into the discrepancies of existing solutions while presenting some critical and important factors that have been addressed. Furthermore, open issues that have to be addressed are also emphasized. We also try to answer the question of how to benchmark intersection management algorithms by providing some factors that impact autonomous navigation at intersection. △ Less

Submitted 7 September, 2018; v1 submitted 20 June, 2018; originally announced June 2018.

Comments: Accepted for presentation at ITSC 2018, Final Version

arXiv:1804.00564 [pdf, other]

Codes with Combined Locality and Regeneration Having Optimal Rate, $d_{\text{min}}$ and Linear Field Size

Authors: M. Nikhil Krishnan, Anantha Narayanan R., P. Vijay Kumar

Abstract: In this paper, we study vector codes with all-symbol locality, where the local code is either a Minimum Bandwidth Regenerating (MBR) code or a Minimum Storage Regenerating (MSR) code. In the first part, we present vector codes with all-symbol MBR locality, for all parameters, that have both optimal minimum-distance and optimal rate. These codes combine ideas from two popular codes in the distribut… ▽ More In this paper, we study vector codes with all-symbol locality, where the local code is either a Minimum Bandwidth Regenerating (MBR) code or a Minimum Storage Regenerating (MSR) code. In the first part, we present vector codes with all-symbol MBR locality, for all parameters, that have both optimal minimum-distance and optimal rate. These codes combine ideas from two popular codes in the distributed storage literature, Product-Matrix codes and Tamo-Barg codes. In the second part which deals with codes having all-symbol MSR locality, we follow a Pairwise Coupling Transform-based approach to arrive at optimal minimum-distance and optimal rate, for a range of parameters. All the code constructions presented in this paper have a low field-size that grows linearly with the code-length $n$. △ Less

Submitted 2 April, 2018; originally announced April 2018.

Comments: Accepted for publication in 2018 IEEE International Symposium on Information Theory (ISIT)

arXiv:1708.05545 [pdf]

doi 10.5120/ijca2017915133

Spotting Separator Points at Line Terminals in Compressed Document Images for Text-line Segmentation

Authors: Amarnath R, P. Nagabhushan

Abstract: Line separators are used to segregate text-lines from one another in document image analysis. Finding the separator points at every line terminal in a document image would enable text-line segmentation. In particular, identifying the separators in handwritten text could be a thrilling exercise. Obviously it would be challenging to perform this in the compressed version of a document image and that… ▽ More Line separators are used to segregate text-lines from one another in document image analysis. Finding the separator points at every line terminal in a document image would enable text-line segmentation. In particular, identifying the separators in handwritten text could be a thrilling exercise. Obviously it would be challenging to perform this in the compressed version of a document image and that is the proposed objective in this research. Such an effort would prevent the computational burden of decompressing a document for text-line segmentation. Since document images are generally compressed using run length encoding (RLE) technique as per the CCITT standards, the first column in the RLE will be a white column. The value (depth) in the white column is very low when a particular line is a text line and the depth could be larger at the point of text line separation. A longer consecutive sequence of such larger depth should indicate the gap between the text lines, which provides the separator region. In case of over separation and under separation issues, corrective actions such as deletion and insertion are suggested respectively. An extensive experimentation is conducted on the compressed images of the benchmark datasets of ICDAR13 and Alireza et al [17] to demonstrate the efficacy. △ Less

Submitted 18 August, 2017; originally announced August 2017.

Comments: Line separators, Document image analysis, Handwritten text, Compression and decompression, RLE, CCITT. Line separator points at every line terminal in a compressed handwritten document images enabling text line segmentation

Journal ref: International Journal of Computer Applications 172(4): 40-47 (2017)

arXiv:1707.01742 [pdf, other]

High Resilience Diverse Domain Multilevel Audio Watermarking with Adaptive Threshold

Authors: Jerrin Thomas Panachakel, Anurenjan P. R

Abstract: A novel diverse domain (DCT-SVD & DWT-SVD) watermarking scheme is proposed in this paper. Here, the watermark is embedded simultaneously onto the two domains. It is shown that an audio signal watermarked using this scheme has better subjective and objective quality when compared with other watermarking schemes. Also proposed are two novel watermark detection algorithms viz., AOT (Adaptively Optimi… ▽ More A novel diverse domain (DCT-SVD & DWT-SVD) watermarking scheme is proposed in this paper. Here, the watermark is embedded simultaneously onto the two domains. It is shown that an audio signal watermarked using this scheme has better subjective and objective quality when compared with other watermarking schemes. Also proposed are two novel watermark detection algorithms viz., AOT (Adaptively Optimised Threshold) and AOTx (AOT eXtended). The fundamental idea behind both is finding an optimum threshold for detecting a known character embedded along with the actual watermarks in a known location, with the constraint that the Bit Error Rate (BER) is minimum. This optimum threshold is used for detecting the other characters in the watermarks. This approach is shown to make the watermarking scheme less susceptible to various signal processing attacks, thus making the watermarks more robust. △ Less

Submitted 5 July, 2017; originally announced July 2017.

arXiv:1402.1245 [pdf]

doi 10.14445/22315381/IJETT-V7P240

A Survey on Delay-Aware Network Structure for Wireless Sensor Networks with Consecutive Data Collection Processes

Authors: Ms. Aruna. G. R, Mr. SivanArulSelvan

Abstract: A Wireless Sensor Network (WSN) consists of spatially distributed autonomous sensors to monitor physical or environmental conditions, such as temperature, sound, pressure,etc. In sensing applications, data packets are flowing from sensor nodes to base station. In data collection processes, bottom up approach is used. In bottom up approach, all nodes send their sensed data packets to base station d… ▽ More A Wireless Sensor Network (WSN) consists of spatially distributed autonomous sensors to monitor physical or environmental conditions, such as temperature, sound, pressure,etc. In sensing applications, data packets are flowing from sensor nodes to base station. In data collection processes, bottom up approach is used. In bottom up approach, all nodes send their sensed data packets to base station directly. In this approach will lead to increased delay, which will lead to higher energy consumption. To reduce the energy consumption of sensor nodes,Clustering Algorithm and Low-Energy Adaptive Clustering Hierarchy (LEACH) are being used. Efficient gathering in Wireless Sensor information systems, Power Efficient Gathering in Sensor Information System (PEGASIS) is being used. There are lot of research issues in Wireless Sensor Networks such as delay, lifetime of network, energy dissipation which needs to be resolved. △ Less

Submitted 6 February, 2014; originally announced February 2014.

Comments: 4 pages,3 figures, "Published with International Journal of Engineering Trends and Technology (IJETT)"

Journal ref: IJETT, 7(4),198-201, 2014 published by seventh sense research group

arXiv:1109.3898 [pdf, ps, other]

Monitoring Breathing via Signal Strength in Wireless Networks

Authors: Neal Patwari, Joey Wilson, Sai Ananthanarayanan P. R., Sneha K. Kasera, Dwayne Westenskow

Abstract: This paper shows experimentally that standard wireless networks which measure received signal strength (RSS) can be used to reliably detect human breathing and estimate the breathing rate, an application we call "BreathTaking". We show that although an individual link cannot reliably detect breathing, the collective spectral content of a network of devices reliably indicates the presence and rate… ▽ More This paper shows experimentally that standard wireless networks which measure received signal strength (RSS) can be used to reliably detect human breathing and estimate the breathing rate, an application we call "BreathTaking". We show that although an individual link cannot reliably detect breathing, the collective spectral content of a network of devices reliably indicates the presence and rate of breathing. We present a maximum likelihood estimator (MLE) of breathing rate, amplitude, and phase, which uses the RSS data from many links simultaneously. We show experimental results which demonstrate that reliable detection and frequency estimation is possible with 30 seconds of data, within 0.3 breaths per minute (bpm) RMS error. Use of directional antennas is shown to improve robustness to motion near the network. △ Less

Submitted 18 September, 2011; originally announced September 2011.

Showing 1–38 of 38 results for author: R., A