-
Comprehensive Dataset for Urban Streetlight Analysis
Authors:
Eliza Femi Sherley S,
Sanjay T,
Shri Kaanth P,
Jeffrey Samuel S
Abstract:
This article includes a comprehensive collection of over 800 high-resolution streetlight images taken systematically from India's major streets, primarily in the Chennai region. The images were methodically collected following standardized methods to assure uniformity and quality. Each image has been labelled and grouped into directories based on binary class labels, which indicate whether each st…
▽ More
This article includes a comprehensive collection of over 800 high-resolution streetlight images taken systematically from India's major streets, primarily in the Chennai region. The images were methodically collected following standardized methods to assure uniformity and quality. Each image has been labelled and grouped into directories based on binary class labels, which indicate whether each streetlight is functional or not. This organized dataset is intended to make it easier to train and evaluate deep neural networks, allowing for the creation of pre-trained models that have robust feature representations. Such models have several potential uses, such as improving smart city surveillance systems, automating street infrastructure monitoring, and increasing urban management efficiency. The availability of this dataset is intended to inspire future research and development in computer vision and smart city technologies, supporting innovation and practical solutions to urban infrastructure concerns. The dataset can be accessed at https://github.com/Team16Project/Street-Light-Dataset/.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Enhancing Long-Term Memory using Hierarchical Aggregate Tree for Retrieval Augmented Generation
Authors:
Aadharsh Aadhithya A,
Sachin Kumar S,
Soman K. P
Abstract:
Large language models have limited context capacity, hindering reasoning over long conversations. We propose the Hierarchical Aggregate Tree memory structure to recursively aggregate relevant dialogue context through conditional tree traversals. HAT encapsulates information from children nodes, enabling broad coverage with depth control. We formulate finding best context as optimal tree traversal.…
▽ More
Large language models have limited context capacity, hindering reasoning over long conversations. We propose the Hierarchical Aggregate Tree memory structure to recursively aggregate relevant dialogue context through conditional tree traversals. HAT encapsulates information from children nodes, enabling broad coverage with depth control. We formulate finding best context as optimal tree traversal. Experiments show HAT improves dialog coherence and summary quality over baseline contexts, demonstrating the techniques effectiveness for multi turn reasoning without exponential parameter growth. This memory augmentation enables more consistent, grounded longform conversations from LLMs
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Learning (With) Distributed Optimization
Authors:
Aadharsh Aadhithya A,
Abinesh S,
Akshaya J,
Jayanth M,
Vishnu Radhakrishnan,
Sowmya V,
Soman K. P
Abstract:
This paper provides an overview of the historical progression of distributed optimization techniques, tracing their development from early duality-based methods pioneered by Dantzig, Wolfe, and Benders in the 1960s to the emergence of the Augmented Lagrangian Alternating Direction Inexact Newton (ALADIN) algorithm. The initial focus on Lagrangian relaxation for convex problems and decomposition st…
▽ More
This paper provides an overview of the historical progression of distributed optimization techniques, tracing their development from early duality-based methods pioneered by Dantzig, Wolfe, and Benders in the 1960s to the emergence of the Augmented Lagrangian Alternating Direction Inexact Newton (ALADIN) algorithm. The initial focus on Lagrangian relaxation for convex problems and decomposition strategies led to the refinement of methods like the Alternating Direction Method of Multipliers (ADMM). The resurgence of interest in distributed optimization in the late 2000s, particularly in machine learning and imaging, demonstrated ADMM's practical efficacy and its unifying potential. This overview also highlights the emergence of the proximal center method and its applications in diverse domains. Furthermore, the paper underscores the distinctive features of ALADIN, which offers convergence guarantees for non-convex scenarios without introducing auxiliary variables, differentiating it from traditional augmentation techniques. In essence, this work encapsulates the historical trajectory of distributed optimization and underscores the promising prospects of ALADIN in addressing non-convex optimization challenges.
△ Less
Submitted 10 August, 2023;
originally announced August 2023.
-
Fashion-model pose recommendation and generation using Machine Learning
Authors:
Vijitha Kannumuru,
Santhosh Kannan S P,
Krithiga Shankar,
Joy Larnyoh,
Rohith Mahadevan,
Raja CSP Raman
Abstract:
Fashion-model pose is an important attribute in the fashion industry. Creative directors, modeling production houses, and top photographers always look for professional models able to pose. without the skill to correctly pose, their chances of landing professional modeling employment are regrettably quite little. There are occasions when models and photographers are unsure of the best pose to stri…
▽ More
Fashion-model pose is an important attribute in the fashion industry. Creative directors, modeling production houses, and top photographers always look for professional models able to pose. without the skill to correctly pose, their chances of landing professional modeling employment are regrettably quite little. There are occasions when models and photographers are unsure of the best pose to strike while taking photographs. This research concentrates on suggesting the fashion personnel a series of similar images based on the input image. The image is segmented into different parts and similar images are suggested for the user. This was achieved by calculating the color histogram of the input image and applying the same for all the images in the dataset and comparing the histograms. Synthetic images have become popular to avoid privacy concerns and to overcome the high cost of photoshoots. Hence, this paper also extends the work of generating synthetic images from the recommendation engine using styleGAN to an extent.
△ Less
Submitted 19 February, 2023;
originally announced March 2023.
-
Explainable AI Framework for COVID-19 Prediction in Different Provinces of India
Authors:
Mredulraj S. Pandianchery,
Gopalakrishnan E. A,
Sowmya V,
Vinayakumar Ravi,
Soman K. P
Abstract:
In 2020, covid-19 virus had reached more than 200 countries. Till December 20th 2021, 221 nations in the world had collectively reported 275M confirmed cases of covid-19 & total death toll of 5.37M. Many countries which include United States, India, Brazil, United Kingdom, Russia etc were badly affected by covid-19 pandemic due to the large population. The total confirmed cases reported in this co…
▽ More
In 2020, covid-19 virus had reached more than 200 countries. Till December 20th 2021, 221 nations in the world had collectively reported 275M confirmed cases of covid-19 & total death toll of 5.37M. Many countries which include United States, India, Brazil, United Kingdom, Russia etc were badly affected by covid-19 pandemic due to the large population. The total confirmed cases reported in this country are 51.7M, 34.7M, 22.2M, 11.3M, 10.2M respectively till December 20, 2021. This pandemic can be controlled with the help of precautionary steps by government & civilians of the country. The early prediction of covid-19 cases helps to track the transmission dynamics & alert the government to take the necessary precautions. Recurrent Deep learning algorithms is a data driven model which plays a key role to capture the patterns present in time series data. In many literatures, the Recurrent Neural Network (RNN) based model are proposed for the efficient prediction of COVID-19 cases for different provinces. The study in the literature doesnt involve the interpretation of the model behavior & robustness. In this study, The LSTM model is proposed for the efficient prediction of active cases in each provinces of India. The active cases dataset for each province in India is taken from John Hopkins publicly available dataset for the duration from 10th June, 2020 to 4th August, 2021. The proposed LSTM model is trained on one state i.e., Maharashtra and tested for rest of the provinces in India. The concept of Explainable AI is involved in this study for the better interpretation & understanding of the model behavior. The proposed model is used to forecast the active cases in India from 16th December, 2021 to 5th March, 2022. It is notated that there will be a emergence of third wave on January, 2022 in India.
△ Less
Submitted 30 July, 2022; v1 submitted 12 January, 2022;
originally announced January 2022.
-
AI-Powered Semantic Segmentation and Fluid Volume Calculation of Lung CT images in Covid-19 Patients
Authors:
Sabeerali K. P,
Saleena T. S,
Dr. Muhamed Ilyas P,
Dr. Neha Mohan
Abstract:
COVID-19 pandemic is a deadly disease spreading very fast. People with the confronted immune system are susceptible to many health conditions. A highly significant condition is pneumonia, which is found to be the cause of death in the majority of patients. The main purpose of this study is to find the volume of GGO and consolidation of a covid-19 patient so that the physicians can prioritize the p…
▽ More
COVID-19 pandemic is a deadly disease spreading very fast. People with the confronted immune system are susceptible to many health conditions. A highly significant condition is pneumonia, which is found to be the cause of death in the majority of patients. The main purpose of this study is to find the volume of GGO and consolidation of a covid-19 patient so that the physicians can prioritize the patients. Here we used transfer learning techniques for segmentation of lung CTs with the latest libraries and techniques which reduces training time and increases the accuracy of the AI Model. This system is trained with DeepLabV3+ network architecture and model Resnet50 with Imagenet weights. We used different augmentation techniques like Gaussian Noise, Horizontal shift, color variation, etc to get to the result. Intersection over Union(IoU) is used as the performance metrics. The IoU of lung masks is predicted as 99.78% and that of infected masks is as 89.01%. Our work effectively measures the volume of infected region by calculating the volume of infected and lung mask region of the patients.
△ Less
Submitted 29 October, 2021;
originally announced October 2021.
-
BOLD: An Ontology-based Log Debugger for C Programs
Authors:
Dileep Kumar P,
Rupesh Nasre,
Sreenivasa Kumar P
Abstract:
The different activities related to debugging such as program instrumentation, representation of execution trace and analysis of trace are not typically performed in an unified framework. We propose \textit{BOLD}, an Ontology-based Log Debugger to unify and standardize the activities in debugging. The syntactical information of programs can be represented in the from of Resource Description Framew…
▽ More
The different activities related to debugging such as program instrumentation, representation of execution trace and analysis of trace are not typically performed in an unified framework. We propose \textit{BOLD}, an Ontology-based Log Debugger to unify and standardize the activities in debugging. The syntactical information of programs can be represented in the from of Resource Description Framework (RDF) triples. Using the BOLD framework, the programs can be automatically instrumented by using declarative specifications over these triples. A salient feature of the framework is to store the execution trace of the program also as RDF triples called \textit{trace triples}. These triples can be queried to implement the common debug operations. The novelty of the framework is to abstract these triples as \textit{spans} for high-level reasoning. A span gives a way of examining the values of a particular variable over certain portion of the program execution. The properties of the spans are defined formally as a Web Ontology Language (OWL) ontology called \textit{Program Debug (PD) Ontology}. Using the span abstraction and PD ontology, end-users can debug a given buggy program in a standard manner. A notable feature of using ontology is that users can accurately debug in some cases of missing information, which can be practically useful. To demonstrate the feasibility of the proposed framework, we have debugged the programs in a standard bug benchmark suite Software-artifact Infrastructure Repository (SIR). Experiments show that the querying time is almost the same as in \texttt{gdb}. The reasoning time depends on the sub-language of OWL. We find that the expressibility offered by OWL-DL language is sufficient for the bugs in SIR programs; but to achieve scalability in reasoning, a restricted OWL-RL language is required.
△ Less
Submitted 23 April, 2020;
originally announced April 2020.
-
Prediction of number of cases expected and estimation of the final size of coronavirus epidemic in India using the logistic model and genetic algorithm
Authors:
Ganesh Kumar M,
Soman K. P,
Gopalakrishnan E. A,
Vijay Krishna Menon,
Sowmya V
Abstract:
In this paper, we have applied the logistic growth regression model and genetic algorithm to predict the number of coronavirus infected cases that can be expected in upcoming days in India and also estimated the final size and its peak time of the coronavirus epidemic in India.
In this paper, we have applied the logistic growth regression model and genetic algorithm to predict the number of coronavirus infected cases that can be expected in upcoming days in India and also estimated the final size and its peak time of the coronavirus epidemic in India.
△ Less
Submitted 26 March, 2020;
originally announced March 2020.
-
Offensive Language Detection: A Comparative Analysis
Authors:
Vyshnav M T,
Sachin Kumar S,
Soman K P
Abstract:
Offensive behaviour has become pervasive in the Internet community. Individuals take the advantage of anonymity in the cyber world and indulge in offensive communications which they may not consider in the real life. Governments, online communities, companies etc are investing into prevention of offensive behaviour content in social media. One of the most effective solution for tacking this enigma…
▽ More
Offensive behaviour has become pervasive in the Internet community. Individuals take the advantage of anonymity in the cyber world and indulge in offensive communications which they may not consider in the real life. Governments, online communities, companies etc are investing into prevention of offensive behaviour content in social media. One of the most effective solution for tacking this enigmatic problem is the use of computational techniques to identify offensive content and take action. The current work focuses on detecting offensive language in English tweets. The dataset used for the experiment is obtained from SemEval-2019 Task 6 on Identifying and Categorizing Offensive Language in Social Media (OffensEval). The dataset contains 14,460 annotated English tweets. The present paper provides a comparative analysis and Random kitchen sink (RKS) based approach for offensive language detection. We explore the effectiveness of Google sentence encoder, Fasttext, Dynamic mode decomposition (DMD) based features and Random kitchen sink (RKS) method for offensive language detection. From the experiments and evaluation we observed that RKS with fastetxt achieved competing results. The evaluation measures used are accuracy, precision, recall, f1-score.
△ Less
Submitted 9 January, 2020;
originally announced January 2020.
-
Smart Summarizer for Blind People
Authors:
Mona teja K,
Mohan Sai. S,
H S S S Raviteja D,
Sai Kushagra P V
Abstract:
In today's world, time is a very important resource. In our busy lives, most of us hardly have time to read the complete news so what we have to do is just go through the headlines and satisfy ourselves with that. As a result, we might miss a part of the news or misinterpret the complete thing. The situation is even worse for the people who are visually impaired or have lost their ability to see.…
▽ More
In today's world, time is a very important resource. In our busy lives, most of us hardly have time to read the complete news so what we have to do is just go through the headlines and satisfy ourselves with that. As a result, we might miss a part of the news or misinterpret the complete thing. The situation is even worse for the people who are visually impaired or have lost their ability to see. The inability of these people to read text has a huge impact on their lives. There are a number of methods for blind people to read the text. Braille script, in particular, is one of the examples, but it is a highly inefficient method as it is really time taking and requires a lot of practice. So, we present a method for visually impaired people based on the sense of sound which is obviously better and more accurate than the sense of touch. This paper deals with an efficient method to summarize news into important keywords so as to save the efforts to go through the complete text every single time. This paper deals with many API's and modules like the tesseract, GTTS, and many algorithms that have been discussed and implemented in detail such as Luhn's Algorithm, Latent Semantic Analysis Algorithm, Text Ranking Algorithm. And the other functionality that this paper deals with is converting the summarized text to speech so that the system can aid even the blind people.
△ Less
Submitted 1 January, 2020;
originally announced January 2020.
-
Concatenated Feature Pyramid Network for Instance Segmentation
Authors:
Yongqing Sun,
Pranav Shenoy K P,
Jun Shimamura,
Atsushi Sagata
Abstract:
Low level features like edges and textures play an important role in accurately localizing instances in neural networks. In this paper, we propose an architecture which improves feature pyramid networks commonly used instance segmentation networks by incorporating low level features in all layers of the pyramid in an optimal and efficient way. Specifically, we introduce a new layer which learns ne…
▽ More
Low level features like edges and textures play an important role in accurately localizing instances in neural networks. In this paper, we propose an architecture which improves feature pyramid networks commonly used instance segmentation networks by incorporating low level features in all layers of the pyramid in an optimal and efficient way. Specifically, we introduce a new layer which learns new correlations from feature maps of multiple feature pyramid levels holistically and enhances the semantic information of the feature pyramid to improve accuracy. Our architecture is simple to implement in instance segmentation or object detection frameworks to boost accuracy. Using this method in Mask RCNN, our model achieves consistent improvement in precision on COCO Dataset with the computational overhead compared to the original feature pyramid network.
△ Less
Submitted 16 March, 2019;
originally announced April 2019.
-
Weakly Supervised Instance Segmentation Using Hybrid Network
Authors:
Shisha Liao,
Yongqing Sun,
Chenqiang Gao,
Pranav Shenoy K P,
Song Mu,
Jun Shimamura,
Atsushi Sagata
Abstract:
Weakly-supervised instance segmentation, which could greatly save labor and time cost of pixel mask annotation, has attracted increasing attention in recent years. The commonly used pipeline firstly utilizes conventional image segmentation methods to automatically generate initial masks and then use them to train an off-the-shelf segmentation network in an iterative way. However, the initial gener…
▽ More
Weakly-supervised instance segmentation, which could greatly save labor and time cost of pixel mask annotation, has attracted increasing attention in recent years. The commonly used pipeline firstly utilizes conventional image segmentation methods to automatically generate initial masks and then use them to train an off-the-shelf segmentation network in an iterative way. However, the initial generated masks usually contains a notable proportion of invalid masks which are mainly caused by small object instances. Directly using these initial masks to train segmentation model is harmful for the performance. To address this problem, we propose a hybrid network in this paper. In our architecture, there is a principle segmentation network which is used to handle the normal samples with valid generated masks. In addition, a complementary branch is added to handle the small and dim objects without valid masks. Experimental results indicate that our method can achieve significantly performance improvement both on the small object instances and large ones, and outperforms all state-of-the-art methods.
△ Less
Submitted 12 December, 2018;
originally announced December 2018.
-
Ensemble of Convolutional Neural Networks for Automatic Grading of Diabetic Retinopathy and Macular Edema
Authors:
Avinash Kori,
Sai Saketh Chennamsetty,
Mohammed Safwan K. P.,
Varghese Alex
Abstract:
In this manuscript, we automate the procedure of grading of diabetic retinopathy and macular edema from fundus images using an ensemble of convolutional neural networks. The availability of limited amount of labeled data to perform supervised learning was circumvented by using transfer learning approach. The models in the ensemble were pre-trained on a large dataset comprising natural images and w…
▽ More
In this manuscript, we automate the procedure of grading of diabetic retinopathy and macular edema from fundus images using an ensemble of convolutional neural networks. The availability of limited amount of labeled data to perform supervised learning was circumvented by using transfer learning approach. The models in the ensemble were pre-trained on a large dataset comprising natural images and were later fine-tuned with the limited data for the task of choice. For an image, the ensemble of classifiers generate multiple predictions, and a max-voting based approach was utilized to attain the final grade of the anomaly in the image. For the task of grading DR, on the test data (n=56), the ensemble achieved an accuracy of 83.9\%, while for the task for grading macular edema the network achieved an accuracy of 95.45% (n=44).
△ Less
Submitted 11 September, 2018;
originally announced September 2018.