-
Effective-LDAM: An Effective Loss Function To Mitigate Data Imbalance for Robust Chest X-Ray Disease Classification
Authors:
Sree Rama Vamsidhar S,
Bhargava Satya,
Rama Krishna Gorthi
Abstract:
Deep Learning (DL) approaches have gained prominence in medical imaging for disease diagnosis. Chest X-ray (CXR) classification has emerged as an effective method for detecting various diseases. Among these methodologies, Chest X-ray (CXR) classification has proven to be an effective approach for detecting and analyzing various diseases. However, the reliable performance of DL classification algor…
▽ More
Deep Learning (DL) approaches have gained prominence in medical imaging for disease diagnosis. Chest X-ray (CXR) classification has emerged as an effective method for detecting various diseases. Among these methodologies, Chest X-ray (CXR) classification has proven to be an effective approach for detecting and analyzing various diseases. However, the reliable performance of DL classification algorithms is dependent upon access to large and balanced datasets, which pose challenges in medical imaging due to the impracticality of acquiring sufficient data for every disease category. To tackle this problem, we propose an algorithmic-centric approach called Effective-Label Distribution Aware Margin (E-LDAM), which modifies the margin of the widely adopted Label Distribution Aware Margin (LDAM) loss function using an effective number of samples in each class. Experimental evaluations on the COVIDx CXR dataset focus on Normal, Pneumonia, and COVID-19 classification. The experimental results demonstrate the effectiveness of the proposed E-LDAM approach, achieving a remarkable recall score of 97.81% for the minority class (COVID-19) in CXR image prediction. Furthermore, the overall accuracy of the three-class classification task attains an impressive level of 95.26%.
△ Less
Submitted 6 July, 2024;
originally announced July 2024.
-
NSD-DIL: Null-Shot Deblurring Using Deep Identity Learning
Authors:
Sree Rama Vamsidhar S,
Rama Krishna Gorthi
Abstract:
In this paper, we propose to reformulate the blind image deblurring task to directly learn an inverse of the degradation model using a deep linear network. We introduce Deep Identity Learning (DIL), a novel learning strategy that includes a dedicated regularization term based on the properties of linear systems, to exploit the identity relation between the degradation and inverse degradation model…
▽ More
In this paper, we propose to reformulate the blind image deblurring task to directly learn an inverse of the degradation model using a deep linear network. We introduce Deep Identity Learning (DIL), a novel learning strategy that includes a dedicated regularization term based on the properties of linear systems, to exploit the identity relation between the degradation and inverse degradation models. The salient aspect of our proposed framework is it neither relies on a deblurring dataset nor a single input blurred image (like Polyblur, a self-supervised method). Since it is purely image-data-independent, we term our model as Null-Shot deblurring Using Deep Identity Learning (NSD-DIL). We also provide an explicit representation of the learned deep linear network in a matrix form, called Deep Restoration Kernel (DRK) for deblurring task. The proposed framework detours the typical degradation kernel estimation step involved in most of the existing blind deblurring solutions by the proposition of our Random Kernel Gallery (RKG) dataset. In this work, we focus on the restoration of mild blur images, generated by small out-of-focus, lens blur, or slight camera motion, which often occurs in real images. Our experiments show that the proposed method outperforms both traditional and deep learning based deblurring methods, with at least an order of 100 lesser computational resources. The proposed NSD-DIL method can be effortlessly extended to the Image Super-Resolution (ISR) task as well to restore the low-resolution images with fine details. The NSD-DIL model and its kernel form representation (DRK) are lightweight yet robust and restore the mild blur input in a fraction of a second. Hence, more suitable for wide real-time applications.
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
Narrow Transformer: Starcoder-Based Java-LM For Desktop
Authors:
Kamalkumar Rathinasamy,
Balaji A J,
Ankush Kumar,
Gagan Gayari,
Harshini K,
Rajab Ali Mondal,
Sreenivasa Raghavan K S,
Swayam Singh
Abstract:
This paper presents NT-Java-1.1B, an open-source specialized code language model built on StarCoderBase-1.1B, designed for coding tasks in Java programming. NT-Java-1.1B achieves state-of-the-art performance, surpassing its base model and majority of other models of similar size on MultiPL-E Java code benchmark. While there have been studies on extending large, generic pre-trained models to improv…
▽ More
This paper presents NT-Java-1.1B, an open-source specialized code language model built on StarCoderBase-1.1B, designed for coding tasks in Java programming. NT-Java-1.1B achieves state-of-the-art performance, surpassing its base model and majority of other models of similar size on MultiPL-E Java code benchmark. While there have been studies on extending large, generic pre-trained models to improve proficiency in specific programming languages like Python, similar investigations on small code models for other programming languages are lacking. Large code models require specialized hardware like GPUs for inference, highlighting the need for research into building small code models that can be deployed on developer desktops. This paper addresses this research gap by focusing on the development of a small Java code model, NT-Java-1.1B, and its quantized versions, which performs comparably to open models around 1.1B on MultiPL-E Java code benchmarks, making them ideal for desktop deployment. This paper establishes the foundation for specialized models across languages and sizes for a family of NT Models.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
MIMIC: Masked Image Modeling with Image Correspondences
Authors:
Kalyani Marathe,
Mahtab Bigverdi,
Nishat Khan,
Tuhin Kundu,
Patrick Howe,
Sharan Ranjit S,
Anand Bhattad,
Aniruddha Kembhavi,
Linda G. Shapiro,
Ranjay Krishna
Abstract:
Dense pixel-specific representation learning at scale has been bottlenecked due to the unavailability of large-scale multi-view datasets. Current methods for building effective pretraining datasets heavily rely on annotated 3D meshes, point clouds, and camera parameters from simulated environments, preventing them from building datasets from real-world data sources where such metadata is lacking.…
▽ More
Dense pixel-specific representation learning at scale has been bottlenecked due to the unavailability of large-scale multi-view datasets. Current methods for building effective pretraining datasets heavily rely on annotated 3D meshes, point clouds, and camera parameters from simulated environments, preventing them from building datasets from real-world data sources where such metadata is lacking. We propose a pretraining dataset-curation approach that does not require any additional annotations. Our method allows us to generate multi-view datasets from both real-world videos and simulated environments at scale. Specifically, we experiment with two scales: MIMIC-1M with 1.3M and MIMIC-3M with 3.1M multi-view image pairs. We train multiple models with different masked image modeling objectives to showcase the following findings: Representations trained on our automatically generated MIMIC-3M outperform those learned from expensive crowdsourced datasets (ImageNet-1K) and those learned from synthetic environments (MULTIVIEW-HABITAT) on two dense geometric tasks: depth estimation on NYUv2 (1.7%), and surface normals estimation on Taskonomy (2.05%). For dense tasks which also require object understanding, we outperform MULTIVIEW-HABITAT, on semantic segmentation on ADE20K (3.89%), pose estimation on MSCOCO (9.4%), and reduce the gap with models pre-trained on the object-centric expensive ImageNet-1K. We outperform even when the representations are frozen, and when downstream training data is limited to few-shot. Larger dataset (MIMIC-3M) significantly improves performance, which is promising since our curation method can arbitrarily scale to produce even larger datasets. MIMIC code, dataset, and pretrained models are open-sourced at https://github.com/RAIVNLab/MIMIC.
△ Less
Submitted 15 May, 2024; v1 submitted 26 June, 2023;
originally announced June 2023.
-
AdANNS: A Framework for Adaptive Semantic Search
Authors:
Aniket Rege,
Aditya Kusupati,
Sharan Ranjit S,
Alan Fan,
Qingqing Cao,
Sham Kakade,
Prateek Jain,
Ali Farhadi
Abstract:
Web-scale search systems learn an encoder to embed a given query which is then hooked into an approximate nearest neighbor search (ANNS) pipeline to retrieve similar data points. To accurately capture tail queries and data points, learned representations typically are rigid, high-dimensional vectors that are generally used as-is in the entire ANNS pipeline and can lead to computationally expensive…
▽ More
Web-scale search systems learn an encoder to embed a given query which is then hooked into an approximate nearest neighbor search (ANNS) pipeline to retrieve similar data points. To accurately capture tail queries and data points, learned representations typically are rigid, high-dimensional vectors that are generally used as-is in the entire ANNS pipeline and can lead to computationally expensive retrieval. In this paper, we argue that instead of rigid representations, different stages of ANNS can leverage adaptive representations of varying capacities to achieve significantly better accuracy-compute trade-offs, i.e., stages of ANNS that can get away with more approximate computation should use a lower-capacity representation of the same data point. To this end, we introduce AdANNS, a novel ANNS design framework that explicitly leverages the flexibility of Matryoshka Representations. We demonstrate state-of-the-art accuracy-compute trade-offs using novel AdANNS-based key ANNS building blocks like search data structures (AdANNS-IVF) and quantization (AdANNS-OPQ). For example on ImageNet retrieval, AdANNS-IVF is up to 1.5% more accurate than the rigid representations-based IVF at the same compute budget; and matches accuracy while being up to 90x faster in wall-clock time. For Natural Questions, 32-byte AdANNS-OPQ matches the accuracy of the 64-byte OPQ baseline constructed using rigid representations -- same accuracy at half the cost! We further show that the gains from AdANNS translate to modern-day composite ANNS indices that combine search structures and quantization. Finally, we demonstrate that AdANNS can enable inference-time adaptivity for compute-aware search on ANNS indices built non-adaptively on matryoshka representations. Code is open-sourced at https://github.com/RAIVNLab/AdANNS.
△ Less
Submitted 18 October, 2023; v1 submitted 30 May, 2023;
originally announced May 2023.
-
Control and Coordination of a SWARM of Unmanned Surface Vehicles using Deep Reinforcement Learning in ROS
Authors:
Shrudhi R S,
Sreyash Mohanty,
Dr. Susan Elias
Abstract:
An unmanned surface vehicle (USV) can perform complex missions by continuously observing the state of its surroundings and taking action toward a goal. A SWARM of USVs working together can complete missions faster, and more effectively than a single USV alone. In this paper, we propose an autonomous communication model for a swarm of USVs. The goal of this system is to implement a software system…
▽ More
An unmanned surface vehicle (USV) can perform complex missions by continuously observing the state of its surroundings and taking action toward a goal. A SWARM of USVs working together can complete missions faster, and more effectively than a single USV alone. In this paper, we propose an autonomous communication model for a swarm of USVs. The goal of this system is to implement a software system using Robot Operating System (ROS) and Gazebo. With the main objective of coordinated task completion, the Markov decision process (MDP) provides a base to formulate a task decision problem to achieve efficient localization and tracking in a highly dynamic water environment. To coordinate multiple USVs performing real-time target tracking, we propose an enhanced multi-agent reinforcement learning approach. Our proposed scheme uses MA-DDPG, or Multi-Agent Deep Deterministic Policy Gradient, an extension of the Deep Deterministic Policy Gradients (DDPG) algorithm that allows for decentralized control of multiple agents in a cooperative environment. MA-DDPG's decentralised control allows each and every agent to make decisions based on its own observations and objectives, which can lead to superior gross performance and improved stability. Additionally, it provides communication and coordination among agents through the use of collective readings and rewards.
△ Less
Submitted 17 April, 2023;
originally announced April 2023.
-
SerPyTor: A distributed context-aware computational graph execution framework for durable execution
Authors:
Anuran Roy,
Sridhar Raj S
Abstract:
Distributed computation is always a tricky topic to deal with, especially in context of various requirements in various scenarios. A popular solution is to use Apache Spark with a setup of multiple systems forming a cluster. However, the prerequisite setup for a Spark cluster often induces an additional overhead, often limiting usage in constrained scenarios, especially in scenarios requiring cont…
▽ More
Distributed computation is always a tricky topic to deal with, especially in context of various requirements in various scenarios. A popular solution is to use Apache Spark with a setup of multiple systems forming a cluster. However, the prerequisite setup for a Spark cluster often induces an additional overhead, often limiting usage in constrained scenarios, especially in scenarios requiring context propagation. In this paper, we explore a relatively lightweight computational graph execution framework requiring little setup and fast speeds, coupled with context awareness.
△ Less
Submitted 15 April, 2023;
originally announced April 2023.
-
An Autoencoder Based Approach to Simulate Sports Games
Authors:
Ashwin Vaswani,
Rijul Ganguly,
Het Shah,
Sharan Ranjit S,
Shrey Pandit,
Samruddhi Bothara
Abstract:
Sports data has become widely available in the recent past. With the improvement of machine learning techniques, there have been attempts to use sports data to analyze not only the outcome of individual games but also to improve insights and strategies. The outbreak of COVID-19 has interrupted sports leagues globally, giving rise to increasing questions and speculations about the outcome of this s…
▽ More
Sports data has become widely available in the recent past. With the improvement of machine learning techniques, there have been attempts to use sports data to analyze not only the outcome of individual games but also to improve insights and strategies. The outbreak of COVID-19 has interrupted sports leagues globally, giving rise to increasing questions and speculations about the outcome of this season's leagues. What if the season was not interrupted and concluded normally? Which teams would end up winning trophies? Which players would perform the best? Which team would end their season on a high and which teams would fail to keep up with the pressure? We aim to tackle this problem and develop a solution. In this paper, we proposeUCLData, which is a dataset containing detailed information of UEFA Champions League games played over the past six years. We also propose a novel autoencoder based machine learning pipeline that can come up with a story on how the rest of the season will pan out.
△ Less
Submitted 16 July, 2020;
originally announced July 2020.
-
A Heuristic Algorithm for Network Optimization of OTN over DWDM Network
Authors:
Govardan C.,
Sri Krishna Chaitanya K.,
Krishna Kumar Naik B.,
Shreesha Rao D. S.,
Jagadeesh C.,
Gowrishankar R.,
Siva Sankara Sai S.,
Prabhat Behere,
Bhyri Sai Kishore
Abstract:
While the network traffic has seen exponential increase, the revenues have not maintained the same pace. New methods have to be explored to reduce this gap between traffic and revenue. One such method is convergence in networking layers. In this work, we study the convergence of OTN and DWDM layer from a network planning perspective. We compare the costs of planning networks without and with conve…
▽ More
While the network traffic has seen exponential increase, the revenues have not maintained the same pace. New methods have to be explored to reduce this gap between traffic and revenue. One such method is convergence in networking layers. In this work, we study the convergence of OTN and DWDM layer from a network planning perspective. We compare the costs of planning networks without and with convergence and show that the multilayer planning offers least cost for higher traffic volumes.
△ Less
Submitted 31 December, 2018;
originally announced January 2019.
-
Lower Bounds for Interactive Function Computation via Wyner Common Information
Authors:
Shi** Rajakrishnan,
Sundara Rajan S,
Vinod Prabhakaran
Abstract:
The question of how much communication is required between collaborating parties to compute a function of their data is of fundamental importance in the fields of theoretical computer science and information theory. In this work, the focus is on coming up with lower bounds on this. The information cost of a protocol is the amount of information the protocol reveals to Alice and Bob about each othe…
▽ More
The question of how much communication is required between collaborating parties to compute a function of their data is of fundamental importance in the fields of theoretical computer science and information theory. In this work, the focus is on coming up with lower bounds on this. The information cost of a protocol is the amount of information the protocol reveals to Alice and Bob about each others inputs, and the information complexity of a function is the infimum of information costs over all valid protocols. For the amortized case, it is known that the optimal rate for the computation is equal to the information complexity. Exactly computing this information complexity is not straight forward however. In this work we lower bound information complexity for independent inputs in terms of the Wyner common information of a certain pair of random variables. We show a structural property for the optimal auxiliary random variable of Wyner common information and exploit this to exactly compute the Wyner common information in certain cases. The lower bound obtained through this technique is shown to be tight for a non-trivial example - equality (EQ) for the ternary alphabet. We also give an example to show that the lower bound may, in general, not be tight.
△ Less
Submitted 7 February, 2016;
originally announced February 2016.
-
Audio enabled information extraction system for cricket and hockey domains
Authors:
S. Saraswathi,
Narasimha Sravan. V,
Sai Vamsi Krishna. B. V,
Suresh Reddy. S
Abstract:
The proposed system aims at the retrieval of the summarized information from the documents collected from web based search engine as per the user query related to cricket and hockey domain. The system is designed in a manner that it takes the voice commands as keywords for search. The parts of speech in the query are extracted using the natural language extractor for English. Based on the keywords…
▽ More
The proposed system aims at the retrieval of the summarized information from the documents collected from web based search engine as per the user query related to cricket and hockey domain. The system is designed in a manner that it takes the voice commands as keywords for search. The parts of speech in the query are extracted using the natural language extractor for English. Based on the keywords the search is categorized into 2 types: - 1.Concept wise - information retrieved to the query is retrieved based on the keywords and the concept words related to it. The retrieved information is summarized using the probabilistic approach and weighted means algorithm.2.Keyword search - extracts the result relevant to the query from the highly ranked document retrieved from the search by the search engine. The relevant search results are retrieved and then keywords are used for summarizing part. During summarization it follows the weighted and probabilistic approaches in order to identify the data comparable to the keywords extracted. The extracted information is then refined repeatedly through the aggregation process to reduce redundancy. Finally the resultant data is submitted to the user in the form of audio output.
△ Less
Submitted 26 April, 2010;
originally announced April 2010.
-
Tracing Technique for Blaster Attack
Authors:
Siti Rahayu S.,
Robiah Y.,
Shahrin S.,
Faizal M. A.,
Mohd Zaki M,
Irda R
Abstract:
Blaster worm of 2003 is still persistent, the infection appears to have successfully transitioned to new hosts as the original systems are cleaned or shut off, suggesting that the Blaster worm, and other similar worms, will remain significant Internet threats for many years after their initial release. This paper is to propose technique on tracing the Blaster attack from various logs in differen…
▽ More
Blaster worm of 2003 is still persistent, the infection appears to have successfully transitioned to new hosts as the original systems are cleaned or shut off, suggesting that the Blaster worm, and other similar worms, will remain significant Internet threats for many years after their initial release. This paper is to propose technique on tracing the Blaster attack from various logs in different OSI layers based on fingerprint of Blaster attack on victim logs, attacker logs and IDS alert log. The researchers intended to do a preliminary investigation upon this particular attack so that it can be used for further research in alert correlation and computer forensic investigation.
△ Less
Submitted 25 August, 2009;
originally announced August 2009.