-
A Mamba-based Siamese Network for Remote Sensing Change Detection
Authors:
Jay N. Paranjape,
Celso de Melo,
Vishal M. Patel
Abstract:
Change detection in remote sensing images is an essential tool for analyzing a region at different times. It finds varied applications in monitoring environmental changes, man-made changes as well as corresponding decision-making and prediction of future trends. Deep learning methods like Convolutional Neural Networks (CNNs) and Transformers have achieved remarkable success in detecting significan…
▽ More
Change detection in remote sensing images is an essential tool for analyzing a region at different times. It finds varied applications in monitoring environmental changes, man-made changes as well as corresponding decision-making and prediction of future trends. Deep learning methods like Convolutional Neural Networks (CNNs) and Transformers have achieved remarkable success in detecting significant changes, given two images at different times. In this paper, we propose a Mamba-based Change Detector (M-CD) that segments out the regions of interest even better. Mamba-based architectures demonstrate linear-time training capabilities and an improved receptive field over transformers. Our experiments on four widely used change detection datasets demonstrate significant improvements over existing state-of-the-art (SOTA) methods. Our code and pre-trained models are available at https://github.com/JayParanjape/M-CD
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
Blackbox Adaptation for Medical Image Segmentation
Authors:
Jay N. Paranjape,
Shameema Sikder,
S. Swaroop Vedula,
Vishal M. Patel
Abstract:
In recent years, various large foundation models have been proposed for image segmentation. There models are often trained on large amounts of data corresponding to general computer vision tasks. Hence, these models do not perform well on medical data. There have been some attempts in the literature to perform parameter-efficient finetuning of such foundation models for medical image segmentation.…
▽ More
In recent years, various large foundation models have been proposed for image segmentation. There models are often trained on large amounts of data corresponding to general computer vision tasks. Hence, these models do not perform well on medical data. There have been some attempts in the literature to perform parameter-efficient finetuning of such foundation models for medical image segmentation. However, these approaches assume that all the parameters of the model are available for adaptation. But, in many cases, these models are released as APIs or blackboxes, with no or limited access to the model parameters and data. In addition, finetuning methods also require a significant amount of compute, which may not be available for the downstream task. At the same time, medical data can't be shared with third-party agents for finetuning due to privacy reasons. To tackle these challenges, we pioneer a blackbox adaptation technique for prompted medical image segmentation, called BAPS. BAPS has two components - (i) An Image-Prompt decoder (IP decoder) module that generates visual prompts given an image and a prompt, and (ii) A Zero Order Optimization (ZOO) Method, called SPSA-GC that is used to update the IP decoder without the need for backpropagating through the foundation model. Thus, our method does not require any knowledge about the foundation model's weights or gradients. We test BAPS on four different modalities and show that our method can improve the original model's performance by around 4%.
△ Less
Submitted 17 May, 2024;
originally announced May 2024.
-
Cross-Dataset Adaptation for Instrument Classification in Cataract Surgery Videos
Authors:
Jay N. Paranjape,
Shameema Sikder,
Vishal M. Patel,
S. Swaroop Vedula
Abstract:
Surgical tool presence detection is an important part of the intra-operative and post-operative analysis of a surgery. State-of-the-art models, which perform this task well on a particular dataset, however, perform poorly when tested on another dataset. This occurs due to a significant domain shift between the datasets resulting from the use of different tools, sensors, data resolution etc. In thi…
▽ More
Surgical tool presence detection is an important part of the intra-operative and post-operative analysis of a surgery. State-of-the-art models, which perform this task well on a particular dataset, however, perform poorly when tested on another dataset. This occurs due to a significant domain shift between the datasets resulting from the use of different tools, sensors, data resolution etc. In this paper, we highlight this domain shift in the commonly performed cataract surgery and propose a novel end-to-end Unsupervised Domain Adaptation (UDA) method called the Barlow Adaptor that addresses the problem of distribution shift without requiring any labels from another domain. In addition, we introduce a novel loss called the Barlow Feature Alignment Loss (BFAL) which aligns features across different domains while reducing redundancy and the need for higher batch sizes, thus improving cross-dataset performance. The use of BFAL is a novel approach to address the challenge of domain shift in cataract surgery data. Extensive experiments are conducted on two cataract surgery datasets and it is shown that the proposed method outperforms the state-of-the-art UDA methods by 6%. The code can be found at https://github.com/JayParanjape/Barlow-Adaptor
△ Less
Submitted 31 July, 2023;
originally announced August 2023.
-
AdaptiveSAM: Towards Efficient Tuning of SAM for Surgical Scene Segmentation
Authors:
Jay N. Paranjape,
Nithin Gopalakrishnan Nair,
Shameema Sikder,
S. Swaroop Vedula,
Vishal M. Patel
Abstract:
Segmentation is a fundamental problem in surgical scene analysis using artificial intelligence. However, the inherent data scarcity in this domain makes it challenging to adapt traditional segmentation techniques for this task. To tackle this issue, current research employs pretrained models and finetunes them on the given data. Even so, these require training deep networks with millions of parame…
▽ More
Segmentation is a fundamental problem in surgical scene analysis using artificial intelligence. However, the inherent data scarcity in this domain makes it challenging to adapt traditional segmentation techniques for this task. To tackle this issue, current research employs pretrained models and finetunes them on the given data. Even so, these require training deep networks with millions of parameters every time new data becomes available. A recently published foundation model, Segment-Anything (SAM), generalizes well to a large variety of natural images, hence tackling this challenge to a reasonable extent. However, SAM does not generalize well to the medical domain as is without utilizing a large amount of compute resources for fine-tuning and using task-specific prompts. Moreover, these prompts are in the form of bounding-boxes or foreground/background points that need to be annotated explicitly for every image, making this solution increasingly tedious with higher data size. In this work, we propose AdaptiveSAM - an adaptive modification of SAM that can adjust to new datasets quickly and efficiently, while enabling text-prompted segmentation. For finetuning AdaptiveSAM, we propose an approach called bias-tuning that requires a significantly smaller number of trainable parameters than SAM (less than 2\%). At the same time, AdaptiveSAM requires negligible expert intervention since it uses free-form text as prompt and can segment the object of interest with just the label name as prompt. Our experiments show that AdaptiveSAM outperforms current state-of-the-art methods on various medical imaging datasets including surgery, ultrasound and X-ray. Code is available at https://github.com/JayParanjape/biastuning
△ Less
Submitted 7 August, 2023;
originally announced August 2023.
-
Characterizing the Weight Space for Different Learning Models
Authors:
Saurav Musunuru,
Jay N. Paranjape,
Rahul Kumar Dubey,
Vijendran G. Venkoparao
Abstract:
Deep Learning has become one of the primary research areas in develo** intelligent machines. Most of the well-known applications (such as Speech Recognition, Image Processing and NLP) of AI are driven by Deep Learning. Deep Learning algorithms mimic human brain using artificial neural networks and progressively learn to accurately solve a given problem. But there are significant challenges in De…
▽ More
Deep Learning has become one of the primary research areas in develo** intelligent machines. Most of the well-known applications (such as Speech Recognition, Image Processing and NLP) of AI are driven by Deep Learning. Deep Learning algorithms mimic human brain using artificial neural networks and progressively learn to accurately solve a given problem. But there are significant challenges in Deep Learning systems. There have been many attempts to make deep learning models imitate the biological neural network. However, many deep learning models have performed poorly in the presence of adversarial examples. Poor performance in adversarial examples leads to adversarial attacks and in turn leads to safety and security in most of the applications. In this paper we make an attempt to characterize the solution space of a deep neural network in terms of three different subsets viz. weights belonging to exact trained patterns, weights belonging to generalized pattern set and weights belonging to adversarial pattern sets. We attempt to characterize the solution space with two seemingly different learning paradigms viz. the Deep Neural Networks and the Dense Associative Memory Model, which try to achieve learning via quite different mechanisms. We also show that adversarial attacks are generally less successful against Associative Memory Models than Deep Neural Networks.
△ Less
Submitted 4 June, 2020;
originally announced June 2020.
-
Exploring the role of Input and Output Layers of a Deep Neural Network in Adversarial Defense
Authors:
Jay N. Paranjape,
Rahul Kumar Dubey,
Vijendran V Gopalan
Abstract:
Deep neural networks are learning models having achieved state of the art performance in many fields like prediction, computer vision, language processing and so on. However, it has been shown that certain inputs exist which would not trick a human normally, but may mislead the model completely. These inputs are known as adversarial inputs. These inputs pose a high security threat when such models…
▽ More
Deep neural networks are learning models having achieved state of the art performance in many fields like prediction, computer vision, language processing and so on. However, it has been shown that certain inputs exist which would not trick a human normally, but may mislead the model completely. These inputs are known as adversarial inputs. These inputs pose a high security threat when such models are used in real world applications. In this work, we have analyzed the resistance of three different classes of fully connected dense networks against the rarely tested non-gradient based adversarial attacks. These classes are created by manipulating the input and output layers. We have proven empirically that owing to certain characteristics of the network, they provide a high robustness against these attacks, and can be used in fine tuning other models to increase defense against adversarial attacks.
△ Less
Submitted 2 June, 2020;
originally announced June 2020.