Search | arXiv e-print repository

Unsupervised and semi-supervised co-salient object detection via segmentation frequency statistics

Authors: Souradeep Chakraborty, Shujon Naha, Muhammet Bastan, Amit Kumar K C, Dimitris Samaras

Abstract: In this paper, we address the detection of co-occurring salient objects (CoSOD) in an image group using frequency statistics in an unsupervised manner, which further enable us to develop a semi-supervised method. While previous works have mostly focused on fully supervised CoSOD, less attention has been allocated to detecting co-salient objects when limited segmentation annotations are available f… ▽ More In this paper, we address the detection of co-occurring salient objects (CoSOD) in an image group using frequency statistics in an unsupervised manner, which further enable us to develop a semi-supervised method. While previous works have mostly focused on fully supervised CoSOD, less attention has been allocated to detecting co-salient objects when limited segmentation annotations are available for training. Our simple yet effective unsupervised method US-CoSOD combines the object co-occurrence frequency statistics of unsupervised single-image semantic segmentations with salient foreground detections using self-supervised feature learning. For the first time, we show that a large unlabeled dataset e.g. ImageNet-1k can be effectively leveraged to significantly improve unsupervised CoSOD performance. Our unsupervised model is a great pre-training initialization for our semi-supervised model SS-CoSOD, especially when very limited labeled data is available for training. To avoid propagating erroneous signals from predictions on unlabeled data, we propose a confidence estimation module to guide our semi-supervised training. Extensive experiments on three CoSOD benchmark datasets show that both of our unsupervised and semi-supervised models outperform the corresponding state-of-the-art models by a significant margin (e.g., on the Cosal2015 dataset, our US-CoSOD model has an 8.8% F-measure gain over a SOTA unsupervised co-segmentation model and our SS-CoSOD model has an 11.81% F-measure gain over a SOTA semi-supervised CoSOD model). △ Less

Submitted 11 November, 2023; originally announced November 2023.

Comments: Accepted at IEEE WACV 2024

arXiv:2304.07092 [pdf, other]

Obfuscation of Discrete Data

Authors: Saswata Naha, Sayantan Roy, Arkaprava Sanki, Diptanil Santra

Abstract: Data obfuscation deals with the problem of masking a data-set in such a way that the utility of the data is maximized while minimizing the risk of the disclosure of sensitive information. To protect data we address some ways that may as well retain its statistical uses to some extent. One such way is to mask a data with additive noise and revert to certain desired parameters of the original distri… ▽ More Data obfuscation deals with the problem of masking a data-set in such a way that the utility of the data is maximized while minimizing the risk of the disclosure of sensitive information. To protect data we address some ways that may as well retain its statistical uses to some extent. One such way is to mask a data with additive noise and revert to certain desired parameters of the original distribution from the knowledge of the noise distribution and masked data. In this project, we discuss the estimation of any desired quantile and range of a quantitative data set masked with additive noise. △ Less

Submitted 14 April, 2023; originally announced April 2023.

Comments: 16 pages, 32 figures

arXiv:2004.00060 [pdf, other]

HOPE-Net: A Graph-based Model for Hand-Object Pose Estimation

Authors: Bardia Doosti, Shujon Naha, Majid Mirbagheri, David Crandall

Abstract: Hand-object pose estimation (HOPE) aims to jointly detect the poses of both a hand and of a held object. In this paper, we propose a lightweight model called HOPE-Net which jointly estimates hand and object pose in 2D and 3D in real-time. Our network uses a cascade of two adaptive graph convolutional neural networks, one to estimate 2D coordinates of the hand joints and object corners, followed by… ▽ More Hand-object pose estimation (HOPE) aims to jointly detect the poses of both a hand and of a held object. In this paper, we propose a lightweight model called HOPE-Net which jointly estimates hand and object pose in 2D and 3D in real-time. Our network uses a cascade of two adaptive graph convolutional neural networks, one to estimate 2D coordinates of the hand joints and object corners, followed by another to convert 2D coordinates to 3D. Our experiments show that through end-to-end training of the full network, we achieve better accuracy for both the 2D and 3D coordinate estimation problems. The proposed 2D to 3D graph convolution-based model could be applied to other 3D landmark detection problems, where it is possible to first predict the 2D keypoints and then transform them to 3D. △ Less

Submitted 31 March, 2020; originally announced April 2020.

Comments: IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

arXiv:1806.11266 [pdf, other]

Gated Feedback Refinement Network for Coarse-to-Fine Dense Semantic Image Labeling

Authors: Md Amirul Islam, Mrigank Rochan, Shujon Naha, Neil D. B. Bruce, Yang Wang

Abstract: Effective integration of local and global contextual information is crucial for semantic segmentation and dense image labeling. We develop two encoder-decoder based deep learning architectures to address this problem. We first propose a network architecture called Label Refinement Network (LRN) that predicts segmentation labels in a coarse-to-fine fashion at several spatial resolutions. In this ne… ▽ More Effective integration of local and global contextual information is crucial for semantic segmentation and dense image labeling. We develop two encoder-decoder based deep learning architectures to address this problem. We first propose a network architecture called Label Refinement Network (LRN) that predicts segmentation labels in a coarse-to-fine fashion at several spatial resolutions. In this network, we also define loss functions at several stages to provide supervision at different stages of training. However, there are limits to the quality of refinement possible if ambiguous information is passed forward. In order to address this issue, we also propose Gated Feedback Refinement Network (G-FRNet) that addresses this limitation. Initially, G-FRNet makes a coarse-grained prediction which it progressively refines to recover details by effectively integrating local and global contextual information during the refinement stages. This is achieved by gate units proposed in this work, that control information passed forward in order to resolve the ambiguity. Experiments were conducted on four challenging dense labeling datasets (CamVid, PASCAL VOC 2012, Horse-Cow Parsing, PASCAL-Person-Part, and SUN-RGBD). G-FRNet achieves state-of-the-art semantic segmentation results on the CamVid and Horse-Cow Parsing datasets and produces results competitive with the best performing approaches that appear in the literature for the other three datasets. △ Less

Submitted 29 June, 2018; originally announced June 2018.

arXiv:1803.06002 [pdf]

Ridge Regression Estimated Linear Probability Model Predictions of N-glycosylation in Proteins with Structural and Sequence Data

Authors: Rajaram Gana, Swagata Naha, Raja Mazumder, Radoslav Goldman, Sona Vasudevan

Abstract: Absent experimental evidence, a robust methodology to predict the likelihood of N-glycosylation in human proteins is essential for guiding experimental work. Based on the distribution of amino acids in the neighborhood of the NxS/T sequon (N-site); the structural attributes of the N-site that include Accessible Surface Area, secondary structural elements, main-chain phi-psi, turn types; the relati… ▽ More Absent experimental evidence, a robust methodology to predict the likelihood of N-glycosylation in human proteins is essential for guiding experimental work. Based on the distribution of amino acids in the neighborhood of the NxS/T sequon (N-site); the structural attributes of the N-site that include Accessible Surface Area, secondary structural elements, main-chain phi-psi, turn types; the relative location of the N-site in the primary sequence; and the nature of the glycan bound, the ridge regression estimated linear probability model is used to predict this likelihood. This model yields a Kolmogorov-Smirnov (Gini coefficient) statistic value of about 74% (89%), which is reasonable. △ Less

Submitted 15 March, 2018; originally announced March 2018.

Comments: 20 pages

MSC Class: 62J05; 62J07

arXiv:1703.00551 [pdf, other]

Label Refinement Network for Coarse-to-Fine Semantic Segmentation

Authors: Md Amirul Islam, Shujon Naha, Mrigank Rochan, Neil Bruce, Yang Wang

Abstract: We consider the problem of semantic image segmentation using deep convolutional neural networks. We propose a novel network architecture called the label refinement network that predicts segmentation labels in a coarse-to-fine fashion at several resolutions. The segmentation labels at a coarse resolution are used together with convolutional features to obtain finer resolution segmentation labels.… ▽ More We consider the problem of semantic image segmentation using deep convolutional neural networks. We propose a novel network architecture called the label refinement network that predicts segmentation labels in a coarse-to-fine fashion at several resolutions. The segmentation labels at a coarse resolution are used together with convolutional features to obtain finer resolution segmentation labels. We define loss functions at several stages in the network to provide supervisions at different stages. Our experimental results on several standard datasets demonstrate that the proposed model provides an effective way of producing pixel-wise dense image labeling. △ Less

Submitted 1 March, 2017; originally announced March 2017.

Comments: 9 pages

Showing 1–6 of 6 results for author: Naha, S