Search | arXiv e-print repository

SICKLE: A Multi-Sensor Satellite Imagery Dataset Annotated with Multiple Key Crop** Parameters

Authors: Depanshu Sani, Sandeep Mahato, Sourabh Saini, Harsh Kumar Agarwal, Charu Chandra Devshali, Saket Anand, Gaurav Arora, Thiagarajan Jayaraman

Abstract: The availability of well-curated datasets has driven the success of Machine Learning (ML) models. Despite greater access to earth observation data in agriculture, there is a scarcity of curated and labelled datasets, which limits the potential of its use in training ML models for remote sensing (RS) in agriculture. To this end, we introduce a first-of-its-kind dataset called SICKLE, which constitu… ▽ More The availability of well-curated datasets has driven the success of Machine Learning (ML) models. Despite greater access to earth observation data in agriculture, there is a scarcity of curated and labelled datasets, which limits the potential of its use in training ML models for remote sensing (RS) in agriculture. To this end, we introduce a first-of-its-kind dataset called SICKLE, which constitutes a time-series of multi-resolution imagery from 3 distinct satellites: Landsat-8, Sentinel-1 and Sentinel-2. Our dataset constitutes multi-spectral, thermal and microwave sensors during January 2018 - March 2021 period. We construct each temporal sequence by considering the crop** practices followed by farmers primarily engaged in paddy cultivation in the Cauvery Delta region of Tamil Nadu, India; and annotate the corresponding imagery with key crop** parameters at multiple resolutions (i.e. 3m, 10m and 30m). Our dataset comprises 2,370 season-wise samples from 388 unique plots, having an average size of 0.38 acres, for classifying 21 crop types across 4 districts in the Delta, which amounts to approximately 209,000 satellite images. Out of the 2,370 samples, 351 paddy samples from 145 plots are annotated with multiple crop parameters; such as the variety of paddy, its growing season and productivity in terms of per-acre yields. Ours is also one among the first studies that consider the growing season activities pertinent to crop phenology (spans sowing, transplanting and harvesting dates) as parameters of interest. We benchmark SICKLE on three tasks: crop type, crop phenology (sowing, transplanting, harvesting), and yield prediction △ Less

Submitted 29 November, 2023; originally announced December 2023.

Comments: Accepted as an oral presentation at WACV 2024

arXiv:2311.11723 [pdf, other]

Leveraging Uncertainty Estimates To Improve Classifier Performance

Authors: Gundeep Arora, Srujana Merugu, Anoop Saladi, Rajeev Rastogi

Abstract: Binary classification involves predicting the label of an instance based on whether the model score for the positive class exceeds a threshold chosen based on the application requirements (e.g., maximizing recall for a precision bound). However, model scores are often not aligned with the true positivity rate. This is especially true when the training involves a differential sampling across classe… ▽ More Binary classification involves predicting the label of an instance based on whether the model score for the positive class exceeds a threshold chosen based on the application requirements (e.g., maximizing recall for a precision bound). However, model scores are often not aligned with the true positivity rate. This is especially true when the training involves a differential sampling across classes or there is distributional drift between train and test settings. In this paper, we provide theoretical analysis and empirical evidence of the dependence of model score estimation bias on both uncertainty and score itself. Further, we formulate the decision boundary selection in terms of both model score and uncertainty, prove that it is NP-hard, and present algorithms based on dynamic programming and isotonic regression. Evaluation of the proposed algorithms on three real-world datasets yield 25%-40% gain in recall at high precision bounds over the traditional approach of using model score alone, highlighting the benefits of leveraging uncertainty. △ Less

Submitted 20 November, 2023; originally announced November 2023.

arXiv:2209.12238 [pdf, other]

High-Resolution Satellite Imagery for Modeling the Impact of Aridification on Crop Production

Authors: Depanshu Sani, Sandeep Mahato, Parichya Sirohi, Saket Anand, Gaurav Arora, Charu Chandra Devshali, Thiagarajan Jayaraman, Harsh Kumar Agarwal

Abstract: The availability of well-curated datasets has driven the success of Machine Learning (ML) models. Despite the increased access to earth observation data for agriculture, there is a scarcity of curated, labelled datasets, which limits the potential of its use in training ML models for remote sensing (RS) in agriculture. To this end, we introduce a first-of-its-kind dataset, SICKLE, having time-seri… ▽ More The availability of well-curated datasets has driven the success of Machine Learning (ML) models. Despite the increased access to earth observation data for agriculture, there is a scarcity of curated, labelled datasets, which limits the potential of its use in training ML models for remote sensing (RS) in agriculture. To this end, we introduce a first-of-its-kind dataset, SICKLE, having time-series images at different spatial resolutions from 3 different satellites, annotated with multiple key crop** parameters for paddy cultivation for the Cauvery Delta region in Tamil Nadu, India. The dataset comprises of 2,398 season-wise samples from 388 unique plots distributed across 4 districts of the Delta. The dataset covers multi-spectral, thermal and microwave data between the time period January 2018-March 2021. The paddy samples are annotated with 4 key crop** parameters, i.e. sowing date, transplanting date, harvesting date and crop yield. This is one of the first studies to consider the growing season (using sowing and harvesting dates) as part of a dataset. We also propose a yield prediction strategy that uses time-series data generated based on the observed growing season and the standard seasonal information obtained from Tamil Nadu Agricultural University for the region. The consequent performance improvement highlights the impact of ML techniques that leverage domain knowledge that are consistent with standard practices followed by farmers in a specific region. We benchmark the dataset on 3 separate tasks, namely crop type, phenology date (sowing, transplanting, harvesting) and yield prediction, and develop an end-to-end framework for predicting key crop parameters in a real-world setting. △ Less

Submitted 25 September, 2022; originally announced September 2022.

Comments: Submitted as an End of Google AI4SG Workshop report

arXiv:2205.03104 [pdf, other]

Crop Type Identification for Smallholding Farms: Analyzing Spatial, Temporal and Spectral Resolutions in Satellite Imagery

Authors: Depanshu Sani, Sandeep Mahato, Parichya Sirohi, Saket Anand, Gaurav Arora, Charu Chandra Devshali, T. Jayaraman

Abstract: The integration of the modern Machine Learning (ML) models into remote sensing and agriculture has expanded the scope of the application of satellite images in the agriculture domain. In this paper, we present how the accuracy of crop type identification improves as we move from medium-spatiotemporal-resolution (MSTR) to high-spatiotemporal-resolution (HSTR) satellite images. We further demonstrat… ▽ More The integration of the modern Machine Learning (ML) models into remote sensing and agriculture has expanded the scope of the application of satellite images in the agriculture domain. In this paper, we present how the accuracy of crop type identification improves as we move from medium-spatiotemporal-resolution (MSTR) to high-spatiotemporal-resolution (HSTR) satellite images. We further demonstrate that high spectral resolution in satellite imagery can improve prediction performance for low spatial and temporal resolutions (LSTR) images. The F1-score is increased by 7% when using multispectral data of MSTR images as compared to the best results obtained from HSTR images. Similarly, when crop season based time series of multispectral data is used we observe an increase of 1.2% in the F1-score. The outcome motivates further advancements in the field of synthetic band generation. △ Less

Submitted 6 May, 2022; originally announced May 2022.

Comments: Supported by Google under AI4SG Workshop

arXiv:2107.13217 [pdf, other]

DeepTeeth: A Teeth-photo Based Human Authentication System for Mobile and Hand-held Devices

Authors: Geetika Arora, Rohit K Bharadwaj, Kamlesh Tiwari

Abstract: This paper proposes teeth-photo, a new biometric modality for human authentication on mobile and hand held devices. Biometrics samples are acquired using the camera mounted on mobile device with the help of a mobile application having specific markers to register the teeth area. Region of interest (RoI) is then extracted using the markers and the obtained sample is enhanced using contrast limited… ▽ More This paper proposes teeth-photo, a new biometric modality for human authentication on mobile and hand held devices. Biometrics samples are acquired using the camera mounted on mobile device with the help of a mobile application having specific markers to register the teeth area. Region of interest (RoI) is then extracted using the markers and the obtained sample is enhanced using contrast limited adaptive histogram equalization (CLAHE) for better visual clarity. We propose a deep learning architecture and novel regularization scheme to obtain highly discriminative embedding for small size RoI. Proposed custom loss function was able to achieve perfect classification for the tiny RoI of $75\times 75$ size. The model is end-to-end and few-shot and therefore is very efficient in terms of time and energy requirements. The system can be used in many ways including device unlocking and secure authentication. To the best of our understanding, this is the first work on teeth-photo based authentication for mobile device. Experiments have been conducted on an in-house teeth-photo database collected using our application. The database is made publicly available. Results have shown that the proposed system has perfect accuracy. △ Less

Submitted 28 July, 2021; originally announced July 2021.

arXiv:2105.09595 [pdf, other]

Training Software Engineers for Qualitative Evaluation of Software Architecture

Authors: Ritu Kapur, Sumit Kalra, Kamlesh Tiwari, Geetika Arora

Abstract: A software architect uses quality requirements to design the architecture of a system. However, it is essential to ensure that the system's final architectural design achieves the standard quality requirements. The existing architectural evaluation frameworks require basic skills and experience for practical usage, which novice software architects lack. We propose a framework that enables novice… ▽ More A software architect uses quality requirements to design the architecture of a system. However, it is essential to ensure that the system's final architectural design achieves the standard quality requirements. The existing architectural evaluation frameworks require basic skills and experience for practical usage, which novice software architects lack. We propose a framework that enables novice software architects to infer the system's quality requirements and tactics using the software architectural block-line diagram. The framework takes an image as input, extracts various components and connections, and maps them to viable architectural patterns, followed by identifying the system's corresponding quality attributes (QAs) and tactics. The framework includes a specifically trained machine learning model based on image processing and semantic similarity methods to assist software architects in evaluating a given design by a) evaluating an input architectural design based on the architectural patterns present in it, b) lists out the strengths and weaknesses of the design in terms of QAs, c) recommends the necessary architectural tactics that can be embedded in the design to achieve the lacking QAs. To train our framework, we developed a dataset of 2,035 architectural images from fourteen architectural patterns such as Client-Server, Microservices, and Model View Controller, available at https://www.doi.org/10.6084/m9.figshare.14156408. The framework achieves a Correct Recognition Rate of 98.71% in identifying the architectural patterns. We evaluated the proposed framework's effectiveness and usefulness by using controlled and experimental groups, in which the experimental group performed approximately 150% better than the controlled group. The experiments were performed as a part of the Masters of Computer Science course in an Engineering Institution. △ Less

Submitted 20 May, 2021; originally announced May 2021.

Comments: 16 paged Springer Lecture Notes format as per ECSA 2021 (under single-blind review)

arXiv:2011.07735 [pdf, other]

iPerceive: Applying Common-Sense Reasoning to Multi-Modal Dense Video Captioning and Video Question Answering

Authors: Aman Chadha, Gurneet Arora, Navpreet Kaloty

Abstract: Most prior art in visual understanding relies solely on analyzing the "what" (e.g., event recognition) and "where" (e.g., event localization), which in some cases, fails to describe correct contextual relationships between events or leads to incorrect underlying visual attention. Part of what defines us as human and fundamentally different from machines is our instinct to seek causality behind any… ▽ More Most prior art in visual understanding relies solely on analyzing the "what" (e.g., event recognition) and "where" (e.g., event localization), which in some cases, fails to describe correct contextual relationships between events or leads to incorrect underlying visual attention. Part of what defines us as human and fundamentally different from machines is our instinct to seek causality behind any association, say an event Y that happened as a direct result of event X. To this end, we propose iPerceive, a framework capable of understanding the "why" between events in a video by building a common-sense knowledge base using contextual cues to infer causal relationships between objects in the video. We demonstrate the effectiveness of our technique using the dense video captioning (DVC) and video question answering (VideoQA) tasks. Furthermore, while most prior work in DVC and VideoQA relies solely on visual information, other modalities such as audio and speech are vital for a human observer's perception of an environment. We formulate DVC and VideoQA tasks as machine translation problems that utilize multiple modalities. By evaluating the performance of iPerceive DVC and iPerceive VideoQA on the ActivityNet Captions and TVQA datasets respectively, we show that our approach furthers the state-of-the-art. Code and samples are available at: iperceive.amanchadha.com. △ Less

Submitted 16 November, 2020; originally announced November 2020.

Comments: 13 pages, 6 figures, 4 tables, Project Page: https://iperceive.amanchadha.com

Journal ref: IEEE Winter Conference on Applications of Computer Vision (WACV) 2021

arXiv:2010.02094 [pdf, other]

Gauravarora@HASOC-Dravidian-CodeMix-FIRE2020: Pre-training ULMFiT on Synthetically Generated Code-Mixed Data for Hate Speech Detection

Authors: Gaurav Arora

Abstract: This paper describes the system submitted to Dravidian-Codemix-HASOC2020: Hate Speech and Offensive Content Identification in Dravidian languages (Tamil-English and Malayalam-English). The task aims to identify offensive language in code-mixed dataset of comments/posts in Dravidian languages collected from social media. We participated in both Sub-task A, which aims to identify offensive content i… ▽ More This paper describes the system submitted to Dravidian-Codemix-HASOC2020: Hate Speech and Offensive Content Identification in Dravidian languages (Tamil-English and Malayalam-English). The task aims to identify offensive language in code-mixed dataset of comments/posts in Dravidian languages collected from social media. We participated in both Sub-task A, which aims to identify offensive content in mixed-script (mixture of Native and Roman script) and Sub-task B, which aims to identify offensive content in Roman script, for Dravidian languages. In order to address these tasks, we proposed pre-training ULMFiT on synthetically generated code-mixed data, generated by modelling code-mixed data generation as a Markov process using Markov chains. Our model achieved 0.88 weighted F1-score for code-mixed Tamil-English language in Sub-task B and got 2nd rank on the leader-board. Additionally, our model achieved 0.91 weighted F1-score (4th Rank) for mixed-script Malayalam-English in Sub-task A and 0.74 weighted F1-score (5th Rank) for code-mixed Malayalam-English language in Sub-task B. △ Less

Submitted 19 October, 2020; v1 submitted 5 October, 2020; originally announced October 2020.

Comments: System description paper for 2nd ranked system in Sub-task B accepted at Dravidian-Codemix-HASOC2020@FIRE2020

arXiv:2009.13833 [pdf, other]

doi 10.18653/v1/2020.insights-1.16

HINT3: Raising the bar for Intent Detection in the Wild

Authors: Gaurav Arora, Chirag Jain, Manas Chaturvedi, Krupal Modi

Abstract: Intent Detection systems in the real world are exposed to complexities of imbalanced datasets containing varying perception of intent, unintended correlations and domain-specific aberrations. To facilitate benchmarking which can reflect near real-world scenarios, we introduce 3 new datasets created from live chatbots in diverse domains. Unlike most existing datasets that are crowdsourced, our data… ▽ More Intent Detection systems in the real world are exposed to complexities of imbalanced datasets containing varying perception of intent, unintended correlations and domain-specific aberrations. To facilitate benchmarking which can reflect near real-world scenarios, we introduce 3 new datasets created from live chatbots in diverse domains. Unlike most existing datasets that are crowdsourced, our datasets contain real user queries received by the chatbots and facilitates penalising unwanted correlations grasped during the training process. We evaluate 4 NLU platforms and a BERT based classifier and find that performance saturates at inadequate levels on test sets because all systems latch on to unintended patterns in training data. △ Less

Submitted 10 October, 2020; v1 submitted 29 September, 2020; originally announced September 2020.

Comments: Accepted at EMNLP-2020's Insights workshop

Journal ref: Proceedings of the First Workshop on Insights from Negative Results in NLP @ EMNLP 2020

arXiv:2009.12534 [pdf, ps, other]

doi 10.18653/v1/2020.nlposs-1.10

iNLTK: Natural Language Toolkit for Indic Languages

Authors: Gaurav Arora

Abstract: We present iNLTK, an open-source NLP library consisting of pre-trained language models and out-of-the-box support for Data Augmentation, Textual Similarity, Sentence Embeddings, Word Embeddings, Tokenization and Text Generation in 13 Indic Languages. By using pre-trained models from iNLTK for text classification on publicly available datasets, we significantly outperform previously reported result… ▽ More We present iNLTK, an open-source NLP library consisting of pre-trained language models and out-of-the-box support for Data Augmentation, Textual Similarity, Sentence Embeddings, Word Embeddings, Tokenization and Text Generation in 13 Indic Languages. By using pre-trained models from iNLTK for text classification on publicly available datasets, we significantly outperform previously reported results. On these datasets, we also show that by using pre-trained models and data augmentation from iNLTK, we can achieve more than 95% of the previous best performance by using less than 10% of the training data. iNLTK is already being widely used by the community and has 40,000+ downloads, 600+ stars and 100+ forks on GitHub. The library is available at https://github.com/goru001/inltk. △ Less

Submitted 10 October, 2020; v1 submitted 26 September, 2020; originally announced September 2020.

Comments: Accepted at EMNLP2020's NLP-OSS workshop

arXiv:1908.04842 [pdf, other]

SP-NET: One Shot Fingerprint Singular-Point Detector

Authors: Geetika Arora, Ranjeet Ranjan Jha, Akash Agrawal, Kamlesh Tiwari, Aditya Nigam

Abstract: Singular points of a fingerprint image are special locations having high curvature properties. They can play a pivotal role in fingerprint normalization and reliable feature extraction. Accurate and efficient extraction of a singular point plays a major role in successful fingerprint recognition and indexing. In this paper, a novel deep learning based architecture is proposed for one shot (end-to-… ▽ More Singular points of a fingerprint image are special locations having high curvature properties. They can play a pivotal role in fingerprint normalization and reliable feature extraction. Accurate and efficient extraction of a singular point plays a major role in successful fingerprint recognition and indexing. In this paper, a novel deep learning based architecture is proposed for one shot (end-to-end) singular point detection from an input fingerprint image. The model consists of a Macro-Localization Network and a Micro-Regression Network along with three stacked hourglass as a bottleneck. The proposed model has been tested on three databases viz. FVC2002 DB1_A, FVC2002 DB2_A and FPL30K and has been found to achieve true detection rate of 98.75%, 97.5% and 92.72% respectively, which is better than any other state-of-the-art technique. △ Less

Submitted 13 August, 2019; originally announced August 2019.

Comments: 10 pages, 6 figures

arXiv:1807.03570 [pdf, other]

Small-Variance Asymptotics for Nonparametric Bayesian Overlap** Stochastic Blockmodels

Authors: Gundeep Arora, Anupreet Porwal, Kanupriya Agarwal, Avani Samdariya, Piyush Rai

Abstract: The latent feature relational model (LFRM) is a generative model for graph-structured data to learn a binary vector representation for each node in the graph. The binary vector denotes the node's membership in one or more communities. At its core, the LFRM miller2009nonparametric is an overlap** stochastic blockmodel, which defines the link probability between any pair of nodes as a bilinear fun… ▽ More The latent feature relational model (LFRM) is a generative model for graph-structured data to learn a binary vector representation for each node in the graph. The binary vector denotes the node's membership in one or more communities. At its core, the LFRM miller2009nonparametric is an overlap** stochastic blockmodel, which defines the link probability between any pair of nodes as a bilinear function of their community membership vectors. Moreover, using a nonparametric Bayesian prior (Indian Buffet Process) enables learning the number of communities automatically from the data. However, despite its appealing properties, inference in LFRM remains a challenge and is typically done via MCMC methods. This can be slow and may take a long time to converge. In this work, we develop a small-variance asymptotics based framework for the non-parametric Bayesian LFRM. This leads to an objective function that retains the nonparametric Bayesian flavor of LFRM, while enabling us to design deterministic inference algorithms for this model, that are easy to implement (using generic or specialized optimization routines) and are fast in practice. Our results on several benchmark datasets demonstrate that our algorithm is competitive to methods such as MCMC, while being much faster. △ Less

Submitted 10 July, 2018; originally announced July 2018.

Comments: Accepted For IJCAI'18

arXiv:1802.04675 [pdf, other]

Attention based Sentence Extraction from Scientific Articles using Pseudo-Labeled data

Authors: Parth Mehta, Gaurav Arora, Prasenjit Majumder

Abstract: In this work, we present a weakly supervised sentence extraction technique for identifying important sentences in scientific papers that are worthy of inclusion in the abstract. We propose a new attention based deep learning architecture that jointly learns to identify important content, as well as the cue phrases that are indicative of summary worthy sentences. We propose a new context embedding… ▽ More In this work, we present a weakly supervised sentence extraction technique for identifying important sentences in scientific papers that are worthy of inclusion in the abstract. We propose a new attention based deep learning architecture that jointly learns to identify important content, as well as the cue phrases that are indicative of summary worthy sentences. We propose a new context embedding technique for determining the focus of a given paper using topic models and use it jointly with an LSTM based sequence encoder to learn attention weights across the sentence words. We use a collection of articles publicly available through ACL anthology for our experiments. Our system achieves a performance that is better, in terms of several ROUGE metrics, as compared to several state of art extractive techniques. It also generates more coherent summaries and preserves the overall structure of the document. △ Less

Submitted 13 February, 2018; originally announced February 2018.

arXiv:1712.03878 [pdf, other]

Generalized Zero-Shot Learning via Synthesized Examples

Authors: Vinay Kumar Verma, Gundeep Arora, Ashish Mishra, Piyush Rai

Abstract: We present a generative framework for generalized zero-shot learning where the training and test classes are not necessarily disjoint. Built upon a variational autoencoder based architecture, consisting of a probabilistic encoder and a probabilistic conditional decoder, our model can generate novel exemplars from seen/unseen classes, given their respective class attributes. These exemplars can sub… ▽ More We present a generative framework for generalized zero-shot learning where the training and test classes are not necessarily disjoint. Built upon a variational autoencoder based architecture, consisting of a probabilistic encoder and a probabilistic conditional decoder, our model can generate novel exemplars from seen/unseen classes, given their respective class attributes. These exemplars can subsequently be used to train any off-the-shelf classification model. One of the key aspects of our encoder-decoder architecture is a feedback-driven mechanism in which a discriminator (a multivariate regressor) learns to map the generated exemplars to the corresponding class attribute vectors, leading to an improved generator. Our model's ability to generate and leverage examples from unseen classes to train the classification model naturally helps to mitigate the bias towards predicting seen classes in generalized zero-shot learning settings. Through a comprehensive set of experiments, we show that our model outperforms several state-of-the-art methods, on several benchmark datasets, for both standard as well as generalized zero-shot learning. △ Less

Submitted 11 June, 2018; v1 submitted 11 December, 2017; originally announced December 2017.

Comments: Accepted in CVPR'18

arXiv:0912.4323 [pdf]

Modified Minimum Connected Dominating Set formation for Wireless Adhoc Networks

Authors: Mano Yadav, Vinay Rishiwal, G. Arora, S. Makka

Abstract: Nodes of minimum connected dominating set (MCDS) form a virtual backbone in a wireless adhoc network. In this paper, a modified approach is presented to determine MCDS of an underlying graph of a Wireless Adhoc network. Simulation results for a variety of graphs indicate that the approach is efficient in determining the MCDS as compared to other existing techniques. Nodes of minimum connected dominating set (MCDS) form a virtual backbone in a wireless adhoc network. In this paper, a modified approach is presented to determine MCDS of an underlying graph of a Wireless Adhoc network. Simulation results for a variety of graphs indicate that the approach is efficient in determining the MCDS as compared to other existing techniques. △ Less

Submitted 22 December, 2009; originally announced December 2009.

Journal ref: Journal of Computing, Volume 1, Issue 1, pp 200-203, December 2009

Showing 1–15 of 15 results for author: Arora, G