-
Efficacy of simple continuum models for diverse granular intrusions
Authors:
Shashank Agarwal,
Andras Karsai,
Daniel I Goldman,
Ken Kamrin
Abstract:
Granular intrusion is commonly observed in natural and human-made settings. Unlike typical solids and fluids, granular media can simultaneously display fluid-like and solid-like characteristics in a variety of intrusion scenarios. This multi-phase behavior increases the difficulty of accurately modeling these and other yielding (or flowable) materials. Micro-scale modeling methods, such as DEM (Di…
▽ More
Granular intrusion is commonly observed in natural and human-made settings. Unlike typical solids and fluids, granular media can simultaneously display fluid-like and solid-like characteristics in a variety of intrusion scenarios. This multi-phase behavior increases the difficulty of accurately modeling these and other yielding (or flowable) materials. Micro-scale modeling methods, such as DEM (Discrete Element Method), capture this behavior by modeling the media at the grain scale, but there is often interest in the macro-scale characterizations of such systems. We examine the efficacy of a macro-scale continuum approach in modeling and understanding the physics of various macroscopic phenomena in a variety of granular intrusion cases using two basic frictional yielding constitutive models. We compare predicted granular force response and material flow to experimental data in four quasi-2D intrusion cases: (1) depth-dependent force response in horizontal submerged-intruder motion; (2) separation dependent drag variation in parallel-plate vertical-intrusion; (3) initial-density-dependent drag fluctuations in free surface plowing, and (4) flow zone development during vertical plate intrusions in under-compacted granular media. Our continuum modeling approach captures the flow process and drag forces while providing key meso- and macro-scopic insights. The modeling results are then compared to experimental data. Our study highlights how continuum modeling approaches provide an alternative for efficient modeling as well as a conceptual understanding of various granular intrusion phenomena.
△ Less
Submitted 15 July, 2021; v1 submitted 25 January, 2021;
originally announced January 2021.
-
S-BEV: Semantic Birds-Eye View Representation for Weather and Lighting Invariant 3-DoF Localization
Authors:
Mokshith Voodarla,
Shubham Shrivastava,
Sagar Manglani,
Ankit Vora,
Siddharth Agarwal,
Punarjay Chakravarty
Abstract:
We describe a light-weight, weather and lighting invariant, Semantic Bird's Eye View (S-BEV) signature for vision-based vehicle re-localization. A topological map of S-BEV signatures is created during the first traversal of the route, which are used for coarse localization in subsequent route traversal. A fine-grained localizer is then trained to output the global 3-DoF pose of the vehicle using i…
▽ More
We describe a light-weight, weather and lighting invariant, Semantic Bird's Eye View (S-BEV) signature for vision-based vehicle re-localization. A topological map of S-BEV signatures is created during the first traversal of the route, which are used for coarse localization in subsequent route traversal. A fine-grained localizer is then trained to output the global 3-DoF pose of the vehicle using its S-BEV and its coarse localization. We conduct experiments on vKITTI2 virtual dataset and show the potential of the S-BEV to be robust to weather and lighting. We also demonstrate results with 2 vehicles on a 22 km long highway route in the Ford AV dataset.
△ Less
Submitted 23 January, 2021;
originally announced January 2021.
-
Few Shot Dialogue State Tracking using Meta-learning
Authors:
Saket Dingliwal,
Bill Gao,
Sanchit Agarwal,
Chien-Wei Lin,
Tagyoung Chung,
Dilek Hakkani-Tur
Abstract:
Dialogue State Tracking (DST) forms a core component of automated chatbot based systems designed for specific goals like hotel, taxi reservation, tourist information, etc. With the increasing need to deploy such systems in new domains, solving the problem of zero/few-shot DST has become necessary. There has been a rising trend for learning to transfer knowledge from resource-rich domains to unknow…
▽ More
Dialogue State Tracking (DST) forms a core component of automated chatbot based systems designed for specific goals like hotel, taxi reservation, tourist information, etc. With the increasing need to deploy such systems in new domains, solving the problem of zero/few-shot DST has become necessary. There has been a rising trend for learning to transfer knowledge from resource-rich domains to unknown domains with minimal need for additional data. In this work, we explore the merits of meta-learning algorithms for this transfer and hence, propose a meta-learner D-REPTILE specific to the DST problem. With extensive experimentation, we provide clear evidence of benefits over conventional approaches across different domains, methods, base models, and datasets with significant (5-25%) improvement over the baseline in a low-data setting. Our proposed meta-learner is agnostic of the underlying model and hence any existing state-of-the-art DST system can improve its performance on unknown domains using our training strategy.
△ Less
Submitted 5 April, 2021; v1 submitted 17 January, 2021;
originally announced January 2021.
-
An unrecognized force in inertial microfluidics
Authors:
Siddhansh Agarwal,
Fan Kiat Chan,
Mattia Gazzola,
Sascha Hilgenfeldt
Abstract:
Describing effects of small but finite inertia on suspended particles is a fundamental fluid dynamical problem that has never been solved in full generality. Modern microfluidics has turned this academic problem into a practical challenge through the use of high-frequency oscillatory flows, perhaps the most efficient way to take advantage of inertial effects at low Reynolds numbers, to precisely m…
▽ More
Describing effects of small but finite inertia on suspended particles is a fundamental fluid dynamical problem that has never been solved in full generality. Modern microfluidics has turned this academic problem into a practical challenge through the use of high-frequency oscillatory flows, perhaps the most efficient way to take advantage of inertial effects at low Reynolds numbers, to precisely manipulate particles, cells and vesicles without the need for charges or chemistry. The theoretical understanding of flow forces on particles has so far hinged on the pioneering work of Maxey and Riley (MR in the following), almost 40 years ago. We demonstrate here theoretically and computationally that oscillatory flows exert previously unexplained, significant and persistent forces, that these emerge from a combination of particle inertia and spatial flow variation, and that they can be quantitatively predicted through a generalization of MR.
△ Less
Submitted 9 January, 2021;
originally announced January 2021.
-
AILearn: An Adaptive Incremental Learning Model for Spoof Fingerprint Detection
Authors:
Shivang Agarwal,
Ajita Rattani,
C. Ravindranath Chowdary
Abstract:
Incremental learning enables the learner to accommodate new knowledge without retraining the existing model. It is a challenging task which requires learning from new data as well as preserving the knowledge extracted from the previously accessed data. This challenge is known as the stability-plasticity dilemma. We propose AILearn, a generic model for incremental learning which overcomes the stabi…
▽ More
Incremental learning enables the learner to accommodate new knowledge without retraining the existing model. It is a challenging task which requires learning from new data as well as preserving the knowledge extracted from the previously accessed data. This challenge is known as the stability-plasticity dilemma. We propose AILearn, a generic model for incremental learning which overcomes the stability-plasticity dilemma by carefully integrating the ensemble of base classifiers trained on new data with the current ensemble without retraining the model from scratch using entire data. We demonstrate the efficacy of the proposed AILearn model on spoof fingerprint detection application. One of the significant challenges associated with spoof fingerprint detection is the performance drop on spoofs generated using new fabrication materials. AILearn is an adaptive incremental learning model which adapts to the features of the ``live'' and ``spoof'' fingerprint images and efficiently recognizes the new spoof fingerprints as well as the known spoof fingerprints when the new data is available. To the best of our knowledge, AILearn is the first attempt in incremental learning algorithms that adapts to the properties of data for generating a diverse ensemble of base classifiers. From the experiments conducted on standard high-dimensional datasets LivDet 2011, LivDet 2013 and LivDet 2015, we show that the performance gain on new fake materials is significantly high. On an average, we achieve $49.57\%$ improvement in accuracy between the consecutive learning phases.
△ Less
Submitted 29 December, 2020;
originally announced December 2020.
-
Experimental study of decoherence of the two-mode squeezed vacuum state via second harmonic generation
Authors:
Fu Li,
Tian Li,
Girish S. Agarwal
Abstract:
Decoherence remains one of the most serious challenges to the implementation of quantum technology. It appears as a result of the transformation over time of a quantum superposition state into a classical mixture due to the quantum system interacting with the environment. Since quantum systems are never completely isolated from their environment, decoherence therefore cannot be avoided in realisti…
▽ More
Decoherence remains one of the most serious challenges to the implementation of quantum technology. It appears as a result of the transformation over time of a quantum superposition state into a classical mixture due to the quantum system interacting with the environment. Since quantum systems are never completely isolated from their environment, decoherence therefore cannot be avoided in realistic situations. Decoherence has been extensively studied, mostly theoretically, because it has many important implications in quantum technology, such as in the fields of quantum information processing, quantum communication and quantum computation. Here we report a novel experimental scheme on the study of decoherence of a two-mode squeezed vacuum state via its second harmonic generation signal. Our scheme can directly extract the decoherence of the phase-sensitive quantum correlation $\langle \hat{a}\hat{b}\rangle$ between two entangled modes $a$ and $b$. Such a correlation is the most important characteristic of a two-mode squeezed state. More importantly, this is an experimental study on the decoherence effect of a squeezed vacuum state, which has been rarely investigated.
△ Less
Submitted 27 July, 2021; v1 submitted 22 December, 2020;
originally announced December 2020.
-
CHS-Net: A Deep learning approach for hierarchical segmentation of COVID-19 infected CT images
Authors:
Narinder Singh Punn,
Sonali Agarwal
Abstract:
The pandemic of novel SARS-CoV-2 also known as COVID-19 has been spreading worldwide, causing rampant loss of lives. Medical imaging such as CT, X-ray, etc., plays a significant role in diagnosing the patients by presenting the visual representation of the functioning of the organs. However, for any radiologist analyzing such scans is a tedious and time-consuming task. The emerging deep learning t…
▽ More
The pandemic of novel SARS-CoV-2 also known as COVID-19 has been spreading worldwide, causing rampant loss of lives. Medical imaging such as CT, X-ray, etc., plays a significant role in diagnosing the patients by presenting the visual representation of the functioning of the organs. However, for any radiologist analyzing such scans is a tedious and time-consuming task. The emerging deep learning technologies have displayed its strength in analyzing such scans to aid in the faster diagnosis of the diseases and viruses such as COVID-19. In the present article, an automated deep learning based model, COVID-19 hierarchical segmentation network (CHS-Net) is proposed that functions as a semantic hierarchical segmenter to identify the COVID-19 infected regions from lungs contour via CT medical imaging using two cascaded residual attention inception U-Net (RAIU-Net) models. RAIU-Net comprises of a residual inception U-Net model with spectral spatial and depth attention network (SSD) that is developed with the contraction and expansion phases of depthwise separable convolutions and hybrid pooling (max and spectral pooling) to efficiently encode and decode the semantic and varying resolution information. The CHS-Net is trained with the segmentation loss function that is the defined as the average of binary cross entropy loss and dice loss to penalize false negative and false positive predictions. The approach is compared with the recently proposed approaches and evaluated using the standard metrics like accuracy, precision, specificity, recall, dice coefficient and Jaccard similarity along with the visualized interpretation of the model prediction with GradCam++ and uncertainty maps. With extensive trials, it is observed that the proposed approach outperformed the recently proposed approaches and effectively segments the COVID-19 infected regions in the lungs.
△ Less
Submitted 29 December, 2021; v1 submitted 13 December, 2020;
originally announced December 2020.
-
Application of Computer Vision Techniques for Segregation of PlasticWaste based on Resin Identification Code
Authors:
Shivaank Agarwal,
Ravindra Gudi,
Paresh Saxena
Abstract:
This paper presents methods to identify the plastic waste based on its resin identification code to provide an efficient recycling of post-consumer plastic waste. We propose the design, training and testing of different machine learning techniques to (i) identify a plastic waste that belongs to the known categories of plastic waste when the system is trained and (ii) identify a new plastic waste t…
▽ More
This paper presents methods to identify the plastic waste based on its resin identification code to provide an efficient recycling of post-consumer plastic waste. We propose the design, training and testing of different machine learning techniques to (i) identify a plastic waste that belongs to the known categories of plastic waste when the system is trained and (ii) identify a new plastic waste that do not belong the any known categories of plastic waste while the system is trained. For the first case,we propose the use of one-shot learning techniques using Siamese and Triplet loss networks. Our proposed approach does not require any augmentation to increase the size of the database and achieved a high accuracy of 99.74%. For the second case, we propose the use of supervised and unsupervised dimensionality reduction techniques and achieved an accuracy of 95% to correctly identify a new plastic waste.
△ Less
Submitted 16 November, 2020;
originally announced November 2020.
-
Probing the spectrum of the Jaynes-Cummings-Rabi model by its isomorphism to an atom inside a parametric amplifier cavity
Authors:
R. Gutiérrez-Jáuregui,
G. S. Agarwal
Abstract:
We show how the Jaynes--Cummings--Rabi model of cavity quantum electrodynamics can be realized via an isomorphism to the Hamiltonian of a qubit inside a parametric amplifier cavity. This realization clears the way to observe the full spectrum of the Rabi model via a probe applied to a parametric amplifier cavity containing a qubit and a parametric oscillator operating below threshold. An important…
▽ More
We show how the Jaynes--Cummings--Rabi model of cavity quantum electrodynamics can be realized via an isomorphism to the Hamiltonian of a qubit inside a parametric amplifier cavity. This realization clears the way to observe the full spectrum of the Rabi model via a probe applied to a parametric amplifier cavity containing a qubit and a parametric oscillator operating below threshold. An important outcome of the isomorphism is that the actual frequencies are replaced by detunings which make it feasible to reach the ultra-strong coupling regime. We find that inside this regime the probed spectrum displays a narrow resonance peak that is traced back to the transition between ground and first excited states. The exact form of these states is given at an energy crossing and then extended numerically. At the crossing, the eigenstates are entangled states of field and atom where the field is found inside squeezed cat states.
△ Less
Submitted 31 January, 2021; v1 submitted 8 November, 2020;
originally announced November 2020.
-
Room temperature tunable coupling of single photon emitting quantum dots to localized and delocalized modes in plasmonic nanocavity array
Authors:
Ravindra Kumar Yadav,
Wenxiao Liu,
Ran Li,
Teri W. Odom,
Girish S. Agarwal,
Jaydeep K Basu
Abstract:
Single photon sources (SPS), especially those based on solid state quantum emitters, are key elements in future quantum technologies. What is required is the development of broadband, high quantum efficiency, room temperature SPS which can also be tunably coupled to optical cavities which could lead to development of all-optical quantum communication platforms. In this regard deterministic couplin…
▽ More
Single photon sources (SPS), especially those based on solid state quantum emitters, are key elements in future quantum technologies. What is required is the development of broadband, high quantum efficiency, room temperature SPS which can also be tunably coupled to optical cavities which could lead to development of all-optical quantum communication platforms. In this regard deterministic coupling of SPS to plasmonic nanocavity arrays has great advantage due to long propagation length and delocalized nature of surface lattice resonances (SLRs). Guided by these considerations, we report experiments on the room temperature tunable coupling of single photon emitting colloidal quantum dots (CQDs) to localised and delocalised modes in plasmonic nanocavity arrays. Using time-resolved photo-luminescence measurement on isolated CQD, we report significant advantage of SLRs in realizing much higher Purcell effect, despite large dephasing of CQDs, with values of ~22 and ~ 6 for coupling to the lattice and localised modes, respectively. We present measurements on the antibunching of CQDs coupled to these modes with g(2)(0) values in quantum domain providing evidence for an effective cooperative behavior. We present a density matrix treatment of the coupling of CQDs to plasmonic and lattice modes enabling us to model the experimental results on Purcell factors as well as on the antibunching. We also provide experimental evidence of indirect excitation of remote CQDs mediated by the lattice modes and propose a model to explain these observations. Our study demonstrates the possibility of develo** nanophotonic platforms for single photon operations and communications with broadband quantum emitters and plasmonic nanocavity arrays since these arrays can generate entanglement between to spatially separated quantum emitters.
△ Less
Submitted 31 October, 2020;
originally announced November 2020.
-
Accordion: Adaptive Gradient Communication via Critical Learning Regime Identification
Authors:
Saurabh Agarwal,
Hongyi Wang,
Kangwook Lee,
Shivaram Venkataraman,
Dimitris Papailiopoulos
Abstract:
Distributed model training suffers from communication bottlenecks due to frequent model updates transmitted across compute nodes. To alleviate these bottlenecks, practitioners use gradient compression techniques like sparsification, quantization, or low-rank updates. The techniques usually require choosing a static compression ratio, often requiring users to balance the trade-off between model acc…
▽ More
Distributed model training suffers from communication bottlenecks due to frequent model updates transmitted across compute nodes. To alleviate these bottlenecks, practitioners use gradient compression techniques like sparsification, quantization, or low-rank updates. The techniques usually require choosing a static compression ratio, often requiring users to balance the trade-off between model accuracy and per-iteration speedup. In this work, we show that such performance degradation due to choosing a high compression ratio is not fundamental. An adaptive compression strategy can reduce communication while maintaining final test accuracy. Inspired by recent findings on critical learning regimes, in which small gradient errors can have irrecoverable impact on model performance, we propose Accordion a simple yet effective adaptive compression algorithm. While Accordion maintains a high enough compression rate on average, it avoids over-compressing gradients whenever in critical learning regimes, detected by a simple gradient-norm based criterion. Our extensive experimental study over a number of machine learning tasks in distributed environments indicates that Accordion, maintains similar model accuracy to uncompressed training, yet achieves up to 5.5x better compression and up to 4.1x end-to-end speedup over static approaches. We show that Accordion also works for adjusting the batch size, another popular strategy for alleviating communication bottlenecks.
△ Less
Submitted 29 October, 2020;
originally announced October 2020.
-
Enhanced sensing of weak anharmonicities through coherences in dissipatively coupled anti-PT symmetric systems
Authors:
Jayakrishnan M. P. Nair,
Debsuvra Mukhopadhyay,
G. S. Agarwal
Abstract:
In the last few years, the great utility of PT-symmetric systems in sensing small perturbations has been recognized. Here, we propose an alternate method relevant to dissipative systems, especially those coupled to the vacuum of the electromagnetic fields. In such systems, which typically show anti-PT symmetry and do not require the incorporation of gain, vacuum induces coherence between two modes…
▽ More
In the last few years, the great utility of PT-symmetric systems in sensing small perturbations has been recognized. Here, we propose an alternate method relevant to dissipative systems, especially those coupled to the vacuum of the electromagnetic fields. In such systems, which typically show anti-PT symmetry and do not require the incorporation of gain, vacuum induces coherence between two modes. Owing to this coherence, the linear response acquires a pole on the real axis. We demonstrate how this coherence can be exploited for the enhanced sensing of very weak anhamonicities at low pum** rates. Higher drive powers ($\sim 0.1$ W), on the other hand, generate new domains of coherences. Our results are applicable to a wide class of systems, and we specifically illustrate the remarkable sensing capabilities in the context of a weakly anharmonic Yttrium Iron Garnet (YIG) sphere interacting with a cavity via a tapered fiber waveguide. A small change in the anharmonicity leads to a substantial change in the induced spin current.
△ Less
Submitted 24 October, 2020;
originally announced October 2020.
-
Learning to Deceive Knowledge Graph Augmented Models via Targeted Perturbation
Authors:
Mrigank Raman,
Aaron Chan,
Siddhant Agarwal,
Peifeng Wang,
Hansen Wang,
Sungchul Kim,
Ryan Rossi,
Handong Zhao,
Nedim Lipka,
Xiang Ren
Abstract:
Knowledge graphs (KGs) have helped neural models improve performance on various knowledge-intensive tasks, like question answering and item recommendation. By using attention over the KG, such KG-augmented models can also "explain" which KG information was most relevant for making a given prediction. In this paper, we question whether these models are really behaving as we expect. We show that, th…
▽ More
Knowledge graphs (KGs) have helped neural models improve performance on various knowledge-intensive tasks, like question answering and item recommendation. By using attention over the KG, such KG-augmented models can also "explain" which KG information was most relevant for making a given prediction. In this paper, we question whether these models are really behaving as we expect. We show that, through a reinforcement learning policy (or even simple heuristics), one can produce deceptively perturbed KGs, which maintain the downstream performance of the original KG while significantly deviating from the original KG's semantics and structure. Our findings raise doubts about KG-augmented models' ability to reason about KG information and give sensible explanations.
△ Less
Submitted 3 May, 2021; v1 submitted 24 October, 2020;
originally announced October 2020.
-
Prediction of Rainfall in Rajasthan, India using Deep and Wide Neural Network
Authors:
Vikas Bajpai,
Anukriti Bansal,
Kshitiz Verma,
Sanjay Agarwal
Abstract:
Rainfall is a natural process which is of utmost importance in various areas including water cycle, ground water recharging, disaster management and economic cycle. Accurate prediction of rainfall intensity is a challenging task and its exact prediction helps in every aspect. In this paper, we propose a deep and wide rainfall prediction model (DWRPM) and evaluate its effectiveness to predict rainf…
▽ More
Rainfall is a natural process which is of utmost importance in various areas including water cycle, ground water recharging, disaster management and economic cycle. Accurate prediction of rainfall intensity is a challenging task and its exact prediction helps in every aspect. In this paper, we propose a deep and wide rainfall prediction model (DWRPM) and evaluate its effectiveness to predict rainfall in Indian state of Rajasthan using historical time-series data. For wide network, instead of using rainfall intensity values directly, we are using features obtained after applying a convolutional layer. For deep part, a multi-layer perceptron (MLP) is used. Information of geographical parameters (latitude and longitude) are included in a unique way. It gives the model a generalization ability, which helps a single model to make rainfall predictions in different geographical conditions. We compare our results with various deep-learning approaches like MLP, LSTM and CNN, which are observed to work well in sequence-based predictions. Experimental analysis and comparison shows the applicability of our proposed method for rainfall prediction in Rajasthan.
△ Less
Submitted 22 October, 2020;
originally announced October 2020.
-
Extracting Procedural Knowledge from Technical Documents
Authors:
Shivali Agarwal,
Shubham Atreja,
Vikas Agarwal
Abstract:
Procedures are an important knowledge component of documents that can be leveraged by cognitive assistants for automation, question-answering or driving a conversation. It is a challenging problem to parse big dense documents like product manuals, user guides to automatically understand which parts are talking about procedures and subsequently extract them. Most of the existing research has focuse…
▽ More
Procedures are an important knowledge component of documents that can be leveraged by cognitive assistants for automation, question-answering or driving a conversation. It is a challenging problem to parse big dense documents like product manuals, user guides to automatically understand which parts are talking about procedures and subsequently extract them. Most of the existing research has focused on extracting flows in given procedures or understanding the procedures in order to answer conceptual questions. Identifying and extracting multiple procedures automatically from documents of diverse formats remains a relatively less addressed problem. In this work, we cover some of this ground by -- 1) Providing insights on how structural and linguistic properties of documents can be grouped to define types of procedures, 2) Analyzing documents to extract the relevant linguistic and structural properties, and 3) Formulating procedure identification as a classification problem that leverages the features of the document derived from the above analysis. We first implemented and deployed unsupervised techniques which were used in different use cases. Based on the evaluation in different use cases, we figured out the weaknesses of the unsupervised approach. We then designed an improved version which was supervised. We demonstrate that our technique is effective in identifying procedures from big and complex documents alike by achieving accuracy of 89%.
△ Less
Submitted 20 October, 2020;
originally announced October 2020.
-
Poisoned classifiers are not only backdoored, they are fundamentally broken
Authors:
Mingjie Sun,
Siddhant Agarwal,
J. Zico Kolter
Abstract:
Under a commonly-studied backdoor poisoning attack against classification models, an attacker adds a small trigger to a subset of the training data, such that the presence of this trigger at test time causes the classifier to always predict some target class. It is often implicitly assumed that the poisoned classifier is vulnerable exclusively to the adversary who possesses the trigger. In this pa…
▽ More
Under a commonly-studied backdoor poisoning attack against classification models, an attacker adds a small trigger to a subset of the training data, such that the presence of this trigger at test time causes the classifier to always predict some target class. It is often implicitly assumed that the poisoned classifier is vulnerable exclusively to the adversary who possesses the trigger. In this paper, we show empirically that this view of backdoored classifiers is incorrect. We describe a new threat model for poisoned classifier, where one without knowledge of the original trigger, would want to control the poisoned classifier. Under this threat model, we propose a test-time, human-in-the-loop attack method to generate multiple effective alternative triggers without access to the initial backdoor and the training data. We construct these alternative triggers by first generating adversarial examples for a smoothed version of the classifier, created with a procedure called Denoised Smoothing, and then extracting colors or cropped portions of smoothed adversarial images with human interaction. We demonstrate the effectiveness of our attack through extensive experiments on high-resolution datasets: ImageNet and TrojAI. We also compare our approach to previous work on modeling trigger distributions and find that our method are more scalable and efficient in generating effective triggers. Last, we include a user study which demonstrates that our method allows users to easily determine the existence of such backdoors in existing poisoned classifiers. Thus, we argue that there is no such thing as a secret backdoor in poisoned classifiers: poisoning a classifier invites attacks not just by the party that possesses the trigger, but from anyone with access to the classifier.
△ Less
Submitted 5 October, 2021; v1 submitted 18 October, 2020;
originally announced October 2020.
-
Pinball-OCSVM for early-stage COVID-19 diagnosis with limited posteroanterior chest X-ray images
Authors:
Sanjay Kumar Sonbhadra,
Sonali Agarwal,
P. Nagabhushan
Abstract:
The infection of respiratory coronavirus disease 2019 (COVID-19) starts with the upper respiratory tract and as the virus grows, the infection can progress to lungs and develop pneumonia. The conventional way of COVID-19 diagnosis is reverse transcription polymerase chain reaction (RT-PCR), which is less sensitive during early stages; especially if the patient is asymptomatic, which may further ca…
▽ More
The infection of respiratory coronavirus disease 2019 (COVID-19) starts with the upper respiratory tract and as the virus grows, the infection can progress to lungs and develop pneumonia. The conventional way of COVID-19 diagnosis is reverse transcription polymerase chain reaction (RT-PCR), which is less sensitive during early stages; especially if the patient is asymptomatic, which may further cause more severe pneumonia. In this context, several deep learning models have been proposed to identify pulmonary infections using publicly available chest X-ray (CXR) image datasets for early diagnosis, better treatment and quick cure. In these datasets, presence of less number of COVID-19 positive samples compared to other classes (normal, pneumonia and Tuberculosis) raises the challenge for unbiased learning of deep learning models. All deep learning models opted class balancing techniques to solve this issue; which however should be avoided in any medical diagnosis process. Moreover, the deep learning models are also data hungry and need massive computation resources. Therefore for quicker diagnosis, this research proposes a novel pinball loss function based one-class support vector machine (PB-OCSVM), that can work in presence of limited COVID-19 positive CXR samples with objectives to maximize the learning efficiency and to minimize the false predictions. The performance of the proposed model is compared with conventional OCSVM and existing deep learning models, and the experimental results prove that the proposed model outperformed over state-of-the-art methods. To validate the robustness of the proposed model, experiments are also performed with noisy CXR images and UCI benchmark datasets.
△ Less
Submitted 5 June, 2021; v1 submitted 15 October, 2020;
originally announced October 2020.
-
Can RNNs trained on harder subject-verb agreement instances still perform well on easier ones?
Authors:
Hritik Bansal,
Gantavya Bhatt,
Sumeet Agarwal
Abstract:
Previous work suggests that RNNs trained on natural language corpora can capture number agreement well for simple sentences but perform less well when sentences contain agreement attractors: intervening nouns between the verb and the main subject with grammatical number opposite to the latter. This suggests these models may not learn the actual syntax of agreement, but rather infer shallower heuri…
▽ More
Previous work suggests that RNNs trained on natural language corpora can capture number agreement well for simple sentences but perform less well when sentences contain agreement attractors: intervening nouns between the verb and the main subject with grammatical number opposite to the latter. This suggests these models may not learn the actual syntax of agreement, but rather infer shallower heuristics such as `agree with the recent noun'. In this work, we investigate RNN models with varying inductive biases trained on selectively chosen `hard' agreement instances, i.e., sentences with at least one agreement attractor. For these the verb number cannot be predicted using a simple linear heuristic, and hence they might help provide the model additional cues for hierarchical syntax. If RNNs can learn the underlying agreement rules when trained on such hard instances, then they should generalize well to other sentences, including simpler ones. However, we observe that several RNN types, including the ONLSTM which has a soft structural inductive bias, surprisingly fail to perform well on sentences without attractors when trained solely on sentences with attractors. We analyze how these selectively trained RNNs compare to the baseline (training on a natural distribution of agreement attractors) along the dimensions of number agreement accuracy, representational similarity, and performance across different syntactic constructions. Our findings suggest that RNNs trained on our hard agreement instances still do not capture the underlying syntax of agreement, but rather tend to overfit the training distribution in a way which leads them to perform poorly on `easy' out-of-distribution instances. Thus, while RNNs are powerful models which can pick up non-trivial dependency patterns, inducing them to do so at the level of syntax rather than surface remains a challenge.
△ Less
Submitted 9 April, 2021; v1 submitted 10 October, 2020;
originally announced October 2020.
-
A Variational Information Bottleneck Based Method to Compress Sequential Networks for Human Action Recognition
Authors:
Ayush Srivastava,
Oshin Dutta,
Prathosh AP,
Sumeet Agarwal,
Jigyasa Gupta
Abstract:
In the last few years, compression of deep neural networks has become an important strand of machine learning and computer vision research. Deep models require sizeable computational complexity and storage, when used for instance for Human Action Recognition (HAR) from videos, making them unsuitable to be deployed on edge devices. In this paper, we address this issue and propose a method to effect…
▽ More
In the last few years, compression of deep neural networks has become an important strand of machine learning and computer vision research. Deep models require sizeable computational complexity and storage, when used for instance for Human Action Recognition (HAR) from videos, making them unsuitable to be deployed on edge devices. In this paper, we address this issue and propose a method to effectively compress Recurrent Neural Networks (RNNs) such as Gated Recurrent Units (GRUs) and Long-Short-Term-Memory Units (LSTMs) that are used for HAR. We use a Variational Information Bottleneck (VIB) theory-based pruning approach to limit the information flow through the sequential cells of RNNs to a small subset. Further, we combine our pruning method with a specific group-lasso regularization technique that significantly improves compression. The proposed techniques reduce model parameters and memory footprint from latent representations, with little or no reduction in the validation accuracy while increasing the inference speed several-fold. We perform experiments on the three widely used Action Recognition datasets, viz. UCF11, HMDB51, and UCF101, to validate our approach. It is shown that our method achieves over 70 times greater compression than the nearest competitor with comparable accuracy for the task of action recognition on UCF11.
△ Less
Submitted 9 November, 2020; v1 submitted 3 October, 2020;
originally announced October 2020.
-
One-Shot learning based classification for segregation of plastic waste
Authors:
Shivaank Agarwal,
Ravindra Gudi,
Paresh Saxena
Abstract:
The problem of segregating recyclable waste is fairly daunting for many countries. This article presents an approach for image based classification of plastic waste using one-shot learning techniques. The proposed approach exploits discriminative features generated via the siamese and triplet loss convolutional neural networks to help differentiate between 5 types of plastic waste based on their r…
▽ More
The problem of segregating recyclable waste is fairly daunting for many countries. This article presents an approach for image based classification of plastic waste using one-shot learning techniques. The proposed approach exploits discriminative features generated via the siamese and triplet loss convolutional neural networks to help differentiate between 5 types of plastic waste based on their resin codes. The approach achieves an accuracy of 99.74% on the WaDaBa Database
△ Less
Submitted 29 September, 2020;
originally announced September 2020.
-
Face Mask Detection using Transfer Learning of InceptionV3
Authors:
G. Jignesh Chowdary,
Narinder Singh Punn,
Sanjay Kumar Sonbhadra,
Sonali Agarwal
Abstract:
The world is facing a huge health crisis due to the rapid transmission of coronavirus (COVID-19). Several guidelines were issued by the World Health Organization (WHO) for protection against the spread of coronavirus. According to WHO, the most effective preventive measure against COVID-19 is wearing a mask in public places and crowded areas. It is very difficult to monitor people manually in thes…
▽ More
The world is facing a huge health crisis due to the rapid transmission of coronavirus (COVID-19). Several guidelines were issued by the World Health Organization (WHO) for protection against the spread of coronavirus. According to WHO, the most effective preventive measure against COVID-19 is wearing a mask in public places and crowded areas. It is very difficult to monitor people manually in these areas. In this paper, a transfer learning model is proposed to automate the process of identifying the people who are not wearing mask. The proposed model is built by fine-tuning the pre-trained state-of-the-art deep learning model, InceptionV3. The proposed model is trained and tested on the Simulated Masked Face Dataset (SMFD). Image augmentation technique is adopted to address the limited availability of data for better training and testing of the model. The model outperformed the other recently proposed approaches by achieving an accuracy of 99.9% during training and 100% during testing.
△ Less
Submitted 20 October, 2020; v1 submitted 17 September, 2020;
originally announced September 2020.
-
Convex Calibrated Surrogates for the Multi-Label F-Measure
Authors:
Mingyuan Zhang,
Harish G. Ramaswamy,
Shivani Agarwal
Abstract:
The F-measure is a widely used performance measure for multi-label classification, where multiple labels can be active in an instance simultaneously (e.g. in image tagging, multiple tags can be active in any image). In particular, the F-measure explicitly balances recall (fraction of active labels predicted to be active) and precision (fraction of labels predicted to be active that are actually so…
▽ More
The F-measure is a widely used performance measure for multi-label classification, where multiple labels can be active in an instance simultaneously (e.g. in image tagging, multiple tags can be active in any image). In particular, the F-measure explicitly balances recall (fraction of active labels predicted to be active) and precision (fraction of labels predicted to be active that are actually so), both of which are important in evaluating the overall performance of a multi-label classifier. As with most discrete prediction problems, however, directly optimizing the F-measure is computationally hard. In this paper, we explore the question of designing convex surrogate losses that are calibrated for the F-measure -- specifically, that have the property that minimizing the surrogate loss yields (in the limit of sufficient data) a Bayes optimal multi-label classifier for the F-measure. We show that the F-measure for an $s$-label problem, when viewed as a $2^s \times 2^s$ loss matrix, has rank at most $s^2+1$, and apply a result of Ramaswamy et al. (2014) to design a family of convex calibrated surrogates for the F-measure. The resulting surrogate risk minimization algorithms can be viewed as decomposing the multi-label F-measure learning problem into $s^2+1$ binary class probability estimation problems. We also provide a quantitative regret transfer bound for our surrogates, which allows any regret guarantees for the binary problems to be transferred to regret guarantees for the overall F-measure problem, and discuss a connection with the algorithm of Dembczynski et al. (2013). Our experiments confirm our theoretical findings.
△ Less
Submitted 16 September, 2020;
originally announced September 2020.
-
Dependence of the dynamical properties of light-cone simulation dark matter halos on their environment
Authors:
Maria Chira,
Manolis Plionis,
Shankar Agarwal
Abstract:
Aims: We study the dependence of the dynamical properties of dark matter halos on their environment in a whole-sky $Λ$CDM light-cone simulation extending to $z\sim 0.65$. The properties of interest are halo shape (parametrized by its principal axes), spin and virialisation status, the alignment of halo spin and shape, as well as the shape-shape and spin-spin alignments among halo neighbours. Metho…
▽ More
Aims: We study the dependence of the dynamical properties of dark matter halos on their environment in a whole-sky $Λ$CDM light-cone simulation extending to $z\sim 0.65$. The properties of interest are halo shape (parametrized by its principal axes), spin and virialisation status, the alignment of halo spin and shape, as well as the shape-shape and spin-spin alignments among halo neighbours. Methods: We define the halo environment using the notion of halo isolation status determined by the distance to its nearest neighbor. This defines a maximum spherical region around each halo devoid of other halos, above the catalog threshold mass. We consider as 'close halo pairs', the pairs that are separated by a distance lower than a specific threshold. In order to decontaminate our results from the known dependence of halo dynamical properties on mass, we use a random sampling procedure in order to compare properties of similar halo abundance distributions. Results: (a) We find a strong dependence of halo properties on their environment, confirming that isolated halos are more aspherical and more prolate with lower spin values. (b) Correlations between halo properties exist and are mostly independent of halo environment. (c) Halo spins are aligned with the minor axis, regardless of halo shape. (d) Close halo neighbors have their major axes statistically aligned, while they show a slight but statistically significant preference for anti-parallel spin directions. The latter result is enhanced for the case of close halo pairs in low-density environments. Furthermore, we find a preference of the spin vectors to be oriented perpendicular to the line connecting such close halo pairs.
△ Less
Submitted 25 July, 2021; v1 submitted 10 September, 2020;
originally announced September 2020.
-
Antimony Chalcogenide-based Solid State Sensitizers for Solar Cells: A Forgotten Hero or Low Potential Candidate
Authors:
Sumanshu Agarwal,
Harekrishna Yadav,
Kundan Kumar
Abstract:
The use of stibnite (Sb2S3) as sensitizers in the solid-state sensitized solar cells received considerable research interest during the transition of the millennium. However, the use of perovskite diminished the research in the field and the potential of antimony chalcogenide (Sb2(S,Se)3) was not explored thoroughly. Although these materials also provide bandgap tuning like perovskite by varying t…
▽ More
The use of stibnite (Sb2S3) as sensitizers in the solid-state sensitized solar cells received considerable research interest during the transition of the millennium. However, the use of perovskite diminished the research in the field and the potential of antimony chalcogenide (Sb2(S,Se)3) was not explored thoroughly. Although these materials also provide bandgap tuning like perovskite by varying the composition of S and Se, it is not as popular as perovskite mainly because of the low efficiency of the solar cells based on it. In this paper, we present a landscape of the functional role of various device parameters on the performance of Sb2(S,Se)3 based solar cells. For the purpose, we first calibrate the optoelectronic model used for the simulation with the experimental results from the literature. The model is then subjected to parametric variations to explore the performance metrics for this class of solar cells. Our results show that despite the belief that open circuit voltage is independent of contact layers do** in proper band aligned sensitized solar cells, here we observe otherwise and the open circuit voltage is indeed dependent on the do** density of the contact layers. Using the detailed numerical simulation and analytical model we further identify the performance optimization map of Sb2(S,Se)3 based sensitized solar cells.
△ Less
Submitted 5 September, 2020;
originally announced September 2020.
-
How Visualization PhD Students Cope with Paper Rejections
Authors:
Shivam Agarwal,
Shahid Latif,
Fabian Beck
Abstract:
We conducted a questionnaire study aimed towards PhD students in the field of visualization research to understand how they cope with paper rejections. We collected responses from 24 participants and performed a qualitative analysis of the data in relation to the provided support by collaborators, resubmission strategies, handling multiple rejects, and personal impression of the reviews. The resul…
▽ More
We conducted a questionnaire study aimed towards PhD students in the field of visualization research to understand how they cope with paper rejections. We collected responses from 24 participants and performed a qualitative analysis of the data in relation to the provided support by collaborators, resubmission strategies, handling multiple rejects, and personal impression of the reviews. The results indicate that the PhD students in the visualization community generally cope well with the negative reviews and, with experience, learn how to act accordingly to improve and resubmit their work. Our results reveal the main co** strategies that can be applied for constructively handling rejected visualization papers. The most prominent strategies include: discussing reviews with collaborators and making a resubmission plan, doing a major revision to improve the work, shortening the work, and seeing rejection as a positive learning experience.
△ Less
Submitted 25 September, 2020; v1 submitted 1 September, 2020;
originally announced September 2020.
-
A Chirality-Based Quantum Leap
Authors:
Clarice D. Aiello,
Muneer Abbas,
John M. Abendroth,
Andrei Afanasev,
Shivang Agarwal,
Amartya S. Banerjee,
David N. Beratan,
Jason N. Belling,
Bertrand Berche,
Antia Botana,
Justin R. Caram,
Giuseppe Luca Celardo,
Gianaurelio Cuniberti,
Aitzol Garcia-Etxarri,
Arezoo Dianat,
Ismael Diez-Perez,
Yuqi Guo,
Rafael Gutierrez,
Carmen Herrmann,
Joshua Hihath,
Suneet Kale,
Philip Kurian,
Ying-Cheng Lai,
Alexander Lopez,
Ernesto Medina
, et al. (19 additional authors not shown)
Abstract:
Chiral degrees of freedom occur in matter and in electromagnetic fields and constitute an area of research that is experiencing renewed interest driven by recent observations of the chiral-induced spin selectivity (CISS) effect in chiral molecules and engineered nanomaterials. The CISS effect underpins the fact that charge transport through nanoscopic chiral structures favors a particular electron…
▽ More
Chiral degrees of freedom occur in matter and in electromagnetic fields and constitute an area of research that is experiencing renewed interest driven by recent observations of the chiral-induced spin selectivity (CISS) effect in chiral molecules and engineered nanomaterials. The CISS effect underpins the fact that charge transport through nanoscopic chiral structures favors a particular electronic spin orientation, resulting in large room-temperature spin polarizations. Observations of the CISS effect suggest opportunities for spin control and for the design and fabrication of room-temperature quantum devices from the bottom up, with atomic-scale precision. Any technology that relies on optimal charge transport, including quantum devices for logic, sensing, and storage, may benefit from chiral quantum properties. These properties can be theoretically and experimentally investigated from a quantum information perspective, which is presently lacking. There are uncharted implications for the quantum sciences once chiral couplings can be engineered to control the storage, transduction, and manipulation of quantum information. This forward-looking perspective provides a survey of the experimental and theoretical fundamentals of chiral-influenced quantum effects, and presents a vision for their future roles in enabling room-temperature quantum technologies.
△ Less
Submitted 11 November, 2021; v1 submitted 31 August, 2020;
originally announced September 2020.
-
Enhanced Normalized Mutual Information for Localization in Noisy Environments
Authors:
Samuel Todd Flanagan,
Drupad K. Khublani,
J. -F. Chamberland,
Siddharth Agarwal,
Ankit Vora
Abstract:
Fine localization is a crucial task for autonomous vehicles. Although many algorithms have been explored in the literature for this specific task, the goal of getting accurate results from commodity sensors remains a challenge. As autonomous vehicles make the transition from expensive prototypes to production items, the need for inexpensive, yet reliable solutions is increasing rapidly. This artic…
▽ More
Fine localization is a crucial task for autonomous vehicles. Although many algorithms have been explored in the literature for this specific task, the goal of getting accurate results from commodity sensors remains a challenge. As autonomous vehicles make the transition from expensive prototypes to production items, the need for inexpensive, yet reliable solutions is increasing rapidly. This article considers scenarios where images are captured with inexpensive cameras and localization takes place using pre-loaded fine maps of local roads as side information. The techniques proposed herein extend schemes based on normalized mutual information by leveraging the likelihood of shades rather than exact sensor readings for localization in noisy environments. This algorithmic enhancement, rooted in statistical signal processing, offers substantial gains in performance. Numerical simulations are used to highlight the benefits of the proposed techniques in representative application scenarios. Analysis of a Ford image set is performed to validate the core findings of this work.
△ Less
Submitted 24 August, 2020;
originally announced August 2020.
-
GraphReach: Position-Aware Graph Neural Network using Reachability Estimations
Authors:
Sunil Nishad,
Shubhangi Agarwal,
Arnab Bhattacharya,
Sayan Ranu
Abstract:
Majority of the existing graph neural networks (GNN) learn node embeddings that encode their local neighborhoods but not their positions. Consequently, two nodes that are vastly distant but located in similar local neighborhoods map to similar embeddings in those networks. This limitation prevents accurate performance in predictive tasks that rely on position information. In this paper, we develop…
▽ More
Majority of the existing graph neural networks (GNN) learn node embeddings that encode their local neighborhoods but not their positions. Consequently, two nodes that are vastly distant but located in similar local neighborhoods map to similar embeddings in those networks. This limitation prevents accurate performance in predictive tasks that rely on position information. In this paper, we develop GraphReach, a position-aware inductive GNN that captures the global positions of nodes through reachability estimations with respect to a set of anchor nodes. The anchors are strategically selected so that reachability estimations across all the nodes are maximized. We show that this combinatorial anchor selection problem is NP-hard and, consequently, develop a greedy (1-1/e) approximation heuristic. Empirical evaluation against state-of-the-art GNN architectures reveal that GraphReach provides up to 40% relative improvement in accuracy. In addition, it is more robust to adversarial attacks.
△ Less
Submitted 20 August, 2021; v1 submitted 19 August, 2020;
originally announced August 2020.
-
Contextual Diversity for Active Learning
Authors:
Sharat Agarwal,
Himanshu Arora,
Saket Anand,
Chetan Arora
Abstract:
Requirement of large annotated datasets restrict the use of deep convolutional neural networks (CNNs) for many practical applications. The problem can be mitigated by using active learning (AL) techniques which, under a given annotation budget, allow to select a subset of data that yields maximum accuracy upon fine tuning. State of the art AL approaches typically rely on measures of visual diversi…
▽ More
Requirement of large annotated datasets restrict the use of deep convolutional neural networks (CNNs) for many practical applications. The problem can be mitigated by using active learning (AL) techniques which, under a given annotation budget, allow to select a subset of data that yields maximum accuracy upon fine tuning. State of the art AL approaches typically rely on measures of visual diversity or prediction uncertainty, which are unable to effectively capture the variations in spatial context. On the other hand, modern CNN architectures make heavy use of spatial context for achieving highly accurate predictions. Since the context is difficult to evaluate in the absence of ground-truth labels, we introduce the notion of contextual diversity that captures the confusion associated with spatially co-occurring classes. Contextual Diversity (CD) hinges on a crucial observation that the probability vector predicted by a CNN for a region of interest typically contains information from a larger receptive field. Exploiting this observation, we use the proposed CD measure within two AL frameworks: (1) a core-set based strategy and (2) a reinforcement learning based policy, for active frame selection. Our extensive empirical evaluation establish state of the art results for active learning on benchmark datasets of Semantic Segmentation, Object Detection and Image Classification. Our ablation studies show clear advantages of using contextual diversity for active learning. The source code and additional results are available at https://github.com/sharat29ag/CDAL.
△ Less
Submitted 13 August, 2020;
originally announced August 2020.
-
Quantum sensing of open systems: Estimation of dam** constants and temperature
Authors:
Jiaxuan Wang,
Luiz Davidovich,
Girish Saran Agarwal
Abstract:
We determine quantum precision limits for estimation of dam** constants and temperature of lossy bosonic channels. A direct application would be the use of light for estimation of the absorption and the temperature of a transparent slab. Analytic lower bounds are obtained for the uncertainty in the estimation, through a purification procedure that replaces the master equation description by a un…
▽ More
We determine quantum precision limits for estimation of dam** constants and temperature of lossy bosonic channels. A direct application would be the use of light for estimation of the absorption and the temperature of a transparent slab. Analytic lower bounds are obtained for the uncertainty in the estimation, through a purification procedure that replaces the master equation description by a unitary evolution involving the system and ad hoc environments. For zero temperature, Fock states are shown to lead to the minimal uncertainty in the estimation of dam**, with boson-counting being the best measurement procedure. In both dam** and temperature estimates, sequential pre-thermalization measurements, through a stream of single bosons, may lead to huge gain in precision.
△ Less
Submitted 6 August, 2020;
originally announced August 2020.
-
Optical Valley Hall Effect of 2D Excitons in Hyperbolic Metamaterial
Authors:
Sriram Guddala,
Mandeep Khatoniar,
Nicholas Yama,
W. Liu,
G. S. Agarwal,
Vinod M. Menon
Abstract:
The robust spin and momentum valley locking of electrons in two-dimensional semiconductors make the valley degree of freedom of great utility for functional optoelectronic devices. Owing to the difference in optical selection rules for the different valleys, these valley electrons can be addressed optically. The electrons and excitons in these materials exhibit valley Hall effect, where the carrie…
▽ More
The robust spin and momentum valley locking of electrons in two-dimensional semiconductors make the valley degree of freedom of great utility for functional optoelectronic devices. Owing to the difference in optical selection rules for the different valleys, these valley electrons can be addressed optically. The electrons and excitons in these materials exhibit valley Hall effect, where the carriers from specific valleys are directed to different directions under electrical or thermal bias. Here we report the optical valley Hall effect where the light emission from the valley polarized excitons in monolayer WS2 propagates in different directions owing to the preferential coupling of excitonic emission to the high momentum states of the hyperbolic metamaterial. The experimentally observed effects are corroborated with theoretical modeling of excitonic emission in the near field of hyperbolic media. The demonstration of the optical valley Hall effect using a bulk artificial photonic media without the need for nanostructuring opens the possibility of realizing valley-based excitonic circuits operating at room temperature.
△ Less
Submitted 1 August, 2020;
originally announced August 2020.
-
Imitative Planning using Conditional Normalizing Flow
Authors:
Shubhankar Agarwal,
Harshit Sikchi,
Cole Gulino,
Eric Wilkinson,
Shivam Gautam
Abstract:
A popular way to plan trajectories in dynamic urban scenarios for Autonomous Vehicles is to rely on explicitly specified and hand crafted cost functions, coupled with random sampling in the trajectory space to find the minimum cost trajectory. Such methods require a high number of samples to find a low-cost trajectory and might end up with a highly suboptimal trajectory given the planning time bud…
▽ More
A popular way to plan trajectories in dynamic urban scenarios for Autonomous Vehicles is to rely on explicitly specified and hand crafted cost functions, coupled with random sampling in the trajectory space to find the minimum cost trajectory. Such methods require a high number of samples to find a low-cost trajectory and might end up with a highly suboptimal trajectory given the planning time budget. We explore the application of normalizing flows for improving the performance of trajectory planning for autonomous vehicles (AVs). Our key insight is to learn a sampling policy in a low-dimensional latent space of expert-like trajectories, out of which the best sample is selected for execution. By modeling the trajectory planner's cost manifold as an energy function, we learn a scene conditioned map** from the prior to a Boltzmann distribution over the AV control space. Finally, we demonstrate the effectiveness of our approach on real-world datasets over IL and hand-constructed trajectory sampling techniques.
△ Less
Submitted 13 October, 2022; v1 submitted 31 July, 2020;
originally announced July 2020.
-
Cosmological Model Parameter Dependence of the Matter Power Spectrum Covariance from the DEUS-PUR $Cosmo$ Simulations
Authors:
Linda Blot,
Pier-Stefano Corasaniti,
Yann Rasera,
Shankar Agarwal
Abstract:
Future galaxy surveys will provide accurate measurements of the matter power spectrum across an unprecedented range of scales and redshifts. The analysis of these data will require one to accurately model the imprint of non-linearities of the matter density field. In particular, these induce a non-Gaussian contribution to the data covariance that needs to be properly taken into account to realise…
▽ More
Future galaxy surveys will provide accurate measurements of the matter power spectrum across an unprecedented range of scales and redshifts. The analysis of these data will require one to accurately model the imprint of non-linearities of the matter density field. In particular, these induce a non-Gaussian contribution to the data covariance that needs to be properly taken into account to realise unbiased cosmological parameter inference analyses. Here, we study the cosmological dependence of the matter power spectrum covariance using a dedicated suite of N-body simulations, the Dark Energy Universe Simulation - Parallel Universe Runs (DEUS-PUR) {\it Cosmo}. These consist of 512 realizations for 10 different cosmologies where we vary the matter density $Ω_m$, the amplitude of density fluctuations $σ_8$, the reduced Hubble parameter $h$ and a constant dark energy equation of state $w$ by approximately $10\%$. We use these data to evaluate the first and second derivatives of the power spectrum covariance with respect to a fiducial $Λ$CDM cosmology. We find that the variations can be as large as $150\%$ depending on the scale, redshift and model parameter considered. By performing a Fisher matrix analysis we explore the impact of different choices in modelling the cosmological dependence of the covariance. Our results suggest that fixing the covariance to a fiducial cosmology can significantly affect the recovered parameter errors and that modelling the cosmological dependence of the variance while kee** the correlation coefficient fixed can alleviate the impact of this effect.
△ Less
Submitted 2 November, 2020; v1 submitted 29 July, 2020;
originally announced July 2020.
-
Metasurfaces for Quantum Photonics
Authors:
Alexander S. Solntsev,
Girish S. Agarwal,
Yuri S. Kivshar
Abstract:
Rapid progress in the development of metasurfaces allowed to replace bulky optical assemblies with thin nanostructured films, often called metasurfaces, opening a broad range of novel and superior applications to the generation, manipulation, and detection of light in classical optics. Recently, these developments started making a headway in quantum photonics, where novel opportunities arose for t…
▽ More
Rapid progress in the development of metasurfaces allowed to replace bulky optical assemblies with thin nanostructured films, often called metasurfaces, opening a broad range of novel and superior applications to the generation, manipulation, and detection of light in classical optics. Recently, these developments started making a headway in quantum photonics, where novel opportunities arose for the control of nonclassical nature of light, including photon statistics, quantum state superposition, quantum entanglement, and single-photon detection. In this Perspective, we review recent progress in the field of quantum-photonics applications of metasurfaces, focusing on innovative and promising approaches to create, manipulate, and detect nonclassical light.
△ Less
Submitted 29 July, 2020;
originally announced July 2020.
-
IITK at SemEval-2020 Task 8: Unimodal and Bimodal Sentiment Analysis of Internet Memes
Authors:
Vishal Keswani,
Sakshi Singh,
Suryansh Agarwal,
Ashutosh Modi
Abstract:
Social media is abundant in visual and textual information presented together or in isolation. Memes are the most popular form, belonging to the former class. In this paper, we present our approaches for the Memotion Analysis problem as posed in SemEval-2020 Task 8. The goal of this task is to classify memes based on their emotional content and sentiment. We leverage techniques from Natural Langua…
▽ More
Social media is abundant in visual and textual information presented together or in isolation. Memes are the most popular form, belonging to the former class. In this paper, we present our approaches for the Memotion Analysis problem as posed in SemEval-2020 Task 8. The goal of this task is to classify memes based on their emotional content and sentiment. We leverage techniques from Natural Language Processing (NLP) and Computer Vision (CV) towards the sentiment classification of internet memes (Subtask A). We consider Bimodal (text and image) as well as Unimodal (text-only) techniques in our study ranging from the Naïve Bayes classifier to Transformer-based approaches. Our results show that a text-only approach, a simple Feed Forward Neural Network (FFNN) with Word2vec embeddings as input, performs superior to all the others. We stand first in the Sentiment analysis task with a relative improvement of 63% over the baseline macro-F1 score. Our work is relevant to any task concerned with the combination of different modalities.
△ Less
Submitted 21 July, 2020;
originally announced July 2020.
-
Fruit classification using deep feature maps in the presence of deceptive similar classes
Authors:
Mohit Dandekar,
Narinder Singh Punn,
Sanjay Kumar Sonbhadra,
Sonali Agarwal
Abstract:
Autonomous detection and classification of objects are admired area of research in many industrial applications. Though, humans can distinguish objects with high multi-granular similarities very easily; but for the machines, it is a very challenging task. The convolution neural networks (CNN) have illustrated efficient performance in multi-level representations of objects for classification. Conve…
▽ More
Autonomous detection and classification of objects are admired area of research in many industrial applications. Though, humans can distinguish objects with high multi-granular similarities very easily; but for the machines, it is a very challenging task. The convolution neural networks (CNN) have illustrated efficient performance in multi-level representations of objects for classification. Conventionally, the existing deep learning models utilize the transformed features generated by the rearmost layer for training and testing. However, it is evident that this does not work well with multi-granular data, especially, in presence of deceptive similar classes (almost similar but different classes). The objective of the present research is to address the challenge of classification of deceptively similar multi-granular objects with an ensemble approach thfat utilizes activations from multiple layers of CNN (deep features). These multi-layer activations are further utilized to build multiple deep decision trees (known as Random forest) for classification of objects with similar appearance. The Fruits-360 dataset is utilized for evaluation of the proposed approach. With extensive trials it was observed that the proposed model outperformed over the conventional deep learning approaches.
△ Less
Submitted 12 July, 2020;
originally announced July 2020.
-
Enhanced Behavioral Cloning Based self-driving Car Using Transfer Learning
Authors:
Uppala Sumanth,
Narinder Singh Punn,
Sanjay Kumar Sonbhadra,
Sonali Agarwal
Abstract:
With the growing phase of artificial intelligence and autonomous learning, the self-driving car is one of the promising area of research and emerging as a center of focus for automobile industries. Behavioral cloning is the process of replicating human behavior via visuomotor policies by means of machine learning algorithms. In recent years, several deep learning-based behavioral cloning approache…
▽ More
With the growing phase of artificial intelligence and autonomous learning, the self-driving car is one of the promising area of research and emerging as a center of focus for automobile industries. Behavioral cloning is the process of replicating human behavior via visuomotor policies by means of machine learning algorithms. In recent years, several deep learning-based behavioral cloning approaches have been developed in the context of self-driving cars specifically based on the concept of transfer learning. Concerning the same, the present paper proposes a transfer learning approach using VGG16 architecture, which is fine tuned by retraining the last block while kee** other blocks as non-trainable. The performance of proposed architecture is further compared with existing NVIDIA architecture and its pruned variants (pruned by 22.2% and 33.85% using 1x1 filter to decrease the total number of parameters). Experimental results show that the VGG16 with transfer learning architecture has outperformed other discussed approaches with faster convergence.
△ Less
Submitted 11 July, 2020;
originally announced July 2020.
-
Quantum Advantage with Seeded Squeezed Light for Absorption Measurement
Authors:
Fu Li,
Tian Li,
Marlan O. Scully,
Girish S. Agarwal
Abstract:
Absorption measurement is an exceptionally versatile tool for many applications in science and engineering. For absorption measurements using laser beams of light, the sensitivity is theoretically limited by the shot noise due to the fundamental Poisson distribution of photon number in laser radiation. In practice, the shot-noise limit can only be achieved when all other sources of noise are elimi…
▽ More
Absorption measurement is an exceptionally versatile tool for many applications in science and engineering. For absorption measurements using laser beams of light, the sensitivity is theoretically limited by the shot noise due to the fundamental Poisson distribution of photon number in laser radiation. In practice, the shot-noise limit can only be achieved when all other sources of noise are eliminated. \textcolor{black}{Here, we use seeded squeezed light to demonstrate that direct absorption measurement can be performed with sensitivity beyond the shot-noise limit. We present a practically realizable scheme, where intensity squeezed beams are generated by a seeded four-wave mixing process in an atomic rubidium vapor cell. More than 1.2~dB quantum advantage for the measurement sensitivity is obtained at faint absorption levels ($\leq 10\%$). We also present a detailed theoretical analysis to show that the observed quantum advantage when corrected for optical loss would be equivalent to 3~dB. Our experiment demonstrates a direct sub-shot-noise measurement of absorption that requires neither homodyne/lock-in nor logic coincidence detection schemes. It is therefore very applicable in many circumstances where sub-shot-noise-level absorption measurements are highly desirable.
△ Less
Submitted 21 April, 2021; v1 submitted 10 July, 2020;
originally announced July 2020.
-
Attack of the Tails: Yes, You Really Can Backdoor Federated Learning
Authors:
Hongyi Wang,
Kartik Sreenivasan,
Shashank Rajput,
Harit Vishwakarma,
Saurabh Agarwal,
Jy-yong Sohn,
Kangwook Lee,
Dimitris Papailiopoulos
Abstract:
Due to its decentralized nature, Federated Learning (FL) lends itself to adversarial attacks in the form of backdoors during training. The goal of a backdoor is to corrupt the performance of the trained model on specific sub-tasks (e.g., by classifying green cars as frogs). A range of FL backdoor attacks have been introduced in the literature, but also methods to defend against them, and it is cur…
▽ More
Due to its decentralized nature, Federated Learning (FL) lends itself to adversarial attacks in the form of backdoors during training. The goal of a backdoor is to corrupt the performance of the trained model on specific sub-tasks (e.g., by classifying green cars as frogs). A range of FL backdoor attacks have been introduced in the literature, but also methods to defend against them, and it is currently an open question whether FL systems can be tailored to be robust against backdoors. In this work, we provide evidence to the contrary. We first establish that, in the general case, robustness to backdoors implies model robustness to adversarial examples, a major open problem in itself. Furthermore, detecting the presence of a backdoor in a FL model is unlikely assuming first order oracles or polynomial time. We couple our theoretical results with a new family of backdoor attacks, which we refer to as edge-case backdoors. An edge-case backdoor forces a model to misclassify on seemingly easy inputs that are however unlikely to be part of the training, or test data, i.e., they live on the tail of the input distribution. We explain how these edge-case backdoors can lead to unsavory failures and may have serious repercussions on fairness, and exhibit that with careful tuning at the side of the adversary, one can insert them across a range of machine learning tasks (e.g., image classification, OCR, text prediction, sentiment analysis).
△ Less
Submitted 9 July, 2020;
originally announced July 2020.
-
Observation of photonic spin-momentum locking due to coupling of achiral metamaterials and quantum dots
Authors:
Ravindra Kumar Yadav,
Wenxiao Liu,
SRK Chaitanya Indukuri,
Adarsh B. Vasista,
G. V. Pavan Kumar,
Girish S. Agarwal,
Jaydeep K Basu
Abstract:
Here, we report observations of photonic spin-momentum locking in the form of directional and chiral emission from achiral quantum dots (QDs) evanescently coupled to achiral hyperbolic metamaterials (HMM). Efficient coupling between QDs and the metamaterial leads to emergence of these photonic topological modes which can be detected in the far field. We provide theoretical explanation for the emer…
▽ More
Here, we report observations of photonic spin-momentum locking in the form of directional and chiral emission from achiral quantum dots (QDs) evanescently coupled to achiral hyperbolic metamaterials (HMM). Efficient coupling between QDs and the metamaterial leads to emergence of these photonic topological modes which can be detected in the far field. We provide theoretical explanation for the emergence of spin-momentum locking through rigorous modeling based on photon Green's function where pseudo spin of light arises from coupling of QDs to evanescent modes of HMM.
△ Less
Submitted 31 October, 2020; v1 submitted 1 July, 2020;
originally announced July 2020.
-
BPGC at SemEval-2020 Task 11: Propaganda Detection in News Articles with Multi-Granularity Knowledge Sharing and Linguistic Features based Ensemble Learning
Authors:
Rajaswa Patil,
Somesh Singh,
Swati Agarwal
Abstract:
Propaganda spreads the ideology and beliefs of like-minded people, brainwashing their audiences, and sometimes leading to violence. SemEval 2020 Task-11 aims to design automated systems for news propaganda detection. Task-11 consists of two sub-tasks, namely, Span Identification - given any news article, the system tags those specific fragments which contain at least one propaganda technique; and…
▽ More
Propaganda spreads the ideology and beliefs of like-minded people, brainwashing their audiences, and sometimes leading to violence. SemEval 2020 Task-11 aims to design automated systems for news propaganda detection. Task-11 consists of two sub-tasks, namely, Span Identification - given any news article, the system tags those specific fragments which contain at least one propaganda technique; and Technique Classification - correctly classify a given propagandist statement amongst 14 propaganda techniques. For sub-task 1, we use contextual embeddings extracted from pre-trained transformer models to represent the text data at various granularities and propose a multi-granularity knowledge sharing approach. For sub-task 2, we use an ensemble of BERT and logistic regression classifiers with linguistic features. Our results reveal that the linguistic features are the strong indicators for covering minority classes in a highly imbalanced dataset.
△ Less
Submitted 24 August, 2020; v1 submitted 31 May, 2020;
originally announced June 2020.
-
Language Models are Few-Shot Learners
Authors:
Tom B. Brown,
Benjamin Mann,
Nick Ryder,
Melanie Subbiah,
Jared Kaplan,
Prafulla Dhariwal,
Arvind Neelakantan,
Pranav Shyam,
Girish Sastry,
Amanda Askell,
Sandhini Agarwal,
Ariel Herbert-Voss,
Gretchen Krueger,
Tom Henighan,
Rewon Child,
Aditya Ramesh,
Daniel M. Ziegler,
Jeffrey Wu,
Clemens Winter,
Christopher Hesse,
Mark Chen,
Eric Sigler,
Mateusz Litwin,
Scott Gray,
Benjamin Chess
, et al. (6 additional authors not shown)
Abstract:
Recent work has demonstrated substantial gains on many NLP tasks and benchmarks by pre-training on a large corpus of text followed by fine-tuning on a specific task. While typically task-agnostic in architecture, this method still requires task-specific fine-tuning datasets of thousands or tens of thousands of examples. By contrast, humans can generally perform a new language task from only a few…
▽ More
Recent work has demonstrated substantial gains on many NLP tasks and benchmarks by pre-training on a large corpus of text followed by fine-tuning on a specific task. While typically task-agnostic in architecture, this method still requires task-specific fine-tuning datasets of thousands or tens of thousands of examples. By contrast, humans can generally perform a new language task from only a few examples or from simple instructions - something which current NLP systems still largely struggle to do. Here we show that scaling up language models greatly improves task-agnostic, few-shot performance, sometimes even reaching competitiveness with prior state-of-the-art fine-tuning approaches. Specifically, we train GPT-3, an autoregressive language model with 175 billion parameters, 10x more than any previous non-sparse language model, and test its performance in the few-shot setting. For all tasks, GPT-3 is applied without any gradient updates or fine-tuning, with tasks and few-shot demonstrations specified purely via text interaction with the model. GPT-3 achieves strong performance on many NLP datasets, including translation, question-answering, and cloze tasks, as well as several tasks that require on-the-fly reasoning or domain adaptation, such as unscrambling words, using a novel word in a sentence, or performing 3-digit arithmetic. At the same time, we also identify some datasets where GPT-3's few-shot learning still struggles, as well as some datasets where GPT-3 faces methodological issues related to training on large web corpora. Finally, we find that GPT-3 can generate samples of news articles which human evaluators have difficulty distinguishing from articles written by humans. We discuss broader societal impacts of this finding and of GPT-3 in general.
△ Less
Submitted 22 July, 2020; v1 submitted 28 May, 2020;
originally announced May 2020.
-
Deletion and contraction in configuration spaces of graphs
Authors:
Sanjana Agarwal,
Maya Banks,
Nir Gadish,
Dane Miyata
Abstract:
The aim of this article is to provide space level maps between configuration spaces of graphs that are predicted by algebraic manipulations of cellular chains. More explicitly, we consider edge contraction and half-edge deletion, and identify the homotopy cofibers in terms of configuration spaces of simpler graphs. The construction's main benefit lies in making the operations functorial - in parti…
▽ More
The aim of this article is to provide space level maps between configuration spaces of graphs that are predicted by algebraic manipulations of cellular chains. More explicitly, we consider edge contraction and half-edge deletion, and identify the homotopy cofibers in terms of configuration spaces of simpler graphs. The construction's main benefit lies in making the operations functorial - in particular, graph minors give rise to compatible maps at the level of fundamental groups as well as generalized (co)homology theories.
As applications we provide a long exact sequence for half-edge deletion in any generalized cohomology theory, compatible with cohomology operations such as the Steenrod and Adams operations, allowing for inductive calculations in this general context. We also show that the generalized homology of unordered configuration spaces is finitely generated as a representation of the opposite graph minor category.
△ Less
Submitted 27 May, 2020;
originally announced May 2020.
-
Nonlinear Spin Currents
Authors:
Jayakrishnan M. P. Nair,
Zhedong Zhang,
Marlan O. Scully,
Girish S. Agarwal
Abstract:
The cavity mediated spin current between two ferrite samples has been reported by Bai et. al. [Phys. Rev. Lett. 118, 217201 (2017)]. This experiment was done in the linear regime of the interaction in the presence of external drive. In the current paper we develop a theory for the spin current in the nonlinear domain where the external drive is strong so that one needs to include the Kerr nonlinea…
▽ More
The cavity mediated spin current between two ferrite samples has been reported by Bai et. al. [Phys. Rev. Lett. 118, 217201 (2017)]. This experiment was done in the linear regime of the interaction in the presence of external drive. In the current paper we develop a theory for the spin current in the nonlinear domain where the external drive is strong so that one needs to include the Kerr nonlinearity of the ferrite materials. In this manner the nonlinear polaritons are created and one can reach both bistable and multistable behavior of the spin current. The system is driven into a far from equilibrium steady state which is determined by the details of driving field and various interactions. We present a variety of steady state results for the spin current. A spectroscopic detection of the nonlinear spin current is developed, revealing the key properties of the nonlinear polaritons. The transmission of a weak probe is used to obtain quantitative information on the multistable behavior of the spin current. The results and methods that we present are quite generic and can be used in many other contexts where cavities are used to transfer information from one system to another, e.g., two different molecular systems.
△ Less
Submitted 26 May, 2020;
originally announced May 2020.
-
Unleashing the power of disruptive and emerging technologies amid COVID-19: A detailed review
Authors:
Sonali Agarwal,
Narinder Singh Punn,
Sanjay Kumar Sonbhadra,
M. Tanveer,
P. Nagabhushan,
K K Soundra Pandian,
Praveer Saxena
Abstract:
The unprecedented outbreak of the novel coronavirus (COVID-19), during early December 2019 in Wuhan, China, has quickly evolved into a global pandemic, became a matter of grave concern, and placed government agencies worldwide in a precarious position. The scarcity of resources and lack of experiences to endure the COVID-19 pandemic, combined with the fear of future consequences has established th…
▽ More
The unprecedented outbreak of the novel coronavirus (COVID-19), during early December 2019 in Wuhan, China, has quickly evolved into a global pandemic, became a matter of grave concern, and placed government agencies worldwide in a precarious position. The scarcity of resources and lack of experiences to endure the COVID-19 pandemic, combined with the fear of future consequences has established the need for adoption of emerging and future technologies to address the upcoming challenges. Since the last five months, the amount of pandemic impact has reached its pinnacle that is altering everyone's life; and humans are now bound to adopt safe ways to survive under the risk of being affected. Technological advances are now accelerating faster than ever before to stay ahead of the consequences and acquire new capabilities to build a safer world. Thus, there is a rising need to unfold the power of emerging, future and disruptive technologies to explore all possible ways to fight against COVID-19. In this review article, we attempt to study all emerging, future, and disruptive technologies that can be utilized to mitigate the impact of COVID-19. Building on background insights, detailed technological specific use cases to fight against COVID-19 have been discussed in terms of their strengths, weaknesses, opportunities, and threats (SWOT). As concluding remarks, we highlight prioritized research areas and upcoming opportunities to blur the lines between the physical, digital, and biological domain-specific challenges and also illuminate collaborative research directions for moving towards a post-COVID-19 world.
△ Less
Submitted 19 April, 2021; v1 submitted 23 May, 2020;
originally announced May 2020.
-
Surprising simplicity in the modeling of dynamic granular intrusion
Authors:
Shashank Agarwal,
Andras Karsai,
Daniel I Goldman,
Ken Kamrin
Abstract:
Granular intrusions, such as dynamic impact or wheel locomotion, are complex multiphase phenomena where the grains exhibit solid-like and fluid-like characteristics together with an ejected gas-like phase. Despite decades of modeling efforts, a unified description of the physics in such intrusions is as yet unknown. Here we show that a continuum model based on the simple notions of frictional flow…
▽ More
Granular intrusions, such as dynamic impact or wheel locomotion, are complex multiphase phenomena where the grains exhibit solid-like and fluid-like characteristics together with an ejected gas-like phase. Despite decades of modeling efforts, a unified description of the physics in such intrusions is as yet unknown. Here we show that a continuum model based on the simple notions of frictional flow and tension-free separation describes complex granular intrusions near free surfaces. This model captures dynamics in a variety of experiments including wheel locomotion, plate intrusions, and running legged robots. The model reveals that three effects (a static contribution and two dynamic ones) primarily give rise to intrusion forces in such scenarios. Identification of these effects enables the development of a further reduced-order technique (Dynamic Resistive Force Theory) for rapid modeling of granular locomotion of arbitrarily shaped intruders. The continuum-motivated strategy we propose for identifying physical mechanisms and corresponding reduced-order relations has potential use for a variety of other materials.
△ Less
Submitted 21 January, 2021; v1 submitted 21 May, 2020;
originally announced May 2020.
-
How much complexity does an RNN architecture need to learn syntax-sensitive dependencies?
Authors:
Gantavya Bhatt,
Hritik Bansal,
Rishubh Singh,
Sumeet Agarwal
Abstract:
Long short-term memory (LSTM) networks and their variants are capable of encapsulating long-range dependencies, which is evident from their performance on a variety of linguistic tasks. On the other hand, simple recurrent networks (SRNs), which appear more biologically grounded in terms of synaptic connections, have generally been less successful at capturing long-range dependencies as well as the…
▽ More
Long short-term memory (LSTM) networks and their variants are capable of encapsulating long-range dependencies, which is evident from their performance on a variety of linguistic tasks. On the other hand, simple recurrent networks (SRNs), which appear more biologically grounded in terms of synaptic connections, have generally been less successful at capturing long-range dependencies as well as the loci of grammatical errors in an unsupervised setting. In this paper, we seek to develop models that bridge the gap between biological plausibility and linguistic competence. We propose a new architecture, the Decay RNN, which incorporates the decaying nature of neuronal activations and models the excitatory and inhibitory connections in a population of neurons. Besides its biological inspiration, our model also shows competitive performance relative to LSTMs on subject-verb agreement, sentence grammaticality, and language modeling tasks. These results provide some pointers towards probing the nature of the inductive biases required for RNN architectures to model linguistic phenomena successfully.
△ Less
Submitted 25 May, 2020; v1 submitted 17 May, 2020;
originally announced May 2020.
-
History for Visual Dialog: Do we really need it?
Authors:
Shubham Agarwal,
Trung Bui,
Joon-Young Lee,
Ioannis Konstas,
Verena Rieser
Abstract:
Visual Dialog involves "understanding" the dialog history (what has been discussed previously) and the current question (what is asked), in addition to grounding information in the image, to generate the correct response. In this paper, we show that co-attention models which explicitly encode dialog history outperform models that don't, achieving state-of-the-art performance (72 % NDCG on val set)…
▽ More
Visual Dialog involves "understanding" the dialog history (what has been discussed previously) and the current question (what is asked), in addition to grounding information in the image, to generate the correct response. In this paper, we show that co-attention models which explicitly encode dialog history outperform models that don't, achieving state-of-the-art performance (72 % NDCG on val set). However, we also expose shortcomings of the crowd-sourcing dataset collection procedure by showing that history is indeed only required for a small amount of the data and that the current evaluation metric encourages generic replies. To that end, we propose a challenging subset (VisDialConv) of the VisDial val set and provide a benchmark of 63% NDCG.
△ Less
Submitted 8 May, 2020;
originally announced May 2020.
-
Using Computer Vision to enhance Safety of Workforce in Manufacturing in a Post COVID World
Authors:
Prateek Khandelwal,
Anuj Khandelwal,
Snigdha Agarwal,
Deep Thomas,
Naveen Xavier,
Arun Raghuraman
Abstract:
The COVID-19 pandemic forced governments across the world to impose lockdowns to prevent virus transmissions. This resulted in the shutdown of all economic activity and accordingly the production at manufacturing plants across most sectors was halted. While there is an urgency to resume production, there is an even greater need to ensure the safety of the workforce at the plant site. Reports indic…
▽ More
The COVID-19 pandemic forced governments across the world to impose lockdowns to prevent virus transmissions. This resulted in the shutdown of all economic activity and accordingly the production at manufacturing plants across most sectors was halted. While there is an urgency to resume production, there is an even greater need to ensure the safety of the workforce at the plant site. Reports indicate that maintaining social distancing and wearing face masks while at work clearly reduces the risk of transmission. We decided to use computer vision on CCTV feeds to monitor worker activity and detect violations which trigger real time voice alerts on the shop floor. This paper describes an efficient and economic approach of using AI to create a safe environment in a manufacturing setup. We demonstrate our approach to build a robust social distancing measurement algorithm using a mix of modern-day deep learning and classic projective geometry techniques. We have deployed our solution at manufacturing plants across the Aditya Birla Group (ABG). We have also described our face mask detection approach which provides a high accuracy across a range of customized masks.
△ Less
Submitted 25 May, 2020; v1 submitted 11 May, 2020;
originally announced May 2020.
-
Monitoring COVID-19 social distancing with person detection and tracking via fine-tuned YOLO v3 and Deepsort techniques
Authors:
Narinder Singh Punn,
Sanjay Kumar Sonbhadra,
Sonali Agarwal,
Gaurav Rai
Abstract:
The rampant coronavirus disease 2019 (COVID-19) has brought global crisis with its deadly spread to more than 180 countries, and about 3,519,901 confirmed cases along with 247,630 deaths globally as on May 4, 2020. The absence of any active therapeutic agents and the lack of immunity against COVID-19 increases the vulnerability of the population. Since there are no vaccines available, social dista…
▽ More
The rampant coronavirus disease 2019 (COVID-19) has brought global crisis with its deadly spread to more than 180 countries, and about 3,519,901 confirmed cases along with 247,630 deaths globally as on May 4, 2020. The absence of any active therapeutic agents and the lack of immunity against COVID-19 increases the vulnerability of the population. Since there are no vaccines available, social distancing is the only feasible approach to fight against this pandemic. Motivated by this notion, this article proposes a deep learning based framework for automating the task of monitoring social distancing using surveillance video. The proposed framework utilizes the YOLO v3 object detection model to segregate humans from the background and Deepsort approach to track the identified people with the help of bounding boxes and assigned IDs. The results of the YOLO v3 model are further compared with other popular state-of-the-art models, e.g. faster region-based CNN (convolution neural network) and single shot detector (SSD) in terms of mean average precision (mAP), frames per second (FPS) and loss values defined by object classification and localization. Later, the pairwise vectorized L2 norm is computed based on the three-dimensional feature space obtained by using the centroid coordinates and dimensions of the bounding box. The violation index term is proposed to quantize the non adoption of social distancing protocol. From the experimental analysis, it is observed that the YOLO v3 with Deepsort tracking scheme displayed best results with balanced mAP and FPS score to monitor the social distancing in real-time.
△ Less
Submitted 27 April, 2021; v1 submitted 4 May, 2020;
originally announced May 2020.