-
Utilizing Adversarial Examples for Bias Mitigation and Accuracy Enhancement
Authors:
Pushkar Shukla,
Dhruv Srikanth,
Lee Cohen,
Matthew Turk
Abstract:
We propose a novel approach to mitigate biases in computer vision models by utilizing counterfactual generation and fine-tuning. While counterfactuals have been used to analyze and address biases in DNN models, the counterfactuals themselves are often generated from biased generative models, which can introduce additional biases or spurious correlations. To address this issue, we propose using adv…
▽ More
We propose a novel approach to mitigate biases in computer vision models by utilizing counterfactual generation and fine-tuning. While counterfactuals have been used to analyze and address biases in DNN models, the counterfactuals themselves are often generated from biased generative models, which can introduce additional biases or spurious correlations. To address this issue, we propose using adversarial images, that is images that deceive a deep neural network but not humans, as counterfactuals for fair model training. Our approach leverages a curriculum learning framework combined with a fine-grained adversarial loss to fine-tune the model using adversarial examples. By incorporating adversarial images into the training data, we aim to prevent biases from propagating through the pipeline. We validate our approach through both qualitative and quantitative assessments, demonstrating improved bias mitigation and accuracy compared to existing methods. Qualitatively, our results indicate that post-training, the decisions made by the model are less dependent on the sensitive attribute and our model better disentangles the relationship between sensitive attributes and classification variables.
△ Less
Submitted 27 June, 2024; v1 submitted 17 April, 2024;
originally announced April 2024.
-
DiffRed: Dimensionality Reduction guided by stable rank
Authors:
Prarabdh Shukla,
Gagan Raj Gupta,
Kunal Dutta
Abstract:
In this work, we propose a novel dimensionality reduction technique, DiffRed, which first projects the data matrix, A, along first $k_1$ principal components and the residual matrix $A^{*}$ (left after subtracting its $k_1$-rank approximation) along $k_2$ Gaussian random vectors. We evaluate M1, the distortion of mean-squared pair-wise distance, and Stress, the normalized value of RMS of distortio…
▽ More
In this work, we propose a novel dimensionality reduction technique, DiffRed, which first projects the data matrix, A, along first $k_1$ principal components and the residual matrix $A^{*}$ (left after subtracting its $k_1$-rank approximation) along $k_2$ Gaussian random vectors. We evaluate M1, the distortion of mean-squared pair-wise distance, and Stress, the normalized value of RMS of distortion of the pairwise distances. We rigorously prove that DiffRed achieves a general upper bound of $O\left(\sqrt{\frac{1-p}{k_2}}\right)$ on Stress and $O\left(\frac{(1-p)}{\sqrt{k_2*ρ(A^{*})}}\right)$ on M1 where $p$ is the fraction of variance explained by the first $k_1$ principal components and $ρ(A^{*})$ is the stable rank of $A^{*}$. These bounds are tighter than the currently known results for Random maps. Our extensive experiments on a variety of real-world datasets demonstrate that DiffRed achieves near zero M1 and much lower values of Stress as compared to the well-known dimensionality reduction techniques. In particular, DiffRed can map a 6 million dimensional dataset to 10 dimensions with 54% lower Stress than PCA.
△ Less
Submitted 9 March, 2024;
originally announced March 2024.
-
Navigating the Unknown: Uncertainty-Aware Compute-in-Memory Autonomy of Edge Robotics
Authors:
Nastaran Darabi,
Priyesh Shukla,
Dinithi Jayasuriya,
Divake Kumar,
Alex C. Stutts,
Amit Ranjan Trivedi
Abstract:
This paper addresses the challenging problem of energy-efficient and uncertainty-aware pose estimation in insect-scale drones, which is crucial for tasks such as surveillance in constricted spaces and for enabling non-intrusive spatial intelligence in smart homes. Since tiny drones operate in highly dynamic environments, where factors like lighting and human movement impact their predictive accura…
▽ More
This paper addresses the challenging problem of energy-efficient and uncertainty-aware pose estimation in insect-scale drones, which is crucial for tasks such as surveillance in constricted spaces and for enabling non-intrusive spatial intelligence in smart homes. Since tiny drones operate in highly dynamic environments, where factors like lighting and human movement impact their predictive accuracy, it is crucial to deploy uncertainty-aware prediction algorithms that can account for environmental variations and express not only the prediction but also confidence in the prediction. We address both of these challenges with Compute-in-Memory (CIM) which has become a pivotal technology for deep learning acceleration at the edge. While traditional CIM techniques are promising for energy-efficient deep learning, to bring in the robustness of uncertainty-aware predictions at the edge, we introduce a suite of novel techniques: First, we discuss CIM-based acceleration of Bayesian filtering methods uniquely by leveraging the Gaussian-like switching current of CMOS inverters along with co-design of kernel functions to operate with extreme parallelism and with extreme energy efficiency. Secondly, we discuss the CIM-based acceleration of variational inference of deep learning models through probabilistic processing while unfolding iterative computations of the method with a compute reuse strategy to significantly minimize the workload. Overall, our co-design methodologies demonstrate the potential of CIM to improve the processing efficiency of uncertainty-aware algorithms by orders of magnitude, thereby enabling edge robotics to access the robustness of sophisticated prediction frameworks within their extremely stringent area/power resources.
△ Less
Submitted 30 January, 2024;
originally announced January 2024.
-
A New Dataflow Implementation to Improve Energy Efficiency of Monolithic 3D Systolic Arrays
Authors:
Prachi Shukla,
Vasilis F. Pavlidis,
Emre Salman,
Ayse K. Coskun
Abstract:
Systolic arrays are popular for executing deep neural networks (DNNs) at the edge. Low latency and energy efficiency are key requirements in edge devices such as drones and autonomous vehicles. Monolithic 3D (MONO3D) is an emerging 3D integration technique that offers ultra-high bandwidth among processing and memory elements with a negligible area overhead. Such high bandwidth can help meet the ev…
▽ More
Systolic arrays are popular for executing deep neural networks (DNNs) at the edge. Low latency and energy efficiency are key requirements in edge devices such as drones and autonomous vehicles. Monolithic 3D (MONO3D) is an emerging 3D integration technique that offers ultra-high bandwidth among processing and memory elements with a negligible area overhead. Such high bandwidth can help meet the ever-growing latency and energy efficiency demands for DNNs. This paper presents a novel implementation for weight stationary (WS) dataflow in MONO3D systolic arrays, called WS-MONO3D. WS-MONO3D utilizes multiple resistive RAM layers and SRAM with high-density vertical interconnects to multicast inputs and perform high-bandwidth weight pre-loading while maintaining the same order of multiply-and-accumulate operations as in native WS dataflow. Consequently, WS-MONO3D eliminates input and weight forwarding cycles and, thus, provides up to 40% improvement in energy-delay-product (EDP) over the native WS implementation in 2D at iso-configuration. WS-MONO3D also provides 10X improvement in inference per second per watt per footprint due to multiple vertical tiers. Finally, we also show that temperature impacts the energy efficiency benefits in WS-MONO3D.
△ Less
Submitted 7 January, 2024;
originally announced January 2024.
-
TIBET: Identifying and Evaluating Biases in Text-to-Image Generative Models
Authors:
Aditya Chinchure,
Pushkar Shukla,
Gaurav Bhatt,
Kiri Salij,
Kartik Hosanagar,
Leonid Sigal,
Matthew Turk
Abstract:
Text-to-Image (TTI) generative models have shown great progress in the past few years in terms of their ability to generate complex and high-quality imagery. At the same time, these models have been shown to suffer from harmful biases, including exaggerated societal biases (e.g., gender, ethnicity), as well as incidental correlations that limit such model's ability to generate more diverse imagery…
▽ More
Text-to-Image (TTI) generative models have shown great progress in the past few years in terms of their ability to generate complex and high-quality imagery. At the same time, these models have been shown to suffer from harmful biases, including exaggerated societal biases (e.g., gender, ethnicity), as well as incidental correlations that limit such model's ability to generate more diverse imagery. In this paper, we propose a general approach to study and quantify a broad spectrum of biases, for any TTI model and for any prompt, using counterfactual reasoning. Unlike other works that evaluate generated images on a predefined set of bias axes, our approach automatically identifies potential biases that might be relevant to the given prompt, and measures those biases. In addition, our paper extends quantitative scores with post-hoc explanations in terms of semantic concepts in the images generated. We show that our method is uniquely capable of explaining complex multi-dimensional biases through semantic concepts, as well as the intersectionality between different biases for any given prompt. We perform extensive user studies to illustrate that the results of our method and analysis are consistent with human judgements.
△ Less
Submitted 2 December, 2023;
originally announced December 2023.
-
Forecasting Tropical Cyclones with Cascaded Diffusion Models
Authors:
Pritthijit Nath,
Pancham Shukla,
Shuai Wang,
César Quilodrán-Casas
Abstract:
As tropical cyclones become more intense due to climate change, the rise of Al-based modelling provides a more affordable and accessible approach compared to traditional methods based on mathematical models. This work leverages generative diffusion models to forecast cyclone trajectories and precipitation patterns by integrating satellite imaging, remote sensing, and atmospheric data. It employs a…
▽ More
As tropical cyclones become more intense due to climate change, the rise of Al-based modelling provides a more affordable and accessible approach compared to traditional methods based on mathematical models. This work leverages generative diffusion models to forecast cyclone trajectories and precipitation patterns by integrating satellite imaging, remote sensing, and atmospheric data. It employs a cascaded approach that incorporates three main tasks: forecasting, super-resolution, and precipitation modelling. The training dataset includes 51 cyclones from six major tropical cyclone basins from January 2019 - March 2023. Experiments demonstrate that the final forecasts from the cascaded models show accurate predictions up to a 36-hour rollout, with excellent Structural Similarity (SSIM) and Peak-Singal-To-Noise Ratio (PSNR) values exceeding 0.5 and 20 dB, respectively, for all three tasks. The 36-hour forecasts can be produced in as little as 30 mins on a single Nvidia A30/RTX 2080 Ti. This work also highlights the promising efficiency of Al methods such as diffusion models for high-performance needs in weather forecasting, such as tropical cyclone forecasting, while remaining computationally affordable, making them ideal for highly vulnerable regions with critical forecasting needs and financial limitations. Code accessible at \url{https://github.com/nathzi1505/forecast-diffmodels}.
△ Less
Submitted 7 April, 2024; v1 submitted 2 October, 2023;
originally announced October 2023.
-
Music Source Separation Based on a Lightweight Deep Learning Framework (DTTNET: DUAL-PATH TFC-TDF UNET)
Authors:
Junyu Chen,
Susmitha Vekkot,
Pancham Shukla
Abstract:
Music source separation (MSS) aims to extract 'vocals', 'drums', 'bass' and 'other' tracks from a piece of mixed music. While deep learning methods have shown impressive results, there is a trend toward larger models. In our paper, we introduce a novel and lightweight architecture called DTTNet, which is based on Dual-Path Module and Time-Frequency Convolutions Time-Distributed Fully-connected UNe…
▽ More
Music source separation (MSS) aims to extract 'vocals', 'drums', 'bass' and 'other' tracks from a piece of mixed music. While deep learning methods have shown impressive results, there is a trend toward larger models. In our paper, we introduce a novel and lightweight architecture called DTTNet, which is based on Dual-Path Module and Time-Frequency Convolutions Time-Distributed Fully-connected UNet (TFC-TDF UNet). DTTNet achieves 10.12 dB cSDR on 'vocals' compared to 10.01 dB reported for Bandsplit RNN (BSRNN) but with 86.7% fewer parameters. We also assess pattern-specific performance and model generalization for intricate audio patterns.
△ Less
Submitted 19 March, 2024; v1 submitted 15 September, 2023;
originally announced September 2023.
-
Vision-Based Intelligent Robot Gras** Using Sparse Neural Network
Authors:
Priya Shukla,
Vandana Kushwaha,
G C Nandi
Abstract:
In the modern era of Deep Learning, network parameters play a vital role in models efficiency but it has its own limitations like extensive computations and memory requirements, which may not be suitable for real time intelligent robot gras** tasks. Current research focuses on how the model efficiency can be maintained by introducing sparsity but without compromising accuracy of the model in the…
▽ More
In the modern era of Deep Learning, network parameters play a vital role in models efficiency but it has its own limitations like extensive computations and memory requirements, which may not be suitable for real time intelligent robot gras** tasks. Current research focuses on how the model efficiency can be maintained by introducing sparsity but without compromising accuracy of the model in the robot gras** domain. More specifically, in this research two light-weighted neural networks have been introduced, namely Sparse-GRConvNet and Sparse-GINNet, which leverage sparsity in the robotic gras** domain for grasp pose generation by integrating the Edge-PopUp algorithm. This algorithm facilitates the identification of the top K% of edges by considering their respective score values. Both the Sparse-GRConvNet and Sparse-GINNet models are designed to generate high-quality grasp poses in real-time at every pixel location, enabling robots to effectively manipulate unfamiliar objects. We extensively trained our models using two benchmark datasets: Cornell Gras** Dataset (CGD) and Jacquard Gras** Dataset (JGD). Both Sparse-GRConvNet and Sparse-GINNet models outperform the current state-of-the-art methods in terms of performance, achieving an impressive accuracy of 97.75% with only 10% of the weight of GR-ConvNet and 50% of the weight of GI-NNet, respectively, on CGD. Additionally, Sparse-GRConvNet achieve an accuracy of 85.77% with 30% of the weight of GR-ConvNet and Sparse-GINNet achieve an accuracy of 81.11% with 10% of the weight of GI-NNet on JGD. To validate the performance of our proposed models, we conducted extensive experiments using the Anukul (Baxter) hardware cobot.
△ Less
Submitted 22 August, 2023;
originally announced August 2023.
-
Blockchain-Based Transferable Digital Rights of Land
Authors:
Ras Dwivedi,
Sumit Patel,
Prof. Sandeep Shukla
Abstract:
Land, being a scarce and valuable resource, is in high demand, especially in densely populated areas of older cities. Development authorities require land for infrastructure projects and other amenities, while landowners hold onto their land for both its usage and its financial value. Transferable Development Rights (TDRs) serve as a mechanism to separate the development rights associated with the…
▽ More
Land, being a scarce and valuable resource, is in high demand, especially in densely populated areas of older cities. Development authorities require land for infrastructure projects and other amenities, while landowners hold onto their land for both its usage and its financial value. Transferable Development Rights (TDRs) serve as a mechanism to separate the development rights associated with the land from the physical land itself. Development authorities acquire the land by offering compensation in the form of TDRs, which hold monetary value. In this paper, we present the tokenization of development rights, focusing on the implementation in collaboration with a development authority. While there have been previous implementations of land tokenization, we believe our approach is the first to tokenize development rights specifically. Our implementation addresses practical challenges related to record-kee**, ground verification of land, and the unique identification of stakeholders. We ensure the accurate evaluation of development rights by incorporating publicly available circle rates, which consider the ground development of the land and its surrounding areas.
△ Less
Submitted 11 August, 2023;
originally announced August 2023.
-
Metal Oxide-based Gas Sensor Array for the VOCs Analysis in Complex Mixtures using Machine Learning
Authors:
Shivam Singh,
Sajana S,
Poornima,
Gajje Sreelekha,
Chandranath Adak,
Rajendra P. Shukla,
Vinayak Kamble
Abstract:
Detection of Volatile Organic Compounds (VOCs) from the breath is becoming a viable route for the early detection of diseases non-invasively. This paper presents a sensor array with three metal oxide electrodes that can use machine learning methods to identify four distinct VOCs in a mixture. The metal oxide sensor array was subjected to various VOC concentrations, including ethanol, acetone, tolu…
▽ More
Detection of Volatile Organic Compounds (VOCs) from the breath is becoming a viable route for the early detection of diseases non-invasively. This paper presents a sensor array with three metal oxide electrodes that can use machine learning methods to identify four distinct VOCs in a mixture. The metal oxide sensor array was subjected to various VOC concentrations, including ethanol, acetone, toluene and chloroform. The dataset obtained from individual gases and their mixtures were analyzed using multiple machine learning algorithms, such as Random Forest (RF), K-Nearest Neighbor (KNN), Decision Tree, Linear Regression, Logistic Regression, Naive Bayes, Linear Discriminant Analysis, Artificial Neural Network, and Support Vector Machine. KNN and RF have shown more than 99% accuracy in classifying different varying chemicals in the gas mixtures. In regression analysis, KNN has delivered the best results with R2 value of more than 0.99 and LOD of 0.012, 0.015, 0.014 and 0.025 PPM for predicting the concentrations of varying chemicals Acetone, Toluene, Ethanol, and Chloroform, respectively in complex mixtures. Therefore, it is demonstrated that the array utilizing the provided algorithms can classify and predict the concentrations of the four gases simultaneously for disease diagnosis and treatment monitoring.
△ Less
Submitted 14 February, 2024; v1 submitted 13 July, 2023;
originally announced July 2023.
-
Robot Patrol: Using Crowdsourcing and Robotic Systems to Provide Indoor Navigation Guidance to The Visually Impaired
Authors:
Ike Obi,
Ruiqi Wang,
Prakash Shukla,
Byung-Cheol Min
Abstract:
Indoor navigation is a challenging activity for persons with disabilities, particularly, for those with low vision and visual impairment. Researchers have explored numerous solutions to resolve these challenges; however, several issues remain unsolved, particularly around providing dynamic and contextual information about potential obstacles in indoor environments. In this paper, we developed Robo…
▽ More
Indoor navigation is a challenging activity for persons with disabilities, particularly, for those with low vision and visual impairment. Researchers have explored numerous solutions to resolve these challenges; however, several issues remain unsolved, particularly around providing dynamic and contextual information about potential obstacles in indoor environments. In this paper, we developed Robot Patrol, an integrated system that employs a combination of crowdsourcing, computer vision, and robotic frameworks to provide contextual information to the visually impaired to empower them to navigate indoor spaces safely. In particular, the system is designed to provide information to the visually impaired about 1) potential obstacles on the route to their indoor destination, 2) information about indoor events on their route which they may wish to avoid or attend, and 3) any other contextual information that might support them to navigate to their indoor destinations safely and effectively. Findings from the Wizard of Oz experiment of our demo system provide insights into the benefits and limitations of the system. We provide a concise discussion on the implications of our findings.
△ Less
Submitted 5 June, 2023;
originally announced June 2023.
-
Context-aware 6D Pose Estimation of Known Objects using RGB-D data
Authors:
Ankit Kumar,
Priya Shukla,
Vandana Kushwaha,
G. C. Nandi
Abstract:
6D object pose estimation has been a research topic in the field of computer vision and robotics. Many modern world applications like robot gras**, manipulation, autonomous navigation etc, require the correct pose of objects present in a scene to perform their specific task. It becomes even harder when the objects are placed in a cluttered scene and the level of occlusion is high. Prior works ha…
▽ More
6D object pose estimation has been a research topic in the field of computer vision and robotics. Many modern world applications like robot gras**, manipulation, autonomous navigation etc, require the correct pose of objects present in a scene to perform their specific task. It becomes even harder when the objects are placed in a cluttered scene and the level of occlusion is high. Prior works have tried to overcome this problem but could not achieve accuracy that can be considered reliable in real-world applications. In this paper, we present an architecture that, unlike prior work, is context-aware. It utilizes the context information available to us about the objects. Our proposed architecture treats the objects separately according to their types i.e; symmetric and non-symmetric. A deeper estimator and refiner network pair is used for non-symmetric objects as compared to symmetric due to their intrinsic differences. Our experiments show an enhancement in the accuracy of about 3.2% over the LineMOD dataset, which is considered a benchmark for pose estimation in the occluded and cluttered scenes, against the prior state-of-the-art DenseFusion. Our results also show that the inference time we got is sufficient for real-time usage.
△ Less
Submitted 11 December, 2022;
originally announced December 2022.
-
Robust Monocular Localization of Drones by Adapting Domain Maps to Depth Prediction Inaccuracies
Authors:
Priyesh Shukla,
Sureshkumar S.,
Alex C. Stutts,
Sathya Ravi,
Theja Tulabandhula,
Amit R. Trivedi
Abstract:
We present a novel monocular localization framework by jointly training deep learning-based depth prediction and Bayesian filtering-based pose reasoning. The proposed cross-modal framework significantly outperforms deep learning-only predictions with respect to model scalability and tolerance to environmental variations. Specifically, we show little-to-no degradation of pose accuracy even with ext…
▽ More
We present a novel monocular localization framework by jointly training deep learning-based depth prediction and Bayesian filtering-based pose reasoning. The proposed cross-modal framework significantly outperforms deep learning-only predictions with respect to model scalability and tolerance to environmental variations. Specifically, we show little-to-no degradation of pose accuracy even with extremely poor depth estimates from a lightweight depth predictor. Our framework also maintains high pose accuracy in extreme lighting variations compared to standard deep learning, even without explicit domain adaptation. By openly representing the map and intermediate feature maps (such as depth estimates), our framework also allows for faster updates and reusing intermediate predictions for other tasks, such as obstacle avoidance, resulting in much higher resource efficiency.
△ Less
Submitted 27 October, 2022;
originally announced October 2022.
-
Audio Barlow Twins: Self-Supervised Audio Representation Learning
Authors:
Jonah Anton,
Harry Coppock,
Pancham Shukla,
Bjorn W. Schuller
Abstract:
The Barlow Twins self-supervised learning objective requires neither negative samples or asymmetric learning updates, achieving results on a par with the current state-of-the-art within Computer Vision. As such, we present Audio Barlow Twins, a novel self-supervised audio representation learning approach, adapting Barlow Twins to the audio domain. We pre-train on the large-scale audio dataset Audi…
▽ More
The Barlow Twins self-supervised learning objective requires neither negative samples or asymmetric learning updates, achieving results on a par with the current state-of-the-art within Computer Vision. As such, we present Audio Barlow Twins, a novel self-supervised audio representation learning approach, adapting Barlow Twins to the audio domain. We pre-train on the large-scale audio dataset AudioSet, and evaluate the quality of the learnt representations on 18 tasks from the HEAR 2021 Challenge, achieving results which outperform, or otherwise are on a par with, the current state-of-the-art for instance discrimination self-supervised learning approaches to audio representation learning. Code at https://github.com/jonahanton/SSL_audio.
△ Less
Submitted 28 September, 2022;
originally announced September 2022.
-
Attacking Compressed Vision Transformers
Authors:
Swapnil Parekh,
Devansh Shah,
Pratyush Shukla
Abstract:
Vision Transformers are increasingly embedded in industrial systems due to their superior performance, but their memory and power requirements make deploying them to edge devices a challenging task. Hence, model compression techniques are now widely used to deploy models on edge devices as they decrease the resource requirements and make model inference very fast and efficient. But their reliabili…
▽ More
Vision Transformers are increasingly embedded in industrial systems due to their superior performance, but their memory and power requirements make deploying them to edge devices a challenging task. Hence, model compression techniques are now widely used to deploy models on edge devices as they decrease the resource requirements and make model inference very fast and efficient. But their reliability and robustness from a security perspective is another major issue in safety-critical applications. Adversarial attacks are like optical illusions for ML algorithms and they can severely impact the accuracy and reliability of models. In this work we investigate the transferability of adversarial samples across the SOTA Vision Transformer models across 3 SOTA compressed versions and infer the effects different compression techniques have on adversarial attacks.
△ Less
Submitted 27 September, 2022;
originally announced September 2022.
-
Block Ciphers Substitution Box Generation Based on Natural Randomness in Underwater Acoustics and Knights Tour Chain
Authors:
Muhammad Fahad Khan,
Khalid Saleem,
Tariq Shah,
Mohammad Mazyad Hazzazi,
Ismail Bahkali,
Piyush Kumar Shukla
Abstract:
The protection of confidential information is a global issue and block encryption algorithms are the most reliable option. The famous information theorist, Claude Shannon has given two desirable characteristics that should exist in a strong cipher which are substitution and permutation in their fundamental research on Communication Theory of Secrecy Systems. block ciphers strictly follow the subst…
▽ More
The protection of confidential information is a global issue and block encryption algorithms are the most reliable option. The famous information theorist, Claude Shannon has given two desirable characteristics that should exist in a strong cipher which are substitution and permutation in their fundamental research on Communication Theory of Secrecy Systems. block ciphers strictly follow the substitution and permutation principle to generate a ciphertext. The actual strength of the block ciphers against several attacks is entirely based on its substitution characteristic, which is gained by using the S-Box. In the current literature, algebraic structure-based and chaos-based techniques are highly used for the construction of S-boxes because both these techniques have favourable features for S-box construction, but also various attacks of these techniques have been identified. True randomness has been universally recognized as the ideal method for cipher primitives design because true random numbers are unpredictable, irreversible, and unreproducible. The basic concept of the proposed technique is the extraction of true random bits from underwater acoustic waves and to design a novel technique for the dynamic generation of S-boxes using the chain of knights tour. The proposed method satisfies all standard evaluation tests of S-boxes construction and true random numbers generation. Two million bits have been analyzed using the NIST randomness test suite, and the results show that underwater sound waves are an impeccable entropy source for true randomness. Additionally, our dynamically generated S-boxes have better or equal strength, over the latest published S-boxes (2020 to 2021). According to our knowledge first time, this type of research has been done, in which natural randomness of underwater acoustic waves has been used for the construction of block cipher's S-Box
△ Less
Submitted 26 May, 2022;
originally announced May 2022.
-
Temperature-Aware Monolithic 3D DNN Accelerators for Biomedical Applications
Authors:
Prachi Shukla,
Vasilis F. Pavlidis,
Emre Salman,
Ayse K. Coskun
Abstract:
In this paper, we focus on temperature-aware Monolithic 3D (Mono3D) deep neural network (DNN) inference accelerators for biomedical applications. We develop an optimizer that tunes aspect ratios and footprint of the accelerator under user-defined performance and thermal constraints, and generates near-optimal configurations. Using the proposed Mono3D optimizer, we demonstrate up to 61% improvement…
▽ More
In this paper, we focus on temperature-aware Monolithic 3D (Mono3D) deep neural network (DNN) inference accelerators for biomedical applications. We develop an optimizer that tunes aspect ratios and footprint of the accelerator under user-defined performance and thermal constraints, and generates near-optimal configurations. Using the proposed Mono3D optimizer, we demonstrate up to 61% improvement in energy efficiency for biomedical applications over a performance-optimized accelerator.
△ Less
Submitted 29 March, 2022;
originally announced March 2022.
-
Robust and Scalable Game-theoretic Security Investment Methods for Voltage Stability of Power Systems
Authors:
Lu An,
Pratishtha Shukla,
Aranya Chakrabortty,
Alexandra Duel-Hallen
Abstract:
We develop investment approaches to secure electric power systems against load attacks where a malicious intruder (the attacker) covertly changes reactive power setpoints of loads to push the grid towards voltage instability while the system operator (the defender) employs reactive power compensation (RPC) to prevent instability. Extending our previously reported Stackelberg game formulation for t…
▽ More
We develop investment approaches to secure electric power systems against load attacks where a malicious intruder (the attacker) covertly changes reactive power setpoints of loads to push the grid towards voltage instability while the system operator (the defender) employs reactive power compensation (RPC) to prevent instability. Extending our previously reported Stackelberg game formulation for this problem, we develop a robust-defense sequential algorithm and a novel genetic algorithm that provides scalability to large-scale power system models. The proposed methods are validated using IEEE prototype power system models with time-varying load uncertainties, demonstrating that reliable and robust defense is feasible unless the operator's RPC investment resources are severely limited relative to the attacker's resources.
△ Less
Submitted 4 September, 2023; v1 submitted 10 March, 2022;
originally announced March 2022.
-
Generating Quality Grasp Rectangle using Pix2Pix GAN for Intelligent Robot Gras**
Authors:
Vandana Kushwaha,
Priya Shukla,
G C Nandi
Abstract:
Intelligent robot gras** is a very challenging task due to its inherent complexity and non availability of sufficient labelled data. Since making suitable labelled data available for effective training for any deep learning based model including deep reinforcement learning is so crucial for successful grasp learning, in this paper we propose to solve the problem of generating gras** Poses/Rect…
▽ More
Intelligent robot gras** is a very challenging task due to its inherent complexity and non availability of sufficient labelled data. Since making suitable labelled data available for effective training for any deep learning based model including deep reinforcement learning is so crucial for successful grasp learning, in this paper we propose to solve the problem of generating gras** Poses/Rectangles using a Pix2Pix Generative Adversarial Network (Pix2Pix GAN), which takes an image of an object as input and produces the gras** rectangle tagged with the object as output. Here, we have proposed an end-to-end gras** rectangle generating methodology and embedding it to an appropriate place of an object to be grasped. We have developed two modules to obtain an optimal gras** rectangle. With the help of the first module, the pose (position and orientation) of the generated gras** rectangle is extracted from the output of Pix2Pix GAN, and then the extracted grasp pose is translated to the centroid of the object, since here we hypothesize that like the human way of gras** of regular shaped objects, the center of mass/centroids are the best places for stable gras**. For other irregular shaped objects, we allow the generated gras** rectangles as it is to be fed to the robot for grasp execution. The accuracy has significantly improved for generating the gras** rectangle with limited number of Cornell Gras** Dataset augmented by our proposed approach to the extent of 87.79%. Experiments show that our proposed generative model based approach gives the promising results in terms of executing successful grasps for seen as well as unseen objects.
△ Less
Submitted 20 February, 2022;
originally announced February 2022.
-
PageRank Algorithm using Eigenvector Centrality -- New Approach
Authors:
Suvarna Saumya Chandrashekhar,
Mashrin Srivastava,
B. Jaganathan,
Pankaj Shukla
Abstract:
The purpose of the research is to find a centrality measure that can be used in place of PageRank and to find out the conditions where we can use it in place of PageRank. After analysis and comparison of graphs with a large number of nodes using Spearman's Rank Coefficient Correlation, the conclusion is evident that Eigenvector can be safely used in place of PageRank in directed networks to improv…
▽ More
The purpose of the research is to find a centrality measure that can be used in place of PageRank and to find out the conditions where we can use it in place of PageRank. After analysis and comparison of graphs with a large number of nodes using Spearman's Rank Coefficient Correlation, the conclusion is evident that Eigenvector can be safely used in place of PageRank in directed networks to improve the performance in terms of the time complexity.
△ Less
Submitted 20 January, 2022; v1 submitted 13 January, 2022;
originally announced January 2022.
-
Development of a robust cascaded architecture for intelligent robot gras** using limited labelled data
Authors:
Priya Shukla,
Vandana Kushwaha,
G. C. Nandi
Abstract:
Gras** objects intelligently is a challenging task even for humans and we spend a considerable amount of time during our childhood to learn how to grasp objects correctly. In the case of robots, we can not afford to spend that much time on making it to learn how to grasp objects effectively. Therefore, in the present research we propose an efficient learning architecture based on VQVAE so that r…
▽ More
Gras** objects intelligently is a challenging task even for humans and we spend a considerable amount of time during our childhood to learn how to grasp objects correctly. In the case of robots, we can not afford to spend that much time on making it to learn how to grasp objects effectively. Therefore, in the present research we propose an efficient learning architecture based on VQVAE so that robots can be taught with sufficient data corresponding to correct gras**. However, getting sufficient labelled data is extremely difficult in the robot gras** domain. To help solve this problem, a semi-supervised learning based model which has much more generalization capability even with limited labelled data set, has been investigated. Its performance shows 6\% improvement when compared with existing state-of-the-art models including our earlier model. During experimentation, It has been observed that our proposed model, RGGCNN2, performs significantly better, both in gras** isolated objects as well as objects in a cluttered environment, compared to the existing approaches which do not use unlabelled data for generating gras** rectangles. To the best of our knowledge, develo** an intelligent robot gras** model (based on semi-supervised learning) trained through representation learning and exploiting the high-quality learning ability of GGCNN2 architecture with the limited number of labelled dataset together with the learned latent embeddings, can be used as a de-facto training method which has been established and also validated in this paper through rigorous hardware experimentations using Baxter (Anukul) research robot.
△ Less
Submitted 6 November, 2021;
originally announced December 2021.
-
MC-CIM: Compute-in-Memory with Monte-Carlo Dropouts for Bayesian Edge Intelligence
Authors:
Priyesh Shukla,
Shamma Nasrin,
Nastaran Darabi,
Wilfred Gomes,
Amit Ranjan Trivedi
Abstract:
We propose MC-CIM, a compute-in-memory (CIM) framework for robust, yet low power, Bayesian edge intelligence. Deep neural networks (DNN) with deterministic weights cannot express their prediction uncertainties, thereby pose critical risks for applications where the consequences of mispredictions are fatal such as surgical robotics. To address this limitation, Bayesian inference of a DNN has gained…
▽ More
We propose MC-CIM, a compute-in-memory (CIM) framework for robust, yet low power, Bayesian edge intelligence. Deep neural networks (DNN) with deterministic weights cannot express their prediction uncertainties, thereby pose critical risks for applications where the consequences of mispredictions are fatal such as surgical robotics. To address this limitation, Bayesian inference of a DNN has gained attention. Using Bayesian inference, not only the prediction itself, but the prediction confidence can also be extracted for planning risk-aware actions. However, Bayesian inference of a DNN is computationally expensive, ill-suited for real-time and/or edge deployment. An approximation to Bayesian DNN using Monte Carlo Dropout (MC-Dropout) has shown high robustness along with low computational complexity. Enhancing the computational efficiency of the method, we discuss a novel CIM module that can perform in-memory probabilistic dropout in addition to in-memory weight-input scalar product to support the method. We also propose a compute-reuse reformulation of MC-Dropout where each successive instance can utilize the product-sum computations from the previous iteration. Even more, we discuss how the random instances can be optimally ordered to minimize the overall MC-Dropout workload by exploiting combinatorial optimization methods. Application of the proposed CIM-based MC-Dropout execution is discussed for MNIST character recognition and visual odometry (VO) of autonomous drones. The framework reliably gives prediction confidence amidst non-idealities imposed by MC-CIM to a good extent. Proposed MC-CIM with 16x31 SRAM array, 0.85 V supply, 16nm low-standby power (LSTP) technology consumes 27.8 pJ for 30 MC-Dropout instances of probabilistic inference in its most optimal computing and peripheral configuration, saving 43% energy compared to typical execution.
△ Less
Submitted 13 November, 2021;
originally announced November 2021.
-
Fixation and Creativity in Data Visualization Design: Experiences and Perspectives of Practitioners
Authors:
Paul Parsons,
Prakash Shukla,
Chorong Park
Abstract:
Data visualization design often requires creativity, and research is needed to understand its nature and means for promoting it. The current visualization literature on creativity is not well developed, especially with respect to the experiences of professional data visualization designers. We conducted semi-structured interviews with 15 data visualization practitioners, focusing on a specific asp…
▽ More
Data visualization design often requires creativity, and research is needed to understand its nature and means for promoting it. The current visualization literature on creativity is not well developed, especially with respect to the experiences of professional data visualization designers. We conducted semi-structured interviews with 15 data visualization practitioners, focusing on a specific aspect of creativity known as design fixation. Fixation occurs when designers adhere blindly or prematurely to a set of ideas that limit creative outcomes. We present practitioners' experiences and perspectives from their own design practice, specifically focusing on their views of (i) the nature of fixation, (ii) factors encouraging fixation, and (iii) factors discouraging fixation. We identify opportunities for future research related to chart recommendations, inspiration, and perspective shifts in data visualization design.
△ Less
Submitted 13 August, 2021;
originally announced August 2021.
-
GI-NNet \& RGI-NNet: Development of Robotic Grasp Pose Models, Trainable with Large as well as Limited Labelled Training Datasets, under supervised and semi supervised paradigms
Authors:
Priya Shukla,
Nilotpal Pramanik,
Deepesh Mehta,
G. C. Nandi
Abstract:
Our way of gras** objects is challenging for efficient, intelligent and optimal grasp by COBOTs. To streamline the process, here we use deep learning techniques to help robots learn to generate and execute appropriate grasps quickly. We developed a Generative Inception Neural Network (GI-NNet) model, capable of generating antipodal robotic grasps on seen as well as unseen objects. It is trained…
▽ More
Our way of gras** objects is challenging for efficient, intelligent and optimal grasp by COBOTs. To streamline the process, here we use deep learning techniques to help robots learn to generate and execute appropriate grasps quickly. We developed a Generative Inception Neural Network (GI-NNet) model, capable of generating antipodal robotic grasps on seen as well as unseen objects. It is trained on Cornell Gras** Dataset (CGD) and attained 98.87% grasp pose accuracy for detecting both regular and irregular shaped objects from RGB-Depth (RGB-D) images while requiring only one third of the network trainable parameters as compared to the existing approaches. However, to attain this level of performance the model requires the entire 90% of the available labelled data of CGD kee** only 10% labelled data for testing which makes it vulnerable to poor generalization. Furthermore, getting sufficient and quality labelled dataset is becoming increasingly difficult kee** in pace with the requirement of gigantic networks. To address these issues, we attach our model as a decoder with a semi-supervised learning based architecture known as Vector Quantized Variational Auto Encoder (VQVAE), which works efficiently when trained both with the available labelled and unlabelled data. The proposed model, which we name as Representation based GI-NNet (RGI-NNet), has been trained with various splits of label data on CGD with as minimum as 10% labelled dataset together with latent embedding generated from VQVAE up to 50% labelled data with latent embedding obtained from VQVAE. The performance level, in terms of grasp pose accuracy of RGI-NNet, varies between 92.13% to 95.6% which is far better than several existing models trained with only labelled dataset. For the performance verification of both GI-NNet and RGI-NNet models, we use Anukul (Baxter) hardware cobot.
△ Less
Submitted 15 July, 2021;
originally announced July 2021.
-
Probabilistic Localization of Insect-Scale Drones on Floating-Gate Inverter Arrays
Authors:
Priyesh Shukla,
Ankith Muralidhar,
Nick Iliev,
Theja Tulabandhula,
Sawyer B. Fuller,
Amit Ranjan Trivedi
Abstract:
We propose a novel compute-in-memory (CIM)-based ultra-low-power framework for probabilistic localization of insect-scale drones. The conventional probabilistic localization approaches rely on the three-dimensional (3D) Gaussian Mixture Model (GMM)-based representation of a 3D map. A GMM model with hundreds of mixture functions is typically needed to adequately learn and represent the intricacies…
▽ More
We propose a novel compute-in-memory (CIM)-based ultra-low-power framework for probabilistic localization of insect-scale drones. The conventional probabilistic localization approaches rely on the three-dimensional (3D) Gaussian Mixture Model (GMM)-based representation of a 3D map. A GMM model with hundreds of mixture functions is typically needed to adequately learn and represent the intricacies of the map. Meanwhile, localization using complex GMM map models is computationally intensive. Since insect-scale drones operate under extremely limited area/power budget, continuous localization using GMM models entails much higher operating energy -- thereby, limiting flying duration and/or size of the drone due to a larger battery. Addressing the computational challenges of localization in an insect-scale drone using a CIM approach, we propose a novel framework of 3D map representation using a harmonic mean of "Gaussian-like" mixture (HMGM) model. The likelihood function useful for drone localization can be efficiently implemented by connecting many multi-input inverters in parallel, each programmed with the parameters of the 3D map model represented as HMGM. When the depth measurements are projected to the input of the implementation, the summed current of the inverters emulates the likelihood of the measurement. We have characterized our approach on an RGB-D indoor localization dataset. The average localization error in our approach is $\sim$0.1125 m which is only slightly degraded than software-based evaluation ($\sim$0.08 m). Meanwhile, our localization framework is ultra-low-power, consuming as little as $\sim$17 $μ$W power while processing a depth frame in 1.33 ms over hundred pose hypotheses in the particle-filtering (PF) algorithm used to localize the drone.
△ Less
Submitted 24 May, 2021; v1 submitted 16 February, 2021;
originally announced February 2021.
-
Text-based RL Agents with Commonsense Knowledge: New Challenges, Environments and Baselines
Authors:
Keerthiram Murugesan,
Mattia Atzeni,
Pavan Kapanipathi,
Pushkar Shukla,
Sadhana Kumaravel,
Gerald Tesauro,
Kartik Talamadupula,
Mrinmaya Sachan,
Murray Campbell
Abstract:
Text-based games have emerged as an important test-bed for Reinforcement Learning (RL) research, requiring RL agents to combine grounded language understanding with sequential decision making. In this paper, we examine the problem of infusing RL agents with commonsense knowledge. Such knowledge would allow agents to efficiently act in the world by pruning out implausible actions, and to perform lo…
▽ More
Text-based games have emerged as an important test-bed for Reinforcement Learning (RL) research, requiring RL agents to combine grounded language understanding with sequential decision making. In this paper, we examine the problem of infusing RL agents with commonsense knowledge. Such knowledge would allow agents to efficiently act in the world by pruning out implausible actions, and to perform look-ahead planning to determine how current actions might affect future world states. We design a new text-based gaming environment called TextWorld Commonsense (TWC) for training and evaluating RL agents with a specific kind of commonsense knowledge about objects, their attributes, and affordances. We also introduce several baseline RL agents which track the sequential context and dynamically retrieve the relevant commonsense knowledge from ConceptNet. We show that agents which incorporate commonsense knowledge in TWC perform better, while acting more efficiently. We conduct user-studies to estimate human performance on TWC and show that there is ample room for future improvement.
△ Less
Submitted 8 October, 2020;
originally announced October 2020.
-
Data Visualization Practitioners' Perspectives on Chartjunk
Authors:
Paul Parsons,
Prakash Shukla
Abstract:
Chartjunk is a popular yet contentious topic. Previous studies have shown that extreme minimalism is not always best, and that visual embellishments can be useful depending on the context. While more knowledge is being developed regarding the effects of embellishments on users, less attention has been given to the perspectives of practitioners regarding how they design with embellishments. We cond…
▽ More
Chartjunk is a popular yet contentious topic. Previous studies have shown that extreme minimalism is not always best, and that visual embellishments can be useful depending on the context. While more knowledge is being developed regarding the effects of embellishments on users, less attention has been given to the perspectives of practitioners regarding how they design with embellishments. We conducted semi-structured interviews with 20 data visualization practitioners, investigating how they understand chartjunk and the factors that influence how and when they make use of embellishments. Our investigation uncovers a broad and pluralistic understanding of chartjunk among practitioners, and foregrounds a variety of personal and situated factors that influence the use of chartjunk beyond context. We highlight the personal nature of design practice, and discuss the need for more practice-led research to better understand the ways in which concepts like chartjunk are interpreted and used by practitioners.
△ Less
Submitted 5 September, 2020;
originally announced September 2020.
-
Enhancing Text-based Reinforcement Learning Agents with Commonsense Knowledge
Authors:
Keerthiram Murugesan,
Mattia Atzeni,
Pushkar Shukla,
Mrinmaya Sachan,
Pavan Kapanipathi,
Kartik Talamadupula
Abstract:
In this paper, we consider the recent trend of evaluating progress on reinforcement learning technology by using text-based environments and games as evaluation environments. This reliance on text brings advances in natural language processing into the ambit of these agents, with a recurring thread being the use of external knowledge to mimic and better human-level performance. We present one such…
▽ More
In this paper, we consider the recent trend of evaluating progress on reinforcement learning technology by using text-based environments and games as evaluation environments. This reliance on text brings advances in natural language processing into the ambit of these agents, with a recurring thread being the use of external knowledge to mimic and better human-level performance. We present one such instantiation of agents that use commonsense knowledge from ConceptNet to show promising performance on two text-based environments.
△ Less
Submitted 2 May, 2020;
originally announced May 2020.
-
$MC^2RAM$: Markov Chain Monte Carlo Sampling in SRAM for Fast Bayesian Inference
Authors:
Priyesh Shukla,
Ahish Shylendra,
Theja Tulabandhula,
Amit Ranjan Trivedi
Abstract:
This work discusses the implementation of Markov Chain Monte Carlo (MCMC) sampling from an arbitrary Gaussian mixture model (GMM) within SRAM. We show a novel architecture of SRAM by embedding it with random number generators (RNGs), digital-to-analog converters (DACs), and analog-to-digital converters (ADCs) so that SRAM arrays can be used for high performance Metropolis-Hastings (MH) algorithm-b…
▽ More
This work discusses the implementation of Markov Chain Monte Carlo (MCMC) sampling from an arbitrary Gaussian mixture model (GMM) within SRAM. We show a novel architecture of SRAM by embedding it with random number generators (RNGs), digital-to-analog converters (DACs), and analog-to-digital converters (ADCs) so that SRAM arrays can be used for high performance Metropolis-Hastings (MH) algorithm-based MCMC sampling. Most of the expensive computations are performed within the SRAM and can be parallelized for high speed sampling. Our iterative compute flow minimizes data movement during sampling. We characterize power-performance trade-off of our design by simulating on 45 nm CMOS technology. For a two-dimensional, two mixture GMM, the implementation consumes ~ 91 micro-Watts power per sampling iteration and produces 500 samples in 2000 clock cycles on an average at 1 GHz clock frequency. Our study highlights interesting insights on how low-level hardware non-idealities can affect high-level sampling characteristics, and recommends ways to optimally operate SRAM within area/power constraints for high performance sampling.
△ Less
Submitted 28 February, 2020;
originally announced March 2020.
-
Semi-supervised Grasp Detection by Representation Learning in a Vector Quantized Latent Space
Authors:
Mridul Mahajan,
Tryambak Bhattacharjee,
Arya Krishnan,
Priya Shukla,
G C Nandi
Abstract:
For a robot to perform complex manipulation tasks, it is necessary for it to have a good gras** ability. However, vision based robotic grasp detection is hindered by the unavailability of sufficient labelled data. Furthermore, the application of semi-supervised learning techniques to grasp detection is under-explored. In this paper, a semi-supervised learning based grasp detection approach has b…
▽ More
For a robot to perform complex manipulation tasks, it is necessary for it to have a good gras** ability. However, vision based robotic grasp detection is hindered by the unavailability of sufficient labelled data. Furthermore, the application of semi-supervised learning techniques to grasp detection is under-explored. In this paper, a semi-supervised learning based grasp detection approach has been presented, which models a discrete latent space using a Vector Quantized Variational AutoEncoder (VQ-VAE). To the best of our knowledge, this is the first time a Variational AutoEncoder (VAE) has been applied in the domain of robotic grasp detection. The VAE helps the model in generalizing beyond the Cornell Gras** Dataset (CGD) despite having a limited amount of labelled data by also utilizing the unlabelled data. This claim has been validated by testing the model on images, which are not available in the CGD. Along with this, we augment the Generative Gras** Convolutional Neural Network (GGCNN) architecture with the decoder structure used in the VQ-VAE model with the intuition that it should help to regress in the vector-quantized latent space. Subsequently, the model performs significantly better than the existing approaches which do not make use of unlabelled images to improve the grasp.
△ Less
Submitted 30 January, 2020; v1 submitted 23 January, 2020;
originally announced January 2020.
-
Robotic Grasp Manipulation Using Evolutionary Computing and Deep Reinforcement Learning
Authors:
Priya Shukla,
Hitesh Kumar,
G. C. Nandi
Abstract:
Intelligent Object manipulation for gras** is a challenging problem for robots. Unlike robots, humans almost immediately know how to manipulate objects for gras** due to learning over the years. A grown woman can grasp objects more skilfully than a child because of learning skills developed over years, the absence of which in the present day robotic gras** compels it to perform well below th…
▽ More
Intelligent Object manipulation for gras** is a challenging problem for robots. Unlike robots, humans almost immediately know how to manipulate objects for gras** due to learning over the years. A grown woman can grasp objects more skilfully than a child because of learning skills developed over years, the absence of which in the present day robotic gras** compels it to perform well below the human object gras** benchmarks. In this paper we have taken up the challenge of develo** learning based pose estimation by decomposing the problem into both position and orientation learning. More specifically, for grasp position estimation, we explore three different methods - a Genetic Algorithm (GA) based optimization method to minimize error between calculated image points and predicted end-effector (EE) position, a regression based method (RM) where collected data points of robot EE and image points have been regressed with a linear model, a PseudoInverse (PI) model which has been formulated in the form of a map** matrix with robot EE position and image points for several observations. Further for grasp orientation learning, we develop a deep reinforcement learning (DRL) model which we name as Grasp Deep Q-Network (GDQN) and benchmarked our results with Modified VGG16 (MVGG16). Rigorous experimentations show that due to inherent capability of producing very high-quality solutions for optimization problems and search problems, GA based predictor performs much better than the other two models for position estimation. For orientation learning results indicate that off policy learning through GDQN outperforms MVGG16, since GDQN architecture is specially made suitable for the reinforcement learning. Based on our proposed architectures and algorithms, the robot is capable of gras** all rigid body objects having regular shapes.
△ Less
Submitted 15 January, 2020;
originally announced January 2020.
-
What Should I Ask? Using Conversationally Informative Rewards for Goal-Oriented Visual Dialog
Authors:
Pushkar Shukla,
Carlos Elmadjian,
Richika Sharan,
Vivek Kulkarni,
Matthew Turk,
William Yang Wang
Abstract:
The ability to engage in goal-oriented conversations has allowed humans to gain knowledge, reduce uncertainty, and perform tasks more efficiently. Artificial agents, however, are still far behind humans in having goal-driven conversations. In this work, we focus on the task of goal-oriented visual dialogue, aiming to automatically generate a series of questions about an image with a single objecti…
▽ More
The ability to engage in goal-oriented conversations has allowed humans to gain knowledge, reduce uncertainty, and perform tasks more efficiently. Artificial agents, however, are still far behind humans in having goal-driven conversations. In this work, we focus on the task of goal-oriented visual dialogue, aiming to automatically generate a series of questions about an image with a single objective. This task is challenging since these questions must not only be consistent with a strategy to achieve a goal, but also consider the contextual information in the image. We propose an end-to-end goal-oriented visual dialogue system, that combines reinforcement learning with regularized information gain. Unlike previous approaches that have been proposed for the task, our work is motivated by the Rational Speech Act framework, which models the process of human inquiry to reach a goal. We test the two versions of our model on the GuessWhat?! dataset, obtaining significant results that outperform the current state-of-the-art models in the task of generating questions to find an undisclosed object in an image.
△ Less
Submitted 28 July, 2019;
originally announced July 2019.
-
Crowd Capital in Governance Contexts
Authors:
J. Prpic,
P. Shukla
Abstract:
To begin to understand the implications of the implementation of IT-mediated Crowds for Politics and Policy purposes, this research builds the first-known dataset of IT-mediated Crowd applications currently in use in the governance context. Using Crowd Capital theory and governance theory as frameworks to organize our data collection, we undertake an exploratory data analysis of some fundamental f…
▽ More
To begin to understand the implications of the implementation of IT-mediated Crowds for Politics and Policy purposes, this research builds the first-known dataset of IT-mediated Crowd applications currently in use in the governance context. Using Crowd Capital theory and governance theory as frameworks to organize our data collection, we undertake an exploratory data analysis of some fundamental factors defining this emerging field. Specific factors outlined and discussed include the type of actors implementing IT-mediated Crowds in the governance context, the global geographic distribution of the applications, and the nature of the Crowd-derived resources being generated for governance purposes. The findings from our dataset of 209 on-going endeavours indicates that a wide-diversity of actors are engaging IT-mediated Crowds in the governance context, both jointly and severally, that these endeavours can be found to exist on all continents, and that said actors are generating Crowd-derived resources in at least ten distinct governance sectors. We discuss the ramifications of these and our other findings in comparison to the research literature on the private-sector use of IT-mediated Crowds, while highlighting some unique future research opportunities stemming from our work.
△ Less
Submitted 10 February, 2017;
originally announced February 2017.
-
A Geography of Participation in IT-Mediated Crowds
Authors:
J. Prpic,
P. Shukla,
Y. Roth,
J. F. Lemoine
Abstract:
In this work we seek to understand how differences in location affect participation outcomes in IT-mediated crowds. To do so, we operationalize Crowd Capital Theory with data from a popular international creative crowdsourcing site, to determine whether regional differences exist in crowdsourcing participation outcomes. We present the early results of our investigation from data encompassing 1,858…
▽ More
In this work we seek to understand how differences in location affect participation outcomes in IT-mediated crowds. To do so, we operationalize Crowd Capital Theory with data from a popular international creative crowdsourcing site, to determine whether regional differences exist in crowdsourcing participation outcomes. We present the early results of our investigation from data encompassing 1,858,202 observations from 28,214 crowd members on 94 different projects in 2012. Using probit regressions to isolate geographic effects by continental region, we find significant variation across regions in crowdsourcing participation. In doing so, we contribute to the literature by illustrating that geography matters in respect to crowd participation. Further, our work illustrates an initial validation of Crowd Capital Theory as a useful theoretical model to guide empirical inquiry in the fast-growing domain of IT-mediated crowds.
△ Less
Submitted 12 February, 2017;
originally announced February 2017.
-
The Contours of Crowd Capability
Authors:
J. Prpic,
P. Shukla
Abstract:
In this work we use the theory of Crowd Capital as a lens to compare and contrast a number of IS tools currently in use by organizations for crowd-engagement purposes. In doing so, we contribute to both the practitioner and research domains. For the practitioner community we provide decision-makers with a convenient and useful resource, in table-form, outlining in detail some of the differing pote…
▽ More
In this work we use the theory of Crowd Capital as a lens to compare and contrast a number of IS tools currently in use by organizations for crowd-engagement purposes. In doing so, we contribute to both the practitioner and research domains. For the practitioner community we provide decision-makers with a convenient and useful resource, in table-form, outlining in detail some of the differing potentialities of crowd-engaging IS. For the research community we begin to unpack some of the key properties of crowd-engaging IS, including some of the differing qualities of the crowds that these IS application engage.
△ Less
Submitted 12 February, 2017;
originally announced February 2017.
-
How to Work a Crowd: Develo** Crowd Capital Through Crowdsourcing
Authors:
J. Prpic,
P. P. Shukla,
J. H. Kietzmann,
I. P. McCarthy
Abstract:
Traditionally, the term crowd was used almost exclusively in the context of people who self-organized around a common purpose, emotion or experience. Today, however, firms often refer to crowds in discussions of how collections of individuals can be engaged for organizational purposes. Crowdsourcing, the use of information technologies to outsource business responsibilities to crowds, can now sign…
▽ More
Traditionally, the term crowd was used almost exclusively in the context of people who self-organized around a common purpose, emotion or experience. Today, however, firms often refer to crowds in discussions of how collections of individuals can be engaged for organizational purposes. Crowdsourcing, the use of information technologies to outsource business responsibilities to crowds, can now significantly influence a firms ability to leverage previously unattainable resources to build competitive advantage. Nonetheless, many managers are hesitant to consider crowdsourcing because they do not understand how its various types can add value to the firm. In response, we explain what crowdsourcing is, the advantages it offers and how firms can pursue crowdsourcing. We begin by formulating a crowdsourcing typology and show how its four categories (crowd-voting, micro-task, idea and solution crowdsourcing) can help firms develop crowd capital, an organizational-level resource harnessed from the crowd. We then present a three-step process model for generating crowd capital. Step one includes important considerations that shape how a crowd is to be constructed. Step two outlines the capabilities firms need to develop to acquire and assimilate resources (knowledge, labor, funds) from the crowd. Step three addresses key decision-areas that executives need to address to effectively engage crowds.
△ Less
Submitted 12 February, 2017;
originally announced February 2017.
-
The Theory of Crowd Capital
Authors:
John Prpic,
Prashant Shukla
Abstract:
We are seeing more and more organizations undertaking activities to engage dispersed populations through IS. Using the knowledge-based view of the organization, this work conceptualizes a theory of Crowd Capital to explain this phenomenon. Crowd Capital is a heterogeneous knowledge resource generated by an organization, through its use of Crowd Capability, which is defined by the structure, conten…
▽ More
We are seeing more and more organizations undertaking activities to engage dispersed populations through IS. Using the knowledge-based view of the organization, this work conceptualizes a theory of Crowd Capital to explain this phenomenon. Crowd Capital is a heterogeneous knowledge resource generated by an organization, through its use of Crowd Capability, which is defined by the structure, content, and process by which an organization engages with the dispersed knowledge of individuals (the Crowd). Our work draws upon a diverse literature and builds upon numerous examples of practitioner implementations to support our theorizing. We present a model of Crowd Capital generation in organizations and discuss the implications of Crowd Capital on organizational boundary and on IS research.
△ Less
Submitted 6 October, 2012;
originally announced October 2012.
-
Incorporating Agile with MDA Case Study: Online Polling System
Authors:
Pritha Guha,
Kinjal Shah,
Shiv Shankar Prasad Shukla,
Shweta Singh
Abstract:
Nowadays agile software development is used in greater extend but for small organizations only, whereas MDA is suitable for large organizations but yet not standardized. In this paper the pros and cons of Model Driven Architecture (MDA) and Extreme programming have been discussed. As both of them have some limitations and cannot be used in both large scale and small scale organizations a new archi…
▽ More
Nowadays agile software development is used in greater extend but for small organizations only, whereas MDA is suitable for large organizations but yet not standardized. In this paper the pros and cons of Model Driven Architecture (MDA) and Extreme programming have been discussed. As both of them have some limitations and cannot be used in both large scale and small scale organizations a new architecture has been proposed. In this model it is tried to opt the advantages and important values to overcome the limitations of both the software development procedures. In support to the proposed architecture the implementation of it on Online Polling System has been discussed and all the phases of software development have been explained.
△ Less
Submitted 31 October, 2011;
originally announced October 2011.
-
Problem Reduction in Online Payment System Using Hybrid Model
Authors:
Sandeep Pratap Singh,
Shiv Shankar P. Shukla,
Nitin Rakesh,
Vipin Tyagi
Abstract:
Online auction, shop**, electronic billing etc. all such types of application involves problems of fraudulent transactions. Online fraud occurrence and its detection is one of the challenging fields for web development and online phantom transaction. As no-secure specification of online frauds is in research database, so the techniques to evaluate and stop them are also in study. We are providin…
▽ More
Online auction, shop**, electronic billing etc. all such types of application involves problems of fraudulent transactions. Online fraud occurrence and its detection is one of the challenging fields for web development and online phantom transaction. As no-secure specification of online frauds is in research database, so the techniques to evaluate and stop them are also in study. We are providing an approach with Hidden Markov Model (HMM) and mobile implicit authentication to find whether the user interacting online is a fraud or not. We propose a model based on these approaches to counter the occurred fraud and prevent the loss of the customer. Our technique is more parameterized than traditional approaches and so,chances of detecting legitimate user as a fraud will reduce.
△ Less
Submitted 4 September, 2011;
originally announced September 2011.
-
Design and Analysis of an Attack Resilient and Adaptive Medium access Control Protocol for Computer Networks
Authors:
Piyush Kumar Shukla,
Dr. S. Silakari,
Dr. Sarita Singh Bhadoria
Abstract:
The challenge of designing an efficient Medium Access Control (MAC) protocol and analyzing it has been an important research topic for over 30 years. This paper focuses on the performance analysis (through simulation) and modification of a well known MAC protocol CSMA/CD. The existing protocol does not consider the wastage of bandwidth due to unutilized periods of the channel. By considering thi…
▽ More
The challenge of designing an efficient Medium Access Control (MAC) protocol and analyzing it has been an important research topic for over 30 years. This paper focuses on the performance analysis (through simulation) and modification of a well known MAC protocol CSMA/CD. The existing protocol does not consider the wastage of bandwidth due to unutilized periods of the channel. By considering this fact, performance of MAC protocol can be enhanced. The purpose of this work is to modify the existing protocol by enabling it to adapt according to state of the network. The modified protocol takes appropriate action whenever unutilized periods detected. In this way, to increase the effective bandwidth utilization and determine how it behaves under increasing load, and varying packet sizes. It will also include effects of attacks i.e. Denial of service attacks, Replay Attack, Continuous Channel Access or Exhaustion attack, Flooding attack, Jamming (Radio interference) attack, Selective forwarding attack which degrade performance of MAC protocol. In Continuous Channel Access or Exhaustion attack, a malicious node disrupts the MAC protocol, by continuously requesting or transmitting over the channel. This eventually leads a starvation for other nodes in the network w.r.t channel access. remedy may be the network ignores excessive requests without sending expensive radio transmissions. This limit however cannot drop below the expected maximum data rate the network has to support. This limit is usually coded into the protocol during the design phase and requires additional logic also. Repeated application of these exhaustion or collision based MAC layer attacks can lead into unfairness.
△ Less
Submitted 3 July, 2009;
originally announced July 2009.