-
The Narrow Depth and Breadth of Corporate Responsible AI Research
Authors:
Nur Ahmed,
Amit Das,
Kirsten Martin,
Kawshik Banerjee
Abstract:
The transformative potential of AI presents remarkable opportunities, but also significant risks, underscoring the importance of responsible AI development and deployment. Despite a growing emphasis on this area, there is limited understanding of industry's engagement in responsible AI research, i.e., the critical examination of AI's ethical, social, and legal dimensions. To address this gap, we a…
▽ More
The transformative potential of AI presents remarkable opportunities, but also significant risks, underscoring the importance of responsible AI development and deployment. Despite a growing emphasis on this area, there is limited understanding of industry's engagement in responsible AI research, i.e., the critical examination of AI's ethical, social, and legal dimensions. To address this gap, we analyzed over 6 million peer-reviewed articles and 32 million patent citations using multiple methods across five distinct datasets to quantify industry's engagement. Our findings reveal that the majority of AI firms show limited or no engagement in this critical subfield of AI. We show a stark disparity between industry's dominant presence in conventional AI research and its limited engagement in responsible AI. Leading AI firms exhibit significantly lower output in responsible AI research compared to their conventional AI research and the contributions of leading academic institutions. Our linguistic analysis documents a narrower scope of responsible AI research within industry, with a lack of diversity in key topics addressed. Our large-scale patent citation analysis uncovers a pronounced disconnect between responsible AI research and the commercialization of AI technologies, suggesting that industry patents rarely build upon insights generated by the responsible AI literature. This gap highlights the potential for AI development to diverge from a socially optimal path, risking unintended consequences due to insufficient consideration of ethical and societal implications. Our results highlight the urgent need for industry to publicly engage in responsible AI research to absorb academic knowledge, cultivate public trust, and proactively mitigate AI-induced societal harms.
△ Less
Submitted 20 May, 2024;
originally announced May 2024.
-
Static and Dynamic Synthesis of Bengali and Devanagari Signatures
Authors:
Miguel A. Ferrer,
Sukalpa Chanda,
Moises Diaz,
Chayan Kr. Banerjee,
Anirban Majumdar,
Cristina Carmona-Duarte,
Parikshit Acharya,
Umapada Pal
Abstract:
Develo** an automatic signature verification system is challenging and demands a large number of training samples. This is why synthetic handwriting generation is an emerging topic in document image analysis. Some handwriting synthesizers use the motor equivalence model, the well-established hypothesis from neuroscience, which analyses how a human being accomplishes movement. Specifically, a mot…
▽ More
Develo** an automatic signature verification system is challenging and demands a large number of training samples. This is why synthetic handwriting generation is an emerging topic in document image analysis. Some handwriting synthesizers use the motor equivalence model, the well-established hypothesis from neuroscience, which analyses how a human being accomplishes movement. Specifically, a motor equivalence model divides human actions into two steps: 1) the effector independent step at cognitive level and 2) the effector dependent step at motor level. In fact, recent work reports the successful application to Western scripts of a handwriting synthesizer, based on this theory. This paper aims to adapt this scheme for the generation of synthetic signatures in two Indic scripts, Bengali (Bangla), and Devanagari (Hindi). For this purpose, we use two different online and offline databases for both Bengali and Devanagari signatures. This paper reports an effective synthesizer for static and dynamic signatures written in Devanagari or Bengali scripts. We obtain promising results with artificially generated signatures in terms of appearance and performance when we compare the results with those for real signatures.
△ Less
Submitted 30 January, 2024;
originally announced January 2024.
-
Using Motion Forecasting for Behavior-Based Virtual Reality (VR) Authentication
Authors:
Mingjun Li,
Natasha Kholgade Banerjee,
Sean Banerjee
Abstract:
Task-based behavioral biometric authentication of users interacting in virtual reality (VR) environments enables seamless continuous authentication by using only the motion trajectories of the person's body as a unique signature. Deep learning-based approaches for behavioral biometrics show high accuracy when using complete or near complete portions of the user trajectory, but show lower performan…
▽ More
Task-based behavioral biometric authentication of users interacting in virtual reality (VR) environments enables seamless continuous authentication by using only the motion trajectories of the person's body as a unique signature. Deep learning-based approaches for behavioral biometrics show high accuracy when using complete or near complete portions of the user trajectory, but show lower performance when using smaller segments from the start of the task. Thus, any systems designed with existing techniques are vulnerable while waiting for future segments of motion trajectories to become available. In this work, we present the first approach that predicts future user behavior using Transformer-based forecasting and using the forecasted trajectory to perform user authentication. Our work leverages the notion that given the current trajectory of a user in a task-based environment we can predict the future trajectory of the user as they are unlikely to dramatically shift their behavior since it would preclude the user from successfully completing their task goal. Using the publicly available 41-subject ball throwing dataset of Miller et al. we show improvement in user authentication when using forecasted data. When compared to no forecasting, our approach reduces the authentication equal error rate (EER) by an average of 23.85% and a maximum reduction of 36.14%.
△ Less
Submitted 29 January, 2024;
originally announced January 2024.
-
Evaluating Deep Networks for Detecting User Familiarity with VR from Hand Interactions
Authors:
Mingjun Li,
Numan Zafar,
Natasha Kholgade Banerjee,
Sean Banerjee
Abstract:
As VR devices become more prevalent in the consumer space, VR applications are likely to be increasingly used by users unfamiliar with VR. Detecting the familiarity level of a user with VR as an interaction medium provides the potential of providing on-demand training for acclimatization and prevents the user from being burdened by the VR environment in accomplishing their tasks. In this work, we…
▽ More
As VR devices become more prevalent in the consumer space, VR applications are likely to be increasingly used by users unfamiliar with VR. Detecting the familiarity level of a user with VR as an interaction medium provides the potential of providing on-demand training for acclimatization and prevents the user from being burdened by the VR environment in accomplishing their tasks. In this work, we present preliminary results of using deep classifiers to conduct automatic detection of familiarity with VR by using hand tracking of the user as they interact with a numeric passcode entry panel to unlock a VR door. We use a VR door as we envision it to the first point of entry to collaborative virtual spaces, such as meeting rooms, offices, or clinics. Users who are unfamiliar with VR will have used their hands to open doors with passcode entry panels in the real world. Thus, while the user may not be familiar with VR, they would be familiar with the task of opening the door. Using a pilot dataset consisting of 7 users familiar with VR, and 7 not familiar with VR, we acquire highest accuracy of 88.03\% when 6 test users, 3 familiar and 3 not familiar, are evaluated with classifiers trained using data from the remaining 8 users. Our results indicate potential for using user movement data to detect familiarity for the simple yet important task of secure passcode-based access.
△ Less
Submitted 27 January, 2024;
originally announced January 2024.
-
HOH: Markerless Multimodal Human-Object-Human Handover Dataset with Large Object Count
Authors:
Noah Wiederhold,
Ava Megyeri,
DiMaggio Paris,
Sean Banerjee,
Natasha Kholgade Banerjee
Abstract:
We present the HOH (Human-Object-Human) Handover Dataset, a large object count dataset with 136 objects, to accelerate data-driven research on handover studies, human-robot handover implementation, and artificial intelligence (AI) on handover parameter estimation from 2D and 3D data of person interactions. HOH contains multi-view RGB and depth data, skeletons, fused point clouds, grasp type and ha…
▽ More
We present the HOH (Human-Object-Human) Handover Dataset, a large object count dataset with 136 objects, to accelerate data-driven research on handover studies, human-robot handover implementation, and artificial intelligence (AI) on handover parameter estimation from 2D and 3D data of person interactions. HOH contains multi-view RGB and depth data, skeletons, fused point clouds, grasp type and handedness labels, object, giver hand, and receiver hand 2D and 3D segmentations, giver and receiver comfort ratings, and paired object metadata and aligned 3D models for 2,720 handover interactions spanning 136 objects and 20 giver-receiver pairs-40 with role-reversal-organized from 40 participants. We also show experimental results of neural networks trained using HOH to perform grasp, orientation, and trajectory prediction. As the only fully markerless handover capture dataset, HOH represents natural human-human handover interactions, overcoming challenges with markered datasets that require specific suiting for body tracking, and lack high-resolution hand tracking. To date, HOH is the largest handover dataset in number of objects, participants, pairs with role reversal accounted for, and total interactions captured.
△ Less
Submitted 3 May, 2024; v1 submitted 1 October, 2023;
originally announced October 2023.
-
Pix2Repair: Implicit Shape Restoration from Images
Authors:
Xinchao Song,
Nikolas Lamb,
Sean Banerjee,
Natasha Kholgade Banerjee
Abstract:
We present Pix2Repair, an automated shape repair approach that generates restoration shapes from images to repair fractured objects. Prior repair approaches require a high-resolution watertight 3D mesh of the fractured object as input. Input 3D meshes must be obtained using expensive 3D scanners, and scanned meshes require manual cleanup, limiting accessibility and scalability. Pix2Repair takes an…
▽ More
We present Pix2Repair, an automated shape repair approach that generates restoration shapes from images to repair fractured objects. Prior repair approaches require a high-resolution watertight 3D mesh of the fractured object as input. Input 3D meshes must be obtained using expensive 3D scanners, and scanned meshes require manual cleanup, limiting accessibility and scalability. Pix2Repair takes an image of the fractured object as input and automatically generates a 3D printable restoration shape. We contribute a novel shape function that deconstructs a latent code representing the fractured object into a complete shape and a break surface. We also introduce Fantastic Breaks Imaged, the first large-scale dataset of 11,653 real-world images of fractured objects for training and evaluating image-based shape repair approaches. Our dataset contains images of objects from Fantastic Breaks, complete with rich annotations. We show restorations for real fractures from our dataset, and for synthetic fractures from the Geometric Breaks and Breaking Bad datasets. Our approach outperforms shape completion approaches adapted for shape repair in terms of chamfer distance, normal consistency, and percent restorations generated.
△ Less
Submitted 20 December, 2023; v1 submitted 29 May, 2023;
originally announced May 2023.
-
Fantastic Breaks: A Dataset of Paired 3D Scans of Real-World Broken Objects and Their Complete Counterparts
Authors:
Nikolas Lamb,
Cameron Palmer,
Benjamin Molloy,
Sean Banerjee,
Natasha Kholgade Banerjee
Abstract:
Automated shape repair approaches currently lack access to datasets that describe real-world damaged geometry. We present Fantastic Breaks (and Where to Find Them: https://terascale-all-sensing-research-studio.github.io/FantasticBreaks), a dataset containing scanned, waterproofed, and cleaned 3D meshes for 150 broken objects, paired and geometrically aligned with complete counterparts. Fantastic B…
▽ More
Automated shape repair approaches currently lack access to datasets that describe real-world damaged geometry. We present Fantastic Breaks (and Where to Find Them: https://terascale-all-sensing-research-studio.github.io/FantasticBreaks), a dataset containing scanned, waterproofed, and cleaned 3D meshes for 150 broken objects, paired and geometrically aligned with complete counterparts. Fantastic Breaks contains class and material labels, proxy repair parts that join to broken meshes to generate complete meshes, and manually annotated fracture boundaries. Through a detailed analysis of fracture geometry, we reveal differences between Fantastic Breaks and synthetic fracture datasets generated using geometric and physics-based methods. We show experimental shape repair evaluation with Fantastic Breaks using multiple learning-based approaches pre-trained with synthetic datasets and re-trained with subset of Fantastic Breaks.
△ Less
Submitted 1 May, 2023; v1 submitted 24 March, 2023;
originally announced March 2023.
-
Simultaneous prediction of hand gestures, handedness, and hand keypoints using thermal images
Authors:
Sichao Li,
Sean Banerjee,
Natasha Kholgade Banerjee,
Soumyabrata Dey
Abstract:
Hand gesture detection is a well-explored area in computer vision with applications in various forms of Human-Computer Interactions. In this work, we propose a technique for simultaneous hand gesture classification, handedness detection, and hand keypoints localization using thermal data captured by an infrared camera. Our method uses a novel deep multi-task learning architecture that includes sha…
▽ More
Hand gesture detection is a well-explored area in computer vision with applications in various forms of Human-Computer Interactions. In this work, we propose a technique for simultaneous hand gesture classification, handedness detection, and hand keypoints localization using thermal data captured by an infrared camera. Our method uses a novel deep multi-task learning architecture that includes shared encoderdecoder layers followed by three branches dedicated for each mentioned task. We performed extensive experimental validation of our model on an in-house dataset consisting of 24 users data. The results confirm higher than 98 percent accuracy for gesture classification, handedness detection, and fingertips localization, and more than 91 percent accuracy for wrist points localization.
△ Less
Submitted 2 March, 2023;
originally announced March 2023.
-
DeepJoin: Learning a Joint Occupancy, Signed Distance, and Normal Field Function for Shape Repair
Authors:
Nikolas Lamb,
Sean Banerjee,
Natasha Kholgade Banerjee
Abstract:
We introduce DeepJoin, an automated approach to generate high-resolution repairs for fractured shapes using deep neural networks. Existing approaches to perform automated shape repair operate exclusively on symmetric objects, require a complete proxy shape, or predict restoration shapes using low-resolution voxels which are too coarse for physical repair. We generate a high-resolution restoration…
▽ More
We introduce DeepJoin, an automated approach to generate high-resolution repairs for fractured shapes using deep neural networks. Existing approaches to perform automated shape repair operate exclusively on symmetric objects, require a complete proxy shape, or predict restoration shapes using low-resolution voxels which are too coarse for physical repair. We generate a high-resolution restoration shape by inferring a corresponding complete shape and a break surface from an input fractured shape. We present a novel implicit shape representation for fractured shape repair that combines the occupancy function, signed distance function, and normal field. We demonstrate repairs using our approach for synthetically fractured objects from ShapeNet, 3D scans from the Google Scanned Objects dataset, objects in the style of ancient Greek pottery from the QP Cultural Heritage dataset, and real fractured objects. We outperform three baseline approaches in terms of chamfer distance and normal consistency. Unlike existing approaches and restorations using subtraction, DeepJoin restorations do not exhibit surface artifacts and join closely to the fractured region of the fractured shape. Our code is available at: https://github.com/Terascale-All-sensing-Research-Studio/DeepJoin.
△ Less
Submitted 22 November, 2022;
originally announced November 2022.
-
FoSR: First-order spectral rewiring for addressing oversquashing in GNNs
Authors:
Kedar Karhadkar,
Pradeep Kr. Banerjee,
Guido Montúfar
Abstract:
Graph neural networks (GNNs) are able to leverage the structure of graph data by passing messages along the edges of the graph. While this allows GNNs to learn features depending on the graph structure, for certain graph topologies it leads to inefficient information propagation and a problem known as oversquashing. This has recently been linked with the curvature and spectral gap of the graph. On…
▽ More
Graph neural networks (GNNs) are able to leverage the structure of graph data by passing messages along the edges of the graph. While this allows GNNs to learn features depending on the graph structure, for certain graph topologies it leads to inefficient information propagation and a problem known as oversquashing. This has recently been linked with the curvature and spectral gap of the graph. On the other hand, adding edges to the message-passing graph can lead to increasingly similar node representations and a problem known as oversmoothing. We propose a computationally efficient algorithm that prevents oversquashing by systematically adding edges to the graph based on spectral expansion. We combine this with a relational architecture, which lets the GNN preserve the original graph structure and provably prevents oversmoothing. We find experimentally that our algorithm outperforms existing graph rewiring methods in several graph classification tasks.
△ Less
Submitted 15 February, 2023; v1 submitted 21 October, 2022;
originally announced October 2022.
-
DeepMend: Learning Occupancy Functions to Represent Shape for Repair
Authors:
Nikolas Lamb,
Sean Banerjee,
Natasha Kholgade Banerjee
Abstract:
We present DeepMend, a novel approach to reconstruct restorations to fractured shapes using learned occupancy functions. Existing shape repair approaches predict low-resolution voxelized restorations, or require symmetries or access to a pre-existing complete oracle. We represent the occupancy of a fractured shape as the conjunction of the occupancy of an underlying complete shape and the fracture…
▽ More
We present DeepMend, a novel approach to reconstruct restorations to fractured shapes using learned occupancy functions. Existing shape repair approaches predict low-resolution voxelized restorations, or require symmetries or access to a pre-existing complete oracle. We represent the occupancy of a fractured shape as the conjunction of the occupancy of an underlying complete shape and the fracture surface, which we model as functions of latent codes using neural networks. Given occupancy samples from an input fractured shape, we estimate latent codes using an inference loss augmented with novel penalty terms that avoid empty or voluminous restorations. We use inferred codes to reconstruct the restoration shape. We show results with simulated fractures on synthetic and real-world scanned objects, and with scanned real fractured mugs. Compared to the existing voxel approach and two baseline methods, our work shows state-of-the-art results in accuracy and avoiding restoration artifacts over non-fracture regions of the fractured shape.
△ Less
Submitted 11 October, 2022;
originally announced October 2022.
-
Oversquashing in GNNs through the lens of information contraction and graph expansion
Authors:
Pradeep Kr. Banerjee,
Kedar Karhadkar,
Yu Guang Wang,
Uri Alon,
Guido Montúfar
Abstract:
The quality of signal propagation in message-passing graph neural networks (GNNs) strongly influences their expressivity as has been observed in recent works. In particular, for prediction tasks relying on long-range interactions, recursive aggregation of node features can lead to an undesired phenomenon called "oversquashing". We present a framework for analyzing oversquashing based on informatio…
▽ More
The quality of signal propagation in message-passing graph neural networks (GNNs) strongly influences their expressivity as has been observed in recent works. In particular, for prediction tasks relying on long-range interactions, recursive aggregation of node features can lead to an undesired phenomenon called "oversquashing". We present a framework for analyzing oversquashing based on information contraction. Our analysis is guided by a model of reliable computation due to von Neumann that lends a new insight into oversquashing as signal quenching in noisy computation graphs. Building on this, we propose a graph rewiring algorithm aimed at alleviating oversquashing. Our algorithm employs a random local edge flip primitive motivated by an expander graph construction. We compare the spectral expansion properties of our algorithm with that of an existing curvature-based non-local rewiring strategy. Synthetic experiments show that while our algorithm in general has a slower rate of expansion, it is overall computationally cheaper, preserves the node degrees exactly and never disconnects the graph.
△ Less
Submitted 6 August, 2022;
originally announced August 2022.
-
Detecting Concept Drift in the Presence of Sparsity -- A Case Study of Automated Change Risk Assessment System
Authors:
Vishwas Choudhary,
Binay Gupta,
Anirban Chatterjee,
Subhadip Paul,
Kunal Banerjee,
Vijay Agneeswaran
Abstract:
Missing values, widely called as \textit{sparsity} in literature, is a common characteristic of many real-world datasets. Many imputation methods have been proposed to address this problem of data incompleteness or sparsity. However, the accuracy of a data imputation method for a given feature or a set of features in a dataset is highly dependent on the distribution of the feature values and its c…
▽ More
Missing values, widely called as \textit{sparsity} in literature, is a common characteristic of many real-world datasets. Many imputation methods have been proposed to address this problem of data incompleteness or sparsity. However, the accuracy of a data imputation method for a given feature or a set of features in a dataset is highly dependent on the distribution of the feature values and its correlation with other features. Another problem that plagues industry deployments of machine learning (ML) solutions is concept drift detection, which becomes more challenging in the presence of missing values. Although data imputation and concept drift detection have been studied extensively, little work has attempted a combined study of the two phenomena, i.e., concept drift detection in the presence of sparsity. In this work, we carry out a systematic study of the following: (i) different patterns of missing values, (ii) various statistical and ML based data imputation methods for different kinds of sparsity, (iii) several concept drift detection methods, (iv) practical analysis of the various drift detection metrics, (v) selecting the best concept drift detector given a dataset with missing values based on the different metrics. We first analyze it on synthetic data and publicly available datasets, and finally extend the findings to our deployed solution of automated change risk assessment system. One of the major findings from our empirical study is the absence of supremacy of any one concept drift detection method across all the relevant metrics. Therefore, we adopt a majority voting based ensemble of concept drift detectors for abrupt and gradual concept drifts. Our experiments show optimal or near optimal performance can be achieved for this ensemble method across all the metrics.
△ Less
Submitted 27 July, 2022;
originally announced July 2022.
-
These Deals Won't Last! Longevity, Uniformity and Bias in Product Badge Assignment in E-Commerce Platforms
Authors:
Archit Bansal,
Kunal Banerjee,
Abhijnan Chakraborty
Abstract:
Product badges are ubiquitous in e-commerce platforms, acting as effective psychological triggers to nudge customers to buy specific products, boosting revenues. However, to the best of our knowledge, there has been no attempt to systematically study these badges and their several idiosyncrasies - we intend to close this gap in our current work. Specifically, we try to answer questions such as: Ho…
▽ More
Product badges are ubiquitous in e-commerce platforms, acting as effective psychological triggers to nudge customers to buy specific products, boosting revenues. However, to the best of our knowledge, there has been no attempt to systematically study these badges and their several idiosyncrasies - we intend to close this gap in our current work. Specifically, we try to answer questions such as: How long does a product retain a badge on a given platform? If a product is sold on different platforms, then does it receive similar badges? How do the products that receive badges differ from those which do not, in terms of price, customer rating, etc. We collect longitudinal data from several e-commerce platforms over 45 days, and find that although most of the badges are short-lived, there are several permanent badge assignments and that too for badges meant to denote urgency or scarcity. Furthermore, it is unclear how the badge assignments are done, and we find evidence that highly-rated products are missing out on badges compared to lower quality ones. Our work calls for greater transparency in the badge assignment process to inform customers, as well as to reduce dissatisfaction among the sellers dependent on the platforms for their revenues.
△ Less
Submitted 26 April, 2022;
originally announced April 2022.
-
Continuity and Additivity Properties of Information Decompositions
Authors:
Johannes Rauh,
Pradeep Kr. Banerjee,
Eckehard Olbrich,
Guido Montúfar,
Jürgen Jost
Abstract:
Information decompositions quantify how the Shannon information about a given random variable is distributed among several other random variables. Various requirements have been proposed that such a decomposition should satisfy, leading to different candidate solutions. Curiously, however, only two of the original requirements that determined the Shannon information have been considered, namely mo…
▽ More
Information decompositions quantify how the Shannon information about a given random variable is distributed among several other random variables. Various requirements have been proposed that such a decomposition should satisfy, leading to different candidate solutions. Curiously, however, only two of the original requirements that determined the Shannon information have been considered, namely monotonicity and normalization. Two other important properties, continuity and additivity, have not been considered. In this contribution, we focus on the mutual information of two finite variables $Y,Z$ about a third finite variable $S$ and check which of the decompositions satisfy these two properties. While most of them satisfy continuity, only one of them is both continuous and additive.
△ Less
Submitted 9 July, 2023; v1 submitted 22 April, 2022;
originally announced April 2022.
-
Learning curves for Gaussian process regression with power-law priors and targets
Authors:
Hui **,
Pradeep Kr. Banerjee,
Guido Montúfar
Abstract:
We characterize the power-law asymptotics of learning curves for Gaussian process regression (GPR) under the assumption that the eigenspectrum of the prior and the eigenexpansion coefficients of the target function follow a power law. Under similar assumptions, we leverage the equivalence between GPR and kernel ridge regression (KRR) to show the generalization error of KRR. Infinitely wide neural…
▽ More
We characterize the power-law asymptotics of learning curves for Gaussian process regression (GPR) under the assumption that the eigenspectrum of the prior and the eigenexpansion coefficients of the target function follow a power law. Under similar assumptions, we leverage the equivalence between GPR and kernel ridge regression (KRR) to show the generalization error of KRR. Infinitely wide neural networks can be related to GPR with respect to the neural network GP kernel and the neural tangent kernel, which in several cases is known to have a power-law spectrum. Hence our methods can be applied to study the generalization error of infinitely wide neural networks. We present toy experiments demonstrating the theory.
△ Less
Submitted 27 November, 2021; v1 submitted 23 October, 2021;
originally announced October 2021.
-
Look Before You Leap! Designing a Human-Centered AI System for Change Risk Assessment
Authors:
Binay Gupta,
Anirban Chatterjee,
Harika Matha,
Kunal Banerjee,
Lalitdutt Parsai,
Vijay Agneeswaran
Abstract:
Reducing the number of failures in a production system is one of the most challenging problems in technology driven industries, such as, the online retail industry. To address this challenge, change management has emerged as a promising sub-field in operations that manages and reviews the changes to be deployed in production in a systematic manner. However, it is practically impossible to manually…
▽ More
Reducing the number of failures in a production system is one of the most challenging problems in technology driven industries, such as, the online retail industry. To address this challenge, change management has emerged as a promising sub-field in operations that manages and reviews the changes to be deployed in production in a systematic manner. However, it is practically impossible to manually review a large number of changes on a daily basis and assess the risk associated with them. This warrants the development of an automated system to assess the risk associated with a large number of changes. There are a few commercial solutions available to address this problem but those solutions lack the ability to incorporate domain knowledge and continuous feedback from domain experts into the risk assessment process. As part of this work, we aim to bridge the gap between model-driven risk assessment of change requests and the assessment of domain experts by building a continuous feedback loop into the risk assessment process. Here we present our work to build an end-to-end machine learning system along with the discussion of some of practical challenges we faced related to extreme skewness in class distribution, concept drift, estimation of the uncertainty associated with the model's prediction and the overall scalability of the system.
△ Less
Submitted 17 August, 2021;
originally announced August 2021.
-
Information Complexity and Generalization Bounds
Authors:
Pradeep Kr. Banerjee,
Guido Montúfar
Abstract:
We present a unifying picture of PAC-Bayesian and mutual information-based upper bounds on the generalization error of randomized learning algorithms. As we show, Tong Zhang's information exponential inequality (IEI) gives a general recipe for constructing bounds of both flavors. We show that several important results in the literature can be obtained as simple corollaries of the IEI under differe…
▽ More
We present a unifying picture of PAC-Bayesian and mutual information-based upper bounds on the generalization error of randomized learning algorithms. As we show, Tong Zhang's information exponential inequality (IEI) gives a general recipe for constructing bounds of both flavors. We show that several important results in the literature can be obtained as simple corollaries of the IEI under different assumptions on the loss function. Moreover, we obtain new bounds for data-dependent priors and unbounded loss functions. Optimizing the bounds gives rise to variants of the Gibbs algorithm, for which we discuss two practical examples for learning with neural networks, namely, Entropy- and PAC-Bayes- SGD. Further, we use an Occam's factor argument to show a PAC-Bayesian bound that incorporates second-order curvature information of the training loss.
△ Less
Submitted 23 October, 2021; v1 submitted 4 May, 2021;
originally announced May 2021.
-
Designing a Bot for Efficient Distribution of Service Requests
Authors:
Arkadip Basu,
Kunal Banerjee
Abstract:
The tracking and timely resolution of service requests is one of the major challenges in agile project management. Having an efficient solution to this problem is a key requirement for Walmart to facilitate seamless collaboration across its different business units. The Jira software is one of the popular choices in industries for monitoring such service requests. A service request once logged int…
▽ More
The tracking and timely resolution of service requests is one of the major challenges in agile project management. Having an efficient solution to this problem is a key requirement for Walmart to facilitate seamless collaboration across its different business units. The Jira software is one of the popular choices in industries for monitoring such service requests. A service request once logged into the system by a reporter is referred to as a (Jira) ticket which is assigned to an engineer for servicing. In this work, we explore how the tickets which may arise in any of the Walmart stores and offices distributed over several countries can be assigned to engineers efficiently. Specifically, we will discuss how the introduction of a bot for automated ticket assignment has helped in reducing the disparity in ticket assignment to engineers by human managers and also decreased the average ticket resolution time - thereby improving the experience for both the reporters and the engineers. Additionally, the bot sends reminders and status updates over different business communication platforms for timely tracking of tickets; it can be suitably modified to provision for human intervention in case of special needs by some teams. The current study conducted over data collected from various teams within Walmart shows the efficacy of our bot.
△ Less
Submitted 10 March, 2021;
originally announced March 2021.
-
Exploring Alternatives to Softmax Function
Authors:
Kunal Banerjee,
Vishak Prasad C,
Rishi Raj Gupta,
Karthik Vyas,
Anushree H,
Biswajit Mishra
Abstract:
Softmax function is widely used in artificial neural networks for multiclass classification, multilabel classification, attention mechanisms, etc. However, its efficacy is often questioned in literature. The log-softmax loss has been shown to belong to a more generic class of loss functions, called spherical family, and its member log-Taylor softmax loss is arguably the best alternative in this cl…
▽ More
Softmax function is widely used in artificial neural networks for multiclass classification, multilabel classification, attention mechanisms, etc. However, its efficacy is often questioned in literature. The log-softmax loss has been shown to belong to a more generic class of loss functions, called spherical family, and its member log-Taylor softmax loss is arguably the best alternative in this class. In another approach which tries to enhance the discriminative nature of the softmax function, soft-margin softmax (SM-softmax) has been proposed to be the most suitable alternative. In this work, we investigate Taylor softmax, SM-softmax and our proposed SM-Taylor softmax, an amalgamation of the earlier two functions, as alternatives to softmax function. Furthermore, we explore the effect of expanding Taylor softmax up to ten terms (original work proposed expanding only to two terms) along with the ramifications of considering Taylor softmax to be a finite or infinite series during backpropagation. Our experiments for the image classification task on different datasets reveal that there is always a configuration of the SM-Taylor softmax function that outperforms the normal softmax function and its other alternatives.
△ Less
Submitted 23 November, 2020;
originally announced November 2020.
-
K-TanH: Efficient TanH For Deep Learning
Authors:
Abhisek Kundu,
Alex Heinecke,
Dhiraj Kalamkar,
Sudarshan Srinivasan,
Eric C. Qin,
Naveen K. Mellempudi,
Dipankar Das,
Kunal Banerjee,
Bharat Kaul,
Pradeep Dubey
Abstract:
We propose K-TanH, a novel, highly accurate, hardware efficient approximation of popular activation function TanH for Deep Learning. K-TanH consists of parameterized low-precision integer operations, such as, shift and add/subtract (no floating point operation needed) where parameters are stored in very small look-up tables that can fit in CPU registers. K-TanH can work on various numerical format…
▽ More
We propose K-TanH, a novel, highly accurate, hardware efficient approximation of popular activation function TanH for Deep Learning. K-TanH consists of parameterized low-precision integer operations, such as, shift and add/subtract (no floating point operation needed) where parameters are stored in very small look-up tables that can fit in CPU registers. K-TanH can work on various numerical formats, such as, Float32 and BFloat16. High quality approximations to other activation functions, e.g., Sigmoid, Swish and GELU, can be derived from K-TanH. Our AVX512 implementation of K-TanH demonstrates $>5\times$ speed up over Intel SVML, and it is consistently superior in efficiency over other approximations that use floating point arithmetic. Finally, we achieve state-of-the-art Bleu score and convergence results for training language translation model GNMT on WMT16 data sets with approximate TanH obtained via K-TanH on BFloat16 inputs.
△ Less
Submitted 7 June, 2020; v1 submitted 17 September, 2019;
originally announced September 2019.
-
High-Performance Deep Learning via a Single Building Block
Authors:
Evangelos Georganas,
Kunal Banerjee,
Dhiraj Kalamkar,
Sasikanth Avancha,
Anand Venkat,
Michael Anderson,
Greg Henry,
Hans Pabst,
Alexander Heinecke
Abstract:
Deep learning (DL) is one of the most prominent branches of machine learning. Due to the immense computational cost of DL workloads, industry and academia have developed DL libraries with highly-specialized kernels for each workload/architecture, leading to numerous, complex code-bases that strive for performance, yet they are hard to maintain and do not generalize. In this work, we introduce the…
▽ More
Deep learning (DL) is one of the most prominent branches of machine learning. Due to the immense computational cost of DL workloads, industry and academia have developed DL libraries with highly-specialized kernels for each workload/architecture, leading to numerous, complex code-bases that strive for performance, yet they are hard to maintain and do not generalize. In this work, we introduce the batch-reduce GEMM kernel and show how the most popular DL algorithms can be formulated with this kernel as the basic building-block. Consequently, the DL library-development degenerates to mere (potentially automatic) tuning of loops around this sole optimized kernel. By exploiting our new kernel we implement Recurrent Neural Networks, Convolution Neural Networks and Multilayer Perceptron training and inference primitives in just 3K lines of high-level code. Our primitives outperform vendor-optimized libraries on multi-node CPU clusters, and we also provide proof-of-concept CNN kernels targeting GPUs. Finally, we demonstrate that the batch-reduce GEMM kernel within a tensor compiler yields high-performance CNN primitives, further amplifying the viability of our approach.
△ Less
Submitted 17 June, 2019; v1 submitted 14 June, 2019;
originally announced June 2019.
-
A Study of BFLOAT16 for Deep Learning Training
Authors:
Dhiraj Kalamkar,
Dheevatsa Mudigere,
Naveen Mellempudi,
Dipankar Das,
Kunal Banerjee,
Sasikanth Avancha,
Dharma Teja Vooturi,
Nataraj Jammalamadaka,
Jianyu Huang,
Hector Yuen,
Jiyan Yang,
Jongsoo Park,
Alexander Heinecke,
Evangelos Georganas,
Sudarshan Srinivasan,
Abhisek Kundu,
Misha Smelyanskiy,
Bharat Kaul,
Pradeep Dubey
Abstract:
This paper presents the first comprehensive empirical study demonstrating the efficacy of the Brain Floating Point (BFLOAT16) half-precision format for Deep Learning training across image classification, speech recognition, language modeling, generative networks and industrial recommendation systems. BFLOAT16 is attractive for Deep Learning training for two reasons: the range of values it can repr…
▽ More
This paper presents the first comprehensive empirical study demonstrating the efficacy of the Brain Floating Point (BFLOAT16) half-precision format for Deep Learning training across image classification, speech recognition, language modeling, generative networks and industrial recommendation systems. BFLOAT16 is attractive for Deep Learning training for two reasons: the range of values it can represent is the same as that of IEEE 754 floating-point format (FP32) and conversion to/from FP32 is simple. Maintaining the same range as FP32 is important to ensure that no hyper-parameter tuning is required for convergence; e.g., IEEE 754 compliant half-precision floating point (FP16) requires hyper-parameter tuning. In this paper, we discuss the flow of tensors and various key operations in mixed precision training, and delve into details of operations, such as the rounding modes for converting FP32 tensors to BFLOAT16. We have implemented a method to emulate BFLOAT16 operations in Tensorflow, Caffe2, IntelCaffe, and Neon for our experiments. Our results show that deep learning training using BFLOAT16 tensors achieves the same state-of-the-art (SOTA) results across domains as FP32 tensors in the same number of iterations and with no changes to hyper-parameters.
△ Less
Submitted 13 June, 2019; v1 submitted 29 May, 2019;
originally announced May 2019.
-
A Quick Introduction to Functional Verification of Array-Intensive Programs
Authors:
Kunal Banerjee,
Chandan Karfa
Abstract:
Array-intensive programs are often amenable to parallelization across many cores on a single machine as well as scaling across multiple machines and hence are well explored, especially in the domain of high-performance computing. These programs typically undergo loop transformations and arithmetic transformations in addition to parallelizing transformations. Although a lot of effort has been inves…
▽ More
Array-intensive programs are often amenable to parallelization across many cores on a single machine as well as scaling across multiple machines and hence are well explored, especially in the domain of high-performance computing. These programs typically undergo loop transformations and arithmetic transformations in addition to parallelizing transformations. Although a lot of effort has been invested in improving parallelizing compilers, experienced programmers still resort to hand-optimized transformations which is typically followed by careful tuning of the transformed program to finally obtain the optimized program. Therefore, it is critical to verify that the functional correctness of an original sequential program is not sacrificed during the process of optimization. In this paper, we cover important literature on functional verification of array-intensive programs which we believe can be a good starting point for one interested in this field.
△ Less
Submitted 22 May, 2019;
originally announced May 2019.
-
Unique Information and Secret Key Decompositions
Authors:
Johannes Rauh,
Pradeep Kr. Banerjee,
Eckehard Olbrich,
Jürgen Jost
Abstract:
The unique information ($UI$) is an information measure that quantifies a deviation from the Blackwell order. We have recently shown that this quantity is an upper bound on the one-way secret key rate. In this paper, we prove a triangle inequality for the $UI$, which implies that the $UI$ is never greater than one of the best known upper bounds on the two-way secret key rate. We conjecture that th…
▽ More
The unique information ($UI$) is an information measure that quantifies a deviation from the Blackwell order. We have recently shown that this quantity is an upper bound on the one-way secret key rate. In this paper, we prove a triangle inequality for the $UI$, which implies that the $UI$ is never greater than one of the best known upper bounds on the two-way secret key rate. We conjecture that the $UI$ lower bounds the two-way rate and discuss implications of the conjecture.
△ Less
Submitted 23 January, 2019;
originally announced January 2019.
-
The Variational Deficiency Bottleneck
Authors:
Pradeep Kr. Banerjee,
Guido Montúfar
Abstract:
We introduce a bottleneck method for learning data representations based on information deficiency, rather than the more traditional information sufficiency. A variational upper bound allows us to implement this method efficiently. The bound itself is bounded above by the variational information bottleneck objective, and the two methods coincide in the regime of single-shot Monte Carlo approximati…
▽ More
We introduce a bottleneck method for learning data representations based on information deficiency, rather than the more traditional information sufficiency. A variational upper bound allows us to implement this method efficiently. The bound itself is bounded above by the variational information bottleneck objective, and the two methods coincide in the regime of single-shot Monte Carlo approximations. The notion of deficiency provides a principled way of approximating complicated channels by relatively simpler ones. We show that the deficiency of one channel with respect to another has an operational interpretation in terms of the optimal risk gap of decision problems, capturing classification as a special case. Experiments demonstrate that the deficiency bottleneck can provide advantages in terms of minimal sufficiency as measured by information bottleneck curves, while retaining robust test performance in classification tasks.
△ Less
Submitted 4 November, 2020; v1 submitted 27 October, 2018;
originally announced October 2018.
-
Anatomy Of High-Performance Deep Learning Convolutions On SIMD Architectures
Authors:
Evangelos Georganas,
Sasikanth Avancha,
Kunal Banerjee,
Dhiraj Kalamkar,
Greg Henry,
Hans Pabst,
Alexander Heinecke
Abstract:
Convolution layers are prevalent in many classes of deep neural networks, including Convolutional Neural Networks (CNNs) which provide state-of-the-art results for tasks like image recognition, neural machine translation and speech recognition. The computationally expensive nature of a convolution operation has led to the proliferation of implementations including matrix-matrix multiplication form…
▽ More
Convolution layers are prevalent in many classes of deep neural networks, including Convolutional Neural Networks (CNNs) which provide state-of-the-art results for tasks like image recognition, neural machine translation and speech recognition. The computationally expensive nature of a convolution operation has led to the proliferation of implementations including matrix-matrix multiplication formulation, and direct convolution primarily targeting GPUs. In this paper, we introduce direct convolution kernels for x86 architectures, in particular for Xeon and XeonPhi systems, which are implemented via a dynamic compilation approach. Our JIT-based implementation shows close to theoretical peak performance, depending on the setting and the CPU architecture at hand. We additionally demonstrate how these JIT-optimized kernels can be integrated into a lightweight multi-node graph execution model. This illustrates that single- and multi-node runs yield high efficiencies and high image-throughputs when executing state-of-the-art image recognition tasks on CPUs.
△ Less
Submitted 20 August, 2018; v1 submitted 16 August, 2018;
originally announced August 2018.
-
Unique Informations and Deficiencies
Authors:
Pradeep Kr. Banerjee,
Eckehard Olbrich,
Jürgen Jost,
Johannes Rauh
Abstract:
Given two channels that convey information about the same random variable, we introduce two measures of the unique information of one channel with respect to the other. The two quantities are based on the notion of generalized weighted Le Cam deficiencies and differ on whether one channel can approximate the other by a randomization at either its input or output. We relate the proposed quantities…
▽ More
Given two channels that convey information about the same random variable, we introduce two measures of the unique information of one channel with respect to the other. The two quantities are based on the notion of generalized weighted Le Cam deficiencies and differ on whether one channel can approximate the other by a randomization at either its input or output. We relate the proposed quantities to an existing measure of unique information which we call the minimum-synergy unique information. We give an operational interpretation of the latter in terms of an upper bound on the one-way secret key rate and discuss the role of the unique informations in the context of nonnegative mutual information decompositions into unique, redundant and synergistic components.
△ Less
Submitted 9 December, 2019; v1 submitted 13 July, 2018;
originally announced July 2018.
-
Mixed Precision Training of Convolutional Neural Networks using Integer Operations
Authors:
Dipankar Das,
Naveen Mellempudi,
Dheevatsa Mudigere,
Dhiraj Kalamkar,
Sasikanth Avancha,
Kunal Banerjee,
Srinivas Sridharan,
Karthik Vaidyanathan,
Bharat Kaul,
Evangelos Georganas,
Alexander Heinecke,
Pradeep Dubey,
Jesus Corbal,
Nikita Shustrov,
Roma Dubtsov,
Evarist Fomenko,
Vadim Pirogov
Abstract:
The state-of-the-art (SOTA) for mixed precision training is dominated by variants of low precision floating point operations, and in particular, FP16 accumulating into FP32 Micikevicius et al. (2017). On the other hand, while a lot of research has also happened in the domain of low and mixed-precision Integer training, these works either present results for non-SOTA networks (for instance only Ale…
▽ More
The state-of-the-art (SOTA) for mixed precision training is dominated by variants of low precision floating point operations, and in particular, FP16 accumulating into FP32 Micikevicius et al. (2017). On the other hand, while a lot of research has also happened in the domain of low and mixed-precision Integer training, these works either present results for non-SOTA networks (for instance only AlexNet for ImageNet-1K), or relatively small datasets (like CIFAR-10). In this work, we train state-of-the-art visual understanding neural networks on the ImageNet-1K dataset, with Integer operations on General Purpose (GP) hardware. In particular, we focus on Integer Fused-Multiply-and-Accumulate (FMA) operations which take two pairs of INT16 operands and accumulate results into an INT32 output.We propose a shared exponent representation of tensors and develop a Dynamic Fixed Point (DFP) scheme suitable for common neural network operations. The nuances of develo** an efficient integer convolution kernel is examined, including methods to handle overflow of the INT32 accumulator. We implement CNN training for ResNet-50, GoogLeNet-v1, VGG-16 and AlexNet; and these networks achieve or exceed SOTA accuracy within the same number of iterations as their FP32 counterparts without any change in hyper-parameters and with a 1.8X improvement in end-to-end training throughput. To the best of our knowledge these results represent the first INT16 training results on GP hardware for ImageNet-1K dataset using SOTA CNNs and achieve highest reported accuracy using half-precision
△ Less
Submitted 23 February, 2018; v1 submitted 3 February, 2018;
originally announced February 2018.
-
Computing the Unique Information
Authors:
Pradeep Kr. Banerjee,
Johannes Rauh,
Guido Montúfar
Abstract:
Given a pair of predictor variables and a response variable, how much information do the predictors have about the response, and how is this information distributed between unique, redundant, and synergistic components? Recent work has proposed to quantify the unique component of the decomposition as the minimum value of the conditional mutual information over a constrained set of information chan…
▽ More
Given a pair of predictor variables and a response variable, how much information do the predictors have about the response, and how is this information distributed between unique, redundant, and synergistic components? Recent work has proposed to quantify the unique component of the decomposition as the minimum value of the conditional mutual information over a constrained set of information channels. We present an efficient iterative divergence minimization algorithm to solve this optimization problem with convergence guarantees and evaluate its performance against other techniques.
△ Less
Submitted 29 May, 2018; v1 submitted 21 September, 2017;
originally announced September 2017.
-
Ternary Residual Networks
Authors:
Abhisek Kundu,
Kunal Banerjee,
Naveen Mellempudi,
Dheevatsa Mudigere,
Dipankar Das,
Bharat Kaul,
Pradeep Dubey
Abstract:
Sub-8-bit representation of DNNs incur some discernible loss of accuracy despite rigorous (re)training at low-precision. Such loss of accuracy essentially makes them equivalent to a much shallower counterpart, diminishing the power of being deep networks. To address this problem of accuracy drop we introduce the notion of \textit{residual networks} where we add more low-precision edges to sensitiv…
▽ More
Sub-8-bit representation of DNNs incur some discernible loss of accuracy despite rigorous (re)training at low-precision. Such loss of accuracy essentially makes them equivalent to a much shallower counterpart, diminishing the power of being deep networks. To address this problem of accuracy drop we introduce the notion of \textit{residual networks} where we add more low-precision edges to sensitive branches of the sub-8-bit network to compensate for the lost accuracy. Further, we present a perturbation theory to identify such sensitive edges. Aided by such an elegant trade-off between accuracy and compute, the 8-2 model (8-bit activations, ternary weights), enhanced by ternary residual edges, turns out to be sophisticated enough to achieve very high accuracy ($\sim 1\%$ drop from our FP-32 baseline), despite $\sim 1.6\times$ reduction in model size, $\sim 26\times$ reduction in number of multiplications, and potentially $\sim 2\times$ power-performance gain comparing to 8-8 representation, on the state-of-the-art deep network ResNet-101 pre-trained on ImageNet dataset. Moreover, depending on the varying accuracy requirements in a dynamic environment, the deployed low-precision model can be upgraded/downgraded on-the-fly by partially enabling/disabling residual connections. For example, disabling the least important residual connections in the above enhanced network, the accuracy drop is $\sim 2\%$ (from FP32), despite $\sim 1.9\times$ reduction in model size, $\sim 32\times$ reduction in number of multiplications, and potentially $\sim 2.3\times$ power-performance gain comparing to 8-8 representation. Finally, all the ternary connections are sparse in nature, and the ternary residual conversion can be done in a resource-constraint setting with no low-precision (re)training.
△ Less
Submitted 31 October, 2017; v1 submitted 14 July, 2017;
originally announced July 2017.
-
On extractable shared information
Authors:
Johannes Rauh,
Pradeep Kr. Banerjee,
Eckehard Olbrich,
Jürgen Jost,
Nils Bertschinger
Abstract:
We consider the problem of quantifying the information shared by a pair of random variables $X_{1},X_{2}$ about another variable $S$. We propose a new measure of shared information, called extractable shared information, that is left monotonic; that is, the information shared about $S$ is bounded from below by the information shared about $f(S)$ for any function $f$. We show that our measure leads…
▽ More
We consider the problem of quantifying the information shared by a pair of random variables $X_{1},X_{2}$ about another variable $S$. We propose a new measure of shared information, called extractable shared information, that is left monotonic; that is, the information shared about $S$ is bounded from below by the information shared about $f(S)$ for any function $f$. We show that our measure leads to a new nonnegative decomposition of the mutual information $I(S;X_1X_2)$ into shared, complementary and unique components. We study properties of this decomposition and show that a left monotonic shared information is not compatible with a Blackwell interpretation of unique information. We also discuss whether it is possible to have a decomposition in which both shared and unique information are left monotonic.
△ Less
Submitted 10 November, 2017; v1 submitted 26 January, 2017;
originally announced January 2017.
-
Coarse-graining and the Blackwell order
Authors:
Johannes Rauh,
Pradeep Kr. Banerjee,
Eckehard Olbrich,
Jürgen Jost,
Nils Bertschinger,
David Wolpert
Abstract:
Suppose we have a pair of information channels, $κ_{1},κ_{2}$, with a common input. The Blackwell order is a partial order over channels that compares $κ_{1}$ and $κ_{2}$ by the maximal expected utility an agent can obtain when decisions are based on the channel outputs. Equivalently, $κ_{1}$ is said to be Blackwell-inferior to $κ_{2}$ if and only if $κ_{1}$ can be constructed by garbling the outp…
▽ More
Suppose we have a pair of information channels, $κ_{1},κ_{2}$, with a common input. The Blackwell order is a partial order over channels that compares $κ_{1}$ and $κ_{2}$ by the maximal expected utility an agent can obtain when decisions are based on the channel outputs. Equivalently, $κ_{1}$ is said to be Blackwell-inferior to $κ_{2}$ if and only if $κ_{1}$ can be constructed by garbling the output of $κ_{2}$. A related partial order stipulates that $κ_{2}$ is more capable than $κ_{1}$ if the mutual information between the input and output is larger for $κ_{2}$ than for $κ_{1}$ for any distribution over inputs. A Blackwell-inferior channel is necessarily less capable. However, examples are known where $κ_{1}$ is less capable than $κ_{2}$ but not Blackwell-inferior. We show that this may even happen when $κ_{1}$ is constructed by coarse-graining the inputs of $κ_{2}$. Such a coarse-graining is a special kind of "pre-garbling" of the channel inputs. This example directly establishes that the expected value of the shared utility function for the coarse-grained channel is larger than it is for the non-coarse-grained channel. This contradicts the intuition that coarse-graining can only destroy information and lead to inferior channels. We also discuss our results in the context of information decompositions.
△ Less
Submitted 10 November, 2017; v1 submitted 26 January, 2017;
originally announced January 2017.
-
Categorization of Tablas by Wavelet Analysis
Authors:
Anirban Patranabis,
Kaushik Banerjee,
Vishal Midya,
Shankha Sanyal,
Archi Banerjee,
Ranjan Sengupta,
Dipak Ghosh
Abstract:
Tabla, a percussion instrument, mainly used to accompany vocalists, instrumentalists and dancers in every style of music from classical to light in India, mainly used for kee** rhythm. This percussion instrument consists of two drums played by two hands, structurally different and produces different harmonic sounds. Earlier work has done labeling tabla strokes from real time performances by test…
▽ More
Tabla, a percussion instrument, mainly used to accompany vocalists, instrumentalists and dancers in every style of music from classical to light in India, mainly used for kee** rhythm. This percussion instrument consists of two drums played by two hands, structurally different and produces different harmonic sounds. Earlier work has done labeling tabla strokes from real time performances by testing neural networks and tree based classification methods. The current work extends previous work by C. V. Raman and S. Kumar in 1920 on spectrum modeling of tabla strokes. In this paper we have studied spectral characteristics (by wavelet analysis by sub band coding method and using torrence wavelet tool) of nine strokes from each of five tablas using Wavelet transform. Wavelet analysis is now a common tool for analyzing localized variations of power within a time series and to find the frequency distribution in time frequency space. Statistically, we will look into the patterns depicted by harmonics of different sub bands and the tablas. Distribution of dominant frequencies at different sub-band of stroke signals, distribution of power and behavior of harmonics are the important features, leads to categorization of tabla.
△ Less
Submitted 3 January, 2016;
originally announced January 2016.
-
Harmonic and Timbre Analysis of Tabla Strokes
Authors:
Anirban Patranabis,
Kaushik Banerjee,
Vishal Midya,
Sneha Chakraborty,
Shankha Sanyal,
Archi Banerjee,
Ranjan Sengupta,
Dipak Ghosh
Abstract:
Indian twin drums mainly bayan and dayan (tabla) are the most important percussion instruments in India popularly used for kee** rhythm. It is a twin percussion/drum instrument of which the right hand drum is called dayan and the left hand drum is called bayan. Tabla strokes are commonly called as `bol', constitutes a series of syllables. In this study we have studied the timbre characteristics…
▽ More
Indian twin drums mainly bayan and dayan (tabla) are the most important percussion instruments in India popularly used for kee** rhythm. It is a twin percussion/drum instrument of which the right hand drum is called dayan and the left hand drum is called bayan. Tabla strokes are commonly called as `bol', constitutes a series of syllables. In this study we have studied the timbre characteristics of nine strokes from each of five different tablas. Timbre parameters were calculated from LTAS of each stroke signals. Study of timbre characteristics is one of the most important deterministic approach for analyzing tabla and its stroke characteristics. Statistical correlations among timbre parameters were measured and also through factor analysis we get to know about the parameters of timbre analysis which are closely related. Tabla strokes have unique harmonic and timbral characteristics at mid frequency range and have no uniqueness at low frequency ranges.
△ Less
Submitted 15 October, 2015;
originally announced October 2015.
-
Synergy, Redundancy and Common Information
Authors:
Pradeep Kr. Banerjee,
Virgil Griffith
Abstract:
We consider the problem of decomposing the total mutual information conveyed by a pair of predictor random variables about a target random variable into redundant, unique and synergistic contributions. We focus on the relationship between "redundant information" and the more familiar information-theoretic notions of "common information". Our main contribution is an impossibility result. We show th…
▽ More
We consider the problem of decomposing the total mutual information conveyed by a pair of predictor random variables about a target random variable into redundant, unique and synergistic contributions. We focus on the relationship between "redundant information" and the more familiar information-theoretic notions of "common information". Our main contribution is an impossibility result. We show that for independent predictor random variables, any common information based measure of redundancy cannot induce a nonnegative decomposition of the total mutual information. Interestingly, this entails that any reasonable measure of redundant information cannot be derived by optimization over a single random variable.
△ Less
Submitted 12 September, 2015;
originally announced September 2015.
-
Noise Sensitivity of Teager-Kaiser Energy Operators and Their Ratios
Authors:
Pradeep Kr. Banerjee,
Nirmal B. Chakrabarti
Abstract:
The Teager-Kaiser energy operator (TKO) belongs to a class of autocorrelators and their linear combination that can track the instantaneous energy of a nonstationary sinusoidal signal source. TKO-based monocomponent AM-FM demodulation algorithms work under the basic assumption that the operator outputs are always positive. In the absence of noise, this is assured for pure sinusoidal inputs and the…
▽ More
The Teager-Kaiser energy operator (TKO) belongs to a class of autocorrelators and their linear combination that can track the instantaneous energy of a nonstationary sinusoidal signal source. TKO-based monocomponent AM-FM demodulation algorithms work under the basic assumption that the operator outputs are always positive. In the absence of noise, this is assured for pure sinusoidal inputs and the instantaneous property is also guaranteed. Noise invalidates both of these, particularly under small signal conditions. Post-detection filtering and thresholding are of use to reestablish these at the cost of some time to acquire. Key questions are: (a) how many samples must one use and (b) how much noise power at the detector input can one tolerate. Results of study of the role of delay and the limits imposed by additive Gaussian noise are presented along with the computation of the cumulants and probability density functions of the individual quadratic forms and their ratios.
△ Less
Submitted 29 May, 2015; v1 submitted 30 April, 2015;
originally announced April 2015.
-
Some new insights into information decomposition in complex systems based on common information
Authors:
Pradeep Kr. Banerjee
Abstract:
We take a closer look at the structure of bivariate dependency induced by a pair of predictor random variables $(X_1, X_2)$ trying to synergistically, redundantly or uniquely encode a target random variable $Y$. We evaluate a recently proposed measure of redundancy based on the Gács-Körner common information (Griffith et al., Entropy 2014) and show that the measure, in spite of its elegance is deg…
▽ More
We take a closer look at the structure of bivariate dependency induced by a pair of predictor random variables $(X_1, X_2)$ trying to synergistically, redundantly or uniquely encode a target random variable $Y$. We evaluate a recently proposed measure of redundancy based on the Gács-Körner common information (Griffith et al., Entropy 2014) and show that the measure, in spite of its elegance is degenerate for most non-trivial distributions. We show that Wyner's common information also fails to capture the notion of redundancy as it violates an intuitive monotonically non-increasing property. We identify a set of conditions when a conditional version of Gács and Körner's common information is an ideal measure of unique information. Finally, we show how the notions of approximately sufficient statistics and conditional information bottleneck can be used to quantify unique information.
△ Less
Submitted 2 March, 2015;
originally announced March 2015.
-
A Secret Common Information Duality for Tripartite Noisy Correlations
Authors:
Pradeep Kr. Banerjee
Abstract:
We explore the duality between the simulation and extraction of secret correlations in light of a similar well-known operational duality between the two notions of common information due to Wyner, and Gács and Körner. For the inverse problem of simulating a tripartite noisy correlation from noiseless secret key and unlimited public communication, we show that Winter's (2005) result for the key cos…
▽ More
We explore the duality between the simulation and extraction of secret correlations in light of a similar well-known operational duality between the two notions of common information due to Wyner, and Gács and Körner. For the inverse problem of simulating a tripartite noisy correlation from noiseless secret key and unlimited public communication, we show that Winter's (2005) result for the key cost in terms of a conditional version of Wyner's common information can be simply reexpressed in terms of the existence of a bipartite protocol monotone. For the forward problem of key distillation from noisy correlations, we construct simple distributions for which the conditional Gács and Körner common information achieves a tight bound on the secret key rate. We conjecture that this holds in general for non-communicative key agreement models. We also comment on the interconvertibility of secret correlations under local operations and public communication.
△ Less
Submitted 30 May, 2015; v1 submitted 20 February, 2015;
originally announced February 2015.
-
Multipartite Monotones for Secure Sampling by Public Discussion From Noisy Correlations
Authors:
Pradeep Kr. Banerjee
Abstract:
We address the problem of quantifying the cryptographic content of probability distributions, in relation to an application to secure multi-party sampling against a passive t-adversary. We generalize a recently introduced notion of assisted common information of a pair of correlated sources to that of K sources and define a family of monotone rate regions indexed by K. This allows for a simple cha…
▽ More
We address the problem of quantifying the cryptographic content of probability distributions, in relation to an application to secure multi-party sampling against a passive t-adversary. We generalize a recently introduced notion of assisted common information of a pair of correlated sources to that of K sources and define a family of monotone rate regions indexed by K. This allows for a simple characterization of all t-private distributions that can be statistically securely sampled without any auxiliary setup of pre-shared noisy correlations. We also give a new monotone called the residual total correlation that admits a simple operational interpretation. Interestingly, for sampling with non-trivial setups (K > 2) in the public discussion model, our definition of a monotone region differs from the one by Prabhakaran and Prabhakaran (ITW 2012).
△ Less
Submitted 20 April, 2015; v1 submitted 19 February, 2015;
originally announced February 2015.