-
The Cosmic Dispersion Measure in the EAGLE Simulations
Authors:
Adam J. Batten,
Alan R. Duffy,
Nastasha Wijers,
Vivek Gupta,
Chris Flynn,
Joop Schaye,
Emma Ryan-Weber
Abstract:
The dispersion measure (DM) of fast radio bursts (FRBs) provides a unique way to probe ionised baryons in the intergalactic medium (IGM). Cosmological models with different parameters lead to different DM-redshift ($\mathrm{DM}-z$) relations. Additionally, the over/under-dense regions in the IGM and the circumgalactic medium of intervening galaxies lead to scatter around the mean $\mathrm{DM}-z$ r…
▽ More
The dispersion measure (DM) of fast radio bursts (FRBs) provides a unique way to probe ionised baryons in the intergalactic medium (IGM). Cosmological models with different parameters lead to different DM-redshift ($\mathrm{DM}-z$) relations. Additionally, the over/under-dense regions in the IGM and the circumgalactic medium of intervening galaxies lead to scatter around the mean $\mathrm{DM}-z$ relations. We have used the Evolution and Assembly of GaLaxies and their Environments (EAGLE) simulations to measure the mean $\mathrm{DM}-z$ relation and the scatter around it using over one billion lines-of-sight between redshifts $0<z<3$. We investigated two techniques to estimate line-of-sight DM: `pixel scrambling' and `box transformations'. We find that using box transformations (a technique from the literature) causes strong correlations due to repeated replication of structure. Comparing a linear and non-linear model, we find that the non-linear model with cosmological parameters, provides a better fit to the $\mathrm{DM}-z$ relation. The differences between these models are the most significant at low redshifts ($z<0.5$). The scatter around the $\mathrm{DM}-z$ relation is highly asymmetric, especially at low redshift $\left(z<0.5\right)$, and becomes more Gaussian as redshift approaches $z\sim3$, the limit of this study. The increase in Gaussianity with redshift is indicative of the large scale structures that is better probed with longer lines-of-sight. The minimum simulation size suitable for investigations into the scatter around the $\mathrm{DM}-z$ relation is 100~comoving~Mpc. The $\mathrm{DM}-z$ relation measured in EAGLE is available with an easy-to-use python interface in the open-source FRB redshift estimation package FRUITBAT.
△ Less
Submitted 29 November, 2020;
originally announced November 2020.
-
Reinforcement Learning based Distributed Control of Dissipative Networked Systems
Authors:
K. C. Kosaraju,
S. Sivaranjani,
W. Suttle,
V. Gupta,
J. Liu
Abstract:
We consider the problem of designing distributed controllers to stabilize a class of networked systems, where each subsystem is dissipative and designs a reinforcement learning based local controller to maximize an individual cumulative reward function. We develop an approach that enforces dissipativity conditions on these local controllers at each subsystem to guarantee stability of the entire ne…
▽ More
We consider the problem of designing distributed controllers to stabilize a class of networked systems, where each subsystem is dissipative and designs a reinforcement learning based local controller to maximize an individual cumulative reward function. We develop an approach that enforces dissipativity conditions on these local controllers at each subsystem to guarantee stability of the entire networked system. The proposed approach is illustrated on a DC microgrid example, where the objective is maintain voltage stability of the network using local distributed controllers at each generation unit.
△ Less
Submitted 28 November, 2020;
originally announced November 2020.
-
Balance Regularized Neural Network Models for Causal Effect Estimation
Authors:
Mehrdad Farajtabar,
Andrew Lee,
Yuanjian Feng,
Vishal Gupta,
Peter Dolan,
Harish Chandran,
Martin Szummer
Abstract:
Estimating individual and average treatment effects from observational data is an important problem in many domains such as healthcare and e-commerce. In this paper, we advocate balance regularization of multi-head neural network architectures. Our work is motivated by representation learning techniques to reduce differences between treated and untreated distributions that potentially arise due to…
▽ More
Estimating individual and average treatment effects from observational data is an important problem in many domains such as healthcare and e-commerce. In this paper, we advocate balance regularization of multi-head neural network architectures. Our work is motivated by representation learning techniques to reduce differences between treated and untreated distributions that potentially arise due to confounding factors. We further regularize the model by encouraging it to predict control outcomes for individuals in the treatment group that are similar to control outcomes in the control group. We empirically study the bias-variance trade-off between different weightings of the regularizers, as well as between inductive and transductive inference.
△ Less
Submitted 22 November, 2020;
originally announced November 2020.
-
End-to-End Differentiable 6DoF Object Pose Estimation with Local and Global Constraints
Authors:
Anshul Gupta,
Joydeep Medhi,
Aratrik Chattopadhyay,
Vikram Gupta
Abstract:
Inferring the 6DoF pose of an object from a single RGB image is an important but challenging task, especially under heavy occlusion. While recent approaches improve upon the two stage approaches by training an end-to-end pipeline, they do not leverage local and global constraints. In this paper, we propose pairwise feature extraction to integrate local constraints, and triplet regularization to in…
▽ More
Inferring the 6DoF pose of an object from a single RGB image is an important but challenging task, especially under heavy occlusion. While recent approaches improve upon the two stage approaches by training an end-to-end pipeline, they do not leverage local and global constraints. In this paper, we propose pairwise feature extraction to integrate local constraints, and triplet regularization to integrate global constraints for improved 6DoF object pose estimation. Coupled with better augmentation, our approach achieves state of the art results on the challenging Occlusion Linemod dataset, with a 9% improvement over the previous state of the art, and achieves competitive results on the Linemod dataset.
△ Less
Submitted 22 November, 2020;
originally announced November 2020.
-
Estimating fast transient detection pipeline efficiencies at UTMOST via real-time injection of mock FRBs
Authors:
Vivek Gupta,
Chris Flynn,
Wael Farah,
Andrew Jameson,
Vivek Venkatraman Krishnan,
Matthew Bailes,
Timothy Bateman,
Adam T. Deller,
Ayushi Mandlik,
Angus Sutherland
Abstract:
Dedicated surveys using different detection pipelines are being carried out at multiple observatories to find more Fast Radio Bursts (FRBs). Understanding the efficiency of detection algorithms and the survey completeness function is important to enable unbiased estimation of the underlying FRB population properties. One method to achieve end-to-end testing of the system is by injecting mock FRBs…
▽ More
Dedicated surveys using different detection pipelines are being carried out at multiple observatories to find more Fast Radio Bursts (FRBs). Understanding the efficiency of detection algorithms and the survey completeness function is important to enable unbiased estimation of the underlying FRB population properties. One method to achieve end-to-end testing of the system is by injecting mock FRBs in the live data-stream and searching for them blindly. Mock FRB injection is particularly effective for machine-learning-based classifiers, for which analytic characterisation is impractical. We describe a first-of-its-kind implementation of a real-time mock FRB injection system at the upgraded Molonglo Observatory Synthesis Telescope (UTMOST) and present our results for a set of 20,000 mock FRB injections. The injections have yielded clear insight into the detection efficiencies and have provided a survey completeness function for pulse width, fluence and DM. Mock FRBs are recovered with uniform efficiency over the full range of injected DMs, however the recovery fraction is found to be a strong function of the width and Signal-to-Noise (SNR). For low widths ($\lesssim 20$ ms) and high SNR ($\gtrsim$ 9) the recovery is highly effective with recovery fractions exceeding 90%. We find that the presence of radio frequency interference causes the recovered SNR values to be systematically lower by up to 20% compared to the injected values. We find that wider FRBs become increasingly hard to recover for the machine-learning-based classifier employed at UTMOST. We encourage other observatories to implement live injection set-ups for similar testing of their surveys.
△ Less
Submitted 19 November, 2020;
originally announced November 2020.
-
Stealthy hacking and secrecy of controlled state estimation systems with random dropouts
Authors:
**gyi Lu,
Daniel Quevedo,
Vijay Gupta,
Subhrakanti Dey
Abstract:
We study the maximum information gain that an adversary may obtain through hacking without being detected. Consider a dynamical process observed by a sensor that transmits a local estimate of the system state to a remote estimator according to some reference transmission policy across a packet-drop** wireless channel equipped with acknowledgments (ACK). An adversary overhears the transmissions a…
▽ More
We study the maximum information gain that an adversary may obtain through hacking without being detected. Consider a dynamical process observed by a sensor that transmits a local estimate of the system state to a remote estimator according to some reference transmission policy across a packet-drop** wireless channel equipped with acknowledgments (ACK). An adversary overhears the transmissions and proactively hijacks the sensor to reprogram its transmission policy. We define perfect secrecy as kee** the averaged expected error covariance bounded at the legitimate estimator and unbounded at the adversary. By analyzing the stationary distribution of the expected error covariance, we show that perfect secrecy can be attained for unstable systems only if the ACK channel has no packet dropouts. In other situations, we prove that independent of the reference policy and the detection methods, perfect secrecy is not attainable. For this scenario, we formulate a constrained Markov decision process to derive the optimal transmission policy that the adversary should implement at the sensor, and devise a Stackelberg game to derive the optimal reference policy for the legitimate estimator.
△ Less
Submitted 7 November, 2020;
originally announced November 2020.
-
Voting Rights, Markov Chains, and Optimization by Short Bursts
Authors:
Sarah Cannon,
Ari Goldbloom-Helzner,
Varun Gupta,
JN Matthews,
Bhushan Suwal
Abstract:
Finding outlying elements in probability distributions can be a hard problem. Taking a real example from Voting Rights Act enforcement, we consider the problem of maximizing the number of simultaneous majority-minority districts in a political districting plan. An unbiased random walk on districting plans is unlikely to find plans that approach this maximum. A common search approach is to use a bi…
▽ More
Finding outlying elements in probability distributions can be a hard problem. Taking a real example from Voting Rights Act enforcement, we consider the problem of maximizing the number of simultaneous majority-minority districts in a political districting plan. An unbiased random walk on districting plans is unlikely to find plans that approach this maximum. A common search approach is to use a biased random walk: preferentially select districting plans with more majority-minority districts. Here, we present a third option, called short bursts, in which an unbiased random walk is performed for a small number of steps (called the burst length), then re-started from the most extreme plan that was encountered in the last burst. We give empirical evidence that short-burst runs outperform biased random walks for the problem of maximizing the number of majority-minority districts, and that there are many values of burst length for which we see this improvement. Abstracting from our use case, we also consider short bursts where the underlying state space is a line with various probability distributions, and then explore some features of more complicated state spaces and how these impact the effectiveness of short bursts.
△ Less
Submitted 22 June, 2022; v1 submitted 23 October, 2020;
originally announced November 2020.
-
BEAR: Sketching BFGS Algorithm for Ultra-High Dimensional Feature Selection in Sublinear Memory
Authors:
Amirali Aghazadeh,
Vipul Gupta,
Alex DeWeese,
O. Ozan Koyluoglu,
Kannan Ramchandran
Abstract:
We consider feature selection for applications in machine learning where the dimensionality of the data is so large that it exceeds the working memory of the (local) computing machine. Unfortunately, current large-scale sketching algorithms show poor memory-accuracy trade-off due to the irreversible collision and accumulation of the stochastic gradient noise in the sketched domain. Here, we develo…
▽ More
We consider feature selection for applications in machine learning where the dimensionality of the data is so large that it exceeds the working memory of the (local) computing machine. Unfortunately, current large-scale sketching algorithms show poor memory-accuracy trade-off due to the irreversible collision and accumulation of the stochastic gradient noise in the sketched domain. Here, we develop a second-order ultra-high dimensional feature selection algorithm, called BEAR, which avoids the extra collisions by storing the second-order gradients in the celebrated Broyden-Fletcher-Goldfarb-Shannon (BFGS) algorithm in Count Sketch, a sublinear memory data structure from the streaming literature. Experiments on real-world data sets demonstrate that BEAR requires up to three orders of magnitude less memory space to achieve the same classification accuracy compared to the first-order sketching algorithms. Theoretical analysis proves convergence of BEAR with rate O(1/t) in t iterations of the sketched algorithm. Our algorithm reveals an unexplored advantage of second-order optimization for memory-constrained sketching of models trained on ultra-high dimensional data sets.
△ Less
Submitted 26 May, 2021; v1 submitted 26 October, 2020;
originally announced October 2020.
-
Deep Clustering of Text Representations for Supervision-free Probing of Syntax
Authors:
Vikram Gupta,
Haoyue Shi,
Kevin Gimpel,
Mrinmaya Sachan
Abstract:
We explore deep clustering of text representations for unsupervised model interpretation and induction of syntax. As these representations are high-dimensional, out-of-the-box methods like KMeans do not work well. Thus, our approach jointly transforms the representations into a lower-dimensional cluster-friendly space and clusters them. We consider two notions of syntax: Part of speech Induction (…
▽ More
We explore deep clustering of text representations for unsupervised model interpretation and induction of syntax. As these representations are high-dimensional, out-of-the-box methods like KMeans do not work well. Thus, our approach jointly transforms the representations into a lower-dimensional cluster-friendly space and clusters them. We consider two notions of syntax: Part of speech Induction (POSI) and constituency labelling (CoLab) in this work. Interestingly, we find that Multilingual BERT (mBERT) contains surprising amount of syntactic knowledge of English; possibly even as much as English BERT (EBERT). Our model can be used as a supervision-free probe which is arguably a less-biased way of probing. We find that unsupervised probes show benefits from higher layers as compared to supervised probes. We further note that our unsupervised probe utilizes EBERT and mBERT representations differently, especially for POSI. We validate the efficacy of our probe by demonstrating its capabilities as an unsupervised syntax induction technique. Our probe works well for both syntactic formalisms by simply adapting the input representations. We report competitive performance of our probe on 45-tag English POSI, state-of-the-art performance on 12-tag POSI across 10 languages, and competitive results on CoLab. We also perform zero-shot syntax induction on resource impoverished languages and report strong results.
△ Less
Submitted 1 December, 2021; v1 submitted 24 October, 2020;
originally announced October 2020.
-
Training Recommender Systems at Scale: Communication-Efficient Model and Data Parallelism
Authors:
Vipul Gupta,
Dhruv Choudhary,
** Tak Peter Tang,
Xiaohan Wei,
Xing Wang,
Yuzhen Huang,
Arun Kejariwal,
Kannan Ramchandran,
Michael W. Mahoney
Abstract:
In this paper, we consider hybrid parallelism -- a paradigm that employs both Data Parallelism (DP) and Model Parallelism (MP) -- to scale distributed training of large recommendation models. We propose a compression framework called Dynamic Communication Thresholding (DCT) for communication-efficient hybrid training. DCT filters the entities to be communicated across the network through a simple…
▽ More
In this paper, we consider hybrid parallelism -- a paradigm that employs both Data Parallelism (DP) and Model Parallelism (MP) -- to scale distributed training of large recommendation models. We propose a compression framework called Dynamic Communication Thresholding (DCT) for communication-efficient hybrid training. DCT filters the entities to be communicated across the network through a simple hard-thresholding function, allowing only the most relevant information to pass through. For communication efficient DP, DCT compresses the parameter gradients sent to the parameter server during model synchronization. The threshold is updated only once every few thousand iterations to reduce the computational overhead of compression. For communication efficient MP, DCT incorporates a novel technique to compress the activations and gradients sent across the network during the forward and backward propagation, respectively. This is done by identifying and updating only the most relevant neurons of the neural network for each training sample in the data. We evaluate DCT on publicly available natural language processing and recommender models and datasets, as well as recommendation systems used in production at Facebook. DCT reduces communication by at least $100\times$ and $20\times$ during DP and MP, respectively. The algorithm has been deployed in production, and it improves end-to-end training time for a state-of-the-art industrial recommender model by 37\%, without any loss in performance.
△ Less
Submitted 21 May, 2021; v1 submitted 17 October, 2020;
originally announced October 2020.
-
On Long-Tailed Phenomena in Neural Machine Translation
Authors:
Vikas Raunak,
Siddharth Dalmia,
Vivek Gupta,
Florian Metze
Abstract:
State-of-the-art Neural Machine Translation (NMT) models struggle with generating low-frequency tokens, tackling which remains a major challenge. The analysis of long-tailed phenomena in the context of structured prediction tasks is further hindered by the added complexities of search during inference. In this work, we quantitatively characterize such long-tailed phenomena at two levels of abstrac…
▽ More
State-of-the-art Neural Machine Translation (NMT) models struggle with generating low-frequency tokens, tackling which remains a major challenge. The analysis of long-tailed phenomena in the context of structured prediction tasks is further hindered by the added complexities of search during inference. In this work, we quantitatively characterize such long-tailed phenomena at two levels of abstraction, namely, token classification and sequence generation. We propose a new loss function, the Anti-Focal loss, to better adapt model training to the structural dependencies of conditional text generation by incorporating the inductive biases of beam search in the training process. We show the efficacy of the proposed technique on a number of Machine Translation (MT) datasets, demonstrating that it leads to significant gains over cross-entropy across different language pairs, especially on the generation of low-frequency words. We have released the code to reproduce our results.
△ Less
Submitted 10 October, 2020;
originally announced October 2020.
-
Deep Learning-Based Automatic Detection of Poorly Positioned Mammograms to Minimize Patient Return Visits for Repeat Imaging: A Real-World Application
Authors:
Vikash Gupta,
Clayton Taylor,
Sarah Bonnet,
Luciano M. Prevedello,
Jeffrey Hawley,
Richard D White,
Mona G Flores,
Barbaros Selnur Erdal
Abstract:
Screening mammograms are a routine imaging exam performed to detect breast cancer in its early stages to reduce morbidity and mortality attributed to this disease. In order to maximize the efficacy of breast cancer screening programs, proper mammographic positioning is paramount. Proper positioning ensures adequate visualization of breast tissue and is necessary for effective breast cancer detecti…
▽ More
Screening mammograms are a routine imaging exam performed to detect breast cancer in its early stages to reduce morbidity and mortality attributed to this disease. In order to maximize the efficacy of breast cancer screening programs, proper mammographic positioning is paramount. Proper positioning ensures adequate visualization of breast tissue and is necessary for effective breast cancer detection. Therefore, breast-imaging radiologists must assess each mammogram for the adequacy of positioning before providing a final interpretation of the examination; this often necessitates return patient visits for additional imaging. In this paper, we propose a deep learning-algorithm method that mimics and automates this decision-making process to identify poorly positioned mammograms. Our objective for this algorithm is to assist mammography technologists in recognizing inadequately positioned mammograms real-time, improve the quality of mammographic positioning and performance, and ultimately reducing repeat visits for patients with initially inadequate imaging. The proposed model showed a true positive rate for detecting correct positioning of 91.35% in the mediolateral oblique view and 95.11% in the craniocaudal view. In addition to these results, we also present an automatically generated report which can aid the mammography technologist in taking corrective measures during the patient visit.
△ Less
Submitted 28 September, 2020;
originally announced September 2020.
-
Federated Learning for Breast Density Classification: A Real-World Implementation
Authors:
Holger R. Roth,
Ken Chang,
Praveer Singh,
Nir Neumark,
Wenqi Li,
Vikash Gupta,
Sharut Gupta,
Liangqiong Qu,
Alvin Ihsani,
Bernardo C. Bizzo,
Yuhong Wen,
Varun Buch,
Meesam Shah,
Felipe Kitamura,
Matheus Mendonça,
Vitor Lavor,
Ahmed Harouni,
Colin Compas,
Jesse Tetreault,
Prerna Dogra,
Yan Cheng,
Selnur Erdal,
Richard White,
Behrooz Hashemian,
Thomas Schultz
, et al. (18 additional authors not shown)
Abstract:
Building robust deep learning-based models requires large quantities of diverse training data. In this study, we investigate the use of federated learning (FL) to build medical imaging classification models in a real-world collaborative setting. Seven clinical institutions from across the world joined this FL effort to train a model for breast density classification based on Breast Imaging, Report…
▽ More
Building robust deep learning-based models requires large quantities of diverse training data. In this study, we investigate the use of federated learning (FL) to build medical imaging classification models in a real-world collaborative setting. Seven clinical institutions from across the world joined this FL effort to train a model for breast density classification based on Breast Imaging, Reporting & Data System (BI-RADS). We show that despite substantial differences among the datasets from all sites (mammography system, class distribution, and data set size) and without centralizing data, we can successfully train AI models in federation. The results show that models trained using FL perform 6.3% on average better than their counterparts trained on an institute's local data alone. Furthermore, we show a 45.8% relative improvement in the models' generalizability when evaluated on the other participating sites' testing data.
△ Less
Submitted 20 October, 2020; v1 submitted 3 September, 2020;
originally announced September 2020.
-
Semantics Preserving Hierarchy based Retrieval of Indian heritage monuments
Authors:
Ronak Gupta,
Prerana Mukherjee,
Brejesh Lall,
Varshul Gupta
Abstract:
Monument classification can be performed on the basis of their appearance and shape from coarse to fine categories. Although there is much semantic information present in the monuments which is reflected in the eras they were built, its type or purpose, the dynasty which established it, etc. Particularly, Indian subcontinent exhibits a huge deal of variation in terms of architectural styles owing…
▽ More
Monument classification can be performed on the basis of their appearance and shape from coarse to fine categories. Although there is much semantic information present in the monuments which is reflected in the eras they were built, its type or purpose, the dynasty which established it, etc. Particularly, Indian subcontinent exhibits a huge deal of variation in terms of architectural styles owing to its rich cultural heritage. In this paper, we propose a framework that utilizes hierarchy to preserve semantic information while performing image classification or image retrieval. We encode the learnt deep embeddings to construct a dictionary of images and then utilize a re-ranking framework on the the retrieved results using DeLF features. The semantic information preserved in these embeddings helps to classify unknown monuments at higher level of granularity in hierarchy. We have curated a large, novel Indian heritage monuments dataset comprising of images of historical, cultural and religious importance with subtypes of eras, dynasties and architectural styles. We demonstrate the performance of the proposed framework in image classification and retrieval tasks and compare it with other competing methods on this dataset.
△ Less
Submitted 28 August, 2020;
originally announced August 2020.
-
Utility-based Resource Allocation and Pricing for Serverless Computing
Authors:
Vipul Gupta,
Soham Phade,
Thomas Courtade,
Kannan Ramchandran
Abstract:
Serverless computing platforms currently rely on basic pricing schemes that are static and do not reflect customer feedback. This leads to significant inefficiencies from a total utility perspective. As one of the fastest-growing cloud services, serverless computing provides an opportunity to better serve both users and providers through the incorporation of market-based strategies for pricing and…
▽ More
Serverless computing platforms currently rely on basic pricing schemes that are static and do not reflect customer feedback. This leads to significant inefficiencies from a total utility perspective. As one of the fastest-growing cloud services, serverless computing provides an opportunity to better serve both users and providers through the incorporation of market-based strategies for pricing and resource allocation. With the help of utility functions to model the delay-sensitivity of customers, we propose a novel scheduler to allocate resources for serverless computing. The resulting resource allocation scheme is optimal in the sense that it maximizes the aggregate utility of all users across the system, thus maximizing social welfare. Our approach gives rise to a natural dynamic pricing scheme that is obtained by solving an optimization problem in its dual form. We further develop feedback mechanisms that allow the cloud provider to converge to optimal resource allocation, even when the users' utilities are private and unknown to the service provider. Simulations show that our approach can track market demand and achieve significantly higher social welfare (or, equivalently, cost savings for customers) compared to existing schemes.
△ Less
Submitted 24 January, 2022; v1 submitted 18 August, 2020;
originally announced August 2020.
-
Implementing partisan symmetry: Problems and paradoxes
Authors:
Daryl DeFord,
Natasha Dhamankar,
Moon Duchin,
Varun Gupta,
Mackenzie McPike,
Gabe Schoenbach,
Ki Wan Sim
Abstract:
We consider the measures of partisan symmetry proposed for practical use in the political science literature, as clarified and developed in Katz, King, and Rosenblatt (2020). Elementary mathematical manipulation shows the symmetry metrics to have surprising properties that call their meaningfulness into question. To accompany the general analysis, we study measures of partisan symmetry with respec…
▽ More
We consider the measures of partisan symmetry proposed for practical use in the political science literature, as clarified and developed in Katz, King, and Rosenblatt (2020). Elementary mathematical manipulation shows the symmetry metrics to have surprising properties that call their meaningfulness into question. To accompany the general analysis, we study measures of partisan symmetry with respect to recent voting patterns in Utah, Texas, and North Carolina, flagging problems in each case. Taken together, these observations should raise major concerns about the available techniques for quantitative scores of partisan symmetry -- including the mean-median score, the partisan bias score, and the more general "partisan symmetry standard" -- as the decennial redistricting begins.
△ Less
Submitted 3 March, 2021; v1 submitted 16 August, 2020;
originally announced August 2020.
-
A magnetar parallax
Authors:
H. Ding,
A. T. Deller,
M. E. Lower,
C. Flynn,
S. Chatterjee,
W. Brisken,
N. Hurley-Walker,
F. Camilo,
J. Sarkissian,
V. Gupta
Abstract:
XTE J1810-197 (J1810) was the first magnetar identified to emit radio pulses, and has been extensively studied during a radio-bright phase in 2003$-$2008. It is estimated to be relatively nearby compared to other Galactic magnetars, and provides a useful prototype for the physics of high magnetic fields, magnetar velocities, and the plausible connection to extragalactic fast radio bursts. Upon the…
▽ More
XTE J1810-197 (J1810) was the first magnetar identified to emit radio pulses, and has been extensively studied during a radio-bright phase in 2003$-$2008. It is estimated to be relatively nearby compared to other Galactic magnetars, and provides a useful prototype for the physics of high magnetic fields, magnetar velocities, and the plausible connection to extragalactic fast radio bursts. Upon the re-brightening of the magnetar at radio wavelengths in late 2018, we resumed an astrometric campaign on J1810 with the Very Long Baseline Array, and sampled 14 new positions of J1810 over 1.3 years. The phase calibration for the new observations was performed with two phase calibrators that are quasi-colinear on the sky with J1810, enabling substantial improvement of the resultant astrometric precision. Combining our new observations with two archival observations from 2006, we have refined the proper motion and reference position of the magnetar and have measured its annual geometric parallax, the first such measurement for a magnetar. The parallax of $0.40\pm0.05\,$mas corresponds to a most probable distance $2.5^{+0.4}_{-0.3}\,$kpc for J1810. Our new astrometric results confirm an unremarkable transverse peculiar velocity of $\approx200\,\mathrm{km~s^{-1}}$ for J1810, which is only at the average level among the pulsar population. The magnetar proper motion vector points back to the central region of a supernova remnant (SNR) at a compatible distance at $\approx70\,$kyr ago, but a direct association is disfavored by the estimated SNR age of ~3 kyr.
△ Less
Submitted 14 August, 2020;
originally announced August 2020.
-
Optical Identification of Materials Transformations in Oxide Thin Films
Authors:
Duncan R. Sutherland,
Aine Boyer Connolly,
Maximilian Amsler,
Ming-Chiang Chang,
Katie Rose Gann,
Vidit Gupta,
Sebastian Ament,
Dan Guevarra,
John M. Gregoire,
Carla P. Gomes,
R. B. van Dover,
Michael O. Thompson
Abstract:
Recent advances in high-throughput experimentation for combinatorial studies have accelerated the discovery and analysis of materials across a wide range of compositions and synthesis conditions. However, many of the more powerful characterization methods are limited by speed, cost, availability, and/or resolution. To make efficient use of these methods, there is value in develo** approaches for…
▽ More
Recent advances in high-throughput experimentation for combinatorial studies have accelerated the discovery and analysis of materials across a wide range of compositions and synthesis conditions. However, many of the more powerful characterization methods are limited by speed, cost, availability, and/or resolution. To make efficient use of these methods, there is value in develo** approaches for identifying critical compositions and conditions to be used as a-priori knowledge for follow-up characterization with high-precision techniques, such as micron-scale synchrotron based X-ray diffraction (XRD). Here we demonstrate the use of optical microscopy and reflectance spectroscopy to identify likely phase-change boundaries in thin film libraries. These methods are used to delineate possible metastable phase boundaries following lateral-gradient Laser Spike Annealing (lg-LSA) of oxide materials. The set of boundaries are then compared with definitive determinations of structural transformations obtained using high-resolution XRD. We demonstrate that the optical methods detect more than 95% of the structural transformations in a composition-gradient La-Mn-O library and a Ga$_2$O$_3$ sample, both subject to an extensive set of lg-LSA anneals. Our results provide quantitative support for the value of optically-detected transformations as a priori data to guide subsequent structural characterization, ultimately accelerating and enhancing the efficient implementation of $μ$m-resolution XRD experiments.
△ Less
Submitted 14 August, 2020;
originally announced August 2020.
-
On closed Lie ideals and center of generalized group algebras
Authors:
Ved Prakash Gupta,
Ranjana Jain,
Bharat Talwar
Abstract:
For any locally compact group $G$ and any Banach algebra $A$, a characterization of the closed Lie ideals of the generalized group algebra $L^1(G,A)$ is obtained in terms of left and right actions by $G$ and $A$. In addition, when $A$ is unital and $G$ is an ${\bf [SIN]}$ group, we show that the center of $L^1(G,A)$ is precisely the collection of all center valued functions which are constant on t…
▽ More
For any locally compact group $G$ and any Banach algebra $A$, a characterization of the closed Lie ideals of the generalized group algebra $L^1(G,A)$ is obtained in terms of left and right actions by $G$ and $A$. In addition, when $A$ is unital and $G$ is an ${\bf [SIN]}$ group, we show that the center of $L^1(G,A)$ is precisely the collection of all center valued functions which are constant on the conjugacy classes of $G$. As an application, we establish that $\mathcal{Z}(L^1(G) \otimes^γ A)= \mathcal{Z}(L^1(G)) \otimes^γ \mathcal{Z}(A)$, for a class of groups and Banach algebras. And, prior to these, for any finite group $G$, the Lie ideals of the group algebra $\mathbb{C}[G]$ are identified in terms of some canonical spaces determined by the irreducible characters of $G$.
△ Less
Submitted 13 August, 2020;
originally announced August 2020.
-
Artificial Intelligence to Assist in Exclusion of Coronary Atherosclerosis during CCTA Evaluation of Chest-Pain in the Emergency Department: Preparing an Application for Real-World Use
Authors:
Richard D. White,
Barbaros S. Erdal,
Mutlu Demirer,
Vikash Gupta,
Matthew T. Bigelow,
Engin Dikici,
Sema Candemir,
Mauricio S. Galizia,
Jessica L. Carpenter,
Thomas P. O Donnell,
Abdul H. Halabi,
Luciano M. Prevedello
Abstract:
Coronary Computed Tomography Angiography (CCTA) evaluation of chest-pain patients in an Emergency Department (ED) is considered appropriate. While a negative CCTA interpretation supports direct patient discharge from an ED, labor-intensive analyses are required, with accuracy in jeopardy from distractions. We describe the development of an Artificial Intelligence (AI) algorithm and workflow for as…
▽ More
Coronary Computed Tomography Angiography (CCTA) evaluation of chest-pain patients in an Emergency Department (ED) is considered appropriate. While a negative CCTA interpretation supports direct patient discharge from an ED, labor-intensive analyses are required, with accuracy in jeopardy from distractions. We describe the development of an Artificial Intelligence (AI) algorithm and workflow for assisting interpreting physicians in CCTA screening for the absence of coronary atherosclerosis. The two-phase approach consisted of (1) Phase 1 - focused on the development and preliminary testing of an algorithm for vessel-centerline extraction classification in a balanced study population (n = 500 with 50% disease prevalence) derived by retrospective random case selection; and (2) Phase 2 - concerned with simulated-clinical Trialing of the developed algorithm on a per-case basis in a more real-world study population (n = 100 with 28% disease prevalence) from an ED chest-pain series. This allowed pre-deployment evaluation of the AI-based CCTA screening application which provides a vessel-by-vessel graphic display of algorithm inference results integrated into a clinically capable viewer. Algorithm performance evaluation used Area Under the Receiver-Operating-Characteristic Curve (AUC-ROC); confusion matrices reflected ground-truth vs AI determinations. The vessel-based algorithm demonstrated strong performance with AUC-ROC = 0.96. In both Phase 1 and Phase 2, independent of disease prevalence differences, negative predictive values at the case level were very high at 95%. The rate of completion of the algorithm workflow process (96% with inference results in 55-80 seconds) in Phase 2 depended on adequate image quality. There is potential for this AI application to assist in CCTA interpretation to help extricate atherosclerosis from chest-pain presentations.
△ Less
Submitted 10 August, 2020;
originally announced August 2020.
-
On BPS Strings in ${\mathcal N}=4$ Yang-Mills Theory
Authors:
Sujay K. Ashok,
Varun Gupta,
Nemani V. Suryanarayana
Abstract:
We study singular time-dependent $\frac{1}{8}$-BPS configurations in the abelian sector of ${{\mathcal N}= 4}$ supersymmetric Yang-Mills theory that represent BPS string-like defects in ${{\mathbb R}\times S^3}$ spacetime. Such BPS strings can be described as intersections of the zeros of holomorphic functions in two complex variables with a 3-sphere. We argue that these BPS strings map to…
▽ More
We study singular time-dependent $\frac{1}{8}$-BPS configurations in the abelian sector of ${{\mathcal N}= 4}$ supersymmetric Yang-Mills theory that represent BPS string-like defects in ${{\mathbb R}\times S^3}$ spacetime. Such BPS strings can be described as intersections of the zeros of holomorphic functions in two complex variables with a 3-sphere. We argue that these BPS strings map to $\frac{1}{8}$-BPS surface operators under the state-operator correspondence of the CFT. We show that the string defects are holographically dual to noncompact probe D3-branes in global $AdS_5\times S^5$ that share supersymmetries with a class of dual-giant gravitons. For simple configurations, we demonstrate how to define a good variational problem and propose a regularization scheme that leads to finite energy and global charges on both sides of the holographic correspondence.
△ Less
Submitted 22 November, 2020; v1 submitted 3 August, 2020;
originally announced August 2020.
-
Self-supervised learning through the eyes of a child
Authors:
A. Emin Orhan,
Vaibhav V. Gupta,
Brenden M. Lake
Abstract:
Within months of birth, children develop meaningful expectations about the world around them. How much of this early knowledge can be explained through generic learning mechanisms applied to sensory data, and how much of it requires more substantive innate inductive biases? Addressing this fundamental question in its full generality is currently infeasible, but we can hope to make real progress in…
▽ More
Within months of birth, children develop meaningful expectations about the world around them. How much of this early knowledge can be explained through generic learning mechanisms applied to sensory data, and how much of it requires more substantive innate inductive biases? Addressing this fundamental question in its full generality is currently infeasible, but we can hope to make real progress in more narrowly defined domains, such as the development of high-level visual categories, thanks to improvements in data collecting technology and recent progress in deep learning. In this paper, our goal is precisely to achieve such progress by utilizing modern self-supervised deep learning methods and a recent longitudinal, egocentric video dataset recorded from the perspective of three young children (Sullivan et al., 2020). Our results demonstrate the emergence of powerful, high-level visual representations from developmentally realistic natural videos using generic self-supervised learning objectives.
△ Less
Submitted 15 December, 2020; v1 submitted 31 July, 2020;
originally announced July 2020.
-
An Insurance Contract Design to Boost Storage Participation in the Electricity Market
Authors:
Nayara Aguiar,
Vijay Gupta
Abstract:
Energy storage technologies are key to improving grid flexibility in the presence of increasing amounts of intermittent renewable generation. We propose an insurance contract that suitably compensates energy storage systems for providing flexibility. Such a contract provides a wider range of market opportunities for these systems while also incentivizing higher renewable penetration in the grid. W…
▽ More
Energy storage technologies are key to improving grid flexibility in the presence of increasing amounts of intermittent renewable generation. We propose an insurance contract that suitably compensates energy storage systems for providing flexibility. Such a contract provides a wider range of market opportunities for these systems while also incentivizing higher renewable penetration in the grid. We consider a day-ahead market in which generators, including renewables and storage owners, bid to be scheduled for the next operating day. Due to production uncertainty, renewable generators may be unable to meet their day-ahead production schedule, and thus be subject to a penalty. As a hedge against these penalties, we propose an insurance contract between a renewable producer and a storage owner, in which the storage reserves some energy to be used in case of renewable shortfalls. We show that such a contract incentivizes the renewable player to bid higher, thus increasing renewable participation in the electricity mix. It also provides an extra source of revenue for storage owners that may not be profitable with a purely arbitrage-based strategy in the day-ahead market. Further, we prove this contract is economically beneficial for both players. We validate our analysis through two case studies.
△ Less
Submitted 29 July, 2020;
originally announced July 2020.
-
Radiation Damage Study of SensL J-Series Silicon Photomultipliers Using 101.4 MeV Protons
Authors:
Alexei Ulyanov,
David Murphy,
Joseph Mangan,
Viyas Gupta,
Wojciech Hajdas,
Daithi de Faoite,
Brian Shortt,
Lorraine Hanlon,
Sheila McBreen
Abstract:
Radiation damage of J-series silicon photomultipliers (SiPMs) has been studied in the context of using these photodetectors in future space-borne scintillation detectors. Several SiPM samples were exposed to 101.4 MeV protons, with 1 MeV neutron equivalent fluence ranging from 1.27*10^8 n/cm^2 to 1.23*10^10 n/cm^2 . After the irradiation, the SiPMs experienced a large increase in the dark current…
▽ More
Radiation damage of J-series silicon photomultipliers (SiPMs) has been studied in the context of using these photodetectors in future space-borne scintillation detectors. Several SiPM samples were exposed to 101.4 MeV protons, with 1 MeV neutron equivalent fluence ranging from 1.27*10^8 n/cm^2 to 1.23*10^10 n/cm^2 . After the irradiation, the SiPMs experienced a large increase in the dark current and noise, which may pose problems for long-running space missions in terms of power consumption, thermal control and detection of low-energy events. Measurements performed with a CeBr3 scintillator crystal showed that after exposure to 1.23*10^10 n/cm^2 and following room-temperature annealing, the dark noise of a single 6 mm square SiPM at room temperature increased from 0.1 keV to 2 keV. Because of the large SiPM noise, the gamma-ray detection threshold increased to approximately 20 keV for a CeBr3 detector using a 4-SiPM array and 40 keV for a detector using a 16-SiPM array. Only a small effect of the proton irradiation on the average detector signal was observed, suggesting no or little change to the SiPM gain and photon detection efficiency.
△ Less
Submitted 21 July, 2020;
originally announced July 2020.
-
On the Complexity of Sequential Incentive Design
Authors:
Yagiz Savas,
Vijay Gupta,
Ufuk Topcu
Abstract:
In many scenarios, a principal dynamically interacts with an agent and offers a sequence of incentives to align the agent's behavior with a desired objective. This paper focuses on the problem of synthesizing an incentive sequence that, once offered, induces the desired agent behavior even when the agent's intrinsic motivation is unknown to the principal. We model the agent's behavior as a Markov…
▽ More
In many scenarios, a principal dynamically interacts with an agent and offers a sequence of incentives to align the agent's behavior with a desired objective. This paper focuses on the problem of synthesizing an incentive sequence that, once offered, induces the desired agent behavior even when the agent's intrinsic motivation is unknown to the principal. We model the agent's behavior as a Markov decision process, express its intrinsic motivation as a reward function, which belongs to a finite set of possible reward functions, and consider the incentives as additional rewards offered to the agent. We first show that the behavior modification problem (BMP), i.e., the problem of synthesizing an incentive sequence that induces a desired agent behavior at minimum total cost to the principal, is PSPACE-hard. Moreover, we show that by imposing certain restrictions on the incentive sequences available to the principal, one can obtain two NP-complete variants of the BMP. We also provide a sufficient condition on the set of possible reward functions under which the BMP can be solved via linear programming. Finally, we propose two algorithms to compute globally and locally optimal solutions to the NP-complete variants of the BMP.
△ Less
Submitted 16 July, 2020;
originally announced July 2020.
-
Handling Variable-Dimensional Time Series with Graph Neural Networks
Authors:
Vibhor Gupta,
Jyoti Narwariya,
Pankaj Malhotra,
Lovekesh Vig,
Gautam Shroff
Abstract:
Several applications of Internet of Things (IoT) technology involve capturing data from multiple sensors resulting in multi-sensor time series. Existing neural networks based approaches for such multi-sensor or multivariate time series modeling assume fixed input dimension or number of sensors. Such approaches can struggle in the practical setting where different instances of the same device or eq…
▽ More
Several applications of Internet of Things (IoT) technology involve capturing data from multiple sensors resulting in multi-sensor time series. Existing neural networks based approaches for such multi-sensor or multivariate time series modeling assume fixed input dimension or number of sensors. Such approaches can struggle in the practical setting where different instances of the same device or equipment such as mobiles, wearables, engines, etc. come with different combinations of installed sensors. We consider training neural network models from such multi-sensor time series, where the time series have varying input dimensionality owing to availability or installation of a different subset of sensors at each source of time series. We propose a novel neural network architecture suitable for zero-shot transfer learning allowing robust inference for multivariate time series with previously unseen combination of available dimensions or sensors at test time. Such a combinatorial generalization is achieved by conditioning the layers of a core neural network-based time series model with a "conditioning vector" that carries information of the available combination of sensors for each time series. This conditioning vector is obtained by summarizing the set of learned "sensor embedding vectors" corresponding to the available sensors in a time series via a graph neural network. We evaluate the proposed approach on publicly available activity recognition and equipment prognostics datasets, and show that the proposed approach allows for better generalization in comparison to a deep gated recurrent neural network baseline.
△ Less
Submitted 20 July, 2020; v1 submitted 1 July, 2020;
originally announced July 2020.
-
Generative models for sampling and phase transition indication in spin systems
Authors:
Japneet Singh,
Vipul Arora,
Vinay Gupta,
Mathias S. Scheurer
Abstract:
Recently, generative machine-learning models have gained popularity in physics, driven by the goal of improving the efficiency of Markov chain Monte Carlo techniques and of exploring their potential in capturing experimental data distributions. Motivated by their ability to generate images that look realistic to the human eye, we here study generative adversarial networks (GANs) as tools to learn…
▽ More
Recently, generative machine-learning models have gained popularity in physics, driven by the goal of improving the efficiency of Markov chain Monte Carlo techniques and of exploring their potential in capturing experimental data distributions. Motivated by their ability to generate images that look realistic to the human eye, we here study generative adversarial networks (GANs) as tools to learn the distribution of spin configurations and to generate samples, conditioned on external tuning parameters, such as temperature. We propose ways to efficiently represent the physical states, e.g., by exploiting symmetries, and to minimize the correlations between generated samples. We present a detailed evaluation of the various modifications, using the two-dimensional XY model as an example, and find considerable improvements in our proposed implicit generative model. It is also shown that the model can reliably generate samples in the vicinity of the phase transition, even when it has not been trained in the critical region. On top of using the samples generated by the model to capture the phase transition via evaluation of observables, we show how the model itself can be employed as an unsupervised indicator of transitions, by constructing measures of the model's susceptibility to changes in tuning parameters.
△ Less
Submitted 21 June, 2020;
originally announced June 2020.
-
Modeling Implicit Communities using Spatio-Temporal Point Processes from Geo-tagged Event Traces
Authors:
Ankita Likhyani,
Vinayak Gupta,
Srijith P. K.,
Deepak P.,
Srikanta Bedathur
Abstract:
The location check-ins of users through various location-based services such as Foursquare, Twitter, and Facebook Places, etc., generate large traces of geo-tagged events. These event-traces often manifest in hidden (possibly overlap**) communities of users with similar interests. Inferring these implicit communities is crucial for forming user profiles for improvements in recommendation and pre…
▽ More
The location check-ins of users through various location-based services such as Foursquare, Twitter, and Facebook Places, etc., generate large traces of geo-tagged events. These event-traces often manifest in hidden (possibly overlap**) communities of users with similar interests. Inferring these implicit communities is crucial for forming user profiles for improvements in recommendation and prediction tasks. Given only time-stamped geo-tagged traces of users, can we find out these implicit communities, and characteristics of the underlying influence network? Can we use this network to improve the next location prediction task? In this paper, we focus on the problem of community detection as well as capturing the underlying diffusion process and propose a model COLAB based on Spatio-temporal point processes in continuous time but discrete space of locations that simultaneously models the implicit communities of users based on their check-in activities, without making use of their social network connections. COLAB captures the semantic features of the location, user-to-user influence along with spatial and temporal preferences of users. To learn the latent community of users and model parameters, we propose an algorithm based on stochastic variational inference. To the best of our knowledge, this is the first attempt at jointly modeling the diffusion process with activity-driven implicit communities. We demonstrate COLAB achieves up to 27% improvements in location prediction task over recent deep point-process based methods on geo-tagged event traces collected from Foursquare check-ins.
△ Less
Submitted 13 June, 2020;
originally announced June 2020.
-
SiEVE: Semantically Encoded Video Analytics on Edge and Cloud
Authors:
Tarek Elgamal,
Shu Shi,
Varun Gupta,
Rittwik Jana,
Klara Nahrstedt
Abstract:
Recent advances in computer vision and neural networks have made it possible for more surveillance videos to be automatically searched and analyzed by algorithms rather than humans. This happened in parallel with advances in edge computing where videos are analyzed over hierarchical clusters that contain edge devices, close to the video source. However, the current video analysis pipeline has seve…
▽ More
Recent advances in computer vision and neural networks have made it possible for more surveillance videos to be automatically searched and analyzed by algorithms rather than humans. This happened in parallel with advances in edge computing where videos are analyzed over hierarchical clusters that contain edge devices, close to the video source. However, the current video analysis pipeline has several disadvantages when dealing with such advances. For example, video encoders have been designed for a long time to please human viewers and be agnostic of the downstream analysis task (e.g., object detection). Moreover, most of the video analytics systems leverage 2-tier architecture where the encoded video is sent to either a remote cloud or a private edge server but does not efficiently leverage both of them. In response to these advances, we present SIEVE, a 3-tier video analytics system to reduce the latency and increase the throughput of analytics over video streams. In SIEVE, we present a novel technique to detect objects in compressed video streams. We refer to this technique as semantic video encoding because it allows video encoders to be aware of the semantics of the downstream task (e.g., object detection). Our results show that by leveraging semantic video encoding, we achieve close to 100% object detection accuracy with decompressing only 3.5% of the video frames which results in more than 100x speedup compared to classical approaches that decompress every video frame.
△ Less
Submitted 1 June, 2020;
originally announced June 2020.
-
Renewable Power Trades and Network Congestion Externalities
Authors:
Nayara Aguiar,
Indraneel Chakraborty,
Vijay Gupta
Abstract:
Integrating renewable energy production into the electricity grid is an important policy goal to address climate change. However, such an integration faces economic and technological challenges. As power generation by renewable sources increases, power transmission patterns over the electric grid change. Due to physical laws, these new transmission patterns lead to non-intuitive grid congestion ex…
▽ More
Integrating renewable energy production into the electricity grid is an important policy goal to address climate change. However, such an integration faces economic and technological challenges. As power generation by renewable sources increases, power transmission patterns over the electric grid change. Due to physical laws, these new transmission patterns lead to non-intuitive grid congestion externalities. We derive the conditions under which negative network externalities due to power trades occur. Calibration using a stylized framework and data from Europe shows that each additional unit of power traded between northern and western Europe reduces transmission capacity for the southern and eastern regions by 27% per unit traded. Such externalities suggest that new investments in the electric grid infrastructure cannot be made piecemeal. In our example, power infrastructure investment in northern and western Europe needs an accompanying investment in southern and eastern Europe as well. An economic challenge is regions facing externalities do not always have the financial ability to invest in infrastructure. Power transit fares can help finance power infrastructure investment in regions facing network congestion externalities. The resulting investment in the overall electricity grid facilitates integration of renewable energy production.
△ Less
Submitted 14 January, 2021; v1 submitted 28 May, 2020;
originally announced June 2020.
-
P-SIF: Document Embeddings Using Partition Averaging
Authors:
Vivek Gupta,
Ankit Saw,
Pegah Nokhiz,
Praneeth Netrapalli,
Piyush Rai,
Partha Talukdar
Abstract:
Simple weighted averaging of word vectors often yields effective representations for sentences which outperform sophisticated seq2seq neural models in many tasks. While it is desirable to use the same method to represent documents as well, unfortunately, the effectiveness is lost when representing long documents involving multiple sentences. One of the key reasons is that a longer document is like…
▽ More
Simple weighted averaging of word vectors often yields effective representations for sentences which outperform sophisticated seq2seq neural models in many tasks. While it is desirable to use the same method to represent documents as well, unfortunately, the effectiveness is lost when representing long documents involving multiple sentences. One of the key reasons is that a longer document is likely to contain words from many different topics; hence, creating a single vector while ignoring all the topical structure is unlikely to yield an effective document representation. This problem is less acute in single sentences and other short text fragments where the presence of a single topic is most likely. To alleviate this problem, we present P-SIF, a partitioned word averaging model to represent long documents. P-SIF retains the simplicity of simple weighted word averaging while taking a document's topical structure into account. In particular, P-SIF learns topic-specific vectors from a document and finally concatenates them all to represent the overall document. We provide theoretical justifications on the correctness of P-SIF. Through a comprehensive set of experiments, we demonstrate P-SIF's effectiveness compared to simple weighted averaging and many other baselines.
△ Less
Submitted 18 May, 2020;
originally announced May 2020.
-
A Large-Scale, Open-Domain, Mixed-Interface Dialogue-Based ITS for STEM
Authors:
Iulian Vlad Serban,
Varun Gupta,
Ekaterina Kochmar,
Dung D. Vu,
Robert Belfer,
Joelle Pineau,
Aaron Courville,
Laurent Charlin,
Yoshua Bengio
Abstract:
We present Korbit, a large-scale, open-domain, mixed-interface, dialogue-based intelligent tutoring system (ITS). Korbit uses machine learning, natural language processing and reinforcement learning to provide interactive, personalized learning online. Korbit has been designed to easily scale to thousands of subjects, by automating, standardizing and simplifying the content creation process. Unlik…
▽ More
We present Korbit, a large-scale, open-domain, mixed-interface, dialogue-based intelligent tutoring system (ITS). Korbit uses machine learning, natural language processing and reinforcement learning to provide interactive, personalized learning online. Korbit has been designed to easily scale to thousands of subjects, by automating, standardizing and simplifying the content creation process. Unlike other ITS, a teacher can develop new learning modules for Korbit in a matter of hours. To facilitate learning across a widerange of STEM subjects, Korbit uses a mixed-interface, which includes videos, interactive dialogue-based exercises, question-answering, conceptual diagrams, mathematical exercises and gamification elements. Korbit has been built to scale to millions of students, by utilizing a state-of-the-art cloud-based micro-service architecture. Korbit launched its first course in 2019 on machine learning, and since then over 7,000 students have enrolled. Although Korbit was designed to be open-domain and highly scalable, A/B testing experiments with real-world students demonstrate that both student learning outcomes and student motivation are substantially improved compared to typical online courses.
△ Less
Submitted 5 May, 2020;
originally announced May 2020.
-
INFOTABS: Inference on Tables as Semi-structured Data
Authors:
Vivek Gupta,
Maitrey Mehta,
Pegah Nokhiz,
Vivek Srikumar
Abstract:
In this paper, we observe that semi-structured tabulated text is ubiquitous; understanding them requires not only comprehending the meaning of text fragments, but also implicit relationships between them. We argue that such data can prove as a testing ground for understanding how we reason about information. To study this, we introduce a new dataset called INFOTABS, comprising of human-written tex…
▽ More
In this paper, we observe that semi-structured tabulated text is ubiquitous; understanding them requires not only comprehending the meaning of text fragments, but also implicit relationships between them. We argue that such data can prove as a testing ground for understanding how we reason about information. To study this, we introduce a new dataset called INFOTABS, comprising of human-written textual hypotheses based on premises that are tables extracted from Wikipedia info-boxes. Our analysis shows that the semi-structured, multi-domain and heterogeneous nature of the premises admits complex, multi-faceted reasoning. Experiments reveal that, while human annotators agree on the relationships between a table-hypothesis pair, several standard modeling strategies are unsuccessful at the task, suggesting that reasoning about tables can pose a difficult modeling challenge.
△ Less
Submitted 12 May, 2020;
originally announced May 2020.
-
Automated Personalized Feedback Improves Learning Gains in an Intelligent Tutoring System
Authors:
Ekaterina Kochmar,
Dung Do Vu,
Robert Belfer,
Varun Gupta,
Iulian Vlad Serban,
Joelle Pineau
Abstract:
We investigate how automated, data-driven, personalized feedback in a large-scale intelligent tutoring system (ITS) improves student learning outcomes. We propose a machine learning approach to generate personalized feedback, which takes individual needs of students into account. We utilize state-of-the-art machine learning and natural language processing techniques to provide the students with pe…
▽ More
We investigate how automated, data-driven, personalized feedback in a large-scale intelligent tutoring system (ITS) improves student learning outcomes. We propose a machine learning approach to generate personalized feedback, which takes individual needs of students into account. We utilize state-of-the-art machine learning and natural language processing techniques to provide the students with personalized hints, Wikipedia-based explanations, and mathematical hints. Our model is used in Korbit, a large-scale dialogue-based ITS with thousands of students launched in 2019, and we demonstrate that the personalized feedback leads to considerable improvement in student learning outcomes and in the subjective evaluation of the feedback.
△ Less
Submitted 7 May, 2020; v1 submitted 5 May, 2020;
originally announced May 2020.
-
Lattice of intermediate subalgebras
Authors:
Keshab Chandra Bakshi,
Ved Prakash Gupta
Abstract:
Analogous to subfactor theory, employing Watatani's notions of index and $C^*$-basic construction of certain inclusions of $C^*$-algebras, (a) we develop a Fourier theory (consisting of Fourier transforms, rotation maps and shift operators) on the relative commutants of any inclusion of simple unital $C^*$-algebras with finite Watatani index, and (b) we introduce the notions of interior and exteri…
▽ More
Analogous to subfactor theory, employing Watatani's notions of index and $C^*$-basic construction of certain inclusions of $C^*$-algebras, (a) we develop a Fourier theory (consisting of Fourier transforms, rotation maps and shift operators) on the relative commutants of any inclusion of simple unital $C^*$-algebras with finite Watatani index, and (b) we introduce the notions of interior and exterior angles between intermediate $C^*$-subalgebras of any inclusion of unital $C^*$-algebras admitting a finite index conditional expectation. Then, on the lines of [2], we apply these concepts to obtain a bound for the cardinality of the lattice of intermediate $C^*$-subalgebras of any irreducible inclusion as in (a), and improve Longo's bound for the cardinality of intermediate subfactors of an inclusion of type $III$ factors with finite index. Moreover, we also show that for a fairly large class of inclusions of finite von Neumann algebras, the lattice of intermediate von Neumann subalgebras is always finite.
△ Less
Submitted 9 May, 2020; v1 submitted 3 May, 2020;
originally announced May 2020.
-
Detecting and Characterizing Extremist Reviewer Groups in Online Product Reviews
Authors:
Viresh Gupta,
Aayush Aggarwal,
Tanmoy Chakraborty
Abstract:
Online marketplaces often witness opinion spam in the form of reviews. People are often hired to target specific brands for promoting or impeding them by writing highly positive or negative reviews. This often is done collectively in groups. Although some previous studies attempted to identify and analyze such opinion spam groups, little has been explored to spot those groups who target a brand as…
▽ More
Online marketplaces often witness opinion spam in the form of reviews. People are often hired to target specific brands for promoting or impeding them by writing highly positive or negative reviews. This often is done collectively in groups. Although some previous studies attempted to identify and analyze such opinion spam groups, little has been explored to spot those groups who target a brand as a whole, instead of just products.
In this paper, we collected reviews from the Amazon product review site and manually labelled a set of 923 candidate reviewer groups. The groups are extracted using frequent itemset mining over brand similarities such that users are clustered together if they have mutually reviewed (products of) a lot of brands. We hypothesize that the nature of the reviewer groups is dependent on 8 features specific to a (group, brand) pair. We develop a feature-based supervised model to classify candidate groups as extremist entities. We run multiple classifiers for the task of classifying a group based on the reviews written by the users of that group, to determine if the group shows signs of extremity. A 3-layer Perceptron based classifier turns out to be the best classifier. We further study the behaviours of such groups in detail to understand the dynamics of brand-level opinion fraud better. These behaviours include consistency in ratings, review sentiment, verified purchase, review dates and helpful votes received on reviews. Surprisingly, we observe that there are a lot of verified reviewers showing extreme sentiment, which on further investigation leads to ways to circumvent existing mechanisms in place to prevent unofficial incentives on Amazon.
△ Less
Submitted 13 April, 2020;
originally announced April 2020.
-
DeepSumm -- Deep Code Summaries using Neural Transformer Architecture
Authors:
Vivek Gupta
Abstract:
Source code summarizing is a task of writing short, natural language descriptions of source code behavior during run time. Such summaries are extremely useful for software development and maintenance but are expensive to manually author,hence it is done for small fraction of the code that is produced and is often ignored. Automatic code documentation can possibly solve this at a low cost. This is…
▽ More
Source code summarizing is a task of writing short, natural language descriptions of source code behavior during run time. Such summaries are extremely useful for software development and maintenance but are expensive to manually author,hence it is done for small fraction of the code that is produced and is often ignored. Automatic code documentation can possibly solve this at a low cost. This is thus an emerging research field with further applications to program comprehension, and software maintenance. Traditional methods often relied on cognitive models that were built in the form of templates and by heuristics and had varying degree of adoption by the developer community. But with recent advancements, end to end data-driven approaches based on neural techniques have largely overtaken the traditional techniques. Much of the current landscape employs neural translation based architectures with recurrence and attention which is resource and time intensive training procedure. In this paper, we employ neural techniques to solve the task of source code summarizing and specifically compare NMT based techniques to more simplified and appealing Transformer architecture on a dataset of Java methods and comments. We bring forth an argument to dispense the need of recurrence in the training procedure. To the best of our knowledge, transformer based models have not been used for the task before. With supervised samples of more than 2.1m comments and code, we reduce the training time by more than 50% and achieve the BLEU score of 17.99 for the test set of examples.
△ Less
Submitted 31 March, 2020;
originally announced April 2020.
-
Size-Stretched Exponential Relaxation in a Model with Arrested States
Authors:
Vaibhav Gupta,
Saroj Kumar Nandi,
Mustansir Barma
Abstract:
We study the effect of rapid quench to zero temperature in a model with competing interactions, evolving through conserved spin dynamics. In a certain regime of model parameters, we find that the model belongs to the broader class of kinetically constrained models, however, the dynamics is different from that of a glass. The system shows stretched exponential relaxation with the unusual feature th…
▽ More
We study the effect of rapid quench to zero temperature in a model with competing interactions, evolving through conserved spin dynamics. In a certain regime of model parameters, we find that the model belongs to the broader class of kinetically constrained models, however, the dynamics is different from that of a glass. The system shows stretched exponential relaxation with the unusual feature that the relaxation time diverges as a power of the system size. Explicitly, we find that the spatial correlation function decays as $\exp(-2r/\sqrt{L})$ as a function of spatial separation $r$ in a system with $L$ sites in steady state, while the temporal auto-correlation function follows $\exp(-(t/τ_L)^{1/2})$, where $t$ is the time and $τ_L$ proportional to $L$. In the coarsening regime, after time $t_w$, there are two growing length scales, namely $\mathcal{L}(t_w) \sim t_w^{1/2}$ and $\mathcal{R}(t_w) \sim t_w^{1/4}$; the spatial correlation function decays as $\exp(-r/ \mathcal{R}(t_w))$. Interestingly, the stretched exponential form of the auto-correlation function of a single typical sample in steady state differs markedly from that averaged over an ensemble of initial conditions resulting from different quenches; the latter shows a slow power law decay at large times.
△ Less
Submitted 8 August, 2020; v1 submitted 1 April, 2020;
originally announced April 2020.
-
The UTMOST pulsar timing programme II: Timing noise across the pulsar population
Authors:
Marcus E. Lower,
Matthew Bailes,
Ryan M. Shannon,
Simon Johnston,
Chris Flynn,
Stefan Osłowski,
Vivek Gupta,
Wael Farah,
Timothy Bateman,
Anne J. Green,
Richard Hunstead,
Andrew Jameson,
Fabian Jankowski,
Aditya Parthasarathy,
Daniel C. Price,
Angus Sutherland,
David Temby,
Vivek Venkatraman Krishnan
Abstract:
While pulsars possess exceptional rotational stability, large scale timing studies have revealed at least two distinct types of irregularities in their rotation: red timing noise and glitches. Using modern Bayesian techniques, we investigated the timing noise properties of 300 bright southern-sky radio pulsars that have been observed over 1.0-4.8 years by the upgraded Molonglo Observatory Synthesi…
▽ More
While pulsars possess exceptional rotational stability, large scale timing studies have revealed at least two distinct types of irregularities in their rotation: red timing noise and glitches. Using modern Bayesian techniques, we investigated the timing noise properties of 300 bright southern-sky radio pulsars that have been observed over 1.0-4.8 years by the upgraded Molonglo Observatory Synthesis Telescope (MOST). We reanalysed the spin and spin-down changes associated with nine previously reported pulsar glitches, report the discovery of three new glitches and four unusual glitch-like events in the rotational evolution of PSR J1825$-$0935. We develop a refined Bayesian framework for determining how red noise strength scales with pulsar spin frequency ($ν$) and spin-down frequency ($\dotν$), which we apply to a sample of 280 non-recycled pulsars. With this new method and a simple power-law scaling relation, we show that red noise strength scales across the non-recycled pulsar population as $ν^{a} |\dotν|^{b}$, where $a = -0.84^{+0.47}_{-0.49}$ and $b = 0.97^{+0.16}_{-0.19}$. This method can be easily adapted to utilise more complex, astrophysically motivated red noise models. Lastly, we highlight our timing of the double neutron star PSR J0737$-$3039, and the rediscovery of a bright radio pulsar originally found during the first Molonglo pulsar surveys with an incorrectly catalogued position.
△ Less
Submitted 27 February, 2020;
originally announced February 2020.
-
Back to the Future: Joint Aware Temporal Deep Learning 3D Human Pose Estimation
Authors:
Vikas Gupta
Abstract:
We propose a new deep learning network that introduces a deeper CNN channel filter and constraints as losses to reduce joint position and motion errors for 3D video human body pose estimation. Our model outperforms the previous best result from the literature based on mean per-joint position error, velocity error, and acceleration errors on the Human 3.6M benchmark corresponding to a new state-of-…
▽ More
We propose a new deep learning network that introduces a deeper CNN channel filter and constraints as losses to reduce joint position and motion errors for 3D video human body pose estimation. Our model outperforms the previous best result from the literature based on mean per-joint position error, velocity error, and acceleration errors on the Human 3.6M benchmark corresponding to a new state-of-the-art mean error reduction in all protocols and motion metrics. Mean per joint error is reduced by 1%, velocity error by 7% and acceleration by 13% compared to the best results from the literature. Our contribution increasing positional accuracy and motion smoothness in video can be integrated with future end to end networks without increasing network complexity. Our model and code are available at https://vnmr.github.io/
Keywords: 3D, human, image, pose, action, detection, object, video, visual, supervised, joint, kinematic
△ Less
Submitted 22 February, 2020;
originally announced February 2020.
-
Scalable Second Order Optimization for Deep Learning
Authors:
Rohan Anil,
Vineet Gupta,
Tomer Koren,
Kevin Regan,
Yoram Singer
Abstract:
Optimization in machine learning, both theoretical and applied, is presently dominated by first-order gradient methods such as stochastic gradient descent. Second-order optimization methods, that involve second derivatives and/or second order statistics of the data, are far less prevalent despite strong theoretical properties, due to their prohibitive computation, memory and communication costs. I…
▽ More
Optimization in machine learning, both theoretical and applied, is presently dominated by first-order gradient methods such as stochastic gradient descent. Second-order optimization methods, that involve second derivatives and/or second order statistics of the data, are far less prevalent despite strong theoretical properties, due to their prohibitive computation, memory and communication costs. In an attempt to bridge this gap between theoretical and practical optimization, we present a scalable implementation of a second-order preconditioned method (concretely, a variant of full-matrix Adagrad), that along with several critical algorithmic and numerical improvements, provides significant convergence and wall-clock time improvements compared to conventional first-order methods on state-of-the-art deep models. Our novel design effectively utilizes the prevalent heterogeneous hardware architecture for training deep models, consisting of a multicore CPU coupled with multiple accelerator units. We demonstrate superior performance compared to state-of-the-art on very large learning tasks such as machine translation with Transformers, language modeling with BERT, click-through rate prediction on Criteo, and image classification on ImageNet with ResNet-50.
△ Less
Submitted 5 March, 2021; v1 submitted 20 February, 2020;
originally announced February 2020.
-
Serverless Straggler Mitigation using Local Error-Correcting Codes
Authors:
Vipul Gupta,
Dominic Carrano,
Yaoqing Yang,
Vaishaal Shankar,
Thomas Courtade,
Kannan Ramchandran
Abstract:
Inexpensive cloud services, such as serverless computing, are often vulnerable to straggling nodes that increase end-to-end latency for distributed computation. We propose and implement simple yet principled approaches for straggler mitigation in serverless systems for matrix multiplication and evaluate them on several common applications from machine learning and high-performance computing. The p…
▽ More
Inexpensive cloud services, such as serverless computing, are often vulnerable to straggling nodes that increase end-to-end latency for distributed computation. We propose and implement simple yet principled approaches for straggler mitigation in serverless systems for matrix multiplication and evaluate them on several common applications from machine learning and high-performance computing. The proposed schemes are inspired by error-correcting codes and employ parallel encoding and decoding over the data stored in the cloud using serverless workers. This creates a fully distributed computing framework without using a master node to conduct encoding or decoding, which removes the computation, communication and storage bottleneck at the master. On the theory side, we establish that our proposed scheme is asymptotically optimal in terms of decoding time and provide a lower bound on the number of stragglers it can tolerate with high probability. Through extensive experiments, we show that our scheme outperforms existing schemes such as speculative execution and other coding theoretic methods by at least 25%.
△ Less
Submitted 21 January, 2020;
originally announced January 2020.
-
A Game-Theoretic Approach to a Task Delegation Problem
Authors:
Donya G. Dobakhshari,
Lav R. Varshney,
Vijay Gupta
Abstract:
We study a setting in which a principal selects an agent to execute a collection of tasks according to a specified priority sequence. Agents, however, have their own individual priority sequences according to which they wish to execute the tasks. There is information asymmetry since each priority sequence is private knowledge for the individual agent. We design a mechanism for selecting the agent…
▽ More
We study a setting in which a principal selects an agent to execute a collection of tasks according to a specified priority sequence. Agents, however, have their own individual priority sequences according to which they wish to execute the tasks. There is information asymmetry since each priority sequence is private knowledge for the individual agent. We design a mechanism for selecting the agent and incentivizing the selected agent to realize a priority sequence for executing the tasks that achieves socially optimal performance. Our proposed mechanism consists of two parts. First, the principal runs an auction to select an agent to allocate tasks to with minimum declared priority sequence misalignment. Then, the principal rewards the agent according to the realized priority sequence with which the tasks were performed. We show that the proposed mechanism is individually rational and incentive compatible. Further, it is also socially optimal for the case of linear cost of priority sequence modification for the agents.
△ Less
Submitted 24 May, 2020; v1 submitted 11 January, 2020;
originally announced January 2020.
-
Stochastic Weight Averaging in Parallel: Large-Batch Training that Generalizes Well
Authors:
Vipul Gupta,
Santiago Akle Serrano,
Dennis DeCoste
Abstract:
We propose Stochastic Weight Averaging in Parallel (SWAP), an algorithm to accelerate DNN training. Our algorithm uses large mini-batches to compute an approximate solution quickly and then refines it by averaging the weights of multiple models computed independently and in parallel. The resulting models generalize equally well as those trained with small mini-batches but are produced in a substan…
▽ More
We propose Stochastic Weight Averaging in Parallel (SWAP), an algorithm to accelerate DNN training. Our algorithm uses large mini-batches to compute an approximate solution quickly and then refines it by averaging the weights of multiple models computed independently and in parallel. The resulting models generalize equally well as those trained with small mini-batches but are produced in a substantially shorter time. We demonstrate the reduction in training time and the good generalization performance of the resulting models on the computer vision datasets CIFAR10, CIFAR100, and ImageNet.
△ Less
Submitted 7 January, 2020;
originally announced January 2020.
-
"Hinglish" Language -- Modeling a Messy Code-Mixed Language
Authors:
Vivek Kumar Gupta
Abstract:
With a sharp rise in fluency and users of "Hinglish" in linguistically diverse country, India, it has increasingly become important to analyze social content written in this language in platforms such as Twitter, Reddit, Facebook. This project focuses on using deep learning techniques to tackle a classification problem in categorizing social content written in Hindi-English into Abusive, Hate-Indu…
▽ More
With a sharp rise in fluency and users of "Hinglish" in linguistically diverse country, India, it has increasingly become important to analyze social content written in this language in platforms such as Twitter, Reddit, Facebook. This project focuses on using deep learning techniques to tackle a classification problem in categorizing social content written in Hindi-English into Abusive, Hate-Inducing and Not offensive categories. We utilize bi-directional sequence models with easy text augmentation techniques such as synonym replacement, random insertion, random swap, and random deletion to produce a state of the art classifier that outperforms the previous work done on analyzing this dataset.
△ Less
Submitted 30 December, 2019;
originally announced December 2019.
-
Achieving Arbitrary Throughput-Fairness Trade-offs in the Inter Cell Interference Coordination with Fixed Transmit Power Problem
Authors:
Vaibhav Kumar Gupta,
Gaurav S. Kasbekar
Abstract:
We study the problem of inter cell interference coordination (ICIC) with fixed transmit power in OFDMA-based cellular networks, in which each base station (BS) needs to decide as to which subchannel, if any, to allocate to each of its associated mobile stations (MS) for data transmission. In general, there exists a trade-off between the total throughput (sum of throughputs of all the MSs) and fair…
▽ More
We study the problem of inter cell interference coordination (ICIC) with fixed transmit power in OFDMA-based cellular networks, in which each base station (BS) needs to decide as to which subchannel, if any, to allocate to each of its associated mobile stations (MS) for data transmission. In general, there exists a trade-off between the total throughput (sum of throughputs of all the MSs) and fairness under the allocations found by resource allocation schemes. We introduce the concept of $τ-α-$fairness by modifying the concept of $α-$fairness, which was earlier proposed in the context of designing fair end-to-end window-based congestion control protocols for packet-switched networks. The concept of $τ-α-$fairness allows us to achieve arbitrary trade-offs between the total throughput and degree of fairness by selecting an appropriate value of $α$ in $[0,\infty)$. We show that for every $α\in [0,\infty)$ and every $τ> 0$, the problem of finding a $τ-α-$fair allocation is NP-Complete. Further, we show that for every $α\in [0, \infty)$, there exist thresholds such that if the potential interference levels experienced by each MS on every subchannel are above the threshold values, then the problem can be optimally solved in polynomial time by reducing it to the bipartite graph matching problem. Also, we propose a simple, distributed subchannel allocation algorithm for the ICIC problem, which is flexible, requires a small amount of time to operate, and requires information exchange among only neighboring BSs. We investigate via simulations as to how the algorithm parameters should be selected so as to achieve any desired trade-off between the total throughput and fairness.
△ Less
Submitted 27 December, 2019;
originally announced December 2019.
-
Detection of a Glitch in PSR J0908$-$4913 by UTMOST
Authors:
Marcus E. Lower,
Matthew Bailes,
Ryan M. Shannon,
Simon Johnston,
Chris Flynn,
Timothy Bateman,
Duncan Campbell-Wilson,
Cherie K. Day,
Adam Deller,
Wael Farah,
Anne J. Green,
Vivek Gupta,
Richard W. Hunstead,
Andrew Jameson,
Ayushi Mandlik,
Stefan Osłowski,
Aditya Parthasarathy,
Daniel C. Price,
Angus Sutherland,
David Temby,
Glen Torr,
Glenn Urquhart,
Vivek Venkatraman Krishnan
Abstract:
We report the first detection of a glitch in the radio pulsar PSR J0908$-$4913 (PSR B0906$-$49) during regular timing observations by the Molonglo Observatory Synthesis Telescope (MOST) as part of the UTMOST project.
We report the first detection of a glitch in the radio pulsar PSR J0908$-$4913 (PSR B0906$-$49) during regular timing observations by the Molonglo Observatory Synthesis Telescope (MOST) as part of the UTMOST project.
△ Less
Submitted 17 December, 2019;
originally announced December 2019.
-
Automated Coronary Artery Atherosclerosis Detection and Weakly Supervised Localization on Coronary CT Angiography with a Deep 3-Dimensional Convolutional Neural Network
Authors:
Sema Candemir,
Richard D. White,
Mutlu Demirer,
Vikash Gupta,
Matthew T. Bigelow,
Luciano M. Prevedello,
Barbaros S. Erdal
Abstract:
We propose a fully automated algorithm based on a deep learning framework enabling screening of a coronary computed tomography angiography (CCTA) examination for confident detection of the presence or absence of coronary artery atherosclerosis. The system starts with extracting the coronary arteries and their branches from CCTA datasets and representing them with multi-planar reformatted volumes;…
▽ More
We propose a fully automated algorithm based on a deep learning framework enabling screening of a coronary computed tomography angiography (CCTA) examination for confident detection of the presence or absence of coronary artery atherosclerosis. The system starts with extracting the coronary arteries and their branches from CCTA datasets and representing them with multi-planar reformatted volumes; pre-processing and augmentation techniques are then applied to increase the robustness and generalization ability of the system. A 3-dimensional convolutional neural network (3D-CNN) is utilized to model pathological changes (e.g., atherosclerotic plaques) in coronary vessels. The system learns the discriminatory features between vessels with and without atherosclerosis. The discriminative features at the final convolutional layer are visualized with a saliency map approach to provide visual clues related to atherosclerosis likelihood and location. We have evaluated the system on a reference dataset representing247 patients with atherosclerosis and 246 patients free of atherosclerosis. With five-fold cross-validation,an Accuracy = 90.9%, Positive Predictive Value = 58.8%, Sensitivity = 68.9%, Specificity of 93.6%, and Negative Predictive Value (NPV) = 96.1% are achieved at the artery/branch level with threshold 0.5. The average area under the receiver operating characteristic curve is 0.91. The system indicates a high NPV, which may be potentially useful for assisting interpreting physicians in excluding coronary atherosclerosis in patients with acute chest pain.
△ Less
Submitted 7 June, 2020; v1 submitted 26 November, 2019;
originally announced November 2019.
-
PERMUTATION Strikes Back: The Power of Recourse in Online Metric Matching
Authors:
Varun Gupta,
Ravishankar Krishnaswamy,
Sai Sandeep
Abstract:
In the classical Online Metric Matching problem, we are given a metric space with $k$ servers. A collection of clients arrive in an online fashion, and upon arrival, a client should irrevocably be matched to an as-yet-unmatched server. The goal is to find an online matching which minimizes the total cost, i.e., the sum of distances between each client and the server it is matched to. We know deter…
▽ More
In the classical Online Metric Matching problem, we are given a metric space with $k$ servers. A collection of clients arrive in an online fashion, and upon arrival, a client should irrevocably be matched to an as-yet-unmatched server. The goal is to find an online matching which minimizes the total cost, i.e., the sum of distances between each client and the server it is matched to. We know deterministic algorithms~\cite{KP93,khuller1994line} that achieve a competitive ratio of $2k-1$, and this bound is tight for deterministic algorithms. The problem has also long been considered in specialized metrics such as the line metric or metrics of bounded doubling dimension, with the current best result on a line metric being a deterministic $O(\log k)$ competitive algorithm~\cite{raghvendra2018optimal}. Obtaining (or refuting) $O(\log k)$-competitive algorithms in general metrics and constant-competitive algorithms on the line metric have been long-standing open questions in this area.
In this paper, we investigate the robustness of these lower bounds by considering the Online Metric Matching with Recourse problem where we are allowed to change a small number of previous assignments upon arrival of a new client. Indeed, we show that a small logarithmic amount of recourse can significantly improve the quality of matchings we can maintain. For general metrics, we show a simple \emph{deterministic} $O(\log k)$-competitive algorithm with $O(\log k)$-amortized recourse, an exponential improvement over the $2k-1$ lower bound when no recourse is allowed. We next consider the line metric, and present a deterministic algorithm which is $3$-competitive and has $O(\log k)$-recourse, again a substantial improvement over the best known $O(\log k)$-competitive algorithm when no recourse is allowed.
△ Less
Submitted 28 November, 2019;
originally announced November 2019.
-
Universal EEG Encoder for Learning Diverse Intelligent Tasks
Authors:
Baani Leen Kaur Jolly,
Palash Aggrawal,
Surabhi S Nath,
Viresh Gupta,
Manraj Singh Grover,
Rajiv Ratn Shah
Abstract:
Brain Computer Interfaces (BCI) have become very popular with Electroencephalography (EEG) being one of the most commonly used signal acquisition techniques. A major challenge in BCI studies is the individualistic analysis required for each task. Thus, task-specific feature extraction and classification are performed, which fails to generalize to other tasks with similar time-series EEG input data…
▽ More
Brain Computer Interfaces (BCI) have become very popular with Electroencephalography (EEG) being one of the most commonly used signal acquisition techniques. A major challenge in BCI studies is the individualistic analysis required for each task. Thus, task-specific feature extraction and classification are performed, which fails to generalize to other tasks with similar time-series EEG input data. To this end, we design a GRU-based universal deep encoding architecture to extract meaningful features from publicly available datasets for five diverse EEG-based classification tasks. Our network can generate task and format-independent data representation and outperform the state of the art EEGNet architecture on most experiments. We also compare our results with CNN-based, and Autoencoder networks, in turn performing local, spatial, temporal and unsupervised analysis on the data.
△ Less
Submitted 26 November, 2019;
originally announced November 2019.