-
Detecting Frames in News Headlines and Lead Images in U.S. Gun Violence Coverage
Authors:
Isidora Chara Tourni,
Lei Guo,
Hengchang Hu,
Edward Halim,
Prakash Ishwar,
Taufiq Daryanto,
Mona Jalal,
Boqi Chen,
Margrit Betke,
Fabian Zhafransyah,
Sha Lai,
Derry Tanti Wijaya
Abstract:
News media structure their reporting of events or issues using certain perspectives.
When describing an incident involving gun violence, for example, some journalists may focus on mental health or gun regulation, while others may emphasize the discussion of gun rights. Such perspectives are called \say{frames} in communication research. We study, for the first time, the value of combining lead i…
▽ More
News media structure their reporting of events or issues using certain perspectives.
When describing an incident involving gun violence, for example, some journalists may focus on mental health or gun regulation, while others may emphasize the discussion of gun rights. Such perspectives are called \say{frames} in communication research. We study, for the first time, the value of combining lead images and their contextual information with text to identify the frame of a given news article. We observe that using multiple modes of information(article- and image-derived features) improves prediction of news frames over any single mode of information when the images are relevant to the frames of the headlines. We also observe that frame image relevance is related to the ease of conveying frames via images, which we call frame concreteness. Additionally, we release the first multimodal news framing dataset related to gun violence in the U.S., curated and annotated by communication researchers. The dataset will allow researchers to further examine the use of multiple information modalities for studying media framing.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
Complexity Evaluation of Parallel Execution of the RAPiD Deep-Learning Algorithm on Intel CPU
Authors:
Dominic Konrad,
Zhihao Duan,
Mertcan Cokbas,
Prakash Ishwar
Abstract:
Knowing how many and where are people in various indoor spaces is critical for reducing HVAC energy waste, space management, spatial analytics and in emergency scenarios. While a range of technologies have been proposed to detect and track people in large indoor spaces, ceiling-mounted fisheye cameras have recently emerged as strong contenders. Currently, RAPiD is the SOTA algorithm for people det…
▽ More
Knowing how many and where are people in various indoor spaces is critical for reducing HVAC energy waste, space management, spatial analytics and in emergency scenarios. While a range of technologies have been proposed to detect and track people in large indoor spaces, ceiling-mounted fisheye cameras have recently emerged as strong contenders. Currently, RAPiD is the SOTA algorithm for people detection in images captured by fisheye cameras. However, in large spaces several overhead fisheye cameras are needed to assure high accuracy of counting and thus multiple instances of RAPiD must be executed simultaneously. This report evaluates inference time when multiple instances of RAPiD run in parallel on an Ubuntu NUC PC with Intel I7 8559U CPU. We consider three mechanisms of CPU-resource allocation to handle multiple instances of RAPiD: 1) managed by Ubuntu, 2) managed by user via operating-system calls to assign logical cores, and 3) managed by user via PyTorch-library calls to limit the number of threads used by PyTorch. Each scenario was evaluated on 300 images. The experimental results show, that when one or two instances of RAPiD are executed in parallel all three approaches result in similar inference times of 1.8sec and 3.2sec, respectively. However, when three or more instances of RAPiD run in parallel, limiting the number of threads used by PyTorch results in the shortest inference times. On average, RAPiD completes inference of 2 images simultaneously in about 3sec, 4 images in 6sec and 8 images in less than 14sec. This is important for real-time system design. In HVAC-application scenarios, with a typical reaction time of 10-15min, a latency of 14sec is negligible so a single 8559U CPU can support 8 camera streams thus reducing the system cost. However, in emergency scenarios, when time is of essence, a single CPU may be needed for each camera to reduce the latency to 1.8sec.
△ Less
Submitted 11 December, 2023;
originally announced December 2023.
-
On neural and dimensional collapse in supervised and unsupervised contrastive learning with hard negative sampling
Authors:
Ruijie Jiang,
Thuan Nguyen,
Shuchin Aeron,
Prakash Ishwar
Abstract:
For a widely-studied data model and general loss and sample-hardening functions we prove that the Supervised Contrastive Learning (SCL), Hard-SCL (HSCL), and Unsupervised Contrastive Learning (UCL) risks are minimized by representations that exhibit Neural Collapse (NC), i.e., the class means form an Equianglular Tight Frame (ETF) and data from the same class are mapped to the same representation.…
▽ More
For a widely-studied data model and general loss and sample-hardening functions we prove that the Supervised Contrastive Learning (SCL), Hard-SCL (HSCL), and Unsupervised Contrastive Learning (UCL) risks are minimized by representations that exhibit Neural Collapse (NC), i.e., the class means form an Equianglular Tight Frame (ETF) and data from the same class are mapped to the same representation. We also prove that for any representation map**, the HSCL and Hard-UCL (HUCL) risks are lower bounded by the corresponding SCL and UCL risks. Although the optimality of ETF is known for SCL, albeit only for InfoNCE loss, its optimality for HSCL and UCL under general loss and hardening functions is novel. Moreover, our proofs are much simpler, compact, and transparent. We empirically demonstrate, for the first time, that ADAM optimization of HSCL and HUCL risks with random initialization and suitable hardness levels can indeed converge to the NC geometry if we incorporate unit-ball or unit-sphere feature normalization. Without incorporating hard negatives or feature normalization, however, the representations learned via ADAM suffer from dimensional collapse (DC) and fail to attain the NC geometry.
△ Less
Submitted 8 November, 2023;
originally announced November 2023.
-
A principled approach to model validation in domain generalization
Authors:
Boyang Lyu,
Thuan Nguyen,
Matthias Scheutz,
Prakash Ishwar,
Shuchin Aeron
Abstract:
Domain generalization aims to learn a model with good generalization ability, that is, the learned model should not only perform well on several seen domains but also on unseen domains with different data distributions. State-of-the-art domain generalization methods typically train a representation function followed by a classifier jointly to minimize both the classification risk and the domain di…
▽ More
Domain generalization aims to learn a model with good generalization ability, that is, the learned model should not only perform well on several seen domains but also on unseen domains with different data distributions. State-of-the-art domain generalization methods typically train a representation function followed by a classifier jointly to minimize both the classification risk and the domain discrepancy. However, when it comes to model selection, most of these methods rely on traditional validation routines that select models solely based on the lowest classification risk on the validation set. In this paper, we theoretically demonstrate a trade-off between minimizing classification risk and mitigating domain discrepancy, i.e., it is impossible to achieve the minimum of these two objectives simultaneously. Motivated by this theoretical result, we propose a novel model selection method suggesting that the validation process should account for both the classification risk and the domain discrepancy. We validate the effectiveness of the proposed method by numerical results on several domain generalization datasets.
△ Less
Submitted 2 April, 2023;
originally announced April 2023.
-
Estimating Distances Between People using a Single Overhead Fisheye Camera with Application to Social-Distancing Oversight
Authors:
Zhangchi Lu,
Mertcan Cokbas,
Prakash Ishwar,
Jansuz Konrad
Abstract:
Unobtrusive monitoring of distances between people indoors is a useful tool in the fight against pandemics. A natural resource to accomplish this are surveillance cameras. Unlike previous distance estimation methods, we use a single, overhead, fisheye camera with wide area coverage and propose two approaches. One method leverages a geometric model of the fisheye lens, whereas the other method uses…
▽ More
Unobtrusive monitoring of distances between people indoors is a useful tool in the fight against pandemics. A natural resource to accomplish this are surveillance cameras. Unlike previous distance estimation methods, we use a single, overhead, fisheye camera with wide area coverage and propose two approaches. One method leverages a geometric model of the fisheye lens, whereas the other method uses a neural network to predict the 3D-world distance from people-locations in a fisheye image. To evaluate our algorithms, we collected a first-of-its-kind dataset using single fisheye camera, that comprises a wide range of distances between people (1-58 ft) and will be made publicly available. The algorithms achieve 1-2 ft distance error and over 95% accuracy in detecting social-distance violations.
△ Less
Submitted 20 March, 2023;
originally announced March 2023.
-
Spatio-Visual Fusion-Based Person Re-Identification for Overhead Fisheye Images
Authors:
Mertcan Cokbas,
Prakash Ishwar,
Janusz Konrad
Abstract:
Person re-identification (PRID) has been thoroughly researched in typical surveillance scenarios where various scenes are monitored by side-mounted, rectilinear-lens cameras. To date, few methods have been proposed for fisheye cameras mounted overhead and their performance is lacking. In order to close this performance gap, we propose a multi-feature framework for fisheye PRID where we combine dee…
▽ More
Person re-identification (PRID) has been thoroughly researched in typical surveillance scenarios where various scenes are monitored by side-mounted, rectilinear-lens cameras. To date, few methods have been proposed for fisheye cameras mounted overhead and their performance is lacking. In order to close this performance gap, we propose a multi-feature framework for fisheye PRID where we combine deep-learning, color-based and location-based features by means of novel feature fusion. We evaluate the performance of our framework for various feature combinations on FRIDA, a public fisheye PRID dataset. The results demonstrate that our multi-feature approach outperforms recent appearance-based deep-learning methods by almost 18% points and location-based methods by almost 3% points in matching accuracy. We also demonstrate the potential application of the proposed PRID framework to people counting in large, crowded indoor spaces.
△ Less
Submitted 25 April, 2023; v1 submitted 21 December, 2022;
originally announced December 2022.
-
Trade-off between reconstruction loss and feature alignment for domain generalization
Authors:
Thuan Nguyen,
Boyang Lyu,
Prakash Ishwar,
Matthias Scheutz,
Shuchin Aeron
Abstract:
Domain generalization (DG) is a branch of transfer learning that aims to train the learning models on several seen domains and subsequently apply these pre-trained models to other unseen (unknown but related) domains. To deal with challenging settings in DG where both data and label of the unseen domain are not available at training time, the most common approach is to design the classifiers based…
▽ More
Domain generalization (DG) is a branch of transfer learning that aims to train the learning models on several seen domains and subsequently apply these pre-trained models to other unseen (unknown but related) domains. To deal with challenging settings in DG where both data and label of the unseen domain are not available at training time, the most common approach is to design the classifiers based on the domain-invariant representation features, i.e., the latent representations that are unchanged and transferable between domains. Contrary to popular belief, we show that designing classifiers based on invariant representation features alone is necessary but insufficient in DG. Our analysis indicates the necessity of imposing a constraint on the reconstruction loss induced by representation functions to preserve most of the relevant information about the label in the latent space. More importantly, we point out the trade-off between minimizing the reconstruction loss and achieving domain alignment in DG. Our theoretical results motivate a new DG framework that jointly optimizes the reconstruction loss and the domain discrepancy. Both theoretical and numerical results are provided to justify our approach.
△ Less
Submitted 26 October, 2022;
originally announced October 2022.
-
FRIDA: Fisheye Re-Identification Dataset with Annotations
Authors:
Mertcan Cokbas,
John Bolognino,
Janusz Konrad,
Prakash Ishwar
Abstract:
Person re-identification (PRID) from side-mounted rectilinear-lens cameras is a well-studied problem. On the other hand, PRID from overhead fisheye cameras is new and largely unstudied, primarily due to the lack of suitable image datasets. To fill this void, we introduce the "Fisheye Re-IDentification Dataset with Annotations" (FRIDA), with 240k+ bounding-box annotations of people, captured by 3 t…
▽ More
Person re-identification (PRID) from side-mounted rectilinear-lens cameras is a well-studied problem. On the other hand, PRID from overhead fisheye cameras is new and largely unstudied, primarily due to the lack of suitable image datasets. To fill this void, we introduce the "Fisheye Re-IDentification Dataset with Annotations" (FRIDA), with 240k+ bounding-box annotations of people, captured by 3 time-synchronized, ceiling-mounted fisheye cameras in a large indoor space. Due to a field-of-view overlap, PRID in this case differs from a typical PRID problem, which we discuss in depth. We also evaluate the performance of 10 state-of-the-art PRID algorithms on FRIDA. We show that for 6 CNN-based algorithms, training on FRIDA boosts the performance by up to 11.64% points in mAP compared to training on a common rectilinear-camera PRID dataset.
△ Less
Submitted 19 October, 2022; v1 submitted 4 October, 2022;
originally announced October 2022.
-
Supervised Contrastive Learning with Hard Negative Samples
Authors:
Ruijie Jiang,
Thuan Nguyen,
Prakash Ishwar,
Shuchin Aeron
Abstract:
Through minimization of an appropriate loss function such as the InfoNCE loss, contrastive learning (CL) learns a useful representation function by pulling positive samples close to each other while pushing negative samples far apart in the embedding space. The positive samples are typically created using "label-preserving" augmentations, i.e., domain-specific transformations of a given datum or a…
▽ More
Through minimization of an appropriate loss function such as the InfoNCE loss, contrastive learning (CL) learns a useful representation function by pulling positive samples close to each other while pushing negative samples far apart in the embedding space. The positive samples are typically created using "label-preserving" augmentations, i.e., domain-specific transformations of a given datum or anchor. In absence of class information, in unsupervised CL (UCL), the negative samples are typically chosen randomly and independently of the anchor from a preset negative sampling distribution over the entire dataset. This leads to class-collisions in UCL. Supervised CL (SCL), avoids this class collision by conditioning the negative sampling distribution to samples having labels different from that of the anchor. In hard-UCL (H-UCL), which has been shown to be an effective method to further enhance UCL, the negative sampling distribution is conditionally tilted, by means of a hardening function, towards samples that are closer to the anchor. Motivated by this, in this paper we propose hard-SCL (H-SCL) {wherein} the class conditional negative sampling distribution {is tilted} via a hardening function. Our simulation results confirm the utility of H-SCL over SCL with significant performance gains {in downstream classification tasks.} Analytically, we show that {in the} limit of infinite negative samples per anchor and a suitable assumption, the {H-SCL loss} is upper bounded by the {H-UCL loss}, thereby justifying the utility of H-UCL {for controlling} the H-SCL loss in the absence of label information. Through experiments on several datasets, we verify the assumption as well as the claimed inequality between H-UCL and H-SCL losses. We also provide a plausible scenario where H-SCL loss is lower bounded by UCL loss, indicating the limited utility of UCL in controlling the H-SCL loss.
△ Less
Submitted 10 May, 2024; v1 submitted 31 August, 2022;
originally announced September 2022.
-
Joint covariate-alignment and concept-alignment: a framework for domain generalization
Authors:
Thuan Nguyen,
Boyang Lyu,
Prakash Ishwar,
Matthias Scheutz,
Shuchin Aeron
Abstract:
In this paper, we propose a novel domain generalization (DG) framework based on a new upper bound to the risk on the unseen domain. Particularly, our framework proposes to jointly minimize both the covariate-shift as well as the concept-shift between the seen domains for a better performance on the unseen domain. While the proposed approach can be implemented via an arbitrary combination of covari…
▽ More
In this paper, we propose a novel domain generalization (DG) framework based on a new upper bound to the risk on the unseen domain. Particularly, our framework proposes to jointly minimize both the covariate-shift as well as the concept-shift between the seen domains for a better performance on the unseen domain. While the proposed approach can be implemented via an arbitrary combination of covariate-alignment and concept-alignment modules, in this work we use well-established approaches for distributional alignment namely, Maximum Mean Discrepancy (MMD) and covariance Alignment (CORAL), and use an Invariant Risk Minimization (IRM)-based approach for concept alignment. Our numerical results show that the proposed methods perform as well as or better than the state-of-the-art for domain generalization on several data sets.
△ Less
Submitted 1 August, 2022;
originally announced August 2022.
-
Conditional entropy minimization principle for learning domain invariant representation features
Authors:
Thuan Nguyen,
Boyang Lyu,
Prakash Ishwar,
Matthias Scheutz,
Shuchin Aeron
Abstract:
Invariance-principle-based methods such as Invariant Risk Minimization (IRM), have recently emerged as promising approaches for Domain Generalization (DG). Despite promising theory, such approaches fail in common classification tasks due to the mixing of true invariant features and spurious invariant features. To address this, we propose a framework based on the conditional entropy minimization (C…
▽ More
Invariance-principle-based methods such as Invariant Risk Minimization (IRM), have recently emerged as promising approaches for Domain Generalization (DG). Despite promising theory, such approaches fail in common classification tasks due to the mixing of true invariant features and spurious invariant features. To address this, we propose a framework based on the conditional entropy minimization (CEM) principle to filter-out the spurious invariant features leading to a new algorithm with a better generalization capability. We show that our proposed approach is closely related to the well-known Information Bottleneck (IB) framework and prove that under certain assumptions, entropy minimization can exactly recover the true invariant features. Our approach provides competitive classification accuracy compared to recent theoretically-principled state-of-the-art alternatives across several DG datasets.
△ Less
Submitted 9 July, 2022; v1 submitted 25 January, 2022;
originally announced January 2022.
-
Hard Negative Sampling via Regularized Optimal Transport for Contrastive Representation Learning
Authors:
Ruijie Jiang,
Prakash Ishwar,
Shuchin Aeron
Abstract:
We study the problem of designing hard negative sampling distributions for unsupervised contrastive representation learning. We propose and analyze a novel min-max framework that seeks a representation which minimizes the maximum (worst-case) generalized contrastive learning loss over all couplings (joint distributions between positive and negative samples subject to marginal constraints) and prov…
▽ More
We study the problem of designing hard negative sampling distributions for unsupervised contrastive representation learning. We propose and analyze a novel min-max framework that seeks a representation which minimizes the maximum (worst-case) generalized contrastive learning loss over all couplings (joint distributions between positive and negative samples subject to marginal constraints) and prove that the resulting min-max optimum representation will be degenerate. This provides the first theoretical justification for incorporating additional regularization constraints on the couplings. We re-interpret the min-max problem through the lens of Optimal Transport (OT) theory and utilize regularized transport couplings to control the degree of hardness of negative examples. Through experiments we demonstrate that the negative samples generated from our designed negative distribution are more similar to the anchor than those generated from the baseline negative distribution. We also demonstrate that entropic regularization yields negative sampling distributions with parametric form similar to that in a recent state-of-the-art negative sampling design and has similar performance in multiple datasets. Utilizing the uncovered connection with OT, we propose a new ground cost for designing the negative distribution and show improved performance of the learned representation on downstream tasks compared to the representation learned when using squared Euclidean cost.
△ Less
Submitted 14 December, 2023; v1 submitted 4 November, 2021;
originally announced November 2021.
-
Ergodic Limits, Relaxations, and Geometric Properties of Random Walk Node Embeddings
Authors:
Christy Lin,
Daniel Sussman,
Prakash Ishwar
Abstract:
Random walk based node embedding algorithms learn vector representations of nodes by optimizing an objective function of node embedding vectors and skip-bigram statistics computed from random walks on the network. They have been applied to many supervised learning problems such as link prediction and node classification and have demonstrated state-of-the-art performance. Yet, their properties rema…
▽ More
Random walk based node embedding algorithms learn vector representations of nodes by optimizing an objective function of node embedding vectors and skip-bigram statistics computed from random walks on the network. They have been applied to many supervised learning problems such as link prediction and node classification and have demonstrated state-of-the-art performance. Yet, their properties remain poorly understood. This paper studies properties of random walk based node embeddings in the unsupervised setting of discovering hidden block structure in the network, i.e., learning node representations whose cluster structure in Euclidean space reflects their adjacency structure within the network. We characterize the ergodic limits of the embedding objective, its generalization, and related convex relaxations to derive corresponding non-randomized versions of the node embedding objectives. We also characterize the optimal node embedding Grammians of the non-randomized objectives for the expected graph of a two-community Stochastic Block Model (SBM). We prove that the solution Grammian has rank $1$ for a suitable nuclear norm relaxation of the non-randomized objective. Comprehensive experimental results on SBM random networks reveal that our non-randomized ergodic objectives yield node embeddings whose distribution is Gaussian-like, centered at the node embeddings of the expected network within each community, and concentrate in the linear degree-scaling regime as the number of nodes increases.
△ Less
Submitted 9 September, 2021;
originally announced September 2021.
-
Barycentric-alignment and reconstruction loss minimization for domain generalization
Authors:
Boyang Lyu,
Thuan Nguyen,
Prakash Ishwar,
Matthias Scheutz,
Shuchin Aeron
Abstract:
This paper advances the theory and practice of Domain Generalization (DG) in machine learning. We consider the typical DG setting where the hypothesis is composed of a representation map** followed by a labeling function. Within this setting, the majority of popular DG methods aim to jointly learn the representation and the labeling functions by minimizing a well-known upper bound for the classi…
▽ More
This paper advances the theory and practice of Domain Generalization (DG) in machine learning. We consider the typical DG setting where the hypothesis is composed of a representation map** followed by a labeling function. Within this setting, the majority of popular DG methods aim to jointly learn the representation and the labeling functions by minimizing a well-known upper bound for the classification risk in the unseen domain. In practice, however, methods based on this theoretical upper bound ignore a term that cannot be directly optimized due to its dual dependence on both the representation map** and the unknown optimal labeling function in the unseen domain. To bridge this gap between theory and practice, we introduce a new upper bound that is free of terms having such dual dependence, resulting in a fully optimizable risk upper bound for the unseen domain. Our derivation leverages classical and recent transport inequalities that link optimal transport metrics with information-theoretic measures. Compared to previous bounds, our bound introduces two new terms: (i) the Wasserstein-2 barycenter term that aligns distributions between domains, and (ii) the reconstruction loss term that assesses the quality of representation in reconstructing the original data. Based on this new upper bound, we propose a novel DG algorithm named Wasserstein Barycenter Auto-Encoder (WBAE) that simultaneously minimizes the classification loss, the barycenter loss, and the reconstruction loss. Numerical results demonstrate that the proposed method outperforms current state-of-the-art DG algorithms on several datasets.
△ Less
Submitted 21 May, 2023; v1 submitted 4 September, 2021;
originally announced September 2021.
-
BSUV-Net 2.0: Spatio-Temporal Data Augmentations for Video-Agnostic Supervised Background Subtraction
Authors:
M. Ozan Tezcan,
Prakash Ishwar,
Janusz Konrad
Abstract:
Background subtraction (BGS) is a fundamental video processing task which is a key component of many applications. Deep learning-based supervised algorithms achieve very good perforamnce in BGS, however, most of these algorithms are optimized for either a specific video or a group of videos, and their performance decreases dramatically when applied to unseen videos. Recently, several papers addres…
▽ More
Background subtraction (BGS) is a fundamental video processing task which is a key component of many applications. Deep learning-based supervised algorithms achieve very good perforamnce in BGS, however, most of these algorithms are optimized for either a specific video or a group of videos, and their performance decreases dramatically when applied to unseen videos. Recently, several papers addressed this problem and proposed video-agnostic supervised BGS algorithms. However, nearly all of the data augmentations used in these algorithms are limited to the spatial domain and do not account for temporal variations that naturally occur in video data. In this work, we introduce spatio-temporal data augmentations and apply them to one of the leading video-agnostic BGS algorithms, BSUV-Net. We also introduce a new cross-validation training and evaluation strategy for the CDNet-2014 dataset that makes it possible to fairly and easily compare the performance of various video-agnostic supervised BGS algorithms. Our new model trained using the proposed data augmentations, named BSUV-Net 2.0, significantly outperforms state-of-the-art algorithms evaluated on unseen videos of CDNet-2014. We also evaluate the cross-dataset generalization capacity of BSUV-Net 2.0 by training it solely on CDNet-2014 videos and evaluating its performance on LASIESTA dataset. Overall, BSUV-Net 2.0 provides a ~5% improvement in the F-score over state-of-the-art methods on unseen videos of CDNet-2014 and LASIESTA datasets. Furthermore, we develop a real-time variant of our model, that we call Fast BSUV-Net 2.0, whose performance is close to the state of the art.
△ Less
Submitted 24 February, 2021; v1 submitted 23 January, 2021;
originally announced January 2021.
-
OpenFraming: We brought the ML; you bring the data. Interact with your data and discover its frames
Authors:
Alyssa Smith,
David Assefa Tofu,
Mona Jalal,
Edward Edberg Halim,
Yimeng Sun,
Vidya Akavoor,
Margrit Betke,
Prakash Ishwar,
Lei Guo,
Derry Wijaya
Abstract:
When journalists cover a news story, they can cover the story from multiple angles or perspectives. A news article written about COVID-19 for example, might focus on personal preventative actions such as mask-wearing, while another might focus on COVID-19's impact on the economy. These perspectives are called "frames," which when used may influence public perception and opinion of the issue. We in…
▽ More
When journalists cover a news story, they can cover the story from multiple angles or perspectives. A news article written about COVID-19 for example, might focus on personal preventative actions such as mask-wearing, while another might focus on COVID-19's impact on the economy. These perspectives are called "frames," which when used may influence public perception and opinion of the issue. We introduce a Web-based system for analyzing and classifying frames in text documents. Our goal is to make effective tools for automatic frame discovery and labeling based on topic modeling and deep learning widely accessible to researchers from a diverse array of disciplines. To this end, we provide both state-of-the-art pre-trained frame classification models on various issues as well as a user-friendly pipeline for training novel classification models on user-provided corpora. Researchers can submit their documents and obtain frames of the documents. The degree of user involvement is flexible: they can run models that have been pre-trained on select issues; submit labeled documents and train a new model for frame classification; or submit unlabeled documents and obtain potential frames of the documents. The code making up our system is also open-sourced and well-documented, making the system transparent and expandable. The system is available on-line at http://www.openframing.org and via our GitHub page https://github.com/davidatbu/openFraming .
△ Less
Submitted 16 August, 2020;
originally announced August 2020.
-
RAPiD: Rotation-Aware People Detection in Overhead Fisheye Images
Authors:
Zhihao Duan,
M. Ozan Tezcan,
Hayato Nakamura,
Prakash Ishwar,
Janusz Konrad
Abstract:
Recent methods for people detection in overhead, fisheye images either use radially-aligned bounding boxes to represent people, assuming people always appear along image radius or require significant pre-/post-processing which radically increases computational complexity. In this work, we develop an end-to-end rotation-aware people detection method, named RAPiD, that detects people using arbitrari…
▽ More
Recent methods for people detection in overhead, fisheye images either use radially-aligned bounding boxes to represent people, assuming people always appear along image radius or require significant pre-/post-processing which radically increases computational complexity. In this work, we develop an end-to-end rotation-aware people detection method, named RAPiD, that detects people using arbitrarily-oriented bounding boxes. Our fully-convolutional neural network directly regresses the angle of each bounding box using a periodic loss function, which accounts for angle periodicities. We have also created a new dataset with spatio-temporal annotations of rotated bounding boxes, for people detection as well as other vision tasks in overhead fisheye videos. We show that our simple, yet effective method outperforms state-of-the-art results on three fisheye-image datasets. Code and dataset are available at http://vip.bu.edu/rapid .
△ Less
Submitted 23 May, 2020;
originally announced May 2020.
-
Low-Resolution Overhead Thermal Tripwire for Occupancy Estimation
Authors:
Mertcan Cokbas,
Prakash Ishwar,
Janusz Konrad
Abstract:
Smart buildings use occupancy sensing for various tasks ranging from energy-efficient HVAC and lighting to space-utilization analysis and emergency response. We propose a people counting system which uses a low-resolution thermal sensor. Unlike previous people-counting systems based on thermal sensors, we use an overhead tripwire configuration at entryways to detect and track transient entries or…
▽ More
Smart buildings use occupancy sensing for various tasks ranging from energy-efficient HVAC and lighting to space-utilization analysis and emergency response. We propose a people counting system which uses a low-resolution thermal sensor. Unlike previous people-counting systems based on thermal sensors, we use an overhead tripwire configuration at entryways to detect and track transient entries or exits. We develop two distinct people counting algorithms for this configuration. To evaluate our algorithms, we have collected and labeled a low-resolution thermal video dataset using the proposed system. The dataset, the first of its kind, is public and available for download. We also propose new evaluation metrics that are more suitable for systems that are subject to drift and jitter.
△ Less
Submitted 5 May, 2020; v1 submitted 12 April, 2020;
originally announced April 2020.
-
VAE/WGAN-Based Image Representation Learning For Pose-Preserving Seamless Identity Replacement In Facial Images
Authors:
Hiroki Kawai,
Jiawei Chen,
Prakash Ishwar,
Janusz Konrad
Abstract:
We present a novel variational generative adversarial network (VGAN) based on Wasserstein loss to learn a latent representation from a face image that is invariant to identity but preserves head-pose information. This facilitates synthesis of a realistic face image with the same head pose as a given input image, but with a different identity. One application of this network is in privacy-sensitive…
▽ More
We present a novel variational generative adversarial network (VGAN) based on Wasserstein loss to learn a latent representation from a face image that is invariant to identity but preserves head-pose information. This facilitates synthesis of a realistic face image with the same head pose as a given input image, but with a different identity. One application of this network is in privacy-sensitive scenarios; after identity replacement in an image, utility, such as head pose, can still be recovered. Extensive experimental validation on synthetic and real human-face image datasets performed under 3 threat scenarios confirms the ability of the proposed network to preserve head pose of the input image, mask the input identity, and synthesize a good-quality realistic face image of a desired identity. We also show that our network can be used to perform pose-preserving identity morphing and identity-preserving pose morphing. The proposed method improves over a recent state-of-the-art method in terms of quantitative metrics as well as synthesized image quality.
△ Less
Submitted 1 March, 2020;
originally announced March 2020.
-
BSUV-Net: A Fully-Convolutional Neural Network for Background Subtraction of Unseen Videos
Authors:
M. Ozan Tezcan,
Prakash Ishwar,
Janusz Konrad
Abstract:
Background subtraction is a basic task in computer vision and video processing often applied as a pre-processing step for object tracking, people recognition, etc. Recently, a number of successful background-subtraction algorithms have been proposed, however nearly all of the top-performing ones are supervised. Crucially, their success relies upon the availability of some annotated frames of the t…
▽ More
Background subtraction is a basic task in computer vision and video processing often applied as a pre-processing step for object tracking, people recognition, etc. Recently, a number of successful background-subtraction algorithms have been proposed, however nearly all of the top-performing ones are supervised. Crucially, their success relies upon the availability of some annotated frames of the test video during training. Consequently, their performance on completely "unseen" videos is undocumented in the literature. In this work, we propose a new, supervised, background-subtraction algorithm for unseen videos (BSUV-Net) based on a fully-convolutional neural network. The input to our network consists of the current frame and two background frames captured at different time scales along with their semantic segmentation maps. In order to reduce the chance of overfitting, we also introduce a new data-augmentation technique which mitigates the impact of illumination difference between the background frames and the current frame. On the CDNet-2014 dataset, BSUV-Net outperforms state-of-the-art algorithms evaluated on unseen videos in terms of several metrics including F-measure, recall and precision.
△ Less
Submitted 14 January, 2020; v1 submitted 25 July, 2019;
originally announced July 2019.
-
A Cyclically-Trained Adversarial Network for Invariant Representation Learning
Authors:
Jiawei Chen,
Janusz Konrad,
Prakash Ishwar
Abstract:
Recent studies show that deep neural networks are vulnerable to adversarial examples which can be generated via certain types of transformations. Being robust to a desired family of adversarial attacks is then equivalent to being invariant to a family of transformations. Learning invariant representations then naturally emerges as an important goal to achieve which we explore in this paper within…
▽ More
Recent studies show that deep neural networks are vulnerable to adversarial examples which can be generated via certain types of transformations. Being robust to a desired family of adversarial attacks is then equivalent to being invariant to a family of transformations. Learning invariant representations then naturally emerges as an important goal to achieve which we explore in this paper within specific application contexts. Specifically, we propose a cyclically-trained adversarial network to learn a map** from image space to latent representation space and back such that the latent representation is invariant to a specified factor of variation (e.g., identity). The learned map** assures that the synthesized image is not only realistic, but has the same values for unspecified factors (e.g., pose and illumination) as the original image and a desired value of the specified factor. Unlike disentangled representation learning, which requires two latent spaces, one for specified and another for unspecified factors, invariant representation learning needs only one such space. We encourage invariance to a specified factor by applying adversarial training using a variational autoencoder in the image space as opposed to the latent space. We strengthen this invariance by introducing a cyclic training process (forward and backward cycle). We also propose a new method to evaluate conditional generative networks. It compares how well different factors of variation can be predicted from the synthesized, as opposed to real, images. In quantitative terms, our approach attains state-of-the-art performance in experiments spanning three datasets with factors such as identity, pose, illumination or style. Our method produces sharp, high-quality synthetic images with little visible artefacts compared to previous approaches.
△ Less
Submitted 16 April, 2020; v1 submitted 21 June, 2019;
originally announced June 2019.
-
BUOCA: Budget-Optimized Crowd Worker Allocation
Authors:
Mehrnoosh Sameki,
Sha Lai,
Kate K. Mays,
Lei Guo,
Prakash Ishwar,
Margrit Betke
Abstract:
Due to concerns about human error in crowdsourcing, it is standard practice to collect labels for the same data point from multiple internet workers. We here show that the resulting budget can be used more effectively with a flexible worker assignment strategy that asks fewer workers to analyze easy-to-label data and more workers to analyze data that requires extra scrutiny. Our main contribution…
▽ More
Due to concerns about human error in crowdsourcing, it is standard practice to collect labels for the same data point from multiple internet workers. We here show that the resulting budget can be used more effectively with a flexible worker assignment strategy that asks fewer workers to analyze easy-to-label data and more workers to analyze data that requires extra scrutiny. Our main contribution is to show how the allocations of the number of workers to a task can be computed optimally based on task features alone, without using worker profiles. Our target tasks are delineating cells in microscopy images and analyzing the sentiment toward the 2016 U.S. presidential candidates in tweets. We first propose an algorithm that computes budget-optimized crowd worker allocation (BUOCA). We next train a machine learning system (BUOCA-ML) that predicts an optimal number of crowd workers needed to maximize the accuracy of the labeling. We show that the computed allocation can yield large savings in the crowdsourcing budget (up to 49 percent points) while maintaining labeling accuracy. Finally, we envisage a human-machine system for performing budget-optimized data analysis at a scale beyond the feasibility of crowdsourcing.
△ Less
Submitted 11 January, 2019;
originally announced January 2019.
-
Collaborative Privacy for Web Applications
Authors:
Yihao Hu,
Ari Trachtenberg,
Prakash Ishwar
Abstract:
Real-time, online-editing web apps provide free and convenient services for collaboratively editing, sharing and storing files. The benefits of these web applications do not come for free: not only do service providers have full access to the users' files, but they also control access, transmission, and storage mechanisms for them. As a result, user data may be at risk of data mining, third-party…
▽ More
Real-time, online-editing web apps provide free and convenient services for collaboratively editing, sharing and storing files. The benefits of these web applications do not come for free: not only do service providers have full access to the users' files, but they also control access, transmission, and storage mechanisms for them. As a result, user data may be at risk of data mining, third-party interception, or even manipulation. To combat this, we propose a new system for hel** to preserve the privacy of user data within collaborative environments. There are several distinct challenges in producing such a system, including develo** an encryption mechanism that does not interfere with the back-end (and often proprietary) control mechanisms utilized by the service, and identifying transparent code hooks through which to obfuscate user data. Toward the first challenge, we develop a character-level encryption scheme that is more resilient to the types of attacks that plague classical substitution ciphers. For the second challenge, we design a browser extension that robustly demonstrates the feasibility of our approach, and show a concrete implementation for Google Chrome and the widely-used Google Docs platform. Our example tangibly demonstrates how several users with a shared key can collaboratively and transparently edit a Google Docs document without revealing the plaintext directly to Google.
△ Less
Submitted 17 November, 2019; v1 submitted 10 January, 2019;
originally announced January 2019.
-
VGAN-Based Image Representation Learning for Privacy-Preserving Facial Expression Recognition
Authors:
Jiawei Chen,
Janusz Konrad,
Prakash Ishwar
Abstract:
Reliable facial expression recognition plays a critical role in human-machine interactions. However, most of the facial expression analysis methodologies proposed to date pay little or no attention to the protection of a user's privacy. In this paper, we propose a Privacy-Preserving Representation-Learning Variational Generative Adversarial Network (PPRL-VGAN) to learn an image representation that…
▽ More
Reliable facial expression recognition plays a critical role in human-machine interactions. However, most of the facial expression analysis methodologies proposed to date pay little or no attention to the protection of a user's privacy. In this paper, we propose a Privacy-Preserving Representation-Learning Variational Generative Adversarial Network (PPRL-VGAN) to learn an image representation that is explicitly disentangled from the identity information. At the same time, this representation is discriminative from the standpoint of facial expression recognition and generative as it allows expression-equivalent face image synthesis. We evaluate the proposed model on two public datasets under various threat scenarios. Quantitative and qualitative results demonstrate that our approach strikes a balance between the preservation of privacy and data utility. We further demonstrate that our model can be effectively applied to other tasks such as expression morphing and image completion.
△ Less
Submitted 7 September, 2018; v1 submitted 19 March, 2018;
originally announced March 2018.
-
Privacy-Preserving Adversarial Networks
Authors:
Ardhendu Tripathy,
Ye Wang,
Prakash Ishwar
Abstract:
We propose a data-driven framework for optimizing privacy-preserving data release mechanisms to attain the information-theoretically optimal tradeoff between minimizing distortion of useful data and concealing specific sensitive information. Our approach employs adversarially-trained neural networks to implement randomized mechanisms and to perform a variational approximation of mutual information…
▽ More
We propose a data-driven framework for optimizing privacy-preserving data release mechanisms to attain the information-theoretically optimal tradeoff between minimizing distortion of useful data and concealing specific sensitive information. Our approach employs adversarially-trained neural networks to implement randomized mechanisms and to perform a variational approximation of mutual information privacy. We validate our Privacy-Preserving Adversarial Networks (PPAN) framework via proof-of-concept experiments on discrete and continuous synthetic data, as well as the MNIST handwritten digits dataset. For synthetic data, our model-agnostic PPAN approach achieves tradeoff points very close to the optimal tradeoffs that are analytically-derived from model knowledge. In experiments with the MNIST data, we visually demonstrate a learned tradeoff between minimizing the pixel-level distortion versus concealing the written digit.
△ Less
Submitted 12 June, 2019; v1 submitted 19 December, 2017;
originally announced December 2017.
-
Privacy-Utility Tradeoffs under Constrained Data Release Mechanisms
Authors:
Ye Wang,
Yuksel Ozan Basciftci,
Prakash Ishwar
Abstract:
Privacy-preserving data release mechanisms aim to simultaneously minimize information-leakage with respect to sensitive data and distortion with respect to useful data. Dependencies between sensitive and useful data results in a privacy-utility tradeoff that has strong connections to generalized rate-distortion problems. In this work, we study how the optimal privacy-utility tradeoff region is aff…
▽ More
Privacy-preserving data release mechanisms aim to simultaneously minimize information-leakage with respect to sensitive data and distortion with respect to useful data. Dependencies between sensitive and useful data results in a privacy-utility tradeoff that has strong connections to generalized rate-distortion problems. In this work, we study how the optimal privacy-utility tradeoff region is affected by constraints on the data that is directly available as input to the release mechanism. In particular, we consider the availability of only sensitive data, only useful data, and both (full data). We show that a general hierarchy holds: the tradeoff region given only the sensitive data is no larger than the region given only the useful data, which in turn is clearly no larger than the region given both sensitive and useful data. In addition, we determine conditions under which the tradeoff region given only the useful data coincides with that given full data. These are based on the common information between the sensitive and useful data. We establish these results for general families of privacy and utility measures that satisfy certain natural properties required of any reasonable measure of privacy or utility. We also uncover a new, subtler aspect of the data processing inequality for general non-symmetric privacy measures and discuss its operational relevance and implications. Finally, we derive exact closed-analytic-form expressions for the privacy-utility tradeoffs for symmetrically dependent sensitive and useful data under mutual information and Hamming distortion as the respective privacy and utility measures.
△ Less
Submitted 25 October, 2017;
originally announced October 2017.
-
Node Embedding via Word Embedding for Network Community Discovery
Authors:
Weicong Ding,
Christy Lin,
Prakash Ishwar
Abstract:
Neural node embeddings have recently emerged as a powerful representation for supervised learning tasks involving graph-structured data. We leverage this recent advance to develop a novel algorithm for unsupervised community discovery in graphs. Through extensive experimental studies on simulated and real-world data, we demonstrate that the proposed approach consistently improves over the current…
▽ More
Neural node embeddings have recently emerged as a powerful representation for supervised learning tasks involving graph-structured data. We leverage this recent advance to develop a novel algorithm for unsupervised community discovery in graphs. Through extensive experimental studies on simulated and real-world data, we demonstrate that the proposed approach consistently improves over the current state-of-the-art. Specifically, our approach empirically attains the information-theoretic limits for community recovery under the benchmark Stochastic Block Models for graph generation and exhibits better stability and accuracy over both Spectral Clustering and Acyclic Belief Propagation in the community recovery limits.
△ Less
Submitted 28 June, 2017; v1 submitted 9 November, 2016;
originally announced November 2016.
-
Semi-Coupled Two-Stream Fusion ConvNets for Action Recognition at Extremely Low Resolutions
Authors:
Jiawei Chen,
Jonathan Wu,
Janusz Konrad,
Prakash Ishwar
Abstract:
Deep convolutional neural networks (ConvNets) have been recently shown to attain state-of-the-art performance for action recognition on standard-resolution videos. However, less attention has been paid to recognition performance at extremely low resolutions (eLR) (e.g., 16 x 12 pixels). Reliable action recognition using eLR cameras would address privacy concerns in various application environments…
▽ More
Deep convolutional neural networks (ConvNets) have been recently shown to attain state-of-the-art performance for action recognition on standard-resolution videos. However, less attention has been paid to recognition performance at extremely low resolutions (eLR) (e.g., 16 x 12 pixels). Reliable action recognition using eLR cameras would address privacy concerns in various application environments such as private homes, hospitals, nursing/rehabilitation facilities, etc. In this paper, we propose a semi-coupled filter-sharing network that leverages high resolution (HR) videos during training in order to assist an eLR ConvNet. We also study methods for fusing spatial and temporal ConvNets customized for eLR videos in order to take advantage of appearance and motion information. Our method outperforms state-of-the-art methods at extremely low resolutions on IXMAS (93.7%) and HMDB (29.2%) datasets.
△ Less
Submitted 5 October, 2018; v1 submitted 12 October, 2016;
originally announced October 2016.
-
Necessary and Sufficient Conditions and a Provably Efficient Algorithm for Separable Topic Discovery
Authors:
Weicong Ding,
Prakash Ishwar,
Venkatesh Saligrama
Abstract:
We develop necessary and sufficient conditions and a novel provably consistent and efficient algorithm for discovering topics (latent factors) from observations (documents) that are realized from a probabilistic mixture of shared latent factors that have certain properties. Our focus is on the class of topic models in which each shared latent factor contains a novel word that is unique to that fac…
▽ More
We develop necessary and sufficient conditions and a novel provably consistent and efficient algorithm for discovering topics (latent factors) from observations (documents) that are realized from a probabilistic mixture of shared latent factors that have certain properties. Our focus is on the class of topic models in which each shared latent factor contains a novel word that is unique to that factor, a property that has come to be known as separability. Our algorithm is based on the key insight that the novel words correspond to the extreme points of the convex hull formed by the row-vectors of a suitably normalized word co-occurrence matrix. We leverage this geometric insight to establish polynomial computation and sample complexity bounds based on a few isotropic random projections of the rows of the normalized word co-occurrence matrix. Our proposed random-projections-based algorithm is naturally amenable to an efficient distributed implementation and is attractive for modern web-scale distributed data mining applications.
△ Less
Submitted 4 December, 2015; v1 submitted 22 August, 2015;
originally announced August 2015.
-
Learning Mixed Membership Mallows Models from Pairwise Comparisons
Authors:
Weicong Ding,
Prakash Ishwar,
Venkatesh Saligrama
Abstract:
We propose a novel parameterized family of Mixed Membership Mallows Models (M4) to account for variability in pairwise comparisons generated by a heterogeneous population of noisy and inconsistent users. M4 models individual preferences as a user-specific probabilistic mixture of shared latent Mallows components. Our key algorithmic insight for estimation is to establish a statistical connection b…
▽ More
We propose a novel parameterized family of Mixed Membership Mallows Models (M4) to account for variability in pairwise comparisons generated by a heterogeneous population of noisy and inconsistent users. M4 models individual preferences as a user-specific probabilistic mixture of shared latent Mallows components. Our key algorithmic insight for estimation is to establish a statistical connection between M4 and topic models by viewing pairwise comparisons as words, and users as documents. This key insight leads us to explore Mallows components with a separable structure and leverage recent advances in separable topic discovery. While separability appears to be overly restrictive, we nevertheless show that it is an inevitable outcome of a relatively small number of latent Mallows components in a world of large number of items. We then develop an algorithm based on robust extreme-point identification of convex polygons to learn the reference rankings, and is provably consistent with polynomial sample complexity guarantees. We demonstrate that our new model is empirically competitive with the current state-of-the-art approaches in predicting real-world preferences.
△ Less
Submitted 3 April, 2015;
originally announced April 2015.
-
A Topic Modeling Approach to Ranking
Authors:
Weicong Ding,
Prakash Ishwar,
Venkatesh Saligrama
Abstract:
We propose a topic modeling approach to the prediction of preferences in pairwise comparisons. We develop a new generative model for pairwise comparisons that accounts for multiple shared latent rankings that are prevalent in a population of users. This new model also captures inconsistent user behavior in a natural way. We show how the estimation of latent rankings in the new generative model can…
▽ More
We propose a topic modeling approach to the prediction of preferences in pairwise comparisons. We develop a new generative model for pairwise comparisons that accounts for multiple shared latent rankings that are prevalent in a population of users. This new model also captures inconsistent user behavior in a natural way. We show how the estimation of latent rankings in the new generative model can be formally reduced to the estimation of topics in a statistically equivalent topic modeling problem. We leverage recent advances in the topic modeling literature to develop an algorithm that can learn shared latent rankings with provable consistency as well as sample and computational complexity guarantees. We demonstrate that the new approach is empirically competitive with the current state-of-the-art approaches in predicting preferences on some semi-synthetic and real world datasets.
△ Less
Submitted 25 January, 2015; v1 submitted 11 December, 2014;
originally announced December 2014.
-
A note on the sum-rate-distortion function of some lossy source coding problems involving infinite-valued distortion functions
Authors:
Prakash Ishwar
Abstract:
For a number of lossy source coding problems it is shown that even if the usual single-letter sum-rate-distortion expressions may become invalid for non-infinite distortion functions, they can be approached, to any desired accuracy, via the usual valid expressions for appropriately truncated finite versions of the distortion functions.
For a number of lossy source coding problems it is shown that even if the usual single-letter sum-rate-distortion expressions may become invalid for non-infinite distortion functions, they can be approached, to any desired accuracy, via the usual valid expressions for appropriately truncated finite versions of the distortion functions.
△ Less
Submitted 4 August, 2014;
originally announced August 2014.
-
An Elementary Completeness Proof for Secure Two-Party Computation Primitives
Authors:
Ye Wang,
Prakash Ishwar,
Shantanu Rane
Abstract:
In the secure two-party computation problem, two parties wish to compute a (possibly randomized) function of their inputs via an interactive protocol, while ensuring that neither party learns more than what can be inferred from only their own input and output. For semi-honest parties and information-theoretic security guarantees, it is well-known that, if only noiseless communication is available,…
▽ More
In the secure two-party computation problem, two parties wish to compute a (possibly randomized) function of their inputs via an interactive protocol, while ensuring that neither party learns more than what can be inferred from only their own input and output. For semi-honest parties and information-theoretic security guarantees, it is well-known that, if only noiseless communication is available, only a limited set of functions can be securely computed; however, if interaction is also allowed over general communication primitives (multi-input/output channels), there are "complete" primitives that enable any function to be securely computed. The general set of complete primitives was characterized recently by Maji, Prabhakaran, and Rosulek leveraging an earlier specialized characterization by Kilian. Our contribution in this paper is a simple, self-contained, alternative derivation using elementary information-theoretic tools.
△ Less
Submitted 12 December, 2014; v1 submitted 18 February, 2014;
originally announced February 2014.
-
Sensing-Aware Kernel SVM
Authors:
Weicong Ding,
Prakash Ishwar,
Venkatesh Saligrama,
W. Clem Karl
Abstract:
We propose a novel approach for designing kernels for support vector machines (SVMs) when the class label is linked to the observation through a latent state and the likelihood function of the observation given the state (the sensing model) is available. We show that the Bayes-optimum decision boundary is a hyperplane under a map** defined by the likelihood function. Combining this with the maxi…
▽ More
We propose a novel approach for designing kernels for support vector machines (SVMs) when the class label is linked to the observation through a latent state and the likelihood function of the observation given the state (the sensing model) is available. We show that the Bayes-optimum decision boundary is a hyperplane under a map** defined by the likelihood function. Combining this with the maximum margin principle yields kernels for SVMs that leverage knowledge of the sensing model in an optimal way. We derive the optimum kernel for the bag-of-words (BoWs) sensing model and demonstrate its superior performance over other kernels in document and image classification tasks. These results indicate that such optimum sensing-aware kernel SVMs can match the performance of rather sophisticated state-of-the-art approaches.
△ Less
Submitted 13 March, 2014; v1 submitted 2 December, 2013;
originally announced December 2013.
-
On Unconditionally Secure Multiparty Computation for Realizing Correlated Equilibria in Games
Authors:
Ye Wang,
Shantanu Rane,
Prakash Ishwar
Abstract:
In game theory, a trusted mediator acting on behalf of the players can enable the attainment of correlated equilibria, which may provide better payoffs than those available from the Nash equilibria alone. We explore the approach of replacing the trusted mediator with an unconditionally secure sampling protocol that jointly generates the players' actions. We characterize the joint distributions tha…
▽ More
In game theory, a trusted mediator acting on behalf of the players can enable the attainment of correlated equilibria, which may provide better payoffs than those available from the Nash equilibria alone. We explore the approach of replacing the trusted mediator with an unconditionally secure sampling protocol that jointly generates the players' actions. We characterize the joint distributions that can be securely sampled by malicious players via protocols using error-free communication. This class of distributions depends on whether players may speak simultaneously ("cheap talk") or must speak in turn ("polite talk"). In applying sampling protocols toward attaining correlated equilibria with rational players, we observe that security against malicious parties may be much stronger than necessary. We propose the concept of secure sampling by rational players, and show that many more distributions are feasible given certain utility functions. However, the payoffs attainable via secure sampling by malicious players are a dominant subset of the rationally attainable payoffs.
△ Less
Submitted 6 November, 2013;
originally announced November 2013.
-
Necessary and Sufficient Conditions for Novel Word Detection in Separable Topic Models
Authors:
Weicong Ding,
Prakash Ishwar,
Mohammad H. Rohban,
Venkatesh Saligrama
Abstract:
The simplicial condition and other stronger conditions that imply it have recently played a central role in develo** polynomial time algorithms with provable asymptotic consistency and sample complexity guarantees for topic estimation in separable topic models. Of these algorithms, those that rely solely on the simplicial condition are impractical while the practical ones need stronger condition…
▽ More
The simplicial condition and other stronger conditions that imply it have recently played a central role in develo** polynomial time algorithms with provable asymptotic consistency and sample complexity guarantees for topic estimation in separable topic models. Of these algorithms, those that rely solely on the simplicial condition are impractical while the practical ones need stronger conditions. In this paper, we demonstrate, for the first time, that the simplicial condition is a fundamental, algorithm-independent, information-theoretic necessary condition for consistent separable topic estimation. Furthermore, under solely the simplicial condition, we present a practical quadratic-complexity algorithm based on random projections which consistently detects all novel words of all topics using only up to second-order empirical word moments. This algorithm is amenable to distributed implementation making it attractive for 'big-data' scenarios involving a network of large distributed databases.
△ Less
Submitted 29 October, 2013;
originally announced October 2013.
-
Secure Biometrics: Concepts, Authentication Architectures and Challenges
Authors:
Shantanu Rane,
Ye Wang,
Stark. C. Draper,
Prakash Ishwar
Abstract:
BIOMETRICS are an important and widely used class of methods for identity verification and access control. Biometrics are attractive because they are inherent properties of an individual. They need not be remembered like passwords, and are not easily lost or forged like identifying documents. At the same time, bio- metrics are fundamentally noisy and irreplaceable. There are always slight variatio…
▽ More
BIOMETRICS are an important and widely used class of methods for identity verification and access control. Biometrics are attractive because they are inherent properties of an individual. They need not be remembered like passwords, and are not easily lost or forged like identifying documents. At the same time, bio- metrics are fundamentally noisy and irreplaceable. There are always slight variations among the measurements of a given biometric, and, unlike passwords or identification numbers, biometrics are derived from physical characteristics that cannot easily be changed. The proliferation of biometric usage raises critical privacy and security concerns that, due to the noisy nature of biometrics, cannot be addressed using standard cryptographic methods. In this article we present an overview of "secure biometrics", also referred to as "biometric template protection", an emerging class of methods that address these concerns.
△ Less
Submitted 21 May, 2013;
originally announced May 2013.
-
Topic Discovery through Data Dependent and Random Projections
Authors:
Weicong Ding,
Mohammad H. Rohban,
Prakash Ishwar,
Venkatesh Saligrama
Abstract:
We present algorithms for topic modeling based on the geometry of cross-document word-frequency patterns. This perspective gains significance under the so called separability condition. This is a condition on existence of novel-words that are unique to each topic. We present a suite of highly efficient algorithms based on data-dependent and random projections of word-frequency patterns to identify…
▽ More
We present algorithms for topic modeling based on the geometry of cross-document word-frequency patterns. This perspective gains significance under the so called separability condition. This is a condition on existence of novel-words that are unique to each topic. We present a suite of highly efficient algorithms based on data-dependent and random projections of word-frequency patterns to identify novel words and associated topics. We will also discuss the statistical guarantees of the data-dependent projections method based on two mild assumptions on the prior density of topic document matrix. Our key insight here is that the maximum and minimum values of cross-document frequency patterns projected along any direction are associated with novel words. While our sample complexity bounds for topic recovery are similar to the state-of-art, the computational complexity of our random projection scheme scales linearly with the number of documents and the number of words per document. We present several experiments on synthetic and real-world datasets to demonstrate qualitative and quantitative merits of our scheme.
△ Less
Submitted 18 March, 2013; v1 submitted 14 March, 2013;
originally announced March 2013.
-
An Impossibility Result for High Dimensional Supervised Learning
Authors:
Mohammad Hossein Rohban,
Prakash Ishwar,
Birant Orten,
William C. Karl,
Venkatesh Saligrama
Abstract:
We study high-dimensional asymptotic performance limits of binary supervised classification problems where the class conditional densities are Gaussian with unknown means and covariances and the number of signal dimensions scales faster than the number of labeled training samples. We show that the Bayes error, namely the minimum attainable error probability with complete distributional knowledge a…
▽ More
We study high-dimensional asymptotic performance limits of binary supervised classification problems where the class conditional densities are Gaussian with unknown means and covariances and the number of signal dimensions scales faster than the number of labeled training samples. We show that the Bayes error, namely the minimum attainable error probability with complete distributional knowledge and equally likely classes, can be arbitrarily close to zero and yet the limiting minimax error probability of every supervised learning algorithm is no better than a random coin toss. In contrast to related studies where the classification difficulty (Bayes error) is made to vanish, we hold it constant when taking high-dimensional limits. In contrast to VC-dimension based minimax lower bounds that consider the worst case error probability over all distributions that have a fixed Bayes error, our worst case is over the family of Gaussian distributions with constant Bayes error. We also show that a nontrivial asymptotic minimax error probability can only be attained for parametric subsets of zero measure (in a suitable measure space). These results expose the fundamental importance of prior knowledge and suggest that unless we impose strong structural constraints, such as sparsity, on the parametric space, supervised learning may be ineffective in high dimensional small sample settings.
△ Less
Submitted 25 April, 2013; v1 submitted 29 January, 2013;
originally announced January 2013.
-
A New Geometric Approach to Latent Topic Modeling and Discovery
Authors:
Weicong Ding,
Mohammad H. Rohban,
Prakash Ishwar,
Venkatesh Saligrama
Abstract:
A new geometrically-motivated algorithm for nonnegative matrix factorization is developed and applied to the discovery of latent "topics" for text and image "document" corpora. The algorithm is based on robustly finding and clustering extreme points of empirical cross-document word-frequencies that correspond to novel "words" unique to each topic. In contrast to related approaches that are based o…
▽ More
A new geometrically-motivated algorithm for nonnegative matrix factorization is developed and applied to the discovery of latent "topics" for text and image "document" corpora. The algorithm is based on robustly finding and clustering extreme points of empirical cross-document word-frequencies that correspond to novel "words" unique to each topic. In contrast to related approaches that are based on solving non-convex optimization problems using suboptimal approximations, locally-optimal methods, or heuristics, the new algorithm is convex, has polynomial complexity, and has competitive qualitative and quantitative performance compared to the current state-of-the-art approaches on synthetic and real-world datasets.
△ Less
Submitted 4 January, 2013;
originally announced January 2013.
-
Information-Theoretically Secure Three-Party Computation with One Corrupted Party
Authors:
Ye Wang,
Prakash Ishwar,
Shantanu Rane
Abstract:
The problem in which one of three pairwise interacting parties is required to securely compute a function of the inputs held by the other two, when one party may arbitrarily deviate from the computation protocol (active behavioral model), is studied. An information-theoretic characterization of unconditionally secure computation protocols under the active behavioral model is provided. A protocol f…
▽ More
The problem in which one of three pairwise interacting parties is required to securely compute a function of the inputs held by the other two, when one party may arbitrarily deviate from the computation protocol (active behavioral model), is studied. An information-theoretic characterization of unconditionally secure computation protocols under the active behavioral model is provided. A protocol for Hamming distance computation is provided and shown to be unconditionally secure under both active and passive behavioral models using the information-theoretic characterization. The difference between the notions of security under the active and passive behavioral models is illustrated through the BGW protocol for computing quadratic and Hamming distances; this protocol is secure under the passive model, but is shown to be not secure under the active model.
△ Less
Submitted 4 February, 2013; v1 submitted 12 June, 2012;
originally announced June 2012.
-
A Theoretical Analysis of Authentication, Privacy and Reusability Across Secure Biometric Systems
Authors:
Ye Wang,
Shantanu Rane,
Stark C. Draper,
Prakash Ishwar
Abstract:
We present a theoretical framework for the analysis of privacy and security tradeoffs in secure biometric authentication systems. We use this framework to conduct a comparative information-theoretic analysis of two biometric systems that are based on linear error correction codes, namely fuzzy commitment and secure sketches. We derive upper bounds for the probability of false rejection ($P_{FR}$)…
▽ More
We present a theoretical framework for the analysis of privacy and security tradeoffs in secure biometric authentication systems. We use this framework to conduct a comparative information-theoretic analysis of two biometric systems that are based on linear error correction codes, namely fuzzy commitment and secure sketches. We derive upper bounds for the probability of false rejection ($P_{FR}$) and false acceptance ($P_{FA}$) for these systems. We use mutual information to quantify the information leaked about a user's biometric identity, in the scenario where one or multiple biometric enrollments of the user are fully or partially compromised. We also quantify the probability of successful attack ($P_{SA}$) based on the compromised information. Our analysis reveals that fuzzy commitment and secure sketch systems have identical $P_{FR}, P_{FA}, P_{SA}$ and information leakage, but secure sketch systems have lower storage requirements. We analyze both single-factor (keyless) and two-factor (key-based) variants of secure biometrics, and consider the most general scenarios in which a single user may provide noisy biometric enrollments at several access control devices, some of which may be subsequently compromised by an attacker. Our analysis highlights the revocability and reusability properties of key-based systems and exposes a subtle design tradeoff between reducing information leakage from compromised systems and preventing successful attacks on systems whose data have not been compromised.
△ Less
Submitted 23 December, 2011;
originally announced December 2011.
-
Unconditionally Secure Computation on Large Distributed Databases with Vanishing Cost
Authors:
Ye Wang,
Shantanu Rane,
Prakash Ishwar,
Wei Sun
Abstract:
Consider a network of k parties, each holding a long sequence of n entries (a database), with minimum vertex-cut greater than t. We show that any empirical statistic across the network of databases can be computed by each party with perfect privacy, against any set of t < k/2 passively colluding parties, such that the worst-case distortion and communication cost (in bits per database entry) both g…
▽ More
Consider a network of k parties, each holding a long sequence of n entries (a database), with minimum vertex-cut greater than t. We show that any empirical statistic across the network of databases can be computed by each party with perfect privacy, against any set of t < k/2 passively colluding parties, such that the worst-case distortion and communication cost (in bits per database entry) both go to zero as n, the number of entries in the databases, goes to infinity. This is based on combining a striking dimensionality reduction result for random sampling with unconditionally secure multi-party computation protocols.
△ Less
Submitted 18 February, 2014; v1 submitted 4 October, 2010;
originally announced October 2010.
-
Interaction Strictly Improves the Wyner-Ziv Rate-distortion function
Authors:
Nan Ma,
Prakash Ishwar
Abstract:
In 1985 Kaspi provided a single-letter characterization of the sum-rate-distortion function for a two-way lossy source coding problem in which two terminals send multiple messages back and forth with the goal of reproducing each other's sources. Yet, the question remained whether more messages can strictly improve the sum-rate-distortion function. Viewing the sum-rate as a functional of the distor…
▽ More
In 1985 Kaspi provided a single-letter characterization of the sum-rate-distortion function for a two-way lossy source coding problem in which two terminals send multiple messages back and forth with the goal of reproducing each other's sources. Yet, the question remained whether more messages can strictly improve the sum-rate-distortion function. Viewing the sum-rate as a functional of the distortions and the joint source distribution and leveraging its convex-geometric properties, we construct an example which shows that two messages can strictly improve the one-message (Wyner-Ziv) rate-distortion function. The example also shows that the ratio of the one-message rate to the two-message sum-rate can be arbitrarily large and simultaneously the ratio of the backward rate to the forward rate in the two-message sum-rate can be arbitrarily small.
△ Less
Submitted 1 June, 2010; v1 submitted 15 January, 2010;
originally announced January 2010.
-
Infinite-message Interactive Function Computation in Collocated Networks
Authors:
Nan Ma,
Prakash Ishwar
Abstract:
An interactive function computation problem in a collocated network is studied in a distributed block source coding framework. With the goal of computing a desired function at the sink, the source nodes exchange messages through a sequence of error-free broadcasts. The infinite-message minimum sum-rate is viewed as a functional of the joint source pmf and is characterized as the least element in a…
▽ More
An interactive function computation problem in a collocated network is studied in a distributed block source coding framework. With the goal of computing a desired function at the sink, the source nodes exchange messages through a sequence of error-free broadcasts. The infinite-message minimum sum-rate is viewed as a functional of the joint source pmf and is characterized as the least element in a partially ordered family of functionals having certain convex-geometric properties. This characterization leads to a family of lower bounds for the infinite-message minimum sum-rate and a simple optimality test for any achievable infinite-message sum-rate. An iterative algorithm for evaluating the infinite-message minimum sum-rate functional is proposed and is demonstrated through an example of computing the minimum function of three sources.
△ Less
Submitted 1 June, 2010; v1 submitted 11 January, 2010;
originally announced January 2010.
-
The Infinite-message Limit of Two-terminal Interactive Source Coding
Authors:
Nan Ma,
Prakash Ishwar
Abstract:
A two-terminal interactive function computation problem with alternating messages is studied within the framework of distributed block source coding theory. For any finite number of messages, a single-letter characterization of the sum-rate-distortion function was established in previous works using standard information-theoretic techniques. This, however, does not provide a satisfactory character…
▽ More
A two-terminal interactive function computation problem with alternating messages is studied within the framework of distributed block source coding theory. For any finite number of messages, a single-letter characterization of the sum-rate-distortion function was established in previous works using standard information-theoretic techniques. This, however, does not provide a satisfactory characterization of the infinite-message limit, which is a new, unexplored dimension for asymptotic-analysis in distributed block source coding involving potentially an infinite number of infinitesimal-rate messages. In this paper, the infinite-message sum-rate-distortion function, viewed as a functional of the joint source pmf and the distortion levels, is characterized as the least element of a partially ordered family of functionals having certain convex-geometric properties. The new characterization does not involve evaluating the infinite-message limit of a finite-message sum-rate-distortion expression. This characterization leads to a family of lower bounds for the infinite-message sum-rate-distortion expression and a simple criterion to test the optimality of any achievable infinite-message sum-rate-distortion expression. For computing the amplewise Boolean AND function, the infinite-message minimum sum-rates are characterized in closed analytic form. These sum-rates are shown to be achievable using infinitely many infinitesimal-rate messages. The new convex-geometric characterization is used to develop an iterative algorithm for evaluating any finite-message sumrate-distortion function. It is also used to construct the first examples which demonstrate that for lossy source reproduction, two messages can strictly improve the one-message Wyner-Ziv rate-distortion function settling an unresolved question from a 1985 paper.
△ Less
Submitted 2 August, 2012; v1 submitted 24 August, 2009;
originally announced August 2009.
-
Bootstrapped Oblivious Transfer and Secure Two-Party Function Computation
Authors:
Ye Wang,
Prakash Ishwar
Abstract:
We propose an information theoretic framework for the secure two-party function computation (SFC) problem and introduce the notion of SFC capacity. We study and extend string oblivious transfer (OT) to sample-wise OT. We propose an efficient, perfectly private OT protocol utilizing the binary erasure channel or source. We also propose the bootstrap string OT protocol which provides disjoint (wea…
▽ More
We propose an information theoretic framework for the secure two-party function computation (SFC) problem and introduce the notion of SFC capacity. We study and extend string oblivious transfer (OT) to sample-wise OT. We propose an efficient, perfectly private OT protocol utilizing the binary erasure channel or source. We also propose the bootstrap string OT protocol which provides disjoint (weakened) privacy while achieving a multiplicative increase in rate, thus trading off security for rate. Finally, leveraging our OT protocol, we construct a protocol for SFC and establish a general lower bound on SFC capacity of the binary erasure channel and source.
△ Less
Submitted 4 February, 2009;
originally announced February 2009.
-
Information-Theoretic Bounds for Multiround Function Computation in Collocated Networks
Authors:
Nan Ma,
Prakash Ishwar,
Piyush Gupta
Abstract:
We study the limits of communication efficiency for function computation in collocated networks within the framework of multi-terminal block source coding theory. With the goal of computing a desired function of sources at a sink, nodes interact with each other through a sequence of error-free, network-wide broadcasts of finite-rate messages. For any function of independent sources, we derive a…
▽ More
We study the limits of communication efficiency for function computation in collocated networks within the framework of multi-terminal block source coding theory. With the goal of computing a desired function of sources at a sink, nodes interact with each other through a sequence of error-free, network-wide broadcasts of finite-rate messages. For any function of independent sources, we derive a computable characterization of the set of all feasible message coding rates - the rate region - in terms of single-letter information measures. We show that when computing symmetric functions of binary sources, the sink will inevitably learn certain additional information which is not demanded in computing the function. This conceptual understanding leads to new improved bounds for the minimum sum-rate. The new bounds are shown to be orderwise better than those based on cut-sets as the network scales. The scaling law of the minimum sum-rate is explored for different classes of symmetric functions and source parameters.
△ Less
Submitted 28 January, 2009; v1 submitted 15 January, 2009;
originally announced January 2009.
-
Distributed Source Coding for Interactive Function Computation
Authors:
Nan Ma,
Prakash Ishwar
Abstract:
A two-terminal interactive distributed source coding problem with alternating messages for function computation at both locations is studied. For any number of messages, a computable characterization of the rate region is provided in terms of single-letter information measures. While interaction is useless in terms of the minimum sum-rate for lossless source reproduction at one or both locations…
▽ More
A two-terminal interactive distributed source coding problem with alternating messages for function computation at both locations is studied. For any number of messages, a computable characterization of the rate region is provided in terms of single-letter information measures. While interaction is useless in terms of the minimum sum-rate for lossless source reproduction at one or both locations, the gains can be arbitrarily large for function computation even when the sources are independent. For a class of sources and functions, interaction is shown to be useless, even with infinite messages, when a function has to be computed at only one location, but is shown to be useful, if functions have to be computed at both locations. For computing the Boolean AND function of two independent Bernoulli sources at both locations, an achievable infinite-message sum-rate with infinitesimal-rate messages is derived in terms of a two-dimensional definite integral and a rate-allocation curve. A general framework for multiterminal interactive function computation based on an information exchange protocol which successively switches among different distributed source coding configurations is developed. For networks with a star topology, multiple rounds of interactive coding is shown to decrease the scaling law of the total network rate by an order of magnitude as the network grows.
△ Less
Submitted 12 November, 2008; v1 submitted 4 January, 2008;
originally announced January 2008.
-
Benefit of Delay on the Diversity-Multiplexing Tradeoffs of MIMO Channels with Partial CSI
Authors:
Masoud Sharif,
Prakash Ishwar
Abstract:
This paper re-examines the well-known fundamental tradeoffs between rate and reliability for the multi-antenna, block Rayleigh fading channel in the high signal to noise ratio (SNR) regime when (i) the transmitter has access to (noiseless) one bit per coherence-interval of causal channel state information (CSI) and (ii) soft decoding delays together with worst-case delay guarantees are acceptabl…
▽ More
This paper re-examines the well-known fundamental tradeoffs between rate and reliability for the multi-antenna, block Rayleigh fading channel in the high signal to noise ratio (SNR) regime when (i) the transmitter has access to (noiseless) one bit per coherence-interval of causal channel state information (CSI) and (ii) soft decoding delays together with worst-case delay guarantees are acceptable. A key finding of this work is that substantial improvements in reliability can be realized with a very short expected delay and a slightly longer (but bounded) worst-case decoding delay guarantee in communication systems where the transmitter has access to even one bit per coherence interval of causal CSI. While similar in spirit to the recent work on communication systems based on automatic repeat requests (ARQ) where decoding failure is known at the transmitter and leads to re-transmission, here transmit side-information is purely based on CSI. The findings reported here also lend further support to an emerging understanding that decoding delay (related to throughput) and codeword blocklength (related to coding complexity and delays) are distinctly different design parameters which can be tuned to control reliability.
△ Less
Submitted 14 July, 2007;
originally announced July 2007.