-
Navigating the Web of Misinformation: A Framework for Misinformation Domain Detection Using Browser Traffic
Authors:
Mayana Pereira,
Kevin Greene,
Nilima Pisharody,
Rahul Dodhia,
Jacob N. Shapiro,
Juan Lavista
Abstract:
The proliferation of misinformation and propaganda is a global challenge, with profound effects during major crises such as the COVID-19 pandemic and the Russian invasion of Ukraine. Understanding the spread of misinformation and its social impacts requires identifying the news sources spreading false information. While machine learning (ML) techniques have been proposed to address this issue, ML…
▽ More
The proliferation of misinformation and propaganda is a global challenge, with profound effects during major crises such as the COVID-19 pandemic and the Russian invasion of Ukraine. Understanding the spread of misinformation and its social impacts requires identifying the news sources spreading false information. While machine learning (ML) techniques have been proposed to address this issue, ML models have failed to provide an efficient implementation scenario that yields useful results. In prior research, the precision of deployment in real traffic deteriorates significantly, experiencing a decrement up to ten times compared to the results derived from benchmark data sets. Our research addresses this gap by proposing a graph-based approach to capture navigational patterns and generate traffic-based features which are used to train a classification model. These navigational and traffic-based features result in classifiers that present outstanding performance when evaluated against real traffic. Moreover, we also propose graph-based filtering techniques to filter out models to be classified by our framework. These filtering techniques increase the signal-to-noise ratio of the models to be classified, greatly reducing false positives and the computational cost of deploying the model. Our proposed framework for the detection of misinformation domains achieves a precision of 0.78 when evaluated in real traffic. This outcome represents an improvement factor of over ten times over those achieved in previous studies.
△ Less
Submitted 24 July, 2023;
originally announced July 2023.
-
Multishot Adversarial Network Decoding
Authors:
Giuseppe Cotardo,
Gretchen L. Matthews,
Alberto Ravagnani,
Julia Shapiro
Abstract:
We investigate adversarial network coding and decoding focusing on the multishot regime. Errors can occur on a proper subset of the network edges and are modeled via an adversarial channel. The paper contains both bounds and capacity-achieving schemes for the Diamond Network and the Mirrored Diamond Network. We also initiate the study of the generalizations of these networks.
We investigate adversarial network coding and decoding focusing on the multishot regime. Errors can occur on a proper subset of the network edges and are modeled via an adversarial channel. The paper contains both bounds and capacity-achieving schemes for the Diamond Network and the Mirrored Diamond Network. We also initiate the study of the generalizations of these networks.
△ Less
Submitted 17 July, 2023;
originally announced July 2023.
-
Analyzing Intentional Behavior in Autonomous Agents under Uncertainty
Authors:
Filip Cano Córdoba,
Samuel Judson,
Timos Antonopoulos,
Katrine Bjørner,
Nicholas Shoemaker,
Scott J. Shapiro,
Ruzica Piskac,
Bettina Könighofer
Abstract:
Principled accountability for autonomous decision-making in uncertain environments requires distinguishing intentional outcomes from negligent designs from actual accidents. We propose analyzing the behavior of autonomous agents through a quantitative measure of the evidence of intentional behavior. We model an uncertain environment as a Markov Decision Process (MDP). For a given scenario, we rely…
▽ More
Principled accountability for autonomous decision-making in uncertain environments requires distinguishing intentional outcomes from negligent designs from actual accidents. We propose analyzing the behavior of autonomous agents through a quantitative measure of the evidence of intentional behavior. We model an uncertain environment as a Markov Decision Process (MDP). For a given scenario, we rely on probabilistic model checking to compute the ability of the agent to influence reaching a certain event. We call this the scope of agency. We say that there is evidence of intentional behavior if the scope of agency is high and the decisions of the agent are close to being optimal for reaching the event. Our method applies counterfactual reasoning to automatically generate relevant scenarios that can be analyzed to increase the confidence of our assessment. In a case study, we show how our method can distinguish between 'intentional' and 'accidental' traffic collisions.
△ Less
Submitted 4 July, 2023;
originally announced July 2023.
-
Towards Early Prediction of Human iPSC Reprogramming Success
Authors:
Abhineet Singh,
Ila Jasra,
Omar Mouhammed,
Nidheesh Dadheech,
Nilanjan Ray,
James Shapiro
Abstract:
This paper presents advancements in automated early-stage prediction of the success of reprogramming human induced pluripotent stem cells (iPSCs) as a potential source for regenerative cell therapies.The minuscule success rate of iPSC-reprogramming of around $ 0.01% $ to $ 0.1% $ makes it labor-intensive, time-consuming, and exorbitantly expensive to generate a stable iPSC line. Since that require…
▽ More
This paper presents advancements in automated early-stage prediction of the success of reprogramming human induced pluripotent stem cells (iPSCs) as a potential source for regenerative cell therapies.The minuscule success rate of iPSC-reprogramming of around $ 0.01% $ to $ 0.1% $ makes it labor-intensive, time-consuming, and exorbitantly expensive to generate a stable iPSC line. Since that requires culturing of millions of cells and intense biological scrutiny of multiple clones to identify a single optimal clone. The ability to reliably predict which cells are likely to establish as an optimal iPSC line at an early stage of pluripotency would therefore be ground-breaking in rendering this a practical and cost-effective approach to personalized medicine. Temporal information about changes in cellular appearance over time is crucial for predicting its future growth outcomes. In order to generate this data, we first performed continuous time-lapse imaging of iPSCs in culture using an ultra-high resolution microscope. We then annotated the locations and identities of cells in late-stage images where reliable manual identification is possible. Next, we propagated these labels backwards in time using a semi-automated tracking system to obtain labels for early stages of growth. Finally, we used this data to train deep neural networks to perform automatic cell segmentation and classification. Our code and data are available at https://github.com/abhineet123/ipsc_prediction.
△ Less
Submitted 11 November, 2023; v1 submitted 23 May, 2023;
originally announced May 2023.
-
'Put the Car on the Stand': SMT-based Oracles for Investigating Decisions
Authors:
Samuel Judson,
Matthew Elacqua,
Filip Cano,
Timos Antonopoulos,
Bettina Könighofer,
Scott J. Shapiro,
Ruzica Piskac
Abstract:
Principled accountability in the aftermath of harms is essential to the trustworthy design and governance of algorithmic decision making. Legal theory offers a paramount method for assessing culpability: putting the agent 'on the stand' to subject their actions and intentions to cross-examination. We show that under minimal assumptions automated reasoning can rigorously interrogate algorithmic beh…
▽ More
Principled accountability in the aftermath of harms is essential to the trustworthy design and governance of algorithmic decision making. Legal theory offers a paramount method for assessing culpability: putting the agent 'on the stand' to subject their actions and intentions to cross-examination. We show that under minimal assumptions automated reasoning can rigorously interrogate algorithmic behaviors as in the adversarial process of legal fact finding. We model accountability processes, such as trials or review boards, as Counterfactual-Guided Logic Exploration and Abstraction Refinement (CLEAR) loops. We use the formal methods of symbolic execution and satisfiability modulo theories (SMT) solving to discharge queries about agent behavior in factual and counterfactual scenarios, as adaptively formulated by a human investigator. In order to do so, for a decision algorithm $\mathcal{A}$ we use symbolic execution to represent its logic as a statement $Π$ in the decidable theory $\texttt{QF_FPBV}$. We implement our framework and demonstrate its utility on an illustrative car crash scenario.
△ Less
Submitted 29 January, 2024; v1 submitted 9 May, 2023;
originally announced May 2023.
-
Machine Learning-Based Soft Sensors for Vacuum Distillation Unit
Authors:
Kamil Oster,
Stefan Güttel,
Lu Chen,
Jonathan L. Shapiro,
Megan Jobson
Abstract:
Product quality assessment in the petroleum processing industry can be difficult and time-consuming, e.g. due to a manual collection of liquid samples from the plant and subsequent chemical laboratory analysis of the samples. The product quality is an important property that informs whether the products of the process are within the specifications. In particular, the delays caused by sample proces…
▽ More
Product quality assessment in the petroleum processing industry can be difficult and time-consuming, e.g. due to a manual collection of liquid samples from the plant and subsequent chemical laboratory analysis of the samples. The product quality is an important property that informs whether the products of the process are within the specifications. In particular, the delays caused by sample processing (collection, laboratory measurements, results analysis, reporting) can lead to detrimental economic effects. One of the strategies to deal with this problem is soft sensors. Soft sensors are a collection of models that can be used to predict and forecast some infrequently measured properties (such as laboratory measurements of petroleum products) based on more frequent measurements of quantities like temperature, pressure and flow rate provided by physical sensors. Soft sensors short-cut the pathway to obtain relevant information about the product quality, often providing measurements as frequently as every minute. One of the applications of soft sensors is for the real-time optimization of a chemical process by a targeted adaptation of operating parameters. Models used for soft sensors can have various forms, however, among the most common are those based on artificial neural networks (ANNs). While soft sensors can deal with some of the issues in the refinery processes, their development and deployment can pose other challenges that are addressed in this paper. Firstly, it is important to enhance the quality of both sets of data (laboratory measurements and physical sensors) in a data pre-processing stage (as described in Methodology section). Secondly, once the data sets are pre-processed, different models need to be tested against prediction error and the model's interpretability. In this work, we present a framework for soft sensor development from raw data to ready-to-use models.
△ Less
Submitted 19 November, 2021;
originally announced November 2021.
-
Pre-treatment of outliers and anomalies in plant data: Methodology and case study of a Vacuum Distillation Unit
Authors:
Kamil Oster,
Stefan Güttel,
Jonathan L. Shapiro,
Lu Chen,
Megan Jobson
Abstract:
Data pre-treatment plays a significant role in improving data quality, thus allowing extraction of accurate information from raw data. One of the data pre-treatment techniques commonly used is outliers detection. The so-called 3$σ$ method is a common practice to identify the outliers. As shown in the manuscript, it does not identify all outliers, resulting in possible distortion of the overall sta…
▽ More
Data pre-treatment plays a significant role in improving data quality, thus allowing extraction of accurate information from raw data. One of the data pre-treatment techniques commonly used is outliers detection. The so-called 3$σ$ method is a common practice to identify the outliers. As shown in the manuscript, it does not identify all outliers, resulting in possible distortion of the overall statistics of the data. This problem can have a significant impact on further data analysis and can lead to reduction in the accuracy of predictive models. There is a plethora of various techniques for outliers detection, however, aside from theoretical work, they all require case study work. Two types of outliers were considered: short-term (erroneous data, noise) and long-term outliers (e.g. malfunctioning for longer periods). The data used were taken from the vacuum distillation unit (VDU) of an Asian refinery and included 40 physical sensors (temperature, pressure and flow rate). We used a modified method for 3$σ$ thresholds to identify the short-term outliers, i.e. ensors data are divided into chunks determined by change points and 3$σ$ thresholds are calculated within each chunk representing near-normal distribution. We have shown that piecewise 3$σ$ method offers a better approach to short-term outliers detection than 3$σ$ method applied to the entire time series. Nevertheless, this does not perform well for long-term outliers (which can represent another state in the data). In this case, we used principal component analysis (PCA) with Hotelling's $T^2$ statistics to identify the long-term outliers. The results obtained with PCA were subject to DBSCAN clustering method. The outliers (which were visually obvious and correctly detected by the PCA method) were also correctly identified by DBSCAN which supported the consistency and accuracy of the PCA method.
△ Less
Submitted 17 June, 2021;
originally announced June 2021.
-
Low-Resource Language Modelling of South African Languages
Authors:
Stuart Mesham,
Luc Hayward,
Jared Shapiro,
Jan Buys
Abstract:
Language models are the foundation of current neural network-based models for natural language understanding and generation. However, research on the intrinsic performance of language models on African languages has been extremely limited, which is made more challenging by the lack of large or standardised training and evaluation sets that exist for English and other high-resource languages. In th…
▽ More
Language models are the foundation of current neural network-based models for natural language understanding and generation. However, research on the intrinsic performance of language models on African languages has been extremely limited, which is made more challenging by the lack of large or standardised training and evaluation sets that exist for English and other high-resource languages. In this paper, we evaluate the performance of open-vocabulary language models on low-resource South African languages, using byte-pair encoding to handle the rich morphology of these languages. We evaluate different variants of n-gram models, feedforward neural networks, recurrent neural networks (RNNs), and Transformers on small-scale datasets. Overall, well-regularized RNNs give the best performance across two isiZulu and one Sepedi datasets. Multilingual training further improves performance on these datasets. We hope that this research will open new avenues for research into multilingual and low-resource language modelling for African languages.
△ Less
Submitted 1 April, 2021;
originally announced April 2021.
-
Eliminating Catastrophic Interference with Biased Competition
Authors:
Amelia Elizabeth Pollard,
Jonathan L. Shapiro
Abstract:
We present here a model to take advantage of the multi-task nature of complex datasets by learning to separate tasks and subtasks in and end to end manner by biasing competitive interactions in the network. This method does not require additional labelling or reformatting of data in a dataset. We propose an alternate view to the monolithic one-task-fits-all learning of multi-task problems, and des…
▽ More
We present here a model to take advantage of the multi-task nature of complex datasets by learning to separate tasks and subtasks in and end to end manner by biasing competitive interactions in the network. This method does not require additional labelling or reformatting of data in a dataset. We propose an alternate view to the monolithic one-task-fits-all learning of multi-task problems, and describe a model based on a theory of neuronal attention from neuroscience, proposed by Desimone. We create and exhibit a new toy dataset, based on the MNIST dataset, which we call MNIST-QA, for testing Visual Question Answering architectures in a low-dimensional environment while preserving the more difficult components of the Visual Question Answering task, and demonstrate the proposed network architecture on this new dataset, as well as on COCO-QA and DAQUAR-FULL. We then demonstrate that this model eliminates catastrophic interference between tasks on a newly created toy dataset and provides competitive results in the Visual Question Answering space. We provide further evidence that Visual Question Answering can be approached as a multi-task problem, and demonstrate that this new architecture based on the Biased Competition model is capable of learning to separate and learn the tasks in an end-to-end fashion without the need for task labels.
△ Less
Submitted 3 July, 2020;
originally announced July 2020.
-
Visual Question Answering as a Multi-Task Problem
Authors:
Amelia Elizabeth Pollard,
Jonathan L. Shapiro
Abstract:
Visual Question Answering(VQA) is a highly complex problem set, relying on many sub-problems to produce reasonable answers. In this paper, we present the hypothesis that Visual Question Answering should be viewed as a multi-task problem, and provide evidence to support this hypothesis. We demonstrate this by reformatting two commonly used Visual Question Answering datasets, COCO-QA and DAQUAR, int…
▽ More
Visual Question Answering(VQA) is a highly complex problem set, relying on many sub-problems to produce reasonable answers. In this paper, we present the hypothesis that Visual Question Answering should be viewed as a multi-task problem, and provide evidence to support this hypothesis. We demonstrate this by reformatting two commonly used Visual Question Answering datasets, COCO-QA and DAQUAR, into a multi-task format and train these reformatted datasets on two baseline networks, with one designed specifically to eliminate other possible causes for performance changes as a result of the reformatting. Though the networks demonstrated in this paper do not achieve strongly competitive results, we find that the multi-task approach to Visual Question Answering results in increases in performance of 5-9% against the single-task formatting, and that the networks reach convergence much faster than in the single-task case. Finally we discuss possible reasons for the observed difference in performance, and perform additional experiments which rule out causes not associated with the learning of the dataset as a multi-task problem.
△ Less
Submitted 3 July, 2020;
originally announced July 2020.
-
ASAPP-ASR: Multistream CNN and Self-Attentive SRU for SOTA Speech Recognition
Authors:
**g Pan,
Joshua Shapiro,
Jeremy Wohlwend,
Kyu J. Han,
Tao Lei,
Tao Ma
Abstract:
In this paper we present state-of-the-art (SOTA) performance on the LibriSpeech corpus with two novel neural network architectures, a multistream CNN for acoustic modeling and a self-attentive simple recurrent unit (SRU) for language modeling. In the hybrid ASR framework, the multistream CNN acoustic model processes an input of speech frames in multiple parallel pipelines where each stream has a u…
▽ More
In this paper we present state-of-the-art (SOTA) performance on the LibriSpeech corpus with two novel neural network architectures, a multistream CNN for acoustic modeling and a self-attentive simple recurrent unit (SRU) for language modeling. In the hybrid ASR framework, the multistream CNN acoustic model processes an input of speech frames in multiple parallel pipelines where each stream has a unique dilation rate for diversity. Trained with the SpecAugment data augmentation method, it achieves relative word error rate (WER) improvements of 4% on test-clean and 14% on test-other. We further improve the performance via N-best rescoring using a 24-layer self-attentive SRU language model, achieving WERs of 1.75% on test-clean and 4.46% on test-other.
△ Less
Submitted 21 May, 2020;
originally announced May 2020.
-
Illusion of Causality in Visualized Data
Authors:
Cindy Xiong,
Joel Shapiro,
Jessica Hullman,
Steven Franconeri
Abstract:
Students who eat breakfast more frequently tend to have a higher grade point average. From this data, many people might confidently state that a before-school breakfast program would lead to higher grades. This is a reasoning error, because correlation does not necessarily indicate causation -- X and Y can be correlated without one directly causing the other. While this error is pervasive, its pre…
▽ More
Students who eat breakfast more frequently tend to have a higher grade point average. From this data, many people might confidently state that a before-school breakfast program would lead to higher grades. This is a reasoning error, because correlation does not necessarily indicate causation -- X and Y can be correlated without one directly causing the other. While this error is pervasive, its prevalence might be amplified or mitigated by the way that the data is presented to a viewer. Across three crowdsourced experiments, we examined whether how simple data relations are presented would mitigate this reasoning error. The first experiment tested examples similar to the breakfast-GPA relation, varying in the plausibility of the causal link. We asked participants to rate their level of agreement that the relation was correlated, which they rated appropriately as high. However, participants also expressed high agreement with a causal interpretation of the data. Levels of support for the causal interpretation were not equally strong across visualization types: causality ratings were highest for text descriptions and bar graphs, but weaker for scatter plots. But is this effect driven by bar graphs aggregating data into two groups or by the visual encoding type? We isolated data aggregation versus visual encoding type and examined their individual effect on perceived causality. Overall, different visualization designs afford different cognitive reasoning affordances across the same data. High levels of data aggregation by graphs tend to be associated with higher perceived causality in data. Participants perceived line and dot visual encodings as more causal than bar encodings. Our results demonstrate how some visualization designs trigger stronger causal links while choosing others can help mitigate unwarranted perceptions of causality.
△ Less
Submitted 1 August, 2019;
originally announced August 2019.
-
Analyzing Polarization in Social Media: Method and Application to Tweets on 21 Mass Shootings
Authors:
Dorottya Demszky,
Nikhil Garg,
Rob Voigt,
James Zou,
Matthew Gentzkow,
Jesse Shapiro,
Dan Jurafsky
Abstract:
We provide an NLP framework to uncover four linguistic dimensions of political polarization in social media: topic choice, framing, affect and illocutionary force. We quantify these aspects with existing lexical methods, and propose clustering of tweet embeddings as a means to identify salient topics for analysis across events; human evaluations show that our approach generates more cohesive topic…
▽ More
We provide an NLP framework to uncover four linguistic dimensions of political polarization in social media: topic choice, framing, affect and illocutionary force. We quantify these aspects with existing lexical methods, and propose clustering of tweet embeddings as a means to identify salient topics for analysis across events; human evaluations show that our approach generates more cohesive topics than traditional LDA-based models. We apply our methods to study 4.4M tweets on 21 mass shootings. We provide evidence that the discussion of these events is highly polarized politically and that this polarization is primarily driven by partisan differences in framing rather than topic choice. We identify framing devices, such as grounding and the contrasting use of the terms "terrorist" and "crazy", that contribute to polarization. Results pertaining to topic choice, affect and illocutionary force suggest that Republicans focus more on the shooter and event-specific facts (news) while Democrats focus more on the victims and call for policy changes. Our work contributes to a deeper understanding of the way group divisions manifest in language and to computational methods for studying them.
△ Less
Submitted 3 April, 2019; v1 submitted 2 April, 2019;
originally announced April 2019.
-
Secret key distillation across a quantum wiretap channel under restricted eavesdrop**
Authors:
Ziwen Pan,
Kaushik P. Seshadreesan,
William Clark,
Mark R. Adcock,
Ivan B. Djordjevic,
Jeffrey H. Shapiro,
Saikat Guha
Abstract:
The theory of quantum cryptography aims to guarantee unconditional information-theoretic security against an omnipotent eavesdropper. In many practical scenarios, however, the assumption of an all-powerful adversary is excessive and can be relaxed considerably. In this paper we study secret key distillation across a lossy and noisy quantum wiretap channel between Alice and Bob, with a separately p…
▽ More
The theory of quantum cryptography aims to guarantee unconditional information-theoretic security against an omnipotent eavesdropper. In many practical scenarios, however, the assumption of an all-powerful adversary is excessive and can be relaxed considerably. In this paper we study secret key distillation across a lossy and noisy quantum wiretap channel between Alice and Bob, with a separately parameterized realistically lossy quantum channel to the eavesdropper Eve. We show that under such restricted eavesdrop**, the key rates achievable can exceed the secret key distillation capacity against an unrestricted eavesdropper in the quantum wiretap channel. Further, we show upper bounds on the key rates based on the relative entropy of entanglement. This simple restricted eavesdrop** model is widely applicable, e.g., to free-space quantum optical communication, where realistic collection of light by Eve is limited by the finite size of her optical aperture. Future work will include calculating bounds on the amount of light Eve can collect under various realistic scenarios.
△ Less
Submitted 18 April, 2019; v1 submitted 7 March, 2019;
originally announced March 2019.
-
Thinning, photonic beamsplitting, and a general discrete entropy power inequality
Authors:
Saikat Guha,
Jeffrey H. Shapiro,
Raul Garcia-Patron Sanchez
Abstract:
Many partially-successful attempts have been made to find the most natural discrete-variable version of Shannon's entropy power inequality (EPI). We develop an axiomatic framework from which we deduce the natural form of a discrete-variable EPI and an associated entropic monotonicity in a discrete-variable central limit theorem. In this discrete EPI, the geometric distribution, which has the maxim…
▽ More
Many partially-successful attempts have been made to find the most natural discrete-variable version of Shannon's entropy power inequality (EPI). We develop an axiomatic framework from which we deduce the natural form of a discrete-variable EPI and an associated entropic monotonicity in a discrete-variable central limit theorem. In this discrete EPI, the geometric distribution, which has the maximum entropy among all discrete distributions with a given mean, assumes a role analogous to the Gaussian distribution in Shannon's EPI. The entropy power of $X$ is defined as the mean of a geometric random variable with entropy $H(X)$. The crux of our construction is a discrete-variable version of Lieb's scaled addition $X \boxplus_ηY$ of two discrete random variables $X$ and $Y$ with $η\in (0, 1)$. We discuss the relationship of our discrete EPI with recent work of Yu and Johnson who developed an EPI for a restricted class of random variables that have ultra-log-concave (ULC) distributions. Even though we leave open the proof of the aforesaid natural form of the discrete EPI, we show that this discrete EPI holds true for variables with arbitrary discrete distributions when the entropy power is redefined as $e^{H(X)}$ in analogy with the continuous version. Finally, we show that our conjectured discrete EPI is a special case of the yet-unproven Entropy Photon-number Inequality (EPnI), which assumes a role analogous to Shannon's EPI in capacity proofs for Gaussian bosonic (quantum) channels.
△ Less
Submitted 25 May, 2016;
originally announced May 2016.
-
Quantum enigma machines and the locking capacity of a quantum channel
Authors:
Saikat Guha,
Patrick Hayden,
Hari Krovi,
Seth Lloyd,
Cosmo Lupo,
Jeffrey H. Shapiro,
Masahiro Takeoka,
Mark M. Wilde
Abstract:
The locking effect is a phenomenon which is unique to quantum information theory and represents one of the strongest separations between the classical and quantum theories of information. The Fawzi-Hayden-Sen (FHS) locking protocol harnesses this effect in a cryptographic context, whereby one party can encode n bits into n qubits while using only a constant-size secret key. The encoded message is…
▽ More
The locking effect is a phenomenon which is unique to quantum information theory and represents one of the strongest separations between the classical and quantum theories of information. The Fawzi-Hayden-Sen (FHS) locking protocol harnesses this effect in a cryptographic context, whereby one party can encode n bits into n qubits while using only a constant-size secret key. The encoded message is then secure against any measurement that an eavesdropper could perform in an attempt to recover the message, but the protocol does not necessarily meet the composability requirements needed in quantum key distribution applications. In any case, the locking effect represents an extreme violation of Shannon's classical theorem, which states that information-theoretic security holds in the classical case if and only if the secret key is the same size as the message. Given this intriguing phenomenon, it is of practical interest to study the effect in the presence of noise, which can occur in the systems of both the legitimate receiver and the eavesdropper. This paper formally defines the locking capacity of a quantum channel as the maximum amount of locked information that can be reliably transmitted to a legitimate receiver by exploiting many independent uses of a quantum channel and an amount of secret key sublinear in the number of channel uses. We provide general operational bounds on the locking capacity in terms of other well-known capacities from quantum Shannon theory. We also study the important case of bosonic channels, finding limitations on these channels' locking capacity when coherent-state encodings are employed and particular locking protocols for these channels that might be physically implementable.
△ Less
Submitted 9 November, 2013; v1 submitted 19 July, 2013;
originally announced July 2013.
-
Thompson Sampling in Switching Environments with Bayesian Online Change Point Detection
Authors:
Joseph Mellor,
Jonathan Shapiro
Abstract:
Thompson Sampling has recently been shown to be optimal in the Bernoulli Multi-Armed Bandit setting[Kaufmann et al., 2012]. This bandit problem assumes stationary distributions for the rewards. It is often unrealistic to model the real world as a stationary distribution. In this paper we derive and evaluate algorithms using Thompson Sampling for a Switching Multi-Armed Bandit Problem. We propose a…
▽ More
Thompson Sampling has recently been shown to be optimal in the Bernoulli Multi-Armed Bandit setting[Kaufmann et al., 2012]. This bandit problem assumes stationary distributions for the rewards. It is often unrealistic to model the real world as a stationary distribution. In this paper we derive and evaluate algorithms using Thompson Sampling for a Switching Multi-Armed Bandit Problem. We propose a Thompson Sampling strategy equipped with a Bayesian change point mechanism to tackle this problem. We develop algorithms for a variety of cases with constant switching rate: when switching occurs all arms change (Global Switching), switching occurs independently for each arm (Per-Arm Switching), when the switching rate is known and when it must be inferred from data. This leads to a family of algorithms we collectively term Change-Point Thompson Sampling (CTS). We show empirical results of the algorithm in 4 artificial environments, and 2 derived from real world data; news click-through[Yahoo!, 2011] and foreign exchange data[Dukascopy, 2012], comparing them to some other bandit algorithms. In real world data CTS is the most effective.
△ Less
Submitted 15 February, 2013;
originally announced February 2013.
-
Bayesian Mixture Models for Frequent Itemset Discovery
Authors:
Ruefei He,
Jonathan Shapiro
Abstract:
In binary-transaction data-mining, traditional frequent itemset mining often produces results which are not straightforward to interpret. To overcome this problem, probability models are often used to produce more compact and conclusive results, albeit with some loss of accuracy. Bayesian statistics have been widely used in the development of probability models in machine learning in recent years…
▽ More
In binary-transaction data-mining, traditional frequent itemset mining often produces results which are not straightforward to interpret. To overcome this problem, probability models are often used to produce more compact and conclusive results, albeit with some loss of accuracy. Bayesian statistics have been widely used in the development of probability models in machine learning in recent years and these methods have many advantages, including their abilities to avoid overfitting. In this paper, we develop two Bayesian mixture models with the Dirichlet distribution prior and the Dirichlet process (DP) prior to improve the previous non-Bayesian mixture model developed for transaction dataset mining. We implement the inference of both mixture models using two methods: a collapsed Gibbs sampling scheme and a variational approximation algorithm. Experiments in several benchmark problems have shown that both mixture models achieve better performance than a non-Bayesian mixture model. The variational algorithm is the faster of the two approaches while the Gibbs sampling method achieves a more accurate results. The Dirichlet process mixture model can automatically grow to a proper complexity for a better approximation. Once the model is built, it can be very fast to query and run analysis on (typically 10 times faster than Eclat, as we will show in the experiment section). However, these approaches also show that mixture models underestimate the probabilities of frequent itemsets. Consequently, these models have a higher sensitivity but a lower specificity.
△ Less
Submitted 26 September, 2012;
originally announced September 2012.
-
Capacity of optical reading, Part 1: Reading boundless error-free bits using a single photon
Authors:
Saikat Guha,
Jeffrey H. Shapiro
Abstract:
We show that nature imposes no fundamental upper limit to the number of information bits per expended photon that can, in principle, be read reliably when classical data is encoded in a medium that can only passively modulate the amplitude and phase of the probe light. We show that with a coherent-state (laser) source, an on-off (amplitude-modulation) pixel encoding, and shot-noise-limited direct…
▽ More
We show that nature imposes no fundamental upper limit to the number of information bits per expended photon that can, in principle, be read reliably when classical data is encoded in a medium that can only passively modulate the amplitude and phase of the probe light. We show that with a coherent-state (laser) source, an on-off (amplitude-modulation) pixel encoding, and shot-noise-limited direct detection (an overly-optimistic model for commercial CD/DVD drives), the highest photon information efficiency achievable in principle is about 0.5 bit per transmitted photon. We then show that a coherent-state probe can read unlimited bits per photon when the receiver is allowed to make joint (inseparable) measurements on the reflected light from a large block of phase-modulated memory pixels. Finally, we show an example of a spatially-entangled non-classical light probe and a receiver design---constructable using a single-photon source, beam splitters, and single-photon detectors---that can in principle read any number of error-free bits of information. The probe is a single photon prepared in a uniform coherent superposition of multiple orthogonal spatial modes, i.e., a W-state. The code, target, and joint-detection receiver complexity required by a coherent-state transmitter to achieve comparable photon efficiency performance is shown to be much higher in comparison to that required by the W-state transceiver.
△ Less
Submitted 15 February, 2013; v1 submitted 26 July, 2012;
originally announced July 2012.
-
On quantum limit of optical communications: concatenated codes and joint-detection receivers
Authors:
Saikat Guha,
Zachary Dutton,
Jeffrey H. Shapiro
Abstract:
When classical information is sent over a channel with quantum-state modulation alphabet, such as the free-space optical (FSO) channel, attaining the ultimate (Holevo) limit to channel capacity requires the receiver to make joint measurements over long codeword blocks. In recent work, we showed a receiver for a pure-state channel that can attain the ultimate capacity by applying a single-shot opti…
▽ More
When classical information is sent over a channel with quantum-state modulation alphabet, such as the free-space optical (FSO) channel, attaining the ultimate (Holevo) limit to channel capacity requires the receiver to make joint measurements over long codeword blocks. In recent work, we showed a receiver for a pure-state channel that can attain the ultimate capacity by applying a single-shot optical (unitary) transformation on the received codeword state followed by simultaneous (but separable) projective measurements on the single-modulation-symbol state spaces. In this paper, we study the ultimate tradeoff between photon efficiency and spectral efficiency for the FSO channel. Based on our general results for the pure-state quantum channel, we show some of the first concrete examples of codes and laboratory-realizable joint-detection optical receivers that can achieve fundamentally higher (superadditive) channel capacity than receivers that physically detect each modulation symbol one at a time, as is done by all conventional (coherent or direct-detection) optical receivers.
△ Less
Submitted 9 February, 2011;
originally announced February 2011.
-
The Poisson Channel at Low Input Powers
Authors:
Amos Lapidoth,
Jeffrey H. Shapiro,
Vinodh Venkatesan,
Ligong Wang
Abstract:
The asymptotic capacity at low input powers of an average-power limited or an average- and peak-power limited discrete-time Poisson channel is considered. For a Poisson channel whose dark current is zero or decays to zero linearly with its average input power $E$, capacity scales like $E\log\frac{1}{E}$ for small $E$. For a Poisson channel whose dark current is a nonzero constant, capacity scale…
▽ More
The asymptotic capacity at low input powers of an average-power limited or an average- and peak-power limited discrete-time Poisson channel is considered. For a Poisson channel whose dark current is zero or decays to zero linearly with its average input power $E$, capacity scales like $E\log\frac{1}{E}$ for small $E$. For a Poisson channel whose dark current is a nonzero constant, capacity scales, to within a constant, like $E\log\log\frac{1}{E}$ for small $E$.
△ Less
Submitted 20 October, 2008;
originally announced October 2008.
-
Capacity of the Bosonic Wiretap Channel and the Entropy Photon-Number Inequality
Authors:
Saikat Guha,
Jeffrey H. Shapiro,
Baris I. Erkmen
Abstract:
Determining the ultimate classical information carrying capacity of electromagnetic waves requires quantum-mechanical analysis to properly account for the bosonic nature of these waves. Recent work has established capacity theorems for bosonic single-user and broadcast channels, under the presumption of two minimum output entropy conjectures. Despite considerable accumulated evidence that suppor…
▽ More
Determining the ultimate classical information carrying capacity of electromagnetic waves requires quantum-mechanical analysis to properly account for the bosonic nature of these waves. Recent work has established capacity theorems for bosonic single-user and broadcast channels, under the presumption of two minimum output entropy conjectures. Despite considerable accumulated evidence that supports the validity of these conjectures, they have yet to be proven. In this paper, it is shown that the second conjecture suffices to prove the classical capacity of the bosonic wiretap channel, which in turn would also prove the quantum capacity of the lossy bosonic channel. The preceding minimum output entropy conjectures are then shown to be simple consequences of an Entropy Photon-Number Inequality (EPnI), which is a conjectured quantum-mechanical analog of the Entropy Power Inequality (EPI) form classical information theory.
△ Less
Submitted 6 January, 2008;
originally announced January 2008.
-
The Entropy Photon-Number Inequality and its Consequences
Authors:
Saikat Guha,
Baris I. Erkmen,
Jeffrey H. Shapiro
Abstract:
Determining the ultimate classical information carrying capacity of electromagnetic waves requires quantum-mechanical analysis to properly account for the bosonic nature of these waves. Recent work has established capacity theorems for bosonic single-user, broadcast, and wiretap channels, under the presumption of two minimum output entropy conjectures. Despite considerable accumulated evidence t…
▽ More
Determining the ultimate classical information carrying capacity of electromagnetic waves requires quantum-mechanical analysis to properly account for the bosonic nature of these waves. Recent work has established capacity theorems for bosonic single-user, broadcast, and wiretap channels, under the presumption of two minimum output entropy conjectures. Despite considerable accumulated evidence that supports the validity of these conjectures, they have yet to be proven. Here we show that the preceding minimum output entropy conjectures are simple consequences of an Entropy Photon-Number Inequality, which is a conjectured quantum-mechanical analog of the Entropy Power Inequality (EPI) from classical information theory.
△ Less
Submitted 9 November, 2007; v1 submitted 30 October, 2007;
originally announced October 2007.
-
An associative memory for the on-line recognition and prediction of temporal sequences
Authors:
J. Bose,
S. B. Furber,
J. L. Shapiro
Abstract:
This paper presents the design of an associative memory with feedback that is capable of on-line temporal sequence learning. A framework for on-line sequence learning has been proposed, and different sequence learning models have been analysed according to this framework. The network model is an associative memory with a separate store for the sequence context of a symbol. A sparse distributed m…
▽ More
This paper presents the design of an associative memory with feedback that is capable of on-line temporal sequence learning. A framework for on-line sequence learning has been proposed, and different sequence learning models have been analysed according to this framework. The network model is an associative memory with a separate store for the sequence context of a symbol. A sparse distributed memory is used to gain scalability. The context store combines the functionality of a neural layer with a shift register. The sensitivity of the machine to the sequence context is controllable, resulting in different characteristic behaviours. The model can store and predict on-line sequences of various types and length. Numerical simulations on the model have been carried out to determine its properties.
△ Less
Submitted 4 November, 2006;
originally announced November 2006.
-
Novelty Detection on a Mobile Robot Using Habituation
Authors:
Stephen Marsland,
Ulrich Nehmzow,
Jonathan Shapiro
Abstract:
In this paper a novelty filter is introduced which allows a robot operating in an un structured environment to produce a self-organised model of its surroundings and to detect deviations from the learned model. The environment is perceived using the rob ot's 16 sonar sensors. The algorithm produces a novelty measure for each sensor scan relative to the model it has learned. This means that it hi…
▽ More
In this paper a novelty filter is introduced which allows a robot operating in an un structured environment to produce a self-organised model of its surroundings and to detect deviations from the learned model. The environment is perceived using the rob ot's 16 sonar sensors. The algorithm produces a novelty measure for each sensor scan relative to the model it has learned. This means that it highlights stimuli which h ave not been previously experienced. The novelty filter proposed uses a model of hab ituation. Habituation is a decrement in behavioural response when a stimulus is pre sented repeatedly. Robot experiments are presented which demonstrate the reliable o peration of the filter in a number of environments.
△ Less
Submitted 2 June, 2000;
originally announced June 2000.
-
A Real-Time Novelty Detector for a Mobile Robot
Authors:
Stephen Marsland,
Ulrich Nehmzow,
Jonathan Shapiro
Abstract:
Recognising new or unusual features of an environment is an ability which is potentially very useful to a robot. This paper demonstrates an algorithm which achieves this task by learning an internal representation of `normality' from sonar scans taken as a robot explores the environment. This model of the environment is used to evaluate the novelty of each sonar scan presented to it with relatio…
▽ More
Recognising new or unusual features of an environment is an ability which is potentially very useful to a robot. This paper demonstrates an algorithm which achieves this task by learning an internal representation of `normality' from sonar scans taken as a robot explores the environment. This model of the environment is used to evaluate the novelty of each sonar scan presented to it with relation to the model. Stimuli which have not been seen before, and therefore have more novelty, are highlighted by the filter. The filter has the ability to forget about features which have been learned, so that stimuli which are seen only rarely recover their response over time. A number of robot experiments are presented which demonstrate the operation of the filter.
△ Less
Submitted 2 June, 2000;
originally announced June 2000.
-
Novelty Detection for Robot Neotaxis
Authors:
Stephen Marsland,
Ulrich Nehmzow,
Jonathan Shapiro
Abstract:
The ability of a robot to detect and respond to changes in its environment is potentially very useful, as it draws attention to new and potentially important features. We describe an algorithm for learning to filter out previously experienced stimuli to allow further concentration on novel features. The algorithm uses a model of habituation, a biological process which causes a decrement in respo…
▽ More
The ability of a robot to detect and respond to changes in its environment is potentially very useful, as it draws attention to new and potentially important features. We describe an algorithm for learning to filter out previously experienced stimuli to allow further concentration on novel features. The algorithm uses a model of habituation, a biological process which causes a decrement in response with repeated presentation. Experiments with a mobile robot are presented in which the robot detects the most novel stimulus and turns towards it (`neotaxis').
△ Less
Submitted 2 June, 2000;
originally announced June 2000.