Search | arXiv e-print repository

Navigating the Web of Misinformation: A Framework for Misinformation Domain Detection Using Browser Traffic

Authors: Mayana Pereira, Kevin Greene, Nilima Pisharody, Rahul Dodhia, Jacob N. Shapiro, Juan Lavista

Abstract: The proliferation of misinformation and propaganda is a global challenge, with profound effects during major crises such as the COVID-19 pandemic and the Russian invasion of Ukraine. Understanding the spread of misinformation and its social impacts requires identifying the news sources spreading false information. While machine learning (ML) techniques have been proposed to address this issue, ML… ▽ More The proliferation of misinformation and propaganda is a global challenge, with profound effects during major crises such as the COVID-19 pandemic and the Russian invasion of Ukraine. Understanding the spread of misinformation and its social impacts requires identifying the news sources spreading false information. While machine learning (ML) techniques have been proposed to address this issue, ML models have failed to provide an efficient implementation scenario that yields useful results. In prior research, the precision of deployment in real traffic deteriorates significantly, experiencing a decrement up to ten times compared to the results derived from benchmark data sets. Our research addresses this gap by proposing a graph-based approach to capture navigational patterns and generate traffic-based features which are used to train a classification model. These navigational and traffic-based features result in classifiers that present outstanding performance when evaluated against real traffic. Moreover, we also propose graph-based filtering techniques to filter out models to be classified by our framework. These filtering techniques increase the signal-to-noise ratio of the models to be classified, greatly reducing false positives and the computational cost of deploying the model. Our proposed framework for the detection of misinformation domains achieves a precision of 0.78 when evaluated in real traffic. This outcome represents an improvement factor of over ten times over those achieved in previous studies. △ Less

Submitted 24 July, 2023; originally announced July 2023.

arXiv:2307.08839 [pdf, ps, other]

Multishot Adversarial Network Decoding

Authors: Giuseppe Cotardo, Gretchen L. Matthews, Alberto Ravagnani, Julia Shapiro

Abstract: We investigate adversarial network coding and decoding focusing on the multishot regime. Errors can occur on a proper subset of the network edges and are modeled via an adversarial channel. The paper contains both bounds and capacity-achieving schemes for the Diamond Network and the Mirrored Diamond Network. We also initiate the study of the generalizations of these networks. We investigate adversarial network coding and decoding focusing on the multishot regime. Errors can occur on a proper subset of the network edges and are modeled via an adversarial channel. The paper contains both bounds and capacity-achieving schemes for the Diamond Network and the Mirrored Diamond Network. We also initiate the study of the generalizations of these networks. △ Less

Submitted 17 July, 2023; originally announced July 2023.

arXiv:2307.01532 [pdf, other]

Analyzing Intentional Behavior in Autonomous Agents under Uncertainty

Authors: Filip Cano Córdoba, Samuel Judson, Timos Antonopoulos, Katrine Bjørner, Nicholas Shoemaker, Scott J. Shapiro, Ruzica Piskac, Bettina Könighofer

Abstract: Principled accountability for autonomous decision-making in uncertain environments requires distinguishing intentional outcomes from negligent designs from actual accidents. We propose analyzing the behavior of autonomous agents through a quantitative measure of the evidence of intentional behavior. We model an uncertain environment as a Markov Decision Process (MDP). For a given scenario, we rely… ▽ More Principled accountability for autonomous decision-making in uncertain environments requires distinguishing intentional outcomes from negligent designs from actual accidents. We propose analyzing the behavior of autonomous agents through a quantitative measure of the evidence of intentional behavior. We model an uncertain environment as a Markov Decision Process (MDP). For a given scenario, we rely on probabilistic model checking to compute the ability of the agent to influence reaching a certain event. We call this the scope of agency. We say that there is evidence of intentional behavior if the scope of agency is high and the decisions of the agent are close to being optimal for reaching the event. Our method applies counterfactual reasoning to automatically generate relevant scenarios that can be analyzed to increase the confidence of our assessment. In a case study, we show how our method can distinguish between 'intentional' and 'accidental' traffic collisions. △ Less

Submitted 4 July, 2023; originally announced July 2023.

Comments: 10 pages. Accepted for publication at IJCAI 2023 (Main Track)

arXiv:2305.14575 [pdf, other]

doi 10.59275/j.melba.2023-3d9d

Towards Early Prediction of Human iPSC Reprogramming Success

Authors: Abhineet Singh, Ila Jasra, Omar Mouhammed, Nidheesh Dadheech, Nilanjan Ray, James Shapiro

Abstract: This paper presents advancements in automated early-stage prediction of the success of reprogramming human induced pluripotent stem cells (iPSCs) as a potential source for regenerative cell therapies.The minuscule success rate of iPSC-reprogramming of around $ 0.01% $ to $ 0.1% $ makes it labor-intensive, time-consuming, and exorbitantly expensive to generate a stable iPSC line. Since that require… ▽ More This paper presents advancements in automated early-stage prediction of the success of reprogramming human induced pluripotent stem cells (iPSCs) as a potential source for regenerative cell therapies.The minuscule success rate of iPSC-reprogramming of around $ 0.01% $ to $ 0.1% $ makes it labor-intensive, time-consuming, and exorbitantly expensive to generate a stable iPSC line. Since that requires culturing of millions of cells and intense biological scrutiny of multiple clones to identify a single optimal clone. The ability to reliably predict which cells are likely to establish as an optimal iPSC line at an early stage of pluripotency would therefore be ground-breaking in rendering this a practical and cost-effective approach to personalized medicine. Temporal information about changes in cellular appearance over time is crucial for predicting its future growth outcomes. In order to generate this data, we first performed continuous time-lapse imaging of iPSCs in culture using an ultra-high resolution microscope. We then annotated the locations and identities of cells in late-stage images where reliable manual identification is possible. Next, we propagated these labels backwards in time using a semi-automated tracking system to obtain labels for early stages of growth. Finally, we used this data to train deep neural networks to perform automatic cell segmentation and classification. Our code and data are available at https://github.com/abhineet123/ipsc_prediction. △ Less

Submitted 11 November, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

Comments: Accepted for publication at the Journal of Machine Learning for Biomedical Imaging (MELBA) https://melba-journal.org/2023:014

Journal ref: Machine.Learning.for.Biomedical.Imaging. 2 (2023)

arXiv:2305.05731 [pdf, other]

'Put the Car on the Stand': SMT-based Oracles for Investigating Decisions

Authors: Samuel Judson, Matthew Elacqua, Filip Cano, Timos Antonopoulos, Bettina Könighofer, Scott J. Shapiro, Ruzica Piskac

Abstract: Principled accountability in the aftermath of harms is essential to the trustworthy design and governance of algorithmic decision making. Legal theory offers a paramount method for assessing culpability: putting the agent 'on the stand' to subject their actions and intentions to cross-examination. We show that under minimal assumptions automated reasoning can rigorously interrogate algorithmic beh… ▽ More Principled accountability in the aftermath of harms is essential to the trustworthy design and governance of algorithmic decision making. Legal theory offers a paramount method for assessing culpability: putting the agent 'on the stand' to subject their actions and intentions to cross-examination. We show that under minimal assumptions automated reasoning can rigorously interrogate algorithmic behaviors as in the adversarial process of legal fact finding. We model accountability processes, such as trials or review boards, as Counterfactual-Guided Logic Exploration and Abstraction Refinement (CLEAR) loops. We use the formal methods of symbolic execution and satisfiability modulo theories (SMT) solving to discharge queries about agent behavior in factual and counterfactual scenarios, as adaptively formulated by a human investigator. In order to do so, for a decision algorithm $\mathcal{A}$ we use symbolic execution to represent its logic as a statement $Π$ in the decidable theory $\texttt{QF_FPBV}$. We implement our framework and demonstrate its utility on an illustrative car crash scenario. △ Less

Submitted 29 January, 2024; v1 submitted 9 May, 2023; originally announced May 2023.

arXiv:2111.11251 [pdf]

Machine Learning-Based Soft Sensors for Vacuum Distillation Unit

Authors: Kamil Oster, Stefan Güttel, Lu Chen, Jonathan L. Shapiro, Megan Jobson

Abstract: Product quality assessment in the petroleum processing industry can be difficult and time-consuming, e.g. due to a manual collection of liquid samples from the plant and subsequent chemical laboratory analysis of the samples. The product quality is an important property that informs whether the products of the process are within the specifications. In particular, the delays caused by sample proces… ▽ More Product quality assessment in the petroleum processing industry can be difficult and time-consuming, e.g. due to a manual collection of liquid samples from the plant and subsequent chemical laboratory analysis of the samples. The product quality is an important property that informs whether the products of the process are within the specifications. In particular, the delays caused by sample processing (collection, laboratory measurements, results analysis, reporting) can lead to detrimental economic effects. One of the strategies to deal with this problem is soft sensors. Soft sensors are a collection of models that can be used to predict and forecast some infrequently measured properties (such as laboratory measurements of petroleum products) based on more frequent measurements of quantities like temperature, pressure and flow rate provided by physical sensors. Soft sensors short-cut the pathway to obtain relevant information about the product quality, often providing measurements as frequently as every minute. One of the applications of soft sensors is for the real-time optimization of a chemical process by a targeted adaptation of operating parameters. Models used for soft sensors can have various forms, however, among the most common are those based on artificial neural networks (ANNs). While soft sensors can deal with some of the issues in the refinery processes, their development and deployment can pose other challenges that are addressed in this paper. Firstly, it is important to enhance the quality of both sets of data (laboratory measurements and physical sensors) in a data pre-processing stage (as described in Methodology section). Secondly, once the data sets are pre-processed, different models need to be tested against prediction error and the model's interpretability. In this work, we present a framework for soft sensor development from raw data to ready-to-use models. △ Less

Submitted 19 November, 2021; originally announced November 2021.

Comments: 9 pages; 7 figures; Conference Proceedings of 2021 AIChE Annual Meeting (7th - 19th November 2021)

arXiv:2106.14641 [pdf]

Pre-treatment of outliers and anomalies in plant data: Methodology and case study of a Vacuum Distillation Unit

Authors: Kamil Oster, Stefan Güttel, Jonathan L. Shapiro, Lu Chen, Megan Jobson

Abstract: Data pre-treatment plays a significant role in improving data quality, thus allowing extraction of accurate information from raw data. One of the data pre-treatment techniques commonly used is outliers detection. The so-called 3$σ$ method is a common practice to identify the outliers. As shown in the manuscript, it does not identify all outliers, resulting in possible distortion of the overall sta… ▽ More Data pre-treatment plays a significant role in improving data quality, thus allowing extraction of accurate information from raw data. One of the data pre-treatment techniques commonly used is outliers detection. The so-called 3$σ$ method is a common practice to identify the outliers. As shown in the manuscript, it does not identify all outliers, resulting in possible distortion of the overall statistics of the data. This problem can have a significant impact on further data analysis and can lead to reduction in the accuracy of predictive models. There is a plethora of various techniques for outliers detection, however, aside from theoretical work, they all require case study work. Two types of outliers were considered: short-term (erroneous data, noise) and long-term outliers (e.g. malfunctioning for longer periods). The data used were taken from the vacuum distillation unit (VDU) of an Asian refinery and included 40 physical sensors (temperature, pressure and flow rate). We used a modified method for 3$σ$ thresholds to identify the short-term outliers, i.e. ensors data are divided into chunks determined by change points and 3$σ$ thresholds are calculated within each chunk representing near-normal distribution. We have shown that piecewise 3$σ$ method offers a better approach to short-term outliers detection than 3$σ$ method applied to the entire time series. Nevertheless, this does not perform well for long-term outliers (which can represent another state in the data). In this case, we used principal component analysis (PCA) with Hotelling's $T^2$ statistics to identify the long-term outliers. The results obtained with PCA were subject to DBSCAN clustering method. The outliers (which were visually obvious and correctly detected by the PCA method) were also correctly identified by DBSCAN which supported the consistency and accuracy of the PCA method. △ Less

Submitted 17 June, 2021; originally announced June 2021.

Comments: 33 pages, 20 figures, submitted to the Journal of Process Control (ref: JPROCONT-D-21-00332)

arXiv:2104.00772 [pdf, other]

Low-Resource Language Modelling of South African Languages

Authors: Stuart Mesham, Luc Hayward, Jared Shapiro, Jan Buys

Abstract: Language models are the foundation of current neural network-based models for natural language understanding and generation. However, research on the intrinsic performance of language models on African languages has been extremely limited, which is made more challenging by the lack of large or standardised training and evaluation sets that exist for English and other high-resource languages. In th… ▽ More Language models are the foundation of current neural network-based models for natural language understanding and generation. However, research on the intrinsic performance of language models on African languages has been extremely limited, which is made more challenging by the lack of large or standardised training and evaluation sets that exist for English and other high-resource languages. In this paper, we evaluate the performance of open-vocabulary language models on low-resource South African languages, using byte-pair encoding to handle the rich morphology of these languages. We evaluate different variants of n-gram models, feedforward neural networks, recurrent neural networks (RNNs), and Transformers on small-scale datasets. Overall, well-regularized RNNs give the best performance across two isiZulu and one Sepedi datasets. Multilingual training further improves performance on these datasets. We hope that this research will open new avenues for research into multilingual and low-resource language modelling for African languages. △ Less

Submitted 1 April, 2021; originally announced April 2021.

Comments: AfricaNLP workshop at EACL 2021

arXiv:2007.02833 [pdf, other]

Eliminating Catastrophic Interference with Biased Competition

Authors: Amelia Elizabeth Pollard, Jonathan L. Shapiro

Abstract: We present here a model to take advantage of the multi-task nature of complex datasets by learning to separate tasks and subtasks in and end to end manner by biasing competitive interactions in the network. This method does not require additional labelling or reformatting of data in a dataset. We propose an alternate view to the monolithic one-task-fits-all learning of multi-task problems, and des… ▽ More We present here a model to take advantage of the multi-task nature of complex datasets by learning to separate tasks and subtasks in and end to end manner by biasing competitive interactions in the network. This method does not require additional labelling or reformatting of data in a dataset. We propose an alternate view to the monolithic one-task-fits-all learning of multi-task problems, and describe a model based on a theory of neuronal attention from neuroscience, proposed by Desimone. We create and exhibit a new toy dataset, based on the MNIST dataset, which we call MNIST-QA, for testing Visual Question Answering architectures in a low-dimensional environment while preserving the more difficult components of the Visual Question Answering task, and demonstrate the proposed network architecture on this new dataset, as well as on COCO-QA and DAQUAR-FULL. We then demonstrate that this model eliminates catastrophic interference between tasks on a newly created toy dataset and provides competitive results in the Visual Question Answering space. We provide further evidence that Visual Question Answering can be approached as a multi-task problem, and demonstrate that this new architecture based on the Biased Competition model is capable of learning to separate and learn the tasks in an end-to-end fashion without the need for task labels. △ Less

Submitted 3 July, 2020; originally announced July 2020.

arXiv:2007.01780 [pdf, other]

Visual Question Answering as a Multi-Task Problem

Authors: Amelia Elizabeth Pollard, Jonathan L. Shapiro

Abstract: Visual Question Answering(VQA) is a highly complex problem set, relying on many sub-problems to produce reasonable answers. In this paper, we present the hypothesis that Visual Question Answering should be viewed as a multi-task problem, and provide evidence to support this hypothesis. We demonstrate this by reformatting two commonly used Visual Question Answering datasets, COCO-QA and DAQUAR, int… ▽ More Visual Question Answering(VQA) is a highly complex problem set, relying on many sub-problems to produce reasonable answers. In this paper, we present the hypothesis that Visual Question Answering should be viewed as a multi-task problem, and provide evidence to support this hypothesis. We demonstrate this by reformatting two commonly used Visual Question Answering datasets, COCO-QA and DAQUAR, into a multi-task format and train these reformatted datasets on two baseline networks, with one designed specifically to eliminate other possible causes for performance changes as a result of the reformatting. Though the networks demonstrated in this paper do not achieve strongly competitive results, we find that the multi-task approach to Visual Question Answering results in increases in performance of 5-9% against the single-task formatting, and that the networks reach convergence much faster than in the single-task case. Finally we discuss possible reasons for the observed difference in performance, and perform additional experiments which rule out causes not associated with the learning of the dataset as a multi-task problem. △ Less

Submitted 3 July, 2020; originally announced July 2020.

arXiv:2005.10469 [pdf, other]

ASAPP-ASR: Multistream CNN and Self-Attentive SRU for SOTA Speech Recognition

Authors: **g Pan, Joshua Shapiro, Jeremy Wohlwend, Kyu J. Han, Tao Lei, Tao Ma

Abstract: In this paper we present state-of-the-art (SOTA) performance on the LibriSpeech corpus with two novel neural network architectures, a multistream CNN for acoustic modeling and a self-attentive simple recurrent unit (SRU) for language modeling. In the hybrid ASR framework, the multistream CNN acoustic model processes an input of speech frames in multiple parallel pipelines where each stream has a u… ▽ More In this paper we present state-of-the-art (SOTA) performance on the LibriSpeech corpus with two novel neural network architectures, a multistream CNN for acoustic modeling and a self-attentive simple recurrent unit (SRU) for language modeling. In the hybrid ASR framework, the multistream CNN acoustic model processes an input of speech frames in multiple parallel pipelines where each stream has a unique dilation rate for diversity. Trained with the SpecAugment data augmentation method, it achieves relative word error rate (WER) improvements of 4% on test-clean and 14% on test-other. We further improve the performance via N-best rescoring using a 24-layer self-attentive SRU language model, achieving WERs of 1.75% on test-clean and 4.46% on test-other. △ Less

Submitted 21 May, 2020; originally announced May 2020.

Comments: Submitted to Interspeech 2020

arXiv:1908.00215 [pdf, other]

Illusion of Causality in Visualized Data

Authors: Cindy Xiong, Joel Shapiro, Jessica Hullman, Steven Franconeri

Abstract: Students who eat breakfast more frequently tend to have a higher grade point average. From this data, many people might confidently state that a before-school breakfast program would lead to higher grades. This is a reasoning error, because correlation does not necessarily indicate causation -- X and Y can be correlated without one directly causing the other. While this error is pervasive, its pre… ▽ More Students who eat breakfast more frequently tend to have a higher grade point average. From this data, many people might confidently state that a before-school breakfast program would lead to higher grades. This is a reasoning error, because correlation does not necessarily indicate causation -- X and Y can be correlated without one directly causing the other. While this error is pervasive, its prevalence might be amplified or mitigated by the way that the data is presented to a viewer. Across three crowdsourced experiments, we examined whether how simple data relations are presented would mitigate this reasoning error. The first experiment tested examples similar to the breakfast-GPA relation, varying in the plausibility of the causal link. We asked participants to rate their level of agreement that the relation was correlated, which they rated appropriately as high. However, participants also expressed high agreement with a causal interpretation of the data. Levels of support for the causal interpretation were not equally strong across visualization types: causality ratings were highest for text descriptions and bar graphs, but weaker for scatter plots. But is this effect driven by bar graphs aggregating data into two groups or by the visual encoding type? We isolated data aggregation versus visual encoding type and examined their individual effect on perceived causality. Overall, different visualization designs afford different cognitive reasoning affordances across the same data. High levels of data aggregation by graphs tend to be associated with higher perceived causality in data. Participants perceived line and dot visual encodings as more causal than bar encodings. Our results demonstrate how some visualization designs trigger stronger causal links while choosing others can help mitigate unwarranted perceptions of causality. △ Less

Submitted 1 August, 2019; originally announced August 2019.

arXiv:1904.01596 [pdf, other]

Analyzing Polarization in Social Media: Method and Application to Tweets on 21 Mass Shootings

Authors: Dorottya Demszky, Nikhil Garg, Rob Voigt, James Zou, Matthew Gentzkow, Jesse Shapiro, Dan Jurafsky

Abstract: We provide an NLP framework to uncover four linguistic dimensions of political polarization in social media: topic choice, framing, affect and illocutionary force. We quantify these aspects with existing lexical methods, and propose clustering of tweet embeddings as a means to identify salient topics for analysis across events; human evaluations show that our approach generates more cohesive topic… ▽ More We provide an NLP framework to uncover four linguistic dimensions of political polarization in social media: topic choice, framing, affect and illocutionary force. We quantify these aspects with existing lexical methods, and propose clustering of tweet embeddings as a means to identify salient topics for analysis across events; human evaluations show that our approach generates more cohesive topics than traditional LDA-based models. We apply our methods to study 4.4M tweets on 21 mass shootings. We provide evidence that the discussion of these events is highly polarized politically and that this polarization is primarily driven by partisan differences in framing rather than topic choice. We identify framing devices, such as grounding and the contrasting use of the terms "terrorist" and "crazy", that contribute to polarization. Results pertaining to topic choice, affect and illocutionary force suggest that Republicans focus more on the shooter and event-specific facts (news) while Democrats focus more on the victims and call for policy changes. Our work contributes to a deeper understanding of the way group divisions manifest in language and to computational methods for studying them. △ Less

Submitted 3 April, 2019; v1 submitted 2 April, 2019; originally announced April 2019.

Comments: NAACL 2019; code and data available at https://github.com/ddemszky/framing-twitter

arXiv:1903.03136 [pdf, other]

doi 10.1103/PhysRevApplied.14.024044

Secret key distillation across a quantum wiretap channel under restricted eavesdrop**

Authors: Ziwen Pan, Kaushik P. Seshadreesan, William Clark, Mark R. Adcock, Ivan B. Djordjevic, Jeffrey H. Shapiro, Saikat Guha

Abstract: The theory of quantum cryptography aims to guarantee unconditional information-theoretic security against an omnipotent eavesdropper. In many practical scenarios, however, the assumption of an all-powerful adversary is excessive and can be relaxed considerably. In this paper we study secret key distillation across a lossy and noisy quantum wiretap channel between Alice and Bob, with a separately p… ▽ More The theory of quantum cryptography aims to guarantee unconditional information-theoretic security against an omnipotent eavesdropper. In many practical scenarios, however, the assumption of an all-powerful adversary is excessive and can be relaxed considerably. In this paper we study secret key distillation across a lossy and noisy quantum wiretap channel between Alice and Bob, with a separately parameterized realistically lossy quantum channel to the eavesdropper Eve. We show that under such restricted eavesdrop**, the key rates achievable can exceed the secret key distillation capacity against an unrestricted eavesdropper in the quantum wiretap channel. Further, we show upper bounds on the key rates based on the relative entropy of entanglement. This simple restricted eavesdrop** model is widely applicable, e.g., to free-space quantum optical communication, where realistic collection of light by Eve is limited by the finite size of her optical aperture. Future work will include calculating bounds on the amount of light Eve can collect under various realistic scenarios. △ Less

Submitted 18 April, 2019; v1 submitted 7 March, 2019; originally announced March 2019.

Comments: 14 pages, 19 figures. We welcome comments and suggestions

Journal ref: Phys. Rev. Applied 14, 024044 (2020)

arXiv:1605.07853 [pdf, other]

doi 10.1109/ISIT.2016.7541390

Thinning, photonic beamsplitting, and a general discrete entropy power inequality

Authors: Saikat Guha, Jeffrey H. Shapiro, Raul Garcia-Patron Sanchez

Abstract: Many partially-successful attempts have been made to find the most natural discrete-variable version of Shannon's entropy power inequality (EPI). We develop an axiomatic framework from which we deduce the natural form of a discrete-variable EPI and an associated entropic monotonicity in a discrete-variable central limit theorem. In this discrete EPI, the geometric distribution, which has the maxim… ▽ More Many partially-successful attempts have been made to find the most natural discrete-variable version of Shannon's entropy power inequality (EPI). We develop an axiomatic framework from which we deduce the natural form of a discrete-variable EPI and an associated entropic monotonicity in a discrete-variable central limit theorem. In this discrete EPI, the geometric distribution, which has the maximum entropy among all discrete distributions with a given mean, assumes a role analogous to the Gaussian distribution in Shannon's EPI. The entropy power of $X$ is defined as the mean of a geometric random variable with entropy $H(X)$. The crux of our construction is a discrete-variable version of Lieb's scaled addition $X \boxplus_ηY$ of two discrete random variables $X$ and $Y$ with $η\in (0, 1)$. We discuss the relationship of our discrete EPI with recent work of Yu and Johnson who developed an EPI for a restricted class of random variables that have ultra-log-concave (ULC) distributions. Even though we leave open the proof of the aforesaid natural form of the discrete EPI, we show that this discrete EPI holds true for variables with arbitrary discrete distributions when the entropy power is redefined as $e^{H(X)}$ in analogy with the continuous version. Finally, we show that our conjectured discrete EPI is a special case of the yet-unproven Entropy Photon-number Inequality (EPnI), which assumes a role analogous to Shannon's EPI in capacity proofs for Gaussian bosonic (quantum) channels. △ Less

Submitted 25 May, 2016; originally announced May 2016.

Comments: 6 pages, 1 figure. To be presented at the IEEE International Symposium on Information Theory 2016

arXiv:1307.5368 [pdf, ps, other]

doi 10.1103/PhysRevX.4.011016

Quantum enigma machines and the locking capacity of a quantum channel

Authors: Saikat Guha, Patrick Hayden, Hari Krovi, Seth Lloyd, Cosmo Lupo, Jeffrey H. Shapiro, Masahiro Takeoka, Mark M. Wilde

Abstract: The locking effect is a phenomenon which is unique to quantum information theory and represents one of the strongest separations between the classical and quantum theories of information. The Fawzi-Hayden-Sen (FHS) locking protocol harnesses this effect in a cryptographic context, whereby one party can encode n bits into n qubits while using only a constant-size secret key. The encoded message is… ▽ More The locking effect is a phenomenon which is unique to quantum information theory and represents one of the strongest separations between the classical and quantum theories of information. The Fawzi-Hayden-Sen (FHS) locking protocol harnesses this effect in a cryptographic context, whereby one party can encode n bits into n qubits while using only a constant-size secret key. The encoded message is then secure against any measurement that an eavesdropper could perform in an attempt to recover the message, but the protocol does not necessarily meet the composability requirements needed in quantum key distribution applications. In any case, the locking effect represents an extreme violation of Shannon's classical theorem, which states that information-theoretic security holds in the classical case if and only if the secret key is the same size as the message. Given this intriguing phenomenon, it is of practical interest to study the effect in the presence of noise, which can occur in the systems of both the legitimate receiver and the eavesdropper. This paper formally defines the locking capacity of a quantum channel as the maximum amount of locked information that can be reliably transmitted to a legitimate receiver by exploiting many independent uses of a quantum channel and an amount of secret key sublinear in the number of channel uses. We provide general operational bounds on the locking capacity in terms of other well-known capacities from quantum Shannon theory. We also study the important case of bosonic channels, finding limitations on these channels' locking capacity when coherent-state encodings are employed and particular locking protocols for these channels that might be physically implementable. △ Less

Submitted 9 November, 2013; v1 submitted 19 July, 2013; originally announced July 2013.

Comments: 37 pages

Journal ref: Physical Review X vol. 4, no. 1, page 011016 (January 2014)

arXiv:1302.3721 [pdf, other]

Thompson Sampling in Switching Environments with Bayesian Online Change Point Detection

Authors: Joseph Mellor, Jonathan Shapiro

Abstract: Thompson Sampling has recently been shown to be optimal in the Bernoulli Multi-Armed Bandit setting[Kaufmann et al., 2012]. This bandit problem assumes stationary distributions for the rewards. It is often unrealistic to model the real world as a stationary distribution. In this paper we derive and evaluate algorithms using Thompson Sampling for a Switching Multi-Armed Bandit Problem. We propose a… ▽ More Thompson Sampling has recently been shown to be optimal in the Bernoulli Multi-Armed Bandit setting[Kaufmann et al., 2012]. This bandit problem assumes stationary distributions for the rewards. It is often unrealistic to model the real world as a stationary distribution. In this paper we derive and evaluate algorithms using Thompson Sampling for a Switching Multi-Armed Bandit Problem. We propose a Thompson Sampling strategy equipped with a Bayesian change point mechanism to tackle this problem. We develop algorithms for a variety of cases with constant switching rate: when switching occurs all arms change (Global Switching), switching occurs independently for each arm (Per-Arm Switching), when the switching rate is known and when it must be inferred from data. This leads to a family of algorithms we collectively term Change-Point Thompson Sampling (CTS). We show empirical results of the algorithm in 4 artificial environments, and 2 derived from real world data; news click-through[Yahoo!, 2011] and foreign exchange data[Dukascopy, 2012], comparing them to some other bandit algorithms. In real world data CTS is the most effective. △ Less

Submitted 15 February, 2013; originally announced February 2013.

Comments: A version will appear in the Sixteenth international conference on Artificial Intelligence and Statistics (AIStats 2013)

arXiv:1209.6001 [pdf, ps, other]

Bayesian Mixture Models for Frequent Itemset Discovery

Authors: Ruefei He, Jonathan Shapiro

Abstract: In binary-transaction data-mining, traditional frequent itemset mining often produces results which are not straightforward to interpret. To overcome this problem, probability models are often used to produce more compact and conclusive results, albeit with some loss of accuracy. Bayesian statistics have been widely used in the development of probability models in machine learning in recent years… ▽ More In binary-transaction data-mining, traditional frequent itemset mining often produces results which are not straightforward to interpret. To overcome this problem, probability models are often used to produce more compact and conclusive results, albeit with some loss of accuracy. Bayesian statistics have been widely used in the development of probability models in machine learning in recent years and these methods have many advantages, including their abilities to avoid overfitting. In this paper, we develop two Bayesian mixture models with the Dirichlet distribution prior and the Dirichlet process (DP) prior to improve the previous non-Bayesian mixture model developed for transaction dataset mining. We implement the inference of both mixture models using two methods: a collapsed Gibbs sampling scheme and a variational approximation algorithm. Experiments in several benchmark problems have shown that both mixture models achieve better performance than a non-Bayesian mixture model. The variational algorithm is the faster of the two approaches while the Gibbs sampling method achieves a more accurate results. The Dirichlet process mixture model can automatically grow to a proper complexity for a better approximation. Once the model is built, it can be very fast to query and run analysis on (typically 10 times faster than Eclat, as we will show in the experiment section). However, these approaches also show that mixture models underestimate the probabilities of frequent itemsets. Consequently, these models have a higher sensitivity but a lower specificity. △ Less

Submitted 26 September, 2012; originally announced September 2012.

ACM Class: H.2.8; H.3.3; I.2.6

arXiv:1207.6435 [pdf, other]

doi 10.1103/PhysRevA.87.062306

Capacity of optical reading, Part 1: Reading boundless error-free bits using a single photon

Authors: Saikat Guha, Jeffrey H. Shapiro

Abstract: We show that nature imposes no fundamental upper limit to the number of information bits per expended photon that can, in principle, be read reliably when classical data is encoded in a medium that can only passively modulate the amplitude and phase of the probe light. We show that with a coherent-state (laser) source, an on-off (amplitude-modulation) pixel encoding, and shot-noise-limited direct… ▽ More We show that nature imposes no fundamental upper limit to the number of information bits per expended photon that can, in principle, be read reliably when classical data is encoded in a medium that can only passively modulate the amplitude and phase of the probe light. We show that with a coherent-state (laser) source, an on-off (amplitude-modulation) pixel encoding, and shot-noise-limited direct detection (an overly-optimistic model for commercial CD/DVD drives), the highest photon information efficiency achievable in principle is about 0.5 bit per transmitted photon. We then show that a coherent-state probe can read unlimited bits per photon when the receiver is allowed to make joint (inseparable) measurements on the reflected light from a large block of phase-modulated memory pixels. Finally, we show an example of a spatially-entangled non-classical light probe and a receiver design---constructable using a single-photon source, beam splitters, and single-photon detectors---that can in principle read any number of error-free bits of information. The probe is a single photon prepared in a uniform coherent superposition of multiple orthogonal spatial modes, i.e., a W-state. The code, target, and joint-detection receiver complexity required by a coherent-state transmitter to achieve comparable photon efficiency performance is shown to be much higher in comparison to that required by the W-state transceiver. △ Less

Submitted 15 February, 2013; v1 submitted 26 July, 2012; originally announced July 2012.

Comments: 11 pages, 12 figures, v3 includes a new plot characterizing the photon efficiency vs. encoding efficiency tradeoff for optical reading. The main technical body of the paper remains unaltered

Journal ref: Phys. Rev. A 87, 062306 (2013)

arXiv:1102.1963 [pdf, other]

doi 10.1109/ISIT.2011.6034073

On quantum limit of optical communications: concatenated codes and joint-detection receivers

Authors: Saikat Guha, Zachary Dutton, Jeffrey H. Shapiro

Abstract: When classical information is sent over a channel with quantum-state modulation alphabet, such as the free-space optical (FSO) channel, attaining the ultimate (Holevo) limit to channel capacity requires the receiver to make joint measurements over long codeword blocks. In recent work, we showed a receiver for a pure-state channel that can attain the ultimate capacity by applying a single-shot opti… ▽ More When classical information is sent over a channel with quantum-state modulation alphabet, such as the free-space optical (FSO) channel, attaining the ultimate (Holevo) limit to channel capacity requires the receiver to make joint measurements over long codeword blocks. In recent work, we showed a receiver for a pure-state channel that can attain the ultimate capacity by applying a single-shot optical (unitary) transformation on the received codeword state followed by simultaneous (but separable) projective measurements on the single-modulation-symbol state spaces. In this paper, we study the ultimate tradeoff between photon efficiency and spectral efficiency for the FSO channel. Based on our general results for the pure-state quantum channel, we show some of the first concrete examples of codes and laboratory-realizable joint-detection optical receivers that can achieve fundamentally higher (superadditive) channel capacity than receivers that physically detect each modulation symbol one at a time, as is done by all conventional (coherent or direct-detection) optical receivers. △ Less

Submitted 9 February, 2011; originally announced February 2011.

Comments: 5 pages, 7 figures, submitted to IEEE International Symposium on Information Theory (ISIT), 2011

arXiv:0810.3564 [pdf, ps, other]

The Poisson Channel at Low Input Powers

Authors: Amos Lapidoth, Jeffrey H. Shapiro, Vinodh Venkatesan, Ligong Wang

Abstract: The asymptotic capacity at low input powers of an average-power limited or an average- and peak-power limited discrete-time Poisson channel is considered. For a Poisson channel whose dark current is zero or decays to zero linearly with its average input power $E$, capacity scales like $E\log\frac{1}{E}$ for small $E$. For a Poisson channel whose dark current is a nonzero constant, capacity scale… ▽ More The asymptotic capacity at low input powers of an average-power limited or an average- and peak-power limited discrete-time Poisson channel is considered. For a Poisson channel whose dark current is zero or decays to zero linearly with its average input power $E$, capacity scales like $E\log\frac{1}{E}$ for small $E$. For a Poisson channel whose dark current is a nonzero constant, capacity scales, to within a constant, like $E\log\log\frac{1}{E}$ for small $E$. △ Less

Submitted 20 October, 2008; originally announced October 2008.

Comments: To be presented at IEEEI 2008, December 3-5 2008, Eilat, Israel

arXiv:0801.0841 [pdf, ps, other]

Capacity of the Bosonic Wiretap Channel and the Entropy Photon-Number Inequality

Authors: Saikat Guha, Jeffrey H. Shapiro, Baris I. Erkmen

Abstract: Determining the ultimate classical information carrying capacity of electromagnetic waves requires quantum-mechanical analysis to properly account for the bosonic nature of these waves. Recent work has established capacity theorems for bosonic single-user and broadcast channels, under the presumption of two minimum output entropy conjectures. Despite considerable accumulated evidence that suppor… ▽ More Determining the ultimate classical information carrying capacity of electromagnetic waves requires quantum-mechanical analysis to properly account for the bosonic nature of these waves. Recent work has established capacity theorems for bosonic single-user and broadcast channels, under the presumption of two minimum output entropy conjectures. Despite considerable accumulated evidence that supports the validity of these conjectures, they have yet to be proven. In this paper, it is shown that the second conjecture suffices to prove the classical capacity of the bosonic wiretap channel, which in turn would also prove the quantum capacity of the lossy bosonic channel. The preceding minimum output entropy conjectures are then shown to be simple consequences of an Entropy Photon-Number Inequality (EPnI), which is a conjectured quantum-mechanical analog of the Entropy Power Inequality (EPI) form classical information theory. △ Less

Submitted 6 January, 2008; originally announced January 2008.

Comments: 5 pages, 1 figure, submitted to ISIT 2008

arXiv:0710.5666 [pdf, ps, other]

The Entropy Photon-Number Inequality and its Consequences

Authors: Saikat Guha, Baris I. Erkmen, Jeffrey H. Shapiro

Abstract: Determining the ultimate classical information carrying capacity of electromagnetic waves requires quantum-mechanical analysis to properly account for the bosonic nature of these waves. Recent work has established capacity theorems for bosonic single-user, broadcast, and wiretap channels, under the presumption of two minimum output entropy conjectures. Despite considerable accumulated evidence t… ▽ More Determining the ultimate classical information carrying capacity of electromagnetic waves requires quantum-mechanical analysis to properly account for the bosonic nature of these waves. Recent work has established capacity theorems for bosonic single-user, broadcast, and wiretap channels, under the presumption of two minimum output entropy conjectures. Despite considerable accumulated evidence that supports the validity of these conjectures, they have yet to be proven. Here we show that the preceding minimum output entropy conjectures are simple consequences of an Entropy Photon-Number Inequality, which is a conjectured quantum-mechanical analog of the Entropy Power Inequality (EPI) from classical information theory. △ Less

Submitted 9 November, 2007; v1 submitted 30 October, 2007; originally announced October 2007.

Comments: 3 pages, submitted to "Open Problems Session, ITA 2008, UCSD"

arXiv:cs/0611020 [pdf, ps, other]

doi 10.1109/IJCNN.2005.1556028

An associative memory for the on-line recognition and prediction of temporal sequences

Authors: J. Bose, S. B. Furber, J. L. Shapiro

Abstract: This paper presents the design of an associative memory with feedback that is capable of on-line temporal sequence learning. A framework for on-line sequence learning has been proposed, and different sequence learning models have been analysed according to this framework. The network model is an associative memory with a separate store for the sequence context of a symbol. A sparse distributed m… ▽ More This paper presents the design of an associative memory with feedback that is capable of on-line temporal sequence learning. A framework for on-line sequence learning has been proposed, and different sequence learning models have been analysed according to this framework. The network model is an associative memory with a separate store for the sequence context of a symbol. A sparse distributed memory is used to gain scalability. The context store combines the functionality of a neural layer with a shift register. The sensitivity of the machine to the sequence context is controllable, resulting in different characteristic behaviours. The model can store and predict on-line sequences of various types and length. Numerical simulations on the model have been carried out to determine its properties. △ Less

Submitted 4 November, 2006; originally announced November 2006.

Comments: Published in IJCNN 2005, Montreal, Canada

arXiv:cs/0006007 [pdf, ps, other]

Novelty Detection on a Mobile Robot Using Habituation

Authors: Stephen Marsland, Ulrich Nehmzow, Jonathan Shapiro

Abstract: In this paper a novelty filter is introduced which allows a robot operating in an un structured environment to produce a self-organised model of its surroundings and to detect deviations from the learned model. The environment is perceived using the rob ot's 16 sonar sensors. The algorithm produces a novelty measure for each sensor scan relative to the model it has learned. This means that it hi… ▽ More In this paper a novelty filter is introduced which allows a robot operating in an un structured environment to produce a self-organised model of its surroundings and to detect deviations from the learned model. The environment is perceived using the rob ot's 16 sonar sensors. The algorithm produces a novelty measure for each sensor scan relative to the model it has learned. This means that it highlights stimuli which h ave not been previously experienced. The novelty filter proposed uses a model of hab ituation. Habituation is a decrement in behavioural response when a stimulus is pre sented repeatedly. Robot experiments are presented which demonstrate the reliable o peration of the filter in a number of environments. △ Less

Submitted 2 June, 2000; originally announced June 2000.

Comments: 10 pages, 6 figures. In From Animals to Animats, The Sixth International Conference on Simulation of Adaptive Behaviour, Paris, 2000

ACM Class: I.2.6

arXiv:cs/0006006 [pdf, ps, other]

A Real-Time Novelty Detector for a Mobile Robot

Authors: Stephen Marsland, Ulrich Nehmzow, Jonathan Shapiro

Abstract: Recognising new or unusual features of an environment is an ability which is potentially very useful to a robot. This paper demonstrates an algorithm which achieves this task by learning an internal representation of `normality' from sonar scans taken as a robot explores the environment. This model of the environment is used to evaluate the novelty of each sonar scan presented to it with relatio… ▽ More Recognising new or unusual features of an environment is an ability which is potentially very useful to a robot. This paper demonstrates an algorithm which achieves this task by learning an internal representation of `normality' from sonar scans taken as a robot explores the environment. This model of the environment is used to evaluate the novelty of each sonar scan presented to it with relation to the model. Stimuli which have not been seen before, and therefore have more novelty, are highlighted by the filter. The filter has the ability to forget about features which have been learned, so that stimuli which are seen only rarely recover their response over time. A number of robot experiments are presented which demonstrate the operation of the filter. △ Less

Submitted 2 June, 2000; originally announced June 2000.

Comments: 8 pages, 6 figures. In Proceedings of EUREL European Advanced Robotics Systems Masterclass and Conference, 2000

ACM Class: I.2.6

arXiv:cs/0006005 [pdf, ps, other]

Novelty Detection for Robot Neotaxis

Authors: Stephen Marsland, Ulrich Nehmzow, Jonathan Shapiro

Abstract: The ability of a robot to detect and respond to changes in its environment is potentially very useful, as it draws attention to new and potentially important features. We describe an algorithm for learning to filter out previously experienced stimuli to allow further concentration on novel features. The algorithm uses a model of habituation, a biological process which causes a decrement in respo… ▽ More The ability of a robot to detect and respond to changes in its environment is potentially very useful, as it draws attention to new and potentially important features. We describe an algorithm for learning to filter out previously experienced stimuli to allow further concentration on novel features. The algorithm uses a model of habituation, a biological process which causes a decrement in response with repeated presentation. Experiments with a mobile robot are presented in which the robot detects the most novel stimulus and turns towards it (`neotaxis'). △ Less

Submitted 2 June, 2000; originally announced June 2000.

Comments: 7 pages, 5 figures. In Proceedings of the Second International Conference on Neural Computation, 2000

ACM Class: I.2.6

Showing 1–27 of 27 results for author: Shapiro, J