-
Timely Multi-Process Estimation Over Erasure Channels With and Without Feedback: Signal-Independent Policies
Authors:
Karim Banawan,
Ahmed Arafa,
Karim G. Seddik
Abstract:
We consider a multi-process remote estimation system observing $K$ independent Ornstein-Uhlenbeck processes. In this system, a shared sensor samples the $K$ processes in such a way that the long-term average sum mean square error (MSE) is minimized using signal-independent sampling policies, in which sampling instances are chosen independently from the processes' values. The sensor operates under…
▽ More
We consider a multi-process remote estimation system observing $K$ independent Ornstein-Uhlenbeck processes. In this system, a shared sensor samples the $K$ processes in such a way that the long-term average sum mean square error (MSE) is minimized using signal-independent sampling policies, in which sampling instances are chosen independently from the processes' values. The sensor operates under a total sampling frequency constraint $f_{\max}$. The samples from all processes consume random processing delays in a shared queue and then are transmitted over an erasure channel with probability $ε$. We study two variants of the problem: first, when the samples are scheduled according to a Maximum-Age-First (MAF) policy, and the receiver provides an erasure status feedback; and second, when samples are scheduled according to a Round-Robin (RR) policy, when there is no erasure status feedback from the receiver. Aided by optimal structural results, we show that the optimal sampling policy for both settings, under some conditions, is a \emph{threshold policy}. We characterize the optimal threshold and the corresponding optimal long-term average sum MSE as a function of $K$, $f_{\max}$, $ε$, and the statistical properties of the observed processes. Our results show that, with an exponentially distributed service rate, the optimal threshold $τ^*$ increases as the number of processes $K$ increases, for both settings. Additionally, we show that the optimal threshold is an \emph{increasing} function of $ε$ in the case of \emph{available} erasure status feedback, while it exhibits the \emph{opposite behavior}, i.e., $τ^*$ is a \emph{decreasing} function of $ε$, in the case of \emph{absent} erasure status feedback.
△ Less
Submitted 30 October, 2023; v1 submitted 23 March, 2023;
originally announced March 2023.
-
Private Status Updating with Erasures: A Case for Retransmission Without Resampling
Authors:
Ahmed Arafa,
Karim Banawan
Abstract:
A status updating system is considered in which a source updates a destination over an erasure channel. The utility of the updates is measured through a function of their age-of-information (AoI), which assesses their freshness. Correlated with the status updates is another process that needs to be kept private from the destination. Privacy is measured through a leakage function that depends on th…
▽ More
A status updating system is considered in which a source updates a destination over an erasure channel. The utility of the updates is measured through a function of their age-of-information (AoI), which assesses their freshness. Correlated with the status updates is another process that needs to be kept private from the destination. Privacy is measured through a leakage function that depends on the amount and time of the status updates received: stale updates are more private than fresh ones. Different from most of the current AoI literature, a post-sampling waiting time is introduced in order to provide a privacy cover at the expense of AoI. More importantly, it is also shown that, depending on the leakage budget and the channel statistics, it can be useful to retransmit stale status updates following erasure events without resampling fresh ones.
△ Less
Submitted 23 February, 2023;
originally announced February 2023.
-
Timely Multi-Process Estimation with Erasures
Authors:
Karim Banawan,
Ahmed Arafa,
Karim G. Seddik
Abstract:
We consider a multi-process remote estimation system observing $K$ independent Ornstein-Uhlenbeck processes. In this system, a shared sensor samples the $K$ processes in such a way that the long-term average sum mean square error (MSE) is minimized. The sensor operates under a total sampling frequency constraint $f_{\max}$ and samples the processes according to a Maximum-Age-First (MAF) schedule.…
▽ More
We consider a multi-process remote estimation system observing $K$ independent Ornstein-Uhlenbeck processes. In this system, a shared sensor samples the $K$ processes in such a way that the long-term average sum mean square error (MSE) is minimized. The sensor operates under a total sampling frequency constraint $f_{\max}$ and samples the processes according to a Maximum-Age-First (MAF) schedule. The samples from all processes consume random processing delays, and then are transmitted over an erasure channel with probability $ε$. Aided by optimal structural results, we show that the optimal sampling policy, under some conditions, is a \emph{threshold policy}. We characterize the optimal threshold and the corresponding optimal long-term average sum MSE as a function of $K$, $f_{\max}$, $ε$, and the statistical properties of the observed processes.
△ Less
Submitted 22 September, 2022;
originally announced September 2022.
-
Timely Private Information Retrieval
Authors:
Karim Banawan,
Ahmed Arafa,
Sennur Ulukus
Abstract:
We introduce the problem of \emph{timely} private information retrieval (PIR) from $N$ non-colluding and replicated servers. In this problem, a user desires to retrieve a message out of $M$ messages from the servers, whose contents are continuously updating. The retrieval process should be executed in a timely manner such that no information is leaked about the identity of the message. To assess t…
▽ More
We introduce the problem of \emph{timely} private information retrieval (PIR) from $N$ non-colluding and replicated servers. In this problem, a user desires to retrieve a message out of $M$ messages from the servers, whose contents are continuously updating. The retrieval process should be executed in a timely manner such that no information is leaked about the identity of the message. To assess the timeliness, we use the \emph{age of information} (AoI) metric. Interestingly, the timely PIR problem reduces to an AoI minimization subject to PIR constraints under \emph{asymmetric traffic}. We explicitly characterize the optimal tradeoff between the PIR rate and the AoI metric (peak AoI or average AoI) for the case of $N=2$, $M=3$. Further, we provide some structural insights on the general problem with arbitrary $N$, $M$.
△ Less
Submitted 18 May, 2021;
originally announced May 2021.
-
Download Cost of Private Updating
Authors:
Bryttany Herren,
Ahmed Arafa,
Karim Banawan
Abstract:
We consider the problem of privately updating a message out of $K$ messages from $N$ replicated and non-colluding databases. In this problem, a user has an outdated version of the message $\hat{W}_θ$ of length $L$ bits that differ from the current version $W_θ$ in at most $f$ bits. The user needs to retrieve $W_θ$ correctly using a private information retrieval (PIR) scheme with the least number o…
▽ More
We consider the problem of privately updating a message out of $K$ messages from $N$ replicated and non-colluding databases. In this problem, a user has an outdated version of the message $\hat{W}_θ$ of length $L$ bits that differ from the current version $W_θ$ in at most $f$ bits. The user needs to retrieve $W_θ$ correctly using a private information retrieval (PIR) scheme with the least number of downloads without leaking any information about the message index $θ$ to any individual database. To that end, we propose a novel achievable scheme based on \emph{syndrome decoding}. Specifically, the user downloads the syndrome corresponding to $W_θ$, according to a linear block code with carefully designed parameters, using the optimal PIR scheme for messages with a length constraint. We derive lower and upper bounds for the optimal download cost that match if the term $\log_2\left(\sum_{i=0}^f \binom{L}{i}\right)$ is an integer. Our results imply that there is a significant reduction in the download cost if $f < \frac{L}{2}$ compared with downloading $W_θ$ directly using classical PIR approaches without taking the correlation between $W_θ$ and $\hat{W}_θ$ into consideration.
△ Less
Submitted 25 February, 2021;
originally announced February 2021.
-
Multi-Party Private Set Intersection: An Information-Theoretic Approach
Authors:
Zhusheng Wang,
Karim Banawan,
Sennur Ulukus
Abstract:
We investigate the problem of multi-party private set intersection (MP-PSI). In MP-PSI, there are $M$ parties, each storing a data set $\mathcal{p}_i$ over $N_i$ replicated and non-colluding databases, and we want to calculate the intersection of the data sets $\cap_{i=1}^M \mathcal{p}_i$ without leaking any information beyond the set intersection to any of the parties. We consider a specific comm…
▽ More
We investigate the problem of multi-party private set intersection (MP-PSI). In MP-PSI, there are $M$ parties, each storing a data set $\mathcal{p}_i$ over $N_i$ replicated and non-colluding databases, and we want to calculate the intersection of the data sets $\cap_{i=1}^M \mathcal{p}_i$ without leaking any information beyond the set intersection to any of the parties. We consider a specific communication protocol where one of the parties, called the leader party, initiates the MP-PSI protocol by sending queries to the remaining parties which are called client parties. The client parties are not allowed to communicate with each other. We propose an information-theoretic scheme that privately calculates the intersection $\cap_{i=1}^M \mathcal{p}_i$ with a download cost of $D = \min_{t \in \{1, \cdots, M\}} \sum_{i \in \{1, \cdots M\}\setminus {t}} \left\lceil \frac{|\mathcal{p}_t|N_i}{N_i-1}\right\rceil$. Similar to the 2-party PSI problem, our scheme builds on the connection between the PSI problem and the multi-message symmetric private information retrieval (MM-SPIR) problem. Our scheme is a non-trivial generalization of the 2-party PSI scheme as it needs an intricate design of the shared common randomness. Interestingly, in terms of the download cost, our scheme does not incur any penalty due to the more stringent privacy constraints in the MP-PSI problem compared to the 2-party PSI problem.
△ Less
Submitted 17 August, 2020;
originally announced August 2020.
-
Sample, Quantize and Encode: Timely Estimation Over Noisy Channels
Authors:
Ahmed Arafa,
Karim Banawan,
Karim G. Seddik,
H. Vincent Poor
Abstract:
The effects of quantization and coding on the estimation quality of Gauss-Markov processes are considered, with a special attention to the Ornstein-Uhlenbeck process. Samples are acquired from the process, quantized, and then encoded for transmission using either infinite incremental redundancy (IIR) or fixed redundancy (FR) coding schemes. A fixed processing time is consumed at the receiver for d…
▽ More
The effects of quantization and coding on the estimation quality of Gauss-Markov processes are considered, with a special attention to the Ornstein-Uhlenbeck process. Samples are acquired from the process, quantized, and then encoded for transmission using either infinite incremental redundancy (IIR) or fixed redundancy (FR) coding schemes. A fixed processing time is consumed at the receiver for decoding and sending feedback to the transmitter. Decoded messages are used to construct a minimum mean square error (MMSE) estimate of the process as a function of time. This is shown to be an increasing functional of the age-of-information (AoI), defined as the time elapsed since the sampling time pertaining to the latest successfully decoded message. Such functional depends on the quantization bits, codewords lengths and receiver processing time. The goal, for each coding scheme, is to optimize sampling times such that the long-term average MMSE is minimized. This is then characterized in the setting of general increasing functionals of AoI, not necessarily corresponding to MMSE, which may be of independent interest in other contexts.
We first show that the optimal sampling policy for IIR is such that a new sample is generated only if the AoI exceeds a certain threshold, while for FR it is such that a new sample is delivered just-in-time as the receiver finishes processing the previous one. Enhanced transmissions schemes are then developed in order to exploit the processing times to make new data available at the receiver sooner. For both IIR and FR, it is shown that there exists an optimal number of quantization bits that balances AoI and quantization errors, and hence minimizes the MMSE. It is also shown that for longer receiver processing times, the relatively simpler FR scheme outperforms IIR.
△ Less
Submitted 21 June, 2021; v1 submitted 16 July, 2020;
originally announced July 2020.
-
Timely Estimation Using Coded Quantized Samples
Authors:
Ahmed Arafa,
Karim Banawan,
Karim G. Seddik,
H. Vincent Poor
Abstract:
The effects of quantization and coding on the estimation quality of a Gauss-Markov, namely Ornstein-Uhlenbeck, process are considered. Samples are acquired from the process, quantized, and then encoded for transmission using either infinite incremental redundancy or fixed redundancy coding schemes. A fixed processing time is consumed at the receiver for decoding and sending feedback to the transmi…
▽ More
The effects of quantization and coding on the estimation quality of a Gauss-Markov, namely Ornstein-Uhlenbeck, process are considered. Samples are acquired from the process, quantized, and then encoded for transmission using either infinite incremental redundancy or fixed redundancy coding schemes. A fixed processing time is consumed at the receiver for decoding and sending feedback to the transmitter. Decoded messages are used to construct a minimum mean square error (MMSE) estimate of the process as a function of time. This is shown to be an increasing functional of the age-of-information, defined as the time elapsed since the sampling time pertaining to the latest successfully decoded message. Such (age-penalty) functional depends on the quantization bits, codeword lengths and receiver processing time. The goal, for each coding scheme, is to optimize sampling times such that the long term average MMSE is minimized. This is then characterized in the setting of general increasing age-penalty functionals, not necessarily corresponding to MMSE, which may be of independent interest in other contexts.
△ Less
Submitted 27 April, 2020;
originally announced April 2020.
-
Semantic Private Information Retrieval
Authors:
Sajani Vithana,
Karim Banawan,
Sennur Ulukus
Abstract:
We investigate the problem of semantic private information retrieval (semantic PIR). In semantic PIR, a user retrieves a message out of $K$ independent messages stored in $N$ replicated and non-colluding databases without revealing the identity of the desired message to any individual database. The messages come with \emph{different semantics}, i.e., the messages are allowed to have \emph{non-unif…
▽ More
We investigate the problem of semantic private information retrieval (semantic PIR). In semantic PIR, a user retrieves a message out of $K$ independent messages stored in $N$ replicated and non-colluding databases without revealing the identity of the desired message to any individual database. The messages come with \emph{different semantics}, i.e., the messages are allowed to have \emph{non-uniform a priori probabilities} denoted by $(p_i>0,\: i \in [K])$, which are a proxy for their respective popularity of retrieval, and \emph{arbitrary message sizes} $(L_i,\: i \in [K])$. This is a generalization of the classical private information retrieval (PIR) problem, where messages are assumed to have equal a priori probabilities and equal message sizes. We derive the semantic PIR capacity for general $K$, $N$. The results show that the semantic PIR capacity depends on the number of databases $N$, the number of messages $K$, the a priori probability distribution of messages $p_i$, and the message sizes $L_i$. We present two achievable semantic PIR schemes: The first one is a deterministic scheme which is based on message asymmetry. This scheme employs non-uniform subpacketization. The second scheme is probabilistic and is based on choosing one query set out of multiple options at random to retrieve the required message without the need for exponential subpacketization. We derive necessary and sufficient conditions for the semantic PIR capacity to exceed the classical PIR capacity with equal priors and sizes. Our results show that the semantic PIR capacity can be larger than the classical PIR capacity when longer messages have higher popularities. However, when messages are equal-length, the non-uniform priors cannot be exploited to improve the retrieval rate over the classical PIR capacity.
△ Less
Submitted 30 March, 2020;
originally announced March 2020.
-
Private Set Intersection: A Multi-Message Symmetric Private Information Retrieval Perspective
Authors:
Zhusheng Wang,
Karim Banawan,
Sennur Ulukus
Abstract:
We study the problem of private set intersection (PSI). In this problem, there are two entities $E_i$, for $i=1, 2$, each storing a set $\mathcal{P}_i$, whose elements are picked from a finite field $\mathbb{F}_K$, on $N_i$ replicated and non-colluding databases. It is required to determine the set intersection $\mathcal{P}_1 \cap \mathcal{P}_2$ without leaking any information about the remaining…
▽ More
We study the problem of private set intersection (PSI). In this problem, there are two entities $E_i$, for $i=1, 2$, each storing a set $\mathcal{P}_i$, whose elements are picked from a finite field $\mathbb{F}_K$, on $N_i$ replicated and non-colluding databases. It is required to determine the set intersection $\mathcal{P}_1 \cap \mathcal{P}_2$ without leaking any information about the remaining elements to the other entity with the least amount of downloaded bits. We first show that the PSI problem can be recast as a multi-message symmetric private information retrieval (MM-SPIR) problem. Next, as a stand-alone result, we derive the information-theoretic sum capacity of MM-SPIR, $C_{MM-SPIR}$. We show that with $K$ messages, $N$ databases, and the size of the desired message set $P$, the exact capacity of MM-SPIR is $C_{MM-SPIR} = 1 - \frac{1}{N}$ when $P \leq K-1$, provided that the entropy of the common randomness $S$ satisfies $H(S) \geq \frac{P}{N-1}$ per desired symbol. This result implies that there is no gain for MM-SPIR over successive single-message SPIR (SM-SPIR). For the MM-SPIR problem, we present a novel capacity-achieving scheme that builds on the near-optimal scheme of Banawan-Ulukus originally proposed for the multi-message PIR (MM-PIR) problem without database privacy constraints. Surprisingly, our scheme here is exactly optimal for the MM-SPIR problem for any $P$, in contrast to the scheme for the MM-PIR problem, which was proved only to be near-optimal. Our scheme is an alternative to the SM-SPIR scheme of Sun-Jafar. Based on this capacity result for MM-SPIR, and after addressing the added requirements in its conversion to the PSI problem, we show that the optimal download cost for the PSI problem is $\min\left\{\left\lceil\frac{P_1 N_2}{N_2-1}\right\rceil, \left\lceil\frac{P_2 N_1}{N_1-1}\right\rceil\right\}$, where $P_i$ is the cardinality of set $\mathcal{P}_i$
△ Less
Submitted 29 November, 2020; v1 submitted 31 December, 2019;
originally announced December 2019.
-
Improved Storage for Efficient Private Information Retrieval
Authors:
Karim Banawan,
Batuhan Arasli,
Sennur Ulukus
Abstract:
We consider the problem of private information retrieval from $N$ \emph{storage-constrained} databases. In this problem, a user wishes to retrieve a single message out of $M$ messages (of size $L$) without revealing any information about the identity of the message to individual databases. Each database stores $μML$ symbols, i.e., a $μ$ fraction of the entire library, where…
▽ More
We consider the problem of private information retrieval from $N$ \emph{storage-constrained} databases. In this problem, a user wishes to retrieve a single message out of $M$ messages (of size $L$) without revealing any information about the identity of the message to individual databases. Each database stores $μML$ symbols, i.e., a $μ$ fraction of the entire library, where $\frac{1}{N} \leq μ\leq 1$. Our goal is to characterize the optimal tradeoff curve for the storage cost (captured by $μ$) and the normalized download cost ($D/L$). We show that the download cost can be reduced by employing a hybrid storage scheme that combines \emph{MDS coding} ideas with \emph{uncoded partial replication} ideas. When there is no coding, our scheme reduces to Attia-Kumar-Tandon storage scheme, which was initially introduced by Maddah-Ali-Niesen in the context of the caching problem, and when there is no uncoded partial replication, our scheme reduces to Banawan-Ulukus storage scheme; in general, our scheme outperforms both.
△ Less
Submitted 29 August, 2019;
originally announced August 2019.
-
On Timely Channel Coding with Hybrid ARQ
Authors:
Ahmed Arafa,
Karim Banawan,
Karim G. Seddik,
H. Vincent Poor
Abstract:
A status updating communication system is examined, in which a transmitter communicates with a receiver over a noisy channel. The goal is to realize timely delivery of fresh data over time, which is assessed by an age-of-information (AoI) metric. Channel coding is used to combat the channel errors, and feedback is sent to acknowledge updates' reception. In case decoding is unsuccessful, a hybrid A…
▽ More
A status updating communication system is examined, in which a transmitter communicates with a receiver over a noisy channel. The goal is to realize timely delivery of fresh data over time, which is assessed by an age-of-information (AoI) metric. Channel coding is used to combat the channel errors, and feedback is sent to acknowledge updates' reception. In case decoding is unsuccessful, a hybrid ARQ protocol is employed, in which incremental redundancy (IR) bits are transmitted to enhance the decoding ability. This continues for some amount of time in case decoding remains unsuccessful, after which a new (fresh) status update is transmitted instead. In case decoding is successful, the transmitter has the option to idly wait for a certain amount of time before sending a new update. A general problem is formulated that optimizes the codeword and IR lengths for each update, and the waiting times, such that the long term average AoI is minimized. Stationary deterministic policies are investigated, in which the codeword and IR lengths are fixed for each update, and the waiting time is a deterministic function of the AoI. The optimal waiting policy is then derived, and is shown to have a threshold structure, in which the transmitter sends a new update only if the AoI grows above a certain threshold that is a function of the codeword and IR lengths. Choosing the codeword and IR lengths is discussed in the context of binary symmetric channels.
△ Less
Submitted 8 May, 2019;
originally announced May 2019.
-
Privacy-Preserving Smart Parking System Using Blockchain and Private Information Retrieval
Authors:
Wesam Al Amiri,
Mohamed Baza,
Karim Banawan,
Mohamed Mahmoud,
Waleed Alasmary,
Kemal Akkaya
Abstract:
Searching for available parking spaces is a major problem for drivers especially in big crowded cities, causing traffic congestion and air pollution, and wasting drivers' time. Smart parking systems are a novel solution to enable drivers to have real-time parking information for pre-booking. However, current smart parking requires drivers to disclose their private information, such as desired dest…
▽ More
Searching for available parking spaces is a major problem for drivers especially in big crowded cities, causing traffic congestion and air pollution, and wasting drivers' time. Smart parking systems are a novel solution to enable drivers to have real-time parking information for pre-booking. However, current smart parking requires drivers to disclose their private information, such as desired destinations. Moreover, the existing schemes are centralized and vulnerable to the bottleneck of the single point of failure and data breaches. In this paper, we propose a distributed privacy-preserving smart parking system using blockchain. A consortium blockchain created by different parking lot owners to ensure security, transparency, and availability is proposed to store their parking offers on the blockchain. To preserve drivers' location privacy, we adopt a private information retrieval (PIR) technique to enable drivers to retrieve parking offers from blockchain nodes privately, without revealing which parking offers are retrieved. Furthermore, a short randomizable signature is used to enable drivers to reserve available parking slots in an anonymous manner. Besides, we introduce an anonymous payment system that cannot link drivers' to specific parking locations. Finally, our performance evaluations demonstrate that the proposed scheme can preserve drivers' privacy with low communication and computation overhead.
△ Less
Submitted 27 January, 2021; v1 submitted 21 April, 2019;
originally announced April 2019.
-
The Capacity of Private Information Retrieval from Heterogeneous Uncoded Caching Databases
Authors:
Karim Banawan,
Batuhan Arasli,
Yi-Peng Wei,
Sennur Ulukus
Abstract:
We consider private information retrieval (PIR) of a single file out of $K$ files from $N$ non-colluding databases with heterogeneous storage constraints $\mathbf{m}=(m_1, \cdots, m_N)$. The aim of this work is to jointly design the content placement phase and the information retrieval phase in order to minimize the download cost in the PIR phase. We characterize the optimal PIR download cost as a…
▽ More
We consider private information retrieval (PIR) of a single file out of $K$ files from $N$ non-colluding databases with heterogeneous storage constraints $\mathbf{m}=(m_1, \cdots, m_N)$. The aim of this work is to jointly design the content placement phase and the information retrieval phase in order to minimize the download cost in the PIR phase. We characterize the optimal PIR download cost as a linear program. By analyzing the structure of the optimal solution of this linear program, we show that, surprisingly, the optimal download cost in our heterogeneous case matches its homogeneous counterpart where all databases have the same average storage constraint $μ=\frac{1}{N} \sum_{n=1}^{N} m_n$. Thus, we show that there is no loss in the PIR capacity due to heterogeneity of storage spaces of the databases. We provide the optimum content placement explicitly for $N=3$.
△ Less
Submitted 25 February, 2019;
originally announced February 2019.
-
Private Information Retrieval from Non-Replicated Databases
Authors:
Karim Banawan,
Sennur Ulukus
Abstract:
We consider the problem of private information retrieval (PIR) of a single message out of $K$ messages from $N$ non-colluding and non-replicated databases. Different from the majority of the existing literature, which considers the case of replicated databases where all databases store the same content in the form of all $K$ messages, here, we consider the case of non-replicated databases under a…
▽ More
We consider the problem of private information retrieval (PIR) of a single message out of $K$ messages from $N$ non-colluding and non-replicated databases. Different from the majority of the existing literature, which considers the case of replicated databases where all databases store the same content in the form of all $K$ messages, here, we consider the case of non-replicated databases under a special non-replication structure where each database stores $M$ out of $K$ messages and each message is stored across $R$ different databases. This generates an $R$-regular graph structure for the storage system where the vertices of the graph are the messages and the edges are the databases. We derive a general upper bound for $M=2$ that depends on the graph structure. We then specialize the problem to storage systems described by two special types of graph structures: cyclic graphs and \emph{fully-connected graphs}. We prove that the PIR capacity for the case of cyclic graphs is $\frac{2}{K+1}$, and the PIR capacity for the case of fully-connected graphs is $\min\{\frac{2}{K},\frac{1}{2}\}$. To that end, we propose novel achievable schemes for both graph structures that are capacity-achieving. The central insight in both schemes is to introduce dependency in the queries submitted to databases that do not contain the desired message, such that the requests can be compressed. In both cases, the results show severe degradation in PIR capacity due to non-replication.
△ Less
Submitted 31 December, 2018;
originally announced January 2019.
-
The Capacity of Private Information Retrieval from Decentralized Uncoded Caching Databases
Authors:
Yi-Peng Wei,
Batuhan Arasli,
Karim Banawan,
Sennur Ulukus
Abstract:
We consider the private information retrieval (PIR) problem from decentralized uncoded caching databases. There are two phases in our problem setting, a caching phase, and a retrieval phase. In the caching phase, a data center containing all the $K$ files, where each file is of size $L$ bits, and several databases with storage size constraint $μK L$ bits exist in the system. Each database independ…
▽ More
We consider the private information retrieval (PIR) problem from decentralized uncoded caching databases. There are two phases in our problem setting, a caching phase, and a retrieval phase. In the caching phase, a data center containing all the $K$ files, where each file is of size $L$ bits, and several databases with storage size constraint $μK L$ bits exist in the system. Each database independently chooses $μK L$ bits out of the total $KL$ bits from the data center to cache through the same probability distribution in a decentralized manner. In the retrieval phase, a user (retriever) accesses $N$ databases in addition to the data center, and wishes to retrieve a desired file privately. We characterize the optimal normalized download cost to be $\frac{D}{L} = \sum_{n=1}^{N+1} \binom{N}{n-1} μ^{n-1} (1-μ)^{N+1-n} \left( 1+ \frac{1}{n} + \dots+ \frac{1}{n^{K-1}} \right)$. We show that uniform and random caching scheme which is originally proposed for decentralized coded caching by Maddah-Ali and Niesen, along with Sun and Jafar retrieval scheme which is originally proposed for PIR from replicated databases surprisingly result in the lowest normalized download cost. This is the decentralized counterpart of the recent result of Attia, Kumar and Tandon for the centralized case. The converse proof contains several ingredients such as interference lower bound, induction lemma, replacing queries and answering string random variables with the content of distributed databases, the nature of decentralized uncoded caching databases, and bit marginalization of joint caching distributions.
△ Less
Submitted 27 November, 2018;
originally announced November 2018.
-
Noisy Private Information Retrieval: On Separability of Channel Coding and Information Retrieval
Authors:
Karim Banawan,
Sennur Ulukus
Abstract:
We consider the problem of noisy private information retrieval (NPIR) from $N$ non-communicating databases, each storing the same set of $M$ messages. In this model, the answer strings are not returned through noiseless bit pipes, but rather through \emph{noisy} memoryless channels. We aim at characterizing the PIR capacity for this model as a function of the statistical information measures of th…
▽ More
We consider the problem of noisy private information retrieval (NPIR) from $N$ non-communicating databases, each storing the same set of $M$ messages. In this model, the answer strings are not returned through noiseless bit pipes, but rather through \emph{noisy} memoryless channels. We aim at characterizing the PIR capacity for this model as a function of the statistical information measures of the noisy channels such as entropy and mutual information. We derive a general upper bound for the retrieval rate in the form of a max-min optimization. We use the achievable schemes for the PIR problem under asymmetric traffic constraints and random coding arguments to derive a general lower bound for the retrieval rate. The upper and lower bounds match for $M=2$ and $M=3$, for any $N$, and any noisy channel. The results imply that separation between channel coding and retrieval is optimal except for adapting the traffic ratio from the databases. We refer to this as \emph{almost separation}. Next, we consider the private information retrieval problem from multiple access channels (MAC-PIR). In MAC-PIR, the database responses reach the user through a multiple access channel (MAC) that mixes the responses together in a stochastic way. We show that for the additive MAC and the conjunction/disjunction MAC, channel coding and retrieval scheme are \emph{inseparable} unlike in NPIR. We show that the retrieval scheme depends on the properties of the MAC, in particular on the linearity aspect. For both cases, we provide schemes that achieve the full capacity without any loss due to the privacy constraint, which implies that the user can exploit the nature of the channel to improve privacy. Finally, we show that the full unconstrained capacity is not always attainable by determining the capacity of the selection channel.
△ Less
Submitted 16 July, 2018;
originally announced July 2018.
-
Private Information Retrieval Through Wiretap Channel II: Privacy Meets Security
Authors:
Karim Banawan,
Sennur Ulukus
Abstract:
We consider the problem of private information retrieval through wiretap channel II (PIR-WTC-II). In PIR-WTC-II, a user wants to retrieve a single message (file) privately out of $M$ messages, which are stored in $N$ replicated and non-communicating databases. An external eavesdropper observes a fraction $μ_n$ (of its choice) of the traffic exchanged between the $n$th database and the user. In add…
▽ More
We consider the problem of private information retrieval through wiretap channel II (PIR-WTC-II). In PIR-WTC-II, a user wants to retrieve a single message (file) privately out of $M$ messages, which are stored in $N$ replicated and non-communicating databases. An external eavesdropper observes a fraction $μ_n$ (of its choice) of the traffic exchanged between the $n$th database and the user. In addition to the privacy constraint, the databases should encode the returned answer strings such that the eavesdropper learns absolutely nothing about the \emph{contents} of the databases. We aim at characterizing the capacity of the PIR-WTC-II under the combined privacy and security constraints. We obtain a general upper bound for the problem in the form of a max-min optimization problem, which extends the converse proof of the PIR problem under asymmetric traffic constraints. We propose an achievability scheme that satisfies the security constraint by encoding a secret key, which is generated securely at each database, into an artificial noise vector using an MDS code. The user and the databases operate at one of the corner points of the achievable scheme for the PIR under asymmetric traffic constraints such that the retrieval rate is maximized under the imposed security constraint. The upper bound and the lower bound match for the case of $M=2$ and $M=3$ messages, for any $N$, and any $\boldsymbolμ=(μ_1, \cdots, μ_N)$.
△ Less
Submitted 18 January, 2018;
originally announced January 2018.
-
Asymmetry Hurts: Private Information Retrieval Under Asymmetric Traffic Constraints
Authors:
Karim Banawan,
Sennur Ulukus
Abstract:
We consider the classical setting of private information retrieval (PIR) of a single message (file) out of $M$ messages from $N$ distributed databases under the new constraint of \emph{asymmetric traffic} from databases. In this problem, the \emph{ratios between the traffic} from the databases are constrained, i.e., the ratio of the length of the answer string that the user (retriever) receives fr…
▽ More
We consider the classical setting of private information retrieval (PIR) of a single message (file) out of $M$ messages from $N$ distributed databases under the new constraint of \emph{asymmetric traffic} from databases. In this problem, the \emph{ratios between the traffic} from the databases are constrained, i.e., the ratio of the length of the answer string that the user (retriever) receives from the $n$th database to the total length of all answer strings from all databases is constrained to be $τ_n$. This may happen if the user's access to the databases is restricted due database availability, channel quality to the databases, and other factors. For this problem, for fixed $M$, $N$, we develop a general upper bound $\bar{C}(\boldsymbolτ)$, which generalizes the converse proof of Sun-Jafar, where database symmetry was inherently used. Our converse bound is a piece-wise affine function in the traffic ratio vector $\boldsymbolτ=(τ_1, \cdots, τ_N)$. For the lower bound, we explicitly show the achievability of $\binom{M+N-1}{M}$ corner points. For the remaining traffic ratio vectors, we perform time-sharing between these corner points. The recursive structure of our achievability scheme is captured via a system of difference equations. The upper and lower bounds exactly match for $M=2$ and $M=3$ for any $N$ and any $\boldsymbolτ$. The results show strict loss of PIR capacity due to the asymmetric traffic constraints compared with the symmetric case of Sun-Jafar which implicitly uses $τ_n=\frac{1}{N}$ for all $n$.
△ Less
Submitted 9 January, 2018;
originally announced January 2018.
-
Cache-Aided Private Information Retrieval with Partially Known Uncoded Prefetching: Fundamental Limits
Authors:
Yi-Peng Wei,
Karim Banawan,
Sennur Ulukus
Abstract:
We consider the problem of private information retrieval (PIR) from $N$ non-colluding and replicated databases, when the user is equipped with a cache that holds an uncoded fraction $r$ of the symbols from each of the $K$ stored messages in the databases. This model operates in a two-phase scheme, namely, the prefetching phase where the user acquires side information and the retrieval phase where…
▽ More
We consider the problem of private information retrieval (PIR) from $N$ non-colluding and replicated databases, when the user is equipped with a cache that holds an uncoded fraction $r$ of the symbols from each of the $K$ stored messages in the databases. This model operates in a two-phase scheme, namely, the prefetching phase where the user acquires side information and the retrieval phase where the user privately downloads the desired message. In the prefetching phase, the user receives $\frac{r}{N}$ uncoded fraction of each message from the $n$th database. This side information is known only to the $n$th database and unknown to the remaining databases, i.e., the user possesses \emph{partially known} side information. We investigate the optimal normalized download cost $D^*(r)$ in the retrieval phase as a function of $K$, $N$, $r$. We develop lower and upper bounds for the optimal download cost. The bounds match in general for the cases of very low caching ratio ($r \leq \frac{1}{N^{K-1}}$) and very high caching ratio ($r \geq \frac{K-2}{N^2-3N+KN}$). We fully characterize the optimal download cost caching ratio tradeoff for $K=3$. For general $K$, $N$, and $r$, we show that the largest gap between the achievability and the converse bounds is $\frac{5}{32}$.
△ Less
Submitted 18 December, 2017;
originally announced December 2017.
-
The Capacity of Private Information Retrieval with Partially Known Private Side Information
Authors:
Yi-Peng Wei,
Karim Banawan,
Sennur Ulukus
Abstract:
We consider the problem of private information retrieval (PIR) of a single message out of $K$ messages from $N$ replicated and non-colluding databases where a cache-enabled user (retriever) of cache-size $M$ possesses side information in the form of full messages that are partially known to the databases. In this model, the user and the databases engage in a two-phase scheme, namely, the prefetchi…
▽ More
We consider the problem of private information retrieval (PIR) of a single message out of $K$ messages from $N$ replicated and non-colluding databases where a cache-enabled user (retriever) of cache-size $M$ possesses side information in the form of full messages that are partially known to the databases. In this model, the user and the databases engage in a two-phase scheme, namely, the prefetching phase where the user acquires side information and the retrieval phase where the user downloads desired information. In the prefetching phase, the user receives $m_n$ full messages from the $n$th database, under the cache memory size constraint $\sum_{n=1}^N m_n \leq M$. In the retrieval phase, the user wishes to retrieve a message such that no individual database learns anything about the identity of the desired message. In addition, the identities of the side information messages that the user did not prefetch from a database must remain private against that database. Since the side information provided by each database in the prefetching phase is known by the providing database and the side information must be kept private against the remaining databases, we coin this model as \textit{partially known private side information}. We characterize the capacity of the PIR with partially known private side information to be $C=\left(1+\frac{1}{N}+\cdots+\frac{1}{N^{K-M-1}}\right)^{-1}=\frac{1-\frac{1}{N}}{1-(\frac{1}{N})^{K-M}}$. Interestingly, this result is the same if none of the databases knows any of the prefetched side information, i.e., when the side information is obtained externally, a problem posed by Kadhe et al. and settled by Chen-Wang-Jafar recently. Thus, our result implies that there is no loss in using the same databases for both prefetching and retrieval phases.
△ Less
Submitted 26 November, 2017; v1 submitted 2 October, 2017;
originally announced October 2017.
-
Fundamental Limits of Cache-Aided Private Information Retrieval with Unknown and Uncoded Prefetching
Authors:
Yi-Peng Wei,
Karim Banawan,
Sennur Ulukus
Abstract:
We consider the problem of private information retrieval (PIR) from $N$ non-colluding and replicated databases when the user is equipped with a cache that holds an uncoded fraction $r$ from each of the $K$ stored messages in the databases. We assume that the databases are unaware of the cache content. We investigate $D^*(r)$ the optimal download cost normalized with the message size as a function…
▽ More
We consider the problem of private information retrieval (PIR) from $N$ non-colluding and replicated databases when the user is equipped with a cache that holds an uncoded fraction $r$ from each of the $K$ stored messages in the databases. We assume that the databases are unaware of the cache content. We investigate $D^*(r)$ the optimal download cost normalized with the message size as a function of $K$, $N$, $r$. For a fixed $K$, $N$, we develop an inner bound (converse bound) for the $D^*(r)$ curve. The inner bound is a piece-wise linear function in $r$ that consists of $K$ line segments. For the achievability, we develop explicit schemes that exploit the cached bits as side information to achieve $K-1$ non-degenerate corner points. These corner points differ in the number of cached bits that are used to generate one side information equation. We obtain an outer bound (achievability) for any caching ratio by memory-sharing between these corner points. Thus, the outer bound is also a piece-wise linear function in $r$ that consists of $K$ line segments. The inner and the outer bounds match in general for the cases of very low caching ratio ($r \leq \frac{1}{1+N+N^2+\cdots+N^{K-1}}$) and very high caching ratio ($r \geq \frac{K-2}{(N+1)K+N^2-2N-2}$). As a corollary, we fully characterize the optimal download cost caching ratio tradeoff for $K=3$. For general $K$, $N$, and $r$, we show that the largest gap between the achievability and the converse bounds is $\frac{1}{6}$. Our results show that the download cost can be reduced beyond memory-sharing if the databases are unaware of the cached content.
△ Less
Submitted 4 September, 2017;
originally announced September 2017.
-
The Capacity of Private Information Retrieval from Byzantine and Colluding Databases
Authors:
Karim Banawan,
Sennur Ulukus
Abstract:
We consider the problem of single-round private information retrieval (PIR) from $N$ replicated databases. We consider the case when $B$ databases are outdated (unsynchronized), or even worse, adversarial (Byzantine), and therefore, can return incorrect answers. In the PIR problem with Byzantine databases (BPIR), a user wishes to retrieve a specific message from a set of $M$ messages with zero-err…
▽ More
We consider the problem of single-round private information retrieval (PIR) from $N$ replicated databases. We consider the case when $B$ databases are outdated (unsynchronized), or even worse, adversarial (Byzantine), and therefore, can return incorrect answers. In the PIR problem with Byzantine databases (BPIR), a user wishes to retrieve a specific message from a set of $M$ messages with zero-error, irrespective of the actions performed by the Byzantine databases. We consider the $T$-privacy constraint in this paper, where any $T$ databases can collude, and exchange the queries submitted by the user. We derive the information-theoretic capacity of this problem, which is the maximum number of \emph{correct symbols} that can be retrieved privately (under the $T$-privacy constraint) for every symbol of the downloaded data. We determine the exact BPIR capacity to be $C=\frac{N-2B}{N}\cdot\frac{1-\frac{T}{N-2B}}{1-(\frac{T}{N-2B})^M}$, if $2B+T < N$. This capacity expression shows that the effect of Byzantine databases on the retrieval rate is equivalent to removing $2B$ databases from the system, with a penalty factor of $\frac{N-2B}{N}$, which signifies that even though the number of databases needed for PIR is effectively $N-2B$, the user still needs to access the entire $N$ databases. The result shows that for the unsynchronized PIR problem, if the user does not have any knowledge about the fraction of the messages that are mis-synchronized, the single-round capacity is the same as the BPIR capacity. Our achievable scheme extends the optimal achievable scheme for the robust PIR (RPIR) problem to correct the \emph{errors} introduced by the Byzantine databases as opposed to \emph{erasures} in the RPIR problem. Our converse proof uses the idea of the cut-set bound in the network coding problem against adversarial nodes.
△ Less
Submitted 5 June, 2017;
originally announced June 2017.
-
Multi-Message Private Information Retrieval: Capacity Results and Near-Optimal Schemes
Authors:
Karim Banawan,
Sennur Ulukus
Abstract:
We consider the problem of multi-message private information retrieval (MPIR) from $N$ non-communicating replicated databases. In MPIR, the user is interested in retrieving $P$ messages out of $M$ stored messages without leaking the identity of the retrieved messages. The information-theoretic sum capacity of MPIR $C_s^P$ is the maximum number of desired message symbols that can be retrieved priva…
▽ More
We consider the problem of multi-message private information retrieval (MPIR) from $N$ non-communicating replicated databases. In MPIR, the user is interested in retrieving $P$ messages out of $M$ stored messages without leaking the identity of the retrieved messages. The information-theoretic sum capacity of MPIR $C_s^P$ is the maximum number of desired message symbols that can be retrieved privately per downloaded symbol. For the case $P \geq \frac{M}{2}$, we determine the exact sum capacity of MPIR as $C_s^P=\frac{1}{1+\frac{M-P}{PN}}$. The achievable scheme in this case is based on downloading MDS-coded mixtures of all messages. For $P \leq \frac{M}{2}$, we develop lower and upper bounds for all $M,P,N$. These bounds match if the total number of messages $M$ is an integer multiple of the number of desired messages $P$, i.e., $\frac{M}{P} \in \mathbb{N}$. In this case, $C_s^P=\frac{1-\frac{1}{N}}{1-(\frac{1}{N})^{M/P}}$. The achievable scheme in this case generalizes the single-message capacity achieving scheme to have unbalanced number of stages per round of download. For all the remaining cases, the difference between the lower and upper bound is at most $0.0082$, which occurs for $M=5$, $P=2$, $N=2$. Our results indicate that joint retrieval of desired messages is more efficient than successive use of single-message retrieval schemes.
△ Less
Submitted 6 February, 2017;
originally announced February 2017.
-
The Capacity of Private Information Retrieval from Coded Databases
Authors:
Karim Banawan,
Sennur Ulukus
Abstract:
We consider the problem of private information retrieval (PIR) over a distributed storage system. The storage system consists of $N$ non-colluding databases, each storing a coded version of $M$ messages. In the PIR problem, the user wishes to retrieve one of the available messages without revealing the message identity to any individual database. We derive the information-theoretic capacity of thi…
▽ More
We consider the problem of private information retrieval (PIR) over a distributed storage system. The storage system consists of $N$ non-colluding databases, each storing a coded version of $M$ messages. In the PIR problem, the user wishes to retrieve one of the available messages without revealing the message identity to any individual database. We derive the information-theoretic capacity of this problem, which is defined as the maximum number of bits of the desired message that can be privately retrieved per one bit of downloaded information. We show that the PIR capacity in this case is $C=\left(1+\frac{K}{N}+\frac{K^2}{N^2}+\cdots+\frac{K^{M-1}}{N^{M-1}}\right)^{-1}=(1+R_c+R_c^2+\cdots+R_c^{M-1})^{-1}=\frac{1-R_c}{1-R_c^M}$, where $R_c$ is the rate of the $(N,K)$ code used. The capacity is a function of the code rate and the number of messages only regardless of the explicit structure of the storage code. The result implies a fundamental tradeoff between the optimal retrieval cost and the storage cost. The result generalizes the achievability and converse results for the classical PIR with replicating databases to the case of coded databases.
△ Less
Submitted 26 September, 2016;
originally announced September 2016.
-
MIMO Wiretap Channel under Receiver Side Power Constraints with Applications to Wireless Power Transfer and Cognitive Radio
Authors:
Karim Banawan,
Sennur Ulukus
Abstract:
We consider the multiple-input multiple-output (MIMO) wiretap channel under a minimum receiver-side power constraint in addition to the usual maximum transmitter-side power constraint. This problem is motivated by energy harvesting communications with wireless energy transfer, where an added goal is to deliver a minimum amount of energy to a receiver in addition to delivering secure data to anothe…
▽ More
We consider the multiple-input multiple-output (MIMO) wiretap channel under a minimum receiver-side power constraint in addition to the usual maximum transmitter-side power constraint. This problem is motivated by energy harvesting communications with wireless energy transfer, where an added goal is to deliver a minimum amount of energy to a receiver in addition to delivering secure data to another receiver. In this paper, we characterize the exact secrecy capacity of the MIMO wiretap channel under transmitter and receiver-side power constraints. We first show that solving this problem is equivalent to solving the secrecy capacity of the wiretap channel under a double-sided correlation matrix constraint on the channel input. We show the converse by extending the channel enhancement technique to our case. We present two achievable schemes that achieve the secrecy capacity: the first achievable scheme uses a Gaussian codebook with a fixed mean, and the second achievable scheme uses artificial noise (or cooperative jamming) together with a Gaussian codebook. The role of the mean or the artificial noise is to enable energy transfer without sacrificing from the secure rate. This is the first instance of a channel model where either the use of a mean signal or the use of channel prefixing via artificial noise is strictly necessary for the MIMO wiretap channel. We then extend our work to consider a maximum receiver-side power constraint. This problem is motivated by cognitive radio applications, where an added goal is to decrease the received signal energy (interference temperature) at a receiver. We further extend our results to: requiring receiver-side power constraints at both receivers; considering secrecy constraints at both receivers to study broadcast channels with confidential messages; and removing the secrecy constraints to study the classical broadcast channel.
△ Less
Submitted 5 July, 2016;
originally announced July 2016.