Search | arXiv e-print repository

Off-Policy Evaluation in Markov Decision Processes under Weak Distributional Overlap

Abstract: Doubly robust methods hold considerable promise for off-policy evaluation in Markov decision processes (MDPs) under sequential ignorability: They have been shown to converge as $1/\sqrt{T}$ with the horizon $T$, to be statistically efficient in large samples, and to allow for modular implementation where preliminary estimation tasks can be executed using standard reinforcement learning techniques.… ▽ More Doubly robust methods hold considerable promise for off-policy evaluation in Markov decision processes (MDPs) under sequential ignorability: They have been shown to converge as $1/\sqrt{T}$ with the horizon $T$, to be statistically efficient in large samples, and to allow for modular implementation where preliminary estimation tasks can be executed using standard reinforcement learning techniques. Existing results, however, make heavy use of a strong distributional overlap assumption whereby the stationary distributions of the target policy and the data-collection policy are within a bounded factor of each other -- and this assumption is typically only credible when the state space of the MDP is bounded. In this paper, we re-visit the task of off-policy evaluation in MDPs under a weaker notion of distributional overlap, and introduce a class of truncated doubly robust (TDR) estimators which we find to perform well in this setting. When the distribution ratio of the target and data-collection policies is square-integrable (but not necessarily bounded), our approach recovers the large-sample behavior previously established under strong distributional overlap. When this ratio is not square-integrable, TDR is still consistent but with a slower-than-$1/\sqrt{T}$; furthermore, this rate of convergence is minimax over a class of MDPs defined only using mixing conditions. We validate our approach numerically and find that, in our experiments, appropriate truncation plays a major role in enabling accurate off-policy evaluation when strong distributional overlap does not hold. △ Less

Submitted 12 February, 2024; originally announced February 2024.

Comments: 50 pages, 4 figures

arXiv:2306.11855 [pdf, other]

A Model-free Closeness-of-influence Test for Features in Supervised Learning

Authors: Mohammad Mehrabi, Ryan A. Rossi

Abstract: Understanding the effect of a feature vector $x \in \mathbb{R}^d$ on the response value (label) $y \in \mathbb{R}$ is the cornerstone of many statistical learning problems. Ideally, it is desired to understand how a set of collected features combine together and influence the response value, but this problem is notoriously difficult, due to the high-dimensionality of data and limited number of lab… ▽ More Understanding the effect of a feature vector $x \in \mathbb{R}^d$ on the response value (label) $y \in \mathbb{R}$ is the cornerstone of many statistical learning problems. Ideally, it is desired to understand how a set of collected features combine together and influence the response value, but this problem is notoriously difficult, due to the high-dimensionality of data and limited number of labeled data points, among many others. In this work, we take a new perspective on this problem, and we study the question of assessing the difference of influence that the two given features have on the response value. We first propose a notion of closeness for the influence of features, and show that our definition recovers the familiar notion of the magnitude of coefficients in the parametric model. We then propose a novel method to test for the closeness of influence in general model-free supervised learning problems. Our proposed test can be used with finite number of samples with control on type I error rate, no matter the ground truth conditional law $\mathcal{L}(Y |X)$. We analyze the power of our test for two general learning problems i) linear regression, and ii) binary classification under mixture of Gaussian models, and show that under the proper choice of score function, an internal component of our test, with sufficient number of samples will achieve full statistical power. We evaluate our findings through extensive numerical simulations, specifically we adopt the datamodel framework (Ilyas, et al., 2022) for CIFAR-10 dataset to identify pairs of training samples with different influence on the trained model via optional black box training mechanisms. △ Less

Submitted 20 June, 2023; originally announced June 2023.

arXiv:2301.03412 [pdf, other]

Neighbor Auto-Grou** Graph Neural Networks for Handover Parameter Configuration in Cellular Network

Authors: Mehrtash Mehrabi, Walid Masoudimansour, Yingxue Zhang, Jie Chuai, Zhitang Chen, Mark Coates, Jianye Hao, Yanhui Geng

Abstract: The mobile communication enabled by cellular networks is the one of the main foundations of our modern society. Optimizing the performance of cellular networks and providing massive connectivity with improved coverage and user experience has a considerable social and economic impact on our daily life. This performance relies heavily on the configuration of the network parameters. However, with the… ▽ More The mobile communication enabled by cellular networks is the one of the main foundations of our modern society. Optimizing the performance of cellular networks and providing massive connectivity with improved coverage and user experience has a considerable social and economic impact on our daily life. This performance relies heavily on the configuration of the network parameters. However, with the massive increase in both the size and complexity of cellular networks, network management, especially parameter configuration, is becoming complicated. The current practice, which relies largely on experts' prior knowledge, is not adequate and will require lots of domain experts and high maintenance costs. In this work, we propose a learning-based framework for handover parameter configuration. The key challenge, in this case, is to tackle the complicated dependencies between neighboring cells and jointly optimize the whole network. Our framework addresses this challenge in two ways. First, we introduce a novel approach to imitate how the network responds to different network states and parameter values, called auto-grou** graph convolutional network (AG-GCN). During the parameter configuration stage, instead of solving the global optimization problem, we design a local multi-objective optimization strategy where each cell considers several local performance metrics to balance its own performance and its neighbors. We evaluate our proposed algorithm via a simulator constructed using real network data. We demonstrate that the handover parameters our model can find, achieve better average network throughput compared to those recommended by experts as well as alternative baselines, which can bring better network quality and stability. It has the potential to massively reduce costs arising from human expert intervention and maintenance. △ Less

Submitted 27 February, 2023; v1 submitted 29 December, 2022; originally announced January 2023.

arXiv:2301.01274 [pdf, other]

Activity Detection for Grant-Free NOMA in Massive IoT Networks

Authors: Mehrtash Mehrabi, Mostafa Mohammadkarimi, Masoud Ardakani

Abstract: Recently, grant-free transmission paradigm has been introduced for massive Internet of Things (IoT) networks to save both time and bandwidth and transmit the message with low latency. In order to accurately decode the message of each device at the base station (BS), first, the active devices at each transmission frame must be identified. In this work, first we investigate the problem of activity d… ▽ More Recently, grant-free transmission paradigm has been introduced for massive Internet of Things (IoT) networks to save both time and bandwidth and transmit the message with low latency. In order to accurately decode the message of each device at the base station (BS), first, the active devices at each transmission frame must be identified. In this work, first we investigate the problem of activity detection as a threshold comparing problem. We show the convexity of the activity detection method through analyzing its probability of error which makes it possible to find the optimal threshold for minimizing the activity detection error. Consequently, to achieve an optimum solution, we propose a deep learning (DL)-based method called convolutional neural network (CNN)-activity detection (AD). In order to make it more practical, we consider unknown and time-varying activity rate for the IoT devices. Our simulations verify that our proposed CNN-AD method can achieve higher performance compared to the existing non-Bayesian greedy-based methods. This is while existing methods need to know the activity rate of IoT devices, while our method works for unknown and even time-varying activity rates △ Less

Submitted 22 December, 2022; originally announced January 2023.

Comments: Accepted in International Conference on Computing, Networking and Communications (ICNC 2023)

arXiv:2211.09740 [pdf, other]

Sub-Graph Learning for Spatiotemporal Forecasting via Knowledge Distillation

Authors: Mehrtash Mehrabi, Yingxue Zhang

Abstract: One of the challenges in studying the interactions in large graphs is to learn their diverse pattern and various interaction types. Hence, considering only one distribution and model to study all nodes and ignoring their diversity and local features in their neighborhoods, might severely affect the overall performance. Based on the structural information of the nodes in the graph and the interacti… ▽ More One of the challenges in studying the interactions in large graphs is to learn their diverse pattern and various interaction types. Hence, considering only one distribution and model to study all nodes and ignoring their diversity and local features in their neighborhoods, might severely affect the overall performance. Based on the structural information of the nodes in the graph and the interactions between them, the main graph can be divided into multiple sub-graphs. This graph partitioning can tremendously affect the learning process, however the overall performance is highly dependent on the clustering method to avoid misleading the model. In this work, we present a new framework called KD-SGL to effectively learn the sub-graphs, where we define one global model to learn the overall structure of the graph and multiple local models for each sub-graph. We assess the performance of the proposed framework and evaluate it on public datasets. Based on the achieved results, it can improve the performance of the state-of-the-arts spatiotemporal models with comparable results compared to ensemble of models with less complexity. △ Less

Submitted 17 November, 2022; originally announced November 2022.

arXiv:2209.02064 [pdf, other]

GRASP: A Goodness-of-Fit Test for Classification Learning

Authors: Adel Javanmard, Mohammad Mehrabi

Abstract: Performance of classifiers is often measured in terms of average accuracy on test data. Despite being a standard measure, average accuracy fails in characterizing the fit of the model to the underlying conditional law of labels given the features vector ($Y|X$), e.g. due to model misspecification, over fitting, and high-dimensionality. In this paper, we consider the fundamental problem of assessin… ▽ More Performance of classifiers is often measured in terms of average accuracy on test data. Despite being a standard measure, average accuracy fails in characterizing the fit of the model to the underlying conditional law of labels given the features vector ($Y|X$), e.g. due to model misspecification, over fitting, and high-dimensionality. In this paper, we consider the fundamental problem of assessing the goodness-of-fit for a general binary classifier. Our framework does not make any parametric assumption on the conditional law $Y|X$, and treats that as a black box oracle model which can be accessed only through queries. We formulate the goodness-of-fit assessment problem as a tolerance hypothesis testing of the form \[ H_0: \mathbb{E}\Big[D_f\Big({\sf Bern}(η(X))\|{\sf Bern}(\hatη(X))\Big)\Big]\leq τ\,, \] where $D_f$ represents an $f$-divergence function, and $η(x)$, $\hatη(x)$ respectively denote the true and an estimate likelihood for a feature vector $x$ admitting a positive label. We propose a novel test, called \grasp for testing $H_0$, which works in finite sample settings, no matter the features (distribution-free). We also propose model-X \grasp designed for model-X settings where the joint distribution of the features vector is known. Model-X \grasp uses this distributional information to achieve better power. We evaluate the performance of our tests through extensive numerical experiments. △ Less

Submitted 30 August, 2023; v1 submitted 5 September, 2022; originally announced September 2022.

Comments: 54 pages, 4 tables and 5 figures

arXiv:2111.00027 [pdf, other]

Pearson Chi-squared Conditional Randomization Test

Authors: Adel Javanmard, Mohammad Mehrabi

Abstract: Conditional independence (CI) testing arises naturally in many scientific problems and applications domains. The goal of this problem is to investigate the conditional independence between a response variable $Y$ and another variable $X$, while controlling for the effect of a high-dimensional confounding variable $Z$. In this paper, we introduce a novel test, called `Pearson Chi-squared Conditiona… ▽ More Conditional independence (CI) testing arises naturally in many scientific problems and applications domains. The goal of this problem is to investigate the conditional independence between a response variable $Y$ and another variable $X$, while controlling for the effect of a high-dimensional confounding variable $Z$. In this paper, we introduce a novel test, called `Pearson Chi-squared Conditional Randomization' (PCR) test, which uses the distributional information on covariates $X,Z$ and constructs randomizations to test conditional independence. Our proposal is motivated by some of the hard alternatives for the vanilla conditional randomization test (Candès et al., 2018). We also provide a power analysis of the PCR test, which captures the effect of various parameters of the test, the sample size and the distance of the alternative from the set of null distributions, measured in terms of a notion called `conditional relative density'. In addition, we propose two extensions of the PCR test, with important practical implications: $(i)$ parameter-free PCR, which uses Bonferroni's correction to decide on a tuning parameter in the test; $(ii)$ robust PCR, which avoids inflations in the size of the test when there is slight error in estimating the conditional law $P_{X|Z}$. △ Less

Submitted 29 October, 2021; originally announced November 2021.

arXiv:2110.11950 [pdf, other]

Adversarial robustness for latent models: Revisiting the robust-standard accuracies tradeoff

Authors: Adel Javanmard, Mohammad Mehrabi

Abstract: Over the past few years, several adversarial training methods have been proposed to improve the robustness of machine learning models against adversarial perturbations in the input. Despite remarkable progress in this regard, adversarial training is often observed to drop the standard test accuracy. This phenomenon has intrigued the research community to investigate the potential tradeoff between… ▽ More Over the past few years, several adversarial training methods have been proposed to improve the robustness of machine learning models against adversarial perturbations in the input. Despite remarkable progress in this regard, adversarial training is often observed to drop the standard test accuracy. This phenomenon has intrigued the research community to investigate the potential tradeoff between standard accuracy (a.k.a generalization) and robust accuracy (a.k.a robust generalization) as two performance measures. In this paper, we revisit this tradeoff for latent models and argue that this tradeoff is mitigated when the data enjoys a low-dimensional structure. In particular, we consider binary classification under two data generative models, namely Gaussian mixture model and generalized linear model, where the features data lie on a low-dimensional manifold. We develop a theory to show that the low-dimensional manifold structure allows one to obtain models that are nearly optimal with respect to both, the standard accuracy and the robust accuracy measures. We further corroborate our theory with several numerical experiments, including Mixture of Factor Analyzers (MFA) model trained on the MNIST dataset. △ Less

Submitted 31 March, 2022; v1 submitted 22 October, 2021; originally announced October 2021.

Comments: 30 pages, 7 figures

arXiv:2103.03801 [pdf, other]

Error-Correction for Sparse Support Recovery Algorithms

Authors: Mohammad Mehrabi, Aslan Tchamkerten

Abstract: Consider the compressed sensing setup where the support $s^*$ of an $m$-sparse $d$-dimensional signal $x$ is to be recovered from $n$ linear measurements with a given algorithm. Suppose that the measurements are such that the algorithm does not guarantee perfect support recovery and that true features may be missed. Can they efficiently be retrieved? This paper addresses this question through a si… ▽ More Consider the compressed sensing setup where the support $s^*$ of an $m$-sparse $d$-dimensional signal $x$ is to be recovered from $n$ linear measurements with a given algorithm. Suppose that the measurements are such that the algorithm does not guarantee perfect support recovery and that true features may be missed. Can they efficiently be retrieved? This paper addresses this question through a simple error-correction module referred to as LiRE. LiRE takes as input an estimate $s_{in}$ of the true support $s^*$, and outputs a refined support estimate $s_{out}$. In the noiseless measurement setup, sufficient conditions are established under which LiRE is guaranteed to recover the entire support, that is $s_{out}$ contains $s^*$. These conditions imply, for instance, that in the high-dimensional regime LiRE can correct a sublinear in $m$ number of errors made by Orthogonal Matching Pursuit (OMP). The computational complexity of LiRE is $O(mnd)$. Experimental results with random Gaussian design matrices show that LiRE substantially reduces the number of measurements needed for perfect support recovery via Compressive Sampling Matching Pursuit, Basis Pursuit (BP), and OMP. Interestingly, adding LiRE to OMP yields a support recovery procedure that is more accurate and significantly faster than BP. This observation carries over in the noisy measurement setup. Finally, as a standalone support recovery algorithm with a random initialization, experiments show that LiRE's reconstruction performance lies between OMP and BP. These results suggest that LiRE may be used generically, on top of any suboptimal baseline support recovery algorithm, to improve support recovery or to operate with a smaller number of measurements, at the cost of a relatively small computational overhead. Alternatively, LiRE may be used as a standalone support recovery algorithm that is competitive with respect to OMP. △ Less

Submitted 5 March, 2021; originally announced March 2021.

Comments: Submitted for publication

arXiv:2101.06309 [pdf, other]

Fundamental Tradeoffs in Distributionally Adversarial Training

Authors: Mohammad Mehrabi, Adel Javanmard, Ryan A. Rossi, Anup Rao, Tung Mai

Abstract: Adversarial training is among the most effective techniques to improve the robustness of models against adversarial perturbations. However, the full effect of this approach on models is not well understood. For example, while adversarial training can reduce the adversarial risk (prediction error against an adversary), it sometimes increase standard risk (generalization error when there is no adver… ▽ More Adversarial training is among the most effective techniques to improve the robustness of models against adversarial perturbations. However, the full effect of this approach on models is not well understood. For example, while adversarial training can reduce the adversarial risk (prediction error against an adversary), it sometimes increase standard risk (generalization error when there is no adversary). Even more, such behavior is impacted by various elements of the learning problem, including the size and quality of training data, specific forms of adversarial perturbations in the input, model overparameterization, and adversary's power, among others. In this paper, we focus on \emph{distribution perturbing} adversary framework wherein the adversary can change the test distribution within a neighborhood of the training data distribution. The neighborhood is defined via Wasserstein distance between distributions and the radius of the neighborhood is a measure of adversary's manipulative power. We study the tradeoff between standard risk and adversarial risk and derive the Pareto-optimal tradeoff, achievable over specific classes of models, in the infinite data limit with features dimension kept fixed. We consider three learning settings: 1) Regression with the class of linear models; 2) Binary classification under the Gaussian mixtures data model, with the class of linear classifiers; 3) Regression with the class of random features model (which can be equivalently represented as two-layer neural network with random first-layer weights). We show that a tradeoff between standard and adversarial risk is manifested in all three settings. We further characterize the Pareto-optimal tradeoff curves and discuss how a variety of factors, such as features correlation, adversary's power or the width of two-layer neural network would affect this tradeoff. △ Less

Submitted 15 January, 2021; originally announced January 2021.

Comments: 23 pages, 3 figures

arXiv:1911.01040 [pdf, other]

Online Debiasing for Adaptively Collected High-dimensional Data with Applications to Time Series Analysis

Authors: Yash Deshpande, Adel Javanmard, Mohammad Mehrabi

Abstract: Adaptive collection of data is commonplace in applications throughout science and engineering. From the point of view of statistical inference however, adaptive data collection induces memory and correlation in the samples, and poses significant challenge. We consider the high-dimensional linear regression, where the samples are collected adaptively, and the sample size $n$ can be smaller than… ▽ More Adaptive collection of data is commonplace in applications throughout science and engineering. From the point of view of statistical inference however, adaptive data collection induces memory and correlation in the samples, and poses significant challenge. We consider the high-dimensional linear regression, where the samples are collected adaptively, and the sample size $n$ can be smaller than $p$, the number of covariates. In this setting, there are two distinct sources of bias: the first due to regularization imposed for consistent estimation, e.g. using the LASSO, and the second due to adaptivity in collecting the samples. We propose "online debiasing", a general procedure for estimators such as the LASSO, which addresses both sources of bias. In two concrete contexts $(i)$ time series analysis and $(ii)$ batched data collection, we demonstrate that online debiasing optimally debiases the LASSO estimate when the underlying parameter $θ_0$ has sparsity of order $o(\sqrt{n}/\log p)$. In this regime, the debiased estimator can be used to compute $p$-values and confidence intervals of optimal size. △ Less

Submitted 5 May, 2020; v1 submitted 4 November, 2019; originally announced November 2019.

Comments: 66 pages, 2 tables, 11 figures; updated with minor fixes and reorganization

arXiv:1901.03435 [pdf, other]

Decision Directed Channel Estimation Based on Deep Neural Network k-step Predictor for MIMO Communications in 5G

Authors: Mehrtash Mehrabi, Mostafa Mohammadkarimi, Masoud Ardakani, Yindi **g

Abstract: We consider the use of deep neural network (DNN) to develop a decision-directed (DD)-channel estimation (CE) algorithm for multiple-input multiple-output (MIMO)-space-time block coded systems in highly dynamic vehicular environments. We propose the use of DNN for k-step channel prediction for space-time block code (STBC)s, and show that deep learning (DL)-based DD-CE can removes the need for Doppl… ▽ More We consider the use of deep neural network (DNN) to develop a decision-directed (DD)-channel estimation (CE) algorithm for multiple-input multiple-output (MIMO)-space-time block coded systems in highly dynamic vehicular environments. We propose the use of DNN for k-step channel prediction for space-time block code (STBC)s, and show that deep learning (DL)-based DD-CE can removes the need for Doppler spread estimation in fast time-varying quasi stationary channels, where the Doppler spread varies from one packet to another. Doppler spread estimation in this kind of vehicular channels is remarkably challenging and requires a large number of pilots and preambles, leading to lower power and spectral efficiency. We train two DNNs which learn real and imaginary parts of the MIMO fading channels over a wide range of Doppler spreads. We demonstrate that by those DNNs, DD-CE can be realized with only rough priori knowledge about Doppler spread range. For the proposed DD-CE algorithm, we also analytically derive the maximum likelihood (ML) decoding algorithm for STBC transmission. The proposed DL-based DD-CE is a promising solution for reliable communication over the vehicular MIMO fading channels without accurate mathematical models. This is because DNN can intelligently learn the statistics of the fading channels. Our simulation results show that the proposed DL-based DD-CE algorithm exhibits lower propagation error compared to existing DD-CE algorithms while the latters require perfect knowledge of the Doppler rate. △ Less

Submitted 10 January, 2019; originally announced January 2019.

arXiv:1809.00377 [pdf, ps, other]

On Some Integral Means

Authors: Fariba Khoshnasib-Zeinabad, Mohammadhossein Mehrabi

Abstract: Harmonic, Geometric, Arithmetic, Heronian and Contraharmonic means have been studied by many mathematicians. In 2003, H. Evens studied these means from geometrical point of view and established some of the inequalities between them in using a circle and its radius. In 1961, E. Beckenback and R. Bellman introduced several inequalities corresponding to means. In this paper, we will introduce the con… ▽ More Harmonic, Geometric, Arithmetic, Heronian and Contraharmonic means have been studied by many mathematicians. In 2003, H. Evens studied these means from geometrical point of view and established some of the inequalities between them in using a circle and its radius. In 1961, E. Beckenback and R. Bellman introduced several inequalities corresponding to means. In this paper, we will introduce the concept of mean functions and integral means and give bounds on some of these mean functions and integral means. △ Less

Submitted 3 January, 2020; v1 submitted 2 September, 2018; originally announced September 2018.

Comments: 16 pages

MSC Class: 26E60; 26D15

arXiv:1807.03162 [pdf, other]

doi 10.1109/TWC.2019.2924220

Deep Learning Based Sphere Decoding

Authors: Mostafa Mohammadkarimi, Mehrtash Mehrabi, Masoud Ardakani, Yindi **g

Abstract: In this paper, a deep learning (DL)-based sphere decoding algorithm is proposed, where the radius of the decoding hypersphere is learned by a deep neural network (DNN). The performance achieved by the proposed algorithm is very close to the optimal maximum likelihood decoding (MLD) over a wide range of signal-to-noise ratios (SNRs), while the computational complexity, compared to existing sphere d… ▽ More In this paper, a deep learning (DL)-based sphere decoding algorithm is proposed, where the radius of the decoding hypersphere is learned by a deep neural network (DNN). The performance achieved by the proposed algorithm is very close to the optimal maximum likelihood decoding (MLD) over a wide range of signal-to-noise ratios (SNRs), while the computational complexity, compared to existing sphere decoding variants, is significantly reduced. This improvement is attributed to DNN's ability of intelligently learning the radius of the hypersphere used in decoding. The expected complexity of the proposed DL-based algorithm is analytically derived and compared with existing ones. It is shown that the number of lattice points inside the decoding hypersphere drastically reduces in the DL-based algorithm in both the average and worst-case senses. The effectiveness of the proposed algorithm is shown through simulation for high-dimensional multiple-input multiple-output (MIMO) systems, using high-order modulations. △ Less

Submitted 25 March, 2024; v1 submitted 5 July, 2018; originally announced July 2018.

Journal ref: IEEE Trans. Wireless Commun. vol. 18, no. 9, pp. 4368-4378, June. 2019

arXiv:1806.11416 [pdf, ps, other]

Bounds on the Approximation Power of Feedforward Neural Networks

Authors: Mohammad Mehrabi, Aslan Tchamkerten, Mansoor I. Yousefi

Abstract: The approximation power of general feedforward neural networks with piecewise linear activation functions is investigated. First, lower bounds on the size of a network are established in terms of the approximation error and network depth and width. These bounds improve upon state-of-the-art bounds for certain classes of functions, such as strongly convex functions. Second, an upper bound is establ… ▽ More The approximation power of general feedforward neural networks with piecewise linear activation functions is investigated. First, lower bounds on the size of a network are established in terms of the approximation error and network depth and width. These bounds improve upon state-of-the-art bounds for certain classes of functions, such as strongly convex functions. Second, an upper bound is established on the difference of two neural networks with identical weights but different activation functions. △ Less

Submitted 29 June, 2018; originally announced June 2018.

arXiv:1703.10084 [pdf, ps, other]

Cooperative Abnormality Detection via Diffusive Molecular Communications

Authors: Reza Mosayebi, Vahid Jamali, Nafiseh Ghoroghchian, Robert Schober, Masoumeh Nasiri-Kenari, Mahdieh Mehrabi

Abstract: In this paper, we consider abnormality detection via diffusive molecular communications (MCs) for a network consisting of several sensors and a fusion center (FC). If a sensor detects an abnormality, it injects into the medium a number of molecules which is proportional to the sensed value. Two transmission schemes for releasing molecules into the medium are considered. In the first scheme, referr… ▽ More In this paper, we consider abnormality detection via diffusive molecular communications (MCs) for a network consisting of several sensors and a fusion center (FC). If a sensor detects an abnormality, it injects into the medium a number of molecules which is proportional to the sensed value. Two transmission schemes for releasing molecules into the medium are considered. In the first scheme, referred to as DTM, each sensor releases a different type of molecule, whereas in the second scheme, referred to as STM, all sensors release the same type of molecule. The molecules released by the sensors propagate through the MC channel and some may reach the FC where the final decision regarding whether or not an abnormality has occurred is made. We derive the optimal decision rules for both DTM and STM. However, the optimal detectors entail high computational complexity as log-likelihood ratios (LLRs) have to be computed. To overcome this issue, we show that the optimal decision rule for STM can be transformed into an equivalent low-complexity decision rule. Since a similar transformation is not possible for DTM, we propose simple low-complexity sub-optimal detectors based on different approximations of the LLR. The proposed low-complexity detectors are more suitable for practical MC systems than the original complex optimal decision rule, particularly when the FC is a nano-machine with limited computational capabilities. Furthermore, we analyze the performance of the proposed detectors in terms of their false alarm and missed detection probabilities. Simulation results verify our analytical derivations and reveal interesting insights regarding the trade-off between complexity and performance of the proposed detectors and the considered DTM and STM schemes. △ Less

Submitted 29 March, 2017; originally announced March 2017.

Comments: 30 pages, 9 figures

arXiv:1702.02642

doi 10.1109/CWIT.2017.7994819

On minimum distance of locally repairable codes

Authors: Mehrtash Mehrabi, Massoud Ardakani

Abstract: Distributed and cloud storage systems are used to reliably store large-scale data. Erasure codes have been recently proposed and used in real-world distributed and cloud storage systems such as Google File System, Microsoft Azure Storage, and Facebook HDFS-RAID, to enhance the reliability. In order to decrease the repair bandwidth and disk I/O, a class of erasure codes called locally repairable co… ▽ More Distributed and cloud storage systems are used to reliably store large-scale data. Erasure codes have been recently proposed and used in real-world distributed and cloud storage systems such as Google File System, Microsoft Azure Storage, and Facebook HDFS-RAID, to enhance the reliability. In order to decrease the repair bandwidth and disk I/O, a class of erasure codes called locally repairable codes (LRCs) have been proposed which have small locality compare to other erasure codes. Although LRCs have small locality, they have lower minimum distance compare to the Singleton bound. Hence, seeking the largest possible minimum distance for LRCs have been the topic of many recent studies. In this paper, we study the largest possible minimum distance of a class of LRCs and evaluate them in terms of achievability. Furthermore, we compare our results with the existence bounds in the literature. △ Less

Submitted 17 February, 2017; v1 submitted 8 February, 2017; originally announced February 2017.

Comments: This paper has been withdrawn by the author due to some typos

arXiv:1606.09463 [pdf, other]

Optimal Locally Repairable Codes with Improved Update Complexity

Authors: Mehrtash Mehrabi, Mostafa Shahabinejad, Masoud Ardakani, Majid Khabbazian

Abstract: For a systematic erasure code, update complexity (UC) is defined as the maximum number of parity blocks needed to be changed when some information blocks are updated. Locally repairable codes (LRCs) have been recently proposed and used in real-world distributed storage systems. In this paper, update complexity for optimal LRC is studied and both lower and upper bounds on UC are established in term… ▽ More For a systematic erasure code, update complexity (UC) is defined as the maximum number of parity blocks needed to be changed when some information blocks are updated. Locally repairable codes (LRCs) have been recently proposed and used in real-world distributed storage systems. In this paper, update complexity for optimal LRC is studied and both lower and upper bounds on UC are established in terms of length (n), dimension (k), minimum distance (d), and locality (r) of the code, when (r+1)|n. Furthermore, a class of optimal LRCs with small UC is proposed. Our proposed LRCs could be of interest as they improve UC without sacrificing optimality of the code. △ Less

Submitted 14 July, 2016; v1 submitted 30 June, 2016; originally announced June 2016.

Showing 1–18 of 18 results for author: Mehrabi, M