Search | arXiv e-print repository

Interventional Causal Discovery in a Mixture of DAGs

Authors: Burak Varıcı, Dmitriy Katz-Rogozhnikov, Dennis Wei, Prasanna Sattigeri, Ali Tajer

Abstract: Causal interactions among a group of variables are often modeled by a single causal graph. In some domains, however, these interactions are best described by multiple co-existing causal graphs, e.g., in dynamical systems or genomics. This paper addresses the hitherto unknown role of interventions in learning causal interactions among variables governed by a mixture of causal systems, each modeled… ▽ More Causal interactions among a group of variables are often modeled by a single causal graph. In some domains, however, these interactions are best described by multiple co-existing causal graphs, e.g., in dynamical systems or genomics. This paper addresses the hitherto unknown role of interventions in learning causal interactions among variables governed by a mixture of causal systems, each modeled by one directed acyclic graph (DAG). Causal discovery from mixtures is fundamentally more challenging than single-DAG causal discovery. Two major difficulties stem from (i) inherent uncertainty about the skeletons of the component DAGs that constitute the mixture and (ii) possibly cyclic relationships across these component DAGs. This paper addresses these challenges and aims to identify edges that exist in at least one component DAG of the mixture, referred to as true edges. First, it establishes matching necessary and sufficient conditions on the size of interventions required to identify the true edges. Next, guided by the necessity results, an adaptive algorithm is designed that learns all true edges using ${\cal O}(n^2)$ interventions, where $n$ is the number of nodes. Remarkably, the size of the interventions is optimal if the underlying mixture model does not contain cycles across its components. More generally, the gap between the intervention size used by the algorithm and the optimal size is quantified. It is shown to be bounded by the cyclic complexity number of the mixture model, defined as the size of the minimal intervention that can break the cycles in the mixture, which is upper bounded by the number of cycles among the ancestors of a node. △ Less

Submitted 12 June, 2024; originally announced June 2024.

arXiv:2406.05937 [pdf, other]

Linear Causal Representation Learning from Unknown Multi-node Interventions

Authors: Burak Varıcı, Emre Acartürk, Karthikeyan Shanmugam, Ali Tajer

Abstract: Despite the multifaceted recent advances in interventional causal representation learning (CRL), they primarily focus on the stylized assumption of single-node interventions. This assumption is not valid in a wide range of applications, and generally, the subset of nodes intervened in an interventional environment is fully unknown. This paper focuses on interventional CRL under unknown multi-node… ▽ More Despite the multifaceted recent advances in interventional causal representation learning (CRL), they primarily focus on the stylized assumption of single-node interventions. This assumption is not valid in a wide range of applications, and generally, the subset of nodes intervened in an interventional environment is fully unknown. This paper focuses on interventional CRL under unknown multi-node (UMN) interventional environments and establishes the first identifiability results for general latent causal models (parametric or nonparametric) under stochastic interventions (soft or hard) and linear transformation from the latent to observed space. Specifically, it is established that given sufficiently diverse interventional environments, (i) identifiability up to ancestors is possible using only soft interventions, and (ii) perfect identifiability is possible using hard interventions. Remarkably, these guarantees match the best-known results for more restrictive single-node interventions. Furthermore, CRL algorithms are also provided that achieve the identifiability guarantees. A central step in designing these algorithms is establishing the relationships between UMN interventional CRL and score functions associated with the statistical models of different interventional environments. Establishing these relationships also serves as constructive proof of the identifiability guarantees. △ Less

Submitted 9 June, 2024; originally announced June 2024.

arXiv:2405.07795 [pdf, other]

Improved Bound for Robust Causal Bandits with Linear Models

Authors: Zirui Yan, Arpan Mukherjee, Burak Varıcı, Ali Tajer

Abstract: This paper investigates the robustness of causal bandits (CBs) in the face of temporal model fluctuations. This setting deviates from the existing literature's widely-adopted assumption of constant causal models. The focus is on causal systems with linear structural equation models (SEMs). The SEMs and the time-varying pre- and post-interventional statistical models are all unknown and subject to… ▽ More This paper investigates the robustness of causal bandits (CBs) in the face of temporal model fluctuations. This setting deviates from the existing literature's widely-adopted assumption of constant causal models. The focus is on causal systems with linear structural equation models (SEMs). The SEMs and the time-varying pre- and post-interventional statistical models are all unknown and subject to variations over time. The goal is to design a sequence of interventions that incur the smallest cumulative regret compared to an oracle aware of the entire causal model and its fluctuations. A robust CB algorithm is proposed, and its cumulative regret is analyzed by establishing both upper and lower bounds on the regret. It is shown that in a graph with maximum in-degree $d$, length of the largest causal path $L$, and an aggregate model deviation $C$, the regret is upper bounded by $\tilde{\mathcal{O}}(d^{L-\frac{1}{2}}(\sqrt{T} + C))$ and lower bounded by $Ω(d^{\frac{L}{2}-2}\max\{\sqrt{T}\; ,\; d^2C\})$. The proposed algorithm achieves nearly optimal $\tilde{\mathcal{O}}(\sqrt{T})$ regret when $C$ is $o(\sqrt{T})$, maintaining sub-linear regret for a broad range of $C$. △ Less

Submitted 13 May, 2024; originally announced May 2024.

Comments: 11 pages, 3 figures. arXiv admin note: substantial text overlap with arXiv:2310.19794

arXiv:2403.16297 [pdf, other]

Round Robin Active Sequential Change Detection for Dependent Multi-Channel Data

Authors: Anamitra Chaudhuri, Georgios Fellouris, Ali Tajer

Abstract: This paper considers the problem of sequentially detecting a change in the joint distribution of multiple data sources under a sampling constraint. Specifically, the channels or sources generate observations that are independent over time, but not necessarily independent at any given time instant. The sources follow an initial joint distribution, and at an unknown time instant, the joint distribut… ▽ More This paper considers the problem of sequentially detecting a change in the joint distribution of multiple data sources under a sampling constraint. Specifically, the channels or sources generate observations that are independent over time, but not necessarily independent at any given time instant. The sources follow an initial joint distribution, and at an unknown time instant, the joint distribution of an unknown subset of sources changes. Importantly, there is a hard constraint that only a fixed number of sources are allowed to be sampled at each time instant. The goal is to sequentially observe the sources according to the constraint, and stop sampling as quickly as possible after the change while controlling the false alarm rate below a user-specified level. The sources can be selected dynamically based on the already collected data, and thus, a policy for this problem consists of a joint sampling and change-detection rule. A non-randomized policy is studied, and an upper bound is established on its worst-case conditional expected detection delay with respect to both the change point and the observations from the affected sources before the change. △ Less

Submitted 24 March, 2024; originally announced March 2024.

MSC Class: 62L10; 62L05 (Primary) 62P30 (Secondary)

arXiv:2403.00233 [pdf, other]

Causal Bandits with General Causal Models and Interventions

Authors: Zirui Yan, Dennis Wei, Dmitriy Katz-Rogozhnikov, Prasanna Sattigeri, Ali Tajer

Abstract: This paper considers causal bandits (CBs) for the sequential design of interventions in a causal system. The objective is to optimize a reward function via minimizing a measure of cumulative regret with respect to the best sequence of interventions in hindsight. The paper advances the results on CBs in three directions. First, the structural causal models (SCMs) are assumed to be unknown and drawn… ▽ More This paper considers causal bandits (CBs) for the sequential design of interventions in a causal system. The objective is to optimize a reward function via minimizing a measure of cumulative regret with respect to the best sequence of interventions in hindsight. The paper advances the results on CBs in three directions. First, the structural causal models (SCMs) are assumed to be unknown and drawn arbitrarily from a general class $\mathcal{F}$ of Lipschitz-continuous functions. Existing results are often focused on (generalized) linear SCMs. Second, the interventions are assumed to be generalized soft with any desired level of granularity, resulting in an infinite number of possible interventions. The existing literature, in contrast, generally adopts atomic and hard interventions. Third, we provide general upper and lower bounds on regret. The upper bounds subsume (and improve) known bounds for special cases. The lower bounds are generally hitherto unknown. These bounds are characterized as functions of the (i) graph parameters, (ii) eluder dimension of the space of SCMs, denoted by $\operatorname{dim}(\mathcal{F})$, and (iii) the covering number of the function space, denoted by ${\rm cn}(\mathcal{F})$. Specifically, the cumulative achievable regret over horizon $T$ is $\mathcal{O}(K d^{L-1}\sqrt{T\operatorname{dim}(\mathcal{F}) \log({\rm cn}(\mathcal{F}))})$, where $K$ is related to the Lipschitz constants, $d$ is the graph's maximum in-degree, and $L$ is the length of the longest causal path. The upper bound is further refined for special classes of SCMs (neural network, polynomial, and linear), and their corresponding lower bounds are provided. △ Less

Submitted 29 February, 2024; originally announced March 2024.

Comments: 37 pages, 13 figures, conference

arXiv:2402.00849 [pdf, other]

Score-based Causal Representation Learning: Linear and General Transformations

Authors: Burak Varıcı, Emre Acartürk, Karthikeyan Shanmugam, Abhishek Kumar, Ali Tajer

Abstract: This paper addresses intervention-based causal representation learning (CRL) under a general nonparametric latent causal model and an unknown transformation that maps the latent variables to the observed variables. Linear and general transformations are investigated. The paper addresses both the identifiability and achievability aspects. Identifiability refers to determining algorithm-agnostic con… ▽ More This paper addresses intervention-based causal representation learning (CRL) under a general nonparametric latent causal model and an unknown transformation that maps the latent variables to the observed variables. Linear and general transformations are investigated. The paper addresses both the identifiability and achievability aspects. Identifiability refers to determining algorithm-agnostic conditions that ensure recovering the true latent causal variables and the latent causal graph underlying them. Achievability refers to the algorithmic aspects and addresses designing algorithms that achieve identifiability guarantees. By drawing novel connections between score functions (i.e., the gradients of the logarithm of density functions) and CRL, this paper designs a score-based class of algorithms that ensures both identifiability and achievability. First, the paper focuses on linear transformations and shows that one stochastic hard intervention per node suffices to guarantee identifiability. It also provides partial identifiability guarantees for soft interventions, including identifiability up to ancestors for general causal models and perfect latent graph recovery for sufficiently non-linear causal models. Secondly, it focuses on general transformations and shows that two stochastic hard interventions per node suffice for identifiability. Notably, one does not need to know which pair of interventional environments have the same node intervened. △ Less

Submitted 26 February, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

Comments: (updated literature review) Linear transformations: stronger results than our previous paper Score-based Causal Representation Learning with Interventions (arXiv:2301.08230). General transformations: results also appear in our paper General Identifiability and Achievability for Causal Representation Learning (arXiv:2310.15450) accepted to AISTATS 2024 (oral). arXiv admin note: text overlap with arXiv:2310.15450

arXiv:2401.09640 [pdf, ps, other]

Blackout Mitigation via Physics-guided RL

Authors: Anmol Dwivedi, Santiago Paternain, Ali Tajer

Abstract: This paper considers the sequential design of remedial control actions in response to system anomalies for the ultimate objective of preventing blackouts. A physics-guided reinforcement learning (RL) framework is designed to identify effective sequences of real-time remedial look-ahead decisions accounting for the long-term impact on the system's stability. The paper considers a space of control a… ▽ More This paper considers the sequential design of remedial control actions in response to system anomalies for the ultimate objective of preventing blackouts. A physics-guided reinforcement learning (RL) framework is designed to identify effective sequences of real-time remedial look-ahead decisions accounting for the long-term impact on the system's stability. The paper considers a space of control actions that involve both discrete-valued transmission line-switching decisions (line reconnections and removals) and continuous-valued generator adjustments. To identify an effective blackout mitigation policy, a physics-guided approach is designed that uses power-flow sensitivity factors associated with the power transmission network to guide the RL exploration during agent training. Comprehensive empirical evaluations using the open-source Grid2Op platform demonstrate the notable advantages of incorporating physical signals into RL decisions, establishing the gains of the proposed physics-guided approach compared to its black box counterparts. One important observation is that strategically~\emph{removing} transmission lines, in conjunction with multiple real-time generator adjustments, often renders effective long-term decisions that are likely to prevent or delay blackouts. △ Less

Submitted 17 January, 2024; originally announced January 2024.

arXiv:2310.19794 [pdf, other]

Robust Causal Bandits for Linear Models

Authors: Zirui Yan, Arpan Mukherjee, Burak Varıcı, Ali Tajer

Abstract: Sequential design of experiments for optimizing a reward function in causal systems can be effectively modeled by the sequential design of interventions in causal bandits (CBs). In the existing literature on CBs, a critical assumption is that the causal models remain constant over time. However, this assumption does not necessarily hold in complex systems, which constantly undergo temporal model f… ▽ More Sequential design of experiments for optimizing a reward function in causal systems can be effectively modeled by the sequential design of interventions in causal bandits (CBs). In the existing literature on CBs, a critical assumption is that the causal models remain constant over time. However, this assumption does not necessarily hold in complex systems, which constantly undergo temporal model fluctuations. This paper addresses the robustness of CBs to such model fluctuations. The focus is on causal systems with linear structural equation models (SEMs). The SEMs and the time-varying pre- and post-interventional statistical models are all unknown. Cumulative regret is adopted as the design criteria, based on which the objective is to design a sequence of interventions that incur the smallest cumulative regret with respect to an oracle aware of the entire causal model and its fluctuations. First, it is established that the existing approaches fail to maintain regret sub-linearity with even a few instances of model deviation. Specifically, when the number of instances with model deviation is as few as $T^\frac{1}{2L}$, where $T$ is the time horizon and $L$ is the longest causal path in the graph, the existing algorithms will have linear regret in $T$. Next, a robust CB algorithm is designed, and its regret is analyzed, where upper and information-theoretic lower bounds on the regret are established. Specifically, in a graph with $N$ nodes and maximum degree $d$, under a general measure of model deviation $C$, the cumulative regret is upper bounded by $\tilde{\mathcal{O}}(d^{L-\frac{1}{2}}(\sqrt{NT} + NC))$ and lower bounded by $Ω(d^{\frac{L}{2}-2}\max\{\sqrt{T},d^2C\})$. Comparing these bounds establishes that the proposed algorithm achieves nearly optimal $\tilde{\mathcal{O}}(\sqrt{T})$ regret when $C$ is $o(\sqrt{T})$ and maintains sub-linear regret for a broader range of $C$. △ Less

Submitted 4 March, 2024; v1 submitted 30 October, 2023; originally announced October 2023.

arXiv:2310.15450 [pdf, other]

General Identifiability and Achievability for Causal Representation Learning

Authors: Burak Varıcı, Emre Acartürk, Karthikeyan Shanmugam, Ali Tajer

Abstract: This paper focuses on causal representation learning (CRL) under a general nonparametric latent causal model and a general transformation model that maps the latent data to the observational data. It establishes identifiability and achievability results using two hard uncoupled interventions per node in the latent causal graph. Notably, one does not know which pair of intervention environments hav… ▽ More This paper focuses on causal representation learning (CRL) under a general nonparametric latent causal model and a general transformation model that maps the latent data to the observational data. It establishes identifiability and achievability results using two hard uncoupled interventions per node in the latent causal graph. Notably, one does not know which pair of intervention environments have the same node intervened (hence, uncoupled). For identifiability, the paper establishes that perfect recovery of the latent causal model and variables is guaranteed under uncoupled interventions. For achievability, an algorithm is designed that uses observational and interventional data and recovers the latent causal model and variables with provable guarantees. This algorithm leverages score variations across different environments to estimate the inverse of the transformer and, subsequently, the latent variables. The analysis, additionally, recovers the identifiability result for two hard coupled interventions, that is when metadata about the pair of environments that have the same node intervened is known. This paper also shows that when observational data is available, additional faithfulness assumptions that are adopted by the existing literature are unnecessary. △ Less

Submitted 14 February, 2024; v1 submitted 23 October, 2023; originally announced October 2023.

Comments: Accepted to AISTATS 2024 (oral presentation). Also appeared at CRL Workshop @ NeurIPS 2023 (oral presentation) titled as "Score-based Causal Representation Learning: Nonparametric Identifiability"

arXiv:2310.13393 [pdf, ps, other]

Optimal Best Arm Identification with Fixed Confidence in Restless Bandits

Authors: P. N. Karthik, Vincent Y. F. Tan, Arpan Mukherjee, Ali Tajer

Abstract: We study best arm identification in a restless multi-armed bandit setting with finitely many arms. The discrete-time data generated by each arm forms a homogeneous Markov chain taking values in a common, finite state space. The state transitions in each arm are captured by an ergodic transition probability matrix (TPM) that is a member of a single-parameter exponential family of TPMs. The real-val… ▽ More We study best arm identification in a restless multi-armed bandit setting with finitely many arms. The discrete-time data generated by each arm forms a homogeneous Markov chain taking values in a common, finite state space. The state transitions in each arm are captured by an ergodic transition probability matrix (TPM) that is a member of a single-parameter exponential family of TPMs. The real-valued parameters of the arm TPMs are unknown and belong to a given space. Given a function $f$ defined on the common state space of the arms, the goal is to identify the best arm -- the arm with the largest average value of $f$ evaluated under the arm's stationary distribution -- with the fewest number of samples, subject to an upper bound on the decision's error probability (i.e., the fixed-confidence regime). A lower bound on the growth rate of the expected stop** time is established in the asymptote of a vanishing error probability. Furthermore, a policy for best arm identification is proposed, and its expected stop** time is proved to have an asymptotic growth rate that matches the lower bound. It is demonstrated that tracking the long-term behavior of a certain Markov decision process and its state-action visitation proportions are the key ingredients in analyzing the converse and achievability bounds. It is shown that under every policy, the state-action visitation proportions satisfy a specific approximate flow conservation constraint and that these proportions match the optimal proportions dictated by the lower bound under any asymptotically optimal policy. The prior studies on best arm identification in restless bandits focus on independent observations from the arms, rested Markov arms, and restless Markov arms with known arm TPMs. In contrast, this work is the first to study best arm identification in restless bandits with unknown arm TPMs. △ Less

Submitted 23 June, 2024; v1 submitted 20 October, 2023; originally announced October 2023.

Comments: Accepted to the IEEE Transactions on Information Theory

arXiv:2309.01207 [pdf, other]

Spectral Adversarial MixUp for Few-Shot Unsupervised Domain Adaptation

Authors: Jia** Zhang, Hanqing Chao, Amit Dhurandhar, Pin-Yu Chen, Ali Tajer, Yangyang Xu, **kun Yan

Abstract: Domain shift is a common problem in clinical applications, where the training images (source domain) and the test images (target domain) are under different distributions. Unsupervised Domain Adaptation (UDA) techniques have been proposed to adapt models trained in the source domain to the target domain. However, those methods require a large number of images from the target domain for model train… ▽ More Domain shift is a common problem in clinical applications, where the training images (source domain) and the test images (target domain) are under different distributions. Unsupervised Domain Adaptation (UDA) techniques have been proposed to adapt models trained in the source domain to the target domain. However, those methods require a large number of images from the target domain for model training. In this paper, we propose a novel method for Few-Shot Unsupervised Domain Adaptation (FSUDA), where only a limited number of unlabeled target domain samples are available for training. To accomplish this challenging task, first, a spectral sensitivity map is introduced to characterize the generalization weaknesses of models in the frequency domain. We then developed a Sensitivity-guided Spectral Adversarial MixUp (SAMix) method to generate target-style images to effectively suppresses the model sensitivity, which leads to improved model generalizability in the target domain. We demonstrated the proposed method and rigorously evaluated its performance on multiple tasks using several public datasets. △ Less

Submitted 3 September, 2023; originally announced September 2023.

Comments: Accepted by MICCAI 2023

arXiv:2303.08864 [pdf, other]

GRNN-based Real-time Fault Chain Prediction

Authors: Anmol Dwivedi, Ali Tajer

Abstract: This paper proposes a data-driven graphical framework for the real-time search of risky cascading fault chains (FCs). While identifying risky FCs is pivotal to alleviating cascading failures, the complex spatio-temporal dependencies among the components of the power system render challenges to modeling and analyzing FCs. Furthermore, the real-time search of risky FCs faces an inherent combinatoria… ▽ More This paper proposes a data-driven graphical framework for the real-time search of risky cascading fault chains (FCs). While identifying risky FCs is pivotal to alleviating cascading failures, the complex spatio-temporal dependencies among the components of the power system render challenges to modeling and analyzing FCs. Furthermore, the real-time search of risky FCs faces an inherent combinatorial complexity that grows exponentially with the size of the system. The proposed framework leverages the recent advances in graph recurrent neural networks to circumvent the computational complexities of the real-time search of FCs. The search process is formalized as a partially observable Markov decision process (POMDP), which is subsequently solved via a time-varying graph recurrent neural network (GRNN) that judiciously accounts for the inherent temporal and spatial structures of the data generated by the system. The key features of this structure include (i) leveraging the spatial structure of the data induced by the system topology, (ii) leveraging the temporal structure of data induced by system dynamics, and (iii) efficiently summarizing the system's history in the latent space of the GRNN. The proposed framework's efficiency is compared to the relevant literature on the IEEE 39-bus New England system and the IEEE 118-bus system. △ Less

Submitted 15 March, 2023; originally announced March 2023.

arXiv:2301.08230 [pdf, other]

Score-based Causal Representation Learning with Interventions

Authors: Burak Varici, Emre Acarturk, Karthikeyan Shanmugam, Abhishek Kumar, Ali Tajer

Abstract: This paper studies the causal representation learning problem when the latent causal variables are observed indirectly through an unknown linear transformation. The objectives are: (i) recovering the unknown linear transformation (up to scaling) and (ii) determining the directed acyclic graph (DAG) underlying the latent variables. Sufficient conditions for DAG recovery are established, and it is s… ▽ More This paper studies the causal representation learning problem when the latent causal variables are observed indirectly through an unknown linear transformation. The objectives are: (i) recovering the unknown linear transformation (up to scaling) and (ii) determining the directed acyclic graph (DAG) underlying the latent variables. Sufficient conditions for DAG recovery are established, and it is shown that a large class of non-linear models in the latent space (e.g., causal mechanisms parameterized by two-layer neural networks) satisfy these conditions. These sufficient conditions ensure that the effect of an intervention can be detected correctly from changes in the score. Capitalizing on this property, recovering a valid transformation is facilitated by the following key property: any valid transformation renders latent variables' score function to necessarily have the minimal variations across different interventional environments. This property is leveraged for perfect recovery of the latent DAG structure using only \emph{soft} interventions. For the special case of stochastic \emph{hard} interventions, with an additional hypothesis testing step, one can also uniquely recover the linear transformation up to scaling and a valid causal ordering. △ Less

Submitted 1 May, 2023; v1 submitted 19 January, 2023; originally announced January 2023.

Comments: This version outlines large classes of non-linear causal models in the latent space for which our assumptions hold. It also discusses the latest updates of related literature

arXiv:2301.03785 [pdf, other]

Best Arm Identification in Stochastic Bandits: Beyond $β-$optimality

Authors: Arpan Mukherjee, Ali Tajer

Abstract: This paper investigates a hitherto unaddressed aspect of best arm identification (BAI) in stochastic multi-armed bandits in the fixed-confidence setting. Two key metrics for assessing bandit algorithms are computational efficiency and performance optimality (e.g., in sample complexity). In stochastic BAI literature, there have been advances in designing algorithms to achieve optimal performance, b… ▽ More This paper investigates a hitherto unaddressed aspect of best arm identification (BAI) in stochastic multi-armed bandits in the fixed-confidence setting. Two key metrics for assessing bandit algorithms are computational efficiency and performance optimality (e.g., in sample complexity). In stochastic BAI literature, there have been advances in designing algorithms to achieve optimal performance, but they are generally computationally expensive to implement (e.g., optimization-based methods). There also exist approaches with high computational efficiency, but they have provable gaps to the optimal performance (e.g., the $β$-optimal approaches in top-two methods). This paper introduces a framework and an algorithm for BAI that achieves optimal performance with a computationally efficient set of decision rules. The central process that facilitates this is a routine for sequentially estimating the optimal allocations up to sufficient fidelity. Specifically, these estimates are accurate enough for identifying the best arm (hence, achieving optimality) but not overly accurate to an unnecessary extent that creates excessive computational complexity (hence, maintaining efficiency). Furthermore, the existing relevant literature focuses on the family of exponential distributions. This paper considers a more general setting of any arbitrary family of distributions parameterized by their mean values (under mild regularity conditions). The optimality is established analytically, and numerical evaluations are provided to assess the analytical guarantees and compare the performance with those of the existing ones. △ Less

Submitted 22 June, 2023; v1 submitted 10 January, 2023; originally announced January 2023.

arXiv:2212.00850 [pdf, other]

When Neural Networks Fail to Generalize? A Model Sensitivity Perspective

Authors: Jia** Zhang, Hanqing Chao, Amit Dhurandhar, Pin-Yu Chen, Ali Tajer, Yangyang Xu, **kun Yan

Abstract: Domain generalization (DG) aims to train a model to perform well in unseen domains under different distributions. This paper considers a more realistic yet more challenging scenario,namely Single Domain Generalization (Single-DG), where only a single source domain is available for training. To tackle this challenge, we first try to understand when neural networks fail to generalize? We empirically… ▽ More Domain generalization (DG) aims to train a model to perform well in unseen domains under different distributions. This paper considers a more realistic yet more challenging scenario,namely Single Domain Generalization (Single-DG), where only a single source domain is available for training. To tackle this challenge, we first try to understand when neural networks fail to generalize? We empirically ascertain a property of a model that correlates strongly with its generalization that we coin as "model sensitivity". Based on our analysis, we propose a novel strategy of Spectral Adversarial Data Augmentation (SADA) to generate augmented images targeted at the highly sensitive frequencies. Models trained with these hard-to-learn samples can effectively suppress the sensitivity in the frequency space, which leads to improved generalization performance. Extensive experiments on multiple public datasets demonstrate the superiority of our approach, which surpasses the state-of-the-art single-DG methods. △ Less

Submitted 1 December, 2022; originally announced December 2022.

Comments: Accepted by AAAI 2023

arXiv:2208.12764 [pdf, other]

Causal Bandits for Linear Structural Equation Models

Authors: Burak Varici, Karthikeyan Shanmugam, Prasanna Sattigeri, Ali Tajer

Abstract: This paper studies the problem of designing an optimal sequence of interventions in a causal graphical model to minimize cumulative regret with respect to the best intervention in hindsight. This is, naturally, posed as a causal bandit problem. The focus is on causal bandits for linear structural equation models (SEMs) and soft interventions. It is assumed that the graph's structure is known and h… ▽ More This paper studies the problem of designing an optimal sequence of interventions in a causal graphical model to minimize cumulative regret with respect to the best intervention in hindsight. This is, naturally, posed as a causal bandit problem. The focus is on causal bandits for linear structural equation models (SEMs) and soft interventions. It is assumed that the graph's structure is known and has $N$ nodes. Two linear mechanisms, one soft intervention and one observational, are assumed for each node, giving rise to $2^N$ possible interventions. Majority of the existing causal bandit algorithms assume that at least the interventional distributions of the reward node's parents are fully specified. However, there are $2^N$ such distributions (one corresponding to each intervention), acquiring which becomes prohibitive even in moderate-sized graphs. This paper dispenses with the assumption of knowing these distributions or their marginals. Two algorithms are proposed for the frequentist (UCB-based) and Bayesian (Thompson Sampling-based) settings. The key idea of these algorithms is to avoid directly estimating the $2^N$ reward distributions and instead estimate the parameters that fully specify the SEMs (linear in $N$) and use them to compute the rewards. In both algorithms, under boundedness assumptions on noise and the parameter space, the cumulative regrets scale as $\tilde{\cal O} (d^{L+\frac{1}{2}} \sqrt{NT})$, where $d$ is the graph's maximum degree, and $L$ is the length of its longest causal path. Additionally, a minimax lower of $Ω(d^{\frac{L}{2}-2}\sqrt{T})$ is presented, which suggests that the achievable and lower bounds conform in their scaling behavior with respect to the horizon $T$ and graph parameters $d$ and $L$. △ Less

Submitted 31 March, 2023; v1 submitted 26 August, 2022; originally announced August 2022.

Comments: 61 pages; new to this version: added lower bounds and relaxed assumptions

arXiv:2208.05406 [pdf, other]

doi 10.1109/TSP.2022.3187655

Active Sampling of Multiple Sources for Sequential Estimation

Authors: Arpan Mukherjee, Ali Tajer, Pin-Yu Chen, Payel Das

Abstract: Consider $K$ processes, each generating a sequence of identical and independent random variables. The probability measures of these processes have random parameters that must be estimated. Specifically, they share a parameter $θ$ common to all probability measures. Additionally, each process $i\in\{1, \dots, K\}$ has a private parameter $α_i$. The objective is to design an active sampling algorith… ▽ More Consider $K$ processes, each generating a sequence of identical and independent random variables. The probability measures of these processes have random parameters that must be estimated. Specifically, they share a parameter $θ$ common to all probability measures. Additionally, each process $i\in\{1, \dots, K\}$ has a private parameter $α_i$. The objective is to design an active sampling algorithm for sequentially estimating these parameters in order to form reliable estimates for all shared and private parameters with the fewest number of samples. This sampling algorithm has three key components: (i)~data-driven sampling decisions, which dynamically over time specifies which of the $K$ processes should be selected for sampling; (ii)~stop** time for the process, which specifies when the accumulated data is sufficient to form reliable estimates and terminate the sampling process; and (iii)~estimators for all shared and private parameters. Owing to the sequential estimation being known to be analytically intractable, this paper adopts \emph {conditional} estimation cost functions, leading to a sequential estimation approach that was recently shown to render tractable analysis. Asymptotically optimal decision rules (sampling, stop**, and estimation) are delineated, and numerical experiments are provided to compare the efficacy and quality of the proposed procedure with those of the relevant approaches. △ Less

Submitted 10 August, 2022; originally announced August 2022.

arXiv:2207.11158 [pdf, ps, other]

SPRT-based Efficient Best Arm Identification in Stochastic Bandits

Authors: Arpan Mukherjee, Ali Tajer

Abstract: This paper investigates the best arm identification (BAI) problem in stochastic multi-armed bandits in the fixed confidence setting. The general class of the exponential family of bandits is considered. The existing algorithms for the exponential family of bandits face computational challenges. To mitigate these challenges, the BAI problem is viewed and analyzed as a sequential composite hypothesi… ▽ More This paper investigates the best arm identification (BAI) problem in stochastic multi-armed bandits in the fixed confidence setting. The general class of the exponential family of bandits is considered. The existing algorithms for the exponential family of bandits face computational challenges. To mitigate these challenges, the BAI problem is viewed and analyzed as a sequential composite hypothesis testing task, and a framework is proposed that adopts the likelihood ratio-based tests known to be effective for sequential testing. Based on this test statistic, a BAI algorithm is designed that leverages the canonical sequential probability ratio tests for arm selection and is amenable to tractable analysis for the exponential family of bandits. This algorithm has two key features: (1) its sample complexity is asymptotically optimal, and (2) it is guaranteed to be $δ-$PAC. Existing efficient approaches focus on the Gaussian setting and require Thompson sampling for the arm deemed the best and the challenger arm. Additionally, this paper analytically quantifies the computational expense of identifying the challenger in an existing approach. Finally, numerical experiments are provided to support the analysis. △ Less

Submitted 22 June, 2023; v1 submitted 22 July, 2022; originally announced July 2022.

arXiv:2111.07512 [pdf, other]

Scalable Intervention Target Estimation in Linear Models

Authors: Burak Varici, Karthikeyan Shanmugam, Prasanna Sattigeri, Ali Tajer

Abstract: This paper considers the problem of estimating the unknown intervention targets in a causal directed acyclic graph from observational and interventional data. The focus is on soft interventions in linear structural equation models (SEMs). Current approaches to causal structure learning either work with known intervention targets or use hypothesis testing to discover the unknown intervention target… ▽ More This paper considers the problem of estimating the unknown intervention targets in a causal directed acyclic graph from observational and interventional data. The focus is on soft interventions in linear structural equation models (SEMs). Current approaches to causal structure learning either work with known intervention targets or use hypothesis testing to discover the unknown intervention targets even for linear SEMs. This severely limits their scalability and sample complexity. This paper proposes a scalable and efficient algorithm that consistently identifies all intervention targets. The pivotal idea is to estimate the intervention sites from the difference between the precision matrices associated with the observational and interventional datasets. It involves repeatedly estimating such sites in different subsets of variables. The proposed algorithm can be used to also update a given observational Markov equivalence class into the interventional Markov equivalence class. Consistency, Markov equivalency, and sample complexity are established analytically. Finally, simulation results on both real and synthetic data demonstrate the gains of the proposed approach for scalable causal structure recovery. Implementation of the algorithm and the code to reproduce the simulation results are available at \url{https://github.com/bvarici/intervention-estimation}. △ Less

Submitted 14 November, 2021; originally announced November 2021.

Comments: 23 pages, 4 figures, NeurIPS 2021

arXiv:2111.07458 [pdf, ps, other]

Mean-based Best Arm Identification in Stochastic Bandits under Reward Contamination

Authors: Arpan Mukherjee, Ali Tajer, Pin-Yu Chen, Payel Das

Abstract: This paper investigates the problem of best arm identification in $\textit{contaminated}$ stochastic multi-arm bandits. In this setting, the rewards obtained from any arm are replaced by samples from an adversarial model with probability $\varepsilon$. A fixed confidence (infinite-horizon) setting is considered, where the goal of the learner is to identify the arm with the largest mean. Owing to t… ▽ More This paper investigates the problem of best arm identification in $\textit{contaminated}$ stochastic multi-arm bandits. In this setting, the rewards obtained from any arm are replaced by samples from an adversarial model with probability $\varepsilon$. A fixed confidence (infinite-horizon) setting is considered, where the goal of the learner is to identify the arm with the largest mean. Owing to the adversarial contamination of the rewards, each arm's mean is only partially identifiable. This paper proposes two algorithms, a gap-based algorithm and one based on the successive elimination, for best arm identification in sub-Gaussian bandits. These algorithms involve mean estimates that achieve the optimal error guarantee on the deviation of the true mean from the estimate asymptotically. Furthermore, these algorithms asymptotically achieve the optimal sample complexity. Specifically, for the gap-based algorithm, the sample complexity is asymptotically optimal up to constant factors, while for the successive elimination-based algorithm, it is optimal up to logarithmic factors. Finally, numerical experiments are provided to illustrate the gains of the algorithms compared to the existing baselines. △ Less

Submitted 14 November, 2021; originally announced November 2021.

arXiv:2105.07725 [pdf, other]

Distributed Computation over MAC via Kolmogorov-Arnold Representation

Authors: Derya Malak, Ali Tajer

Abstract: Kolmogorov's representation theorem provides a framework for decomposing any arbitrary real-valued, multivariate, and continuous function into a two-layer nested superposition of a finite number of functions. The functions at these two layers, are referred to as the inner and outer functions with the key property that the design of the inner functions is independent of that of the original functio… ▽ More Kolmogorov's representation theorem provides a framework for decomposing any arbitrary real-valued, multivariate, and continuous function into a two-layer nested superposition of a finite number of functions. The functions at these two layers, are referred to as the inner and outer functions with the key property that the design of the inner functions is independent of that of the original function of interest to be computed. This brings modularity and universality to the design of the inner function, and subsequently, a part of computation. This paper capitalizes on such modularity and universality in functional representation to propose two frameworks for distributed computation over the additive multiple access channels (MACs). In the first framework, each source encodes the inner representations and sends them over the additive MAC. Subsequently, the receiver computes the outer functions to compute the function of interest. Transmitting the values of the inner functions instead of the messages directly leads to compression gains. In the second approach, in order to further increase the compression rate, the framework aims to also bring computing the outer functions to the source sites. Specifically, each source employs a graph-coloring-based approach to perform joint functional compression of the inner and the outer functions, which may attain further compression savings over the former. These modular encoding schemes provide an exact representation in the asymptotic regime and the non-asymptotic regime. Contrasting these with the baseline model where sources directly transmit data over MAC, we observe gains. To showcase the gains of these two frameworks and their discrepancies, they are applied to a number of commonly used computations in distributed systems, e.g., computing products, $\ell_m$-norms, polynomial functions, extremum values of functions, and affine transformations. △ Less

Submitted 17 May, 2021; originally announced May 2021.

arXiv:2101.07173 [pdf, other]

doi 10.3390/e23010120

The Broadcast Approach in Communication Networks

Authors: Ali Tajer, Avi Steiner, Shlomo Shamai

Abstract: This paper reviews the theoretical and practical principles of the broadcast approach to communication over state-dependent channels and networks in which the transmitters have access to only the probabilistic description of the time-varying states while remaining oblivious to their instantaneous realizations. When the temporal variations are frequent enough, an effective long-term strategy is ada… ▽ More This paper reviews the theoretical and practical principles of the broadcast approach to communication over state-dependent channels and networks in which the transmitters have access to only the probabilistic description of the time-varying states while remaining oblivious to their instantaneous realizations. When the temporal variations are frequent enough, an effective long-term strategy is adapting the transmission strategies to the system's ergodic behavior. However, when the variations are infrequent, their temporal average can deviate significantly from the channel's ergodic mode, rendering a lack of instantaneous performance guarantees. To circumvent a lack of short-term guarantees, the {\em broadcast approach} provides principles for designing transmission schemes that benefit from both short- and long-term performance guarantees. This paper provides an overview of how to apply the broadcast approach to various channels and network models under various operational constraints. △ Less

Submitted 18 January, 2021; originally announced January 2021.

Comments: 149 pages, 37 figures

arXiv:1812.10569 [pdf, other]

Secure Estimation under Causative Attacks

Authors: Saurabh Sihag, Ali Tajer

Abstract: This paper considers the problem of secure parameter estimation when the estimation algorithm is prone to causative attacks. Causative attacks, in principle, target decision-making algorithms to alter their decisions by making them oblivious to specific attacks. Such attacks influence inference algorithms by tampering with the mechanism through which the algorithm is provided with the statistical… ▽ More This paper considers the problem of secure parameter estimation when the estimation algorithm is prone to causative attacks. Causative attacks, in principle, target decision-making algorithms to alter their decisions by making them oblivious to specific attacks. Such attacks influence inference algorithms by tampering with the mechanism through which the algorithm is provided with the statistical model of the population about which an inferential decision is made. Causative attacks are viable, for instance, by contaminating the historical or training data, or by compromising an expert who provides the model. In the presence of causative attacks, the inference algorithms operate under a distorted statistical model for the population from which they collect data samples. This paper introduces specific notions of secure estimation and provides a framework under which secure estimation under causative attacks can be formulated. A central premise underlying the secure estimation framework is that forming secure estimates introduces a new dimension to the estimation objective, which pertains to detecting attacks and isolating the true model. Since detection and isolation decisions are imperfect, their inclusion induces an inherent coupling between the desired secure estimation objective and the auxiliary detection and isolation decisions that need to be formed in conjunction with the estimates. This paper establishes the fundamental interplay among the decisions involved and characterizes the general decision rules in closed-form for any desired estimation cost function. Furthermore, to circumvent the computational complexity associated with growing parameter dimension or attack complexity, a scalable estimation algorithm and its attendant optimality guarantees are provided. The theory developed is applied to secure parameter estimation in a sensor network. △ Less

Submitted 26 December, 2018; originally announced December 2018.

arXiv:1804.09657 [pdf, ps, other]

Quickest Search for a Change Point

Authors: Javad Heydari, Ali Tajer

Abstract: This paper considers a sequence of random variables generated according to a common distribution. The distribution might undergo periods of transient changes at an unknown set of time instants, referred to as change-points. The objective is to sequentially collect measurements from the sequence and design a dynamic decision rule for the quickest identification of one change-point in real time, whi… ▽ More This paper considers a sequence of random variables generated according to a common distribution. The distribution might undergo periods of transient changes at an unknown set of time instants, referred to as change-points. The objective is to sequentially collect measurements from the sequence and design a dynamic decision rule for the quickest identification of one change-point in real time, while, in parallel, the rate of false alarms is controlled. This setting is different from the conventional change-point detection settings in which there exists at most one change-point that can be either persistent or transient. The problem is considered under the minimax setting with a constraint on the false alarm rate before the first change occurs. It is proved that the Shewhart test achieves exact optimality under worst-case change points and also worst-case data realization. Numerical evaluations are also provided to assess the performance of the decision rule characterized. △ Less

Submitted 25 April, 2018; originally announced April 2018.

Comments: 6 pages, 3 figures

arXiv:1712.08720 [pdf, other]

Multiaccess Communication via a Broadcast Approach Adapted to the Multiuser Channel

Authors: Samia Kazemi, Ali Tajer

Abstract: A broadcast strategy for multiple access communication over slowly fading channels is introduced, in which the channel state information is known to only the receiver. In this strategy, the transmitters split their information streams into multiple independent information streams, each adapted to a specific actual channel realization. The major distinction between the proposed strategy and the exi… ▽ More A broadcast strategy for multiple access communication over slowly fading channels is introduced, in which the channel state information is known to only the receiver. In this strategy, the transmitters split their information streams into multiple independent information streams, each adapted to a specific actual channel realization. The major distinction between the proposed strategy and the existing ones is that in the existing approaches, each transmitter adapts its transmission strategy only to the fading process of its direct channel to the receiver, hence directly adopting a single-user strategy previously designed for the single-user channels. However, the contribution of each user to a network-wide measure (e.g., sum-rate capacity) depends not only on the user's direct channel to the receiver, but also on the qualities of other channels. Driven by this premise, this paper proposes an alternative broadcast strategy in which the transmitters adapt their transmissions to the combined states resulting from all users' channels. This leads to generating a larger number of information streams by each transmitter and adopting a different decoding strategy by the receiver. An achievable rate region and an outer bound that capture the trade-off among the rates of different information layers are established, and it is shown that the achievable rate region subsumes the existing known capacity regions obtained based on adapting the broadcast approach to the single-user channels. △ Less

Submitted 23 December, 2017; originally announced December 2017.

arXiv:1711.04268 [pdf, other]

Active Sampling for the Quickest Detection of Markov Networks

Authors: Javad Heydari, Ali Tajer, H. Vincent Poor

Abstract: Consider $n$ random variables forming a Markov random field (MRF). The true model of the MRF is unknown, and it is assumed to belong to a binary set. The objective is to sequentially sample the random variables (one-at-a-time) such that the true MRF model can be detected with the fewest number of samples, while in parallel, the decision reliability is controlled. The core element of an optimal dec… ▽ More Consider $n$ random variables forming a Markov random field (MRF). The true model of the MRF is unknown, and it is assumed to belong to a binary set. The objective is to sequentially sample the random variables (one-at-a-time) such that the true MRF model can be detected with the fewest number of samples, while in parallel, the decision reliability is controlled. The core element of an optimal decision process is a rule for selecting and sampling the random variables over time. Such a process, at every time instant and adaptively to the collected data, selects the random variable that is expected to be most informative about the model, rendering an overall minimized number of samples required for reaching a reliable decision. The existing studies on detecting MRF structures generally sample the entire network at the same time and focus on designing optimal detection rules without regard to the data-acquisition process. This paper characterizes the sampling process for general MRFs, which, in conjunction with the sequential probability ratio test, is shown to be optimal in the asymptote of large $n$. The critical insight in designing the sampling process is devising an information measure that captures the decisions' inherent statistical dependence over time. Furthermore, when the MRFs can be modeled by acyclic probabilistic graphical models, the sampling rule is shown to take a computationally simple form. Performance analysis for the general case is provided, and the results are interpreted in several special cases: Gaussian MRFs, non-asymptotic regimes, connection to Chernoff's rule to controlled (active) sensing, and the problem of cluster detection. △ Less

Submitted 2 August, 2020; v1 submitted 12 November, 2017; originally announced November 2017.

Comments: 50 pages, 12 figures

arXiv:1702.01576 [pdf, other]

Quickest Localization of Anomalies in Power Grids: A Stochastic Graphical Framework

Authors: Javad Heydari, Ali Tajer

Abstract: Agile localization of anomalous events plays a pivotal role in enhancing the overall reliability of the grid and avoiding cascading failures. This is especially of paramount significance in the large-scale grids due to their geographical expansions and the large volume of data generated. This paper proposes a stochastic graphical framework, by leveraging which it aims to localize the anomalies wit… ▽ More Agile localization of anomalous events plays a pivotal role in enhancing the overall reliability of the grid and avoiding cascading failures. This is especially of paramount significance in the large-scale grids due to their geographical expansions and the large volume of data generated. This paper proposes a stochastic graphical framework, by leveraging which it aims to localize the anomalies with the minimum amount of data. This framework capitalizes on the strong correlation structures observed among the measurements collected from different buses. The proposed approach, at its core, collects the measurements sequentially and progressively updates its decision about the location of the anomaly. The process resumes until the location of the anomaly can be identified with desired reliability. We provide a general theory for the quickest anomaly localization and also investigate its application for quickest line outage localization. Simulations in the IEEE 118-bus model are provided to establish the gains of the proposed approach. △ Less

Submitted 6 February, 2017; originally announced February 2017.

Comments: 31 pages, 7 figures

arXiv:1507.08800 [pdf, ps, other]

doi 10.1109/TSG.2015.2466078

A Stochastic Sizing Approach for Sharing-based Energy Storage Applications

Authors: Islam Safak Bayram, Mohamed Abdallah, Ali Tajer, Khalid Qaraqe

Abstract: In order to foster renewable energy integration, improve power quality and reliability, and reduce hydrocarbon emissions, there is a strong need to deploy energy storage systems (ESSs), which can provide a control medium for peak hour utility operations. ESSs are especially desired at the residential level, as this sector has the most untapped demand response potential. However, considering their… ▽ More In order to foster renewable energy integration, improve power quality and reliability, and reduce hydrocarbon emissions, there is a strong need to deploy energy storage systems (ESSs), which can provide a control medium for peak hour utility operations. ESSs are especially desired at the residential level, as this sector has the most untapped demand response potential. However, considering their high acquisition, operation, and maintenance costs, individual ESS deployment is not economically viable. Hence, in this paper, we propose a \emph{sharing-based} ESS architecture, in which the demand of each customer is modeled stochastically and the aggregate demand is accommodated by a combination of power drawn from the grid and the storage unit when the demand exceeds grid capacity. Stochastic framework for analyzing the optimal size of energy storage systems is provided. An analytical method is developed for a group customers with \emph{single} type of appliances. Then, this framework is extended to any network size with arbitrary number of customers and appliance types. The analytical method provides a tractable solution to the ESS sizing problem. Finally, a detailed cost-benefit analysis is provided, and the results indicate that sharing-based ESSs are practical and significant savings in terms of ESS size can be achieved. △ Less

Submitted 8 August, 2015; v1 submitted 31 July, 2015; originally announced July 2015.

Comments: Accepted by IEEE Transactions on Smart Grid

arXiv:1502.01524 [pdf, ps, other]

Capacity Planning Frameworks for Electric Vehicle Charging Stations with Multi-Class Customers

Authors: Islam Safak Bayram, Ali Tajer, Mohamed Abdallah, Khalid Qaraqe

Abstract: In order to foster electric vehicle (EV) adoption, there is a strong need for designing and develo** charging stations that can accommodate different customer classes, distinguished by their charging preferences, needs, and technologies. By growing such charging station networks, the power grid becomes more congested and, therefore, controlling of charging demands should be carefully aligned wit… ▽ More In order to foster electric vehicle (EV) adoption, there is a strong need for designing and develo** charging stations that can accommodate different customer classes, distinguished by their charging preferences, needs, and technologies. By growing such charging station networks, the power grid becomes more congested and, therefore, controlling of charging demands should be carefully aligned with the available resources. This paper focuses on an EV charging network equipped with different charging technologies and proposes two frameworks. In the first framework, appropriate for large networks, the EV population is expected to constitute a sizable portion of the light duty fleets. This which necessitates controlling the EV charging operations to prevent potential grid failures and distribute the resources efficiently. This framework leverages pricing dynamics in order to control the EV customer request rates and to provide a charging service with the best level of quality of service. The second framework, on the other hand, is more appropriate for smaller networks, in which the objective is to compute the minimum amount of resources required to provide certain levels of quality of service to each class. The results show that the proposed frameworks ensure grid reliability and lead to significant savings in capacity planning. △ Less

Submitted 5 February, 2015; originally announced February 2015.

Comments: Accepted Transactions on Smart Grid

arXiv:1406.4726 [pdf, ps, other]

Energy Storage System Sizing for Peak Hour Utility Applications

Authors: I. Safak Bayram, Mohamed Abdallah, Ali Tajer, Khalid Qaraqe

Abstract: In future smart grids, energy storage systems (ESSs) are expected to play a key role in reducing peak hour electricity generation cost and the associated level of carbon emissions. Considering their high acquisition, operation, and maintenance costs, ESSs are likely to serve a large number of users. Hence, optimal sizing of energy ESSs plays a critical role as over-provisioning ESS size leads to u… ▽ More In future smart grids, energy storage systems (ESSs) are expected to play a key role in reducing peak hour electricity generation cost and the associated level of carbon emissions. Considering their high acquisition, operation, and maintenance costs, ESSs are likely to serve a large number of users. Hence, optimal sizing of energy ESSs plays a critical role as over-provisioning ESS size leads to under-utilizing costly assets and under-provisioning it taxes operation lifetime. This paper proposes a stochastic framework for analyzing the optimal size of energy storage systems. In this framework the demand of each customer is modeled stochastically and the aggregate demand is accommodated by a combination of power drawn from the grid and the storage unit when the demand exceed grid capacity. In this framework an analytical method is developed, which provides tractable solution to the ESS sizing problem of interest. The results indicate that significant savings in terms of ESS size can be achieved. △ Less

Submitted 7 January, 2015; v1 submitted 18 June, 2014; originally announced June 2014.

Comments: ACCEPTED BY IEEE ICC 2015

arXiv:1210.2406 [pdf, other]

Quick Search for Rare Events

Authors: Ali Tajer, H. Vincent Poor

Abstract: Rare events can potentially occur in many applications. When manifested as opportunities to be exploited, risks to be ameliorated, or certain features to be extracted, such events become of paramount significance. Due to their sporadic nature, the information-bearing signals associated with rare events often lie in a large set of irrelevant signals and are not easily accessible. This paper provide… ▽ More Rare events can potentially occur in many applications. When manifested as opportunities to be exploited, risks to be ameliorated, or certain features to be extracted, such events become of paramount significance. Due to their sporadic nature, the information-bearing signals associated with rare events often lie in a large set of irrelevant signals and are not easily accessible. This paper provides a statistical framework for detecting such events so that an optimal balance between detection reliability and agility, as two opposing performance measures, is established. The core component of this framework is a sampling procedure that adaptively and quickly focuses the information-gathering resources on the segments of the dataset that bear the information pertinent to the rare events. Particular focus is placed on Gaussian signals with the aim of detecting signals with rare mean and variance values. △ Less

Submitted 8 October, 2012; originally announced October 2012.

arXiv:1206.1438 [pdf, ps, other]

Adaptive Sensing of Congested Spectrum Bands

Authors: Ali Tajer, Rui M. Castro, Xiaodong Wang

Abstract: Cognitive radios process their sensed information collectively in order to opportunistically identify and access under-utilized spectrum segments (spectrum holes). Due to the transient and rapidly-varying nature of the spectrum occupancy, the cognitive radios (secondary users) must be agile in identifying the spectrum holes in order to enhance their spectral efficiency. We propose a novel {\em ada… ▽ More Cognitive radios process their sensed information collectively in order to opportunistically identify and access under-utilized spectrum segments (spectrum holes). Due to the transient and rapidly-varying nature of the spectrum occupancy, the cognitive radios (secondary users) must be agile in identifying the spectrum holes in order to enhance their spectral efficiency. We propose a novel {\em adaptive} procedure to reinforce the agility of the secondary users for identifying {\em multiple} spectrum holes simultaneously over a wide spectrum band. This is accomplished by successively {\em exploring} the set of potential spectrum holes and {\em progressively} allocating the sensing resources to the most promising areas of the spectrum. Such exploration and resource allocation results in conservative spending of the sensing resources and translates into very agile spectrum monitoring. The proposed successive and adaptive sensing procedure is in contrast to the more conventional approaches that distribute the sampling resources equally over the entire spectrum. Besides improved agility, the adaptive procedure requires less-stringent constraints on the power of the primary users to guarantee that they remain distinguishable from the environment noise and renders more reliable spectrum hole detection. △ Less

Submitted 7 June, 2012; originally announced June 2012.

Comments: 16 pages, 5 figures

arXiv:1104.3911 [pdf, ps, other]

doi 10.1109/TSP.2011.2122259

Information Exchange Limits in Cooperative MIMO Networks

Authors: Ali Tajer, Xiaodong Wang

Abstract: Concurrent presence of inter-cell and intra-cell interferences constitutes a major impediment to reliable downlink transmission in multi-cell multiuser networks. Harnessing such interferences largely hinges on two levels of information exchange in the network: one from the users to the base-stations (feedback) and the other one among the base-stations (cooperation). We demonstrate that exchanging… ▽ More Concurrent presence of inter-cell and intra-cell interferences constitutes a major impediment to reliable downlink transmission in multi-cell multiuser networks. Harnessing such interferences largely hinges on two levels of information exchange in the network: one from the users to the base-stations (feedback) and the other one among the base-stations (cooperation). We demonstrate that exchanging a finite number of bits across the network, in the form of feedback and cooperation, is adequate for achieving the optimal capacity scaling. We also show that the average level of information exchange is independent of the number of users in the network. This level of information exchange is considerably less than that required by the existing coordination strategies which necessitate exchanging infinite bits across the network for achieving the optimal sum-rate capacity scaling. The results provided rely on a constructive proof. △ Less

Submitted 20 April, 2011; v1 submitted 19 April, 2011; originally announced April 2011.

Comments: 35 pages, 5 figure

arXiv:1104.2784 [pdf, ps, other]

Diversity Analysis of Symbol-by-Symbol Linear Equalizers

Authors: Ali Tajer, Aria Nosratinia, Naofal Al-Dhahir

Abstract: In frequency-selective channels linear receivers enjoy significantly-reduced complexity compared with maximum likelihood receivers at the cost of performance degradation which can be in the form of a loss of the inherent frequency diversity order or reduced coding gain. This paper demonstrates that the minimum mean-square error symbol-by-symbol linear equalizer incurs no diversity loss compared to… ▽ More In frequency-selective channels linear receivers enjoy significantly-reduced complexity compared with maximum likelihood receivers at the cost of performance degradation which can be in the form of a loss of the inherent frequency diversity order or reduced coding gain. This paper demonstrates that the minimum mean-square error symbol-by-symbol linear equalizer incurs no diversity loss compared to the maximum likelihood receivers. In particular, for a channel with memory $ν$, it achieves the full diversity order of ($ν+1$) while the zero-forcing symbol-by-symbol linear equalizer always achieves a diversity order of one. △ Less

Submitted 14 April, 2011; originally announced April 2011.

arXiv:1101.5084 [pdf, other]

Joint Detection and Estimation: Optimum Tests and Applications

Authors: George V. Moustakides, Guido H. Jajamovich, Ali Tajer, Xiaodong Wang

Abstract: We consider a well defined joint detection and parameter estimation problem. By combining the Baysian formulation of the estimation subproblem with suitable constraints on the detection subproblem we develop optimum one- and two-step test for the joint detection/estimation case. The proposed combined strategies have the very desirable characteristic to allow for the trade-off between detection pow… ▽ More We consider a well defined joint detection and parameter estimation problem. By combining the Baysian formulation of the estimation subproblem with suitable constraints on the detection subproblem we develop optimum one- and two-step test for the joint detection/estimation case. The proposed combined strategies have the very desirable characteristic to allow for the trade-off between detection power and estimation efficiency. Our theoretical developments are then applied to the problems of retrospective changepoint detection and MIMO radar. In the former case we are interested in detecting a change in the statistics of a set of available data and provide an estimate for the time of change, while in the latter in detecting a target and estimating its location. Intense simulations demonstrate that by using the jointly optimum schemes, we can experience significant improvement in estimation quality with small sacrifice in detection power. △ Less

Submitted 26 January, 2011; originally announced January 2011.

MSC Class: 62C10; 91B06

arXiv:1009.5146 [pdf, ps, other]

doi 10.1109/TSP.2010.2082537

Robust Linear Precoder Design for Multi-cell Downlink Transmission

Authors: Ali Tajer, Narayan Prasad, Xiaodong Wang

Abstract: Coordinated information processing by the base stations of multi-cell wireless networks enhances the overall quality of communication in the network. Such coordinations for optimizing any desired network-wide quality of service (QoS) necessitate the base stations to acquire and share some channel state information (CSI). With perfect knowledge of channel states, the base stations can adjust their… ▽ More Coordinated information processing by the base stations of multi-cell wireless networks enhances the overall quality of communication in the network. Such coordinations for optimizing any desired network-wide quality of service (QoS) necessitate the base stations to acquire and share some channel state information (CSI). With perfect knowledge of channel states, the base stations can adjust their transmissions for achieving a network-wise QoS optimality. In practice, however, the CSI can be obtained only imperfectly. As a result, due to the uncertainties involved, the network is not guaranteed to benefit from a globally optimal QoS. Nevertheless, if the channel estimation perturbations are confined within bounded regions, the QoS measure will also lie within a bounded region. Therefore, by exploiting the notion of robustness in the worst-case sense some worst-case QoS guarantees for the network can be asserted. We adopt a popular model for noisy channel estimates that assumes that estimation noise terms lie within known hyper-spheres. We aim to design linear transceivers that optimize a worst-case QoS measure in downlink transmissions. In particular, we focus on maximizing the worst-case weighted sum-rate of the network and the minimum worst-case rate of the network. For obtaining such transceiver designs, we offer several centralized (fully cooperative) and distributed (limited cooperation) algorithms which entail different levels of complexity and information exchange among the base stations. △ Less

Submitted 26 September, 2010; originally announced September 2010.

Comments: 38 Pages, 7 Figures, To appear in the IEEE Transactions on Signal Processing

arXiv:1007.3676 [pdf, ps, other]

(n,K)-user Interference Channels: Degrees of Freedom

Authors: Ali Tajer, Xiaodong Wang

Abstract: We analyze the gains of opportunistic communication in multiuser interference channels. Consider a fully connected $n$-user Gaussian interference channel. At each time instance only $K\leq n$ transmitters are allowed to be communicating with their respective receivers and the remaining $(n-K)$ transmitter-receiver pairs remain inactive. For finite $n$, if the transmitters can acquire channel state… ▽ More We analyze the gains of opportunistic communication in multiuser interference channels. Consider a fully connected $n$-user Gaussian interference channel. At each time instance only $K\leq n$ transmitters are allowed to be communicating with their respective receivers and the remaining $(n-K)$ transmitter-receiver pairs remain inactive. For finite $n$, if the transmitters can acquire channel state information (CSI) and if all channel gains are bounded away from zero and infinity, the seminal results on interference alignment establish that for any $K$ {\em arbitrary} active pairs the total number of spatial degrees of freedom per orthogonal time and frequency domain is $\frac{K}{2}$. Also it is noteworthy that without transmit-side CSI the interference channel becomes interference-limited and the degrees of freedom is 0. In {\em dense} networks ($n\rightarrow\infty$), however, as the size of the network increase, it becomes less likely to sustain the bounding conditions on the channel gains. By exploiting this fact, we show that when $n$ obeys certain scaling laws, by {\em opportunistically} and {\em dynamically} selecting the $K$ active pairs at each time instance, the number of degrees of freedom can exceed $\frac{K}{2}$ and in fact can be made arbitrarily close to $K$. More specifically when all transmitters and receivers are equipped with one antenna, then the network size scaling as $n\inω(\snr^{d(K-1)})$ is a {\em sufficient} condition for achieving $d\in[0,K]$ degrees of freedom. Moreover, achieving these degrees of freedom does not necessitate the transmitters to acquire channel state information. Hence, invoking opportunistic communication in the context of interference channels leads to achieving higher degrees of freedom that are not achievable otherwise. △ Less

Submitted 21 July, 2010; originally announced July 2010.

Comments: 34 pages

arXiv:1004.0383 [pdf, ps, other]

Multiuser Diversity Gain in Cognitive Networks

Authors: Ali Tajer, Xiaodong Wang

Abstract: Dynamic allocation of resources to the \emph{best} link in large multiuser networks offers considerable improvement in spectral efficiency. This gain, often referred to as \emph{multiuser diversity gain}, can be cast as double-logarithmic growth of the network throughput with the number of users. In this paper we consider large cognitive networks granted concurrent spectrum access with license-ho… ▽ More Dynamic allocation of resources to the \emph{best} link in large multiuser networks offers considerable improvement in spectral efficiency. This gain, often referred to as \emph{multiuser diversity gain}, can be cast as double-logarithmic growth of the network throughput with the number of users. In this paper we consider large cognitive networks granted concurrent spectrum access with license-holding users. The primary network affords to share its under-utilized spectrum bands with the secondary users. We assess the optimal multiuser diversity gain in the cognitive networks by quantifying how the sum-rate throughput of the network scales with the number of secondary users. For this purpose we look at the optimal pairing of spectrum bands and secondary users, which is supervised by a central entity fully aware of the instantaneous channel conditions, and show that the throughput of the cognitive network scales double-logarithmically with the number of secondary users ($N$) and linearly with the number of available spectrum bands ($M$), i.e., $M\log\log N$. We then propose a \emph{distributed} spectrum allocation scheme, which does not necessitate a central controller or any information exchange between different secondary users and still obeys the optimal throughput scaling law. This scheme requires that \emph{some} secondary transmitter-receiver pairs exchange $\log M$ information bits among themselves. We also show that the aggregate amount of information exchange between secondary transmitter-receiver pairs is {\em asymptotically} equal to $M\log M$. Finally, we show that our distributed scheme guarantees fairness among the secondary users, meaning that they are equally likely to get access to an available spectrum band. △ Less

Submitted 16 April, 2010; v1 submitted 2 April, 2010; originally announced April 2010.

Comments: 32 pages, 3 figures, to appear in the IEEE/ACM Transactions on Networking

arXiv:0911.4896 [pdf, ps, other]

Diversity Order in ISI Channels with Single-Carrier Frequency-Domain Equalizers

Authors: Ali Tajer, Aria Nosratinia

Abstract: This paper analyzes the diversity gain achieved by single-carrier frequency-domain equalizer (SC-FDE) in frequency selective channels, and uncovers the interplay between diversity gain $d$, channel memory length $ν$, transmission block length $L$, and the spectral efficiency $R$. We specifically show that for the class of minimum means-square error (MMSE) SC-FDE receivers, for rates… ▽ More This paper analyzes the diversity gain achieved by single-carrier frequency-domain equalizer (SC-FDE) in frequency selective channels, and uncovers the interplay between diversity gain $d$, channel memory length $ν$, transmission block length $L$, and the spectral efficiency $R$. We specifically show that for the class of minimum means-square error (MMSE) SC-FDE receivers, for rates $R\leq\log\frac{L}ν$ full diversity of $d=ν+1$ is achievable, while for higher rates the diversity is given by $d=\lfloor2^{-R}L\rfloor+1$. In other words, the achievable diversity gain depends not only on the channel memory length, but also on the desired spectral efficiency and the transmission block length. A similar analysis reveals that for zero forcing SC-FDE, the diversity order is always one irrespective of channel memory length and spectral efficiency. These results are supported by simulations. △ Less

Submitted 25 November, 2009; originally announced November 2009.

Comments: 30 pages, 6 figures, to appear in the IEEE Transactions on Wireless Communications

arXiv:0908.1077 [pdf, ps, other]

doi 10.1109/TSP.2009.2031280

Beamforming and Rate Allocation in MISO Cognitive Radio Networks

Authors: Ali Tajer, Narayan Prasad, Xiaodong Wang

Abstract: We consider decentralized multi-antenna cognitive radio networks where secondary (cognitive) users are granted simultaneous spectrum access along with license-holding (primary) users. We treat the problem of distributed beamforming and rate allocation for the secondary users such that the minimum weighted secondary rate is maximized. Such an optimization is subject to (1) a limited weighted sum-… ▽ More We consider decentralized multi-antenna cognitive radio networks where secondary (cognitive) users are granted simultaneous spectrum access along with license-holding (primary) users. We treat the problem of distributed beamforming and rate allocation for the secondary users such that the minimum weighted secondary rate is maximized. Such an optimization is subject to (1) a limited weighted sum-power budget for the secondary users and (2) guaranteed protection for the primary users in the sense that the interference level imposed on each primary receiver does not exceed a specified level. Based on the decoding method deployed by the secondary receivers, we consider three scenarios for solving this problem. In the first scenario each secondary receiver decodes only its designated transmitter while suppressing the rest as Gaussian interferers (single-user decoding). In the second case each secondary receiver employs the maximum likelihood decoder (MLD) to jointly decode all secondary transmissions, and in the third one each secondary receiver uses the unconstrained group decoder (UGD). By deploying the UGD, each secondary user is allowed to decode any arbitrary subset of users (which contains its designated user) after suppressing or canceling the remaining users. △ Less

Submitted 7 August, 2009; originally announced August 2009.

Comments: 32 pages, 6 figures

arXiv:0908.1071 [pdf, ps, other]

doi 10.1109/JSTSP.2010.2040104

Optimal Joint Target Detection and Parameter Estimation By MIMO Radar

Authors: Ali Tajer, Guido H. Jajamovich, Xiaodong Wang, George V. Moustakides

Abstract: We consider multiple-input multiple-output (MIMO) radar systems with widely-spaced antennas. Such antenna configuration facilitates capturing the inherent diversity gain due to independent signal dispersion by the target scatterers. We consider a new MIMO radar framework for detecting a target that lies in an unknown location. This is in contrast with conventional MIMO radars which break the spa… ▽ More We consider multiple-input multiple-output (MIMO) radar systems with widely-spaced antennas. Such antenna configuration facilitates capturing the inherent diversity gain due to independent signal dispersion by the target scatterers. We consider a new MIMO radar framework for detecting a target that lies in an unknown location. This is in contrast with conventional MIMO radars which break the space into small cells and aim at detecting the presence of a target in a specified cell. We treat this problem through offering a novel composite hypothesis testing framework for target detection when (i) one or more parameters of the target are unknown and we are interested in estimating them, and (ii) only a finite number of observations are available. The test offered optimizes a metric which accounts for both detection and estimation accuracies. In this paper as the parameter of interest we focus on the vector of time-delays that the waveforms undergo from being emitted by the transmit antennas until being observed by the receive antennas. The analytical and empirical results establish that for the proposed joint target detection and time-delay estimation framework, MIMO radars exhibit significant gains over phased-array radars for extended targets which consist of multiple independent scatterers. For point targets modeled as single scatterers, however, the detection/estimation accuracies of MIMO and phased-array radars for this specific setup (joint target detection and time-delay estimation) are comparable. △ Less

Submitted 7 August, 2009; originally announced August 2009.

Comments: 37 pages, 8 figures

arXiv:0905.4476 [pdf, ps, other]

Beacon-Assisted Spectrum Access with Cooperative Cognitive Transmitter and Receiver

Authors: Ali Tajer, Xiaodong Wang

Abstract: Spectrum access is an important function of cognitive radios for detecting and utilizing spectrum holes without interfering with the legacy systems. In this paper we propose novel cooperative communication models and show how deploying such cooperations between a pair of secondary transmitter and receiver assists them in identifying spectrum opportunities more reliably. These cooperations are fa… ▽ More Spectrum access is an important function of cognitive radios for detecting and utilizing spectrum holes without interfering with the legacy systems. In this paper we propose novel cooperative communication models and show how deploying such cooperations between a pair of secondary transmitter and receiver assists them in identifying spectrum opportunities more reliably. These cooperations are facilitated by dynamically and opportunistically assigning one of the secondary users as a relay to assist the other one which results in more efficient spectrum hole detection. Also, we investigate the impact of erroneous detection of spectrum holes and thereof missing communication opportunities on the capacity of the secondary channel. The capacity of the secondary users with interference-avoiding spectrum access is affected by 1) how effectively the availability of vacant spectrum is sensed by the secondary transmitter-receiver pair, and 2) how correlated are the perceptions of the secondary transmitter-receiver pair about network spectral activity. We show that both factors are improved by using the proposed cooperative protocols. One of the proposed protocols requires explicit information exchange in the network. Such information exchange in practice is prone to wireless channel errors (i.e., is imperfect) and costs bandwidth loss. We analyze the effects of such imperfect information exchange on the capacity as well as the effect of bandwidth cost on the achievable throughput. The protocols are also extended to multiuser secondary networks. △ Less

Submitted 27 May, 2009; originally announced May 2009.

Comments: 36 pages, 6 figures, To appear in IEEE Transaction on Mobile Computing

Showing 1–42 of 42 results for author: Tajer, A