Search | arXiv e-print repository

Bayesian Inverse Reinforcement Learning for Non-Markovian Rewards

Authors: Noah Topper, Alvaro Velasquez, George Atia

Abstract: Inverse reinforcement learning (IRL) is the problem of inferring a reward function from expert behavior. There are several approaches to IRL, but most are designed to learn a Markovian reward. However, a reward function might be non-Markovian, depending on more than just the current state, such as a reward machine (RM). Although there has been recent work on inferring RMs, it assumes access to the… ▽ More Inverse reinforcement learning (IRL) is the problem of inferring a reward function from expert behavior. There are several approaches to IRL, but most are designed to learn a Markovian reward. However, a reward function might be non-Markovian, depending on more than just the current state, such as a reward machine (RM). Although there has been recent work on inferring RMs, it assumes access to the reward signal, absent in IRL. We propose a Bayesian IRL (BIRL) framework for inferring RMs directly from expert behavior, requiring significant changes to the standard framework. We define a new reward space, adapt the expert demonstration to include history, show how to compute the reward posterior, and propose a novel modification to simulated annealing to maximize this posterior. We demonstrate that our method performs well when optimizing according to its inferred reward and compares favorably to an existing method that learns exclusively binary non-Markovian rewards. △ Less

Submitted 20 June, 2024; originally announced June 2024.

arXiv:2310.19137 [pdf, other]

Automaton Distillation: Neuro-Symbolic Transfer Learning for Deep Reinforcement Learning

Authors: Suraj Singireddy, Andre Beckus, George Atia, Sumit Jha, Alvaro Velasquez

Abstract: Reinforcement learning (RL) is a powerful tool for finding optimal policies in sequential decision processes. However, deep RL methods suffer from two weaknesses: collecting the amount of agent experience required for practical RL problems is prohibitively expensive, and the learned policies exhibit poor generalization on tasks outside of the training distribution. To mitigate these issues, we int… ▽ More Reinforcement learning (RL) is a powerful tool for finding optimal policies in sequential decision processes. However, deep RL methods suffer from two weaknesses: collecting the amount of agent experience required for practical RL problems is prohibitively expensive, and the learned policies exhibit poor generalization on tasks outside of the training distribution. To mitigate these issues, we introduce automaton distillation, a form of neuro-symbolic transfer learning in which Q-value estimates from a teacher are distilled into a low-dimensional representation in the form of an automaton. We then propose two methods for generating Q-value estimates: static transfer, which reasons over an abstract Markov Decision Process constructed based on prior knowledge, and dynamic transfer, where symbolic information is extracted from a teacher Deep Q-Network (DQN). The resulting Q-value estimates from either method are used to bootstrap learning in the target environment via a modified DQN loss function. We list several failure modes of existing automaton-based transfer methods and demonstrate that both static and dynamic automaton distillation decrease the time required to find optimal policies for various decision tasks. △ Less

Submitted 29 October, 2023; originally announced October 2023.

arXiv:2305.10504 [pdf, other]

Model-Free Robust Average-Reward Reinforcement Learning

Authors: Yue Wang, Alvaro Velasquez, George Atia, Ashley Prater-Bennette, Shaofeng Zou

Abstract: Robust Markov decision processes (MDPs) address the challenge of model uncertainty by optimizing the worst-case performance over an uncertainty set of MDPs. In this paper, we focus on the robust average-reward MDPs under the model-free setting. We first theoretically characterize the structure of solutions to the robust average-reward Bellman equation, which is essential for our later convergence… ▽ More Robust Markov decision processes (MDPs) address the challenge of model uncertainty by optimizing the worst-case performance over an uncertainty set of MDPs. In this paper, we focus on the robust average-reward MDPs under the model-free setting. We first theoretically characterize the structure of solutions to the robust average-reward Bellman equation, which is essential for our later convergence analysis. We then design two model-free algorithms, robust relative value iteration (RVI) TD and robust RVI Q-learning, and theoretically prove their convergence to the optimal solution. We provide several widely used uncertainty sets as examples, including those defined by the contamination model, total variation, Chi-squared divergence, Kullback-Leibler (KL) divergence and Wasserstein distance. △ Less

Submitted 17 May, 2023; originally announced May 2023.

Comments: ICML 2023

arXiv:2305.09044 [pdf, ps, other]

Scalable and Robust Tensor Ring Decomposition for Large-scale Data

Authors: Yicong He, George K. Atia

Abstract: Tensor ring (TR) decomposition has recently received increased attention due to its superior expressive performance for high-order tensors. However, the applicability of traditional TR decomposition algorithms to real-world applications is hindered by prevalent large data sizes, missing entries, and corruption with outliers. In this work, we propose a scalable and robust TR decomposition algorithm… ▽ More Tensor ring (TR) decomposition has recently received increased attention due to its superior expressive performance for high-order tensors. However, the applicability of traditional TR decomposition algorithms to real-world applications is hindered by prevalent large data sizes, missing entries, and corruption with outliers. In this work, we propose a scalable and robust TR decomposition algorithm capable of handling large-scale tensor data with missing entries and gross corruptions. We first develop a novel auto-weighted steepest descent method that can adaptively fill the missing entries and identify the outliers during the decomposition process. Further, taking advantage of the tensor ring model, we develop a novel fast Gram matrix computation (FGMC) approach and a randomized subtensor sketching (RStS) strategy which yield significant reduction in storage and computational complexity. Experimental results demonstrate that the proposed method outperforms existing TR decomposition methods in the presence of outliers, and runs significantly faster than existing robust tensor completion algorithms. △ Less

Submitted 15 May, 2023; originally announced May 2023.

arXiv:2301.04093 [pdf, other]

On the Robustness of AlphaFold: A COVID-19 Case Study

Authors: Ismail Alkhouri, Sumit Jha, Andre Beckus, George Atia, Alvaro Velasquez, Rickard Ewetz, Arvind Ramanathan, Susmit Jha

Abstract: Protein folding neural networks (PFNNs) such as AlphaFold predict remarkably accurate structures of proteins compared to other approaches. However, the robustness of such networks has heretofore not been explored. This is particularly relevant given the broad social implications of such technologies and the fact that biologically small perturbations in the protein sequence do not generally lead to… ▽ More Protein folding neural networks (PFNNs) such as AlphaFold predict remarkably accurate structures of proteins compared to other approaches. However, the robustness of such networks has heretofore not been explored. This is particularly relevant given the broad social implications of such technologies and the fact that biologically small perturbations in the protein sequence do not generally lead to drastic changes in the protein structure. In this paper, we demonstrate that AlphaFold does not exhibit such robustness despite its high accuracy. This raises the challenge of detecting and quantifying the extent to which these predicted protein structures can be trusted. To measure the robustness of the predicted structures, we utilize (i) the root-mean-square deviation (RMSD) and (ii) the Global Distance Test (GDT) similarity measure between the predicted structure of the original sequence and the structure of its adversarially perturbed version. We prove that the problem of minimally perturbing protein sequences to fool protein folding neural networks is NP-complete. Based on the well-established BLOSUM62 sequence alignment scoring matrix, we generate adversarial protein sequences and show that the RMSD between the predicted protein structure and the structure of the original sequence are very large when the adversarial changes are bounded by (i) 20 units in the BLOSUM62 distance, and (ii) five residues (out of hundreds or thousands of residues) in the given protein sequence. In our experimental evaluation, we consider 111 COVID-19 proteins in the Universal Protein resource (UniProt), a central resource for protein data managed by the European Bioinformatics Institute, Swiss Institute of Bioinformatics, and the US Protein Information Resource. These result in an overall GDT similarity test score average of around 34%, demonstrating a substantial drop in the performance of AlphaFold. △ Less

Submitted 12 January, 2023; v1 submitted 10 January, 2023; originally announced January 2023.

Comments: arXiv admin note: text overlap with arXiv:2109.04460

arXiv:2301.00858 [pdf, other]

Robust Average-Reward Markov Decision Processes

Authors: Yue Wang, Alvaro Velasquez, George Atia, Ashley Prater-Bennette, Shaofeng Zou

Abstract: In robust Markov decision processes (MDPs), the uncertainty in the transition kernel is addressed by finding a policy that optimizes the worst-case performance over an uncertainty set of MDPs. While much of the literature has focused on discounted MDPs, robust average-reward MDPs remain largely unexplored. In this paper, we focus on robust average-reward MDPs, where the goal is to find a policy th… ▽ More In robust Markov decision processes (MDPs), the uncertainty in the transition kernel is addressed by finding a policy that optimizes the worst-case performance over an uncertainty set of MDPs. While much of the literature has focused on discounted MDPs, robust average-reward MDPs remain largely unexplored. In this paper, we focus on robust average-reward MDPs, where the goal is to find a policy that optimizes the worst-case average reward over an uncertainty set. We first take an approach that approximates average-reward MDPs using discounted MDPs. We prove that the robust discounted value function converges to the robust average-reward as the discount factor $γ$ goes to $1$, and moreover, when $γ$ is large, any optimal policy of the robust discounted MDP is also an optimal policy of the robust average-reward. We further design a robust dynamic programming approach, and theoretically characterize its convergence to the optimum. Then, we investigate robust average-reward MDPs directly without using discounted MDPs as an intermediate step. We derive the robust Bellman equation for robust average-reward MDPs, prove that the optimal policy can be derived from its solution, and further design a robust relative value iteration algorithm that provably finds its solution, or equivalently, the optimal robust policy. △ Less

Submitted 1 March, 2023; v1 submitted 2 January, 2023; originally announced January 2023.

Comments: AAAI 2023

arXiv:2210.16894 [pdf, ps, other]

Distributionally Robust Domain Adaptation

Authors: Akram S. Awad, George K. Atia

Abstract: Domain Adaptation (DA) has recently received significant attention due to its potential to adapt a learning model across source and target domains with mismatched distributions. Since DA methods rely exclusively on the given source and target domain samples, they generally yield models that are vulnerable to noise and unable to adapt to unseen samples from the target domain, which calls for DA met… ▽ More Domain Adaptation (DA) has recently received significant attention due to its potential to adapt a learning model across source and target domains with mismatched distributions. Since DA methods rely exclusively on the given source and target domain samples, they generally yield models that are vulnerable to noise and unable to adapt to unseen samples from the target domain, which calls for DA methods that guarantee the robustness and generalization of the learned models. In this paper, we propose DRDA, a distributionally robust domain adaptation method. DRDA leverages a distributionally robust optimization (DRO) framework to learn a robust decision function that minimizes the worst-case target domain risk and generalizes to any sample from the target domain by transferring knowledge from a given labeled source domain sample. We utilize the Maximum Mean Discrepancy (MMD) metric to construct an ambiguity set of distributions that provably contains the source and target domain distributions with high probability. Hence, the risk is shown to upper bound the out-of-sample target domain loss. Our experimental results demonstrate that our formulation outperforms existing robust learning approaches. △ Less

Submitted 30 October, 2022; originally announced October 2022.

arXiv:2203.08209 [pdf, other]

A Differentiable Approach to Combinatorial Optimization using Dataless Neural Networks

Authors: Ismail R. Alkhouri, George K. Atia, Alvaro Velasquez

Abstract: The success of machine learning solutions for reasoning about discrete structures has brought attention to its adoption within combinatorial optimization algorithms. Such approaches generally rely on supervised learning by leveraging datasets of the combinatorial structures of interest drawn from some distribution of problem instances. Reinforcement learning has also been employed to find such str… ▽ More The success of machine learning solutions for reasoning about discrete structures has brought attention to its adoption within combinatorial optimization algorithms. Such approaches generally rely on supervised learning by leveraging datasets of the combinatorial structures of interest drawn from some distribution of problem instances. Reinforcement learning has also been employed to find such structures. In this paper, we propose a radically different approach in that no data is required for training the neural networks that produce the solution. In particular, we reduce the combinatorial optimization problem to a neural network and employ a dataless training scheme to refine the parameters of the network such that those parameters yield the structure of interest. We consider the combinatorial optimization problems of finding maximum independent sets and maximum cliques in a graph. In principle, since these problems belong to the NP-hard complexity class, our proposed approach can be used to solve any other NP-hard problem. Additionally, we propose a universal graph reduction procedure to handle large scale graphs. The reduction exploits community detection for graph partitioning and is applicable to any graph type and/or density. Experimental evaluation on both synthetic graphs and real-world benchmarks demonstrates that our method performs on par with or outperforms state-of-the-art heuristic, reinforcement learning, and machine learning based methods without requiring any data. △ Less

Submitted 15 March, 2022; originally announced March 2022.

arXiv:2110.13200 [pdf, ps, other]

Support Recovery of Periodic Mixtures with Nested Periodic Dictionaries

Authors: Pouria Saidi, George K. Atia

Abstract: Periodic signals composed of periodic mixtures admit sparse representations in nested periodic dictionaries (NPDs). Therefore, their underlying hidden periods can be estimated by recovering the exact support of said representations. In this paper, support recovery guarantees of such signals are derived both in noise-free and noisy settings. While exact recovery conditions have been studied in the… ▽ More Periodic signals composed of periodic mixtures admit sparse representations in nested periodic dictionaries (NPDs). Therefore, their underlying hidden periods can be estimated by recovering the exact support of said representations. In this paper, support recovery guarantees of such signals are derived both in noise-free and noisy settings. While exact recovery conditions have been studied in the theory of compressive sensing, existing conditions fall short of yielding meaningful achievability regions in the context of periodic signals with sparse representations in NPDs, in part since existing bounds do not capture structures intrinsic to these dictionaries. We leverage known properties of NPDs to derive several conditions for exact sparse recovery of periodic mixtures in the noise-free setting. These conditions rest on newly introduced notions of nested periodic coherence and restricted coherence, which can be efficiently computed. In the presence of noise, we obtain improved conditions for recovering the exact support set of the sparse representation of the periodic mixture via orthogonal matching pursuit based on the introduced notions of coherence. The theoretical findings are corroborated using numerical experiments for different families of NPDs. Our results show significant improvement over generic recovery bounds as the conditions hold over a larger range of sparsity levels. △ Less

Submitted 3 June, 2024; v1 submitted 25 October, 2021; originally announced October 2021.

Comments: 32 pages, 10 figures

arXiv:2108.02756 [pdf, other]

BOSS: Bidirectional One-Shot Synthesis of Adversarial Examples

Authors: Ismail R. Alkhouri, Alvaro Velasquez, George K. Atia

Abstract: The design of additive imperceptible perturbations to the inputs of deep classifiers to maximize their misclassification rates is a central focus of adversarial machine learning. An alternative approach is to synthesize adversarial examples from scratch using GAN-like structures, albeit with the use of large amounts of training data. By contrast, this paper considers one-shot synthesis of adversar… ▽ More The design of additive imperceptible perturbations to the inputs of deep classifiers to maximize their misclassification rates is a central focus of adversarial machine learning. An alternative approach is to synthesize adversarial examples from scratch using GAN-like structures, albeit with the use of large amounts of training data. By contrast, this paper considers one-shot synthesis of adversarial examples; the inputs are synthesized from scratch to induce arbitrary soft predictions at the output of pre-trained models, while simultaneously maintaining high similarity to specified inputs. To this end, we present a problem that encodes objectives on the distance between the desired and output distributions of the trained model and the similarity between such inputs and the synthesized examples. We prove that the formulated problem is NP-complete. Then, we advance a generative approach to the solution in which the adversarial examples are obtained as the output of a generative network whose parameters are iteratively updated by optimizing surrogate loss functions for the dual-objective. We demonstrate the generality and versatility of the framework and approach proposed through applications to the design of targeted adversarial attacks, generation of decision boundary samples, and synthesis of low confidence classification inputs. The approach is further extended to an ensemble of models with different soft output specifications. The experimental results verify that the targeted and confidence reduction attack methods developed perform on par with state-of-the-art algorithms. △ Less

Submitted 16 July, 2022; v1 submitted 5 August, 2021; originally announced August 2021.

arXiv:2107.04633 [pdf, other]

Inferring Probabilistic Reward Machines from Non-Markovian Reward Processes for Reinforcement Learning

Authors: Taylor Dohmen, Noah Topper, George Atia, Andre Beckus, Ashutosh Trivedi, Alvaro Velasquez

Abstract: The success of reinforcement learning in typical settings is predicated on Markovian assumptions on the reward signal by which an agent learns optimal policies. In recent years, the use of reward machines has relaxed this assumption by enabling a structured representation of non-Markovian rewards. In particular, such representations can be used to augment the state space of the underlying decision… ▽ More The success of reinforcement learning in typical settings is predicated on Markovian assumptions on the reward signal by which an agent learns optimal policies. In recent years, the use of reward machines has relaxed this assumption by enabling a structured representation of non-Markovian rewards. In particular, such representations can be used to augment the state space of the underlying decision process, thereby facilitating non-Markovian reinforcement learning. However, these reward machines cannot capture the semantics of stochastic reward signals. In this paper, we make progress on this front by introducing probabilistic reward machines (PRMs) as a representation of non-Markovian stochastic rewards. We present an algorithm to learn PRMs from the underlying decision process and prove results around its correctness and convergence. △ Less

Submitted 27 March, 2022; v1 submitted 9 July, 2021; originally announced July 2021.

arXiv:2106.10422 [pdf, ps, other]

Coarse to Fine Two-Stage Approach to Robust Tensor Completion of Visual Data

Authors: Yicong He, George K. Atia

Abstract: Tensor completion is the problem of estimating the missing values of high-order data from partially observed entries. Data corruption due to prevailing outliers poses major challenges to traditional tensor completion algorithms, which catalyzed the development of robust algorithms that alleviate the effect of outliers. However, existing robust methods largely presume that the corruption is sparse,… ▽ More Tensor completion is the problem of estimating the missing values of high-order data from partially observed entries. Data corruption due to prevailing outliers poses major challenges to traditional tensor completion algorithms, which catalyzed the development of robust algorithms that alleviate the effect of outliers. However, existing robust methods largely presume that the corruption is sparse, which may not hold in practice. In this paper, we develop a two-stage robust tensor completion approach to deal with tensor completion of visual data with a large amount of gross corruption. A novel coarse-to-fine framework is proposed which uses a global coarse completion result to guide a local patch refinement process. To efficiently mitigate the effect of a large number of outliers on tensor recovery, we develop a new M-estimator-based robust tensor ring recovery method which can adaptively identify the outliers and alleviate their negative effect in the optimization. The experimental results demonstrate the superior performance of the proposed approach over state-of-the-art robust algorithms for tensor completion. △ Less

Submitted 11 August, 2022; v1 submitted 19 June, 2021; originally announced June 2021.

arXiv:2106.02951 [pdf, other]

Controller Synthesis for Omega-Regular and Steady-State Specifications

Authors: Alvaro Velasquez, Ismail Alkhouri, Andre Beckus, Ashutosh Trivedi, George Atia

Abstract: Given a Markov decision process (MDP) and a linear-time ($ω$-regular or LTL) specification, the controller synthesis problem aims to compute the optimal policy that satisfies the specification. More recently, problems that reason over the asymptotic behavior of systems have been proposed through the lens of steady-state planning. This entails finding a control policy for an MDP such that the Marko… ▽ More Given a Markov decision process (MDP) and a linear-time ($ω$-regular or LTL) specification, the controller synthesis problem aims to compute the optimal policy that satisfies the specification. More recently, problems that reason over the asymptotic behavior of systems have been proposed through the lens of steady-state planning. This entails finding a control policy for an MDP such that the Markov chain induced by the solution policy satisfies a given set of constraints on its steady-state distribution. This paper studies a generalization of the controller synthesis problem for a linear-time specification under steady-state constraints on the asymptotic behavior. We present an algorithm to find a deterministic policy satisfying $ω$-regular and steady-state constraints by characterizing the solutions as an integer linear program, and experimentally evaluate our approach. △ Less

Submitted 7 February, 2022; v1 submitted 5 June, 2021; originally announced June 2021.

arXiv:2105.14620 [pdf, ps, other]

Patch Tracking-based Streaming Tensor Ring Completion for Visual Data Recovery

Authors: Yicong He, George K. Atia

Abstract: Tensor completion aims to recover the missing entries of a partially observed tensor by exploiting its low-rank structure, and has been applied to visual data recovery. In applications where the data arrives sequentially such as streaming video completion, the missing entries of the tensor need to be dynamically recovered in a streaming fashion. Traditional streaming tensor completion algorithms t… ▽ More Tensor completion aims to recover the missing entries of a partially observed tensor by exploiting its low-rank structure, and has been applied to visual data recovery. In applications where the data arrives sequentially such as streaming video completion, the missing entries of the tensor need to be dynamically recovered in a streaming fashion. Traditional streaming tensor completion algorithms treat the entire visual data as a tensor, which may not work satisfactorily when there is a big change in the tensor subspace along the temporal dimension, such as due to strong motion across the video frames. In this paper, we develop a novel patch tracking-based streaming tensor ring completion framework for visual data recovery. Given a newly incoming frame, small patches are tracked from the previous frame. Meanwhile, for each tracked patch, a patch tensor is constructed by stacking similar patches from the new frame. Patch tensors are then completed using a streaming tensor ring completion algorithm, and the incoming frame is recovered using the completed patch tensors. We propose a new patch tracking strategy that can accurately and efficiently track the patches with missing data. Further, a new streaming tensor ring completion algorithm is proposed which can efficiently and accurately update the latent core tensors and complete the missing entries of the patch tensors. Extensive experimental results demonstrate the superior performance of the proposed algorithms compared with both batch and streaming state-of-the-art tensor completion methods. △ Less

Submitted 12 August, 2022; v1 submitted 30 May, 2021; originally announced May 2021.

arXiv:2012.02178 [pdf, other]

doi 10.1613/jair.1.12611

Steady-State Planning in Expected Reward Multichain MDPs

Authors: George K. Atia, Andre Beckus, Ismail Alkhouri, Alvaro Velasquez

Abstract: The planning domain has experienced increased interest in the formal synthesis of decision-making policies. This formal synthesis typically entails finding a policy which satisfies formal specifications in the form of some well-defined logic. While many such logics have been proposed with varying degrees of expressiveness and complexity in their capacity to capture desirable agent behavior, their… ▽ More The planning domain has experienced increased interest in the formal synthesis of decision-making policies. This formal synthesis typically entails finding a policy which satisfies formal specifications in the form of some well-defined logic. While many such logics have been proposed with varying degrees of expressiveness and complexity in their capacity to capture desirable agent behavior, their value is limited when deriving decision-making policies which satisfy certain types of asymptotic behavior in general system models. In particular, we are interested in specifying constraints on the steady-state behavior of an agent, which captures the proportion of time an agent spends in each state as it interacts for an indefinite period of time with its environment. This is sometimes called the average or expected behavior of the agent and the associated planning problem is faced with significant challenges unless strong restrictions are imposed on the underlying model in terms of the connectivity of its graph structure. In this paper, we explore this steady-state planning problem that consists of deriving a decision-making policy for an agent such that constraints on its steady-state behavior are satisfied. A linear programming solution for the general case of multichain Markov Decision Processes (MDPs) is proposed and we prove that optimal solutions to the proposed programs yield stationary policies with rigorous guarantees of behavior. △ Less

Submitted 23 October, 2021; v1 submitted 3 December, 2020; originally announced December 2020.

Journal ref: Journal of Artificial Intelligence Research 72 (2021) 1029-1082

arXiv:2010.11740 [pdf, ps, other]

Robust Low-tubal-rank Tensor Completion based on Tensor Factorization and Maximum Correntopy Criterion

Authors: Yicong He, George K. Atia

Abstract: The goal of tensor completion is to recover a tensor from a subset of its entries, often by exploiting its low-rank property. Among several useful definitions of tensor rank, the low-tubal-rank was shown to give a valuable characterization of the inherent low-rank structure of a tensor. While some low-tubal-rank tensor completion algorithms with favorable performance have been recently proposed, t… ▽ More The goal of tensor completion is to recover a tensor from a subset of its entries, often by exploiting its low-rank property. Among several useful definitions of tensor rank, the low-tubal-rank was shown to give a valuable characterization of the inherent low-rank structure of a tensor. While some low-tubal-rank tensor completion algorithms with favorable performance have been recently proposed, these algorithms utilize second-order statistics to measure the error residual, which may not work well when the observed entries contain large outliers. In this paper, we propose a new objective function for low-tubal-rank tensor completion, which uses correntropy as the error measure to mitigate the effect of the outliers. To efficiently optimize the proposed objective, we leverage a half-quadratic minimization technique whereby the optimization is transformed to a weighted low-tubal-rank tensor factorization problem. Subsequently, we propose two simple and efficient algorithms to obtain the solution and provide their convergence and complexity analysis. Numerical results using both synthetic and real data demonstrate the robust and superior performance of the proposed algorithms. △ Less

Submitted 14 October, 2022; v1 submitted 22 October, 2020; originally announced October 2020.

arXiv:2009.11835 [pdf, other]

doi 10.1103/PhysRevE.106.044306

Sketch-based community detection in evolving networks

Authors: Andre Beckus, George K. Atia

Abstract: We consider an approach for community detection in time-varying networks. At its core, this approach maintains a small sketch graph to capture the essential community structure found in each snapshot of the full network. We demonstrate how the sketch can be used to explicitly identify six key community events which typically occur during network evolution: growth, shrinkage, merging, splitting, bi… ▽ More We consider an approach for community detection in time-varying networks. At its core, this approach maintains a small sketch graph to capture the essential community structure found in each snapshot of the full network. We demonstrate how the sketch can be used to explicitly identify six key community events which typically occur during network evolution: growth, shrinkage, merging, splitting, birth and death. Based on these detection techniques, we formulate a community detection algorithm which can process a network concurrently exhibiting all processes. One advantage afforded by the sketch-based algorithm is the efficient handling of large networks. Whereas detecting events in the full graph may be computationally expensive, the small size of the sketch allows changes to be quickly assessed. A second advantage occurs in networks containing clusters of disproportionate size. The sketch is constructed such that there is equal representation of each cluster, thus reducing the possibility that the small clusters are lost in the estimate. We present a new standardized benchmark based on the stochastic block model which models the addition and deletion of nodes, as well as the birth and death of communities. When coupled with existing benchmarks, this new benchmark provides a comprehensive suite of tests encompassing all six community events. We provide analysis and a set of numerical results demonstrating the advantages of our approach both in run time and in the handling of small clusters. △ Less

Submitted 3 December, 2022; v1 submitted 24 September, 2020; originally announced September 2020.

Journal ref: Physical Review E, vol. 106, p. 044306, Oct 2022

arXiv:2003.05989 [pdf, other]

A Multi-criteria Approach for Fast and Outlier-aware Representative Selection from Manifolds

Authors: Mahlagha Sedghi, George Atia, Michael Georgiopoulos

Abstract: The problem of representative selection amounts to sampling few informative exemplars from large datasets. This paper presents MOSAIC, a novel representative selection approach from high-dimensional data that may exhibit non-linear structures. Resting upon a novel quadratic formulation, Our method advances a multi-criteria selection approach that maximizes the global representation power of the sa… ▽ More The problem of representative selection amounts to sampling few informative exemplars from large datasets. This paper presents MOSAIC, a novel representative selection approach from high-dimensional data that may exhibit non-linear structures. Resting upon a novel quadratic formulation, Our method advances a multi-criteria selection approach that maximizes the global representation power of the sampled subset, ensures diversity, and rejects disruptive information by effectively detecting outliers. Through theoretical analyses we characterize the obtained sketch and reveal that the sampled representatives maximize a well-defined notion of data coverage in a transformed space. In addition, we present a highly scalable randomized implementation of the proposed algorithm shown to bring about substantial speedups. MOSAIC's superiority in achieving the desired characteristics of a representative subset all at once while exhibiting remarkable robustness to various outlier types is demonstrated via extensive experiments conducted on both real and synthetic data with comparisons to state-of-the-art algorithms. △ Less

Submitted 12 March, 2020; originally announced March 2020.

arXiv:1809.10073 [pdf, other]

Rediscovering Deep Neural Networks Through Finite-State Distributions

Authors: Amir Emad Marvasti, Ehsan Emad Marvasti, George Atia, Hassan Foroosh

Abstract: We propose a new way of thinking about deep neural networks, in which the linear and non-linear components of the network are naturally derived and justified in terms of principles in probability theory. In particular, the models constructed in our framework assign probabilities to uncertain realizations, leading to Kullback-Leibler Divergence (KLD) as the linear layer. In our model construction,… ▽ More We propose a new way of thinking about deep neural networks, in which the linear and non-linear components of the network are naturally derived and justified in terms of principles in probability theory. In particular, the models constructed in our framework assign probabilities to uncertain realizations, leading to Kullback-Leibler Divergence (KLD) as the linear layer. In our model construction, we also arrive at a structure similar to ReLU activation supported with Bayes' theorem. The non-linearities in our framework are normalization layers with ReLU and Sigmoid as element-wise approximations. Additionally, the pooling function is derived as a marginalization of spatial random variables according to the mechanics of the framework. As such, Max Pooling is an approximation to the aforementioned marginalization process. Since our models are comprised of finite state distributions (FSD) as variables and parameters, exact computation of information-theoretic quantities such as entropy and KLD is possible, thereby providing more objective measures to analyze networks. Unlike existing designs that rely on heuristics, the proposed framework restricts subjective interpretations of CNNs and sheds light on the functionality of neural networks from a completely new perspective. △ Less

Submitted 9 October, 2019; v1 submitted 26 September, 2018; originally announced September 2018.

arXiv:1807.03622 [pdf, ps, other]

doi 10.1364/JOSAA.35.001880

Interferometry-based modal analysis with finite aperture effects

Authors: Davood Mardani, Ayman F. Abouraddy, George K. Atia

Abstract: We analyze the effects of aperture finiteness on interferograms recorded to unveil the modal content of optical beams in arbitrary basis using generalized interferometry. We develop a scheme for modal reconstruction from interferometric measurements that accounts for the ensuing clip** effects. Clip**-cognizant reconstruction is shown to yield significant performance gains over traditional sch… ▽ More We analyze the effects of aperture finiteness on interferograms recorded to unveil the modal content of optical beams in arbitrary basis using generalized interferometry. We develop a scheme for modal reconstruction from interferometric measurements that accounts for the ensuing clip** effects. Clip**-cognizant reconstruction is shown to yield significant performance gains over traditional schemes that overlook such effects that do arise in practice. Our work can inspire further research on reconstruction schemes and algorithms that account for practical hardware limitations in a variety of contexts. △ Less

Submitted 4 July, 2018; originally announced July 2018.

arXiv:1807.02444 [pdf, other]

doi 10.1109/TIP.2019.2896517

Multi-modal Non-line-of-sight Passive Imaging

Authors: Andre Beckus, Alexandru Tamasan, George K. Atia

Abstract: We consider the non-line-of-sight (NLOS) imaging of an object using the light reflected off a diffusive wall. The wall scatters incident light such that a lens is no longer useful to form an image. Instead, we exploit the 4D spatial coherence function to reconstruct a 2D projection of the obscured object. The approach is completely passive in the sense that no control over the light illuminating t… ▽ More We consider the non-line-of-sight (NLOS) imaging of an object using the light reflected off a diffusive wall. The wall scatters incident light such that a lens is no longer useful to form an image. Instead, we exploit the 4D spatial coherence function to reconstruct a 2D projection of the obscured object. The approach is completely passive in the sense that no control over the light illuminating the object is assumed and is compatible with the partially coherent fields ubiquitous in both the indoor and outdoor environments. We formulate a multi-criteria convex optimization problem for reconstruction, which fuses the reflected field's intensity and spatial coherence information at different scales. Our formulation leverages established optics models of light propagation and scattering and exploits the sparsity common to many images in different bases. We also develop an algorithm based on the alternating direction method of multipliers to efficiently solve the convex program proposed. A means for analyzing the null space of the measurement matrices is provided as well as a means for weighting the contribution of individual measurements to the reconstruction. This paper holds promise to advance passive imaging in the challenging NLOS regimes in which the intensity does not necessarily retain distinguishable features and provides a framework for multi-modal information fusion for efficient scene reconstruction. △ Less

Submitted 2 March, 2019; v1 submitted 6 July, 2018; originally announced July 2018.

Journal ref: IEEE Transactions on Image Processing, vol. 28, no. 7, pp. 3372-3382, July 2019

arXiv:1805.10927 [pdf, other]

doi 10.1109/TSP.2020.2965818

Scalable and Robust Community Detection with Randomized Sketching

Authors: Mostafa Rahmani, Andre Beckus, Adel Karimian, George Atia

Abstract: This article explores and analyzes the unsupervised clustering of large partially observed graphs. We propose a scalable and provable randomized framework for clustering graphs generated from the stochastic block model. The clustering is first applied to a sub-matrix of the graph's adjacency matrix associated with a reduced graph sketch constructed using random sampling. Then, the clusters of the… ▽ More This article explores and analyzes the unsupervised clustering of large partially observed graphs. We propose a scalable and provable randomized framework for clustering graphs generated from the stochastic block model. The clustering is first applied to a sub-matrix of the graph's adjacency matrix associated with a reduced graph sketch constructed using random sampling. Then, the clusters of the full graph are inferred based on the clusters extracted from the sketch using a correlation-based retrieval step. Uniform random node sampling is shown to improve the computational complexity over clustering of the full graph when the cluster sizes are balanced. A new random degree-based node sampling algorithm is presented which significantly improves upon the performance of the clustering algorithm even when clusters are unbalanced. This framework improves the phase transitions for matrix-decomposition-based clustering with regard to computational complexity and minimum cluster size, which are shown to be nearly dimension-free in the low inter-cluster connectivity regime. A third sampling technique is shown to improve balance by randomly sampling nodes based on spatial distribution. We provide analysis and numerical results using a convex clustering algorithm based on matrix completion. △ Less

Submitted 3 December, 2022; v1 submitted 25 May, 2018; originally announced May 2018.

Journal ref: IEEE Transactions on Signal Processing, vol. 68, pp. 962-977, 2020

arXiv:1803.05542 [pdf, ps, other]

A Game-Theoretic Framework for the Virtual Machines Migration Timing Problem

Authors: Ahmed H. Anwar, George Atia, Mina Guirguis

Abstract: In a multi-tenant cloud, a number of Virtual Machines (VMs) are collocated on the same physical machine to optimize performance, power consumption and maximize profit. This, however, increases the risk of a malicious VM performing side-channel attacks and leaking sensitive information from neighboring VMs. To this end, this paper develops and analyzes a game-theoretic framework for the VM migratio… ▽ More In a multi-tenant cloud, a number of Virtual Machines (VMs) are collocated on the same physical machine to optimize performance, power consumption and maximize profit. This, however, increases the risk of a malicious VM performing side-channel attacks and leaking sensitive information from neighboring VMs. To this end, this paper develops and analyzes a game-theoretic framework for the VM migration timing problem in which the cloud provider decides \emph{when} to migrate a VM to a different physical machine to reduce the risk of being compromised by a collocated malicious VM. The adversary decides the rate at which she launches new VMs to collocate with the victim VMs. Our formulation captures a data leakage model in which the cost incurred by the cloud provider depends on the duration of collocation with malicious VMs. It also captures costs incurred by the adversary in launching new VMs and by the defender in migrating VMs. We establish sufficient conditions for the existence of Nash equilibria for general cost functions, as well as for specific instantiations, and characterize the best response for both players. Furthermore, we extend our model to characterize its impact on the attacker's payoff when the cloud utilizes intrusion detection systems that detect side-channel attacks. Our theoretical findings are corroborated with extensive numerical results in various settings. △ Less

Submitted 14 March, 2018; originally announced March 2018.

arXiv:1712.00891 [pdf, other]

Data Dropout in Arbitrary Basis for Deep Network Regularization

Authors: Mostafa Rahmani, George Atia

Abstract: An important problem in training deep networks with high capacity is to ensure that the trained network works well when presented with new inputs outside the training dataset. Dropout is an effective regularization technique to boost the network generalization in which a random subset of the elements of the given data and the extracted features are set to zero during the training process. In this… ▽ More An important problem in training deep networks with high capacity is to ensure that the trained network works well when presented with new inputs outside the training dataset. Dropout is an effective regularization technique to boost the network generalization in which a random subset of the elements of the given data and the extracted features are set to zero during the training process. In this paper, a new randomized regularization technique in which we withhold a random part of the data without necessarily turning off the neurons/data-elements is proposed. In the proposed method, of which the conventional dropout is shown to be a special case, random data dropout is performed in an arbitrary basis, hence the designation Generalized Dropout. We also present a framework whereby the proposed technique can be applied efficiently to convolutional neural networks. The presented numerical experiments demonstrate that the proposed technique yields notable performance gain. Generalized Dropout provides new insight into the idea of dropout, shows that we can achieve different performance gains by using different bases matrices, and opens up a new research question as of how to choose optimal bases matrices that achieve maximal performance gain. △ Less

Submitted 4 December, 2017; v1 submitted 3 December, 2017; originally announced December 2017.

arXiv:1706.10275 [pdf, ps, other]

Signal Reconstruction from Interferometric Measurements under Sensing Constraints

Authors: Davood Mardani, George K. Atia, Ayman F. Abouraddy

Abstract: This paper develops a unifying framework for signal reconstruction from interferometric measurements that is broadly applicable to various applications of interferometry. In this framework, the problem of signal reconstruction in interferometry amounts to one of basis analysis. Its applicability is shown to extend beyond conventional temporal interferometry, which leverages the relative delay betw… ▽ More This paper develops a unifying framework for signal reconstruction from interferometric measurements that is broadly applicable to various applications of interferometry. In this framework, the problem of signal reconstruction in interferometry amounts to one of basis analysis. Its applicability is shown to extend beyond conventional temporal interferometry, which leverages the relative delay between the two arms of an interferometer, to arbitrary degrees of freedom of the input signal. This allows for reconstruction of signals supported in other domains (e.g., spatial) with no modification to the underlying structure except for replacing the standard temporal delay with a generalized delay, that is, a practically realizable unitary transformation for which the basis elements are eigenfunctions. Under the proposed model, the interferometric measurements are shown to be linear in the basis coefficients, thereby enabling efficient and fast recovery of the desired information. While the corresponding linear transformation has only a limited number of degrees of freedom set by the structure of the interferometer giving rise to a highly constrained sensing structure, we show that the problem of signal recovery from such measurements can still be carried out compressively. This signifies significant reduction in sample complexity without introducing any additional randomization as is typically done in prior work leveraging compressive sensing techniques. We provide performance guarantees under constrained sensing by proving that the transformation satisfies sufficient conditions for successful reconstruction of sparse signals using concentration arguments. We showcase the effectiveness of the proposed approach using simulation results, as well as actual experimental results in the context of optical modal analysis of spatial beams. △ Less

Submitted 30 June, 2017; originally announced June 2017.

arXiv:1706.06166 [pdf, other]

doi 10.1364/OE.26.005225

Compressive optical interferometry

Authors: Davood Mardani, H. Esat Kondakci, Lane Martin, Ayman F. Abouraddy, George K. Atia

Abstract: Compressive sensing (CS) combines data acquisition with compression coding to reduce the number of measurements required to reconstruct a sparse signal. In optics, this usually takes the form of projecting the field onto sequences of random spatial patterns that are selected from an appropriate random ensemble. We show here that CS can be exploited in `native' optics hardware without introducing a… ▽ More Compressive sensing (CS) combines data acquisition with compression coding to reduce the number of measurements required to reconstruct a sparse signal. In optics, this usually takes the form of projecting the field onto sequences of random spatial patterns that are selected from an appropriate random ensemble. We show here that CS can be exploited in `native' optics hardware without introducing added components. Specifically, we show that random sub-Nyquist sampling of an interferogram helps reconstruct the field modal structure. The distribution of reduced sensing matrices corresponding to random measurements is provably incoherent and isotropic, which helps us carry out CS successfully. △ Less

Submitted 19 June, 2017; originally announced June 2017.

arXiv:1706.03860 [pdf, other]

doi 10.1109/LSP.2017.2757901

Subspace Clustering via Optimal Direction Search

Authors: Mostafa Rahmani, George Atia

Abstract: This letter presents a new spectral-clustering-based approach to the subspace clustering problem. Underpinning the proposed method is a convex program for optimal direction search, which for each data point d finds an optimal direction in the span of the data that has minimum projection on the other data points and non-vanishing projection on d. The obtained directions are subsequently leveraged t… ▽ More This letter presents a new spectral-clustering-based approach to the subspace clustering problem. Underpinning the proposed method is a convex program for optimal direction search, which for each data point d finds an optimal direction in the span of the data that has minimum projection on the other data points and non-vanishing projection on d. The obtained directions are subsequently leveraged to identify a neighborhood set for each data point. An alternating direction method of multipliers framework is provided to efficiently solve for the optimal directions. The proposed method is shown to notably outperform the existing subspace clustering methods, particularly for unwieldy scenarios involving high levels of noise and close subspaces, and yields the state-of-the-art results for the problem of face clustering using subspace segmentation. △ Less

Submitted 26 November, 2017; v1 submitted 12 June, 2017; originally announced June 2017.

Journal ref: IEEE Signal Processing Letters ( Volume: 24, Issue: 12, Dec. 2017 )

arXiv:1705.03566 [pdf, other]

doi 10.1109/LSP.2017.2723472

Spatial Random Sampling: A Structure-Preserving Data Sketching Tool

Authors: Mostafa Rahmani, George Atia

Abstract: Random column sampling is not guaranteed to yield data sketches that preserve the underlying structures of the data and may not sample sufficiently from less-populated data clusters. Also, adaptive sampling can often provide accurate low rank approximations, yet may fall short of producing descriptive data sketches, especially when the cluster centers are linearly dependent. Motivated by that, thi… ▽ More Random column sampling is not guaranteed to yield data sketches that preserve the underlying structures of the data and may not sample sufficiently from less-populated data clusters. Also, adaptive sampling can often provide accurate low rank approximations, yet may fall short of producing descriptive data sketches, especially when the cluster centers are linearly dependent. Motivated by that, this paper introduces a novel randomized column sampling tool dubbed Spatial Random Sampling (SRS), in which data points are sampled based on their proximity to randomly sampled points on the unit sphere. The most compelling feature of SRS is that the corresponding probability of sampling from a given data cluster is proportional to the surface area the cluster occupies on the unit sphere, independently from the size of the cluster population. Although it is fully randomized, SRS is shown to provide descriptive and balanced data representations. The proposed idea addresses a pressing need in data science and holds potential to inspire many novel approaches for analysis of big data. △ Less

Submitted 12 July, 2017; v1 submitted 9 May, 2017; originally announced May 2017.

arXiv:1702.01847 [pdf, other]

doi 10.1109/JSTSP.2018.2876604

Low Rank Matrix Recovery with Simultaneous Presence of Outliers and Sparse Corruption

Authors: Mostafa Rahmani, George Atia

Abstract: We study a data model in which the data matrix D can be expressed as D = L + S + C, where L is a low rank matrix, S an element-wise sparse matrix and C a matrix whose non-zero columns are outlying data points. To date, robust PCA algorithms have solely considered models with either S or C, but not both. As such, existing algorithms cannot account for simultaneous element-wise and column-wise corru… ▽ More We study a data model in which the data matrix D can be expressed as D = L + S + C, where L is a low rank matrix, S an element-wise sparse matrix and C a matrix whose non-zero columns are outlying data points. To date, robust PCA algorithms have solely considered models with either S or C, but not both. As such, existing algorithms cannot account for simultaneous element-wise and column-wise corruptions. In this paper, a new robust PCA algorithm that is robust to simultaneous types of corruption is proposed. Our approach hinges on the sparse approximation of a sparsely corrupted column so that the sparse expansion of a column with respect to the other data points is used to distinguish a sparsely corrupted inlier column from an outlying data point. We also develop a randomized design which provides a scalable implementation of the proposed approach. The core idea of sparse approximation is analyzed analytically where we show that the underlying ell_1-norm minimization can obtain the representation of an inlier in presence of sparse corruptions. △ Less

Submitted 6 February, 2017; originally announced February 2017.

arXiv:1611.05977 [pdf, other]

Robust and Scalable Column/Row Sampling from Corrupted Big Data

Authors: Mostafa Rahmani, George Atia

Abstract: Conventional sampling techniques fall short of drawing descriptive sketches of the data when the data is grossly corrupted as such corruptions break the low rank structure required for them to perform satisfactorily. In this paper, we present new sampling algorithms which can locate the informative columns in presence of severe data corruptions. In addition, we develop new scalable randomized desi… ▽ More Conventional sampling techniques fall short of drawing descriptive sketches of the data when the data is grossly corrupted as such corruptions break the low rank structure required for them to perform satisfactorily. In this paper, we present new sampling algorithms which can locate the informative columns in presence of severe data corruptions. In addition, we develop new scalable randomized designs of the proposed algorithms. The proposed approach is simultaneously robust to sparse corruption and outliers and substantially outperforms the state-of-the-art robust sampling algorithms as demonstrated by experiments conducted using both real and synthetic data. △ Less

Submitted 18 November, 2016; originally announced November 2016.

arXiv:1609.04789 [pdf, other]

doi 10.1109/TSP.2017.2749215

Coherence Pursuit: Fast, Simple, and Robust Principal Component Analysis

Authors: Mostafa Rahmani, George Atia

Abstract: This paper presents a remarkably simple, yet powerful, algorithm termed Coherence Pursuit (CoP) to robust Principal Component Analysis (PCA). As inliers lie in a low dimensional subspace and are mostly correlated, an inlier is likely to have strong mutual coherence with a large number of data points. By contrast, outliers either do not admit low dimensional structures or form small clusters. In ei… ▽ More This paper presents a remarkably simple, yet powerful, algorithm termed Coherence Pursuit (CoP) to robust Principal Component Analysis (PCA). As inliers lie in a low dimensional subspace and are mostly correlated, an inlier is likely to have strong mutual coherence with a large number of data points. By contrast, outliers either do not admit low dimensional structures or form small clusters. In either case, an outlier is unlikely to bear strong resemblance to a large number of data points. Given that, CoP sets an outlier apart from an inlier by comparing their coherence with the rest of the data points. The mutual coherences are computed by forming the Gram matrix of the normalized data points. Subsequently, the sought subspace is recovered from the span of the subset of the data points that exhibit strong coherence with the rest of the data. As CoP only involves one simple matrix multiplication, it is significantly faster than the state-of-the-art robust PCA algorithms. We derive analytical performance guarantees for CoP under different models for the distributions of inliers and outliers in both noise-free and noisy settings. CoP is the first robust PCA algorithm that is simultaneously non-iterative, provably robust to both unstructured and structured outliers, and can tolerate a large number of unstructured outliers. △ Less

Submitted 12 July, 2017; v1 submitted 15 September, 2016; originally announced September 2016.

Journal ref: IEEE Transactions on Signal Processing ( Volume: 65, Issue: 23, Dec.1, 1 2017 )

arXiv:1605.04380 [pdf, other]

Sparsity-Based Error Detection in DC Power Flow State Estimation

Authors: M. Hadi Amini, Mostafa Rahmani, Kianoosh G. Boroojeni, George Atia, S. S. Iyengar, Orkun Karabasoglu

Abstract: This paper presents a new approach for identifying the measurement error in the DC power flow state estimation problem. The proposed algorithm exploits the singularity of the impedance matrix and the sparsity of the error vector by posing the DC power flow problem as a sparse vector recovery problem that leverages the structure of the power system and uses $l_1$-norm minimization for state estimat… ▽ More This paper presents a new approach for identifying the measurement error in the DC power flow state estimation problem. The proposed algorithm exploits the singularity of the impedance matrix and the sparsity of the error vector by posing the DC power flow problem as a sparse vector recovery problem that leverages the structure of the power system and uses $l_1$-norm minimization for state estimation. This approach can provably compute the measurement errors exactly, and its performance is robust to the arbitrary magnitudes of the measurement errors. Hence, the proposed approach can detect the noisy elements if the measurements are contaminated with additive white Gaussian noise plus sparse noise with large magnitude. The effectiveness of the proposed sparsity-based decomposition-DC power flow approach is demonstrated on the IEEE 118-bus and 300-bus test systems. △ Less

Submitted 26 August, 2016; v1 submitted 14 May, 2016; originally announced May 2016.

arXiv:1512.00907 [pdf, other]

doi 10.1109/TSP.2017.2749206

Innovation Pursuit: A New Approach to Subspace Clustering

Authors: Mostafa Rahmani, George Atia

Abstract: In subspace clustering, a group of data points belonging to a union of subspaces are assigned membership to their respective subspaces. This paper presents a new approach dubbed Innovation Pursuit (iPursuit) to the problem of subspace clustering using a new geometrical idea whereby subspaces are identified based on their relative novelties. We present two frameworks in which the idea of innovation… ▽ More In subspace clustering, a group of data points belonging to a union of subspaces are assigned membership to their respective subspaces. This paper presents a new approach dubbed Innovation Pursuit (iPursuit) to the problem of subspace clustering using a new geometrical idea whereby subspaces are identified based on their relative novelties. We present two frameworks in which the idea of innovation pursuit is used to distinguish the subspaces. Underlying the first framework is an iterative method that finds the subspaces consecutively by solving a series of simple linear optimization problems, each searching for a direction of innovation in the span of the data potentially orthogonal to all subspaces except for the one to be identified in one step of the algorithm. A detailed mathematical analysis is provided establishing sufficient conditions for iPursuit to correctly cluster the data. The proposed approach can provably yield exact clustering even when the subspaces have significant intersections. It is shown that the complexity of the iterative approach scales only linearly in the number of data points and subspaces, and quadratically in the dimension of the subspaces. The second framework integrates iPursuit with spectral clustering to yield a new variant of spectral-clustering-based algorithms. The numerical simulations with both real and synthetic data demonstrate that iPursuit can often outperform the state-of-the-art subspace clustering algorithms, more so for subspaces with significant intersections, and that it significantly improves the state-of-the-art result for subspace-segmentation-based face clustering. △ Less

Submitted 26 November, 2017; v1 submitted 2 December, 2015; originally announced December 2015.

Journal ref: IEEE Transactions on Signal Processing ( Volume: 65, Issue: 23, Dec.1, 1 2017 )

arXiv:1505.05901 [pdf, other]

doi 10.1109/TSP.2016.2645515

Randomized Robust Subspace Recovery for High Dimensional Data Matrices

Authors: Mostafa Rahmani, George Atia

Abstract: This paper explores and analyzes two randomized designs for robust Principal Component Analysis (PCA) employing low-dimensional data sketching. In one design, a data sketch is constructed using random column sampling followed by low dimensional embedding, while in the other, sketching is based on random column and row sampling. Both designs are shown to bring about substantial savings in complexit… ▽ More This paper explores and analyzes two randomized designs for robust Principal Component Analysis (PCA) employing low-dimensional data sketching. In one design, a data sketch is constructed using random column sampling followed by low dimensional embedding, while in the other, sketching is based on random column and row sampling. Both designs are shown to bring about substantial savings in complexity and memory requirements for robust subspace learning over conventional approaches that use the full scale data. A characterization of the sample and computational complexity of both designs is derived in the context of two distinct outlier models, namely, sparse and independent outlier models. The proposed randomized approach can provably recover the correct subspace with computational and sample complexity that are almost independent of the size of the data. The results of the mathematical analysis are confirmed through numerical simulations using both synthetic and real data. △ Less

Submitted 8 April, 2016; v1 submitted 21 May, 2015; originally announced May 2015.

Journal ref: IEEE Transactions on Signal Processing ( Volume: 65, Issue: 6, March15, 15 2017 )

arXiv:1502.00182 [pdf, other]

doi 10.1109/TSP.2017.2649482

High Dimensional Low Rank plus Sparse Matrix Decomposition

Authors: Mostafa Rahmani, George Atia

Abstract: This paper is concerned with the problem of low rank plus sparse matrix decomposition for big data. Conventional algorithms for matrix decomposition use the entire data to extract the low-rank and sparse components, and are based on optimization problems with complexity that scales with the dimension of the data, which limits their scalability. Furthermore, existing randomized approaches mostly re… ▽ More This paper is concerned with the problem of low rank plus sparse matrix decomposition for big data. Conventional algorithms for matrix decomposition use the entire data to extract the low-rank and sparse components, and are based on optimization problems with complexity that scales with the dimension of the data, which limits their scalability. Furthermore, existing randomized approaches mostly rely on uniform random sampling, which is quite inefficient for many real world data matrices that exhibit additional structures (e.g. clustering). In this paper, a scalable subspace-pursuit approach that transforms the decomposition problem to a subspace learning problem is proposed. The decomposition is carried out using a small data sketch formed from sampled columns/rows. Even when the data is sampled uniformly at random, it is shown that the sufficient number of sampled columns/rows is roughly O(rμ), where μis the coherency parameter and r the rank of the low rank component. In addition, adaptive sampling algorithms are proposed to address the problem of column/row sampling from structured data. We provide an analysis of the proposed method with adaptive sampling and show that adaptive sampling makes the required number of sampled columns/rows invariant to the distribution of the data. The proposed approach is amenable to online implementation and an online scheme is proposed. △ Less

Submitted 16 March, 2017; v1 submitted 31 January, 2015; originally announced February 2015.

Comments: IEEE Transactions on Signal Processing

arXiv:1411.0622 [pdf, other]

A Subspace Method for Array Covariance Matrix Estimation

Authors: Mostafa Rahmani, George Atia

Abstract: This paper introduces a subspace method for the estimation of an array covariance matrix. It is shown that when the received signals are uncorrelated, the true array covariance matrices lie in a specific subspace whose dimension is typically much smaller than the dimension of the full space. Based on this idea, a subspace based covariance matrix estimator is proposed. The estimator is obtained as… ▽ More This paper introduces a subspace method for the estimation of an array covariance matrix. It is shown that when the received signals are uncorrelated, the true array covariance matrices lie in a specific subspace whose dimension is typically much smaller than the dimension of the full space. Based on this idea, a subspace based covariance matrix estimator is proposed. The estimator is obtained as a solution to a semi-definite convex optimization problem. While the optimization problem has no closed-form solution, a nearly optimal closed-form solution is proposed making it easy to implement. In comparison to the conventional approaches, the proposed method yields higher estimation accuracy because it eliminates the estimation error which does not lie in the subspace of the true covariance matrices. The numerical examples indicate that the proposed covariance matrix estimator can significantly improve the estimation quality of the covariance matrix. △ Less

Submitted 19 October, 2014; originally announced November 2014.

Comments: 5 pages, 4 figures

arXiv:1404.3152 [pdf, other]

doi 10.1109/LSP.2014.2352116

Change Detection with Compressive Measurements

Authors: George Atia

Abstract: Quickest change point detection is concerned with the detection of statistical change(s) in sequences while minimizing the detection delay subject to false alarm constraints. In this paper, the problem of change point detection is studied when the decision maker only has access to compressive measurements. First, an expression for the average detection delay of Shiryaev's procedure with compressiv… ▽ More Quickest change point detection is concerned with the detection of statistical change(s) in sequences while minimizing the detection delay subject to false alarm constraints. In this paper, the problem of change point detection is studied when the decision maker only has access to compressive measurements. First, an expression for the average detection delay of Shiryaev's procedure with compressive measurements is derived in the asymptotic regime where the probability of false alarm goes to zero. Second, the dependence of the delay on the compression ratio and the signal to noise ratio is explicitly quantified. The ratio of delays with and without compression is studied under various sensing matrix constructions, including Gaussian ensembles and random projections. For a target ratio of the delays after and before compression, a sufficient condition on the number of measurements required to meet this objective with prespecified probability is derived. △ Less

Submitted 11 April, 2014; originally announced April 2014.

arXiv:1304.0682 [pdf, other]

doi 10.1109/TIT.2016.2605122

Sparse Signal Processing with Linear and Nonlinear Observations: A Unified Shannon-Theoretic Approach

Authors: Cem Aksoylar, George Atia, Venkatesh Saligrama

Abstract: We derive fundamental sample complexity bounds for recovering sparse and structured signals for linear and nonlinear observation models including sparse regression, group testing, multivariate regression and problems with missing features. In general, sparse signal processing problems can be characterized in terms of the following Markovian property. We are given a set of $N$ variables… ▽ More We derive fundamental sample complexity bounds for recovering sparse and structured signals for linear and nonlinear observation models including sparse regression, group testing, multivariate regression and problems with missing features. In general, sparse signal processing problems can be characterized in terms of the following Markovian property. We are given a set of $N$ variables $X_1,X_2,\ldots,X_N$, and there is an unknown subset of variables $S \subset \{1,\ldots,N\}$ that are relevant for predicting outcomes $Y$. More specifically, when $Y$ is conditioned on $\{X_n\}_{n\in S}$ it is conditionally independent of the other variables, $\{X_n\}_{n \not \in S}$. Our goal is to identify the set $S$ from samples of the variables $X$ and the associated outcomes $Y$. We characterize this problem as a version of the noisy channel coding problem. Using asymptotic information theoretic analyses, we establish mutual information formulas that provide sufficient and necessary conditions on the number of samples required to successfully recover the salient variables. These mutual information expressions unify conditions for both linear and nonlinear observations. We then compute sample complexity bounds for the aforementioned models, based on the mutual information expressions in order to demonstrate the applicability and flexibility of our results in general sparse signal processing models. △ Less

Submitted 25 August, 2016; v1 submitted 2 April, 2013; originally announced April 2013.

Comments: Final version submitted to Trans. IT

arXiv:1210.5454 [pdf, ps, other]

Stuck in Traffic (SiT) Attacks: A Framework for Identifying Stealthy Attacks that Cause Traffic Congestion

Authors: Mina Guirguis, George Atia

Abstract: Recent advances in wireless technologies have enabled many new applications in Intelligent Transportation Systems (ITS) such as collision avoidance, cooperative driving, congestion avoidance, and traffic optimization. Due to the vulnerable nature of wireless communication against interference and intentional jamming, ITS face new challenges to ensure the reliability and the safety of the overall s… ▽ More Recent advances in wireless technologies have enabled many new applications in Intelligent Transportation Systems (ITS) such as collision avoidance, cooperative driving, congestion avoidance, and traffic optimization. Due to the vulnerable nature of wireless communication against interference and intentional jamming, ITS face new challenges to ensure the reliability and the safety of the overall system. In this paper, we expose a class of stealthy attacks -- Stuck in Traffic (SiT) attacks -- that aim to cause congestion by exploiting how drivers make decisions based on smart traffic signs. An attacker mounting a SiT attack solves a Markov Decision Process problem to find optimal/suboptimal attack policies in which he/she interferes with a well-chosen subset of signals that are based on the state of the system. We apply Approximate Policy Iteration (API) algorithms to derive potent attack policies. We evaluate their performance on a number of systems and compare them to other attack policies including random, myopic and DoS attack policies. The generated policies, albeit suboptimal, are shown to significantly outperform other attack policies as they maximize the expected cumulative reward from the standpoint of the attacker. △ Less

Submitted 19 October, 2012; originally announced October 2012.

arXiv:1205.0858 [pdf, ps, other]

Controlled Sensing for Multihypothesis Testing

Authors: Sirin Nitinawarat, George Atia, Venugopal V. Veeravalli

Abstract: The problem of multiple hypothesis testing with observation control is considered in both fixed sample size and sequential settings. In the fixed sample size setting, for binary hypothesis testing, the optimal exponent for the maximal error probability corresponds to the maximum Chernoff information over the choice of controls, and a pure stationary open-loop control policy is asymptotically optim… ▽ More The problem of multiple hypothesis testing with observation control is considered in both fixed sample size and sequential settings. In the fixed sample size setting, for binary hypothesis testing, the optimal exponent for the maximal error probability corresponds to the maximum Chernoff information over the choice of controls, and a pure stationary open-loop control policy is asymptotically optimal within the larger class of all causal control policies. For multihypothesis testing in the fixed sample size setting, lower and upper bounds on the optimal error exponent are derived. It is also shown through an example with three hypotheses that the optimal causal control policy can be strictly better than the optimal open-loop control policy. In the sequential setting, a test based on earlier work by Chernoff for binary hypothesis testing, is shown to be first-order asymptotically optimal for multihypothesis testing in a strong sense, using the notion of decision making risk in place of the overall probability of error. Another test is also designed to meet hard risk constrains while retaining asymptotic optimality. The role of past information and randomization in designing optimal control policies is discussed. △ Less

Submitted 4 September, 2013; v1 submitted 4 May, 2012; originally announced May 2012.

Comments: To appear in the Transactions on Automatic Control

arXiv:1009.3167 [pdf, ps, other]

doi 10.1109/TSP.2011.2159496

Sensor Management for Tracking in Sensor Networks

Authors: Jason A. Fuemmeler, George K. Atia, Venugopal V. Veeravalli

Abstract: We study the problem of tracking an object moving through a network of wireless sensors. In order to conserve energy, the sensors may be put into a sleep mode with a timer that determines their sleep duration. It is assumed that an asleep sensor cannot be communicated with or woken up, and hence the sleep duration needs to be determined at the time the sensor goes to sleep based on all the informa… ▽ More We study the problem of tracking an object moving through a network of wireless sensors. In order to conserve energy, the sensors may be put into a sleep mode with a timer that determines their sleep duration. It is assumed that an asleep sensor cannot be communicated with or woken up, and hence the sleep duration needs to be determined at the time the sensor goes to sleep based on all the information available to the sensor. Having slee** sensors in the network could result in degraded tracking performance, therefore, there is a tradeoff between energy usage and tracking performance. We design slee** policies that attempt to optimize this tradeoff and characterize their performance. As an extension to our previous work in this area [1], we consider generalized models for object movement, object sensing, and tracking cost. For discrete state spaces and continuous Gaussian observations, we derive a lower bound on the optimal energy-tracking tradeoff. It is shown that in the low tracking error regime, the generated policies approach the derived lower bound. △ Less

Submitted 15 September, 2010; originally announced September 2010.

Journal ref: IEEE Trans.Sign.Proc. 59 (2011) 4354-4366

arXiv:1009.2997 [pdf, ps, other]

doi 10.1109/TSP.2011.2160055

Sensor Scheduling for Energy-Efficient Target Tracking in Sensor Networks

Authors: George K. Atia, Venugopal V. Veeravalli, Jason A. Fuemmeler

Abstract: In this paper we study the problem of tracking an object moving randomly through a network of wireless sensors. Our objective is to devise strategies for scheduling the sensors to optimize the tradeoff between tracking performance and energy consumption. We cast the scheduling problem as a Partially Observable Markov Decision Process (POMDP), where the control actions correspond to the set of sens… ▽ More In this paper we study the problem of tracking an object moving randomly through a network of wireless sensors. Our objective is to devise strategies for scheduling the sensors to optimize the tradeoff between tracking performance and energy consumption. We cast the scheduling problem as a Partially Observable Markov Decision Process (POMDP), where the control actions correspond to the set of sensors to activate at each time step. Using a bottom-up approach, we consider different sensing, motion and cost models with increasing levels of difficulty. At the first level, the sensing regions of the different sensors do not overlap and the target is only observed within the sensing range of an active sensor. Then, we consider sensors with overlap** sensing range such that the tracking error, and hence the actions of the different sensors, are tightly coupled. Finally, we consider scenarios wherein the target locations and sensors' observations assume values on continuous spaces. Exact solutions are generally intractable even for the simplest models due to the dimensionality of the information and action spaces. Hence, we devise approximate solution techniques, and in some cases derive lower bounds on the optimal tradeoff curves. The generated scheduling policies, albeit suboptimal, often provide close-to-optimal energy-tracking tradeoffs. △ Less

Submitted 15 September, 2010; originally announced September 2010.

Journal ref: IEEE Trans.Sign.Proc. 59 (2011) 4923-4937

arXiv:0907.1061 [pdf, ps, other]

doi 10.1109/TIT.2011.2178156

Boolean Compressed Sensing and Noisy Group Testing

Authors: George Kamal Atia, Venkatesh Saligrama

Abstract: The fundamental task of group testing is to recover a small distinguished subset of items from a large population while efficiently reducing the total number of tests (measurements). The key contribution of this paper is in adopting a new information-theoretic perspective on group testing problems. We formulate the group testing problem as a channel coding/decoding problem and derive a single-lett… ▽ More The fundamental task of group testing is to recover a small distinguished subset of items from a large population while efficiently reducing the total number of tests (measurements). The key contribution of this paper is in adopting a new information-theoretic perspective on group testing problems. We formulate the group testing problem as a channel coding/decoding problem and derive a single-letter characterization for the total number of tests used to identify the defective set. Although the focus of this paper is primarily on group testing, our main result is generally applicable to other compressive sensing models. The single letter characterization is shown to be order-wise tight for many interesting noisy group testing scenarios. Specifically, we consider an additive Bernoulli($q$) noise model where we show that, for $N$ items and $K$ defectives, the number of tests $T$ is $O(\frac{K\log N}{1-q})$ for arbitrarily small average error probability and $O(\frac{K^2\log N}{1-q})$ for a worst case error criterion. We also consider dilution effects whereby a defective item in a positive pool might get diluted with probability $u$ and potentially missed. In this case, it is shown that $T$ is $O(\frac{K\log N}{(1-u)^2})$ and $O(\frac{K^2\log N}{(1-u)^2})$ for the average and the worst case error criteria, respectively. Furthermore, our bounds allow us to verify existing known bounds for noiseless group testing including the deterministic noise-free case and approximate reconstruction with bounded distortion. Our proof of achievability is based on random coding and the analysis of a Maximum Likelihood Detector, and our information theoretic lower bound is based on Fano's inequality. △ Less

Submitted 10 December, 2013; v1 submitted 6 July, 2009; originally announced July 2009.

Comments: In this revision: reorganized the paper, added citations to related work, and fixed some bugs

Journal ref: IEEE Trans.Inf.Theory 58 (2012) 1880-1901

Showing 1–43 of 43 results for author: Atia, G