Search | arXiv e-print repository

Interpretable Data Fusion for Distributed Learning: A Representative Approach via Gradient Matching

Authors: Mengchen Fan, Baocheng Geng, Keren Li, Xueqian Wang, Pramod K. Varshney

Abstract: This paper introduces a representative-based approach for distributed learning that transforms multiple raw data points into a virtual representation. Unlike traditional distributed learning methods such as Federated Learning, which do not offer human interpretability, our method makes complex machine learning processes accessible and comprehensible. It achieves this by condensing extensive datase… ▽ More This paper introduces a representative-based approach for distributed learning that transforms multiple raw data points into a virtual representation. Unlike traditional distributed learning methods such as Federated Learning, which do not offer human interpretability, our method makes complex machine learning processes accessible and comprehensible. It achieves this by condensing extensive datasets into digestible formats, thus fostering intuitive human-machine interactions. Additionally, this approach maintains privacy and communication efficiency, and it matches the training performance of models using raw data. Simulation results show that our approach is competitive with or outperforms traditional Federated Learning in accuracy and convergence, especially in scenarios with complex models and a higher number of clients. This framework marks a step forward in integrating human intuition with machine intelligence, which potentially enhances human-machine learning interfaces and collaborative efforts. △ Less

Submitted 6 May, 2024; originally announced May 2024.

arXiv:2404.10356 [pdf, other]

Generating Counterfactual Trajectories with Latent Diffusion Models for Concept Discovery

Authors: Payal Varshney, Adriano Lucieri, Christoph Balada, Andreas Dengel, Sheraz Ahmed

Abstract: Trustworthiness is a major prerequisite for the safe application of opaque deep learning models in high-stakes domains like medicine. Understanding the decision-making process not only contributes to fostering trust but might also reveal previously unknown decision criteria of complex models that could advance the state of medical research. The discovery of decision-relevant concepts from black bo… ▽ More Trustworthiness is a major prerequisite for the safe application of opaque deep learning models in high-stakes domains like medicine. Understanding the decision-making process not only contributes to fostering trust but might also reveal previously unknown decision criteria of complex models that could advance the state of medical research. The discovery of decision-relevant concepts from black box models is a particularly challenging task. This study proposes Concept Discovery through Latent Diffusion-based Counterfactual Trajectories (CDCT), a novel three-step framework for concept discovery leveraging the superior image synthesis capabilities of diffusion models. In the first step, CDCT uses a Latent Diffusion Model (LDM) to generate a counterfactual trajectory dataset. This dataset is used to derive a disentangled representation of classification-relevant concepts using a Variational Autoencoder (VAE). Finally, a search algorithm is applied to identify relevant concepts in the disentangled latent space. The application of CDCT to a classifier trained on the largest public skin lesion dataset revealed not only the presence of several biases but also meaningful biomarkers. Moreover, the counterfactuals generated within CDCT show better FID scores than those produced by a previously established state-of-the-art method, while being 12 times more resource-efficient. Unsupervised concept discovery holds great potential for the application of trustworthy AI and the further development of human knowledge in various domains. CDCT represents a further step in this direction. △ Less

Submitted 16 April, 2024; originally announced April 2024.

Comments: Submitted to International Conference on Pattern Recognition (ICPR) 2024

arXiv:2404.09453 [pdf, other]

Towards Greener Nights: Exploring AI-Driven Solutions for Light Pollution Management

Authors: Paras Varshney, Niral Desai, Uzair Ahmed

Abstract: This research endeavors to address the pervasive issue of light pollution through an interdisciplinary approach, leveraging data science and machine learning techniques. By analyzing extensive datasets and research findings, we aim to develop predictive models capable of estimating the degree of sky glow observed in various locations and times. Our research seeks to inform evidence-based intervent… ▽ More This research endeavors to address the pervasive issue of light pollution through an interdisciplinary approach, leveraging data science and machine learning techniques. By analyzing extensive datasets and research findings, we aim to develop predictive models capable of estimating the degree of sky glow observed in various locations and times. Our research seeks to inform evidence-based interventions and promote responsible outdoor lighting practices to mitigate the adverse impacts of light pollution on ecosystems, energy consumption, and human well-being. △ Less

Submitted 15 April, 2024; originally announced April 2024.

arXiv:2404.05993 [pdf, other]

AEGIS: Online Adaptive AI Content Safety Moderation with Ensemble of LLM Experts

Authors: Shaona Ghosh, Prasoon Varshney, Erick Galinkin, Christopher Parisien

Abstract: As Large Language Models (LLMs) and generative AI become more widespread, the content safety risks associated with their use also increase. We find a notable deficiency in high-quality content safety datasets and benchmarks that comprehensively cover a wide range of critical safety areas. To address this, we define a broad content safety risk taxonomy, comprising 13 critical risk and 9 sparse risk… ▽ More As Large Language Models (LLMs) and generative AI become more widespread, the content safety risks associated with their use also increase. We find a notable deficiency in high-quality content safety datasets and benchmarks that comprehensively cover a wide range of critical safety areas. To address this, we define a broad content safety risk taxonomy, comprising 13 critical risk and 9 sparse risk categories. Additionally, we curate AEGISSAFETYDATASET, a new dataset of approximately 26, 000 human-LLM interaction instances, complete with human annotations adhering to the taxonomy. We plan to release this dataset to the community to further research and to help benchmark LLM models for safety. To demonstrate the effectiveness of the dataset, we instruction-tune multiple LLM-based safety models. We show that our models (named AEGISSAFETYEXPERTS), not only surpass or perform competitively with the state-of-the-art LLM-based safety models and general purpose LLMs, but also exhibit robustness across multiple jail-break attack categories. We also show how using AEGISSAFETYDATASET during the LLM alignment phase does not negatively impact the performance of the aligned models on MT Bench scores. Furthermore, we propose AEGIS, a novel application of a no-regret online adaptation framework with strong theoretical guarantees, to perform content moderation with an ensemble of LLM content safety experts in deployment △ Less

Submitted 8 April, 2024; originally announced April 2024.

arXiv:2312.00088 [pdf, ps, other]

Anomaly Detection via Learning-Based Sequential Controlled Sensing

Authors: Geethu Joseph, Chen Zhong, M. Cenk Gursoy, Senem Velipasalar, Pramod K. Varshney

Abstract: In this paper, we address the problem of detecting anomalies among a given set of binary processes via learning-based controlled sensing. Each process is parameterized by a binary random variable indicating whether the process is anomalous. To identify the anomalies, the decision-making agent is allowed to observe a subset of the processes at each time instant. Also, probing each process has an as… ▽ More In this paper, we address the problem of detecting anomalies among a given set of binary processes via learning-based controlled sensing. Each process is parameterized by a binary random variable indicating whether the process is anomalous. To identify the anomalies, the decision-making agent is allowed to observe a subset of the processes at each time instant. Also, probing each process has an associated cost. Our objective is to design a sequential selection policy that dynamically determines which processes to observe at each time with the goal to minimize the delay in making the decision and the total sensing cost. We cast this problem as a sequential hypothesis testing problem within the framework of Markov decision processes. This formulation utilizes both a Bayesian log-likelihood ratio-based reward and an entropy-based reward. The problem is then solved using two approaches: 1) a deep reinforcement learning-based approach where we design both deep Q-learning and policy gradient actor-critic algorithms; and 2) a deep active inference-based approach. Using numerical experiments, we demonstrate the efficacy of our algorithms and show that our algorithms adapt to any unknown statistical dependence pattern of the processes. △ Less

Submitted 30 November, 2023; originally announced December 2023.

arXiv:2309.07855 [pdf, other]

On Distributed and Asynchronous Sampling of Gaussian Processes for Sequential Binary Hypothesis Testing

Authors: Nandan Sriranga, Saikiran Bulusu, Baocheng Geng, Pramod K. Varshney

Abstract: In this work, we consider a binary sequential hypothesis testing problem with distributed and asynchronous measurements. The aim is to analyze the effect of sampling times of jointly $\textit{wide-sense stationary}$ (WSS) Gaussian observation processes at distributed sensors on the expected stop** time of the sequential test at the fusion center (FC). The distributed system is such that the sens… ▽ More In this work, we consider a binary sequential hypothesis testing problem with distributed and asynchronous measurements. The aim is to analyze the effect of sampling times of jointly $\textit{wide-sense stationary}$ (WSS) Gaussian observation processes at distributed sensors on the expected stop** time of the sequential test at the fusion center (FC). The distributed system is such that the sensors and the FC sample observations periodically, where the sampling times are not necessarily synchronous, i.e., the sampling times at different sensors and the FC may be different from each other. The sampling times, however, are restricted to be within a time window and a sample obtained within the window is assumed to be $\textit{uncorrelated}$ with samples outside the window. We also assume that correlations may exist only between the observations sampled at the FC and those at the sensors in a pairwise manner (sensor pairs not including the FC have independent observations). The effect of $\textit{asynchronous}$ sampling on the SPRT performance is analyzed by obtaining bounds for the expected stop** time. We illustrate the validity of the theoretical results with numerical results. △ Less

Submitted 10 October, 2023; v1 submitted 14 September, 2023; originally announced September 2023.

Comments: 7 pages, 3 figures

arXiv:2306.15135 [pdf, other]

On Gibbs Sampling Architecture for Labeled Random Finite Sets Multi-Object Tracking

Authors: Anthony Trezza, Donald J. Bucci Jr., Pramod K. Varshney

Abstract: Gibbs sampling is one of the most popular Markov chain Monte Carlo algorithms because of its simplicity, scalability, and wide applicability within many fields of statistics, science, and engineering. In the labeled random finite sets literature, Gibbs sampling procedures have recently been applied to efficiently truncate the single-sensor and multi-sensor $δ$-generalized labeled multi-Bernoulli p… ▽ More Gibbs sampling is one of the most popular Markov chain Monte Carlo algorithms because of its simplicity, scalability, and wide applicability within many fields of statistics, science, and engineering. In the labeled random finite sets literature, Gibbs sampling procedures have recently been applied to efficiently truncate the single-sensor and multi-sensor $δ$-generalized labeled multi-Bernoulli posterior density as well as the multi-sensor adaptive labeled multi-Bernoulli birth distribution. However, only a limited discussion has been provided regarding key Gibbs sampler architecture details including the Markov chain Monte Carlo sample generation technique and early termination criteria. This paper begins with a brief background on Markov chain Monte Carlo methods and a review of the Gibbs sampler implementations proposed for labeled random finite sets filters. Next, we propose a short chain, multi-simulation sample generation technique that is well suited for these applications and enables a parallel processing implementation. Additionally, we present two heuristic early termination criteria that achieve similar sampling performance with substantially fewer Markov chain observations. Finally, the benefits of the proposed Gibbs samplers are demonstrated via two Monte Carlo simulations. △ Less

Submitted 26 June, 2023; originally announced June 2023.

Comments: Accepted to the 2023 Proc. IEEE 26th Int. Conf. Inf. Fusion

arXiv:2304.14467 [pdf, other]

Distributed Quantized Detection of Sparse Signals Under Byzantine Attacks

Authors: Chen Quan, Yunghsiang S. Han, Baocheng Geng, Pramod K. Varshney

Abstract: This paper investigates distributed detection of sparse stochastic signals with quantized measurements under Byzantine attacks. Under this type of attack, sensors in the networks might send falsified data to degrade system performance. The Bernoulli-Gaussian (BG) distribution in terms of the sparsity degree of the stochastic signal is utilized for modeling the sparsity of signals. Several detector… ▽ More This paper investigates distributed detection of sparse stochastic signals with quantized measurements under Byzantine attacks. Under this type of attack, sensors in the networks might send falsified data to degrade system performance. The Bernoulli-Gaussian (BG) distribution in terms of the sparsity degree of the stochastic signal is utilized for modeling the sparsity of signals. Several detectors with improved detection performance are proposed by incorporating the estimated attack parameters into the detection process. First, we propose the generalized likelihood ratio test with reference sensors (GLRTRS) and the locally most powerful test with reference sensors (LMPTRS) detectors with adaptive thresholds, given that the sparsity degree and the attack parameters are unknown. Our simulation results show that the LMPTRS and GLRTRS detectors outperform the LMPT and GLRT detectors proposed for an attack-free environment and are more robust against attacks. The proposed detectors can achieve the detection performance close to the benchmark likelihood ratio test (LRT) detector, which has perfect knowledge of the attack parameters and sparsity degree. When the fraction of Byzantine nodes are assumed to be known, we can further improve the system's detection performance. We propose the enhanced LMPTRS (E-LMPTRS) and enhanced GLRTRS (E-GLRTRS) detectors by filtering out potential malicious sensors with the knowledge of the fraction of Byzantine nodes in the network. Simulation results show the superiority of proposed enhanced detectors over LMPTRS and GLRTRS detectors. △ Less

Submitted 27 April, 2023; originally announced April 2023.

arXiv:2304.00721 [pdf]

doi 10.1016/j.inffus.2024.102240.

COMIC: An Unsupervised Change Detection Method for Heterogeneous Remote Sensing Images Based on Copula Mixtures and Cycle-Consistent Adversarial Networks

Authors: Chengxi Li, Gang Li, Zhuoyue Wang, Xueqian Wang, Pramod K. Varshney

Abstract: In this paper, we consider the problem of change detection (CD) with two heterogeneous remote sensing (RS) images. For this problem, an unsupervised change detection method has been proposed recently based on the image translation technique of Cycle-Consistent Adversarial Networks (CycleGANs), where one image is translated from its original modality to the modality of the other image so that the d… ▽ More In this paper, we consider the problem of change detection (CD) with two heterogeneous remote sensing (RS) images. For this problem, an unsupervised change detection method has been proposed recently based on the image translation technique of Cycle-Consistent Adversarial Networks (CycleGANs), where one image is translated from its original modality to the modality of the other image so that the difference map can be obtained by performing arithmetical subtraction. However, the difference map derived from subtraction is susceptible to image translation errors, in which case the changed area and the unchanged area are less distinguishable. To overcome the above shortcoming, we propose a new unsupervised copula mixture and CycleGAN-based CD method (COMIC), which combines the advantages of copula mixtures on statistical modeling and the advantages of CycleGANs on data mining. In COMIC, the pre-event image is first translated from its original modality to the post-event image modality. After that, by constructing a copula mixture, the joint distribution of the features from the heterogeneous images can be learnt according to quantitive analysis of the dependence structure based on the translated image and the original pre-event image, which are of the same modality and contain totally the same objects. Then, we model the CD problem as a binary hypothesis testing problem and derive its test statistics based on the constructed copula mixture. Finally, the difference map can be obtained from the test statistics and the binary change map (BCM) is generated by K-means clustering. We perform experiments on real RS datasets, which demonstrate the superiority of COMIC over the state-of-the-art methods. △ Less

Submitted 1 February, 2024; v1 submitted 3 April, 2023; originally announced April 2023.

Journal ref: Published in Information Fusion, Volume 106, 2024, 102240

arXiv:2303.16555 [pdf, other]

On Communication-Efficient Multisensor Track Association via Measurement Transformation (Extended Version)

Authors: Haiqi Liu, Jiajie Sun, Xuqi Zhang, Fanqin Meng, Xiao**g Shen, Pramod K. Varshney

Abstract: Multisensor track-to-track fusion for target tracking involves two primary operations: track association and estimation fusion. For estimation fusion, lossless measurement transformation of sensor measurements has been proposed for single target tracking. In this paper, we investigate track association which is a fundamental and important problem for multitarget tracking. First, since the optimal… ▽ More Multisensor track-to-track fusion for target tracking involves two primary operations: track association and estimation fusion. For estimation fusion, lossless measurement transformation of sensor measurements has been proposed for single target tracking. In this paper, we investigate track association which is a fundamental and important problem for multitarget tracking. First, since the optimal track association problem is a multi-dimensional assignment (MDA) problem, we demonstrate that MDA-based data association (with and without prior track information) using linear transformations of track measurements is lossless, and is equivalent to that using raw track measurements. Second, recent superior scalability and performance of belief propagation (BP) algorithms enable new real-time applications of multitarget tracking with resource-limited devices. Thus, we present a BP-based multisensor track association method with transformed measurements and show that it is equivalent to that with raw measurements. Third, considering communication constraints, it is more beneficial for local sensors to send in compressed data. Two analytical lossless transformations for track association are provided, and it is shown that their communication requirements from each sensor to the fusion center are less than those of fusion with raw track measurements. Numerical examples for tracking an unknown number of targets verify that track association with transformed track measurements has the same performance as that with raw measurements and requires fewer communication bandwidths. △ Less

Submitted 31 March, 2023; v1 submitted 29 March, 2023; originally announced March 2023.

arXiv:2301.10815 [pdf, other]

Human-machine Hierarchical Networks for Decision Making under Byzantine Attacks

Authors: Chen Quan, Baocheng Geng, Yunghsiang S. Han, Pramod K. Varshney

Abstract: This paper proposes a belief-updating scheme in a human-machine collaborative decision-making network to combat Byzantine attacks. A hierarchical framework is used to realize the network where local decisions from physical sensors act as reference decisions to improve the quality of human sensor decisions. During the decision-making process, the belief that each physical sensor is malicious is upd… ▽ More This paper proposes a belief-updating scheme in a human-machine collaborative decision-making network to combat Byzantine attacks. A hierarchical framework is used to realize the network where local decisions from physical sensors act as reference decisions to improve the quality of human sensor decisions. During the decision-making process, the belief that each physical sensor is malicious is updated. The case when humans have side information available is investigated, and its impact is analyzed. Simulation results substantiate that the proposed scheme can significantly improve the quality of human sensor decisions, even when most physical sensors are malicious. Moreover, the performance of the proposed method does not necessarily depend on the knowledge of the actual fraction of malicious physical sensors. Consequently, the proposed scheme can effectively defend against Byzantine attacks and improve the quality of human sensors' decisions so that the performance of the human-machine collaborative system is enhanced. △ Less

Submitted 25 January, 2023; originally announced January 2023.

arXiv:2301.07789 [pdf, other]

Loss Attitude Aware Energy Management for Signal Detection

Authors: Baocheng Geng, Chen Quan, Tianyun Zhang, Makan Fardad, Pramod K. Varshney

Abstract: This work considers a Bayesian signal processing problem where increasing the power of the probing signal may cause risks or undesired consequences. We employ a market based approach to solve energy management problems for signal detection while balancing multiple objectives. In particular, the optimal amount of resource consumption is determined so as to maximize a profit-loss based expected util… ▽ More This work considers a Bayesian signal processing problem where increasing the power of the probing signal may cause risks or undesired consequences. We employ a market based approach to solve energy management problems for signal detection while balancing multiple objectives. In particular, the optimal amount of resource consumption is determined so as to maximize a profit-loss based expected utility function. Next, we study the human behavior of resource consumption while taking individuals' behavioral disparity into account. Unlike rational decision makers who consume the amount of resource to maximize the expected utility function, human decision makers act to maximize their subjective utilities. We employ prospect theory to model humans' loss aversion towards a risky event. The amount of resource consumption that maximizes the humans' subjective utility is derived to characterize the actual behavior of humans. It is shown that loss attitudes may lead the human to behave quite differently from a rational decision maker. △ Less

Submitted 18 January, 2023; originally announced January 2023.

arXiv:2301.07767 [pdf, ps, other]

Sequential Processing of Observations in Human Decision-Making Systems

Authors: Nandan Sriranga, Baocheng Geng, Pramod K. Varshney

Abstract: In this work, we consider a binary hypothesis testing problem involving a group of human decision-makers. Due to the nature of human behavior, each human decision-maker observes the phenomenon of interest sequentially up to a random length of time. The humans use a belief model to accumulate the log-likelihood ratios until they cease observing the phenomenon. The belief model is used to characteri… ▽ More In this work, we consider a binary hypothesis testing problem involving a group of human decision-makers. Due to the nature of human behavior, each human decision-maker observes the phenomenon of interest sequentially up to a random length of time. The humans use a belief model to accumulate the log-likelihood ratios until they cease observing the phenomenon. The belief model is used to characterize the perception of the human decision-maker towards observations at different instants of time, i.e., some decision-makers may assign greater importance to observations that were observed earlier, rather than later and vice-versa. The global decision-maker is a machine that fuses human decisions using the Chair-Varshney rule with different weights for the human decisions, where the weights are determined by the number of observations that were used by the humans to arrive at their respective decisions. △ Less

Submitted 24 January, 2023; v1 submitted 18 January, 2023; originally announced January 2023.

arXiv:2301.07766 [pdf, other]

Human-Machine Collaboration for Smart Decision Making: Current Trends and Future Opportunities

Authors: Baocheng Geng, Pramod K. Varshney

Abstract: Recently, modeling of decision making and control systems that include heterogeneous smart sensing devices (machines) as well as human agents as participants is becoming an important research area due to the wide variety of applications including autonomous driving, smart manufacturing, internet of things, national security, and healthcare. To accomplish complex missions under uncertainty, it is i… ▽ More Recently, modeling of decision making and control systems that include heterogeneous smart sensing devices (machines) as well as human agents as participants is becoming an important research area due to the wide variety of applications including autonomous driving, smart manufacturing, internet of things, national security, and healthcare. To accomplish complex missions under uncertainty, it is imperative that we build novel human machine collaboration structures to integrate the cognitive strengths of humans with computational capabilities of machines in an intelligent manner. In this paper, we present an overview of the existing works on human decision making and human machine collaboration within the scope of signal processing and information fusion. We review several application areas and research domains relevant to human machine collaborative decision making. We also discuss current challenges and future directions in this problem domain. △ Less

Submitted 18 January, 2023; originally announced January 2023.

arXiv:2211.04036 [pdf, other]

Performance Analysis of LEO Satellite-Based IoT Networks in the Presence of Interference

Authors: Ayush Kumar Dwivedi, Sachin Chaudhari, Neeraj Varshney, Pramod K. Varshney

Abstract: This paper presents a star-of-star topology for internet-of-things (IoT) networks using mega low-Earth-orbit constellations. The proposed topology enables IoT users to broadcast their sensed data to multiple satellites simultaneously over a shared channel, which is then relayed to the ground station (GS) using amplify-and-forward relaying. The GS coherently combines the signals from multiple satel… ▽ More This paper presents a star-of-star topology for internet-of-things (IoT) networks using mega low-Earth-orbit constellations. The proposed topology enables IoT users to broadcast their sensed data to multiple satellites simultaneously over a shared channel, which is then relayed to the ground station (GS) using amplify-and-forward relaying. The GS coherently combines the signals from multiple satellites using maximal ratio combining. To analyze the performance of the proposed topology in the presence of interference, a comprehensive outage probability (OP) analysis is performed, assuming imperfect channel state information at the GS. The paper employs stochastic geometry to model the random locations of satellites, making the analysis general and independent of any specific constellation. Furthermore, the paper examines successive interference cancellation (SIC) and capture model (CM)-based decoding schemes at the GS to mitigate interference. The average OP for the CM-based scheme and the OP of the best user for the SIC scheme are derived analytically. The paper also presents simplified expressions for the OP under a high signal-to-noise ratio (SNR) assumption, which are utilized to optimize the system parameters for achieving a target OP. The simulation results are consistent with the analytical expressions and provide insights into the impact of various system parameters, such as mask angle, altitude, number of satellites, and decoding order. The findings of this study demonstrate that the proposed topology can effectively leverage the benefits of multiple satellites to achieve the desired OP and enable burst transmissions without coordination among IoT users, making it an attractive choice for satellite-based IoT networks. △ Less

Submitted 2 September, 2023; v1 submitted 8 November, 2022; originally announced November 2022.

arXiv:2210.03505 [pdf, other]

Sample-Efficient Personalization: Modeling User Parameters as Low Rank Plus Sparse Components

Authors: Soumyabrata Pal, Prateek Varshney, Prateek Jain, Abhradeep Guha Thakurta, Gagan Madan, Gaurav Aggarwal, Pradeep Shenoy, Gaurav Srivastava

Abstract: Personalization of machine learning (ML) predictions for individual users/domains/enterprises is critical for practical recommendation systems. Standard personalization approaches involve learning a user/domain specific embedding that is fed into a fixed global model which can be limiting. On the other hand, personalizing/fine-tuning model itself for each user/domain -- a.k.a meta-learning -- has… ▽ More Personalization of machine learning (ML) predictions for individual users/domains/enterprises is critical for practical recommendation systems. Standard personalization approaches involve learning a user/domain specific embedding that is fed into a fixed global model which can be limiting. On the other hand, personalizing/fine-tuning model itself for each user/domain -- a.k.a meta-learning -- has high storage/infrastructure cost. Moreover, rigorous theoretical studies of scalable personalization approaches have been very limited. To address the above issues, we propose a novel meta-learning style approach that models network weights as a sum of low-rank and sparse components. This captures common information from multiple individuals/users together in the low-rank part while sparse part captures user-specific idiosyncrasies. We then study the framework in the linear setting, where the problem reduces to that of estimating the sum of a rank-$r$ and a $k$-column sparse matrix using a small number of linear measurements. We propose a computationally efficient alternating minimization method with iterative hard thresholding -- AMHT-LRS -- to learn the low-rank and sparse part. Theoretically, for the realizable Gaussian data setting, we show that AMHT-LRS solves the problem efficiently with nearly optimal sample complexity. Finally, a significant challenge in personalization is ensuring privacy of each user's sensitive data. We alleviate this problem by proposing a differentially private variant of our method that also is equipped with strong generalization guarantees. △ Less

Submitted 5 September, 2023; v1 submitted 7 October, 2022; originally announced October 2022.

Comments: 104 pages, 7 figures, 2 Tables

arXiv:2207.08870 [pdf, other]

doi 10.1109/LSP.2023.3244748

Efficient Ordered-Transmission Based Distributed Detection under Data Falsification Attacks

Authors: Chen Quan, Nandan Sriranga, Haodong Yang, Yunghsiang S. Han, Baocheng Geng, Pramod K. Varshney

Abstract: In distributed detection systems, energy-efficient ordered transmission (EEOT) schemes are able to reduce the number of transmissions required to make a final decision. In this work, we investigate the effect of data falsification attacks on the performance of EEOT-based systems. We derive the probability of error for an EEOT-based system under attack and find an upper bound (UB) on the expected n… ▽ More In distributed detection systems, energy-efficient ordered transmission (EEOT) schemes are able to reduce the number of transmissions required to make a final decision. In this work, we investigate the effect of data falsification attacks on the performance of EEOT-based systems. We derive the probability of error for an EEOT-based system under attack and find an upper bound (UB) on the expected number of transmissions required to make the final decision. Moreover, we tighten this UB by solving an optimization problem via integer programming (IP). We also obtain the FC's optimal threshold which guarantees the optimal detection performance of the EEOT-based system. Numerical and simulation results indicate that it is possible to reduce transmissions while still ensuring the quality of the decision with an appropriately designed threshold. △ Less

Submitted 18 July, 2022; originally announced July 2022.

arXiv:2207.04686 [pdf, ps, other]

(Nearly) Optimal Private Linear Regression via Adaptive Clip**

Authors: Prateek Varshney, Abhradeep Thakurta, Prateek Jain

Abstract: We study the problem of differentially private linear regression where each data point is sampled from a fixed sub-Gaussian style distribution. We propose and analyze a one-pass mini-batch stochastic gradient descent method (DP-AMBSSGD) where points in each iteration are sampled without replacement. Noise is added for DP but the noise standard deviation is estimated online. Compared to existing… ▽ More We study the problem of differentially private linear regression where each data point is sampled from a fixed sub-Gaussian style distribution. We propose and analyze a one-pass mini-batch stochastic gradient descent method (DP-AMBSSGD) where points in each iteration are sampled without replacement. Noise is added for DP but the noise standard deviation is estimated online. Compared to existing $(ε, δ)$-DP techniques which have sub-optimal error bounds, DP-AMBSSGD is able to provide nearly optimal error bounds in terms of key parameters like dimensionality $d$, number of points $N$, and the standard deviation $σ$ of the noise in observations. For example, when the $d$-dimensional covariates are sampled i.i.d. from the normal distribution, then the excess error of DP-AMBSSGD due to privacy is $\frac{σ^2 d}{N}(1+\frac{d}{ε^2 N})$, i.e., the error is meaningful when number of samples $N= Ω(d \log d)$ which is the standard operative regime for linear regression. In contrast, error bounds for existing efficient methods in this setting are: $\mathcal{O}\big(\frac{d^3}{ε^2 N^2}\big)$, even for $σ=0$. That is, for constant $ε$, the existing techniques require $N=Ω(d\sqrt{d})$ to provide a non-trivial result. △ Less

Submitted 12 July, 2022; v1 submitted 11 July, 2022; originally announced July 2022.

Comments: 41 Pages, Accepted in the 35th Annual Conference on Learning Theory (COLT 2022)

arXiv:2204.07212 [pdf, other]

Reputation and Audit Bit Based Distributed Detection in the Presence of Byzantine

Authors: Chen Quan, Yunghsiang S. Han, Baocheng Geng, Pramod K. Varshney

Abstract: In this paper, two reputation based algorithms called Reputation and audit based clustering (RAC) algorithm and Reputation and audit based clustering with auxiliary anchor node (RACA) algorithm are proposed to defend against Byzantine attacks in distributed detection networks when the fusion center (FC) has no prior knowledge of the attacking strategy of Byzantine nodes. By updating the reputation… ▽ More In this paper, two reputation based algorithms called Reputation and audit based clustering (RAC) algorithm and Reputation and audit based clustering with auxiliary anchor node (RACA) algorithm are proposed to defend against Byzantine attacks in distributed detection networks when the fusion center (FC) has no prior knowledge of the attacking strategy of Byzantine nodes. By updating the reputation index of the sensors in cluster-based networks, the system can accurately identify Byzantine nodes. The simulation results show that both proposed algorithms have superior detection performance compared with other algorithms. The proposed RACA algorithm works well even when the number of Byzantine nodes exceeds half of the total number of sensors in the network. Furthermore, the robustness of our proposed algorithms is evaluated in a dynamically changing scenario, where the attacking parameters change over time. We show that our algorithms can still achieve superior detection performance. △ Less

Submitted 14 April, 2022; originally announced April 2022.

arXiv:2203.13324 [pdf, other]

Resilient Execution of Data-triggered Applications on Edge, Fog and Cloud Resources

Authors: Prateeksha Varshney, Shriram Ramesh, Shayal Chhabra, Aakash Khochare, Yogesh Simmhan

Abstract: Internet of Things (IoT) is leading to the pervasive availability of streaming data about the physical world, coupled with edge computing infrastructure deployed as part of smart cities and 5G rollout. These constrained, less reliable but cheap resources are complemented by fog resources that offer federated management and accelerated computing, and pay-as-you-go cloud resources. There is a lack o… ▽ More Internet of Things (IoT) is leading to the pervasive availability of streaming data about the physical world, coupled with edge computing infrastructure deployed as part of smart cities and 5G rollout. These constrained, less reliable but cheap resources are complemented by fog resources that offer federated management and accelerated computing, and pay-as-you-go cloud resources. There is a lack of intuitive means to deploy application pipelines to consume such diverse streams, and to execute them reliably on edge and fog resources. We propose an innovative application model to declaratively specify queries to match streams of micro-batch data from stream sources and trigger the distributed execution of data pipelines. We also design a resilient scheduling strategy using advanced reservation on reliable fogs to guarantee dataflow completion within a deadline while minimizing the execution cost. Our detailed experiments on over 100 virtual IoT resources and for $\approx 10k$ task executions, with comparison against baseline scheduling strategies, illustrates the cost-effectiveness, resilience and scalability of our framework. △ Less

Submitted 24 March, 2022; originally announced March 2022.

arXiv:2203.09567 [pdf, other]

Distributed Estimation in Large Scale Wireless Sensor Networks via a Two Step Group-based Approach

Authors: Shan Zhang, Pranay Sharma, Baocheng Geng, Pramod K. Varshney

Abstract: We consider the problem of collaborative distributed estimation in a large scale sensor network with statistically dependent sensor observations. In collaborative setup, the aim is to maximize the overall estimation performance by modeling the underlying statistical dependence and efficiently utilizing the deployed sensors. To achieve greater sensor transmission and estimation efficiency, we propo… ▽ More We consider the problem of collaborative distributed estimation in a large scale sensor network with statistically dependent sensor observations. In collaborative setup, the aim is to maximize the overall estimation performance by modeling the underlying statistical dependence and efficiently utilizing the deployed sensors. To achieve greater sensor transmission and estimation efficiency, we propose a two step group-based collaborative distributed estimation scheme, where in the first step, sensors form dependence driven groups such that sensors in the same group are highly dependent, while sensors from different groups are independent, and perform a copula-based maximum a posteriori probability (MAP) estimation via intragroup collaboration. In the second step, the estimates generated in the first step are shared via inter-group collaboration to reach an average consensus. A merge based K-medoid dependence driven grou** algorithm is proposed. Moreover, we further propose a group-based sensor selection scheme using mutual information prior to the estimation. The aim is to select sensors with maximum relevance and minimum redundancy regarding the parameter of interest under certain pre-specified energy constraint. Also, the proposed group-based sensor selection scheme is shown to be equivalent to the global/non-group based selection scheme with high probability, but computationally more efficient. Numerical experiments are conducted to demonstrate the effectiveness of our approach. △ Less

Submitted 17 March, 2022; originally announced March 2022.

arXiv:2203.04850 [pdf, other]

Federated Minimax Optimization: Improved Convergence Analyses and Algorithms

Authors: Pranay Sharma, Rohan Panda, Gauri Joshi, Pramod K. Varshney

Abstract: In this paper, we consider nonconvex minimax optimization, which is gaining prominence in many modern machine learning applications such as GANs. Large-scale edge-based collection of training data in these applications calls for communication-efficient distributed optimization algorithms, such as those used in federated learning, to process the data. In this paper, we analyze Local stochastic grad… ▽ More In this paper, we consider nonconvex minimax optimization, which is gaining prominence in many modern machine learning applications such as GANs. Large-scale edge-based collection of training data in these applications calls for communication-efficient distributed optimization algorithms, such as those used in federated learning, to process the data. In this paper, we analyze Local stochastic gradient descent ascent (SGDA), the local-update version of the SGDA algorithm. SGDA is the core algorithm used in minimax optimization, but it is not well-understood in a distributed setting. We prove that Local SGDA has \textit{order-optimal} sample complexity for several classes of nonconvex-concave and nonconvex-nonconcave minimax problems, and also enjoys \textit{linear speedup} with respect to the number of clients. We provide a novel and tighter analysis, which improves the convergence and communication guarantees in the existing literature. For nonconvex-PL and nonconvex-one-point-concave functions, we improve the existing complexity results for centralized minimax problems. Furthermore, we propose a momentum-based local-update algorithm, which has the same convergence guarantees, but outperforms Local SGDA as demonstrated in our experiments. △ Less

Submitted 9 March, 2022; originally announced March 2022.

Comments: 52 pages, 4 figures

arXiv:2201.08737 [pdf, other]

Ordered Transmission-based Detection in Distributed Networks in the Presence of Byzantines

Authors: Chen Quan, Saikiran Bulusu, Baocheng Geng, Pramod K. Varshney

Abstract: The ordered transmission (OT) scheme reduces the number of transmissions needed in the network to make the final decision, while it maintains the same probability of error as the system without using OT scheme. In this paper, we investigate the performance of the system using OT scheme in the presence of Byzantine attacks for binary hypothesis testing problem. We analyze the probability of error f… ▽ More The ordered transmission (OT) scheme reduces the number of transmissions needed in the network to make the final decision, while it maintains the same probability of error as the system without using OT scheme. In this paper, we investigate the performance of the system using OT scheme in the presence of Byzantine attacks for binary hypothesis testing problem. We analyze the probability of error for the system under attack and evaluate the number of transmissions saved using Monte Carlo method. We also derive the bounds for the number of transmissions saved in the system under attack. The optimal attacking strategy for the OT-based system is investigated. Simulation results show that the Byzantine attacks have significant impact on the number of transmissions saved even when the signal strength is sufficiently large. △ Less

Submitted 21 January, 2022; originally announced January 2022.

arXiv:2201.00879 [pdf, ps, other]

Temporal Detection of Anomalies via Actor-Critic Based Controlled Sensing

Authors: Geethu Joseph, M. Cenk Gursoy, Pramod K. Varshney

Abstract: We address the problem of monitoring a set of binary stochastic processes and generating an alert when the number of anomalies among them exceeds a threshold. For this, the decision-maker selects and probes a subset of the processes to obtain noisy estimates of their states (normal or anomalous). Based on the received observations, the decisionmaker first determines whether to declare that the num… ▽ More We address the problem of monitoring a set of binary stochastic processes and generating an alert when the number of anomalies among them exceeds a threshold. For this, the decision-maker selects and probes a subset of the processes to obtain noisy estimates of their states (normal or anomalous). Based on the received observations, the decisionmaker first determines whether to declare that the number of anomalies has exceeded the threshold or to continue taking observations. When the decision is to continue, it then decides whether to collect observations at the next time instant or defer it to a later time. If it chooses to collect observations, it further determines the subset of processes to be probed. To devise this three-step sequential decision-making process, we use a Bayesian formulation wherein we learn the posterior probability on the states of the processes. Using the posterior probability, we construct a Markov decision process and solve it using deep actor-critic reinforcement learning. Via numerical experiments, we demonstrate the superior performance of our algorithm compared to the traditional model-based algorithms. △ Less

Submitted 16 June, 2023; v1 submitted 3 January, 2022; originally announced January 2022.

Comments: 6 pages, 1 figure

arXiv:2112.04912 [pdf, ps, other]

Scalable and Decentralized Algorithms for Anomaly Detection via Learning-Based Controlled Sensing

Authors: Geethu Joseph, Chen Zhong, M. Cenk Gursoy, Senem Velipasalar, Pramod K. Varshney

Abstract: We address the problem of sequentially selecting and observing processes from a given set to find the anomalies among them. The decision-maker observes a subset of the processes at any given time instant and obtains a noisy binary indicator of whether or not the corresponding process is anomalous. In this setting, we develop an anomaly detection algorithm that chooses the processes to be observed… ▽ More We address the problem of sequentially selecting and observing processes from a given set to find the anomalies among them. The decision-maker observes a subset of the processes at any given time instant and obtains a noisy binary indicator of whether or not the corresponding process is anomalous. In this setting, we develop an anomaly detection algorithm that chooses the processes to be observed at a given time instant, decides when to stop taking observations, and declares the decision on anomalous processes. The objective of the detection algorithm is to identify the anomalies with an accuracy exceeding the desired value while minimizing the delay in decision making. We devise a centralized algorithm where the processes are jointly selected by a common agent as well as a decentralized algorithm where the decision of whether to select a process is made independently for each process. Our algorithms rely on a Markov decision process defined using the marginal probability of each process being normal or anomalous, conditioned on the observations. We implement the detection algorithms using the deep actor-critic reinforcement learning framework. Unlike prior work on this topic that has exponential complexity in the number of processes, our algorithms have computational and memory requirements that are both polynomial in the number of processes. We demonstrate the efficacy of these algorithms using numerical experiments by comparing them with state-of-the-art methods. △ Less

Submitted 8 December, 2021; originally announced December 2021.

Comments: 13 pages, 4 figures. arXiv admin note: substantial text overlap with arXiv:2105.06289

arXiv:2109.13325 [pdf, other]

Enhanced Audit Bit Based Distributed Bayesian Detection in the Presence of Strategic Attacks

Authors: Chen Quan, Baocheng Geng, Yunghsiang S. Han, Pramod K. Varshney

Abstract: This paper employs an audit bit based mechanism to mitigate the effect of Byzantine attacks. In this framework, the optimal attacking strategy for intelligent attackers is investigated for the traditional audit bit based scheme (TAS) to evaluate the robustness of the system. We show that it is possible for an intelligent attacker to degrade the performance of TAS to the system without audit bits.… ▽ More This paper employs an audit bit based mechanism to mitigate the effect of Byzantine attacks. In this framework, the optimal attacking strategy for intelligent attackers is investigated for the traditional audit bit based scheme (TAS) to evaluate the robustness of the system. We show that it is possible for an intelligent attacker to degrade the performance of TAS to the system without audit bits. To enhance the robustness of the system in the presence of intelligent attackers, we propose an enhanced audit bit based scheme (EAS). The optimal fusion rule for the proposed scheme is derived and the detection performance of the system is evaluated via the probability of error for the system. Simulation results show that the proposed EAS improves the robustness and the detection performance of the system. Moreover, based on EAS, another new scheme called the reduced audit bit based scheme (RAS) is proposed which further improves system performance. We derive the new optimal fusion rule and the simulation results show that RAS outperforms EAS and TAS in terms of both robustness and detection performance of the system. Then, we extend the proposed RAS for a wide-area cluster based distributed wireless sensor networks (CWSNs). Simulation results show that the proposed RAS significantly reduces the communication overhead between the sensors and the FC, which prolongs the lifetime of the network. △ Less

Submitted 27 September, 2021; originally announced September 2021.

arXiv:2109.04355 [pdf, ps, other]

doi 10.1109/TSP.2022.3151553

Multi-sensor Joint Adaptive Birth Sampler for Labeled Random Finite Set Tracking

Authors: Anthony Trezza, Donald J. Bucci Jr., Pramod K. Varshney

Abstract: This paper provides a scalable, multi-sensor measurement adaptive track initiation technique for labeled random finite set filters. A naive construction of the multi-sensor measurement adaptive birth set distribution leads to an exponential number of newborn components in the number of sensors. A truncation criterion is established for a labeled multi-Bernoulli random finite set birth density. The… ▽ More This paper provides a scalable, multi-sensor measurement adaptive track initiation technique for labeled random finite set filters. A naive construction of the multi-sensor measurement adaptive birth set distribution leads to an exponential number of newborn components in the number of sensors. A truncation criterion is established for a labeled multi-Bernoulli random finite set birth density. The proposed truncation criterion is shown to have a bounded L1 error in the generalized labeled multi-Bernoulli posterior density. This criterion is used to construct a Gibbs sampler that produces a truncated measurement-generated labeled multi-Bernoulli birth distribution with quadratic complexity in the number of sensors. A closed-form solution of the conditional sampling distribution assuming linear Gaussian likelihoods is provided, alongside an approximate solution using Monte Carlo importance sampling. Multiple simulation results are provided to verify the efficacy of the truncation criterion, as well as the reduction in complexity. △ Less

Submitted 29 April, 2022; v1 submitted 11 August, 2021; originally announced September 2021.

Journal ref: in IEEE Transactions on Signal Processing, vol. 70, pp. 1010-1025, 2022

arXiv:2106.10435 [pdf, other]

STEM: A Stochastic Two-Sided Momentum Algorithm Achieving Near-Optimal Sample and Communication Complexities for Federated Learning

Authors: Prashant Khanduri, Pranay Sharma, Haibo Yang, Mingyi Hong, Jia Liu, Ketan Rajawat, Pramod K. Varshney

Abstract: Federated Learning (FL) refers to the paradigm where multiple worker nodes (WNs) build a joint model by using local data. Despite extensive research, for a generic non-convex FL problem, it is not clear, how to choose the WNs' and the server's update directions, the minibatch sizes, and the local update frequency, so that the WNs use the minimum number of samples and communication rounds to achiev… ▽ More Federated Learning (FL) refers to the paradigm where multiple worker nodes (WNs) build a joint model by using local data. Despite extensive research, for a generic non-convex FL problem, it is not clear, how to choose the WNs' and the server's update directions, the minibatch sizes, and the local update frequency, so that the WNs use the minimum number of samples and communication rounds to achieve the desired solution. This work addresses the above question and considers a class of stochastic algorithms where the WNs perform a few local updates before communication. We show that when both the WN's and the server's directions are chosen based on a stochastic momentum estimator, the algorithm requires $\tilde{\mathcal{O}}(ε^{-3/2})$ samples and $\tilde{\mathcal{O}}(ε^{-1})$ communication rounds to compute an $ε$-stationary solution. To the best of our knowledge, this is the first FL algorithm that achieves such {\it near-optimal} sample and communication complexities simultaneously. Further, we show that there is a trade-off curve between local update frequencies and local minibatch sizes, on which the above sample and communication complexities can be maintained. Finally, we show that for the classical FedAvg (a.k.a. Local SGD, which is a momentum-less special case of the STEM), a similar trade-off curve exists, albeit with worse sample and communication complexities. Our insights on this trade-off provides guidelines for choosing the four important design elements for FL algorithms, the update frequency, directions, and minibatch sizes to achieve the best performance. △ Less

Submitted 19 June, 2021; originally announced June 2021.

arXiv:2105.06289 [pdf, ps, other]

A Scalable Algorithm for Anomaly Detection via Learning-Based Controlled Sensing

Authors: Geethu Joseph, M. Cenk Gursoy, Pramod K. Varshney

Abstract: We address the problem of sequentially selecting and observing processes from a given set to find the anomalies among them. The decision-maker observes one process at a time and obtains a noisy binary indicator of whether or not the corresponding process is anomalous. In this setting, we develop an anomaly detection algorithm that chooses the process to be observed at a given time instant, decides… ▽ More We address the problem of sequentially selecting and observing processes from a given set to find the anomalies among them. The decision-maker observes one process at a time and obtains a noisy binary indicator of whether or not the corresponding process is anomalous. In this setting, we develop an anomaly detection algorithm that chooses the process to be observed at a given time instant, decides when to stop taking observations, and makes a decision regarding the anomalous processes. The objective of the detection algorithm is to arrive at a decision with an accuracy exceeding a desired value while minimizing the delay in decision making. Our algorithm relies on a Markov decision process defined using the marginal probability of each process being normal or anomalous, conditioned on the observations. We implement the detection algorithm using the deep actor-critic reinforcement learning framework. Unlike prior work on this topic that has exponential complexity in the number of processes, our algorithm has computational and memory requirements that are both polynomial in the number of processes. We demonstrate the efficacy of our algorithm using numerical experiments by comparing it with the state-of-the-art methods. △ Less

Submitted 12 May, 2021; originally announced May 2021.

Comments: 6 pages, 8 figures

Journal ref: ICC 2021

arXiv:2105.06288 [pdf, ps, other]

Anomaly Detection via Controlled Sensing and Deep Active Inference

Authors: Geethu Joseph, Chen Zhong, M. Cenk Gursoy, Senem Velipasalar, Pramod K. Varshney

Abstract: In this paper, we address the anomaly detection problem where the objective is to find the anomalous processes among a given set of processes. To this end, the decision-making agent probes a subset of processes at every time instant and obtains a potentially erroneous estimate of the binary variable which indicates whether or not the corresponding process is anomalous. The agent continues to probe… ▽ More In this paper, we address the anomaly detection problem where the objective is to find the anomalous processes among a given set of processes. To this end, the decision-making agent probes a subset of processes at every time instant and obtains a potentially erroneous estimate of the binary variable which indicates whether or not the corresponding process is anomalous. The agent continues to probe the processes until it obtains a sufficient number of measurements to reliably identify the anomalous processes. In this context, we develop a sequential selection algorithm that decides which processes to be probed at every instant to detect the anomalies with an accuracy exceeding a desired value while minimizing the delay in making the decision and the total number of measurements taken. Our algorithm is based on active inference which is a general framework to make sequential decisions in order to maximize the notion of free energy. We define the free energy using the objectives of the selection policy and implement the active inference framework using a deep neural network approximation. Using numerical experiments, we compare our algorithm with the state-of-the-art method based on deep actor-critic reinforcement learning and demonstrate the superior performance of our algorithm. △ Less

Submitted 12 May, 2021; originally announced May 2021.

Comments: 6 pages,9 figures

Journal ref: Globecom 2020

arXiv:2012.13063 [pdf]

doi 10.1109/JIOT.2021.3078543

Decentralized Federated Learning via Mutual Knowledge Transfer

Authors: Chengxi Li, Gang Li, Pramod K. Varshney

Abstract: In this paper, we investigate the problem of decentralized federated learning (DFL) in Internet of things (IoT) systems, where a number of IoT clients train models collectively for a common task without sharing their private training data in the absence of a central server. Most of the existing DFL schemes are composed of two alternating steps, i.e., model updating and model averaging. However, av… ▽ More In this paper, we investigate the problem of decentralized federated learning (DFL) in Internet of things (IoT) systems, where a number of IoT clients train models collectively for a common task without sharing their private training data in the absence of a central server. Most of the existing DFL schemes are composed of two alternating steps, i.e., model updating and model averaging. However, averaging model parameters directly to fuse different models at the local clients suffers from client-drift especially when the training data are heterogeneous across different clients. This leads to slow convergence and degraded learning performance. As a possible solution, we propose the decentralized federated earning via mutual knowledge transfer (Def-KT) algorithm where local clients fuse models by transferring their learnt knowledge to each other. Our experiments on the MNIST, Fashion-MNIST, CIFAR-10, and CIFAR-100 datasets reveal that the proposed Def-KT algorithm significantly outperforms the baseline DFL methods with model averaging, i.e., Combo and FullAvg, especially when the training data are not independent and identically distributed (non-IID) across different clients. △ Less

Submitted 12 May, 2021; v1 submitted 23 December, 2020; originally announced December 2020.

Comments: Published in IEEE Internet of Things Journal

arXiv:2012.11518 [pdf, other]

Zeroth-Order Hybrid Gradient Descent: Towards A Principled Black-Box Optimization Framework

Authors: Pranay Sharma, Kaidi Xu, Sijia Liu, Pin-Yu Chen, Xue Lin, Pramod K. Varshney

Abstract: In this work, we focus on the study of stochastic zeroth-order (ZO) optimization which does not require first-order gradient information and uses only function evaluations. The problem of ZO optimization has emerged in many recent machine learning applications, where the gradient of the objective function is either unavailable or difficult to compute. In such cases, we can approximate the full gra… ▽ More In this work, we focus on the study of stochastic zeroth-order (ZO) optimization which does not require first-order gradient information and uses only function evaluations. The problem of ZO optimization has emerged in many recent machine learning applications, where the gradient of the objective function is either unavailable or difficult to compute. In such cases, we can approximate the full gradients or stochastic gradients through function value based gradient estimates. Here, we propose a novel hybrid gradient estimator (HGE), which takes advantage of the query-efficiency of random gradient estimates as well as the variance-reduction of coordinate-wise gradient estimates. We show that with a graceful design in coordinate importance sampling, the proposed HGE-based ZO optimization method is efficient both in terms of iteration complexity as well as function query cost. We provide a thorough theoretical analysis of the convergence of our proposed method for non-convex, convex, and strongly-convex optimization. We show that the convergence rate that we derive generalizes the results for some prominent existing methods in the nonconvex case, and matches the optimal result in the convex case. We also corroborate the theory with a real-world black-box attack generation application to demonstrate the empirical advantage of our method over state-of-the-art ZO optimization approaches. △ Less

Submitted 21 December, 2020; originally announced December 2020.

Comments: 27 pages, 3 figures

arXiv:2011.14073 [pdf, other]

On Performance Comparison of Multi-Antenna HD-NOMA, SCMA and PD-NOMA Schemes

Authors: Animesh Yadav, Chen Quan, Pramod K. Varshney, H. Vincent Poor

Abstract: In this paper, we study the uplink channel throughput performance of a proposed novel multiple-antenna hybrid-domain non-orthogonal multiple access (MA-HD-NOMA) scheme. This scheme combines the conventional sparse code multiple access (SCMA) and power-domain NOMA (PD-NOMA) schemes in order to increase the number of users served as compared to conventional NOMA schemes and uses multiple antennas at… ▽ More In this paper, we study the uplink channel throughput performance of a proposed novel multiple-antenna hybrid-domain non-orthogonal multiple access (MA-HD-NOMA) scheme. This scheme combines the conventional sparse code multiple access (SCMA) and power-domain NOMA (PD-NOMA) schemes in order to increase the number of users served as compared to conventional NOMA schemes and uses multiple antennas at the base station. To this end, a joint resource allocation problem for the MA-HD-NOMA scheme is formulated that maximizes the sum rate of the entire system. For a comprehensive comparison, the joint resource allocation problems for the multi-antenna SCMA (MA-SCMA) and multi-antenna PD-NOMA (MA-PD-NOMA) schemes with the same overloading factor are formulated as well. Each of the formulated problems is a mixed-integer non-convex program, and hence, we apply successive convex approximation (SCA)- and reweighted $\ell_1$ minimization-based approaches to obtain rapidly converging solutions. Numerical results reveal that the proposed MA-HD-NOMA scheme has superior performance compared to MA-SCMA and MA-PD-NOMA. △ Less

Submitted 28 November, 2020; originally announced November 2020.

Comments: Accepted to be Published in: IEEE Wireless Communications Letters

arXiv:2010.02700 [pdf, ps, other]

doi 10.1109/LSP.2021.3102858

Joint Collaboration and Compression Design for Distributed Sequential Estimation in a Wireless Sensor Network

Authors: Xiancheng Cheng, Prashant Khanduri, Boxiao Chen, Pramod K. Varshney

Abstract: In this work, we propose a joint collaboration-compression framework for sequential estimation of a random vector parameter in a resource constrained wireless sensor network (WSN). Specifically, we propose a framework where the local sensors first collaborate (via a collaboration matrix) with each other. Then a subset of sensors selected to communicate with the FC linearly compress their observati… ▽ More In this work, we propose a joint collaboration-compression framework for sequential estimation of a random vector parameter in a resource constrained wireless sensor network (WSN). Specifically, we propose a framework where the local sensors first collaborate (via a collaboration matrix) with each other. Then a subset of sensors selected to communicate with the FC linearly compress their observations before transmission. We design near-optimal collaboration and linear compression strategies under power constraints via alternating minimization of the sequential minimum mean square error. We show that the objective function for collaboration design can be non-convex depending on the network topology. We reformulate and solve the collaboration design problem using quadratically constrained quadratic program (QCQP). Moreover, the compression design problem is also formulated as a QCQP. We propose two versions of compression design, one centralized where the compression strategies are derived at the FC and the other decentralized, where the local sensors compute their individual compression matrices independently. It is noted that the design of decentralized compression strategy is a non-convex problem. We obtain a near-optimal solution by using the bisection method. In contrast to the one-shot estimator, our proposed algorithm is capable of handling dynamic system parameters such as channel gains and energy constraints. Importantly, we show that the proposed methods can also be used for estimating time-varying random vector parameters. Finally, numerical results are provided to demonstrate the effectiveness of the proposed framework. △ Less

Submitted 6 October, 2020; originally announced October 2020.

arXiv:2007.10830 [pdf, other]

CS-NET at SemEval-2020 Task 4: Siamese BERT for ComVE

Authors: Soumya Ranjan Dash, Sandeep Routray, Prateek Varshney, Ashutosh Modi

Abstract: In this paper, we describe our system for Task 4 of SemEval 2020, which involves differentiating between natural language statements that confirm to common sense and those that do not. The organizers propose three subtasks - first, selecting between two sentences, the one which is against common sense. Second, identifying the most crucial reason why a statement does not make sense. Third, generati… ▽ More In this paper, we describe our system for Task 4 of SemEval 2020, which involves differentiating between natural language statements that confirm to common sense and those that do not. The organizers propose three subtasks - first, selecting between two sentences, the one which is against common sense. Second, identifying the most crucial reason why a statement does not make sense. Third, generating novel reasons for explaining the against common sense statement. Out of the three subtasks, this paper reports the system description of subtask A and subtask B. This paper proposes a model based on transformer neural network architecture for addressing the subtasks. The novelty in work lies in the architecture design, which handles the logical implication of contradicting statements and simultaneous information extraction from both sentences. We use a parallel instance of transformers, which is responsible for a boost in the performance. We achieved an accuracy of 94.8% in subtask A and 89% in subtask B on the test set. △ Less

Submitted 21 July, 2020; originally announced July 2020.

Comments: 6 pages, 2 figures, 2 tables Accepted at Proceedings of the 14th International Workshop on Semantic Evaluation (SemEval-2020)

arXiv:2007.09179 [pdf, other]

A Novel Spectrally-Efficient Uplink Hybrid-Domain NOMA System

Authors: Chen Quan, Animesh Yadav, Baocheng Geng, Pramod K. Varshney, H. Vincent Poor

Abstract: This paper proposes a novel hybrid-domain (HD) non-orthogonal multiple access (NOMA) approach to support a larger number of uplink users than the recently proposed code-domain NOMA approach, i.e., sparse code multiple access (SCMA). HD-NOMA combines the code-domain and power-domain NOMA schemes by clustering the users in small path loss (strong) and large path loss (weak) groups. The two groups ar… ▽ More This paper proposes a novel hybrid-domain (HD) non-orthogonal multiple access (NOMA) approach to support a larger number of uplink users than the recently proposed code-domain NOMA approach, i.e., sparse code multiple access (SCMA). HD-NOMA combines the code-domain and power-domain NOMA schemes by clustering the users in small path loss (strong) and large path loss (weak) groups. The two groups are decoded using successive interference cancellation while within the group users are decoded using the message passing algorithm. To further improve the performance of the system, a spectral-efficiency maximization problem is formulated under a user quality-of-service constraint, which dynamically assigns power and subcarrier to the users. The problem is non-convex and has sparsity constraints. The alternating optimization procedure is used to solve it iteratively. We apply successive convex approximation and reweighted $\ell_1$ minimization approaches to deal with the non-convexity and sparsity constraints, respectively. The performance of the proposed HD-NOMA is evaluated and compared with the conventional SCMA scheme through numerical simulation. The results show the potential of HD-NOMA in increasing the number of uplink users. △ Less

Submitted 27 July, 2020; v1 submitted 17 July, 2020; originally announced July 2020.

Comments: Accepted to be Published in: IEEE Communications Letters

arXiv:2006.06224 [pdf, other]

A Primer on Zeroth-Order Optimization in Signal Processing and Machine Learning

Authors: Sijia Liu, Pin-Yu Chen, Bhavya Kailkhura, Gaoyuan Zhang, Alfred Hero, Pramod K. Varshney

Abstract: Zeroth-order (ZO) optimization is a subset of gradient-free optimization that emerges in many signal processing and machine learning applications. It is used for solving optimization problems similarly to gradient-based methods. However, it does not require the gradient, using only function evaluations. Specifically, ZO optimization iteratively performs three major steps: gradient estimation, desc… ▽ More Zeroth-order (ZO) optimization is a subset of gradient-free optimization that emerges in many signal processing and machine learning applications. It is used for solving optimization problems similarly to gradient-based methods. However, it does not require the gradient, using only function evaluations. Specifically, ZO optimization iteratively performs three major steps: gradient estimation, descent direction computation, and solution update. In this paper, we provide a comprehensive review of ZO optimization, with an emphasis on showing the underlying intuition, optimization principles and recent advances in convergence analysis. Moreover, we demonstrate promising applications of ZO optimization, such as evaluating robustness and generating explanations from black-box deep learning models, and efficient online sensor management. △ Less

Submitted 21 June, 2020; v1 submitted 11 June, 2020; originally announced June 2020.

Comments: IEEE Signal Processing Magazine

arXiv:2006.05454 [pdf, other]

Noisy One-bit Compressed Sensing with Side-Information

Authors: Swatantra Kafle, Thakshila Wimalajeewa, and Pramod K. Varshney

Abstract: We consider the problem of sparse signal reconstruction from noisy one-bit compressed measurements when the receiver has access to side-information (SI). We assume that compressed measurements are corrupted by additive white Gaussian noise before quantization and sign-flip error after quantization. A generalized approximate message passing-based method for signal reconstruction from noisy one-bit… ▽ More We consider the problem of sparse signal reconstruction from noisy one-bit compressed measurements when the receiver has access to side-information (SI). We assume that compressed measurements are corrupted by additive white Gaussian noise before quantization and sign-flip error after quantization. A generalized approximate message passing-based method for signal reconstruction from noisy one-bit compressed measurements is proposed, which is then extended for the case where the receiver has access to a signal that aids signal reconstruction, i.e., side-information. Two different scenarios of side-information are considered-a) side-information consisting of support information only, and b) side information consisting of support and amplitude information. SI is either a noisy version of the signal or a noisy estimate of the support of the signal. We develop reconstruction algorithms from one-bit measurements using noisy SI available at the receiver. Laplacian distribution and Bernoulli distribution are used to model the two types of noises which, when applied to the signal and the support, yields the SI for the above two cases, respectively. The Expectation-Maximization algorithm is used to estimate the noise parameters using noisy one-bit compressed measurements and the SI. We show that one-bit compressed measurement-based signal reconstruction is quite sensitive to noise, and the reconstruction performance can be significantly improved by exploiting available side-information at the receiver. △ Less

Submitted 9 June, 2020; originally announced June 2020.

arXiv:2006.01044 [pdf, ps, other]

Anomaly Detection Under Controlled Sensing Using Actor-Critic Reinforcement Learning

Authors: Geethu Joseph, M. Cenk Gursoy, Pramod K. Varshney

Abstract: We consider the problem of detecting anomalies among a given set of processes using their noisy binary sensor measurements. The noiseless sensor measurement corresponding to a normal process is 0, and the measurement is 1 if the process is anomalous. The decision-making algorithm is assumed to have no knowledge of the number of anomalous processes. The algorithm is allowed to choose a subset of th… ▽ More We consider the problem of detecting anomalies among a given set of processes using their noisy binary sensor measurements. The noiseless sensor measurement corresponding to a normal process is 0, and the measurement is 1 if the process is anomalous. The decision-making algorithm is assumed to have no knowledge of the number of anomalous processes. The algorithm is allowed to choose a subset of the sensors at each time instant until the confidence level on the decision exceeds the desired value. Our objective is to design a sequential sensor selection policy that dynamically determines which processes to observe at each time and when to terminate the detection algorithm. The selection policy is designed such that the anomalous processes are detected with the desired confidence level while incurring minimum cost which comprises the delay in detection and the cost of sensing. We cast this problem as a sequential hypothesis testing problem within the framework of Markov decision processes, and solve it using the actor-critic deep reinforcement learning algorithm. This deep neural network-based algorithm offers a low complexity solution with good detection accuracy. We also study the effect of statistical dependence between the processes on the algorithm performance. Through numerical experiments, we show that our algorithm is able to adapt to any unknown statistical dependence pattern of the processes. △ Less

Submitted 26 May, 2020; originally announced June 2020.

Comments: 5 pages, 2 figures, accepted at 2020 IEEE 21st International Workshop on Signal Processing Advances in Wireless Communications (SPAWC)

arXiv:2005.00224 [pdf, ps, other]

Distributed Stochastic Non-Convex Optimization: Momentum-Based Variance Reduction

Authors: Prashant Khanduri, Pranay Sharma, Swatantra Kafle, Saikiran Bulusu, Ketan Rajawat, Pramod K. Varshney

Abstract: In this work, we propose a distributed algorithm for stochastic non-convex optimization. We consider a worker-server architecture where a set of $K$ worker nodes (WNs) in collaboration with a server node (SN) jointly aim to minimize a global, potentially non-convex objective function. The objective function is assumed to be the sum of local objective functions available at each WN, with each node… ▽ More In this work, we propose a distributed algorithm for stochastic non-convex optimization. We consider a worker-server architecture where a set of $K$ worker nodes (WNs) in collaboration with a server node (SN) jointly aim to minimize a global, potentially non-convex objective function. The objective function is assumed to be the sum of local objective functions available at each WN, with each node having access to only the stochastic samples of its local objective function. In contrast to the existing approaches, we employ a momentum based "single loop" distributed algorithm which eliminates the need of computing large batch size gradients to achieve variance reduction. We propose two algorithms one with "adaptive" and the other with "non-adaptive" learning rates. We show that the proposed algorithms achieve the optimal computational complexity while attaining linear speedup with the number of WNs. Specifically, the algorithms reach an $ε$-stationary point $x_a$ with $\mathbb{E}\| \nabla f(x_a) \| \leq \tilde{O}(K^{-1/3}T^{-1/2} + K^{-1/3}T^{-1/3})$ in $T$ iterations, thereby requiring $\tilde{O}(K^{-1} ε^{-3})$ gradient computations at each WN. Moreover, our approach does not assume identical data distributions across WNs making the approach general enough for federated learning applications. △ Less

Submitted 1 May, 2020; originally announced May 2020.

arXiv:2004.07378 [pdf, ps, other]

doi 10.1109/TSP.2019.2946017

Decentralized Gaussian Filters for Cooperative Self-localization and Multi-target Tracking

Authors: Pranay Sharma, Augustin-Alexandru Saucan, Donald J. Bucci Jr., Pramod K. Varshney

Abstract: Scalable and decentralized algorithms for Cooperative Self-localization (CS) of agents, and Multi-Target Tracking (MTT) are important in many applications. In this work, we address the problem of Simultaneous Cooperative Self-localization and Multi-Target Tracking (SCS-MTT) under target data association uncertainty, i.e., the associations between measurements and target tracks are unknown. Existin… ▽ More Scalable and decentralized algorithms for Cooperative Self-localization (CS) of agents, and Multi-Target Tracking (MTT) are important in many applications. In this work, we address the problem of Simultaneous Cooperative Self-localization and Multi-Target Tracking (SCS-MTT) under target data association uncertainty, i.e., the associations between measurements and target tracks are unknown. Existing CS and tracking algorithms either make the assumption of no data association uncertainty or employ a hard-decision rule for measurement-to-target associations. We propose a novel decentralized SCS-MTT method for an unknown and time-varying number of targets under association uncertainty. Marginal posterior densities for agents and targets are obtained by an efficient belief propagation (BP) based scheme while data association is handled by marginalizing over all target-to-measurement association probabilities. Decentralized single Gaussian and Gaussian mixture implementations are provided based on average consensus schemes, which require communication only with one-hop neighbors. An additional novelty is a decentralized Gibbs mechanism for efficient evaluation of the product of Gaussian mixtures. Numerical experiments show the improved CS and MTT performance compared to the conventional approach of separate localization and target tracking. △ Less

Submitted 15 April, 2020; originally announced April 2020.

Comments: 16 pages, 7 figures

Journal ref: IEEE Transactions on Signal Processing, vol. 67, no. 22, pp. 5896-5911, Nov. 15, 2019

arXiv:2004.02321 [pdf, ps, other]

doi 10.1109/TSP.2021.3051743

Measurement Bounds for Compressed Sensing in Sensor Networks with Missing Data

Authors: Geethu Joseph, Pramod K. Varshney

Abstract: In this paper, we study the problem of sparse vector recovery at the fusion center of a sensor network from linear sensor measurements when there is missing data. In the presence of missing data, the random sampling approach employed in compressed sensing is known to provide excellent reconstruction accuracy. However, when there is missing data, the theoretical guarantees associated with sparse re… ▽ More In this paper, we study the problem of sparse vector recovery at the fusion center of a sensor network from linear sensor measurements when there is missing data. In the presence of missing data, the random sampling approach employed in compressed sensing is known to provide excellent reconstruction accuracy. However, when there is missing data, the theoretical guarantees associated with sparse recovery have not been well studied. Therefore, in this paper, we derive an upper bound on the minimum number of measurements required to ensure faithful recovery of a sparse signal when the generation of missing data is modeled using a Bernoulli erasure channel. We analyze three different network topologies, namely, star, (relay aided-)tree, and serial-star topologies. Our analysis establishes how the minimum required number of measurements for recovery scales with the network parameters, the properties of the measurement matrix, and the recovery algorithm. Finally, through numerical simulations, we show the variation of the minimum required number of measurements with different system parameters and validate our theoretical results. △ Less

Submitted 5 April, 2020; originally announced April 2020.

Comments: 13 pages, double column, 6 figures

arXiv:2003.12817 [pdf, other]

Controllability of Network Opinion in Erdos-Renyi Graphs using Sparse Control Inputs

Authors: Geethu Joseph, Buddhika Nettasinghe, Vikram Krishnamurthy, Pramod Varshney

Abstract: This paper considers a social network modeled as an Erdos Renyi random graph. Each individual in the network updates her opinion using the weighted average of the opinions of her neighbors. We explore how an external manipulative agent can drive the opinions of these individuals to a desired state with a limited additive influence on their innate opinions. We show that the manipulative agent can s… ▽ More This paper considers a social network modeled as an Erdos Renyi random graph. Each individual in the network updates her opinion using the weighted average of the opinions of her neighbors. We explore how an external manipulative agent can drive the opinions of these individuals to a desired state with a limited additive influence on their innate opinions. We show that the manipulative agent can steer the network opinion to any arbitrary value in finite time (i.e., the system is controllable) almost surely when there is no restriction on her influence. However, when the control input is sparsity constrained, the network opinion is controllable with some probability. We lower bound this probability using the concentration properties of random vectors based on the Levy concentration function and small ball probabilities. Further, through numerical simulations, we compare the probability of controllability in Erdos Renyi graphs with that of power-law graphs to illustrate the key differences between the two models in terms of controllability. Our theoretical and numerical results shed light on how controllability of the network opinion depends on the parameters such as the size and the connectivity of the network, and the sparsity constraints faced by the manipulative agent. △ Less

Submitted 21 November, 2020; v1 submitted 28 March, 2020; originally announced March 2020.

arXiv:2003.06979 [pdf, other]

Anomalous Example Detection in Deep Learning: A Survey

Authors: Saikiran Bulusu, Bhavya Kailkhura, Bo Li, Pramod K. Varshney, Dawn Song

Abstract: Deep Learning (DL) is vulnerable to out-of-distribution and adversarial examples resulting in incorrect outputs. To make DL more robust, several posthoc (or runtime) anomaly detection techniques to detect (and discard) these anomalous samples have been proposed in the recent past. This survey tries to provide a structured and comprehensive overview of the research on anomaly detection for DL based… ▽ More Deep Learning (DL) is vulnerable to out-of-distribution and adversarial examples resulting in incorrect outputs. To make DL more robust, several posthoc (or runtime) anomaly detection techniques to detect (and discard) these anomalous samples have been proposed in the recent past. This survey tries to provide a structured and comprehensive overview of the research on anomaly detection for DL based applications. We provide a taxonomy for existing techniques based on their underlying assumptions and adopted approaches. We discuss various techniques in each of the categories and provide the relative strengths and weaknesses of the approaches. Our goal in this survey is to provide an easier yet better understanding of the techniques belonging to different categories in which research has been done on this topic. Finally, we highlight the unsolved research challenges while applying anomaly detection techniques in DL systems and present some high-impact future research directions. △ Less

Submitted 19 February, 2021; v1 submitted 15 March, 2020; originally announced March 2020.

arXiv:2001.03166 [pdf, ps, other]

On Distributed Online Convex Optimization with Sublinear Dynamic Regret and Fit

Authors: Pranay Sharma, Prashant Khanduri, Lixin Shen, Donald J. Bucci Jr., Pramod K. Varshney

Abstract: In this work, we consider a distributed online convex optimization problem, with time-varying (potentially adversarial) constraints. A set of nodes, jointly aim to minimize a global objective function, which is the sum of local convex functions. The objective and constraint functions are revealed locally to the nodes, at each time, after taking an action. Naturally, the constraints cannot be insta… ▽ More In this work, we consider a distributed online convex optimization problem, with time-varying (potentially adversarial) constraints. A set of nodes, jointly aim to minimize a global objective function, which is the sum of local convex functions. The objective and constraint functions are revealed locally to the nodes, at each time, after taking an action. Naturally, the constraints cannot be instantaneously satisfied. Therefore, we reformulate the problem to satisfy these constraints in the long term. To this end, we propose a distributed primal-dual mirror descent based approach, in which the primal and dual updates are carried out locally at all the nodes. This is followed by sharing and mixing of the primal variables by the local nodes via communication with the immediate neighbors. To quantify the performance of the proposed algorithm, we utilize the challenging, but more realistic metrics of dynamic regret and fit. Dynamic regret measures the cumulative loss incurred by the algorithm, compared to the best dynamic strategy. On the other hand, fit measures the long term cumulative constraint violations. Without assuming the restrictive Slater's conditions, we show that the proposed algorithm achieves sublinear regret and fit under mild, commonly used assumptions. △ Less

Submitted 5 May, 2021; v1 submitted 9 January, 2020; originally announced January 2020.

Comments: 22 pages

arXiv:1912.06036 [pdf, ps, other]

Parallel Restarted SPIDER -- Communication Efficient Distributed Nonconvex Optimization with Optimal Computation Complexity

Authors: Pranay Sharma, Swatantra Kafle, Prashant Khanduri, Saikiran Bulusu, Ketan Rajawat, Pramod K. Varshney

Abstract: In this paper, we propose a distributed algorithm for stochastic smooth, non-convex optimization. We assume a worker-server architecture where $N$ nodes, each having $n$ (potentially infinite) number of samples, collaborate with the help of a central server to perform the optimization task. The global objective is to minimize the average of local cost functions available at individual nodes. The p… ▽ More In this paper, we propose a distributed algorithm for stochastic smooth, non-convex optimization. We assume a worker-server architecture where $N$ nodes, each having $n$ (potentially infinite) number of samples, collaborate with the help of a central server to perform the optimization task. The global objective is to minimize the average of local cost functions available at individual nodes. The proposed approach is a non-trivial extension of the popular parallel-restarted SGD algorithm, incorporating the optimal variance-reduction based SPIDER gradient estimator into it. We prove convergence of our algorithm to a first-order stationary solution. The proposed approach achieves the best known communication complexity $O(ε^{-1})$ along with the optimal computation complexity. For finite-sum problems (finite $n$), we achieve the optimal computation (IFO) complexity $O(\sqrt{Nn}ε^{-1})$. For online problems ($n$ unknown or infinite), we achieve the optimal IFO complexity $O(ε^{-3/2})$. In both the cases, we maintain the linear speedup achieved by existing methods. This is a massive improvement over the $O(ε^{-2})$ IFO complexity of the existing approaches. Additionally, our algorithm is general enough to allow non-identical distributions of data across workers, as in the recently proposed federated learning paradigm. △ Less

Submitted 6 November, 2020; v1 submitted 12 December, 2019; originally announced December 2019.

arXiv:1912.04531 [pdf, ps, other]

Byzantine Resilient Non-Convex SVRG with Distributed Batch Gradient Computations

Authors: Prashant Khanduri, Saikiran Bulusu, Pranay Sharma, Pramod K. Varshney

Abstract: In this work, we consider the distributed stochastic optimization problem of minimizing a non-convex function $f(x) = \mathbb{E}_{ξ\sim \mathcal{D}} f(x; ξ)$ in an adversarial setting, where the individual functions $f(x; ξ)$ can also be potentially non-convex. We assume that at most $α$-fraction of a total of $K$ nodes can be Byzantines. We propose a robust stochastic variance-reduced gradient (S… ▽ More In this work, we consider the distributed stochastic optimization problem of minimizing a non-convex function $f(x) = \mathbb{E}_{ξ\sim \mathcal{D}} f(x; ξ)$ in an adversarial setting, where the individual functions $f(x; ξ)$ can also be potentially non-convex. We assume that at most $α$-fraction of a total of $K$ nodes can be Byzantines. We propose a robust stochastic variance-reduced gradient (SVRG) like algorithm for the problem, where the batch gradients are computed at the worker nodes (WNs) and the stochastic gradients are computed at the server node (SN). For the non-convex optimization problem, we show that we need $\tilde{O}\left( \frac{1}{ε^{5/3} K^{2/3}} + \frac{α^{4/3}}{ε^{5/3}} \right)$ gradient computations on average at each node (SN and WNs) to reach an $ε$-stationary point. The proposed algorithm guarantees convergence via the design of a novel Byzantine filtering rule which is independent of the problem dimension. Importantly, we capture the effect of the fraction of Byzantine nodes $α$ present in the network on the convergence performance of the algorithm. △ Less

Submitted 10 December, 2019; originally announced December 2019.

Comments: Optimization for Machine Learning, 2019

arXiv:1911.07873 [pdf, other]

Distributed Sequential Hypothesis Testing with Dependent Sensor Observations

Authors: Shan Zhang, Prashant Khanduri, Pramod K. Varshney

Abstract: In this paper, we consider the problem of distributed sequential detection using wireless sensor networks (WSNs) in the presence of imperfect communication channels between the sensors and the fusion center (FC). We assume that sensor observations are spatially dependent. We propose a copula-based distributed sequential detection scheme that characterizes the spatial dependence. Specifically, each… ▽ More In this paper, we consider the problem of distributed sequential detection using wireless sensor networks (WSNs) in the presence of imperfect communication channels between the sensors and the fusion center (FC). We assume that sensor observations are spatially dependent. We propose a copula-based distributed sequential detection scheme that characterizes the spatial dependence. Specifically, each local sensor collects observations regarding the phenomenon of interest and forwards the information obtained to the FC over noisy channels. The FC fuses the received messages using a copula-based sequential test. Moreover, we show the asymptotic optimality of the proposed copula-based sequential test. Numerical experiments are conducted to demonstrate the effectiveness of our approach. △ Less

Submitted 21 November, 2019; v1 submitted 18 November, 2019; originally announced November 2019.

arXiv:1911.07180 [pdf, ps, other]

Multiple-Source Ellipsoidal Localization Using Acoustic Energy Measurements

Authors: Fanqin Meng, Xiao**g Shen, Zhiguo Wang, Haiqi Liu, Junfeng Wang, Yunmin Zhu, Pramod K. Varshney

Abstract: In this paper, the multiple-source ellipsoidal localization problem based on acoustic energy measurements is investigated via set-membership estimation theory. When the probability density function of measurement noise is unknown-but-bounded, multiple-source localization is a difficult problem since not only the acoustic energy measurements are complicated nonlinear functions of multiple sources,… ▽ More In this paper, the multiple-source ellipsoidal localization problem based on acoustic energy measurements is investigated via set-membership estimation theory. When the probability density function of measurement noise is unknown-but-bounded, multiple-source localization is a difficult problem since not only the acoustic energy measurements are complicated nonlinear functions of multiple sources, but also the multiple sources bring about a high-dimensional state estimation problem. First, when the energy parameter and the position of the source are bounded in an interval and a ball respectively, the nonlinear remainder bound of the Taylor series expansion is obtained analytically on-line. Next, based on the separability of the nonlinear measurement function, an efficient estimation procedure is developed. It solves the multiple-source localization problem by using an alternating optimization iterative algorithm, in which the remainder bound needs to be known on-line. For this reason, we first derive the remainder bound analytically. When the energy decay factor is unknown but bounded, an efficient estimation procedure is developed based on interval mathematics. Finally, numerical examples demonstrate the effectiveness of the ellipsoidal localization algorithms for multiple-source localization. In particular, our results show that when the noise is non-Gaussian, the set-membership localization algorithm performs better than the EM localization algorithm. △ Less

Submitted 17 November, 2019; originally announced November 2019.

Comments: 20 pages, 13 figures, submitted to Automatica

arXiv:1909.01463 [pdf, other]

doi 10.1109/TSP.2020.3006754

Prospect Theory Based Crowdsourcing for Classification in the Presence of Spammers

Authors: Baocheng Geng, Qunwei Li, Pramod K. Varshney

Abstract: We consider the $M$-ary classification problem via crowdsourcing, where crowd workers respond to simple binary questions and the answers are aggregated via decision fusion. The workers have a reject option to skip answering a question when they do not have the expertise, or when the confidence of answering that question correctly is low. We further consider that there are spammers in the crowd who… ▽ More We consider the $M$-ary classification problem via crowdsourcing, where crowd workers respond to simple binary questions and the answers are aggregated via decision fusion. The workers have a reject option to skip answering a question when they do not have the expertise, or when the confidence of answering that question correctly is low. We further consider that there are spammers in the crowd who respond to the questions with random guesses. Under the payment mechanism that encourages the reject option, we study the behavior of honest workers and spammers, whose objectives are to maximize their monetary rewards. To accurately characterize human behavioral aspects, we employ prospect theory to model the rationality of the crowd workers, whose perception of costs and probabilities are distorted based on some value and weight functions, respectively. Moreover, we estimate the number of spammers and employ a weighted majority voting decision rule, where we assign an optimal weight for every worker to maximize the system performance. The probability of correct classification and asymptotic system performance are derived. We also provide simulation results to demonstrate the effectiveness of our approach. △ Less

Submitted 28 April, 2020; v1 submitted 3 September, 2019; originally announced September 2019.

Comments: 14 pages, 6 figures

Showing 1–50 of 145 results for author: Varshney, P