-
Semgrex and Ssurgeon, Searching and Manipulating Dependency Graphs
Authors:
John Bauer,
Chloe Kiddon,
Eric Yeh,
Alex Shan,
Christopher D. Manning
Abstract:
Searching dependency graphs and manipulating them can be a time consuming and challenging task to get right. We document Semgrex, a system for searching dependency graphs, and introduce Ssurgeon, a system for manipulating the output of Semgrex. The compact language used by these systems allows for easy command line or API processing of dependencies. Additionally, integration with publicly released…
▽ More
Searching dependency graphs and manipulating them can be a time consuming and challenging task to get right. We document Semgrex, a system for searching dependency graphs, and introduce Ssurgeon, a system for manipulating the output of Semgrex. The compact language used by these systems allows for easy command line or API processing of dependencies. Additionally, integration with publicly released toolkits in Java and Python allows for searching text relations and attributes over natural text.
△ Less
Submitted 24 April, 2024;
originally announced April 2024.
-
Fair Concurrent Training of Multiple Models in Federated Learning
Authors:
Marie Siew,
Haoran Zhang,
Jong-Ik Park,
Yuezhou Liu,
Yichen Ruan,
Lili Su,
Stratis Ioannidis,
Edmund Yeh,
Carlee Joe-Wong
Abstract:
Federated learning (FL) enables collaborative learning across multiple clients. In most FL work, all clients train a single learning task. However, the recent proliferation of FL applications may increasingly require multiple FL tasks to be trained simultaneously, sharing clients' computing and communication resources, which we call Multiple-Model Federated Learning (MMFL). Current MMFL algorithms…
▽ More
Federated learning (FL) enables collaborative learning across multiple clients. In most FL work, all clients train a single learning task. However, the recent proliferation of FL applications may increasingly require multiple FL tasks to be trained simultaneously, sharing clients' computing and communication resources, which we call Multiple-Model Federated Learning (MMFL). Current MMFL algorithms use naive average-based client-task allocation schemes that can lead to unfair performance when FL tasks have heterogeneous difficulty levels, e.g., tasks with larger models may need more rounds and data to train. Just as naively allocating resources to generic computing jobs with heterogeneous resource needs can lead to unfair outcomes, naive allocation of clients to FL tasks can lead to unfairness, with some tasks having excessively long training times, or lower converged accuracies. Furthermore, in the FL setting, since clients are typically not paid for their training effort, we face a further challenge that some clients may not even be willing to train some tasks, e.g., due to high computational costs, which may exacerbate unfairness in training outcomes across tasks. We address both challenges by firstly designing FedFairMMFL, a difficulty-aware algorithm that dynamically allocates clients to tasks in each training round. We provide guarantees on airness and FedFairMMFL's convergence rate. We then propose a novel auction design that incentivizes clients to train multiple tasks, so as to fairly distribute clients' training efforts across the tasks. We show how our fairness-based learning and incentive mechanisms impact training convergence and finally evaluate our algorithm with multiple sets of learning tasks on real world datasets.
△ Less
Submitted 21 April, 2024;
originally announced April 2024.
-
Empowering Federated Learning with Implicit Gossi**: Mitigating Connection Unreliability Amidst Unknown and Arbitrary Dynamics
Authors:
Ming Xiang,
Stratis Ioannidis,
Edmund Yeh,
Carlee Joe-Wong,
Lili Su
Abstract:
Federated learning is a popular distributed learning approach for training a machine learning model without disclosing raw data. It consists of a parameter server and a possibly large collection of clients (e.g., in cross-device federated learning) that may operate in congested and changing environments. In this paper, we study federated learning in the presence of stochastic and dynamic communica…
▽ More
Federated learning is a popular distributed learning approach for training a machine learning model without disclosing raw data. It consists of a parameter server and a possibly large collection of clients (e.g., in cross-device federated learning) that may operate in congested and changing environments. In this paper, we study federated learning in the presence of stochastic and dynamic communication failures wherein the uplink between the parameter server and client $i$ is on with unknown probability $p_i^t$ in round $t$. Furthermore, we allow the dynamics of $p_i^t$ to be arbitrary.
We first demonstrate that when the $p_i^t$'s vary across clients, the most widely adopted federated learning algorithm, Federated Average (FedAvg), experiences significant bias. To address this observation, we propose Federated Postponed Broadcast (FedPBC), a simple variant of FedAvg. FedPBC differs from FedAvg in that the parameter server postpones broadcasting the global model till the end of each round. Despite uplink failures, we show that FedPBC converges to a stationary point of the original non-convex objective. On the technical front, postponing the global model broadcasts enables implicit gossi** among the clients with active links in round $t$. Despite the time-varying nature of $p_i^t$, we can bound the perturbation of the global model dynamics using techniques to control gossip-type information mixing errors. Extensive experiments have been conducted on real-world datasets over diversified unreliable uplink patterns to corroborate our analysis.
△ Less
Submitted 15 April, 2024;
originally announced April 2024.
-
A Bayesian factor analysis model for high-dimensional microbiome count data
Authors:
Ismaïla Ba,
Maxime Turgeon,
Simona Veniamin,
Juan Joel,
Richard Miller,
Morag Graham,
Christine Bonner,
Charles N. Bernstein,
Douglas L. Arnold,
Amit Bar-Or,
Ruth Ann Marrie,
Julia O'Mahony,
E. Ann Yeh,
Brenda Banwell,
Emmanuelle Waubant,
Natalie Knox,
Gary Van Domselaar,
Ali I. Mirza,
Heather Armstrong,
Saman Muthukumarana,
Kevin McGregor
Abstract:
Dimension reduction techniques are among the most essential analytical tools in the analysis of high-dimensional data. Generalized principal component analysis (PCA) is an extension to standard PCA that has been widely used to identify low-dimensional features in high-dimensional discrete data, such as binary, multi-category and count data. For microbiome count data in particular, the multinomial…
▽ More
Dimension reduction techniques are among the most essential analytical tools in the analysis of high-dimensional data. Generalized principal component analysis (PCA) is an extension to standard PCA that has been widely used to identify low-dimensional features in high-dimensional discrete data, such as binary, multi-category and count data. For microbiome count data in particular, the multinomial PCA is a natural counterpart of the standard PCA. However, this technique fails to account for the excessive number of zero values, which is frequently observed in microbiome count data. To allow for sparsity, zero-inflated multivariate distributions can be used. We propose a zero-inflated probabilistic PCA model for latent factor analysis. The proposed model is a fully Bayesian factor analysis technique that is appropriate for microbiome count data analysis. In addition, we use the mean-field-type variational family to approximate the marginal likelihood and develop a classification variational approximation algorithm to fit the model. We demonstrate the efficiency of our procedure for predictions based on the latent factors and the model parameters through simulation experiments, showcasing its superiority over competing methods. This efficiency is further illustrated with two real microbiome count datasets. The method is implemented in R.
△ Less
Submitted 23 April, 2024; v1 submitted 3 April, 2024;
originally announced April 2024.
-
Delay-Optimal Forwarding and Computation Offloading for Service Chain Tasks
Authors:
**kun Zhang,
Yuezhou Liu,
Edmund Yeh
Abstract:
Emerging edge computing paradigms enable heterogeneous devices to collaborate on complex computation applications. However, for congestible links and computing units, delay-optimal forwarding and offloading for service chain tasks (e.g., DNN with vertical split) in edge computing networks remains an open problem. In this paper, we formulate the service chain forwarding and offloading problem with…
▽ More
Emerging edge computing paradigms enable heterogeneous devices to collaborate on complex computation applications. However, for congestible links and computing units, delay-optimal forwarding and offloading for service chain tasks (e.g., DNN with vertical split) in edge computing networks remains an open problem. In this paper, we formulate the service chain forwarding and offloading problem with arbitrary topology and heterogeneous transmission/computation capability, and aim to minimize the aggregated network cost. We consider congestion-aware nonlinear cost functions that cover various performance metrics and constraints, such as average queueing delay with limited processor capacity. We solve the non-convex optimization problem globally by analyzing the KKT condition and proposing a sufficient condition for optimality. We then propose a distributed algorithm that converges to the global optimum. The algorithm adapts to changes in input rates and network topology, and can be implemented as an online algorithm. Numerical evaluation shows that our method significantly outperforms baselines in multiple network instances, especially in congested scenarios.
△ Less
Submitted 23 March, 2024;
originally announced March 2024.
-
LOAM: Low-latency Communication, Caching, and Computation Placement in Data-Intensive Computing Networks
Authors:
**kun Zhang,
Edmund Yeh
Abstract:
Deploying data- and computation-intensive applications such as large-scale AI into heterogeneous dispersed computing networks can significantly enhance application performance by mitigating bottlenecks caused by limited network resources, including bandwidth, storage, and computing power. However, current resource allocation methods in dispersed computing do not provide a comprehensive solution th…
▽ More
Deploying data- and computation-intensive applications such as large-scale AI into heterogeneous dispersed computing networks can significantly enhance application performance by mitigating bottlenecks caused by limited network resources, including bandwidth, storage, and computing power. However, current resource allocation methods in dispersed computing do not provide a comprehensive solution that considers arbitrary topology, elastic resource amount, reuse of computation results, and nonlinear congestion-dependent optimization objectives. In this paper, we propose LOAM, a low-latency joint communication, caching, and computation placement framework with a rigorous analytical foundation that incorporates the above aspects. We tackle the NP-hard aggregated cost minimization problem with two methods: an offline method with a 1/2 approximation and an online adaptive method with a bounded gap from the optimum. Through extensive simulation, the proposed framework outperforms multiple baselines in both synthesis and real-world network scenarios.
△ Less
Submitted 23 March, 2024;
originally announced March 2024.
-
Distributed Experimental Design Networks
Authors:
Yuanyuan Li,
Lili Su,
Carlee Joe-Wong,
Edmund Yeh,
Stratis Ioannidis
Abstract:
As edge computing capabilities increase, model learning deployments in diverse edge environments have emerged. In experimental design networks, introduced recently, network routing and rate allocation are designed to aid the transfer of data from sensors to heterogeneous learners. We design efficient experimental design network algorithms that are (a) distributed and (b) use multicast transmission…
▽ More
As edge computing capabilities increase, model learning deployments in diverse edge environments have emerged. In experimental design networks, introduced recently, network routing and rate allocation are designed to aid the transfer of data from sensors to heterogeneous learners. We design efficient experimental design network algorithms that are (a) distributed and (b) use multicast transmissions. This setting poses significant challenges as classic decentralization approaches often operate on (strictly) concave objectives under differentiable constraints. In contrast, the problem we study here has a non-convex, continuous DR-submodular objective, while multicast transmissions naturally result in non-differentiable constraints. From a technical standpoint, we propose a distributed Frank-Wolfe and a distributed projected gradient ascent algorithm that, coupled with a relaxation of non-differentiable constraints, yield allocations within a $1-1/e$ factor from the optimal. Numerical evaluations show that our proposed algorithms outperform competitors with respect to model learning quality.
△ Less
Submitted 10 January, 2024;
originally announced January 2024.
-
Cost-aware Joint Caching and Forwarding in Networks with Heterogeneous Cache Resources
Authors:
Faruk Volkan Mutlu,
Edmund Yeh
Abstract:
Caching is crucial for enabling high-throughput networks for data intensive applications. Traditional caching technology relies on DRAM, as it can transfer data at a high rate. However, DRAM capacity is subject to contention by most system components and thus is very limited, implying that DRAM-only caches cannot scale to meet growing demand. Fortunately, persistent memory and flash storage techno…
▽ More
Caching is crucial for enabling high-throughput networks for data intensive applications. Traditional caching technology relies on DRAM, as it can transfer data at a high rate. However, DRAM capacity is subject to contention by most system components and thus is very limited, implying that DRAM-only caches cannot scale to meet growing demand. Fortunately, persistent memory and flash storage technologies are rapidly evolving and can be utilized alongside DRAM to increase cache capacities. To do so without compromising network performance requires caching techniques adapted to the characteristics of these technologies. In this paper, we model the cache as a collection of storage blocks with different rate parameters and utilization costs. We introduce an optimization technique based on the drift-plus-penalty method and apply it in a framework which enables joint caching and forwarding. We show that it achieves an optimal trade-off between throughput and cache utilization costs in a virtual control plane. We then develop a corresponding practical policy in the data plane. Finally, through simulations in several settings, we demonstrate the superior performance of our proposed approach with respect to total user delay and cache utilization costs.
△ Less
Submitted 11 October, 2023;
originally announced October 2023.
-
Delay-Optimal Service Chain Forwarding and Offloading in Collaborative Edge Computing
Authors:
**kun Zhang,
Edmund Yeh
Abstract:
Collaborative edge computing (CEC) is an emerging paradigm for heterogeneous devices to collaborate on edge computation jobs. For congestible links and computing units, delay-optimal forwarding and offloading for service chain tasks (e.g., DNN with vertical split) in CEC remains an open problem. In this paper, we formulate the service chain forwarding and offloading in CEC with arbitrary topology…
▽ More
Collaborative edge computing (CEC) is an emerging paradigm for heterogeneous devices to collaborate on edge computation jobs. For congestible links and computing units, delay-optimal forwarding and offloading for service chain tasks (e.g., DNN with vertical split) in CEC remains an open problem. In this paper, we formulate the service chain forwarding and offloading in CEC with arbitrary topology and heterogeneous transmission/computation capability, and aim to minimize the network aggregated cost. We consider congestion-aware nonlinear cost functions that cover various performance metrics and constraints, such as average queueing delay with limited processor capacity. We solve the non-convex optimization problem globally by analyzing the KKT condition and proposing a sufficiency optimality condition. We propose a polynomial-time distributed algorithm that converges to the global optimum. The algorithm adapts to changes in input rates and network topology, and can be implemented as an online algorithm. Numerical evaluation shows that our method significantly outperforms baselines in multiple network instances, especially in congested scenarios.
△ Less
Submitted 8 December, 2023; v1 submitted 9 October, 2023;
originally announced October 2023.
-
Towards Bias Correction of FedAvg over Nonuniform and Time-Varying Communications
Authors:
Ming Xiang,
Stratis Ioannidis,
Edmund Yeh,
Carlee Joe-Wong,
Lili Su
Abstract:
Federated learning (FL) is a decentralized learning framework wherein a parameter server (PS) and a collection of clients collaboratively train a model via minimizing a global objective. Communication bandwidth is a scarce resource; in each round, the PS aggregates the updates from a subset of clients only. In this paper, we focus on non-convex minimization that is vulnerable to non-uniform and ti…
▽ More
Federated learning (FL) is a decentralized learning framework wherein a parameter server (PS) and a collection of clients collaboratively train a model via minimizing a global objective. Communication bandwidth is a scarce resource; in each round, the PS aggregates the updates from a subset of clients only. In this paper, we focus on non-convex minimization that is vulnerable to non-uniform and time-varying communication failures between the PS and the clients. Specifically, in each round $t$, the link between the PS and client $i$ is active with probability $p_i^t$, which is $\textit{unknown}$ to both the PS and the clients. This arises when the channel conditions are heterogeneous across clients and are changing over time.
We show that when the $p_i^t$'s are not uniform, $\textit{Federated Average}$ (FedAvg) -- the most widely adopted FL algorithm -- fails to minimize the global objective. Observing this, we propose $\textit{Federated Postponed Broadcast}$ (FedPBC) which is a simple variant of FedAvg. It differs from FedAvg in that the PS postpones broadcasting the global model till the end of each round. We show that FedPBC converges to a stationary point of the original objective. The introduced staleness is mild and there is no noticeable slowdown. Both theoretical analysis and numerical results are provided. On the technical front, postponing the global model broadcasts enables implicit gossi** among the clients with active links at round $t$. Despite $p_i^t$'s are time-varying, we are able to bound the perturbation of the global model dynamics via the techniques of controlling the gossip-type information mixing errors.
△ Less
Submitted 31 May, 2023;
originally announced June 2023.
-
CoProver: A Recommender System for Proof Construction
Authors:
Eric Yeh,
Briland Hitaj,
Sam Owre,
Maena Quemener,
Natarajan Shankar
Abstract:
Interactive Theorem Provers (ITPs) are an indispensable tool in the arsenal of formal method experts as a platform for construction and (formal) verification of proofs. The complexity of the proofs in conjunction with the level of expertise typically required for the process to succeed can often hinder the adoption of ITPs. A recent strain of work has investigated methods to incorporate machine le…
▽ More
Interactive Theorem Provers (ITPs) are an indispensable tool in the arsenal of formal method experts as a platform for construction and (formal) verification of proofs. The complexity of the proofs in conjunction with the level of expertise typically required for the process to succeed can often hinder the adoption of ITPs. A recent strain of work has investigated methods to incorporate machine learning models trained on ITP user activity traces as a viable path towards full automation. While a valuable line of investigation, many problems still require human supervision to be completed fully, thus applying learning methods to assist the user with useful recommendations can prove more fruitful.
Following the vein of user assistance, we introduce CoProver, a proof recommender system based on transformers, capable of learning from past actions during proof construction, all while exploring knowledge stored in the ITP concerning previous proofs. CoProver employs a neurally learnt sequence-based encoding of sequents, capturing long distance relationships between terms and hidden cues therein. We couple CoProver with the Prototype Verification System (PVS) and evaluate its performance on two key areas, namely: (1) Next Proof Action Recommendation, and (2) Relevant Lemma Retrieval given a library of theories. We evaluate CoProver on a series of well-established metrics originating from the recommender system and information retrieval communities, respectively. We show that CoProver successfully outperforms prior state of the art applied to recommendation in the domain. We conclude by discussing future directions viable for CoProver (and similar approaches) such as argument prediction, proof summarization, and more.
△ Less
Submitted 1 March, 2023;
originally announced April 2023.
-
Automatic Measures for Evaluating Generative Design Methods for Architects
Authors:
Eric Yeh,
Briland Hitaj,
Vidyasagar Sadhu,
Anirban Roy,
Takuma Nakabayashi,
Yoshito Tsuji
Abstract:
The recent explosion of high-quality image-to-image methods has prompted interest in applying image-to-image methods towards artistic and design tasks. Of interest for architects is to use these methods to generate design proposals from conceptual sketches, usually hand-drawn sketches that are quickly developed and can embody a design intent. More specifically, instantiating a sketch into a visual…
▽ More
The recent explosion of high-quality image-to-image methods has prompted interest in applying image-to-image methods towards artistic and design tasks. Of interest for architects is to use these methods to generate design proposals from conceptual sketches, usually hand-drawn sketches that are quickly developed and can embody a design intent. More specifically, instantiating a sketch into a visual that can be used to elicit client feedback is typically a time consuming task, and being able to speed up this iteration time is important. While the body of work in generative methods has been impressive, there has been a mismatch between the quality measures used to evaluate the outputs of these systems and the actual expectations of architects. In particular, most recent image-based works place an emphasis on realism of generated images. While important, this is one of several criteria architects look for. In this work, we describe the expectations architects have for design proposals from conceptual sketches, and identify corresponding automated metrics from the literature. We then evaluate several image-to-image generative methods that may address these criteria and examine their performance across these metrics. From these results, we identify certain challenges with hand-drawn conceptual sketches and describe possible future avenues of investigation to address them.
△ Less
Submitted 20 March, 2023;
originally announced March 2023.
-
Congestion-aware routing and content placement in elastic cache networks
Authors:
**kun Zhang,
Edmund Yeh
Abstract:
Caching can be leveraged to significantly improve network performance and mitigate congestion. However, characterizing the optimal tradeoff between routing cost and cache deployment cost remains an open problem. In this paper, for a network with arbitrary topology and congestion-dependent nonlinear cost functions, we aim to jointly determine the cache deployment, content placement, and hop-by-hop…
▽ More
Caching can be leveraged to significantly improve network performance and mitigate congestion. However, characterizing the optimal tradeoff between routing cost and cache deployment cost remains an open problem. In this paper, for a network with arbitrary topology and congestion-dependent nonlinear cost functions, we aim to jointly determine the cache deployment, content placement, and hop-by-hop routing strategies, so that the sum of routing cost and cache deployment cost is minimized. We tackle this NP-hard problem starting with a fixed-routing setting, and then to a general dynamic-routing setting. For the fixed-routing setting, a Gradient-combining Frank-Wolfe algorithm with $(\frac{1}{2},1)$-approximation is presented. For the general dynamic-routing setting, we obtain a set of KKT necessary optimal conditions, and devise a distributed and adaptive online algorithm based on the conditions. We demonstrate via extensive simulation that our algorithms significantly outperform a number of baseline techniques.
△ Less
Submitted 2 March, 2023;
originally announced March 2023.
-
Revisiting Variable Ordering for Real Quantifier Elimination using Machine Learning
Authors:
John Hester,
Briland Hitaj,
Grant Passmore,
Sam Owre,
Natarajan Shankar,
Eric Yeh
Abstract:
Cylindrical Algebraic Decomposition (CAD) is a key proof technique for formal verification of cyber-physical systems. CAD is computationally expensive, with worst-case doubly-exponential complexity. Selecting an optimal variable ordering is paramount to efficient use of CAD. Prior work has demonstrated that machine learning can be useful in determining efficient variable orderings. Much of this wo…
▽ More
Cylindrical Algebraic Decomposition (CAD) is a key proof technique for formal verification of cyber-physical systems. CAD is computationally expensive, with worst-case doubly-exponential complexity. Selecting an optimal variable ordering is paramount to efficient use of CAD. Prior work has demonstrated that machine learning can be useful in determining efficient variable orderings. Much of this work has been driven by CAD problems extracted from applications of the MetiTarski theorem prover. In this paper, we revisit this prior work and consider issues of bias in existing training and test data. We observe that the classical MetiTarski benchmarks are heavily biased towards particular variable orderings. To address this, we apply symmetries to create a new dataset containing more than 41K MetiTarski challenges designed to remove bias. Furthermore, we evaluate issues of information leakage, and test the generalizability of our models on the new dataset.
△ Less
Submitted 27 February, 2023;
originally announced February 2023.
-
Outcome-Guided Counterfactuals for Reinforcement Learning Agents from a Jointly Trained Generative Latent Space
Authors:
Eric Yeh,
Pedro Sequeira,
Jesse Hostetler,
Melinda Gervasio
Abstract:
We present a novel generative method for producing unseen and plausible counterfactual examples for reinforcement learning (RL) agents based upon outcome variables that characterize agent behavior. Our approach uses a variational autoencoder to train a latent space that jointly encodes information about the observations and outcome variables pertaining to an agent's behavior. Counterfactuals are g…
▽ More
We present a novel generative method for producing unseen and plausible counterfactual examples for reinforcement learning (RL) agents based upon outcome variables that characterize agent behavior. Our approach uses a variational autoencoder to train a latent space that jointly encodes information about the observations and outcome variables pertaining to an agent's behavior. Counterfactuals are generated using traversals in this latent space, via gradient-driven updates as well as latent interpolations against cases drawn from a pool of examples. These include updates to raise the likelihood of generated examples, which improves the plausibility of generated counterfactuals. From experiments in three RL environments, we show that these methods produce counterfactuals that are more plausible and proximal to their queries compared to purely outcome-driven or case-based baselines. Finally, we show that a latent jointly trained to reconstruct both the input observations and behavioral outcome variables produces higher-quality counterfactuals over latents trained solely to reconstruct the observation inputs.
△ Less
Submitted 15 July, 2022;
originally announced July 2022.
-
Optimal Congestion-aware Routing and Offloading in Collaborative Edge Computing
Authors:
**kun Zhang,
Yuezhou Liu,
Edmund Yeh
Abstract:
Collaborative edge computing (CEC) is an emerging paradigm where heterogeneous edge devices collaborate to fulfill computation tasks, such as model training or video processing, by sharing communication and computation resources. Nevertheless, the optimal data/result routing and computation offloading strategy in CEC with arbitrary topology still remains an open problem. In this paper, we formulat…
▽ More
Collaborative edge computing (CEC) is an emerging paradigm where heterogeneous edge devices collaborate to fulfill computation tasks, such as model training or video processing, by sharing communication and computation resources. Nevertheless, the optimal data/result routing and computation offloading strategy in CEC with arbitrary topology still remains an open problem. In this paper, we formulate the flow model of partial-offloading and multi-hop routing for arbitrarily divisible tasks, where each node individually decides its routing/offloading strategy. In contrast to most existing works, our model applies to tasks with non-negligible result size, and allows data sources to be distinct from the result destination. We propose a network-wide cost minimization problem with congestion-aware convex cost functions for communication and computation. Such convex cost covers various performance metrics and constraints, such as average queueing delay with limited processor capacity. Although the problem is non-convex, we provide necessary conditions and sufficient conditions for the global-optimal solution, and devise a fully distributed algorithm that converges to the optimum in polynomial time, allows asynchronous individual updating, and is adaptive to changes in task pattern. Numerical evaluation shows that our proposed method significantly outperforms other baseline algorithms in multiple network instances, especially in congested scenarios.
△ Less
Submitted 26 May, 2022; v1 submitted 15 May, 2022;
originally announced May 2022.
-
Result and Congestion Aware Optimal Routing and Partial Offloading in Collaborative Edge Computing
Authors:
**kun Zhang,
Yuezhou Liu,
Edmund Yeh
Abstract:
Collaborative edge computing (CEC) is an emerging paradigm where heterogeneous edge devices (stakeholders) collaborate to fulfill computation tasks, such as model training or video processing, by sharing communication and computation resources. Nevertheless, the optimal data/result routing and computation offloading strategy in CEC with arbitrary topology still remains an open problem. In this pap…
▽ More
Collaborative edge computing (CEC) is an emerging paradigm where heterogeneous edge devices (stakeholders) collaborate to fulfill computation tasks, such as model training or video processing, by sharing communication and computation resources. Nevertheless, the optimal data/result routing and computation offloading strategy in CEC with arbitrary topology still remains an open problem. In this paper, we formulate a partial-offloading and multi-hop routing model for arbitrarily divisible tasks. Each node individually decides the computation of the received data and the forwarding of data/result traffic. In contrast to most existing works, our model applies to tasks with non-negligible result size, and enables separable data sources and result destinations. We propose a network-wide cost minimization problem with congestion-aware cost to jointly optimize routing and computation offloading. This problem covers various performance metrics and constraints, such as average queueing delay with limited processor capacity. Although the problem is non-convex, we provide non-trivial necessary and sufficient conditions for the global-optimal solution, and devise a fully distributed algorithm that converges to the optimum in polynomial time, allows asynchronous individual updating, and is adaptive to changes in network topology or task pattern. Numerical evaluation shows that our proposed method significantly outperforms other baseline algorithms in multiple network instances, especially in congested scenarios.
△ Less
Submitted 2 May, 2022;
originally announced May 2022.
-
Experimental Design Networks: A Paradigm for Serving Heterogeneous Learners under Networking Constraints
Authors:
Yuezhou Liu,
Yuanyuan Li,
Lili Su,
Edmund Yeh,
Stratis Ioannidis
Abstract:
Significant advances in edge computing capabilities enable learning to occur at geographically diverse locations. In general, the training data needed in those learning tasks are not only heterogeneous but also not fully generated locally. In this paper, we propose an experimental design network paradigm, wherein learner nodes train possibly different Bayesian linear regression models via consumin…
▽ More
Significant advances in edge computing capabilities enable learning to occur at geographically diverse locations. In general, the training data needed in those learning tasks are not only heterogeneous but also not fully generated locally. In this paper, we propose an experimental design network paradigm, wherein learner nodes train possibly different Bayesian linear regression models via consuming data streams generated by data source nodes over a network. We formulate this problem as a social welfare optimization problem in which the global objective is defined as the sum of experimental design objectives of individual learners, and the decision variables are the data transmission strategies subject to network constraints. We first show that, assuming Poisson data streams, the global objective is a continuous DR-submodular function. We then propose a Frank-Wolfe type algorithm that outputs a solution within a 1-1/e factor from the optimal. Our algorithm contains a novel gradient estimation component which is carefully designed based on Poisson tail bounds and sampling. Finally, we complement our theoretical findings through extensive experiments. Our numerical evaluation shows that the proposed algorithm outperforms several baseline algorithms both in maximizing the global objective and in the quality of the trained models.
△ Less
Submitted 13 January, 2022;
originally announced January 2022.
-
Robust Regression via Model Based Methods
Authors:
Armin Moharrer,
Khashayar Kamran,
Edmund Yeh,
Stratis Ioannidis
Abstract:
The mean squared error loss is widely used in many applications, including auto-encoders, multi-target regression, and matrix factorization, to name a few. Despite computational advantages due to its differentiability, it is not robust to outliers. In contrast, l_p norms are known to be robust, but cannot be optimized via, e.g., stochastic gradient descent, as they are non-differentiable. We propo…
▽ More
The mean squared error loss is widely used in many applications, including auto-encoders, multi-target regression, and matrix factorization, to name a few. Despite computational advantages due to its differentiability, it is not robust to outliers. In contrast, l_p norms are known to be robust, but cannot be optimized via, e.g., stochastic gradient descent, as they are non-differentiable. We propose an algorithm inspired by so-called model-based optimization (MBO) [35, 36], which replaces a non-convex objective with a convex model function and alternates between optimizing the model function and updating the solution. We apply this to robust regression, proposing SADM, a stochastic variant of the Online Alternating Direction Method of Multipliers (OADM) [50] to solve the inner optimization in MBO. We show that SADM converges with the rate O(log T/T). Finally, we demonstrate experimentally (a) the robustness of l_p norms to outliers and (b) the efficiency of our proposed model-based algorithms in comparison with gradient methods on autoencoders and multi-target regression.
△ Less
Submitted 29 June, 2021; v1 submitted 20 June, 2021;
originally announced June 2021.
-
Transmission Delay Minimization via Joint Power Control and Caching in Wireless HetNets
Authors:
Derya Malak,
Faruk V. Mutlu,
**kun Zhang,
Edmund M. Yeh
Abstract:
A fundamental challenge in wireless heterogeneous networks (HetNets) is to effectively utilize the limited transmission and storage resources in the presence of increasing deployment density and backhaul capacity constraints. To alleviate bottlenecks and reduce resource consumption, we design optimal caching and power control algorithms for multi-hop wireless HetNets. We formulate a joint optimiza…
▽ More
A fundamental challenge in wireless heterogeneous networks (HetNets) is to effectively utilize the limited transmission and storage resources in the presence of increasing deployment density and backhaul capacity constraints. To alleviate bottlenecks and reduce resource consumption, we design optimal caching and power control algorithms for multi-hop wireless HetNets. We formulate a joint optimization framework to minimize the average transmission delay as a function of the caching variables and the signal-to-interference-plus-noise ratios (SINR) which are determined by the transmission powers, while explicitly accounting for backhaul connection costs and the power constraints.
Using convex relaxation and rounding, we obtain a reduced-complexity formulation (RCF) of the joint optimization problem, which can provide a constant factor approximation to the globally optimal solution. We then solve RCF in two ways: 1) alternating optimization of the power and caching variables by leveraging biconvexity, and 2) joint optimization of power control and caching. We characterize the necessary (KKT) conditions for an optimal solution to RCF, and use strict quasi-convexity to show that the KKT points are Pareto optimal for RCF. We then devise a subgradient projection algorithm to jointly update the caching and power variables, and show that under appropriate conditions, the algorithm converges at a linear rate to the local minima of RCF, under general SINR conditions. We support our analytical findings with results from extensive numerical experiments.
△ Less
Submitted 29 May, 2021;
originally announced May 2021.
-
Rate Allocation and Content Placement in Cache Networks
Authors:
Khashayar Kamran,
Armin Moharrer,
Stratis Ioannidis,
Edmund Yeh
Abstract:
We introduce the problem of optimal congestion control in cache networks, whereby \emph{both} rate allocations and content placements are optimized \emph{jointly}. We formulate this as a maximization problem with non-convex constraints, and propose solving this problem via (a) a Lagrangian barrier algorithm and (b) a convex relaxation. We prove different optimality guarantees for each of these two…
▽ More
We introduce the problem of optimal congestion control in cache networks, whereby \emph{both} rate allocations and content placements are optimized \emph{jointly}. We formulate this as a maximization problem with non-convex constraints, and propose solving this problem via (a) a Lagrangian barrier algorithm and (b) a convex relaxation. We prove different optimality guarantees for each of these two algorithms; our proofs exploit the fact that the non-convex constraints of our problem involve DR-submodular functions.
△ Less
Submitted 12 February, 2021; v1 submitted 9 January, 2021;
originally announced January 2021.
-
Selfish Caching Games on Directed Graphs
Authors:
Qian Ma,
Edmund Yeh,
Jianwei Huang
Abstract:
Caching networks can reduce the routing costs of accessing contents by caching contents closer to users. However, cache nodes may belong to different entities and behave selfishly to maximize their own benefits, which often lead to performance degradation for the overall network. While there has been extensive literature on allocating contents to caches to maximize the social welfare, the analysis…
▽ More
Caching networks can reduce the routing costs of accessing contents by caching contents closer to users. However, cache nodes may belong to different entities and behave selfishly to maximize their own benefits, which often lead to performance degradation for the overall network. While there has been extensive literature on allocating contents to caches to maximize the social welfare, the analysis of selfish caching behaviors remains largely unexplored. In this paper, we model the selfish behaviors of cache nodes as selfish caching games on arbitrary directed graphs with heterogeneous content popularity. We study the existence of a pure strategy Nash equilibrium (PSNE) in selfish caching games, and analyze its efficiency in terms of social welfare. We show that a PSNE does not always exist in arbitrary-topology caching networks. However, if the network does not have a mixed request loop, i.e., a directed loop in which each edge is traversed by at least one content request, we show that a PSNE always exists and can be found in polynomial time. Furthermore, we can avoid mixed request loops by properly choosing request forwarding paths. We then show that the efficiency of Nash equilibria, captured by the price of anarchy (PoA), can be arbitrarily poor if we allow arbitrary content request patterns, and adding extra cache nodes can make the PoA worse, i.e., cache paradox happens. However, when cache nodes have homogeneous request patterns, we show that the PoA is bounded even allowing arbitrary topologies. We further analyze the selfish caching games for cache nodes with limited computational capabilities, and show that an approximate PSNE exists with bounded PoA in certain cases of interest. Simulation results show that increasing the cache capacity in the network improves the efficiency of Nash equilibria, while adding extra cache nodes can degrade the efficiency of Nash equilibria.
△ Less
Submitted 28 December, 2020;
originally announced December 2020.
-
Healthcare Utilization and Perceived Health Status from Falun Gong Practitioners in Taiwan: A Pilot SF-36 Survey
Authors:
Yu-Whuei Hu,
Li-Shan Huang,
Eric J. Yeh,
Mai He
Abstract:
Objective: Falun Gong (FLG) is a practice of mind and body focusing on moral character improvement along with meditative exercises. This 2002 pilot study explored perceived health status, medical resource utilization and related factors among Taiwanese FLG practitioners, compared to the general Taiwanese norm estimated by the 2001 National Health Interview Survey (NHIS). Methods: This cross-sectio…
▽ More
Objective: Falun Gong (FLG) is a practice of mind and body focusing on moral character improvement along with meditative exercises. This 2002 pilot study explored perceived health status, medical resource utilization and related factors among Taiwanese FLG practitioners, compared to the general Taiwanese norm estimated by the 2001 National Health Interview Survey (NHIS). Methods: This cross-sectional, observational study was based on a voluntary, paper-based survey conducted from October 2002 to February 2003 using the same Taiwanese SF-36 instrument employed by the NHIS. Primary outcomes included eight SF-36 domain scores and the number of medical visits. One-sample t-tests, one-way ANOVA and multivariate linear regression analyses were performed. Results: The response rate was 75.6% (1,210/1,600). Compared to the norm, the study cohort had significantly higher scores in six of eight SF-36 domains across gender and age (p<0.05). Among those with chronic diseases, 70% to 89% reported their conditions either improved or cured. 74.2%, 79.2%, 83.3%, and 85.6% quitted alcohol drinking, smoking, chewing betel nuts, and gambling. 62.7% reported a reduced number of medical visits (mean=13.53 before; mean=5.87 after). Conclusions: In this subject cohort, practicing FLG led to higher perceived health scores and reduced health resource utilization compared to the norm.
△ Less
Submitted 29 July, 2020;
originally announced July 2020.
-
Spatial Soft-Core Caching
Authors:
Derya Malak,
Muriel Médard,
Edmund Yeh
Abstract:
We propose a decentralized spatial soft-core cache placement (SSCC) policy for wireless networks. SSCC yields a spatially balanced sampling via negative dependence across caches, and can be tuned to satisfy cache size constraints with high probability. Given a desired cache hit probability, we compare the 95% confidence intervals of the required cache sizes for independent placement, hard-core pla…
▽ More
We propose a decentralized spatial soft-core cache placement (SSCC) policy for wireless networks. SSCC yields a spatially balanced sampling via negative dependence across caches, and can be tuned to satisfy cache size constraints with high probability. Given a desired cache hit probability, we compare the 95% confidence intervals of the required cache sizes for independent placement, hard-core placement and SSCC policies. We demonstrate that in terms of the required cache storage size, SSCC can provide up to more than 180% and 100% gains with respect to the independent and hard-core placement policies, respectively. SSCC can be used to enable proximity-based applications such as device-to-device communications and peer-to-peer networking as it promotes the item diversity and reciprocation among the nodes.
△ Less
Submitted 30 January, 2019;
originally announced January 2019.
-
Kelly Cache Networks
Authors:
Milad Mahdian,
Armin Moharrer,
Stratis Ioannidis,
Edmund Yeh
Abstract:
We study networks of M/M/1 queues in which nodes act as caches that store objects. Exogenous requests for objects are routed towards nodes that store them; as a result, object traffic in the network is determined not only by demand but, crucially, by where objects are cached. We determine how to place objects in caches to attain a certain design objective, such as, e.g., minimizing network congest…
▽ More
We study networks of M/M/1 queues in which nodes act as caches that store objects. Exogenous requests for objects are routed towards nodes that store them; as a result, object traffic in the network is determined not only by demand but, crucially, by where objects are cached. We determine how to place objects in caches to attain a certain design objective, such as, e.g., minimizing network congestion or retrieval delays. We show that for a broad class of objectives, including minimizing both the expected network delay and the sum of network queue lengths, this optimization problem can be cast as an NP- hard submodular maximization problem. We show that so-called continuous greedy algorithm attains a ratio arbitrarily close to $1 - 1/e \approx 0.63$ using a deterministic estimation via a power series; this drastically reduces execution time over prior art, which resorts to sampling. Finally, we show that our results generalize, beyond M/M/1 queues, to networks of M/M/k and symmetric M/D/1 queues.
△ Less
Submitted 28 January, 2019; v1 submitted 13 January, 2019;
originally announced January 2019.
-
Mixed-Timescale Online PHY Caching for Dual-Mode MIMO Cooperative Networks
Authors:
An Liu,
Vincent Lau,
Wenchao Ding,
Edmund Yeh
Abstract:
Recently, physical layer (PHY) caching has been proposed to exploit the dynamic side information induced by caches at base stations (BSs) to support Coordinated Multi-Point (CoMP) and achieve huge degrees of freedom (DoF) gains. Due to the limited cache storage capacity, the performance of PHY caching depends heavily on the cache content placement algorithm. In existing algorithms, the cache conte…
▽ More
Recently, physical layer (PHY) caching has been proposed to exploit the dynamic side information induced by caches at base stations (BSs) to support Coordinated Multi-Point (CoMP) and achieve huge degrees of freedom (DoF) gains. Due to the limited cache storage capacity, the performance of PHY caching depends heavily on the cache content placement algorithm. In existing algorithms, the cache content placement is adaptive to the long-term popularity distribution in an offline manner. We propose an online PHY caching framework which adapts the cache content placement to microscopic spatial and temporary popularity variations to fully exploit the benefits of PHY caching. Specifically, the joint optimization of online cache content placement and content delivery is formulated as a mixed-timescale drift minimization problem to increase the CoMP opportunity and reduce the cache content placement cost. We propose a low-complexity algorithm to obtain a throughput-optimal solution. Moreover, we provide a closed-form characterization of the maximum sum DoF in the stability region and study the impact of key system parameters on the stability region. Simulations show that the proposed online PHY caching framework achieves large gain over existing solutions.
△ Less
Submitted 17 October, 2018;
originally announced October 2018.
-
ARQ with Cumulative Feedback to Compensate for Burst Errors
Authors:
Derya Malak,
Muriel Medard,
Edmund M. Yeh
Abstract:
We propose a cumulative feedback-based ARQ (CF ARQ) protocol for a sliding window of size 2 over packet erasure channels with unreliable feedback. We exploit a matrix signal-flow graph approach to analyze probability-generating functions of transmission and delay times. Contrasting its performance with that of the uncoded baseline scheme for ARQ, developed by Ausavapattanakun and Nosratinia, we de…
▽ More
We propose a cumulative feedback-based ARQ (CF ARQ) protocol for a sliding window of size 2 over packet erasure channels with unreliable feedback. We exploit a matrix signal-flow graph approach to analyze probability-generating functions of transmission and delay times. Contrasting its performance with that of the uncoded baseline scheme for ARQ, developed by Ausavapattanakun and Nosratinia, we demonstrate that CF ARQ can provide significantly less average delay under bursty feedback, and gains up to about 20% in terms of throughput. We also outline the benefits of CF ARQ under burst errors and asymmetric channel conditions. The protocol is more predictable across statistics, hence is more stable. This can help design robust systems when feedback is unreliable. This feature may be preferable for meeting the strict end-to-end latency and reliability requirements of future use cases of ultra-reliable low-latency communications in 5G, such as mission-critical communications and industrial control for critical control messaging.
△ Less
Submitted 7 August, 2018;
originally announced August 2018.
-
Tiny Codes for Guaranteeable Delay
Authors:
Derya Malak,
Muriel Médard,
Edmund M. Yeh
Abstract:
Future 5G systems will need to support ultra-reliable low-latency communications scenarios. From a latency-reliability viewpoint, it is inefficient to rely on average utility-based system design. Therefore, we introduce the notion of guaranteeable delay which is the average delay plus three standard deviations of the mean. We investigate the trade-off between guaranteeable delay and throughput for…
▽ More
Future 5G systems will need to support ultra-reliable low-latency communications scenarios. From a latency-reliability viewpoint, it is inefficient to rely on average utility-based system design. Therefore, we introduce the notion of guaranteeable delay which is the average delay plus three standard deviations of the mean. We investigate the trade-off between guaranteeable delay and throughput for point-to-point wireless erasure links with unreliable and delayed feedback, by bringing together signal flow techniques to the area of coding. We use tiny codes, i.e. sliding window by coding with just 2 packets, and design three variations of selective-repeat ARQ protocols, by building on the baseline scheme, i.e. uncoded ARQ, developed by Ausavapattanakun and Nosratinia: (i) Hybrid ARQ with soft combining at the receiver; (ii) cumulative feedback-based ARQ without rate adaptation; and (iii) Coded ARQ with rate adaptation based on the cumulative feedback. Contrasting the performance of these protocols with uncoded ARQ, we demonstrate that HARQ performs only slightly better, cumulative feedback-based ARQ does not provide significant throughput while it has better average delay, and Coded ARQ can provide gains up to about 40% in terms of throughput. Coded ARQ also provides delay guarantees, and is robust to various challenges such as imperfect and delayed feedback, burst erasures, and round-trip time fluctuations. This feature may be preferable for meeting the strict end-to-end latency and reliability requirements of future use cases of ultra-reliable low-latency communications in 5G, such as mission-critical communications and industrial control for critical control messaging.
△ Less
Submitted 1 February, 2019; v1 submitted 14 June, 2018;
originally announced June 2018.
-
Updating Content in Cache-Aided Coded Multicast
Authors:
Milad Mahdian,
N. Prakash,
Muriel Médard,
Edmund Yeh
Abstract:
Motivated by applications to delivery of dynamically updated, but correlated data in settings such as content distribution networks, and distributed file sharing systems, we study a single source multiple destination network coded multicast problem in a cache-aided network. We focus on models where the caches are primarily located near the destinations, and where the source has no cache. The sourc…
▽ More
Motivated by applications to delivery of dynamically updated, but correlated data in settings such as content distribution networks, and distributed file sharing systems, we study a single source multiple destination network coded multicast problem in a cache-aided network. We focus on models where the caches are primarily located near the destinations, and where the source has no cache. The source observes a sequence of correlated frames, and is expected to do frame-by-frame encoding with no access to prior frames. We present a novel scheme that shows how the caches can be advantageously used to decrease the overall cost of multicast, even though the source encodes without access to past data. Our cache design and update scheme works with any choice of network code designed for a corresponding cache-less network, is largely decentralized, and works for an arbitrary network. We study a convex relation of the optimization problem that results form the overall cost function. The results of the optimization problem determines the rate allocation and caching strategies. Numerous simulation results are presented to substantiate the theory developed.
△ Less
Submitted 1 May, 2018;
originally announced May 2018.
-
Analysis of Coded Selective-Repeat ARQ via Matrix Signal-Flow Graphs
Authors:
Derya Malak,
Muriel Médard,
Edmund M. Yeh
Abstract:
We propose two schemes for selective-repeat ARQ protocols over packet erasure channels with unreliable feedback: (i) a hybrid ARQ protocol with soft combining at the receiver, and (ii) a coded ARQ protocol, by building on the uncoded baseline scheme for ARQ, developed by Ausavapattanakun and Nosratinia. Our method leverages discrete-time queuing and coding theory to analyze the performance of the…
▽ More
We propose two schemes for selective-repeat ARQ protocols over packet erasure channels with unreliable feedback: (i) a hybrid ARQ protocol with soft combining at the receiver, and (ii) a coded ARQ protocol, by building on the uncoded baseline scheme for ARQ, developed by Ausavapattanakun and Nosratinia. Our method leverages discrete-time queuing and coding theory to analyze the performance of the proposed data transmission methods. We incorporate forward error-correction to reduce in-order delivery delay, and exploit a matrix signal-flow graph approach to analyze the throughput and delay of the protocols. We demonstrate and contrast the performance of the coded protocols with that of the uncoded scheme, illustrating the benefits of coded transmissions.
△ Less
Submitted 31 January, 2018;
originally announced January 2018.
-
MinDelay: Low-latency Forwarding and Caching Algorithms for Information-Centric Networks
Authors:
Milad Mahdian,
Edmund Yeh
Abstract:
We present a new unified framework for minimizing congestion-dependent network cost in information-centric networks by jointly optimizing forwarding and caching strategies. As caching variables are integer-constrained, the resulting optimization problem is NP-hard. To make progress, we focus on a relaxed version of the optimization problem, where caching variables are allowed to be real-valued. We…
▽ More
We present a new unified framework for minimizing congestion-dependent network cost in information-centric networks by jointly optimizing forwarding and caching strategies. As caching variables are integer-constrained, the resulting optimization problem is NP-hard. To make progress, we focus on a relaxed version of the optimization problem, where caching variables are allowed to be real-valued. We develop necessary optimality conditions for the relaxed problem, and leverage this result to design MinDelay, an adaptive and distributed joint forwarding and caching algorithm, based on the conditional gradient algorithm. The MinDelay algorithm elegantly yields feasible routing variables and integer caching variables at each iteration, and can be implemented in a distributed manner with low complexity and overhead. Over a wide range of network topologies, simulation results show that MinDelay typically has significantly better delay performance in the low to moderate request rate regions. Furthermore, the MinDelay and VIP algorithms complement each other in delivering superior delay performance across the entire range of request arrival rates.
△ Less
Submitted 14 October, 2017;
originally announced October 2017.
-
Robustness of Interdependent Random Geometric Networks
Authors:
Jianan Zhang,
Edmund Yeh,
Eytan Modiano
Abstract:
We propose an interdependent random geometric graph (RGG) model for interdependent networks. Based on this model, we study the robustness of two interdependent spatially embedded networks where interdependence exists between geographically nearby nodes in the two networks. We study the emergence of the giant mutual component in two interdependent RGGs as node densities increase, and define the per…
▽ More
We propose an interdependent random geometric graph (RGG) model for interdependent networks. Based on this model, we study the robustness of two interdependent spatially embedded networks where interdependence exists between geographically nearby nodes in the two networks. We study the emergence of the giant mutual component in two interdependent RGGs as node densities increase, and define the percolation threshold as a pair of node densities above which the giant mutual component first appears. In contrast to the case for a single RGG, where the percolation threshold is a unique scalar for a given connection distance, for two interdependent RGGs, multiple pairs of percolation thresholds may exist, given that a smaller node density in one RGG may increase the minimum node density in the other RGG in order for a giant mutual component to exist. We derive analytical upper bounds on the percolation thresholds of two interdependent RGGs by discretization, and obtain $99\%$ confidence intervals for the percolation thresholds by simulation. Based on these results, we derive conditions for the interdependent RGGs to be robust under random failures and geographical attacks.
△ Less
Submitted 4 June, 2018; v1 submitted 9 September, 2017;
originally announced September 2017.
-
Jointly Optimal Routing and Caching for Arbitrary Network Topologies
Authors:
Stratis Ioannidis,
Edmund Yeh
Abstract:
We study a problem of fundamental importance to ICNs, namely, minimizing routing costs by jointly optimizing caching and routing decisions over an arbitrary network topology. We consider both source routing and hop-by-hop routing settings. The respective offline problems are NP-hard. Nevertheless, we show that there exist polynomial time approximation algorithms producing solutions within a consta…
▽ More
We study a problem of fundamental importance to ICNs, namely, minimizing routing costs by jointly optimizing caching and routing decisions over an arbitrary network topology. We consider both source routing and hop-by-hop routing settings. The respective offline problems are NP-hard. Nevertheless, we show that there exist polynomial time approximation algorithms producing solutions within a constant approximation from the optimal. We also produce distributed, adaptive algorithms with the same approximation guarantees. We simulate our adaptive algorithms over a broad array of different topologies. Our algorithms reduce routing costs by several orders of magnitude compared to prior art, including algorithms optimizing caching under fixed routing.
△ Less
Submitted 20 August, 2017;
originally announced August 2017.
-
Forward Collision Vehicular Radar with IEEE 802.11: Feasibility Demonstration through Measurements
Authors:
Enoch R. Yeh,
Robert C. Daniels,
Robert W. Heath, Jr
Abstract:
Increasing safety and automation in transportation systems has led to the proliferation of radar and IEEE 802.11 dedicated short range communication (DSRC) in vehicles. Current implementations of vehicular radar devices, however, are expensive, use a substantial amount of bandwidth, and are susceptible to multiple security risks. Consider the feasibility of using an IEEE 802.11 orthogonal frequenc…
▽ More
Increasing safety and automation in transportation systems has led to the proliferation of radar and IEEE 802.11 dedicated short range communication (DSRC) in vehicles. Current implementations of vehicular radar devices, however, are expensive, use a substantial amount of bandwidth, and are susceptible to multiple security risks. Consider the feasibility of using an IEEE 802.11 orthogonal frequency division multiplexing (OFDM) communications waveform to perform radar functions. In this paper, we present an approach that determines the mean-normalized channel energy from frequency domain channel estimates and models it as a direct sinusoidal function of target range, enabling closest target range estimation. In addition, we propose an alternative to vehicular forward collision detection by extending IEEE 802.11 dedicated short-range communications (DSRC) and WiFi technology to radar, providing a foundation for joint communications and radar framework. Furthermore, we perform an experimental demonstration using existing IEEE 802.11 devices with minimal modification through algorithm processing on frequency-domain channel estimates. The results of this paper show that our solution delivers similar accuracy and reliability to mmWave radar devices with as little as 20 MHz of spectrum (doubling DSRC's 10 MHz allocation), indicating significant potential for industrial devices with joint vehicular communications and radar capabilities.
△ Less
Submitted 5 June, 2017; v1 submitted 10 February, 2017;
originally announced February 2017.
-
Scaled VIP Algorithms for Joint Dynamic Forwarding and Caching in Named Data Networks
Authors:
Fan Lai,
Feng Qiu,
Wenjie Bian,
Ying Cui,
Edmund Yeh
Abstract:
Emerging Information-Centric Networking (ICN) architectures seek to optimally utilize both bandwidth and storage for efficient content distribution over the network. The Virtual Interest Packet (VIP) framework has been proposed to enable joint design of forwarding and caching within the Named Data Networking (NDN) architecture. The virtual plane of the VIP framework captures the measured demand fo…
▽ More
Emerging Information-Centric Networking (ICN) architectures seek to optimally utilize both bandwidth and storage for efficient content distribution over the network. The Virtual Interest Packet (VIP) framework has been proposed to enable joint design of forwarding and caching within the Named Data Networking (NDN) architecture. The virtual plane of the VIP framework captures the measured demand for content objects, but does not reflect interest collapse and suppression in the NDN network. We aim to further improve the performance of the existing VIP algorithms by using a modified virtual plane where VIP counts are appropriately scaled to reflect interest suppression effects. We characterize the stability region of the modified virtual plane with VIP scaling, develop a new distributed forwarding and caching algorithm operating on the scaled VIPs, and demonstrate the throughput optimality of the scaled VIP algorithm in the virtual plane. Numerical experiments demonstrate significantly enhanced performance relative to the existing VIP algorithm, as well as a number of other baseline algorithms.
△ Less
Submitted 15 August, 2016;
originally announced August 2016.
-
Cascading Node Failure with Continuous States in Random Geometric Networks
Authors:
Khashayar Kamran,
Edmund Yeh
Abstract:
The increasing complexity and interdependency of today's networks highlight the importance of studying network robustness to failure and attacks. Many large-scale networks are prone to cascading effects where a limited number of initial failures (due to attacks, natural hazards or resource depletion) propagate through a dependent mechanism, ultimately leading to a global failure scenario where a s…
▽ More
The increasing complexity and interdependency of today's networks highlight the importance of studying network robustness to failure and attacks. Many large-scale networks are prone to cascading effects where a limited number of initial failures (due to attacks, natural hazards or resource depletion) propagate through a dependent mechanism, ultimately leading to a global failure scenario where a substantial fraction of the network loses its functionality. These cascading failure scenarios often take place in networks which are embedded in space and constrained by geometry. Building on previous results on cascading failure in random geometric networks, we introduce and analyze a continuous cascading failure model where a node has an initial continuously-valued state, and fails if the aggregate state of its neighbors fall below a threshold. Within this model, we derive analytical conditions for the occurrence and non-occurrence of cascading node failure, respectively.
△ Less
Submitted 25 July, 2016;
originally announced July 2016.
-
Enhanced VIP Algorithms for Forwarding, Caching, and Congestion Control in Named Data Networks
Authors:
Ying Cui,
Fan Lai,
Edmund Yeh,
Ran Liu
Abstract:
Emerging Information-Centric Networking (ICN) architectures seek to optimally utilize both bandwidth and storage for efficient content distribution over the network. The Virtual Interest Packet (VIP) framework has been proposed to enable joint design of forwarding, caching, and congestion control strategies within the Named Data Networking (NDN) architecture. While the existing VIP algorithms exhi…
▽ More
Emerging Information-Centric Networking (ICN) architectures seek to optimally utilize both bandwidth and storage for efficient content distribution over the network. The Virtual Interest Packet (VIP) framework has been proposed to enable joint design of forwarding, caching, and congestion control strategies within the Named Data Networking (NDN) architecture. While the existing VIP algorithms exhibit good performance, they are primarily focused on maximizing network throughput and utility, and do not explicitly consider user delay. In this paper, we develop a new class of enhanced algorithms for joint dynamic forwarding, caching and congestion control within the VIP framework. These enhanced VIP algorithms adaptively stabilize the network and maximize network utility, while improving the delay performance by intelligently making use of VIP information beyond one hop. Generalizing Lyapunov drift techniques, we prove the throughput optimality and characterize the utility-delay tradeoff of the enhanced VIP algorithms. Numerical experiments demonstrate the superior performance of the resulting enhanced algorithms for handling Interest Packets and Data Packets within the actual plane, in terms of low network delay and high network utility.
△ Less
Submitted 12 July, 2016;
originally announced July 2016.
-
Adaptive Caching Networks with Optimality Guarantees
Authors:
Stratis Ioannidis,
Edmund Yeh
Abstract:
We study the problem of optimal content placement over a network of caches, a problem naturally arising in several networking applications, including ICNs, CDNs, and P2P systems. Given a demand of content request rates and paths followed, we wish to determine the content placement that maximizes the expected caching gain, i.e., the reduction of routing costs due to intermediate caching. The offlin…
▽ More
We study the problem of optimal content placement over a network of caches, a problem naturally arising in several networking applications, including ICNs, CDNs, and P2P systems. Given a demand of content request rates and paths followed, we wish to determine the content placement that maximizes the expected caching gain, i.e., the reduction of routing costs due to intermediate caching. The offline version of this problem is NP-hard and, in general, the demand and topology may be a priori unknown. Hence, a distributed, adaptive, constant approximation content placement algorithm is desired. We show that path replication, a simple algorithm frequently encountered in literature, can be arbitrarily suboptimal when combined with traditional eviction policies, like LRU, LFU, or FIFO. We propose a distributed, adaptive algorithm that performs stochastic gradient ascent on a concave relaxation of the expected caching gain, and constructs a probabilistic content placement within 1-1/e factor from the optimal, in expectation. Motivated by our analysis, we also propose a novel greedy eviction policy to be used with path replication, and show through numerical evaluations that both algorithms significantly outperform path replication with traditional eviction policies over a broad array of network topologies.
△ Less
Submitted 11 April, 2016;
originally announced April 2016.
-
Practical Accounting in Content-Centric Networking (extended version)
Authors:
Cesar Ghali,
Gene Tsudik,
Christopher A. Wood,
Edmund Yeh
Abstract:
Content-Centric Networking (CCN) is a new class of network architectures designed to address some key limitations of the current IP-based Internet. One of its main features is in-network content caching, which allows requests for content to be served by routers. Despite improved bandwidth utilization and lower latency for popular content retrieval, in-network content caching offers producers no me…
▽ More
Content-Centric Networking (CCN) is a new class of network architectures designed to address some key limitations of the current IP-based Internet. One of its main features is in-network content caching, which allows requests for content to be served by routers. Despite improved bandwidth utilization and lower latency for popular content retrieval, in-network content caching offers producers no means of collecting information about content that is requested and later served from network caches. Such information is often needed for accounting purposes. In this paper, we design some secure accounting schemes that vary in the degree of consumer, router, and producer involvement. Next, we identify and analyze performance and security tradeoffs, and show that specific per-consumer accounting is impossible in the presence of router caches and without application-specific support. We then recommend accounting strategies that entail a few simple requirements for CCN architectures. Finally, our experimental results show that forms of native and secure CCN accounting are both more viable and practical than application-specific approaches with little modification to the existing architecture and protocol.
△ Less
Submitted 7 October, 2015;
originally announced October 2015.
-
A Perspective on Future Research Directions in Information Theory
Authors:
Jeffrey G. Andrews,
Alexandros Dimakis,
Lara Dolecek,
Michelle Effros,
Muriel Medard,
Olgica Milenkovic,
Andrea Montanari,
Sriram Vishwanath,
Edmund Yeh,
Randall Berry,
Ken Duffy,
Soheil Feizi,
Saul Kato,
Manolis Kellis,
Stuart Licht,
Jon Sorenson,
Lav Varshney,
Haris Vikalo
Abstract:
Information theory is rapidly approaching its 70th birthday. What are promising future directions for research in information theory? Where will information theory be having the most impact in 10-20 years? What new and emerging areas are ripe for the most impact, of the sort that information theory has had on the telecommunications industry over the last 60 years? How should the IEEE Information T…
▽ More
Information theory is rapidly approaching its 70th birthday. What are promising future directions for research in information theory? Where will information theory be having the most impact in 10-20 years? What new and emerging areas are ripe for the most impact, of the sort that information theory has had on the telecommunications industry over the last 60 years? How should the IEEE Information Theory Society promote high-risk new research directions and broaden the reach of information theory, while continuing to be true to its ideals and insisting on the intellectual rigor that makes its breakthroughs so powerful? These are some of the questions that an ad hoc committee (composed of the present authors) explored over the past two years. We have discussed and debated these questions, and solicited detailed inputs from experts in fields including genomics, biology, economics, and neuroscience. This report is the result of these discussions.
△ Less
Submitted 21 July, 2015;
originally announced July 2015.
-
Throughput and Delay Scaling of Content-Centric Ad Hoc and Heterogeneous Wireless Networks
Authors:
Milad Mahdian,
Edmund Yeh
Abstract:
We study the throughput and delay characteristics of wireless caching networks, where users are mainly interested in retrieving content stored in the network, rather than in maintaining source-destination communication. Nodes are assumed to be uniformly distributed in the network area. Each node has a limited-capacity content store, which it uses to cache contents. We propose an achievable caching…
▽ More
We study the throughput and delay characteristics of wireless caching networks, where users are mainly interested in retrieving content stored in the network, rather than in maintaining source-destination communication. Nodes are assumed to be uniformly distributed in the network area. Each node has a limited-capacity content store, which it uses to cache contents. We propose an achievable caching and transmission scheme whereby requesters retrieve content from the caching point which is closest in Euclidean distance. We establish the throughput and delay scaling of the achievable scheme, and show that the throughput and delay performance are order-optimal within a class of schemes. We then solve the caching optimization problem, and evaluate the network performance for a Zipf content popularity distribution, letting the number of content types and the network size both go to infinity. Finally, we extend our analysis to heterogeneous wireless networks where, in addition to wireless nodes, there are a number of base stations uniformly distributed at random in the network area. We show that in order to achieve a better performance in a heterogeneous network in the order sense, the number of base stations needs to be greater than the ratio of the number of nodes to the number of content types. Furthermore, we show that the heterogeneous network does not yield performance advantages in the order sense if the Zipf content popularity distribution exponent exceeds 3/2.
△ Less
Submitted 22 April, 2017; v1 submitted 14 April, 2015;
originally announced April 2015.
-
Optimization-Based Linear Network Coding for General Connections of Continuous Flows
Authors:
Ying Cui,
Muriel Médard,
Edmund Yeh,
Douglas Leith,
Ken Duffy
Abstract:
For general connections, the problem of finding network codes and optimizing resources for those codes is intrinsically difficult and little is known about its complexity. Most of the existing solutions rely on very restricted classes of network codes in terms of the number of flows allowed to be coded together, and are not entirely distributed. In this paper, we consider a new method for construc…
▽ More
For general connections, the problem of finding network codes and optimizing resources for those codes is intrinsically difficult and little is known about its complexity. Most of the existing solutions rely on very restricted classes of network codes in terms of the number of flows allowed to be coded together, and are not entirely distributed. In this paper, we consider a new method for constructing linear network codes for general connections of continuous flows to minimize the total cost of edge use based on mixing. We first formulate the minimumcost network coding design problem. To solve the optimization problem, we propose two equivalent alternative formulations with discrete mixing and continuous mixing, respectively, and develop distributed algorithms to solve them. Our approach allows fairly general coding across flows and guarantees no greater cost than any solution without network coding.
△ Less
Submitted 27 February, 2015; v1 submitted 23 February, 2015;
originally announced February 2015.
-
A Linear Network Code Construction for General Integer Connections Based on the Constraint Satisfaction Problem
Authors:
Ying Cui,
Muriel Médard,
Fan Lai,
Edmund Yeh,
Douglas Leith,
Ken Duffy,
Dhaivat Pandya
Abstract:
The problem of finding network codes for general connections is inherently difficult in capacity constrained networks. Resource minimization for general connections with network coding is further complicated. Existing methods for identifying solutions mainly rely on highly restricted classes of network codes, and are almost all centralized. In this paper, we introduce linear network mixing coeffic…
▽ More
The problem of finding network codes for general connections is inherently difficult in capacity constrained networks. Resource minimization for general connections with network coding is further complicated. Existing methods for identifying solutions mainly rely on highly restricted classes of network codes, and are almost all centralized. In this paper, we introduce linear network mixing coefficients for code constructions of general connections that generalize random linear network coding (RLNC) for multicast connections. For such code constructions, we pose the problem of cost minimization for the subgraph involved in the coding solution and relate this minimization to a path-based Constraint Satisfaction Problem (CSP) and an edge-based CSP. While CSPs are NP-complete in general, we present a path-based probabilistic distributed algorithm and an edge-based probabilistic distributed algorithm with almost sure convergence in finite time by applying Communication Free Learning (CFL). Our approach allows fairly general coding across flows, guarantees no greater cost than routing, and shows a possible distributed implementation. Numerical results illustrate the performance improvement of our approach over existing methods.
△ Less
Submitted 2 July, 2016; v1 submitted 23 February, 2015;
originally announced February 2015.
-
Enhancing the Delay Performance of Dynamic Backpressure Algorithms
Authors:
Ying Cui,
Edmund M. Yeh,
Ran Liu
Abstract:
For general multi-hop queueing networks, delay optimal network control has unfortunately been an outstanding problem. The dynamic backpressure (BP) algorithm elegantly achieves throughput optimality, but does not yield good delay performance in general. In this paper, we obtain an asymptotically delay optimal control policy, which resembles the BP algorithm in basing resource allocation and routin…
▽ More
For general multi-hop queueing networks, delay optimal network control has unfortunately been an outstanding problem. The dynamic backpressure (BP) algorithm elegantly achieves throughput optimality, but does not yield good delay performance in general. In this paper, we obtain an asymptotically delay optimal control policy, which resembles the BP algorithm in basing resource allocation and routing on a backpressure calculation, but differs from the BP algorithm in the form of the backpressure calculation employed. The difference suggests a possible reason for the unsatisfactory delay performance of the BP algorithm, i.e., the myopic nature of the BP control. Motivated by this new connection, we introduce a new class of enhanced backpressure-based algorithms which incorporate a general queue-dependent bias function into the backpressure term of the traditional BP algorithm to improve delay performance. These enhanced algorithms exploit queue state information beyond one hop. We prove the throughput optimality and characterize the utility-delay tradeoff of the enhanced algorithms. We further focus on two specific distributed algorithms within this class, which have demonstrably improved delay performance as well as acceptable implementation complexity.
△ Less
Submitted 21 April, 2015; v1 submitted 11 February, 2015;
originally announced February 2015.
-
Forwarding, Caching and Congestion Control in Named Data Networks
Authors:
Edmund Yeh,
Tracey Ho,
Ying Cui,
Ran Liu,
Michael Burd,
Derek Leong
Abstract:
Emerging information-centric networking architectures seek to optimally utilize both bandwidth and storage for efficient content distribution. This highlights the need for joint design of traffic engineering and caching strategies, in order to optimize network performance in view of both current traffic loads and future traffic demands. We present a systematic framework for joint dynamic interest…
▽ More
Emerging information-centric networking architectures seek to optimally utilize both bandwidth and storage for efficient content distribution. This highlights the need for joint design of traffic engineering and caching strategies, in order to optimize network performance in view of both current traffic loads and future traffic demands. We present a systematic framework for joint dynamic interest request forwarding and dynamic cache placement and eviction, within the context of the Named Data Networking (NDN) architecture. The framework employs a virtual control plane which operates on the user demand rate for data objects in the network, and an actual plane which handles Interest Packets and Data Packets. We develop distributed algorithms within the virtual plane to achieve network load balancing through dynamic forwarding and caching, thereby maximizing the user demand rate that the NDN network can satisfy. Next, we show that congestion control can be optimally combined with forwarding and caching within this framework to maximize user utilities subject to network stability. Numerical experiments within a number of network settings demonstrate the superior performance of the resulting algorithms for the actual plane in terms of high user utilities, low user delay, and high rate of cache hits.
△ Less
Submitted 25 February, 2016; v1 submitted 21 October, 2013;
originally announced October 2013.
-
Approaching Gaussian Relay Network Capacity in the High SNR Regime: End-to-End Lattice Codes
Authors:
Yun Xu,
Edmund Yeh,
Muriel Medard
Abstract:
We present a natural and low-complexity technique for achieving the capacity of the Gaussian relay network in the high SNR regime. Specifically, we propose the use of end-to-end structured lattice codes with the amplify-and-forward strategy, where the source uses a nested lattice code to encode the messages and the destination decodes the messages by lattice decoding. All intermediate relays simpl…
▽ More
We present a natural and low-complexity technique for achieving the capacity of the Gaussian relay network in the high SNR regime. Specifically, we propose the use of end-to-end structured lattice codes with the amplify-and-forward strategy, where the source uses a nested lattice code to encode the messages and the destination decodes the messages by lattice decoding. All intermediate relays simply amplify and forward the received signals over the network to the destination. We show that the end-to-end lattice-coded amplify-and-forward scheme approaches the capacity of the layered Gaussian relay network in the high SNR regime. Next, we extend our scheme to non-layered Gaussian relay networks under the amplify-and-forward scheme, which can be viewed as a Gaussian intersymbol interference (ISI) channel. Compared with other schemes, our approach is significantly simpler and requires only the end-to-end design of the lattice precoding and decoding. It does not require any knowledge of the network topology or the individual channel gains.
△ Less
Submitted 15 September, 2013; v1 submitted 20 July, 2013;
originally announced July 2013.
-
Deterministic Network Model Revisited: An Algebraic Network Coding Approach
Authors:
MinJi Kim,
Elona Erez,
Edmund M. Yeh,
Muriel Medard
Abstract:
The capacity of multiuser networks has been a long-standing problem in information theory. Recently, Avestimehr et al. have proposed a deterministic network model to approximate multiuser wireless networks. This model, known as the ADT network model, takes into account the broadcast nature of wireless medium and interference.
We show that the ADT network model can be described within the algebra…
▽ More
The capacity of multiuser networks has been a long-standing problem in information theory. Recently, Avestimehr et al. have proposed a deterministic network model to approximate multiuser wireless networks. This model, known as the ADT network model, takes into account the broadcast nature of wireless medium and interference.
We show that the ADT network model can be described within the algebraic network coding framework introduced by Koetter and Medard. We prove that the ADT network problem can be captured by a single matrix, and show that the min-cut of an ADT network is the rank of this matrix; thus, eliminating the need to optimize over exponential number of cuts between two nodes to compute the min-cut of an ADT network. We extend the capacity characterization for ADT networks to a more general set of connections, including single unicast/multicast connection and non-multicast connections such as multiple multicast, disjoint multicast, and two-level multicast. We also provide sufficiency conditions for achievability in ADT networks for any general connection set. In addition, we show that random linear network coding, a randomized distributed algorithm for network code construction, achieves the capacity for the connections listed above. Furthermore, we extend the ADT networks to those with random erasures and cycles (thus, allowing bi-directional links).
In addition, we propose an efficient linear code construction for the deterministic wireless multicast relay network model. Avestimehr et al.'s proposed code construction is not guaranteed to be efficient and may potentially involve an infinite block length. Unlike several previous coding schemes, we do not attempt to find flows in the network. Instead, for a layered network, we maintain an invariant where it is required that at each stage of the code construction, certain sets of codewords are linearly independent.
△ Less
Submitted 6 May, 2011; v1 submitted 4 March, 2011;
originally announced March 2011.
-
The Impact of Incomplete Information on Games in Parallel Relay Networks
Authors:
Hongda Xiao,
Edmund M. Yeh
Abstract:
We consider the impact of incomplete information on incentives for node cooperation in parallel relay networks with one source node, one destination node, and multiple relay nodes. All nodes are selfish and strategic, interested in maximizing their own profit instead of the social welfare. We consider the practical situation where the channel state on any given relay path is not observable to the…
▽ More
We consider the impact of incomplete information on incentives for node cooperation in parallel relay networks with one source node, one destination node, and multiple relay nodes. All nodes are selfish and strategic, interested in maximizing their own profit instead of the social welfare. We consider the practical situation where the channel state on any given relay path is not observable to the source or to the other relays. We examine different bargaining relationships between the source and the relays, and propose a framework for analyzing the efficiency loss induced by incomplete information. We analyze the source of the efficiency loss, and quantify the amount of inefficiency which results.
△ Less
Submitted 20 January, 2011;
originally announced January 2011.
-
Cascading Link Failure in the Power Grid: A Percolation-Based Analysis
Authors:
Hongda Xiao,
Edmund Yeh
Abstract:
Large-scale power blackouts caused by cascading failure are inflicting enormous socioeconomic costs. We study the problem of cascading link failures in power networks modelled by random geometric graphs from a percolation-based viewpoint. To reflect the fact that links fail according to the amount of power flow going through them, we introduce a model where links fail according to a probability wh…
▽ More
Large-scale power blackouts caused by cascading failure are inflicting enormous socioeconomic costs. We study the problem of cascading link failures in power networks modelled by random geometric graphs from a percolation-based viewpoint. To reflect the fact that links fail according to the amount of power flow going through them, we introduce a model where links fail according to a probability which depends on the number of neighboring links. We devise a map** which maps links in a random geometric graph to nodes in a corresponding dual covering graph. This map** enables us to obtain the first-known analytical conditions on the existence and non-existence of a large component of operational links after degree-dependent link failures. Next, we present a simple but descriptive model for cascading link failure, and use the degree-dependent link failure results to obtain the first-known analytical conditions on the existence and non-existence of cascading link failures.
△ Less
Submitted 8 December, 2010; v1 submitted 19 November, 2010;
originally announced November 2010.
-
Polar codes for the two-user multiple-access channel
Authors:
Eren Sasoglu,
Emre Telatar,
Edmund Yeh
Abstract:
Arikan's polar coding method is extended to two-user multiple-access channels. It is shown that if the two users of the channel use the Arikan construction, the resulting channels will polarize to one of five possible extremals, on each of which uncoded transmission is optimal. The sum rate achieved by this coding technique is the one that correponds to uniform input distributions. The encoding an…
▽ More
Arikan's polar coding method is extended to two-user multiple-access channels. It is shown that if the two users of the channel use the Arikan construction, the resulting channels will polarize to one of five possible extremals, on each of which uncoded transmission is optimal. The sum rate achieved by this coding technique is the one that correponds to uniform input distributions. The encoding and decoding complexities and the error performance of these codes are as in the single-user case: $O(n\log n)$ for encoding and decoding, and $o(\exp(-n^{1/2-ε}))$ for block error probability, where $n$ is the block length.
△ Less
Submitted 22 June, 2010;
originally announced June 2010.