-
GraphPipe: Improving Performance and Scalability of DNN Training with Graph Pipeline Parallelism
Authors:
Byungsoo Jeon,
Mengdi Wu,
Shiyi Cao,
Sunghyun Kim,
Sunghyun Park,
Neeraj Aggarwal,
Colin Unger,
Daiyaan Arfeen,
Peiyuan Liao,
Xupeng Miao,
Mohammad Alizadeh,
Gregory R. Ganger,
Tianqi Chen,
Zhihao Jia
Abstract:
Deep neural networks (DNNs) continue to grow rapidly in size, making them infeasible to train on a single device. Pipeline parallelism is commonly used in existing DNN systems to support large-scale DNN training by partitioning a DNN into multiple stages, which concurrently perform DNN training for different micro-batches in a pipeline fashion. However, existing pipeline-parallel approaches only c…
▽ More
Deep neural networks (DNNs) continue to grow rapidly in size, making them infeasible to train on a single device. Pipeline parallelism is commonly used in existing DNN systems to support large-scale DNN training by partitioning a DNN into multiple stages, which concurrently perform DNN training for different micro-batches in a pipeline fashion. However, existing pipeline-parallel approaches only consider sequential pipeline stages and thus ignore the topology of a DNN, resulting in missed model-parallel opportunities. This paper presents graph pipeline parallelism (GPP), a new pipeline-parallel scheme that partitions a DNN into pipeline stages whose dependencies are identified by a directed acyclic graph. GPP generalizes existing sequential pipeline parallelism and preserves the inherent topology of a DNN to enable concurrent execution of computationally-independent operators, resulting in reduced memory requirement and improved GPU performance. In addition, we develop GraphPipe, a distributed system that exploits GPP strategies to enable performant and scalable DNN training. GraphPipe partitions a DNN into a graph of stages, optimizes micro-batch schedules for these stages, and parallelizes DNN training using the discovered GPP strategies. Evaluation on a variety of DNNs shows that GraphPipe outperforms existing pipeline-parallel systems such as PipeDream and Piper by up to 1.6X. GraphPipe also reduces the search time by 9-21X compared to PipeDream and Piper.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
On a Generalization of Heyting Algebras I
Authors:
Amirhossein Akbar Tabatabai,
Majid Alizadeh,
Masoud Memarzadeh
Abstract:
$\nabla$-algebra is a natural generalization of Heyting algebra, unifying many algebraic structures including bounded lattices, Heyting algebras, temporal Heyting algebras and the algebraic presentation of the dynamic topological systems. In a series of two papers, we will systematically study the algebro-topological properties of different varieties of $\nabla…
▽ More
$\nabla$-algebra is a natural generalization of Heyting algebra, unifying many algebraic structures including bounded lattices, Heyting algebras, temporal Heyting algebras and the algebraic presentation of the dynamic topological systems. In a series of two papers, we will systematically study the algebro-topological properties of different varieties of $\nabla$-algebras. In the present paper, we start with investigating the structure of these varieties by characterizing their subdirectly irreducible and simple elements. Then, we prove the closure of these varieties under the Dedekind-MacNeille completion and provide the canonical construction and the Kripke representation for $\nabla$-algebras by which we establish the amalgamation property for some varieties of $\nabla$-algebras. In the sequel of the present paper, we will complete the study by covering the logics of these varieties and their corresponding Priestley-Esakia and spectral duality theories.
△ Less
Submitted 18 May, 2024;
originally announced May 2024.
-
Robust Decentralized Learning with Local Updates and Gradient Tracking
Authors:
Sajjad Ghiasvand,
Amirhossein Reisizadeh,
Mahnoosh Alizadeh,
Ramtin Pedarsani
Abstract:
As distributed learning applications such as Federated Learning, the Internet of Things (IoT), and Edge Computing grow, it is critical to address the shortcomings of such technologies from a theoretical perspective. As an abstraction, we consider decentralized learning over a network of communicating clients or nodes and tackle two major challenges: data heterogeneity and adversarial robustness. W…
▽ More
As distributed learning applications such as Federated Learning, the Internet of Things (IoT), and Edge Computing grow, it is critical to address the shortcomings of such technologies from a theoretical perspective. As an abstraction, we consider decentralized learning over a network of communicating clients or nodes and tackle two major challenges: data heterogeneity and adversarial robustness. We propose a decentralized minimax optimization method that employs two important modules: local updates and gradient tracking. Minimax optimization is the key tool to enable adversarial training for ensuring robustness. Having local updates is essential in Federated Learning (FL) applications to mitigate the communication bottleneck, and utilizing gradient tracking is essential to proving convergence in the case of data heterogeneity. We analyze the performance of the proposed algorithm, Dec-FedTrack, in the case of nonconvex-strongly concave minimax optimization, and prove that it converges a stationary point. We also conduct numerical experiments to support our theoretical findings.
△ Less
Submitted 1 May, 2024;
originally announced May 2024.
-
Optimistic Safety for Online Convex Optimization with Unknown Linear Constraints
Authors:
Spencer Hutchinson,
Tianyi Chen,
Mahnoosh Alizadeh
Abstract:
We study the problem of online convex optimization (OCO) under unknown linear constraints that are either static, or stochastically time-varying. For this problem, we introduce an algorithm that we term Optimistically Safe OCO (OSOCO) and show that it enjoys $\tilde{\mathcal{O}}(\sqrt{T})$ regret and no constraint violation. In the case of static linear constraints, this improves on the previous b…
▽ More
We study the problem of online convex optimization (OCO) under unknown linear constraints that are either static, or stochastically time-varying. For this problem, we introduce an algorithm that we term Optimistically Safe OCO (OSOCO) and show that it enjoys $\tilde{\mathcal{O}}(\sqrt{T})$ regret and no constraint violation. In the case of static linear constraints, this improves on the previous best known $\tilde{\mathcal{O}}(T^{2/3})$ regret with only slightly stronger assumptions. In the case of stochastic time-varying constraints, our work supplements existing results that show $\mathcal{O}(\sqrt{T})$ regret and $\mathcal{O}(\sqrt{T})$ cumulative violation under more general convex constraints albeit a less general feedback model. In addition to our theoretical guarantees, we also give numerical results comparing the performance of OSOCO to existing algorithms.
△ Less
Submitted 27 May, 2024; v1 submitted 8 March, 2024;
originally announced March 2024.
-
Generalized convex functions and their applications in optimality conditions
Authors:
Mohammad Hossein Alizadeh,
Alireza Youhannaee Zanjani
Abstract:
We introduce and study the notion of (e,y)-conjugate for a proper and e-convex function in locally convex spaces, which is an extension of the concept of the conjugate. The mutual relationships between the concepts of (e,y)-conjugacy and e-subdifferential are presented. Moreover, some applications of these notions in optimization are established.
We introduce and study the notion of (e,y)-conjugate for a proper and e-convex function in locally convex spaces, which is an extension of the concept of the conjugate. The mutual relationships between the concepts of (e,y)-conjugacy and e-subdifferential are presented. Moreover, some applications of these notions in optimization are established.
△ Less
Submitted 29 February, 2024; v1 submitted 23 February, 2024;
originally announced February 2024.
-
Comparing Methods for Creating a National Random Sample of Twitter Users
Authors:
Meysam Alizadeh,
Darya Zare,
Zeynab Samei,
Mohammadamin Alizadeh,
Mael Kubli,
Mohammadhadi Aliahmadi,
Sarvenaz Ebrahimi,
Fabrizio Gilardi
Abstract:
Twitter data has been widely used by researchers across various social and computer science disciplines. A common aim when working with Twitter data is the construction of a random sample of users from a given country. However, while several methods have been proposed in the literature, their comparative performance is mostly unexplored. In this paper, we implement four common methods to collect a…
▽ More
Twitter data has been widely used by researchers across various social and computer science disciplines. A common aim when working with Twitter data is the construction of a random sample of users from a given country. However, while several methods have been proposed in the literature, their comparative performance is mostly unexplored. In this paper, we implement four common methods to collect a random sample of Twitter users in the US: 1% Stream, Bounding Box, Location Query, and Language Query. Then, we compare the methods according to their tweet- and user-level metrics as well as their accuracy in estimating US population with and without using inclusion probabilities of various demographics. Our results show that the 1% Stream method performs differently than others in tweet- and user-level metrics, and best for the construction of a population representative sample. We discuss the conditions under which the 1% Stream method may not be suitable and suggest the Bounding Box method as the second-best method to use.
△ Less
Submitted 11 March, 2024; v1 submitted 7 February, 2024;
originally announced February 2024.
-
Practical Rateless Set Reconciliation
Authors:
Lei Yang,
Yossi Gilad,
Mohammad Alizadeh
Abstract:
Set reconciliation, where two parties hold fixed-length bit strings and run a protocol to learn the strings they are missing from each other, is a fundamental task in many distributed systems. We present Rateless Invertible Bloom Lookup Tables (Rateless IBLT), the first set reconciliation protocol, to the best of our knowledge, that achieves low computation cost and near-optimal communication cost…
▽ More
Set reconciliation, where two parties hold fixed-length bit strings and run a protocol to learn the strings they are missing from each other, is a fundamental task in many distributed systems. We present Rateless Invertible Bloom Lookup Tables (Rateless IBLT), the first set reconciliation protocol, to the best of our knowledge, that achieves low computation cost and near-optimal communication cost across a wide range of scenarios: set differences of one to millions, bit strings of a few bytes to megabytes, and workloads injected by potential adversaries. Rateless IBLT is based on a novel encoder that incrementally encodes the set difference into an infinite stream of coded symbols, resembling rateless error-correcting codes. We compare Rateless IBLT with state-of-the-art set reconciliation schemes and demonstrate significant improvements. Rateless IBLT achieves 3--4x lower communication cost than non-rateless schemes with similar computation cost, and 2--2000x lower computation cost than schemes with similar communication cost. We show the real-world benefits of Rateless IBLT by applying it to synchronize the state of the Ethereum blockchain, and demonstrate 5.6x lower end-to-end completion time and 4.4x lower communication cost compared to the system used in production.
△ Less
Submitted 19 June, 2024; v1 submitted 4 February, 2024;
originally announced February 2024.
-
CompactifAI: Extreme Compression of Large Language Models using Quantum-Inspired Tensor Networks
Authors:
Andrei Tomut,
Saeed S. Jahromi,
Abhijoy Sarkar,
Uygar Kurt,
Sukhbinder Singh,
Faysal Ishtiaq,
Cesar Muñoz,
Prabdeep Singh Bajaj,
Ali Elborady,
Gianni del Bimbo,
Mehrazin Alizadeh,
David Montero,
Pablo Martin-Ramiro,
Muhammad Ibrahim,
Oussama Tahiri Alaoui,
John Malcolm,
Samuel Mugel,
Roman Orus
Abstract:
Large Language Models (LLMs) such as ChatGPT and LlaMA are advancing rapidly in generative Artificial Intelligence (AI), but their immense size poses significant challenges, such as huge training and inference costs, substantial energy demands, and limitations for on-site deployment. Traditional compression methods such as pruning, distillation, and low-rank approximation focus on reducing the eff…
▽ More
Large Language Models (LLMs) such as ChatGPT and LlaMA are advancing rapidly in generative Artificial Intelligence (AI), but their immense size poses significant challenges, such as huge training and inference costs, substantial energy demands, and limitations for on-site deployment. Traditional compression methods such as pruning, distillation, and low-rank approximation focus on reducing the effective number of neurons in the network, while quantization focuses on reducing the numerical precision of individual weights to reduce the model size while kee** the number of neurons fixed. While these compression methods have been relatively successful in practice, there is no compelling reason to believe that truncating the number of neurons is an optimal strategy. In this context, this paper introduces CompactifAI, an innovative LLM compression approach using quantum-inspired Tensor Networks that focuses on the model's correlation space instead, allowing for a more controlled, refined and interpretable model compression. Our method is versatile and can be implemented with - or on top of - other compression techniques. As a benchmark, we demonstrate that a combination of CompactifAI with quantization allows to reduce a 93% the memory size of LlaMA 7B, reducing also 70% the number of parameters, accelerating 50% the training and 25% the inference times of the model, and just with a small accuracy drop of 2% - 3%, going much beyond of what is achievable today by other compression techniques. Our methods also allow to perform a refined layer sensitivity profiling, showing that deeper layers tend to be more suitable for tensor network compression, which is compatible with recent observations on the ineffectiveness of those layers for LLM performance. Our results imply that standard LLMs are, in fact, heavily overparametrized, and do not need to be large at all.
△ Less
Submitted 13 May, 2024; v1 submitted 25 January, 2024;
originally announced January 2024.
-
Stochastic orderings between two finite mixture models with inverted-Kumaraswamy distributed components
Authors:
Raju Bhakta,
Pradip Kundu,
Suchandan Kayal,
Morad Alizadeh
Abstract:
In this paper, we consider two finite mixture models (FMMs), with inverted-Kumaraswamy distributed components' lifetimes. Several stochastic ordering results between the FMMs have been obtained. Mainly, we focus on three different cases in terms of the heterogeneity of parameters. The usual stochastic order between the FMMs have been established when heterogeneity presents in one parameter as well…
▽ More
In this paper, we consider two finite mixture models (FMMs), with inverted-Kumaraswamy distributed components' lifetimes. Several stochastic ordering results between the FMMs have been obtained. Mainly, we focus on three different cases in terms of the heterogeneity of parameters. The usual stochastic order between the FMMs have been established when heterogeneity presents in one parameter as well as two parameters. In addition, we have also studied ageing faster order in terms of the reversed hazard rate between two FMMs when heterogeneity is in two parameters. For the case of heterogeneity in three parameters, we obtain the comparison results based on reversed hazard rate and likelihood ratio orders. The theoretical developments have been illustrated using several examples and counterexamples.
△ Less
Submitted 7 March, 2024; v1 submitted 29 November, 2023;
originally announced November 2023.
-
Bringing Reconfigurability to the Network Stack
Authors:
Akshay Narayan,
Aurojit Panda,
Mohammad Alizadeh,
Hari Balakrishnan,
Arvind Krishnamurthy,
Scott Shenker
Abstract:
Reconfiguring the network stack allows applications to specialize the implementations of communication libraries depending on where they run, the requests they serve, and the performance they need to provide. Specializing applications in this way is challenging because developers need to choose the libraries they use when writing a program and cannot easily change them at runtime. This paper intro…
▽ More
Reconfiguring the network stack allows applications to specialize the implementations of communication libraries depending on where they run, the requests they serve, and the performance they need to provide. Specializing applications in this way is challenging because developers need to choose the libraries they use when writing a program and cannot easily change them at runtime. This paper introduces Bertha, which allows these choices to be changed at runtime without limiting developer flexibility in the choice of network and communication functions. Bertha allows applications to safely use optimized communication primitives (including ones with deployment limitations) without limiting deployability. Our evaluation shows cases where this results in 16x higher throughput and 63% lower latency than current portable approaches while imposing minimal overheads when compared to a hand-optimized versions that use deployment-specific communication primitives.
△ Less
Submitted 13 November, 2023;
originally announced November 2023.
-
A Safe First-Order Method for Pricing-Based Resource Allocation in Safety-Critical Networks
Authors:
Berkay Turan,
Spencer Hutchinson,
Mahnoosh Alizadeh
Abstract:
We introduce a novel algorithm for solving network utility maximization (NUM) problems that arise in resource allocation schemes over networks with known safety-critical constraints, where the constraints form an arbitrary convex and compact feasible set. Inspired by applications where customers' demand can only be affected through posted prices and real-time two-way communication with customers i…
▽ More
We introduce a novel algorithm for solving network utility maximization (NUM) problems that arise in resource allocation schemes over networks with known safety-critical constraints, where the constraints form an arbitrary convex and compact feasible set. Inspired by applications where customers' demand can only be affected through posted prices and real-time two-way communication with customers is not available, we require an algorithm to generate ``safe prices''. This means that at no iteration should the realized demand in response to the posted prices violate the safety constraints of the network. Thus, in contrast to existing distributed first-order methods, our algorithm, called safe pricing for NUM (SPNUM), is guaranteed to produce feasible primal iterates at all iterations. At the heart of the algorithm lie two key steps that must go hand in hand to guarantee safety and convergence: 1) applying a projected gradient method on a shrunk feasible set to get the desired demand, and 2) estimating the price response function of the users and determining the price so that the induced demand is close to the desired demand. We ensure safety by adjusting the shrinkage to account for the error between the induced demand and the desired demand. In addition, by gradually reducing the amount of shrinkage and the step size of the gradient method, we prove that the primal iterates produced by the SPNUM achieve a sublinear static regret of ${\cal O}(\log{(T)})$ after $T$ time steps.
△ Less
Submitted 17 May, 2024; v1 submitted 5 October, 2023;
originally announced October 2023.
-
Interpreting the Value of Flexibility in AC Security-Constrained Transmission Expansion Planning via a Cooperative Game Framework
Authors:
Andrey Churkin,
Wangwei Kong,
Mohammad Iman Alizadeh,
Florin Capitanescu,
Pierluigi Mancarella,
Eduardo A. Martínez Ceseña
Abstract:
Security-constrained transmission expansion planning (SCTEP) is an inherently complex problem that requires simultaneously solving multiple contingency states of the system (usually corresponding to N-1 security criterion). Existing studies focus on effectively finding optimal solutions; however, single optimal solutions are not sufficient to interpret the value of flexibility (e.g., from energy s…
▽ More
Security-constrained transmission expansion planning (SCTEP) is an inherently complex problem that requires simultaneously solving multiple contingency states of the system (usually corresponding to N-1 security criterion). Existing studies focus on effectively finding optimal solutions; however, single optimal solutions are not sufficient to interpret the value of flexibility (e.g., from energy storage systems) and support system planners in well-informed decision making. In view of planning uncertainties, it is necessary to estimate the contributions of flexibility to various objectives and prioritise the most effective investments. In this regard, this work introduces a SCTEP tool that enables interpreting the value of flexibility in terms of contributions to avoided load curtailment and total expected system cost reduction. Inspired by cooperative game theory, the tool ranks the contributions of flexibility providers and compares them against traditional line reinforcements. This information can be used by system planners to prioritise investments with higher contributions and synergistic capabilities.
△ Less
Submitted 14 March, 2024; v1 submitted 5 October, 2023;
originally announced October 2023.
-
Vidaptive: Efficient and Responsive Rate Control for Real-Time Video on Variable Networks
Authors:
Pantea Karimi,
Sadjad Fouladi,
Vibhaalakshmi Sivaraman,
Mohammad Alizadeh
Abstract:
Real-time video streaming relies on rate control mechanisms to adapt video bitrate to network capacity while maintaining high utilization and low delay. However, the current video rate controllers, such as Google Congestion Control (GCC), are very slow to respond to network changes, leading to link under-utilization and latency spikes. While recent delay-based congestion control algorithms promise…
▽ More
Real-time video streaming relies on rate control mechanisms to adapt video bitrate to network capacity while maintaining high utilization and low delay. However, the current video rate controllers, such as Google Congestion Control (GCC), are very slow to respond to network changes, leading to link under-utilization and latency spikes. While recent delay-based congestion control algorithms promise high efficiency and rapid adaptation to variable conditions, low-latency video applications have been unable to adopt these schemes due to the intertwined relationship between video encoders and rate control in current systems.
This paper introduces Vidaptive, a new rate control mechanism designed for low-latency video applications. Vidaptive decouples packet transmission decisions from encoder output, injecting ``dummy'' padding traffic as needed to treat video streams akin to backlogged flows controlled by a delay-based congestion controller. Vidaptive then adapts the target bitrate of the encoder based on delay measurements to align the video bitrate with the congestion controller's sending rate. Our evaluations atop Google's implementation of WebRTC show that, across a set of cellular traces, Vidaptive achieves ~1.5x higher video bitrate and 1.4 dB higher SSIM, 1.3 dB higher PSNR, and 40% higher VMAF, and it reduces 95th-percentile frame latency by 2.2 s with a slight 17 ms increase in median frame latency.
△ Less
Submitted 25 February, 2024; v1 submitted 28 September, 2023;
originally announced September 2023.
-
Directional Optimism for Safe Linear Bandits
Authors:
Spencer Hutchinson,
Berkay Turan,
Mahnoosh Alizadeh
Abstract:
The safe linear bandit problem is a version of the classical stochastic linear bandit problem where the learner's actions must satisfy an uncertain constraint at all rounds. Due its applicability to many real-world settings, this problem has received considerable attention in recent years. By leveraging a novel approach that we call directional optimism, we find that it is possible to achieve impr…
▽ More
The safe linear bandit problem is a version of the classical stochastic linear bandit problem where the learner's actions must satisfy an uncertain constraint at all rounds. Due its applicability to many real-world settings, this problem has received considerable attention in recent years. By leveraging a novel approach that we call directional optimism, we find that it is possible to achieve improved regret guarantees for both well-separated problem instances and action sets that are finite star convex sets. Furthermore, we propose a novel algorithm for this setting that improves on existing algorithms in terms of empirical performance, while enjoying matching regret guarantees. Lastly, we introduce a generalization of the safe linear bandit setting where the constraints are convex and adapt our algorithms and analyses to this setting by leveraging a novel convex-analysis based approach.
△ Less
Submitted 11 March, 2024; v1 submitted 28 August, 2023;
originally announced August 2023.
-
Reinforcement Strategies in General Lotto Games
Authors:
Keith Paarporn,
Rahul Chandan,
Mahnoosh Alizadeh,
Jason R. Marden
Abstract:
Strategic decisions are often made over multiple periods of time, wherein decisions made earlier impact a competitor's success in later stages. In this paper, we study these dynamics in General Lotto games, a class of models describing the competitive allocation of resources between two opposing players. We propose a two-stage formulation where one of the players has reserved resources that can be…
▽ More
Strategic decisions are often made over multiple periods of time, wherein decisions made earlier impact a competitor's success in later stages. In this paper, we study these dynamics in General Lotto games, a class of models describing the competitive allocation of resources between two opposing players. We propose a two-stage formulation where one of the players has reserved resources that can be strategically pre-allocated across the battlefields in the first stage of the game as reinforcements. The players then simultaneously allocate their remaining real-time resources, which can be randomized, in a decisive final stage. Our main contributions provide complete characterizations of the optimal reinforcement strategies and resulting equilibrium payoffs in these multi-stage General Lotto games. Interestingly, we determine that real-time resources are at least twice as effective as reinforcement resources when considering equilibrium payoffs.
△ Less
Submitted 28 August, 2023;
originally announced August 2023.
-
Safe Pricing Mechanisms for Distributed Resource Allocation with Bandit Feedback
Authors:
Spencer Hutchinson,
Berkay Turan,
Mahnoosh Alizadeh
Abstract:
In societal-scale infrastructures, such as electric grids or transportation networks, pricing mechanisms are often used as a way to shape users' demand in order to lower operating costs and improve reliability. Existing approaches to pricing design for safety-critical networks often require that users are queried beforehand to negotiate prices, which has proven to be challenging to implement in th…
▽ More
In societal-scale infrastructures, such as electric grids or transportation networks, pricing mechanisms are often used as a way to shape users' demand in order to lower operating costs and improve reliability. Existing approaches to pricing design for safety-critical networks often require that users are queried beforehand to negotiate prices, which has proven to be challenging to implement in the real-world. To offer a more practical alternative, we develop learning-based pricing mechanisms that require no input from the users. These pricing mechanisms aim to maximize the utility of the users' consumption by gradually estimating the users' price response over a span of $T$ time steps (e.g., days) while ensuring that the infrastructure network's safety constraints that limit the users' demand are satisfied at all time steps. We propose two different algorithms for the two different scenarios when, 1) the utility function is chosen by the central coordinator to achieve a social objective and 2) the utility function is defined by the price response under the assumption that the users are self-interested agents. We prove that both algorithms enjoy $\tilde{O} (T^{2/3})$ regret with high probability. We then apply these algorithms to demand response pricing for the smart grid and numerically demonstrate their effectiveness.
△ Less
Submitted 28 July, 2023;
originally announced July 2023.
-
Open-Source LLMs for Text Annotation: A Practical Guide for Model Setting and Fine-Tuning
Authors:
Meysam Alizadeh,
Maël Kubli,
Zeynab Samei,
Shirin Dehghani,
Mohammadmasiha Zahedivafa,
Juan Diego Bermeo,
Maria Korobeynikova,
Fabrizio Gilardi
Abstract:
This paper studies the performance of open-source Large Language Models (LLMs) in text classification tasks typical for political science research. By examining tasks like stance, topic, and relevance classification, we aim to guide scholars in making informed decisions about their use of LLMs for text analysis. Specifically, we conduct an assessment of both zero-shot and fine-tuned LLMs across a…
▽ More
This paper studies the performance of open-source Large Language Models (LLMs) in text classification tasks typical for political science research. By examining tasks like stance, topic, and relevance classification, we aim to guide scholars in making informed decisions about their use of LLMs for text analysis. Specifically, we conduct an assessment of both zero-shot and fine-tuned LLMs across a range of text annotation tasks using news articles and tweets datasets. Our analysis shows that fine-tuning improves the performance of open-source LLMs, allowing them to match or even surpass zero-shot GPT-3.5 and GPT-4, though still lagging behind fine-tuned GPT-3.5. We further establish that fine-tuning is preferable to few-shot training with a relatively modest quantity of annotated text. Our findings show that fine-tuned open-source LLMs can be effectively deployed in a broad spectrum of text annotation applications. We provide a Python notebook facilitating the application of LLMs in text annotation for other researchers.
△ Less
Submitted 29 May, 2024; v1 submitted 5 July, 2023;
originally announced July 2023.
-
Power Control with QoS Guarantees: A Differentiable Projection-based Unsupervised Learning Framework
Authors:
Mehrazin Alizadeh,
Hina Tabassum
Abstract:
Deep neural networks (DNNs) are emerging as a potential solution to solve NP-hard wireless resource allocation problems. However, in the presence of intricate constraints, e.g., users' quality-of-service (QoS) constraints, guaranteeing constraint satisfaction becomes a fundamental challenge. In this paper, we propose a novel unsupervised learning framework to solve the classical power control prob…
▽ More
Deep neural networks (DNNs) are emerging as a potential solution to solve NP-hard wireless resource allocation problems. However, in the presence of intricate constraints, e.g., users' quality-of-service (QoS) constraints, guaranteeing constraint satisfaction becomes a fundamental challenge. In this paper, we propose a novel unsupervised learning framework to solve the classical power control problem in a multi-user interference channel, where the objective is to maximize the network sumrate under users' minimum data rate or QoS requirements and power budget constraints. Utilizing a differentiable projection function, two novel deep learning (DL) solutions are pursued. The first is called Deep Implicit Projection Network (DIPNet), and the second is called Deep Explicit Projection Network (DEPNet). DIPNet utilizes a differentiable convex optimization layer to implicitly define a projection function. On the other hand, DEPNet uses an explicitly-defined projection function, which has an iterative nature and relies on a differentiable correction process. DIPNet requires convex constraints; whereas, the DEPNet does not require convexity and has a reduced computational complexity. To enhance the sum-rate performance of the proposed models even further, Frank-Wolfe algorithm (FW) has been applied to the output of the proposed models. Extensive simulations depict that the proposed DNN solutions not only improve the achievable data rate but also achieve zero constraint violation probability, compared to the existing DNNs. The proposed solutions outperform the classic optimization methods in terms of computation time complexity.
△ Less
Submitted 31 May, 2023;
originally announced June 2023.
-
Reparo: Loss-Resilient Generative Codec for Video Conferencing
Authors:
Tianhong Li,
Vibhaalakshmi Sivaraman,
Pantea Karimi,
Lijie Fan,
Mohammad Alizadeh,
Dina Katabi
Abstract:
Packet loss during video conferencing often leads to poor quality and video freezing. Attempting to retransmit lost packets is often impractical due to the need for real-time playback. Employing Forward Error Correction (FEC) for recovering the lost packets is challenging as it is difficult to determine the appropriate redundancy level. To address these issues, we introduce Reparo -- a loss-resili…
▽ More
Packet loss during video conferencing often leads to poor quality and video freezing. Attempting to retransmit lost packets is often impractical due to the need for real-time playback. Employing Forward Error Correction (FEC) for recovering the lost packets is challenging as it is difficult to determine the appropriate redundancy level. To address these issues, we introduce Reparo -- a loss-resilient video conferencing framework based on generative deep learning models. Our approach involves generating missing information when a frame or part of a frame is lost. This generation is conditioned on the data received thus far, taking into account the model's understanding of how people and objects appear and interact within the visual realm. Experimental results, using publicly available video conferencing datasets, demonstrate that Reparo outperforms state-of-the-art FEC-based video conferencing solutions in terms of both video quality (measured through PSNR, SSIM, and LPIPS) and the occurrence of video freezes.
△ Less
Submitted 20 February, 2024; v1 submitted 23 May, 2023;
originally announced May 2023.
-
The Impact of the Geometric Properties of the Constraint Set in Safe Optimization with Bandit Feedback
Authors:
Spencer Hutchinson,
Berkay Turan,
Mahnoosh Alizadeh
Abstract:
We consider a safe optimization problem with bandit feedback in which an agent sequentially chooses actions and observes responses from the environment, with the goal of maximizing an arbitrary function of the response while respecting stage-wise constraints. We propose an algorithm for this problem, and study how the geometric properties of the constraint set impact the regret of the algorithm. I…
▽ More
We consider a safe optimization problem with bandit feedback in which an agent sequentially chooses actions and observes responses from the environment, with the goal of maximizing an arbitrary function of the response while respecting stage-wise constraints. We propose an algorithm for this problem, and study how the geometric properties of the constraint set impact the regret of the algorithm. In order to do so, we introduce the notion of the sharpness of a particular constraint set, which characterizes the difficulty of performing learning within the constraint set in an uncertain setting. This concept of sharpness allows us to identify the class of constraint sets for which the proposed algorithm is guaranteed to enjoy sublinear regret. Simulation results for this algorithm support the sublinear regret bound and provide empirical evidence that the sharpness of the constraint set impacts the performance of the algorithm.
△ Less
Submitted 1 May, 2023;
originally announced May 2023.
-
ChatGPT Outperforms Crowd-Workers for Text-Annotation Tasks
Authors:
Fabrizio Gilardi,
Meysam Alizadeh,
Maël Kubli
Abstract:
Many NLP applications require manual data annotations for a variety of tasks, notably to train classifiers or evaluate the performance of unsupervised models. Depending on the size and degree of complexity, the tasks may be conducted by crowd-workers on platforms such as MTurk as well as trained annotators, such as research assistants. Using a sample of 2,382 tweets, we demonstrate that ChatGPT ou…
▽ More
Many NLP applications require manual data annotations for a variety of tasks, notably to train classifiers or evaluate the performance of unsupervised models. Depending on the size and degree of complexity, the tasks may be conducted by crowd-workers on platforms such as MTurk as well as trained annotators, such as research assistants. Using a sample of 2,382 tweets, we demonstrate that ChatGPT outperforms crowd-workers for several annotation tasks, including relevance, stance, topics, and frames detection. Specifically, the zero-shot accuracy of ChatGPT exceeds that of crowd-workers for four out of five tasks, while ChatGPT's intercoder agreement exceeds that of both crowd-workers and trained annotators for all tasks. Moreover, the per-annotation cost of ChatGPT is less than $0.003 -- about twenty times cheaper than MTurk. These results show the potential of large language models to drastically increase the efficiency of text classification.
△ Less
Submitted 19 July, 2023; v1 submitted 27 March, 2023;
originally announced March 2023.
-
Counterfactual Identifiability of Bijective Causal Models
Authors:
Arash Nasr-Esfahany,
Mohammad Alizadeh,
Devavrat Shah
Abstract:
We study counterfactual identifiability in causal models with bijective generation mechanisms (BGM), a class that generalizes several widely-used causal models in the literature. We establish their counterfactual identifiability for three common causal structures with unobserved confounding, and propose a practical learning method that casts learning a BGM as structured generative modeling. Learne…
▽ More
We study counterfactual identifiability in causal models with bijective generation mechanisms (BGM), a class that generalizes several widely-used causal models in the literature. We establish their counterfactual identifiability for three common causal structures with unobserved confounding, and propose a practical learning method that casts learning a BGM as structured generative modeling. Learned BGMs enable efficient counterfactual estimation and can be obtained using a variety of deep conditional generative models. We evaluate our techniques in a visual task and demonstrate its application in a real-world video streaming simulation task.
△ Less
Submitted 6 June, 2023; v1 submitted 4 February, 2023;
originally announced February 2023.
-
Online Reinforcement Learning in Non-Stationary Context-Driven Environments
Authors:
Pouya Hamadanian,
Arash Nasr-Esfahany,
Malte Schwarzkopf,
Siddartha Sen,
Mohammad Alizadeh
Abstract:
We study online reinforcement learning (RL) in non-stationary environments, where a time-varying exogenous context process affects the environment dynamics. Online RL is challenging in such environments due to "catastrophic forgetting" (CF). The agent tends to forget prior knowledge as it trains on new experiences. Prior approaches to mitigate this issue assume task labels (which are often not ava…
▽ More
We study online reinforcement learning (RL) in non-stationary environments, where a time-varying exogenous context process affects the environment dynamics. Online RL is challenging in such environments due to "catastrophic forgetting" (CF). The agent tends to forget prior knowledge as it trains on new experiences. Prior approaches to mitigate this issue assume task labels (which are often not available in practice) or use off-policy methods that suffer from instability and poor performance.
We present Locally Constrained Policy Optimization (LCPO), an online RL approach that combats CF by anchoring policy outputs on old experiences while optimizing the return on current experiences. To perform this anchoring, LCPO locally constrains policy optimization using samples from experiences that lie outside of the current context distribution. We evaluate LCPO in Mujoco, classic control and computer systems environments with a variety of synthetic and real context traces, and find that it outperforms state-of-the-art on-policy and off-policy RL methods in the non-stationary setting, while achieving results on-par with an "oracle" agent trained offline across all context traces.
△ Less
Submitted 10 February, 2024; v1 submitted 4 February, 2023;
originally announced February 2023.
-
Predicting Parameters for Modeling Traffic Participants
Authors:
Ahmadreza Moradipari,
Sangjae Bae,
Mahnoosh Alizadeh,
Ehsan Moradi Pari,
David Isele
Abstract:
Accurately modeling the behavior of traffic participants is essential for safely and efficiently navigating an autonomous vehicle through heavy traffic. We propose a method, based on the intelligent driver model, that allows us to accurately model individual driver behaviors from only a small number of frames using easily observable features. On average, this method makes prediction errors that ha…
▽ More
Accurately modeling the behavior of traffic participants is essential for safely and efficiently navigating an autonomous vehicle through heavy traffic. We propose a method, based on the intelligent driver model, that allows us to accurately model individual driver behaviors from only a small number of frames using easily observable features. On average, this method makes prediction errors that have less than 1 meter difference from an oracle with full-information when analyzed over a 10-second horizon of highway driving. We then validate the efficiency of our method through extensive analysis against a competitive data-driven method such as Reinforcement Learning that may be of independent interest.
△ Less
Submitted 25 January, 2023;
originally announced January 2023.
-
Impacts of Distribution Network Reconfiguration on Aggregated DER Flexibility
Authors:
Andrey Churkin,
Miguel Sanchez-Lopez,
Mohammad Iman Alizadeh,
Florin Capitanescu,
Eduardo A. Martínez Ceseña,
Pierluigi Mancarella
Abstract:
The ongoing integration of controllable distributed energy resources (DER) makes distribution networks capable of aggregating flexible power and providing flexibility services at both transmission and distribution levels. The aggregated flexibility of an active distribution network (ADN) can be represented as its feasible operating area in the P-Q space. The limits of this area are pivotal for arr…
▽ More
The ongoing integration of controllable distributed energy resources (DER) makes distribution networks capable of aggregating flexible power and providing flexibility services at both transmission and distribution levels. The aggregated flexibility of an active distribution network (ADN) can be represented as its feasible operating area in the P-Q space. The limits of this area are pivotal for arranging flexibility markets and coordinating transmission and distribution system operators (TSOs and DSOs). However, motivated by the current technical limitations of distribution networks (e.g., protection schemes), existing literature on ADN flexibility and TSO-DSO coordination mostly focuses on radial networks, overlooking the potential benefits of network reconfiguration. This paper, using a realistic meshed distribution system from the UK and the exact ACOPF model for flexibility estimation, demonstrates that network reconfiguration can increase the limits of ADN aggregated flexibility and improve the economic efficiency of flexibility markets.
△ Less
Submitted 12 January, 2023;
originally announced January 2023.
-
FactorJoin: A New Cardinality Estimation Framework for Join Queries
Authors:
Ziniu Wu,
Parimarjan Negi,
Mohammad Alizadeh,
Tim Kraska,
Samuel Madden
Abstract:
Cardinality estimation is one of the most fundamental and challenging problems in query optimization. Neither classical nor learning-based methods yield satisfactory performance when estimating the cardinality of the join queries. They either rely on simplified assumptions leading to ineffective cardinality estimates or build large models to understand the data distributions, leading to long plann…
▽ More
Cardinality estimation is one of the most fundamental and challenging problems in query optimization. Neither classical nor learning-based methods yield satisfactory performance when estimating the cardinality of the join queries. They either rely on simplified assumptions leading to ineffective cardinality estimates or build large models to understand the data distributions, leading to long planning times and a lack of generalizability across queries.
In this paper, we propose a new framework FactorJoin for estimating join queries. FactorJoin combines the idea behind the classical join-histogram method to efficiently handle joins with the learning-based methods to accurately capture attribute correlation. Specifically, FactorJoin scans every table in a DB and builds single-table conditional distributions during an offline preparation phase. When a join query comes, FactorJoin translates it into a factor graph model over the learned distributions to effectively and efficiently estimate its cardinality.
Unlike existing learning-based methods, FactorJoin does not need to de-normalize joins upfront or require executed query workloads to train the model. Since it only relies on single-table statistics, FactorJoin has small space overhead and is extremely easy to train and maintain. In our evaluation, FactorJoin can produce more effective estimates than the previous state-of-the-art learning-based methods, with 40x less estimation latency, 100x smaller model size, and 100x faster training speed at comparable or better accuracy. In addition, FactorJoin can estimate 10,000 sub-plan queries within one second to optimize the query plan, which is very close to the traditional cardinality estimators in commercial DBMS.
△ Less
Submitted 11 December, 2022;
originally announced December 2022.
-
Gemino: Practical and Robust Neural Compression for Video Conferencing
Authors:
Vibhaalakshmi Sivaraman,
Pantea Karimi,
Vedantha Venkatapathy,
Mehrdad Khani,
Sadjad Fouladi,
Mohammad Alizadeh,
Frédo Durand,
Vivienne Sze
Abstract:
Video conferencing systems suffer from poor user experience when network conditions deteriorate because current video codecs simply cannot operate at extremely low bitrates. Recently, several neural alternatives have been proposed that reconstruct talking head videos at very low bitrates using sparse representations of each frame such as facial landmark information. However, these approaches produ…
▽ More
Video conferencing systems suffer from poor user experience when network conditions deteriorate because current video codecs simply cannot operate at extremely low bitrates. Recently, several neural alternatives have been proposed that reconstruct talking head videos at very low bitrates using sparse representations of each frame such as facial landmark information. However, these approaches produce poor reconstructions in scenarios with major movement or occlusions over the course of a call, and do not scale to higher resolutions. We design Gemino, a new neural compression system for video conferencing based on a novel high-frequency-conditional super-resolution pipeline. Gemino upsamples a very low-resolution version of each target frame while enhancing high-frequency details (e.g., skin texture, hair, etc.) based on information extracted from a single high-resolution reference image. We use a multi-scale architecture that runs different components of the model at different resolutions, allowing it to scale to resolutions comparable to 720p, and we personalize the model to learn specific details of each person, achieving much better fidelity at low bitrates. We implement Gemino atop aiortc, an open-source Python implementation of WebRTC, and show that it operates on 1024x1024 videos in real-time on a Titan X GPU, and achieves 2.2-5x lower bitrate than traditional video codecs for the same perceptual quality.
△ Less
Submitted 19 October, 2023; v1 submitted 21 September, 2022;
originally announced September 2022.
-
Strategic investments in multi-stage General Lotto games
Authors:
Rahul Chandan,
Keith Paarporn,
Mahnoosh Alizadeh,
Jason R. Marden
Abstract:
In adversarial interactions, one is often required to make strategic decisions over multiple periods of time, wherein decisions made earlier impact a player's competitive standing as well as how choices are made in later stages. In this paper, we study such scenarios in the context of General Lotto games, which models the competitive allocation of resources over multiple battlefields between two p…
▽ More
In adversarial interactions, one is often required to make strategic decisions over multiple periods of time, wherein decisions made earlier impact a player's competitive standing as well as how choices are made in later stages. In this paper, we study such scenarios in the context of General Lotto games, which models the competitive allocation of resources over multiple battlefields between two players. We propose a two-stage formulation where one of the players has reserved resources that can be strategically pre-allocated across the battlefields in the first stage. The pre-allocation then becomes binding and is revealed to the other player. In the second stage, the players engage by simultaneously allocating their real-time resources against each other. The main contribution in this paper provides complete characterizations of equilibrium payoffs in the two-stage game, revealing the interplay between performance and the amount of resources expended in each stage of the game. We find that real-time resources are at least twice as effective as pre-allocated resources. We then determine the player's optimal investment when there are linear costs associated with purchasing each type of resource before play begins, and there is a limited monetary budget.
△ Less
Submitted 13 September, 2022;
originally announced September 2022.
-
Safe Dual Gradient Method for Network Utility Maximization Problems
Authors:
Berkay Turan,
Mahnoosh Alizadeh
Abstract:
In this paper, we introduce a novel first-order dual gradient algorithm for solving network utility maximization problems that arise in resource allocation schemes over networks with safety-critical constraints. Inspired by applications where customers' demand can only be affected through posted prices and real-time two-way communication with customers is not available, we require an algorithm to…
▽ More
In this paper, we introduce a novel first-order dual gradient algorithm for solving network utility maximization problems that arise in resource allocation schemes over networks with safety-critical constraints. Inspired by applications where customers' demand can only be affected through posted prices and real-time two-way communication with customers is not available, we require an algorithm to generate \textit{safe prices}. This means that at no iteration should the realized demand in response to the posted prices violate the safety constraints of the network. Thus, in contrast to existing first-order methods, our algorithm, called the safe dual gradient method (SDGM), is guaranteed to produce feasible primal iterates at all iterations. We ensure primal feasibility by 1) adding a diminishing safety margin to the constraints, and 2) using a sign-based dual update method with different step sizes for plus and minus directions. In addition, we prove that the primal iterates produced by the SDGM achieve a sublinear static regret of ${\cal O}(\sqrt{T})$.
△ Less
Submitted 8 August, 2022;
originally announced August 2022.
-
A Deployable Online Optimization Framework for EV Smart Charging with Real-World Test Cases
Authors:
Nathaniel Tucker,
Mahnoosh Alizadeh
Abstract:
We present a customizable online optimization framework for real-time EV smart charging to be readily implemented at real large-scale charging facilities. Notably, due to real-world constraints, we designed our framework around 3 main requirements. First, the smart charging strategy is readily deployable and customizable for a wide-array of facilities, infrastructure, objectives, and constraints.…
▽ More
We present a customizable online optimization framework for real-time EV smart charging to be readily implemented at real large-scale charging facilities. Notably, due to real-world constraints, we designed our framework around 3 main requirements. First, the smart charging strategy is readily deployable and customizable for a wide-array of facilities, infrastructure, objectives, and constraints. Second, the online optimization framework can be easily modified to operate with or without user input for energy request amounts and/or departure time estimates which allows our framework to be implemented on standard chargers with 1-way communication or newer chargers with 2-way communication. Third, our online optimization framework outperforms other real-time strategies (including first-come-first-serve, least-laxity-first, earliest-deadline-first, etc.) in multiple real-world test cases with various objectives. We showcase our framework with two real-world test cases with charging session data sourced from SLAC and Google campuses in the Bay Area.
△ Less
Submitted 22 July, 2022;
originally announced July 2022.
-
Soil Erosion in the United States. Present and Future (2020-2050)
Authors:
Shahab Aldin Shojaeezadeh,
Malik Al-Wardy,
Mohammad Reza Nikoo,
Mehrdad Ghorbani Mooselu,
Mohammad Reza Alizadeh,
Jan Franklin Adamowski,
Hamid Moradkhani,
Nasrin Alamdari,
Amir H. Gandomi
Abstract:
Soil erosion is a significant threat to the environment and long-term land management around the world. Accelerated soil erosion by human activities inflicts extreme changes in terrestrial and aquatic ecosystems, which is not fully surveyed/predicted for the present and probable future at field-scales (30-m). Here, we estimate/predict soil erosion rates by water erosion, (sheet and rill erosion),…
▽ More
Soil erosion is a significant threat to the environment and long-term land management around the world. Accelerated soil erosion by human activities inflicts extreme changes in terrestrial and aquatic ecosystems, which is not fully surveyed/predicted for the present and probable future at field-scales (30-m). Here, we estimate/predict soil erosion rates by water erosion, (sheet and rill erosion), using three alternative (2.6, 4.5, and 8.5) Shared Socioeconomic Pathway and Representative Concentration Pathway (SSP-RCP) scenarios across the contiguous United States. Field Scale Soil Erosion Model (FSSLM) estimations rely on a high resolution (30-m) G2 erosion model integrated by satellite- and imagery-based estimations of land use and land cover (LULC), gauge observations of long-term precipitation, and scenarios of the Coupled Model Intercomparison Project Phase 6 (CMIP6). The baseline model (2020) estimates soil erosion rates of 2.32 Mg ha 1 yr 1 with current agricultural conservation practices (CPs). Future scenarios with current CPs indicate an increase between 8% to 21% under different combinations of SSP-RCP scenarios of climate and LULC changes. The soil erosion forecast for 2050 suggests that all the climate and LULC scenarios indicate either an increase in extreme events or a change in the spatial location of extremes largely from the southern to the eastern and northeastern regions of the United States.
△ Less
Submitted 13 July, 2022;
originally announced July 2022.
-
Collaborative Multi-agent Stochastic Linear Bandits
Authors:
Ahmadreza Moradipari,
Mohammad Ghavamzadeh,
Mahnoosh Alizadeh
Abstract:
We study a collaborative multi-agent stochastic linear bandit setting, where $N$ agents that form a network communicate locally to minimize their overall regret. In this setting, each agent has its own linear bandit problem (its own reward parameter) and the goal is to select the best global action w.r.t. the average of their reward parameters. At each round, each agent proposes an action, and one…
▽ More
We study a collaborative multi-agent stochastic linear bandit setting, where $N$ agents that form a network communicate locally to minimize their overall regret. In this setting, each agent has its own linear bandit problem (its own reward parameter) and the goal is to select the best global action w.r.t. the average of their reward parameters. At each round, each agent proposes an action, and one action is randomly selected and played as the network action. All the agents observe the corresponding rewards of the played actions and use an accelerated consensus procedure to compute an estimate of the average of the rewards obtained by all the agents. We propose a distributed upper confidence bound (UCB) algorithm and prove a high probability bound on its $T$-round regret in which we include a linear growth of regret associated with each communication round. Our regret bound is of order $\mathcal{O}\Big(\sqrt{\frac{T}{N \log(1/|λ_2|)}}\cdot (\log T)^2\Big)$, where $λ_2$ is the second largest (in absolute value) eigenvalue of the communication matrix.
△ Less
Submitted 12 May, 2022;
originally announced May 2022.
-
Multi-Environment Meta-Learning in Stochastic Linear Bandits
Authors:
Ahmadreza Moradipari,
Mohammad Ghavamzadeh,
Taha Rajabzadeh,
Christos Thrampoulidis,
Mahnoosh Alizadeh
Abstract:
In this work we investigate meta-learning (or learning-to-learn) approaches in multi-task linear stochastic bandit problems that can originate from multiple environments. Inspired by the work of [1] on meta-learning in a sequence of linear bandit problems whose parameters are sampled from a single distribution (i.e., a single environment), here we consider the feasibility of meta-learning when tas…
▽ More
In this work we investigate meta-learning (or learning-to-learn) approaches in multi-task linear stochastic bandit problems that can originate from multiple environments. Inspired by the work of [1] on meta-learning in a sequence of linear bandit problems whose parameters are sampled from a single distribution (i.e., a single environment), here we consider the feasibility of meta-learning when task parameters are drawn from a mixture distribution instead. For this problem, we propose a regularized version of the OFUL algorithm that, when trained on tasks with labeled environments, achieves low regret on a new task without requiring knowledge of the environment from which the new task originates. Specifically, our regret bound for the new algorithm captures the effect of environment misclassification and highlights the benefits over learning each task separately or meta-learning without recognition of the distinct mixture components.
△ Less
Submitted 12 May, 2022;
originally announced May 2022.
-
Coded Transaction Broadcasting for High-throughput Blockchains
Authors:
Lei Yang,
Yossi Gilad,
Mohammad Alizadeh
Abstract:
High-throughput blockchains require efficient transaction broadcast mechanisms that can deliver transactions to most network nodes with low bandwidth overhead and latency. Existing schemes coordinate transmissions across peers to avoid sending redundant data, but they either incur a high latency or are not robust against adversarial network nodes. We present Strokkur, a new transaction broadcastin…
▽ More
High-throughput blockchains require efficient transaction broadcast mechanisms that can deliver transactions to most network nodes with low bandwidth overhead and latency. Existing schemes coordinate transmissions across peers to avoid sending redundant data, but they either incur a high latency or are not robust against adversarial network nodes. We present Strokkur, a new transaction broadcasting mechanism that provides both low bandwidth overhead and low latency. The core idea behind Strokkur is to avoid explicit coordination through randomized transaction coding. Rather than forward individual transactions. Strokkur nodes send out codewords -- XOR sums of multiple transactions selected at random. Since almost every codeword is useful for the receiver to decode new transactions, Strokkur nodes do not require coordination, for example, to determine which transactions the receiver is missing. Strokkur's coding strategy builds on LT codes, a popular class of rateless erasure codes, and extends them to support multiple uncoordinated senders with partially-overlap** continual streams of transaction data. Strokkur introduces mechanisms to cope with adversarial senders that may send corrupt codewords, and a simple rate control algorithm that enables each node to independently determine an appropriate sending rate of codewords for each peer. Our implementation of Strokkur in Golang supports 647k transactions per second using only one CPU core. Our evaluation across a 19-node Internet deployment and large-scale simulation show that Strokkur consumes 2--7.6x less bandwidth than the existing scheme in Bitcoin, and 9x lower latency that Shrec when only 4% of nodes are adversarial.
△ Less
Submitted 3 May, 2022;
originally announced May 2022.
-
Scalable Tail Latency Estimation for Data Center Networks
Authors:
Kevin Zhao,
Prateesh Goyal,
Mohammad Alizadeh,
Thomas E. Anderson
Abstract:
In this paper, we consider how to provide fast estimates of flow-level tail latency performance for very large scale data center networks. Network tail latency is often a crucial metric for cloud application performance that can be affected by a wide variety of factors, including network load, inter-rack traffic skew, traffic burstiness, flow size distributions, oversubscription, and topology asym…
▽ More
In this paper, we consider how to provide fast estimates of flow-level tail latency performance for very large scale data center networks. Network tail latency is often a crucial metric for cloud application performance that can be affected by a wide variety of factors, including network load, inter-rack traffic skew, traffic burstiness, flow size distributions, oversubscription, and topology asymmetry. Network simulators such as ns-3 and OMNeT++ can provide accurate answers, but are very hard to parallelize, taking hours or days to answer what if questions for a single configuration at even moderate scale. Recent work with MimicNet has shown how to use machine learning to improve simulation performance, but at a cost of including a long training step per configuration, and with assumptions about workload and topology uniformity that typically do not hold in practice.
We address this gap by develo** a set of techniques to provide fast performance estimates for large scale networks with general traffic matrices and topologies. A key step is to decompose the problem into a large number of parallel independent single-link simulations; we carefully combine these link-level simulations to produce accurate estimates of end-to-end flow level performance distributions for the entire network. Like MimicNet, we exploit symmetry where possible to gain additional speedups, but without relying on machine learning, so there is no training delay. On large-scale networks where ns-3 takes 11 to 27 hours to simulate five seconds of network behavior, our techniques run in one to two minutes with 99th percentile accuracy within 9% for flow completion times.
△ Less
Submitted 30 September, 2022; v1 submitted 2 May, 2022;
originally announced May 2022.
-
Real-Time Electric Vehicle Smart Charging at Workplaces: A Real-World Case Study
Authors:
Nathaniel Tucker,
Gustavo Cezar,
Mahnoosh Alizadeh
Abstract:
We study a real-time smart charging algorithm for electric vehicles (EVs) at a workplace parking lot in order to minimize electricity cost from time-of-use electricity rates and demand charges while ensuring that the owners of the EVs receive adequate levels of charge. Notably, due to real-world constraints, our algorithm is agnostic to both the state-of-charge and the departure time of the EVs an…
▽ More
We study a real-time smart charging algorithm for electric vehicles (EVs) at a workplace parking lot in order to minimize electricity cost from time-of-use electricity rates and demand charges while ensuring that the owners of the EVs receive adequate levels of charge. Notably, due to real-world constraints, our algorithm is agnostic to both the state-of-charge and the departure time of the EVs and uses scenario generation to account for each EV's unknown future departure time as well as certainty equivalent control to account for the unknown EV arrivals in the future. Real-world charging data from a Google campus in California allows us to build realistic models of charging demand for each day of the week. We then compare various results from our smart charging algorithm to the status quo for a two week period at a Google parking location.
△ Less
Submitted 14 March, 2022;
originally announced March 2022.
-
Prospect Pruning: Finding Trainable Weights at Initialization using Meta-Gradients
Authors:
Milad Alizadeh,
Shyam A. Tailor,
Luisa M Zintgraf,
Joost van Amersfoort,
Sebastian Farquhar,
Nicholas Donald Lane,
Yarin Gal
Abstract:
Pruning neural networks at initialization would enable us to find sparse models that retain the accuracy of the original network while consuming fewer computational resources for training and inference. However, current methods are insufficient to enable this optimization and lead to a large degradation in model performance. In this paper, we identify a fundamental limitation in the formulation of…
▽ More
Pruning neural networks at initialization would enable us to find sparse models that retain the accuracy of the original network while consuming fewer computational resources for training and inference. However, current methods are insufficient to enable this optimization and lead to a large degradation in model performance. In this paper, we identify a fundamental limitation in the formulation of current methods, namely that their saliency criteria look at a single step at the start of training without taking into account the trainability of the network. While pruning iteratively and gradually has been shown to improve pruning performance, explicit consideration of the training stage that will immediately follow pruning has so far been absent from the computation of the saliency criterion. To overcome the short-sightedness of existing methods, we propose Prospect Pruning (ProsPr), which uses meta-gradients through the first few steps of optimization to determine which weights to prune. ProsPr combines an estimate of the higher-order effects of pruning on the loss and the optimization trajectory to identify the trainable sub-network. Our method achieves state-of-the-art pruning performance on a variety of vision classification tasks, with less data and in a single shot compared to existing pruning-at-initialization methods.
△ Less
Submitted 5 April, 2022; v1 submitted 16 February, 2022;
originally announced February 2022.
-
Optimal Congestion Control for Time-varying Wireless Links
Authors:
Prateesh Goyal,
Mohammad Alizadeh,
Thomas E. Anderson
Abstract:
Modern networks exhibit a high degree of variability in link rates. Cellular network bandwidth inherently varies with receiver motion and orientation, while class-based packet scheduling in datacenter and service provider networks induces high variability in available capacity for network tenants. Recent work has proposed numerous congestion control protocols to cope with this variability, offerin…
▽ More
Modern networks exhibit a high degree of variability in link rates. Cellular network bandwidth inherently varies with receiver motion and orientation, while class-based packet scheduling in datacenter and service provider networks induces high variability in available capacity for network tenants. Recent work has proposed numerous congestion control protocols to cope with this variability, offering different tradeoffs between link utilization and queuing delay. In this paper, we develop a formal model of congestion control over time-varying links, and we use this model to derive a bound on the performance of any congestion control protocol running over a time-varying link with a given distribution of rate variation. Using the insights from this analysis, we derive an optimal control law that offers a smooth tradeoff between link utilization and queuing delay. We compare the performance of this control law to several existing control algorithms on cellular link traces to show that there is significant room for optimization.
△ Less
Submitted 9 February, 2022;
originally announced February 2022.
-
COIN++: Neural Compression Across Modalities
Authors:
Emilien Dupont,
Hrushikesh Loya,
Milad Alizadeh,
Adam Goliński,
Yee Whye Teh,
Arnaud Doucet
Abstract:
Neural compression algorithms are typically based on autoencoders that require specialized encoder and decoder architectures for different data modalities. In this paper, we propose COIN++, a neural compression framework that seamlessly handles a wide range of data modalities. Our approach is based on converting data to implicit neural representations, i.e. neural functions that map coordinates (s…
▽ More
Neural compression algorithms are typically based on autoencoders that require specialized encoder and decoder architectures for different data modalities. In this paper, we propose COIN++, a neural compression framework that seamlessly handles a wide range of data modalities. Our approach is based on converting data to implicit neural representations, i.e. neural functions that map coordinates (such as pixel locations) to features (such as RGB values). Then, instead of storing the weights of the implicit neural representation directly, we store modulations applied to a meta-learned base network as a compressed code for the data. We further quantize and entropy code these modulations, leading to large compression gains while reducing encoding time by two orders of magnitude compared to baselines. We empirically demonstrate the feasibility of our method by compressing various data modalities, from images and audio to medical and climate data.
△ Less
Submitted 8 December, 2022; v1 submitted 30 January, 2022;
originally announced January 2022.
-
Demystifying Reinforcement Learning in Time-Varying Systems
Authors:
Pouya Hamadanian,
Malte Schwarzkopf,
Siddartha Sen,
Mohammad Alizadeh
Abstract:
Recent research has turned to Reinforcement Learning (RL) to solve challenging decision problems, as an alternative to hand-tuned heuristics. RL can learn good policies without the need for modeling the environment's dynamics. Despite this promise, RL remains an impractical solution for many real-world systems problems. A particularly challenging case occurs when the environment changes over time,…
▽ More
Recent research has turned to Reinforcement Learning (RL) to solve challenging decision problems, as an alternative to hand-tuned heuristics. RL can learn good policies without the need for modeling the environment's dynamics. Despite this promise, RL remains an impractical solution for many real-world systems problems. A particularly challenging case occurs when the environment changes over time, i.e. it exhibits non-stationarity. In this work, we characterize the challenges introduced by non-stationarity, shed light on the range of approaches to them and develop a robust framework for addressing them to train RL agents in live systems. Such agents must explore and learn new environments, without hurting the system's performance, and remember them over time. To this end, our framework (i) identifies different environments encountered by the live system, (ii) triggers exploration when necessary, (iii) takes precautions to retain knowledge from prior environments, and (iv) employs safeguards to protect the system's performance when the RL agent makes mistakes. We apply our framework to two systems problems, straggler mitigation and adaptive video streaming, and evaluate it against a variety of alternative approaches using real-world and synthetic data. We show that all components of the framework are necessary to cope with non-stationarity and provide guidance on alternative design choices for each component.
△ Less
Submitted 26 January, 2023; v1 submitted 14 January, 2022;
originally announced January 2022.
-
CausalSim: A Causal Framework for Unbiased Trace-Driven Simulation
Authors:
Abdullah Alomar,
Pouya Hamadanian,
Arash Nasr-Esfahany,
Anish Agarwal,
Mohammad Alizadeh,
Devavrat Shah
Abstract:
We present CausalSim, a causal framework for unbiased trace-driven simulation. Current trace-driven simulators assume that the interventions being simulated (e.g., a new algorithm) would not affect the validity of the traces. However, real-world traces are often biased by the choices algorithms make during trace collection, and hence replaying traces under an intervention may lead to incorrect res…
▽ More
We present CausalSim, a causal framework for unbiased trace-driven simulation. Current trace-driven simulators assume that the interventions being simulated (e.g., a new algorithm) would not affect the validity of the traces. However, real-world traces are often biased by the choices algorithms make during trace collection, and hence replaying traces under an intervention may lead to incorrect results. CausalSim addresses this challenge by learning a causal model of the system dynamics and latent factors capturing the underlying system conditions during trace collection. It learns these models using an initial randomized control trial (RCT) under a fixed set of algorithms, and then applies them to remove biases from trace data when simulating new algorithms.
Key to CausalSim is map** unbiased trace-driven simulation to a tensor completion problem with extremely sparse observations. By exploiting a basic distributional invariance property present in RCT data, CausalSim enables a novel tensor completion method despite the sparsity of observations. Our extensive evaluation of CausalSim on both real and synthetic datasets, including more than ten months of real data from the Puffer video streaming system shows it improves simulation accuracy, reducing errors by 53% and 61% on average compared to expert-designed and supervised learning baselines. Moreover, CausalSim provides markedly different insights about ABR algorithms compared to the biased baseline simulator, which we validate with a real deployment.
△ Less
Submitted 5 May, 2023; v1 submitted 5 January, 2022;
originally announced January 2022.
-
Efficient Strong Scaling Through Burst Parallel Training
Authors:
Seo ** Park,
Joshua Fried,
Sunghyun Kim,
Mohammad Alizadeh,
Adam Belay
Abstract:
As emerging deep neural network (DNN) models continue to grow in size, using large GPU clusters to train DNNs is becoming an essential requirement to achieving acceptable training times. In this paper, we consider the case where future increases in cluster size will cause the global batch size that can be used to train models to reach a fundamental limit: beyond a certain point, larger global batc…
▽ More
As emerging deep neural network (DNN) models continue to grow in size, using large GPU clusters to train DNNs is becoming an essential requirement to achieving acceptable training times. In this paper, we consider the case where future increases in cluster size will cause the global batch size that can be used to train models to reach a fundamental limit: beyond a certain point, larger global batch sizes cause sample efficiency to degrade, increasing overall time to accuracy. As a result, to achieve further improvements in training performance, we must instead consider "strong scaling" strategies that hold the global batch size constant and allocate smaller batches to each GPU. Unfortunately, this makes it significantly more difficult to use cluster resources efficiently. We present DeepPool, a system that addresses this efficiency challenge through two key ideas. First, burst parallelism allocates large numbers of GPUs to foreground jobs in bursts to exploit the unevenness in parallelism across layers. Second, GPU multiplexing prioritizes throughput for foreground training jobs, while packing in background training jobs to reclaim underutilized GPU resources, thereby improving cluster-wide utilization. Together, these two ideas enable DeepPool to deliver a 1.2 - 2.3x improvement in total cluster throughput over standard data parallelism with a single task when the cluster scale is large.
△ Less
Submitted 23 May, 2022; v1 submitted 19 December, 2021;
originally announced December 2021.
-
Longest Chain Consensus Under Bandwidth Constraint
Authors:
Joachim Neu,
Srivatsan Sridhar,
Lei Yang,
David Tse,
Mohammad Alizadeh
Abstract:
Spamming attacks are a serious concern for consensus protocols, as witnessed by recent outages of a major blockchain, Solana. They cause congestion and excessive message delays in a real network due to its bandwidth constraints. In contrast, longest chain (LC), an important family of consensus protocols, has previously only been proven secure assuming an idealized network model in which all messag…
▽ More
Spamming attacks are a serious concern for consensus protocols, as witnessed by recent outages of a major blockchain, Solana. They cause congestion and excessive message delays in a real network due to its bandwidth constraints. In contrast, longest chain (LC), an important family of consensus protocols, has previously only been proven secure assuming an idealized network model in which all messages are delivered within bounded delay. This model-reality mismatch is further aggravated for Proof-of-Stake (PoS) LC where the adversary can spam the network with equivocating blocks. Hence, we extend the network model to capture bandwidth constraints, under which nodes now need to choose carefully which blocks to spend their limited download budget on. To illustrate this point, we show that 'download along the longest header chain', a natural download rule for Proof-of-Work (PoW) LC, is insecure for PoS LC. We propose a simple rule 'download towards the freshest block', formalize two common heuristics 'not downloading equivocations' and 'blocklisting', and prove in a unified framework that PoS LC with any one of these download rules is secure in bandwidth-constrained networks. In experiments, we validate our claims and showcase the behavior of these download rules under attack. By composing multiple instances of a PoS LC protocol with a suitable download rule in parallel, we obtain a PoS consensus protocol that achieves a constant fraction of the network's throughput limit even under worst-case adversarial strategies.
△ Less
Submitted 17 May, 2022; v1 submitted 24 November, 2021;
originally announced November 2021.
-
Strategically revealing intentions in General Lotto games
Authors:
Keith Paarporn,
Rahul Chandan,
Dan Kovenock,
Mahnoosh Alizadeh,
Jason R. Marden
Abstract:
Strategic decision-making in uncertain and adversarial environments is crucial for the security of modern systems and infrastructures. A salient feature of many optimal decision-making policies is a level of unpredictability, or randomness, which helps to keep an adversary uncertain about the system's behavior. This paper seeks to explore decision-making policies on the other end of the spectrum -…
▽ More
Strategic decision-making in uncertain and adversarial environments is crucial for the security of modern systems and infrastructures. A salient feature of many optimal decision-making policies is a level of unpredictability, or randomness, which helps to keep an adversary uncertain about the system's behavior. This paper seeks to explore decision-making policies on the other end of the spectrum -- namely, whether there are benefits in revealing one's strategic intentions to an opponent before engaging in competition. We study these scenarios in a well-studied model of competitive resource allocation problem known as General Lotto games. In the classic formulation, two competing players simultaneously allocate their assets to a set of battlefields, and the resulting payoffs are derived in a zero-sum fashion. Here, we consider a multi-step extension where one of the players has the option to publicly pre-commit assets in a binding fashion to battlefields before play begins. In response, the opponent decides which of these battlefields to secure (or abandon) by matching the pre-commitment with its own assets. They then engage in a General Lotto game over the remaining set of battlefields. Interestingly, this paper highlights many scenarios where strategically revealing intentions can actually significantly improve one's payoff. This runs contrary to the conventional wisdom that randomness should be a central component of decision-making in adversarial environments.
△ Less
Submitted 3 December, 2021; v1 submitted 22 October, 2021;
originally announced October 2021.
-
Updating Street Maps using Changes Detected in Satellite Imagery
Authors:
Favyen Bastani,
Songtao He,
Satvat Jagwani,
Mohammad Alizadeh,
Hari Balakrishnan,
Sanjay Chawla,
Sam Madden,
Mohammad Amin Sadeghi
Abstract:
Accurately maintaining digital street maps is labor-intensive. To address this challenge, much work has studied automatically processing geospatial data sources such as GPS trajectories and satellite images to reduce the cost of maintaining digital maps. An end-to-end map update system would first process geospatial data sources to extract insights, and second leverage those insights to update and…
▽ More
Accurately maintaining digital street maps is labor-intensive. To address this challenge, much work has studied automatically processing geospatial data sources such as GPS trajectories and satellite images to reduce the cost of maintaining digital maps. An end-to-end map update system would first process geospatial data sources to extract insights, and second leverage those insights to update and improve the map. However, prior work largely focuses on the first step of this pipeline: these map extraction methods infer road networks from scratch given geospatial data sources (in effect creating entirely new maps), but do not address the second step of leveraging this extracted information to update the existing digital map data. In this paper, we first explain why current map extraction techniques yield low accuracy when extended to update existing maps. We then propose a novel method that leverages the progression of satellite imagery over time to substantially improve accuracy. Our approach first compares satellite images captured at different times to identify portions of the physical road network that have visibly changed, and then updates the existing map accordingly. We show that our change-based approach reduces map update error rates four-fold.
△ Less
Submitted 12 October, 2021;
originally announced October 2021.
-
DispersedLedger: High-Throughput Byzantine Consensus on Variable Bandwidth Networks
Authors:
Lei Yang,
Seo ** Park,
Mohammad Alizadeh,
Sreeram Kannan,
David Tse
Abstract:
The success of blockchains has sparked interest in large-scale deployments of Byzantine fault tolerant (BFT) consensus protocols over wide area networks. A central feature of such networks is variable communication bandwidth across nodes and across time. We present DispersedLedger, an asynchronous BFT protocol that provides near-optimal throughput in the presence of such variable network bandwidth…
▽ More
The success of blockchains has sparked interest in large-scale deployments of Byzantine fault tolerant (BFT) consensus protocols over wide area networks. A central feature of such networks is variable communication bandwidth across nodes and across time. We present DispersedLedger, an asynchronous BFT protocol that provides near-optimal throughput in the presence of such variable network bandwidth. The core idea of DispersedLedger is to enable nodes to propose, order, and agree on blocks of transactions without having to download their full content. By enabling nodes to agree on an ordered log of blocks, with a guarantee that each block is available within the network and unmalleable, DispersedLedger decouples bandwidth-intensive block downloads at different nodes, allowing each to make progress at its own pace. We build a full system prototype and evaluate it on real-world and emulated networks. Our results on a geo-distributed wide-area deployment across the Internet shows that DispersedLedger achieves 2x better throughput and 74% reduction in latency compared to HoneyBadger, the state-of-the-art asynchronous protocol.
△ Less
Submitted 12 October, 2021; v1 submitted 8 October, 2021;
originally announced October 2021.
-
An Online Scheduling Algorithm for a Community Energy Storage System
Authors:
Nathaniel Tucker,
Mahnoosh Alizadeh
Abstract:
In this paper, we consider a community energy storage (CES) system that is shared by various electricity consumers who want to charge and discharge the CES throughout a given time span. We study the problem facing the manager of such a CES who must schedule the charging, discharging, and capacity reservations for numerous users. Moreover, we consider the case where requests to charge/discharge the…
▽ More
In this paper, we consider a community energy storage (CES) system that is shared by various electricity consumers who want to charge and discharge the CES throughout a given time span. We study the problem facing the manager of such a CES who must schedule the charging, discharging, and capacity reservations for numerous users. Moreover, we consider the case where requests to charge/discharge the CES arrive in an online fashion and the CES manager must immediately allocate charging power and energy capacity to fulfill the request or reject the request altogether. The objective of the CES manager is to maximize the total value gained by all of the users of the CES while accounting for the operational constraints of the CES. We discuss an algorithm titled \textsc{CommunityEnergyScheduling} that acts as a pricing mechanism based on online primal-dual optimization as a solution to the CES manager's problem. The online algorithm estimates the dual variables (prices) in real-time to allow for requests to be allocated or rejected immediately as they arrive. Furthermore, the proposed method promotes charging and discharging cancellations to reduce the CES's usage at popular times and is able to handle the inherent stochastic nature of the requests to charge/discharge stemming from randomness in users' net load patterns and weather uncertainties. Additionally, we are able to show that the algorithm is able to handle any adversarially chosen request sequence and will always yield total welfare within a factor of 1/a of the offline optimal welfare.
△ Less
Submitted 13 March, 2022; v1 submitted 5 October, 2021;
originally announced October 2021.
-
Robust Distributed Optimization With Randomly Corrupted Gradients
Authors:
Berkay Turan,
Cesar A. Uribe,
Hoi-To Wai,
Mahnoosh Alizadeh
Abstract:
In this paper, we propose a first-order distributed optimization algorithm that is provably robust to Byzantine failures-arbitrary and potentially adversarial behavior, where all the participating agents are prone to failure. We model each agent's state over time as a two-state Markov chain that indicates Byzantine or trustworthy behaviors at different time instants. We set no restrictions on the…
▽ More
In this paper, we propose a first-order distributed optimization algorithm that is provably robust to Byzantine failures-arbitrary and potentially adversarial behavior, where all the participating agents are prone to failure. We model each agent's state over time as a two-state Markov chain that indicates Byzantine or trustworthy behaviors at different time instants. We set no restrictions on the maximum number of Byzantine agents at any given time. We design our method based on three layers of defense: 1) temporal robust aggregation, 2) spatial robust aggregation, and 3) gradient normalization. We study two settings for stochastic optimization, namely Sample Average Approximation and Stochastic Approximation. We provide convergence guarantees of our method for strongly convex and smooth non-convex cost functions.
△ Less
Submitted 17 June, 2022; v1 submitted 28 June, 2021;
originally announced June 2021.
-
A General Lotto game with asymmetric budget uncertainty
Authors:
Keith Paarporn,
Rahul Chandan,
Mahnoosh Alizadeh,
Jason R. Marden
Abstract:
The General Lotto game is a popular variant of the famous Colonel Blotto game, in which two opposing players allocate limited resources over many battlefields. In this paper, we consider incomplete and asymmetric information formulations regarding the resource budgets of the players. In particular, one of the player's resource budget is common knowledge while the other player's is private. We prov…
▽ More
The General Lotto game is a popular variant of the famous Colonel Blotto game, in which two opposing players allocate limited resources over many battlefields. In this paper, we consider incomplete and asymmetric information formulations regarding the resource budgets of the players. In particular, one of the player's resource budget is common knowledge while the other player's is private. We provide complete equilibrium characterizations in the scenario where the private resource budget is drawn from an arbitrary Bernoulli distribution. We then show that these characterizations can be used to analyze a multi-stage resource assignment problem where a commander must decide how to assign resources to sub-colonels that compete against opponents in separate General Lotto games. While optimal deterministic assignments have been characterized in the literature, we broaden the context by deriving optimal (Bernoulli) randomized assignments, which induce asymmetric information General Lotto games to be played. We demonstrate that randomizing can offer a four-fold improvement in the commander's performance over deterministic assignments.
△ Less
Submitted 14 October, 2022; v1 submitted 22 June, 2021;
originally announced June 2021.
-
Feature and Parameter Selection in Stochastic Linear Bandits
Authors:
Ahmadreza Moradipari,
Berkay Turan,
Yasin Abbasi-Yadkori,
Mahnoosh Alizadeh,
Mohammad Ghavamzadeh
Abstract:
We study two model selection settings in stochastic linear bandits (LB). In the first setting, which we refer to as feature selection, the expected reward of the LB problem is in the linear span of at least one of $M$ feature maps (models). In the second setting, the reward parameter of the LB problem is arbitrarily selected from $M$ models represented as (possibly) overlap** balls in…
▽ More
We study two model selection settings in stochastic linear bandits (LB). In the first setting, which we refer to as feature selection, the expected reward of the LB problem is in the linear span of at least one of $M$ feature maps (models). In the second setting, the reward parameter of the LB problem is arbitrarily selected from $M$ models represented as (possibly) overlap** balls in $\mathbb R^d$. However, the agent only has access to misspecified models, i.e.,~estimates of the centers and radii of the balls. We refer to this setting as parameter selection. For each setting, we develop and analyze a computationally efficient algorithm that is based on a reduction from bandits to full-information problems. This allows us to obtain regret bounds that are not worse (up to a $\sqrt{\log M}$ factor) than the case where the true model is known. This is the best-reported dependence on the number of models $M$ in these settings. Finally, we empirically show the effectiveness of our algorithms using synthetic and real-world experiments.
△ Less
Submitted 17 June, 2022; v1 submitted 9 June, 2021;
originally announced June 2021.