-
Mitigating Information Asymmetry in Two-Stage Contracts with Non-Myopic Agents
Authors:
Munther A. Dahleh,
Thibaut Horel,
M. Umar B. Niazi
Abstract:
We consider a Stackelberg game in which a principal (she) establishes a two-stage contract with a non-myopic agent (he) whose type is unknown. The contract takes the form of an incentive function map** the agent's first-stage action to his second-stage incentive. While the first-stage action reveals the agent's type under truthful play, a non-myopic agent could benefit from portraying a false ty…
▽ More
We consider a Stackelberg game in which a principal (she) establishes a two-stage contract with a non-myopic agent (he) whose type is unknown. The contract takes the form of an incentive function map** the agent's first-stage action to his second-stage incentive. While the first-stage action reveals the agent's type under truthful play, a non-myopic agent could benefit from portraying a false type in the first stage to obtain a larger incentive in the second stage. The challenge is thus for the principal to design the incentive function so as to induce truthful play. We show that this is only possible with a constant, non-reactive incentive functions when the type space is continuous, whereas it can be achieved with reactive functions for discrete types. Additionally, we show that introducing an adjustment mechanism that penalizes inconsistent behavior across both stages allows the principal to design more flexible incentive functions.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
WeatherFormer: A Pretrained Encoder Model for Learning Robust Weather Representations from Small Datasets
Authors:
Adib Hasan,
Mardavij Roozbehani,
Munther Dahleh
Abstract:
This paper introduces WeatherFormer, a transformer encoder-based model designed to learn robust weather features from minimal observations. It addresses the challenge of modeling complex weather dynamics from small datasets, a bottleneck for many prediction tasks in agriculture, epidemiology, and climate science. WeatherFormer was pretrained on a large pretraining dataset comprised of 39 years of…
▽ More
This paper introduces WeatherFormer, a transformer encoder-based model designed to learn robust weather features from minimal observations. It addresses the challenge of modeling complex weather dynamics from small datasets, a bottleneck for many prediction tasks in agriculture, epidemiology, and climate science. WeatherFormer was pretrained on a large pretraining dataset comprised of 39 years of satellite measurements across the Americas. With a novel pretraining task and fine-tuning, WeatherFormer achieves state-of-the-art performance in county-level soybean yield prediction and influenza forecasting. Technical innovations include a unique spatiotemporal encoding that captures geographical, annual, and seasonal variations, adapting the transformer architecture to continuous weather data, and a pretraining strategy to learn representations that are robust to missing weather features. This paper for the first time demonstrates the effectiveness of pretraining large transformer encoder models for weather-dependent applications across multiple domains.
△ Less
Submitted 22 May, 2024;
originally announced May 2024.
-
Sample Efficient Reinforcement Learning with Partial Dynamics Knowledge
Authors:
Meshal Alharbi,
Mardavij Roozbehani,
Munther Dahleh
Abstract:
The problem of sample complexity of online reinforcement learning is often studied in the literature without taking into account any partial knowledge about the system dynamics that could potentially accelerate the learning process. In this paper, we study the sample complexity of online Q-learning methods when some prior knowledge about the dynamics is available or can be learned efficiently. We…
▽ More
The problem of sample complexity of online reinforcement learning is often studied in the literature without taking into account any partial knowledge about the system dynamics that could potentially accelerate the learning process. In this paper, we study the sample complexity of online Q-learning methods when some prior knowledge about the dynamics is available or can be learned efficiently. We focus on systems that evolve according to an additive disturbance model of the form $S_{h+1} = f(S_h, A_h) + W_h$, where $f$ represents the underlying system dynamics, and $W_h$ are unknown disturbances independent of states and actions. In the setting of finite episodic Markov decision processes with $S$ states, $A$ actions, and episode length $H$, we present an optimistic Q-learning algorithm that achieves $\tilde{\mathcal{O}}(\text{Poly}(H)\sqrt{T})$ regret under perfect knowledge of $f$, where $T$ is the total number of interactions with the system. This is in contrast to the typical $\tilde{\mathcal{O}}(\text{Poly}(H)\sqrt{SAT})$ regret for existing Q-learning methods. Further, if only a noisy estimate $\hat{f}$ of $f$ is available, our method can learn an approximately optimal policy in a number of samples that is independent of the cardinalities of state and action spaces. The sub-optimality gap depends on the approximation error $\hat{f}-f$, as well as the Lipschitz constant of the corresponding optimal value function. Our approach does not require modeling of the transition probabilities and enjoys the same memory complexity as model-free methods.
△ Less
Submitted 3 June, 2024; v1 submitted 19 December, 2023;
originally announced December 2023.
-
Incentive Design for Eco-driving in Urban Transportation Networks
Authors:
M. Umar B. Niazi,
Jung-Hoon Cho,
Munther A. Dahleh,
Roy Dong,
Cathy Wu
Abstract:
Eco-driving emerges as a cost-effective and efficient strategy to mitigate greenhouse gas emissions in urban transportation networks. Acknowledging the persuasive influence of incentives in sha** driver behavior, this paper presents the `eco-planner,' a digital platform devised to promote eco-driving practices in urban transportation. At the outset of their trips, users provide the platform with…
▽ More
Eco-driving emerges as a cost-effective and efficient strategy to mitigate greenhouse gas emissions in urban transportation networks. Acknowledging the persuasive influence of incentives in sha** driver behavior, this paper presents the `eco-planner,' a digital platform devised to promote eco-driving practices in urban transportation. At the outset of their trips, users provide the platform with their trip details and travel time preferences, enabling the eco-planner to formulate personalized eco-driving recommendations and corresponding incentives, while adhering to its budgetary constraints. Upon trip completion, incentives are transferred to users who comply with the recommendations and effectively reduce their emissions. By comparing our proposed incentive mechanism with a baseline scheme that offers uniform incentives to all users, we demonstrate that our approach achieves superior emission reductions and increased user compliance with a smaller budget.
△ Less
Submitted 16 May, 2024; v1 submitted 6 November, 2023;
originally announced November 2023.
-
Estimation of Models with Limited Data by Leveraging Shared Structure
Authors:
Maryann Rui,
Thibaut Horel,
Munther Dahleh
Abstract:
Modern data sets, such as those in healthcare and e-commerce, are often derived from many individuals or systems but have insufficient data from each source alone to separately estimate individual, often high-dimensional, model parameters. If there is shared structure among systems however, it may be possible to leverage data from other systems to help estimate individual parameters, which could o…
▽ More
Modern data sets, such as those in healthcare and e-commerce, are often derived from many individuals or systems but have insufficient data from each source alone to separately estimate individual, often high-dimensional, model parameters. If there is shared structure among systems however, it may be possible to leverage data from other systems to help estimate individual parameters, which could otherwise be non-identifiable. In this paper, we assume systems share a latent low-dimensional parameter space and propose a method for recovering $d$-dimensional parameters for $N$ different linear systems, even when there are only $T<d$ observations per system. To do so, we develop a three-step algorithm which estimates the low-dimensional subspace spanned by the systems' parameters and produces refined parameter estimates within the subspace. We provide finite sample subspace estimation error guarantees for our proposed method. Finally, we experimentally validate our method on simulations with i.i.d. regression data and as well as correlated time series data.
△ Less
Submitted 4 October, 2023;
originally announced October 2023.
-
SAMoSSA: Multivariate Singular Spectrum Analysis with Stochastic Autoregressive Noise
Authors:
Abdullah Alomar,
Munther Dahleh,
Sean Mann,
Devavrat Shah
Abstract:
The well-established practice of time series analysis involves estimating deterministic, non-stationary trend and seasonality components followed by learning the residual stochastic, stationary components. Recently, it has been shown that one can learn the deterministic non-stationary components accurately using multivariate Singular Spectrum Analysis (mSSA) in the absence of a correlated stationa…
▽ More
The well-established practice of time series analysis involves estimating deterministic, non-stationary trend and seasonality components followed by learning the residual stochastic, stationary components. Recently, it has been shown that one can learn the deterministic non-stationary components accurately using multivariate Singular Spectrum Analysis (mSSA) in the absence of a correlated stationary component; meanwhile, in the absence of deterministic non-stationary components, the Autoregressive (AR) stationary component can also be learnt readily, e.g. via Ordinary Least Squares (OLS). However, a theoretical underpinning of multi-stage learning algorithms involving both deterministic and stationary components has been absent in the literature despite its pervasiveness. We resolve this open question by establishing desirable theoretical guarantees for a natural two-stage algorithm, where mSSA is first applied to estimate the non-stationary components despite the presence of a correlated stationary AR component, which is subsequently learned from the residual time series. We provide a finite-sample forecasting consistency bound for the proposed algorithm, SAMoSSA, which is data-driven and thus requires minimal parameter tuning. To establish theoretical guarantees, we overcome three hurdles: (i) we characterize the spectra of Page matrices of stable AR processes, thus extending the analysis of mSSA; (ii) we extend the analysis of AR process identification in the presence of arbitrary bounded perturbations; (iii) we characterize the out-of-sample or forecasting error, as opposed to solely considering model identification. Through representative empirical studies, we validate the superior performance of SAMoSSA compared to existing baselines. Notably, SAMoSSA's ability to account for AR noise structure yields improvements ranging from 5% to 37% across various benchmark datasets.
△ Less
Submitted 26 November, 2023; v1 submitted 25 May, 2023;
originally announced May 2023.
-
Coordination via Selling Information
Authors:
Alessandro Bonatti,
Munther Dahleh,
Thibaut Horel,
Amir Nouripour
Abstract:
We consider games of incomplete information in which the players' payoffs depend both on a privately observed type and an unknown but common "state of nature". External to the game, a data provider knows the state of nature and sells information to the players, thus solving a joint information and mechanism design problem: deciding which information to sell while eliciting the player' types and co…
▽ More
We consider games of incomplete information in which the players' payoffs depend both on a privately observed type and an unknown but common "state of nature". External to the game, a data provider knows the state of nature and sells information to the players, thus solving a joint information and mechanism design problem: deciding which information to sell while eliciting the player' types and collecting payments. We restrict ourselves to a general class of symmetric games with quadratic payoffs that includes games of both strategic substitutes (e.g. Cournot competition) and strategic complements (e.g. Bertrand competition, Keynesian beauty contest). By to the Revelation Principle, the sellers' problem reduces to designing a mechanism that truthfully elicits the player' types and sends action recommendations that constitute a Bayes Correlated Equilibrium of the game. We fully characterize the class of all such Gaussian mechanisms (where the joint distribution of actions and private signals is a multivariate normal distribution) as well as the welfare- and revenue- optimal mechanisms within this class. For games of strategic complements, the optimal mechanisms maximally correlate the players' actions, and conversely maximally anticorrelate them for games of strategic substitutes. In both cases, for sufficiently large uncertainty over the players' types, the recommendations are deterministic (and linear) conditional on the state and the type reports, but they are not fully revealing.
△ Less
Submitted 23 February, 2023;
originally announced February 2023.
-
Data-driven control of COVID-19 in buildings: a reinforcement-learning approach
Authors:
Ashkan Haji Hosseinloo,
Saleh Nabi,
Anette Hosoi,
Munther A. Dahleh
Abstract:
In addition to its public health crisis, COVID-19 pandemic has led to the shutdown and closure of workplaces with an estimated total cost of more than $16 trillion. Given the long hours an average person spends in buildings and indoor environments, this research article proposes data-driven control strategies to design optimal indoor airflow to minimize the exposure of occupants to viral pathogens…
▽ More
In addition to its public health crisis, COVID-19 pandemic has led to the shutdown and closure of workplaces with an estimated total cost of more than $16 trillion. Given the long hours an average person spends in buildings and indoor environments, this research article proposes data-driven control strategies to design optimal indoor airflow to minimize the exposure of occupants to viral pathogens in built environments. A general control framework is put forward for designing an optimal velocity field and proximal policy optimization, a reinforcement learning algorithm is employed to solve the control problem in a data-driven fashion. The same framework is used for optimal placement of disinfectants to neutralize the viral pathogens as an alternative to the airflow design when the latter is practically infeasible or hard to implement. We show, via simulation experiments, that the control agent learns the optimal policy in both scenarios within a reasonable time. The proposed data-driven control framework in this study will have significant societal and economic benefits by setting the foundation for an improved methodology in designing case-specific infection control guidelines that can be realized by affordable ventilation devices and disinfectants.
△ Less
Submitted 27 December, 2022;
originally announced December 2022.
-
Incentive Compatibility in Two-Stage Repeated Stochastic Games
Authors:
Bharadwaj Satchidanandan,
Munther A. Dahleh
Abstract:
We address the problem of mechanism design for two-stage repeated stochastic games -- a novel setting using which many emerging problems in next-generation electricity markets can be readily modeled. Repeated playing affords the players a large class of strategies that adapt a player's actions to all past observations and inferences obtained therefrom. In other settings such as iterative auctions…
▽ More
We address the problem of mechanism design for two-stage repeated stochastic games -- a novel setting using which many emerging problems in next-generation electricity markets can be readily modeled. Repeated playing affords the players a large class of strategies that adapt a player's actions to all past observations and inferences obtained therefrom. In other settings such as iterative auctions or dynamic games where a large strategy space of this sort manifests, it typically has an important implication for mechanism design: It may be impossible to obtain truth-telling as a dominant strategy equilibrium. Consequently, in such scenarios, it is common to settle for mechanisms that render truth-telling only a Nash equilibrium, or variants thereof, even though Nash equilibria are known to be poor models of real-world behavior. This is owing to each player having to make overly specific assumptions about the behaviors of the other players to employ their Nash equilibrium strategy, which they may not make. In general, the lesser the burden of speculation in an equilibrium, the more plausible it is that it models real-world behavior. Guided by this maxim, we introduce a new notion of equilibrium called Dominant Strategy Non-Bankrupting Equilibrium (DNBE) which requires the players to make very little assumptions about the behavior of the other players to employ their equilibrium strategy. Consequently, a mechanism that renders truth-telling a DNBE as opposed to only a Nash equilibrium could be quite effective in molding real-world behavior along truthful lines. We present a mechanism for two-stage repeated stochastic games that renders truth-telling a Dominant Strategy Non-Bankrupting Equilibrium. The mechanism also guarantees individual rationality and maximizes social welfare. Finally, we describe an application of the mechanism to design demand response markets.
△ Less
Submitted 18 October, 2022; v1 submitted 18 March, 2022;
originally announced March 2022.
-
Selling Information in Competitive Environments
Authors:
Alessandro Bonatti,
Munther Dahleh,
Thibaut Horel,
Amir Nouripour
Abstract:
Data buyers compete in a game of incomplete information about which a single data seller owns some payoff-relevant information. The seller faces a joint information- and mechanism-design problem: deciding which information to sell, while eliciting the buyers' types and imposing payments. We derive the welfare- and revenue-optimal mechanisms for a class of games with binary actions and states. Our…
▽ More
Data buyers compete in a game of incomplete information about which a single data seller owns some payoff-relevant information. The seller faces a joint information- and mechanism-design problem: deciding which information to sell, while eliciting the buyers' types and imposing payments. We derive the welfare- and revenue-optimal mechanisms for a class of games with binary actions and states. Our results highlight the critical properties of selling information in competitive environments: (i) the negative externalities arising from buyer competition increase the profitability of recommending the correct action to one buyer exclusively; (ii) for the buyers to follow the seller's recommendations, the degree of exclusivity must be limited; (iii) the buyers' obedience constraints also limit the distortions in the allocation of information introduced by a monopolist seller; (iv) as competition becomes fiercer, these limitations become more severe, weakening the impact of market power on the allocation of information.
△ Less
Submitted 5 December, 2022; v1 submitted 17 February, 2022;
originally announced February 2022.
-
Causal Matrix Completion
Authors:
Anish Agarwal,
Munther Dahleh,
Devavrat Shah,
Dennis Shen
Abstract:
Matrix completion is the study of recovering an underlying matrix from a sparse subset of noisy observations. Traditionally, it is assumed that the entries of the matrix are "missing completely at random" (MCAR), i.e., each entry is revealed at random, independent of everything else, with uniform probability. This is likely unrealistic due to the presence of "latent confounders", i.e., unobserved…
▽ More
Matrix completion is the study of recovering an underlying matrix from a sparse subset of noisy observations. Traditionally, it is assumed that the entries of the matrix are "missing completely at random" (MCAR), i.e., each entry is revealed at random, independent of everything else, with uniform probability. This is likely unrealistic due to the presence of "latent confounders", i.e., unobserved factors that determine both the entries of the underlying matrix and the missingness pattern in the observed matrix. For example, in the context of movie recommender systems -- a canonical application for matrix completion -- a user who vehemently dislikes horror films is unlikely to ever watch horror films. In general, these confounders yield "missing not at random" (MNAR) data, which can severely impact any inference procedure that does not correct for this bias. We develop a formal causal model for matrix completion through the language of potential outcomes, and provide novel identification arguments for a variety of causal estimands of interest. We design a procedure, which we call "synthetic nearest neighbors" (SNN), to estimate these causal estimands. We prove finite-sample consistency and asymptotic normality of our estimator. Our analysis also leads to new theoretical results for the matrix completion literature. In particular, we establish entry-wise, i.e., max-norm, finite-sample consistency and asymptotic normality results for matrix completion with MNAR data. As a special case, this also provides entry-wise bounds for matrix completion with MCAR data. Across simulated and real data, we demonstrate the efficacy of our proposed estimator.
△ Less
Submitted 30 September, 2021;
originally announced September 2021.
-
Eliciting Social Knowledge for Creditworthiness Assessment
Authors:
Mark York,
Munther Dahleh,
David Parkes
Abstract:
Access to capital is a major constraint for economic growth in the develo** world. Yet those attempting to lend in this space face high defaults due to their inability to distinguish creditworthy borrowers from the rest. In this paper, we propose two novel scoring mechanisms that incentivize community members to truthfully report their signal on the creditworthiness of others in their community.…
▽ More
Access to capital is a major constraint for economic growth in the develo** world. Yet those attempting to lend in this space face high defaults due to their inability to distinguish creditworthy borrowers from the rest. In this paper, we propose two novel scoring mechanisms that incentivize community members to truthfully report their signal on the creditworthiness of others in their community. We first design a truncated asymmetric scoring-rule for a setting where the lender has no liquidity constraints. We then derive a novel, strictly-proper VCG scoring mechanism for the liquidity-constrained setting. Whereas Chen et al. [2011] give an impossibility result for an analogous setting in which sequential reports are made in the context of decision markets, we achieve a positive result through appeal to interim beliefs about the reports of others in a setting with simultaneous reports.Moreover, the use of VCG methods allows for the integration of linear belief aggregation methods.
△ Less
Submitted 20 August, 2021;
originally announced August 2021.
-
Learning Good State and Action Representations via Tensor Decomposition
Authors:
Chengzhuo Ni,
Yaqi Duan,
Munther Dahleh,
Anru Zhang,
Mengdi Wang
Abstract:
The transition kernel of a continuous-state-action Markov decision process (MDP) admits a natural tensor structure. This paper proposes a tensor-inspired unsupervised learning method to identify meaningful low-dimensional state and action representations from empirical trajectories. The method exploits the MDP's tensor structure by kernelization, importance sampling and low-Tucker-rank approximati…
▽ More
The transition kernel of a continuous-state-action Markov decision process (MDP) admits a natural tensor structure. This paper proposes a tensor-inspired unsupervised learning method to identify meaningful low-dimensional state and action representations from empirical trajectories. The method exploits the MDP's tensor structure by kernelization, importance sampling and low-Tucker-rank approximation. This method can be further used to cluster states and actions respectively and find the best discrete MDP abstraction. We provide sharp statistical error bounds for tensor concentration and the preservation of diffusion distance after embedding. We further prove that the learned state/action abstractions provide accurate approximations to latent block structures if they exist, enabling function approximation in downstream tasks such as policy evaluation.
△ Less
Submitted 19 February, 2023; v1 submitted 3 May, 2021;
originally announced May 2021.
-
Nonstochastic Bandits with Infinitely Many Experts
Authors:
X. Flora Meng,
Tuhin Sarkar,
Munther A. Dahleh
Abstract:
We study the problem of nonstochastic bandits with expert advice, extending the setting from finitely many experts to any countably infinite set: A learner aims to maximize the total reward by taking actions sequentially based on bandit feedback while benchmarking against a set of experts. We propose a variant of Exp4.P that, for finitely many experts, enables inference of correct expert rankings…
▽ More
We study the problem of nonstochastic bandits with expert advice, extending the setting from finitely many experts to any countably infinite set: A learner aims to maximize the total reward by taking actions sequentially based on bandit feedback while benchmarking against a set of experts. We propose a variant of Exp4.P that, for finitely many experts, enables inference of correct expert rankings while preserving the order of the regret upper bound. We then incorporate the variant into a meta-algorithm that works on infinitely many experts. We prove a high-probability upper bound of $\tilde{\mathcal{O}} \big( i^*K + \sqrt{KT} \big)$ on the regret, up to polylog factors, where $i^*$ is the unknown position of the best expert, $K$ is the number of actions, and $T$ is the time horizon. We also provide an example of structured experts and discuss how to expedite learning in such case. Our meta-learning algorithm achieves optimal regret up to polylog factors when $i^* = \tilde{\mathcal{O}} \big( \sqrt{T/K} \big)$. If a prior distribution is assumed to exist for $i^*$, the probability of optimality increases with $T$, the rate of which can be fast.
△ Less
Submitted 25 March, 2021; v1 submitted 9 February, 2021;
originally announced February 2021.
-
Consensus with Preserved Privacy against Neighbor Collusion
Authors:
Silun Zhang,
Thomas Ohlson Timoudas,
Munther Dahleh
Abstract:
This paper proposes a privacy-preserving algorithm to solve the average consensus problem based on Shamir's secret sharing scheme, in which a network of agents reach an agreement on their states without exposing their individual state until an agreement is reached. Unlike other methods, the proposed algorithm renders the network resistant to the collusion of any given number of neighbors (even wit…
▽ More
This paper proposes a privacy-preserving algorithm to solve the average consensus problem based on Shamir's secret sharing scheme, in which a network of agents reach an agreement on their states without exposing their individual state until an agreement is reached. Unlike other methods, the proposed algorithm renders the network resistant to the collusion of any given number of neighbors (even with all neighbors' colluding). Another virtue of this work is that such a method can protect the network consensus procedure from eavesdrop**.
△ Less
Submitted 18 November, 2020;
originally announced November 2020.
-
A Cross-Domain Approach to Analyzing the Short-Run Impact of COVID-19 on the U.S. Electricity Sector
Authors:
Guangchun Ruan,
Dongqi Wu,
Xiangtian Zheng,
Haiwang Zhong,
Chongqing Kang,
Munther A. Dahleh,
S. Sivaranjani,
Le Xie
Abstract:
The novel coronavirus disease (COVID-19) has rapidly spread around the globe in 2020, with the U.S. becoming the epicenter of COVID-19 cases since late March. As the U.S. begins to gradually resume economic activity, it is imperative for policymakers and power system operators to take a scientific approach to understanding and predicting the impact on the electricity sector. Here, we release a fir…
▽ More
The novel coronavirus disease (COVID-19) has rapidly spread around the globe in 2020, with the U.S. becoming the epicenter of COVID-19 cases since late March. As the U.S. begins to gradually resume economic activity, it is imperative for policymakers and power system operators to take a scientific approach to understanding and predicting the impact on the electricity sector. Here, we release a first-of-its-kind cross-domain open-access data hub, integrating data from across all existing U.S. wholesale electricity markets with COVID-19 case, weather, cellular location, and satellite imaging data. Leveraging cross-domain insights from public health and mobility data, we uncover a significant reduction in electricity consumption across that is strongly correlated with the rise in the number of COVID-19 cases, degree of social distancing, and level of commercial activity.
△ Less
Submitted 27 August, 2020; v1 submitted 11 May, 2020;
originally announced May 2020.
-
Towards Data Auctions with Externalities
Authors:
Anish Agarwal,
Munther Dahleh,
Thibaut Horel,
Maryann Rui
Abstract:
The design of data markets has gained importance as firms increasingly use machine learning models fueled by externally acquired training data. A key consideration is the externalities firms face when data, though inherently freely replicable, is allocated to competing firms. In this setting, we demonstrate that a data seller's optimal revenue increases as firms can pay to prevent allocations to o…
▽ More
The design of data markets has gained importance as firms increasingly use machine learning models fueled by externally acquired training data. A key consideration is the externalities firms face when data, though inherently freely replicable, is allocated to competing firms. In this setting, we demonstrate that a data seller's optimal revenue increases as firms can pay to prevent allocations to others. To do so, we first reduce the combinatorial problem of allocating and pricing multiple datasets to the auction of a single digital good by modeling utility for data through the increase in prediction accuracy it provides. We then derive welfare and revenue maximizing mechanisms, highlighting how the form of firms' private information - whether the externalities one exerts on others is known, or vice-versa - affects the resulting structures. In all cases, under appropriate assumptions, the optimal allocation rule is a single threshold per firm, where either all data is allocated or none is.
△ Less
Submitted 20 September, 2023; v1 submitted 18 March, 2020;
originally announced March 2020.
-
Data-driven control of micro-climate in buildings: an event-triggered reinforcement learning approach
Authors:
Ashkan Haji Hosseinloo,
Alexander Ryzhov,
Aldo Bischi,
Henni Ouerdane,
Konstantin Turitsyn,
Munther A. Dahleh
Abstract:
Smart buildings have great potential for sha** an energy-efficient, sustainable, and more economic future for our planet as buildings account for approximately 40% of the global energy consumption. Future of the smart buildings lies in using sensory data for adaptive decision making and control that is currently gloomed by the key challenge of learning a good control policy in a short period of…
▽ More
Smart buildings have great potential for sha** an energy-efficient, sustainable, and more economic future for our planet as buildings account for approximately 40% of the global energy consumption. Future of the smart buildings lies in using sensory data for adaptive decision making and control that is currently gloomed by the key challenge of learning a good control policy in a short period of time in an online and continuing fashion. To tackle this challenge, an event-triggered -- as opposed to classic time-triggered -- paradigm, is proposed in which learning and control decisions are made when events occur and enough information is collected. Events are characterized by certain design conditions and they occur when the conditions are met, for instance, when a certain state threshold is reached. By systematically adjusting the time of learning and control decisions, the proposed framework can potentially reduce the variance in learning, and consequently, improve the control process. We formulate the micro-climate control problem based on semi-Markov decision processes that allow for variable-time state transitions and decision making. Using extended policy gradient theorems and temporal difference methods in a reinforcement learning set-up, we propose two learning algorithms for event-triggered control of micro-climate in buildings. We show the efficacy of our proposed approach via designing a smart learning thermostat that simultaneously optimizes energy consumption and occupants' comfort in a test building.
△ Less
Submitted 2 July, 2020; v1 submitted 28 January, 2020;
originally announced January 2020.
-
Generativity and Interactional Effects: an Overview
Authors:
Elie M. Adam,
Munther A. Dahleh
Abstract:
We propose a means to relate properties of an interconnected system to its separate component systems in the presence of cascade-like phenomena. Building on a theory of interconnection reminiscent of the behavioral approach to systems theory, we introduce the notion of generativity, and its byproduct, generative effects. Cascade effects, enclosing contagion phenomena and cascading failures, are se…
▽ More
We propose a means to relate properties of an interconnected system to its separate component systems in the presence of cascade-like phenomena. Building on a theory of interconnection reminiscent of the behavioral approach to systems theory, we introduce the notion of generativity, and its byproduct, generative effects. Cascade effects, enclosing contagion phenomena and cascading failures, are seen as instances of generative effects. The latter are precisely the instances where properties of interest are not preserved or behave very badly when systems interact. The goal is to overcome that obstruction. We will show how to extract mathematical objects from the systems, that encode their generativity: their potential to generate new phenomena upon interaction. Those objects may then be used to link the properties of the interconnected system to its separate systems. Such a link will be executed through the use of exact sequences from commutative algebra.
△ Less
Submitted 23 November, 2019;
originally announced November 2019.
-
On the Mathematical Structure of Cascade Effects and Emergent Phenomena
Authors:
Elie M. Adam,
Munther A. Dahleh
Abstract:
We argue that the mathematical structure, enabling certain cascading and emergent phenomena to intuitively emerge, coincides with Galois connections. We introduce the notion of generative effects to formally capture such phenomena. We establish that these effects arise, via a notion of a veil, from either concealing mechanisms in a system or forgetting characteristics from it. The goal of the work…
▽ More
We argue that the mathematical structure, enabling certain cascading and emergent phenomena to intuitively emerge, coincides with Galois connections. We introduce the notion of generative effects to formally capture such phenomena. We establish that these effects arise, via a notion of a veil, from either concealing mechanisms in a system or forgetting characteristics from it. The goal of the work is to initiate a mathematical base that enables us to further study such phenomena. In particular, generative effects can be further linked to a certain loss of exactness. Homological algebra, and related algebraic methods, may then be used to characterize the effects.
△ Less
Submitted 23 November, 2019;
originally announced November 2019.
-
Zorro: A Model Agnostic System to Price Consumer Data
Authors:
Anish Agarwal,
Munther Dahleh,
Devavrat Shah,
Dylan Sleeper,
Andrew Tsai,
Madeline Wong
Abstract:
Personal data is essential in showing users targeted ads - the economic backbone of the web. Still, there are major inefficiencies in how data is transacted online: (1) users don't decide what information is released nor get paid for this privacy loss; (2) algorithmic advertisers are stuck in inefficient long-term contracts where they purchase user data without knowing the value it provides. This…
▽ More
Personal data is essential in showing users targeted ads - the economic backbone of the web. Still, there are major inefficiencies in how data is transacted online: (1) users don't decide what information is released nor get paid for this privacy loss; (2) algorithmic advertisers are stuck in inefficient long-term contracts where they purchase user data without knowing the value it provides. This paper proposes a system, Zorro, which aims to rectify aforementioned two problems.
As the main contribution, we provide a natural, 'absolute' definition of 'Value of Data' (VoD) - for any quantity of interest, it is the delta between an individual's value and population mean. The challenge remains how to operationalize this definition, independently of a buyer's model for VoD. We propose a model-agnostic solution, relying on matrix estimation, and use it to estimate click-through-rate (CTR), as an example.
Regarding (2), Zorro empowers advertisers to measure value of user data on a query-by-query basis and based only on the increase in accuracy it provides in estimating CTR. In contrast advertisers currently engage in inefficient long-term data contracts with third party data sellers. We highlight two results on a large ad-click dataset: (i) our system has R^2=0.58, in line with best-in-class results for related problems (e.g. content recommendation). Crucially, our system is model-agnostic - we estimate CTR without accessing an advertiser's proprietary models, a required property of any such pricing system;(ii) our experiments show selling user data has incremental value ranging from 30%-69% depending on ad category. Roughly, this translates to at least USD 16 Billion loss in value for advertisers if user data is not provided.
Regarding (1), in addition to allowing users to get paid for data sharing, we extend our mathematical framework to when users provide explicit intent.
△ Less
Submitted 14 June, 2019; v1 submitted 6 June, 2019;
originally announced June 2019.
-
A Marketplace for Data: An Algorithmic Solution
Authors:
Anish Agarwal,
Munther Dahleh,
Tuhin Sarkar
Abstract:
In this work, we aim to design a data marketplace; a robust real-time matching mechanism to efficiently buy and sell training data for Machine Learning tasks. While the monetization of data and pre-trained models is an essential focus of industry today, there does not exist a market mechanism to price training data and match buyers to sellers while still addressing the associated (computational an…
▽ More
In this work, we aim to design a data marketplace; a robust real-time matching mechanism to efficiently buy and sell training data for Machine Learning tasks. While the monetization of data and pre-trained models is an essential focus of industry today, there does not exist a market mechanism to price training data and match buyers to sellers while still addressing the associated (computational and other) complexity. The challenge in creating such a market stems from the very nature of data as an asset: (i) it is freely replicable; (ii) its value is inherently combinatorial due to correlation with signal in other data; (iii) prediction tasks and the value of accuracy vary widely; (iv) usefulness of training data is difficult to verify a priori without first applying it to a prediction task. As our main contributions we: (i) propose a mathematical model for a two-sided data market and formally define the key associated challenges; (ii) construct algorithms for such a market to function and analyze how they meet the challenges defined. We highlight two technical contributions: (i) a new notion of 'fairness' required for cooperative games with freely replicable goods; (ii) a truthful, zero regret mechanism to auction a class of combinatorial goods based on utilizing Myerson's payment function and the Multiplicative Weights algorithm. These might be of independent interest.
△ Less
Submitted 12 May, 2019; v1 submitted 21 May, 2018;
originally announced May 2018.
-
Coalitional game with opinion exchange
Authors:
Bomin Jiang,
Mardavij Roozbehani,
Munther A. Dahleh
Abstract:
In coalitional games, traditional coalitional game theory does not apply if different participants hold different opinions about the payoff function that corresponds to each subset of the coalition. In this paper, we propose a framework in which players can exchange opinions about their views of payoff functions and then decide the distribution of the value of the grand coalition. When all players…
▽ More
In coalitional games, traditional coalitional game theory does not apply if different participants hold different opinions about the payoff function that corresponds to each subset of the coalition. In this paper, we propose a framework in which players can exchange opinions about their views of payoff functions and then decide the distribution of the value of the grand coalition. When all players are truth-telling, the problem of opinion consensus is decoupled from the coalitional game, but interesting dynamics will arise when players are strategic in the consensus phase. Assuming that all players are rational, the model implies that, if influential players are risk-averse, an efficient fusion of the distributed data is achieved at pure strategy Nash equilibrium, meaning that the average opinion will not drift. Also, without the assumption that all players are rational, each player can use an algorithmic R-learning process, which gives the same result as the pure strategy Nash equilibrium with rational players.
△ Less
Submitted 10 September, 2017; v1 submitted 5 September, 2017;
originally announced September 2017.
-
How Peer Effects Influence Energy Consumption
Authors:
Datong P. Zhou,
Mardavij Roozbehani,
Munther A. Dahleh,
Claire J. Tomlin
Abstract:
This paper analyzes the impact of peer effects on electricity consumption of a network of rational, utility-maximizing users. Users derive utility from consuming electricity as well as consuming less energy than their neighbors. However, a disutility is incurred for consuming more than their neighbors. To maximize the profit of the load-serving entity that provides electricity to such users, we de…
▽ More
This paper analyzes the impact of peer effects on electricity consumption of a network of rational, utility-maximizing users. Users derive utility from consuming electricity as well as consuming less energy than their neighbors. However, a disutility is incurred for consuming more than their neighbors. To maximize the profit of the load-serving entity that provides electricity to such users, we develop a two-stage game-theoretic model, where the entity sets the prices in the first stage. In the second stage, consumers decide on their demand in response to the observed price set in the first stage so as to maximize their utility. To this end, we derive theoretical statements under which such peer effects reduce aggregate user consumption. Further, we obtain expressions for the resulting electricity consumption and profit of the load serving entity for the case of perfect price discrimination and a single price under complete information, and approximations under incomplete information. Simulations suggest that exposing only a selected subset of all users to peer effects maximizes the entity's profit.
△ Less
Submitted 17 March, 2017; v1 submitted 2 March, 2017;
originally announced March 2017.
-
Eliciting Private User Information for Residential Demand Response
Authors:
Datong P. Zhou,
Maximilian Balandat,
Munther A. Dahleh,
Claire J. Tomlin
Abstract:
Residential Demand Response has emerged as a viable tool to alleviate supply and demand imbalances of electricity, particularly during times when the electric grid is strained due a shortage of supply. Demand Response providers bid reduction capacity into the wholesale electricity market by asking their customers under contract to temporarily reduce their consumption in exchange for a monetary inc…
▽ More
Residential Demand Response has emerged as a viable tool to alleviate supply and demand imbalances of electricity, particularly during times when the electric grid is strained due a shortage of supply. Demand Response providers bid reduction capacity into the wholesale electricity market by asking their customers under contract to temporarily reduce their consumption in exchange for a monetary incentive. To contribute to the analysis of consumer behavior in response to such incentives, this paper formulates Demand Response as a Mechanism Design problem, where a Demand Response Provider elicits private information of its rational, profit-maximizing customers who derive positive expected utility by participating in reduction events. By designing an incentive compatible and individually rational mechanism to collect users' price elasticities of demand, the Demand Response provider can target the most susceptible users to incentives. We measure reductions by comparing the materialized consumption to the projected consumption, which we model as the "10-in-10"-baseline, the regulatory standard set by the California Independent System Operator. Due to the suboptimal performance of this baseline, we show, using consumption data of residential customers in California, that Demand Response Providers receive payments for "virtual reductions", which exist due to the inaccuracies of the baseline rather than actual reductions. Improving the accuracy of the baseline diminishes the contribution of these virtual reductions.
△ Less
Submitted 3 September, 2017; v1 submitted 2 March, 2017;
originally announced March 2017.
-
Towards an Algebra for Cascade Effects
Authors:
Elie M. Adam,
Munther A. Dahleh,
Asuman Ozdaglar
Abstract:
We introduce a new class of (dynamical) systems that inherently capture cascading effects (viewed as consequential effects) and are naturally amenable to combinations. We develop an axiomatic general theory around those systems, and guide the endeavor towards an understanding of cascading failure. The theory evolves as an interplay of lattices and fixed points, and its results may be instantiated…
▽ More
We introduce a new class of (dynamical) systems that inherently capture cascading effects (viewed as consequential effects) and are naturally amenable to combinations. We develop an axiomatic general theory around those systems, and guide the endeavor towards an understanding of cascading failure. The theory evolves as an interplay of lattices and fixed points, and its results may be instantiated to commonly studied models of cascade effects.
We characterize the systems through their fixed points, and equip them with two operators. We uncover properties of the operators, and express global systems through combinations of local systems. We enhance the theory with a notion of failure, and understand the class of shocks inducing a system to failure. We develop a notion of mu-rank to capture the energy of a system, and understand the minimal amount of effort required to fail a system, termed resilience. We deduce a dual notion of fragility and show that the combination of systems sets a limit on the amount of fragility inherited.
△ Less
Submitted 5 July, 2017; v1 submitted 21 June, 2015;
originally announced June 2015.
-
Minimal Realization Problems for Hidden Markov Models
Authors:
Qingqing Huang,
Rong Ge,
Sham Kakade,
Munther Dahleh
Abstract:
Consider a stationary discrete random process with alphabet size d, which is assumed to be the output process of an unknown stationary Hidden Markov Model (HMM). Given the joint probabilities of finite length strings of the process, we are interested in finding a finite state generative model to describe the entire process. In particular, we focus on two classes of models: HMMs and quasi-HMMs, whi…
▽ More
Consider a stationary discrete random process with alphabet size d, which is assumed to be the output process of an unknown stationary Hidden Markov Model (HMM). Given the joint probabilities of finite length strings of the process, we are interested in finding a finite state generative model to describe the entire process. In particular, we focus on two classes of models: HMMs and quasi-HMMs, which is a strictly larger class of models containing HMMs. In the main theorem, we show that if the random process is generated by an HMM of order less or equal than k, and whose transition and observation probability matrix are in general position, namely almost everywhere on the parameter space, both the minimal quasi-HMM realization and the minimal HMM realization can be efficiently computed based on the joint probabilities of all the length N strings, for N > 4 lceil log_d(k) rceil +1. In this paper, we also aim to compare and connect the two lines of literature: realization theory of HMMs, and the recent development in learning latent variable models with tensor decomposition techniques.
△ Less
Submitted 14 December, 2015; v1 submitted 13 November, 2014;
originally announced November 2014.
-
On Threshold Models over Finite Networks
Authors:
Elie M. Adam,
Munther A. Dahleh,
Asuman Ozdaglar
Abstract:
We study a model for cascade effects over finite networks based on a deterministic binary linear threshold model. Our starting point is a networked coordination game where each agent's payoff is the sum of the payoffs coming from pairwise interactions with each of the neighbors. We first establish that the best response dynamics in this networked game is equivalent to the linear threshold dynamics…
▽ More
We study a model for cascade effects over finite networks based on a deterministic binary linear threshold model. Our starting point is a networked coordination game where each agent's payoff is the sum of the payoffs coming from pairwise interactions with each of the neighbors. We first establish that the best response dynamics in this networked game is equivalent to the linear threshold dynamics with heterogeneous thresholds over the agents. While the previous literature has studied such linear threshold models under the assumption that each agent may change actions at most once, a study of best response dynamics in such networked games necessitates an analysis that allows for multiple switches in actions. In this paper, we develop such an analysis and construct a combinatorial framework to understand the behavior of the model. To this end, we establish that the agents behavior cycles among different actions in the limit and provide three sets of results.
We first characterize the limiting behavioral properties of the dynamics. We determine the length of the limit cycles and reveal bounds on the time steps required to reach such cycles for different network structures. We then study the complexity of decision/counting problems that arise within the context. Specifically, we consider the tractability of counting the number of limit cycles and fixed-points, and deciding the reachability of action profiles. We finally propose a measure of network resilience that captures the nature of the involved dynamics. We prove bounds and investigate the resilience of different network structures under this measure.
△ Less
Submitted 2 January, 2013; v1 submitted 3 November, 2012;
originally announced November 2012.
-
Canonical Estimation in a Rare-Events Regime
Authors:
Mesrob I. Ohannessian,
Vincent Y. F. Tan,
Munther A. Dahleh
Abstract:
We propose a general methodology for performing statistical inference within a `rare-events regime' that was recently suggested by Wagner, Viswanath and Kulkarni. Our approach allows one to easily establish consistent estimators for a very large class of canonical estimation problems, in a large alphabet setting. These include the problems studied in the original paper, such as entropy and probabi…
▽ More
We propose a general methodology for performing statistical inference within a `rare-events regime' that was recently suggested by Wagner, Viswanath and Kulkarni. Our approach allows one to easily establish consistent estimators for a very large class of canonical estimation problems, in a large alphabet setting. These include the problems studied in the original paper, such as entropy and probability estimation, in addition to many other interesting ones. We particularly illustrate this approach by consistently estimating the size of the alphabet and the range of the probabilities. We start by proposing an abstract methodology based on constructing a probability measure with the desired asymptotic properties. We then demonstrate two concrete constructions by casting the Good-Turing estimator as a pseudo-empirical measure, and by using the theory of mixture model estimation.
△ Less
Submitted 5 October, 2011; v1 submitted 21 September, 2011;
originally announced September 2011.
-
Stability Analysis of Transportation Networks with Multiscale Driver Decisions
Authors:
Giacomo Como,
Ketan Savla,
Daron Acemoglu,
Munther A. Dahleh,
Emilio Frazzoli
Abstract:
Stability of Wardrop equilibria is analyzed for dynamical transportation networks in which the drivers' route choices are influenced by information at multiple temporal and spatial scales. The considered model involves a continuum of indistinguishable drivers commuting between a common origin/destination pair in an acyclic transportation network. The drivers' route choices are affected by their, r…
▽ More
Stability of Wardrop equilibria is analyzed for dynamical transportation networks in which the drivers' route choices are influenced by information at multiple temporal and spatial scales. The considered model involves a continuum of indistinguishable drivers commuting between a common origin/destination pair in an acyclic transportation network. The drivers' route choices are affected by their, relatively infrequent, perturbed best responses to global information about the current network congestion levels, as well as their instantaneous local observation of the immediate surroundings as they transit through the network. A novel model is proposed for the drivers' route choice behavior, exhibiting local consistency with their preference toward globally less congested paths as well as myopic decisions in favor of locally less congested paths. The simultaneous evolution of the traffic congestion on the network and of the aggregate path preference is modeled by a system of coupled ordinary differential equations. The main result shows that, if the frequency of updates of path preferences is sufficiently small as compared to the frequency of the traffic flow dynamics, then the state of the transportation network ultimately approaches a neighborhood of the Wardrop equilibrium. The presented results may be read as a further evidence in support of Wardrop's postulate of equilibrium, showing robustness of it with respect to non-persistent perturbations. The proposed analysis combines techniques from singular perturbation theory, evolutionary game theory, and cooperative dynamical systems.
△ Less
Submitted 11 January, 2011;
originally announced January 2011.
-
Scheduling Kalman Filters in Continuous Time
Authors:
Jerome Le Ny,
Eric Feron,
Munther A. Dahleh
Abstract:
A set of N independent Gaussian linear time invariant systems is observed by M sensors whose task is to provide the best possible steady-state causal minimum mean square estimate of the state of the systems, in addition to minimizing a steady-state measurement cost. The sensors can switch between systems instantaneously, and there are additional resource constraints, for example on the number of…
▽ More
A set of N independent Gaussian linear time invariant systems is observed by M sensors whose task is to provide the best possible steady-state causal minimum mean square estimate of the state of the systems, in addition to minimizing a steady-state measurement cost. The sensors can switch between systems instantaneously, and there are additional resource constraints, for example on the number of sensors which can observe a given system simultaneously. We first derive a tractable relaxation of the problem, which provides a bound on the achievable performance. This bound can be computed by solving a convex program involving linear matrix inequalities. Exploiting the additional structure of the sites evolving independently, we can decompose this program into coupled smaller dimensional problems. In the scalar case with identical sensors, we give an analytical expression of an index policy proposed in a more general context by Whittle. In the general case, we develop open-loop periodic switching policies whose performance matches the bound arbitrarily closely.
△ Less
Submitted 28 October, 2008;
originally announced October 2008.