-
Fast Revenue Maximization
Authors:
Achraf Bahamou,
Omar Besbes,
Omar Mouchtaki
Abstract:
We study a data-driven problem pricing problem in which a seller offers a price for a single item based on demand observed at a small finite number of historical prices. Our goal is to derive precise evaluation procedures of the value of the historical information gathered by the seller, along with prescriptions for more efficient price experimentation. Our main methodological result is an exact c…
▽ More
We study a data-driven problem pricing problem in which a seller offers a price for a single item based on demand observed at a small finite number of historical prices. Our goal is to derive precise evaluation procedures of the value of the historical information gathered by the seller, along with prescriptions for more efficient price experimentation. Our main methodological result is an exact characterization of the maximin ratio (defined as the worst-case revenue garnered by a seller who only relies on past data divided by the optimal revenue achievable with full knowledge of the distribution of values). This result allows to measure the value of any historical data consisting of prices and corresponding conversion rates. We leverage this central reduction to provide new insights about price experimentation. Motivated by practical constraints that impede the seller from changing prices abruptly, we first illustrate our framework by evaluating the value of local information and show that the mere sign of the gradient may sometimes provide significant information to the seller. We then showcase how our framework can be used to run efficient price experiments. On the one hand, we develop a method to select the next price experiment that the seller should use to maximize the information obtained. On the other hand, we demonstrate that our result allows to considerably reduce the price experimentation needed to reach preset revenue guarantees through dynamic pricing algorithms.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
Battery Operations in Electricity Markets: Strategic Behavior and Distortions
Authors:
Jerry Anunrojwong,
Santiago R. Balseiro,
Omar Besbes,
Bolun Xu
Abstract:
Electric power systems are undergoing a major transformation as they integrate intermittent renewable energy sources, and batteries to smooth out variations in renewable energy production. As privately-owned batteries grow from their role as marginal "price-takers" to significant players in the market, a natural question arises: How do batteries operate in electricity markets, and how does the str…
▽ More
Electric power systems are undergoing a major transformation as they integrate intermittent renewable energy sources, and batteries to smooth out variations in renewable energy production. As privately-owned batteries grow from their role as marginal "price-takers" to significant players in the market, a natural question arises: How do batteries operate in electricity markets, and how does the strategic behavior of decentralized batteries distort decisions compared to centralized batteries?
We propose an analytically tractable model that captures salient features of the highly complex electricity market. We derive in closed form the resulting battery behavior and generation cost in three operating regimes: (i) no battery, (ii) centralized battery, and (ii) decentralized profit-maximizing battery. We establish that a decentralized battery distorts its discharge decisions in three ways. First, there is quantity withholding, i.e., discharging less than centrally optimal. Second, there is a shift in participation from day-ahead to real-time, i.e., postponing some of its discharge from day-ahead to real-time. Third, there is reduction in real-time responsiveness, or discharging less in response to smoothing real-time demand than centrally optimal. We quantify each of the three forms of distortions in terms of market fundamentals. To illustrate our results, we calibrate our model to Los Angeles and Houston and show that the loss from incentive misalignment could be consequential.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
The Fault in Our Recommendations: On the Perils of Optimizing the Measurable
Authors:
Omar Besbes,
Yash Kanoria,
Akshit Kumar
Abstract:
Recommendation systems are widespread, and through customized recommendations, promise to match users with options they will like. To that end, data on engagement is collected and used. Most recommendation systems are ranking-based, where they rank and recommend items based on their predicted engagement. However, the engagement signals are often only a crude proxy for utility, as data on the latte…
▽ More
Recommendation systems are widespread, and through customized recommendations, promise to match users with options they will like. To that end, data on engagement is collected and used. Most recommendation systems are ranking-based, where they rank and recommend items based on their predicted engagement. However, the engagement signals are often only a crude proxy for utility, as data on the latter is rarely collected or available. This paper explores the following question: By optimizing for measurable proxies, are recommendation systems at risk of significantly under-delivering on utility? If so, how can one improve utility which is seldom measured? To study these questions, we introduce a model of repeated user consumption in which, at each interaction, users select between an outside option and the best option from a recommendation set. Our model accounts for user heterogeneity, with the majority preferring ``popular'' content, and a minority favoring ``niche'' content. The system initially lacks knowledge of individual user preferences but can learn them through observations of users' choices over time. Our theoretical and numerical analysis demonstrate that optimizing for engagement can lead to significant utility losses. Instead, we propose a utility-aware policy that initially recommends a mix of popular and niche content. As the platform becomes more forward-looking, our utility-aware policy achieves the best of both worlds: near-optimal utility and near-optimal engagement simultaneously. Our study elucidates an important feature of recommendation systems; given the ability to suggest multiple items, one can perform significant exploration without incurring significant reductions in engagement. By recommending high-risk, high-reward items alongside popular items, systems can enhance discovery of high utility items without significantly affecting engagement.
△ Less
Submitted 6 May, 2024;
originally announced May 2024.
-
The Best of Many Robustness Criteria in Decision Making: Formulation and Application to Robust Pricing
Authors:
Jerry Anunrojwong,
Santiago R. Balseiro,
Omar Besbes
Abstract:
In robust decision-making under non-Bayesian uncertainty, different robust optimization criteria, such as maximin performance, minimax regret, and maximin ratio, have been proposed. In many problems, all three criteria are well-motivated and well-grounded from a decision-theoretic perspective, yet different criteria give different prescriptions. This paper initiates a systematic study of overfitti…
▽ More
In robust decision-making under non-Bayesian uncertainty, different robust optimization criteria, such as maximin performance, minimax regret, and maximin ratio, have been proposed. In many problems, all three criteria are well-motivated and well-grounded from a decision-theoretic perspective, yet different criteria give different prescriptions. This paper initiates a systematic study of overfitting to robustness criteria. How good is a prescription derived from one criterion when evaluated against another criterion? Does there exist a prescription that performs well against all criteria of interest? We formalize and study these questions through the prototypical problem of robust pricing under various information structures, including support, moments, and percentiles of the distribution of values. We provide a unified analysis of three focal robust criteria across various information structures and evaluate the relative performance of mechanisms optimized for each criterion against the others. We find that mechanisms optimized for one criterion often perform poorly against other criteria, highlighting the risk of overfitting to a particular robustness criterion. Remarkably, we show it is possible to design mechanisms that achieve good performance across all three criteria simultaneously, suggesting that decision-makers need not compromise among criteria.
△ Less
Submitted 18 March, 2024;
originally announced March 2024.
-
Robust Auction Design with Support Information
Authors:
Jerry Anunrojwong,
Santiago R. Balseiro,
Omar Besbes
Abstract:
A seller wants to sell an item to n buyers. Buyer valuations are drawn i.i.d. from a distribution unknown to the seller; the seller only knows that the support is included in [a, b]. To be robust, the seller chooses a DSIC mechanism that optimizes the worst-case performance relative to the first-best benchmark. Our analysis unifies the regret and the ratio objectives.
For these objectives, we de…
▽ More
A seller wants to sell an item to n buyers. Buyer valuations are drawn i.i.d. from a distribution unknown to the seller; the seller only knows that the support is included in [a, b]. To be robust, the seller chooses a DSIC mechanism that optimizes the worst-case performance relative to the first-best benchmark. Our analysis unifies the regret and the ratio objectives.
For these objectives, we derive an optimal mechanism and the corresponding performance in quasi-closed form, as a function of the support information and the number of buyers n. Our analysis reveals three regimes of support information and a new class of robust mechanisms. i.) With "low" support information, the optimal mechanism is a second-price auction (SPA) with random reserve, a focal class in earlier literature. ii.) With "high" support information, SPAs are strictly suboptimal, and an optimal mechanism belongs to a class of mechanisms we introduce, which we call pooling auctions (POOL); whenever the highest value is above a threshold, the mechanism still allocates to the highest bidder, but otherwise the mechanism allocates to a uniformly random buyer, i.e., pools low types. iii.) With "moderate" support information, a randomization between SPA and POOL is optimal.
We also characterize optimal mechanisms within nested central subclasses of mechanisms: standard mechanisms that only allocate to the highest bidder, SPA with random reserve, and SPA with no reserve. We show strict separations in terms of performance across classes, implying that deviating from standard mechanisms is necessary for robustness.
△ Less
Submitted 26 August, 2023; v1 submitted 15 May, 2023;
originally announced May 2023.
-
From Contextual Data to Newsvendor Decisions: On the Actual Performance of Data-Driven Algorithms
Authors:
Omar Besbes,
Will Ma,
Omar Mouchtaki
Abstract:
In this work, we explore a framework for contextual decision-making to study how the relevance and quantity of past data affects the performance of a data-driven policy. We analyze a contextual Newsvendor problem in which a decision-maker needs to trade-off between an underage and an overage cost in the face of uncertain demand. We consider a setting in which past demands observed under ``close by…
▽ More
In this work, we explore a framework for contextual decision-making to study how the relevance and quantity of past data affects the performance of a data-driven policy. We analyze a contextual Newsvendor problem in which a decision-maker needs to trade-off between an underage and an overage cost in the face of uncertain demand. We consider a setting in which past demands observed under ``close by'' contexts come from close by distributions and analyze the performance of data-driven algorithms through a notion of context-dependent worst-case expected regret. We analyze the broad class of Weighted Empirical Risk Minimization (WERM) policies which weigh past data according to their similarity in the contextual space. This class includes classical policies such as ERM, k-Nearest Neighbors and kernel-based policies. Our main methodological contribution is to characterize exactly the worst-case regret of any WERM policy on any given configuration of contexts. To the best of our knowledge, this provides the first understanding of tight performance guarantees in any contextual decision-making problem, with past literature focusing on upper bounds via concentration inequalities. We instead take an optimization approach, and isolate a structure in the Newsvendor loss function that allows to reduce the infinite-dimensional optimization problem over worst-case distributions to a simple line search.
This in turn allows us to unveil fundamental insights that were obfuscated by previous general-purpose bounds. We characterize actual guaranteed performance as a function of the contexts, as well as granular insights on the learning curve of algorithms.
△ Less
Submitted 27 July, 2023; v1 submitted 16 February, 2023;
originally announced February 2023.
-
Beyond IID: data-driven decision-making in heterogeneous environments
Authors:
Omar Besbes,
Will Ma,
Omar Mouchtaki
Abstract:
How should one leverage historical data when past observations are not perfectly indicative of the future, e.g., due to the presence of unobserved confounders which one cannot "correct" for? Motivated by this question, we study a data-driven decision-making framework in which historical samples are generated from unknown and different distributions assumed to lie in a heterogeneity ball with known…
▽ More
How should one leverage historical data when past observations are not perfectly indicative of the future, e.g., due to the presence of unobserved confounders which one cannot "correct" for? Motivated by this question, we study a data-driven decision-making framework in which historical samples are generated from unknown and different distributions assumed to lie in a heterogeneity ball with known radius and centered around the (also) unknown future (out-of-sample) distribution on which the performance of a decision will be evaluated. This work aims at analyzing the performance of central data-driven policies but also near-optimal ones in these heterogeneous environments and understanding key drivers of performance. We establish a first result which allows to upper bound the asymptotic worst-case regret of a broad class of policies. Leveraging this result, for any integral probability metric, we provide a general analysis of the performance achieved by Sample Average Approximation (SAA) as a function of the radius of the heterogeneity ball. This analysis is centered around the approximation parameter, a notion of complexity we introduce to capture how the interplay between the heterogeneity and the problem structure impacts the performance of SAA. In turn, we illustrate through several widely-studied problems -- e.g., newsvendor, pricing -- how this methodology can be applied and find that the performance of SAA varies considerably depending on the combinations of problem classes and heterogeneity. The failure of SAA for certain instances motivates the design of alternative policies to achieve rate-optimality. We derive problem-dependent policies achieving strong guarantees for the illustrative problems described above and provide initial results towards a principled approach for the design and analysis of general rate-optimal algorithms.
△ Less
Submitted 19 June, 2024; v1 submitted 20 June, 2022;
originally announced June 2022.
-
Dynamic Resource Allocation: Algorithmic Design Principles and Spectrum of Achievable Performances
Authors:
Omar Besbes,
Yash Kanoria,
Akshit Kumar
Abstract:
Dynamic resource allocation problems are ubiquitous, arising in inventory management, order fulfillment, online advertising, and other applications. We initially focus on one of the simplest models of online resource allocation: the multisecretary problem. In the multisecretary problem, a decision maker sequentially hires up to $B$ out of $T$ candidates, and candidate ability values are drawn i.i.…
▽ More
Dynamic resource allocation problems are ubiquitous, arising in inventory management, order fulfillment, online advertising, and other applications. We initially focus on one of the simplest models of online resource allocation: the multisecretary problem. In the multisecretary problem, a decision maker sequentially hires up to $B$ out of $T$ candidates, and candidate ability values are drawn i.i.d. from a distribution $F$ on $[0,1]$. First, we investigate fundamental limits on performance as a function of the value distribution under consideration. We quantify performance in terms of regret, defined as the additive loss relative to the best performance achievable in hindsight. We present a novel fundamental regret lower bound scaling of $Ω(T^{1/2 - 1/2(1 + β)})$ for distributions with gaps in their support, with $β$ quantifying the mass accumulation of types (values) around these gaps. This lower bound contrasts with the constant and logarithmic regret guarantees shown to be achievable in prior work, under specific assumptions on the value distribution. Second, we introduce a novel algorithmic principle, Conservativeness with respect to Gaps (CwG), which yields near-optimal performance with regret scaling of $\tilde{O}(T^{1/2 - 1/2(1 + β)})$ for any distribution in a class parameterized by the mass accumulation parameter $β$. We then turn to operationalizing the CwG principle across dynamic resource allocation problems. We study a general and practical algorithm, Repeatedly Act using Multiple Simulations (RAMS), which simulates possible futures to estimate a hindsight-based approximation of the value-to-go function. We establish that this algorithm inherits theoretical performance guarantees of algorithms tailored to the distribution of resource requests, including our CwG-based algorithm, and find that it outperforms them in numerical experiments.
△ Less
Submitted 5 October, 2023; v1 submitted 18 May, 2022;
originally announced May 2022.
-
On the Robustness of Second-Price Auctions in Prior-Independent Mechanism Design
Authors:
Jerry Anunrojwong,
Santiago R. Balseiro,
Omar Besbes
Abstract:
Classical Bayesian mechanism design relies on the common prior assumption, but such prior is often not available in practice. We study the design of prior-independent mechanisms that relax this assumption: the seller is selling an indivisible item to $n$ buyers such that the buyers' valuations are drawn from a joint distribution that is unknown to both the buyers and the seller; buyers do not need…
▽ More
Classical Bayesian mechanism design relies on the common prior assumption, but such prior is often not available in practice. We study the design of prior-independent mechanisms that relax this assumption: the seller is selling an indivisible item to $n$ buyers such that the buyers' valuations are drawn from a joint distribution that is unknown to both the buyers and the seller; buyers do not need to form beliefs about competitors, and the seller assumes the distribution is adversarially chosen from a specified class. We measure performance through the worst-case regret, or the difference between the expected revenue achievable with perfect knowledge of buyers' valuations and the actual mechanism revenue.
We study a broad set of classes of valuation distributions that capture a wide spectrum of possible dependencies: independent and identically distributed (i.i.d.) distributions, mixtures of i.i.d. distributions, affiliated and exchangeable distributions, exchangeable distributions, and all joint distributions. We derive in quasi closed form the minimax values and the associated optimal mechanism. In particular, we show that the first three classes admit the same minimax regret value, which is decreasing with the number of competitors, while the last two have the same minimax regret equal to that of the single buyer case. Furthermore, we show that the minimax optimal mechanisms have a simple form across all settings: a second-price auction with random reserve prices, which shows its robustness in prior-independent mechanism design. En route to our results, we also develop a principled methodology to determine the form of the optimal mechanism and worst-case distribution via first-order conditions that should be of independent interest in other minimax problems.
△ Less
Submitted 18 January, 2024; v1 submitted 21 April, 2022;
originally announced April 2022.
-
How Big Should Your Data Really Be? Data-Driven Newsvendor: Learning One Sample at a Time
Authors:
Omar Besbes,
Omar Mouchtaki
Abstract:
We study the classical newsvendor problem in which the decision-maker must trade-off underage and overage costs. In contrast to the typical setting, we assume that the decision-maker does not know the underlying distribution driving uncertainty but has only access to historical data. In turn, the key questions are how to map existing data to a decision and what type of performance to expect as a f…
▽ More
We study the classical newsvendor problem in which the decision-maker must trade-off underage and overage costs. In contrast to the typical setting, we assume that the decision-maker does not know the underlying distribution driving uncertainty but has only access to historical data. In turn, the key questions are how to map existing data to a decision and what type of performance to expect as a function of the data size. We analyze the classical setting with access to past samples drawn from the distribution (e.g., past demand), focusing not only on asymptotic performance but also on what we call the transient regime of learning, i.e., performance for arbitrary data sizes. We evaluate the performance of any algorithm through its worst-case relative expected regret, compared to an oracle with knowledge of the distribution. We provide the first finite sample exact analysis of the classical Sample Average Approximation (SAA) algorithm for this class of problems across all data sizes. This allows to uncover novel fundamental insights on the value of data: it reveals that tens of samples are sufficient to perform very efficiently but also that more data can lead to worse out-of-sample performance for SAA. We then focus on the general class of map**s from data to decisions without any restriction on the set of policies and derive an optimal algorithm (in the minimax sense) as well as characterize its associated performance. This leads to significant improvements for limited data sizes, and allows to exactly quantify the value of historical information.
△ Less
Submitted 25 July, 2022; v1 submitted 6 July, 2021;
originally announced July 2021.
-
Contextual Inverse Optimization: Offline and Online Learning
Authors:
Omar Besbes,
Yuri Fonseca,
Ilan Lobel
Abstract:
We study the problems of offline and online contextual optimization with feedback information, where instead of observing the loss, we observe, after-the-fact, the optimal action an oracle with full knowledge of the objective function would have taken. We aim to minimize regret, which is defined as the difference between our losses and the ones incurred by an all-knowing oracle. In the offline set…
▽ More
We study the problems of offline and online contextual optimization with feedback information, where instead of observing the loss, we observe, after-the-fact, the optimal action an oracle with full knowledge of the objective function would have taken. We aim to minimize regret, which is defined as the difference between our losses and the ones incurred by an all-knowing oracle. In the offline setting, the decision-maker has information available from past periods and needs to make one decision, while in the online setting, the decision-maker optimizes decisions dynamically over time based a new set of feasible actions and contextual functions in each period. For the offline setting, we characterize the optimal minimax policy, establishing the performance that can be achieved as a function of the underlying geometry of the information induced by the data. In the online setting, we leverage this geometric characterization to optimize the cumulative regret. We develop an algorithm that yields the first regret bound for this problem that is logarithmic in the time horizon. Finally, we show via simulation that our proposed algorithms outperform previous methods from the literature.
△ Less
Submitted 1 July, 2023; v1 submitted 26 June, 2021;
originally announced June 2021.
-
Optimal Pricing with a Single Point
Authors:
Amine Allouah,
Achraf Bahamou,
Omar Besbes
Abstract:
We study the following fundamental data-driven pricing problem. How can/should a decision-maker price its product based on data at a single historical price? How valuable is such data? We consider a decision-maker who optimizes over (potentially randomized) pricing policies to maximize the worst-case ratio of the revenue she can garner compared to an oracle with full knowledge of the distribution…
▽ More
We study the following fundamental data-driven pricing problem. How can/should a decision-maker price its product based on data at a single historical price? How valuable is such data? We consider a decision-maker who optimizes over (potentially randomized) pricing policies to maximize the worst-case ratio of the revenue she can garner compared to an oracle with full knowledge of the distribution of values, when the latter is only assumed to belong to a broad non-parametric set. In particular, our framework applies to the widely used regular and monotone non-decreasing hazard rate (mhr) classes of distributions. For settings where the seller knows the exact probability of sale associated with one historical price or only a confidence interval for it, we fully characterize optimal performance and near-optimal pricing algorithms that adjust to the information at hand. The framework we develop is general and allows to characterize optimal performance for deterministic or more general randomized mechanisms, and leads to fundamental novel insights on the value of data for pricing. As examples, against mhr distributions, we show that it is possible to guarantee $85\%$ of oracle performance if one knows that half of the customers have bought at the historical price, and if only $1\%$ of the customers bought, it still possible to guarantee $51\%$ of oracle performance.
△ Less
Submitted 28 March, 2022; v1 submitted 9 March, 2021;
originally announced March 2021.
-
Mechanism Design under Approximate Incentive Compatibility
Authors:
Santiago Balseiro,
Omar Besbes,
Francisco Castro
Abstract:
A fundamental assumption in classical mechanism design is that buyers are perfect optimizers. However, in practice, buyers may be limited by their computational capabilities or a lack of information, and may not be able to perfectly optimize. This has motivated the introduction of approximate incentive compatibility (IC) as an appealing solution concept for practical mechanism design. While most o…
▽ More
A fundamental assumption in classical mechanism design is that buyers are perfect optimizers. However, in practice, buyers may be limited by their computational capabilities or a lack of information, and may not be able to perfectly optimize. This has motivated the introduction of approximate incentive compatibility (IC) as an appealing solution concept for practical mechanism design. While most of the literature focuses on the analysis of particular approximate IC mechanisms, this paper is the first to study the design of optimal mechanisms in the space of approximate IC mechanisms and to explore how much revenue can be garnered by moving from exact to approximate incentive constraints. We study the problem of a seller facing one buyer with private values and analyze optimal selling mechanisms under $\varepsilon$-incentive compatibility. We establish that the gains that can be garnered depend on the local curvature of the seller's revenue function around the optimal posted price when the buyer is a perfect optimizer. If the revenue function behaves locally like an $α$-power for $α\in (1,\infty)$, then no mechanism can garner gains higher than order $\varepsilon^{α/(2α-1)}$. This improves upon state-of-the-art results which imply maximum gains of $\varepsilon^{1/2}$ by providing the first parametric bounds that capture the impact of revenue function's curvature on revenue gains. Furthermore, we establish that an optimal mechanism needs to randomize as soon as $\varepsilon>0$ and construct a randomized mechanism that is guaranteed to achieve order $\varepsilon^{α/(2α-1)}$ additional revenues, leading to a tight characterization of the revenue implications of approximate IC constraints. Our work brings forward the need to optimize not only over allocations and payments but also over best responses, and we develop a new framework to address this challenge.
△ Less
Submitted 24 March, 2022; v1 submitted 4 March, 2021;
originally announced March 2021.
-
Static Pricing: Universal Guarantees for Reusable Resources
Authors:
Omar Besbes,
Adam N. Elmachtoub,
Yunjie Sun
Abstract:
We consider a fundamental pricing model in which a fixed number of units of a reusable resource are used to serve customers. Customers arrive to the system according to a stochastic process and upon arrival decide whether or not to purchase the service, depending on their willingness-to-pay and the current price. The service time during which the resource is used by the customer is stochastic and…
▽ More
We consider a fundamental pricing model in which a fixed number of units of a reusable resource are used to serve customers. Customers arrive to the system according to a stochastic process and upon arrival decide whether or not to purchase the service, depending on their willingness-to-pay and the current price. The service time during which the resource is used by the customer is stochastic and the firm may incur a service cost. This model represents various markets for reusable resources such as cloud computing, shared vehicles, rotable parts, and hotel rooms. In the present paper, we analyze this pricing problem when the firm attempts to maximize a weighted combination of three central metrics: profit, market share, and service level. Under Poisson arrivals, exponential service times, and standard assumptions on the willingness-to-pay distribution, we establish a series of results that characterize the performance of static pricing in such environments.
In particular, while an optimal policy is fully dynamic in such a context, we prove that a static pricing policy simultaneously guarantees 78.9% of the profit, market share, and service level from the optimal policy. Notably, this result holds for any service rate and number of units the firm operates. Our proof technique relies on a judicious construction of a static price that is derived directly from the optimal dynamic pricing policy. In the special case where there are two units and the induced demand is linear, we also prove that the static policy guarantees 95.5% of the profit from the optimal policy. Our numerical findings on a large testbed of instances suggest that the latter result is quite indicative of the profit obtained by the static pricing policy across all parameters.
△ Less
Submitted 10 February, 2020; v1 submitted 2 May, 2019;
originally announced May 2019.
-
Microscopic origin of ferromagnetism in trihalides CrCl$_3$ and CrI$_3$
Authors:
Omar Besbes,
Sergey Nikolaev,
Igor Solovyev
Abstract:
Microscopic origin of the ferromagnetic (FM) exchange coupling in CrCl$_3$ and CrI$_3$, their common aspects and differences, are investigated on the basis of density functional theory combined with realistic modeling approach for the analysis of interatomic exchange interactions. We perform a comparative study based on the pseudopotential and linear muffin-tin orbital methods by treating the effe…
▽ More
Microscopic origin of the ferromagnetic (FM) exchange coupling in CrCl$_3$ and CrI$_3$, their common aspects and differences, are investigated on the basis of density functional theory combined with realistic modeling approach for the analysis of interatomic exchange interactions. We perform a comparative study based on the pseudopotential and linear muffin-tin orbital methods by treating the effects of electron exchange and correlation in GGA and LSDA, respectively. The results of ordinary band structure calculations are used in order to construct the minimal tight-binding type models describing the behavior of the magnetic Cr $3d$ and ligand $p$ bands in the basis of localized Wannier functions, and evaluate the effective exchange coupling ($J_{\rm eff}$) between two Cr sublattices employing four different technique: (i) Second-order Green's function perturbation theory for infinitesimal spin rotations of the LSDA (GGA) potential at the Cr sites; (ii) Enforcement of the magnetic force theorem in order to treat both Cr and ligand spins on a localized footing; (iii) Constrained total-energy calculations with an external field, treated in the framework of self-consistent linear response theory. We argue that the ligand states play crucial role in the ferromagnetism of Cr trihalides, though their contribution to $J_{\rm eff}$ strongly depends on additional assumptions, which are traced back to fundamentals of adiabatic spin dynamics. Particularly, by neglecting ligand spins in the Green's function method, $J_{\rm eff}$ can easily become antiferromagnetic, while by treating them as localized, one can severely overestimate the FM coupling. The best considered approach is based on the constraint method, where the ligand states are allowed to relax in response to each instantaneous reorientation of the Cr spins, controlled by the external field.
△ Less
Submitted 28 January, 2019;
originally announced January 2019.
-
Optimal Exploration-Exploitation in a Multi-Armed-Bandit Problem with Non-stationary Rewards
Authors:
Omar Besbes,
Yonatan Gur,
Assaf Zeevi
Abstract:
In a multi-armed bandit (MAB) problem a gambler needs to choose at each round of play one of K arms, each characterized by an unknown reward distribution. Reward realizations are only observed when an arm is selected, and the gambler's objective is to maximize his cumulative expected earnings over some given horizon of play T. To do this, the gambler needs to acquire information about arms (explor…
▽ More
In a multi-armed bandit (MAB) problem a gambler needs to choose at each round of play one of K arms, each characterized by an unknown reward distribution. Reward realizations are only observed when an arm is selected, and the gambler's objective is to maximize his cumulative expected earnings over some given horizon of play T. To do this, the gambler needs to acquire information about arms (exploration) while simultaneously optimizing immediate rewards (exploitation); the price paid due to this trade off is often referred to as the regret, and the main question is how small can this price be as a function of the horizon length T. This problem has been studied extensively when the reward distributions do not change over time; an assumption that supports a sharp characterization of the regret, yet is often violated in practical settings. In this paper, we focus on a MAB formulation which allows for a broad range of temporal uncertainties in the rewards, while still maintaining mathematical tractability. We fully characterize the (regret) complexity of this class of MAB problems by establishing a direct link between the extent of allowable reward "variation" and the minimal achievable regret. Our analysis draws some connections between two rather disparate strands of literature: the adversarial and the stochastic MAB frameworks.
△ Less
Submitted 6 June, 2019; v1 submitted 13 May, 2014;
originally announced May 2014.
-
Non-stationary Stochastic Optimization
Authors:
O. Besbes,
Y. Gur,
A. Zeevi
Abstract:
We consider a non-stationary variant of a sequential stochastic optimization problem, in which the underlying cost functions may change along the horizon. We propose a measure, termed variation budget, that controls the extent of said change, and study how restrictions on this budget impact achievable performance. We identify sharp conditions under which it is possible to achieve long-run-average…
▽ More
We consider a non-stationary variant of a sequential stochastic optimization problem, in which the underlying cost functions may change along the horizon. We propose a measure, termed variation budget, that controls the extent of said change, and study how restrictions on this budget impact achievable performance. We identify sharp conditions under which it is possible to achieve long-run-average optimality and more refined performance measures such as rate optimality that fully characterize the complexity of such problems. In doing so, we also establish a strong connection between two rather disparate strands of literature: adversarial online convex optimization; and the more traditional stochastic approximation paradigm (couched in a non-stationary setting). This connection is the key to deriving well performing policies in the latter, by leveraging structure of optimal policies in the former. Finally, tight bounds on the minimax regret allow us to quantify the "price of non-stationarity," which mathematically captures the added complexity embedded in a temporally changing environment versus a stationary one.
△ Less
Submitted 22 December, 2014; v1 submitted 20 July, 2013;
originally announced July 2013.