Skip to main content

Showing 1–15 of 15 results for author: Bayati, M

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.18777  [pdf, other

    cs.LG stat.ML

    Aligning Model Properties via Conformal Risk Control

    Authors: William Overman, Jacqueline Jil Vallon, Mohsen Bayati

    Abstract: AI model alignment is crucial due to inadvertent biases in training data and the underspecified pipeline in modern machine learning, where numerous models with excellent test set metrics can be produced, yet they may not meet end-user requirements. Recent advances demonstrate that post-training model alignment via human feedback can address some of these challenges. However, these methods are ofte… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  2. arXiv:2403.10771  [pdf, other

    cs.LG stat.ML

    A Probabilistic Approach for Alignment with Human Comparisons

    Authors: Junyu Cao, Mohsen Bayati

    Abstract: A growing trend involves integrating human knowledge into learning frameworks, leveraging subtle human feedback to refine AI models. Despite these advances, no comprehensive theoretical framework describing the specific conditions under which human comparisons improve the traditional supervised fine-tuning process has been developed. To bridge this gap, this paper studies the effective use of huma… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  3. arXiv:2311.08340  [pdf, other

    stat.ME stat.ML

    Causal Message Passing: A Method for Experiments with Unknown and General Network Interference

    Authors: Sadegh Shirani, Mohsen Bayati

    Abstract: Randomized experiments are a powerful methodology for data-driven evaluation of decisions or interventions. Yet, their validity may be undermined by network interference. This occurs when the treatment of one unit impacts not only its outcome but also that of connected units, biasing traditional treatment effect estimations. Our study introduces a new framework to accommodate complex and unknown n… ▽ More

    Submitted 26 June, 2024; v1 submitted 14 November, 2023; originally announced November 2023.

  4. arXiv:2306.14872  [pdf, other

    cs.LG stat.ML

    Geometry-Aware Approaches for Balancing Performance and Theoretical Guarantees in Linear Bandits

    Authors: Yuwei Luo, Mohsen Bayati

    Abstract: This paper is motivated by recent research in the $d$-dimensional stochastic linear bandit literature, which has revealed an unsettling discrepancy: algorithms like Thompson sampling and Greedy demonstrate promising empirical performance, yet this contrasts with their pessimistic theoretical regret bounds. The challenge arises from the fact that while these algorithms may perform poorly in certain… ▽ More

    Submitted 30 December, 2023; v1 submitted 26 June, 2023; originally announced June 2023.

  5. arXiv:2210.00340  [pdf, other

    cs.LG stat.ML

    Speed Up the Cold-Start Learning in Two-Sided Bandits with Many Arms

    Authors: Mohsen Bayati, Junyu Cao, Wanning Chen

    Abstract: Multi-armed bandit (MAB) algorithms are efficient approaches to reduce the opportunity cost of online experimentation and are used by companies to find the best product from periodically refreshed product catalogs. However, these algorithms face the so-called cold-start at the onset of the experiment due to a lack of knowledge of customer preferences for new products, requiring an initial data col… ▽ More

    Submitted 3 November, 2022; v1 submitted 1 October, 2022; originally announced October 2022.

  6. arXiv:2102.07987  [pdf, ps, other

    stat.ML cs.LG

    The Elliptical Potential Lemma for General Distributions with an Application to Linear Thompson Sampling

    Authors: Nima Hamidi, Mohsen Bayati

    Abstract: In this note, we introduce a general version of the well-known elliptical potential lemma that is a widely used technique in the analysis of algorithms in sequential learning and decision-making problems. We consider a stochastic linear bandit setting where a decision-maker sequentially chooses among a set of given actions, observes their noisy rewards, and aims to maximize her cumulative expected… ▽ More

    Submitted 19 January, 2022; v1 submitted 16 February, 2021; originally announced February 2021.

    Comments: Accepted to Operations Research

  7. arXiv:2006.06790  [pdf, other

    cs.LG stat.ML

    On Frequentist Regret of Linear Thompson Sampling

    Authors: Nima Hamidi, Mohsen Bayati

    Abstract: This paper studies the stochastic linear bandit problem, where a decision-maker chooses actions from possibly time-dependent sets of vectors in $\mathbb{R}^d$ and receives noisy rewards. The objective is to minimize regret, the difference between the cumulative expected reward of the decision-maker and that of an oracle with access to the expected reward of each action, over a sequence of $T$ deci… ▽ More

    Submitted 20 April, 2023; v1 submitted 11 June, 2020; originally announced June 2020.

  8. arXiv:2002.11589  [pdf, other

    cs.LG cs.IR stat.ML

    Recommendation on a Budget: Column Space Recovery from Partially Observed Entries with Random or Active Sampling

    Authors: Carolyn Kim, Mohsen Bayati

    Abstract: We analyze alternating minimization for column space recovery of a partially observed, approximately low rank matrix with a growing number of columns and a fixed budget of observations per column. In this work, we prove that if the budget is greater than the rank of the matrix, column space recovery succeeds -- as the number of columns grows, the estimate from alternating minimization converges to… ▽ More

    Submitted 15 May, 2021; v1 submitted 26 February, 2020; originally announced February 2020.

    Comments: A shorter version is accepted to AISTATS

  9. arXiv:2002.10121  [pdf, other

    cs.LG stat.ML

    The Unreasonable Effectiveness of Greedy Algorithms in Multi-Armed Bandit with Many Arms

    Authors: Mohsen Bayati, Nima Hamidi, Ramesh Johari, Khashayar Khosravi

    Abstract: We investigate a Bayesian $k$-armed bandit problem in the \emph{many-armed} regime, where $k \geq \sqrt{T}$ and $T$ represents the time horizon. Initially, and aligned with recent literature on many-armed bandit problems, we observe that subsampling plays a key role in designing optimal algorithms; the conventional UCB algorithm is sub-optimal, whereas a subsampled UCB (SS-UCB), which selects… ▽ More

    Submitted 20 March, 2024; v1 submitted 24 February, 2020; originally announced February 2020.

  10. arXiv:2002.05152  [pdf, other

    cs.LG stat.ML

    A General Theory of the Stochastic Linear Bandit and Its Applications

    Authors: Nima Hamidi, Mohsen Bayati

    Abstract: Recent growing adoption of experimentation in practice has led to a surge of attention to multiarmed bandits as a technique to reduce the opportunity cost of online experiments. In this setting, a decision-maker sequentially chooses among a set of given actions, observes their noisy rewards, and aims to maximize her cumulative expected reward (or minimize regret) over a horizon of length $T$. In t… ▽ More

    Submitted 31 March, 2022; v1 submitted 12 February, 2020; originally announced February 2020.

  11. arXiv:1911.03764  [pdf, other

    econ.EM stat.ME stat.ML

    Optimal Experimental Design for Staggered Rollouts

    Authors: Ruoxuan Xiong, Susan Athey, Mohsen Bayati, Guido Imbens

    Abstract: In this paper, we study the design and analysis of experiments conducted on a set of units over multiple time periods where the starting time of the treatment may vary by unit. The design problem involves selecting an initial treatment time for each unit in order to most precisely estimate both the instantaneous and cumulative effects of the treatment. We first consider non-adaptive experiments, w… ▽ More

    Submitted 25 September, 2023; v1 submitted 9 November, 2019; originally announced November 2019.

    Comments: Forthcoming in Management Science

  12. arXiv:1904.08576  [pdf, other

    cs.LG cs.IT stat.ML

    On Low-rank Trace Regression under General Sampling Distribution

    Authors: Nima Hamidi, Mohsen Bayati

    Abstract: In this paper, we study the trace regression when a matrix of parameters B* is estimated via the convex relaxation of a rank-regularized regression or via regularized non-convex optimization. It is known that these estimators satisfy near-optimal error bounds under assumptions on the rank, coherence, and spikiness of B*. We start by introducing a general notion of spikiness for B* that provides a… ▽ More

    Submitted 29 August, 2023; v1 submitted 17 April, 2019; originally announced April 2019.

    Comments: 49 pages, 6 figure2

    Journal ref: Journal of Machine Learning Research (JMLR), 2022

  13. arXiv:1704.09011  [pdf, other

    stat.ML cs.LG

    Mostly Exploration-Free Algorithms for Contextual Bandits

    Authors: Hamsa Bastani, Mohsen Bayati, Khashayar Khosravi

    Abstract: The contextual bandit literature has traditionally focused on algorithms that address the exploration-exploitation tradeoff. In particular, greedy algorithms that exploit current estimates without any exploration may be sub-optimal in general. However, exploration-free greedy algorithms are desirable in practical settings where exploration may be costly or unethical (e.g., clinical trials). Surpri… ▽ More

    Submitted 18 April, 2020; v1 submitted 28 April, 2017; originally announced April 2017.

    Comments: 62 Pages, 7 Figures

  14. arXiv:1611.06686  [pdf, other

    stat.ML stat.CO

    Scalable Approximations for Generalized Linear Problems

    Authors: Murat A. Erdogdu, Mohsen Bayati, Lee H. Dicker

    Abstract: In stochastic optimization, the population risk is generally approximated by the empirical risk. However, in the large-scale setting, minimization of the empirical risk may be computationally restrictive. In this paper, we design an efficient algorithm to approximate the population risk minimizer in generalized linear problems such as binary classification with surrogate losses and generalized lin… ▽ More

    Submitted 21 November, 2016; originally announced November 2016.

  15. arXiv:1604.07463  [pdf, other

    stat.ML

    Dynamic Pricing with Demand Covariates

    Authors: Sheng Qiang, Mohsen Bayati

    Abstract: We consider a firm that sells products over $T$ periods without knowing the demand function. The firm sequentially sets prices to earn revenue and to learn the underlying demand function simultaneously. A natural heuristic for this problem, commonly used in practice, is greedy iterative least squares (GILS). At each time period, GILS estimates the demand as a linear function of the price by applyi… ▽ More

    Submitted 25 April, 2016; originally announced April 2016.

    Comments: 28 pages, 6 figures