Skip to main content

Showing 1–8 of 8 results for author: Pedramfar, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.00065  [pdf, other

    math.OC cs.CC cs.LG stat.ML

    From Linear to Linearizable Optimization: A Novel Framework with Applications to Stationary and Non-stationary DR-submodular Optimization

    Authors: Mohammad Pedramfar, Vaneet Aggarwal

    Abstract: This paper introduces the notion of upper linearizable/quadratizable functions, a class that extends concavity and DR-submodularity in various settings, including monotone and non-monotone cases over different convex sets. A general meta-algorithm is devised to convert algorithms for linear/quadratic maximization into ones that optimize upper quadratizable functions, offering a unified approach to… ▽ More

    Submitted 13 May, 2024; v1 submitted 27 April, 2024; originally announced May 2024.

    Comments: arXiv admin note: text overlap with arXiv:2402.08621

  2. arXiv:2403.10063  [pdf, other

    cs.LG cs.AI cs.CC math.OC

    Unified Projection-Free Algorithms for Adversarial DR-Submodular Optimization

    Authors: Mohammad Pedramfar, Yididiya Y. Nadew, Christopher J. Quinn, Vaneet Aggarwal

    Abstract: This paper introduces unified projection-free Frank-Wolfe type algorithms for adversarial continuous DR-submodular optimization, spanning scenarios such as full information and (semi-)bandit feedback, monotone and non-monotone functions, different constraints, and types of stochastic queries. For every problem considered in the non-monotone setting, the proposed algorithms are either the first wit… ▽ More

    Submitted 26 April, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

    Comments: This paper is published in ICLR 2024. This version includes a correction for regret bounds in the full-information zeroth order feedback setting (see the footnote on page 1 for details)

  3. arXiv:2402.08621  [pdf, other

    cs.LG math.OC stat.ML

    A Generalized Approach to Online Convex Optimization

    Authors: Mohammad Pedramfar, Vaneet Aggarwal

    Abstract: In this paper, we analyze the problem of online convex optimization in different settings. We show that any algorithm for online linear optimization with fully adaptive adversaries is an algorithm for online convex optimization. We also show that any such algorithm that requires full-information feedback may be transformed to an algorithm with semi-bandit feedback with comparable regret bound. We… ▽ More

    Submitted 13 May, 2024; v1 submitted 13 February, 2024; originally announced February 2024.

  4. arXiv:2310.20007  [pdf, ps, other

    stat.ML cs.AI cs.LG

    Improved Bayesian Regret Bounds for Thompson Sampling in Reinforcement Learning

    Authors: Ahmadreza Moradipari, Mohammad Pedramfar, Modjtaba Shokrian Zini, Vaneet Aggarwal

    Abstract: In this paper, we prove the first Bayesian regret bounds for Thompson Sampling in reinforcement learning in a multitude of settings. We simplify the learning problem using a discrete set of surrogate environments, and present a refined analysis of the information ratio using posterior consistency. This leads to an upper bound of order $\widetilde{O}(H\sqrt{d_{l_1}T})$ in the time inhomogeneous rei… ▽ More

    Submitted 6 February, 2024; v1 submitted 30 October, 2023; originally announced October 2023.

    Comments: 37th Conference on Neural Information Processing Systems (NeurIPS 2023)

  5. arXiv:2306.17054  [pdf, other

    cs.NI

    Two-tiered Online Optimization of Region-wide Datacenter Resource Allocation via Deep Reinforcement Learning

    Authors: Chang-Lin Chen, Hanhan Zhou, Jiayu Chen, Mohammad Pedramfar, Vaneet Aggarwal, Tian Lan, Zheqing Zhu, Chi Zhou, Tim Gasser, Pol Mauri Ruiz, Vijay Menon, Neeraj Kumar, Hongbo Dong

    Abstract: This paper addresses the important need for advanced techniques in continuously allocating workloads on shared infrastructures in data centers, a problem arising due to the growing popularity and scale of cloud computing. It particularly emphasizes the scarcity of research ensuring guaranteed capacity in capacity reservations during large-scale failures. To tackle these issues, the paper presents… ▽ More

    Submitted 29 June, 2023; originally announced June 2023.

  6. arXiv:2305.16671  [pdf, ps, other

    cs.LG cs.AI cs.CC

    A Unified Approach for Maximizing Continuous DR-submodular Functions

    Authors: Mohammad Pedramfar, Christopher John Quinn, Vaneet Aggarwal

    Abstract: This paper presents a unified approach for maximizing continuous DR-submodular functions that encompasses a range of settings and oracle access types. Our approach includes a Frank-Wolfe type offline algorithm for both monotone and non-monotone functions, with different restrictions on the general convex set. We consider settings where the oracle provides access to either the gradient of the funct… ▽ More

    Submitted 12 January, 2024; v1 submitted 26 May, 2023; originally announced May 2023.

    Comments: 37th Conference on Neural Information Processing Systems (NeurIPS 2023)

  7. arXiv:2303.13604  [pdf, other

    cs.LG cs.AI cs.DS

    Stochastic Submodular Bandits with Delayed Composite Anonymous Bandit Feedback

    Authors: Mohammad Pedramfar, Vaneet Aggarwal

    Abstract: This paper investigates the problem of combinatorial multiarmed bandits with stochastic submodular (in expectation) rewards and full-bandit delayed feedback, where the delayed feedback is assumed to be composite and anonymous. In other words, the delayed feedback is composed of components of rewards from past actions, with unknown division among the sub-components. Three models of delayed feedback… ▽ More

    Submitted 23 March, 2023; originally announced March 2023.

  8. arXiv:2001.10474  [pdf, other

    cs.LG cs.AI stat.ML

    Coagent Networks Revisited

    Authors: Modjtaba Shokrian Zini, Mohammad Pedramfar, Matthew Riemer, Ahmadreza Moradipari, Miao Liu

    Abstract: Coagent networks formalize the concept of arbitrary networks of stochastic agents that collaborate to take actions in a reinforcement learning environment. Prominent examples of coagent networks in action include approaches to hierarchical reinforcement learning (HRL), such as those using options, which attempt to address the exploration exploitation trade-off by introducing abstract actions at di… ▽ More

    Submitted 29 August, 2023; v1 submitted 28 January, 2020; originally announced January 2020.

    Comments: Reformatted paper significantly and clarified results on the asynchronous case