Skip to main content

Showing 1–11 of 11 results for author: Féraud, R

.
  1. arXiv:2307.10704  [pdf

    cs.LG

    Decentralized Smart Charging of Large-Scale EVs using Adaptive Multi-Agent Multi-Armed Bandits

    Authors: Sharyal Zafar, Raphaël Feraud, Anne Blavette, Guy Camilleri, Hamid Ben

    Abstract: The drastic growth of electric vehicles and photovoltaics can introduce new challenges, such as electrical current congestion and voltage limit violations due to peak load demands. These issues can be mitigated by controlling the operation of electric vehicles i.e., smart charging. Centralized smart charging solutions have already been proposed in the literature. But such solutions may lack scalab… ▽ More

    Submitted 20 July, 2023; originally announced July 2023.

    Comments: CIRED 2023 International Conference & Exhibition on Electricity Distribution, Jun 2023, Rome, Italy

  2. arXiv:2109.14733  [pdf, other

    cs.LG cs.AI

    Batched Bandits with Crowd Externalities

    Authors: Romain Laroche, Othmane Safsafi, Raphael Feraud, Nicolas Broutin

    Abstract: In Batched Multi-Armed Bandits (BMAB), the policy is not allowed to be updated at each time step. Usually, the setting asserts a maximum number of allowed policy updates and the algorithm schedules them so that to minimize the expected regret. In this paper, we describe a novel setting for BMAB, with the following twist: the timing of the policy update is not controlled by the BMAB algorithm, but… ▽ More

    Submitted 29 September, 2021; originally announced September 2021.

    Comments: 31 pages

  3. arXiv:2010.09473  [pdf, other

    cs.LG cs.AI

    Double-Linear Thompson Sampling for Context-Attentive Bandits

    Authors: Djallel Bouneffouf, Raphaël Féraud, Sohini Upadhyay, Yasaman Khazaeni, Irina Rish

    Abstract: In this paper, we analyze and extend an online learning framework known as Context-Attentive Bandit, motivated by various practical applications, from medical diagnosis to dialog systems, where due to observation costs only a small subset of a potentially large number of context variables can be observed at each iteration;however, the agent has a freedom to choose which variables to observe. We de… ▽ More

    Submitted 15 October, 2020; originally announced October 2020.

    Comments: arXiv admin note: text overlap with arXiv:1906.09384

  4. arXiv:1811.07763  [pdf, other

    cs.LG stat.ML

    Decentralized Exploration in Multi-Armed Bandits -- Extended version

    Authors: Raphaël Féraud, Réda Alami, Romain Laroche

    Abstract: We consider the decentralized exploration problem: a set of players collaborate to identify the best arm by asynchronously interacting with the same stochastic environment. The objective is to insure privacy in the best arm identification problem between asynchronous, collaborative, and thrifty players. In the context of a digital service, we advocate that this decentralized approach allows a good… ▽ More

    Submitted 16 January, 2023; v1 submitted 19 November, 2018; originally announced November 2018.

  5. arXiv:1705.03821  [pdf, other

    cs.AI cs.LG stat.ML

    Context Attentive Bandits: Contextual Bandit with Restricted Context

    Authors: Djallel Bouneffouf, Irina Rish, Guillermo A. Cecchi, Raphael Feraud

    Abstract: We consider a novel formulation of the multi-armed bandit model, which we call the contextual bandit with restricted context, where only a limited number of features can be accessed by the learner at every iteration. This novel formulation is motivated by different online problems arising in clinical trials, recommender systems and attention modeling. Herein, we adapt the standard multi-armed band… ▽ More

    Submitted 7 June, 2017; v1 submitted 10 May, 2017; originally announced May 2017.

    Comments: IJCAI 2017

  6. arXiv:1701.08810  [pdf, other

    stat.ML cs.AI cs.LG math.OC

    Reinforcement Learning Algorithm Selection

    Authors: Romain Laroche, Raphael Feraud

    Abstract: This paper formalises the problem of online algorithm selection in the context of Reinforcement Learning. The setup is as follows: given an episodic task and a finite number of off-policy RL algorithms, a meta-algorithm has to decide which RL algorithm is in control during the next episode so as to maximize the expected return. The article presents a novel meta-algorithm, called Epochal Stochastic… ▽ More

    Submitted 14 November, 2017; v1 submitted 30 January, 2017; originally announced January 2017.

  7. arXiv:1609.02139  [pdf, other

    cs.AI

    Random Shuffling and Resets for the Non-stationary Stochastic Bandit Problem

    Authors: Robin Allesiardo, Raphaël Féraud, Odalric-Ambrym Maillard

    Abstract: We consider a non-stationary formulation of the stochastic multi-armed bandit where the rewards are no longer assumed to be identically distributed. For the best-arm identification task, we introduce a version of Successive Elimination based on random shuffling of the $K$ arms. We prove that under a novel and mild assumption on the mean gap $Δ$, this simple but powerful modification achieves the s… ▽ More

    Submitted 7 September, 2016; originally announced September 2016.

  8. arXiv:1602.03779  [pdf, other

    cs.AI cs.DC cs.LG

    Network of Bandits insure Privacy of end-users

    Authors: Raphaël Féraud

    Abstract: In order to distribute the best arm identification task as close as possible to the user's devices, on the edge of the Radio Access Network, we propose a new problem setting, where distributed players collaborate to find the best arm. This architecture guarantees privacy to end-users since no events are stored. The only thing that can be observed by an adversary through the core network is aggrega… ▽ More

    Submitted 29 March, 2017; v1 submitted 11 February, 2016; originally announced February 2016.

  9. arXiv:1508.07091  [pdf, other

    cs.LG

    Multi-armed Bandit Problem with Known Trend

    Authors: Djallel Bouneffouf, Raphaël Feraud

    Abstract: We consider a variant of the multi-armed bandit model, which we call multi-armed bandit problem with known trend, where the gambler knows the shape of the reward function of each arm but not its distribution. This new problem is motivated by different online problems like active learning, music and interface recommendation applications, where when an arm is sampled by the model the received reward… ▽ More

    Submitted 10 May, 2017; v1 submitted 28 August, 2015; originally announced August 2015.

    Comments: Neurocomputing 2016. arXiv admin note: text overlap with arXiv:0805.3415 by other authors

    ACM Class: I.2

  10. arXiv:1504.06952  [pdf, other

    cs.LG

    Random Forest for the Contextual Bandit Problem - extended version

    Authors: Raphaël Féraud, Robin Allesiardo, Tanguy Urvoy, Fabrice Clérot

    Abstract: To address the contextual bandit problem, we propose an online random forest algorithm. The analysis of the proposed algorithm is based on the sample complexity needed to find the optimal decision stump. Then, the decision stumps are assembled in a random collection of decision trees, Bandit Forest. We show that the proposed algorithm is optimal up to logarithmic factors. The dependence of the sam… ▽ More

    Submitted 15 September, 2016; v1 submitted 27 April, 2015; originally announced April 2015.

  11. arXiv:1409.8191  [pdf, other

    cs.NE cs.LG

    A Neural Networks Committee for the Contextual Bandit Problem

    Authors: Robin Allesiardo, Raphael Feraud, Djallel Bouneffouf

    Abstract: This paper presents a new contextual bandit algorithm, NeuralBandit, which does not need hypothesis on stationarity of contexts and rewards. Several neural networks are trained to modelize the value of rewards knowing the context. Two variants, based on multi-experts approach, are proposed to choose online the parameters of multi-layer perceptrons. The proposed algorithms are successfully tested o… ▽ More

    Submitted 29 September, 2014; originally announced September 2014.

    Comments: 21st International Conference on Neural Information Processing

    ACM Class: I.2