Skip to main content

Showing 1–3 of 3 results for author: Nayyar, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.01018  [pdf, other

    cs.CL cs.AI

    HR-MultiWOZ: A Task Oriented Dialogue (TOD) Dataset for HR LLM Agent

    Authors: Weijie Xu, Zicheng Huang, Wenxiang Hu, Xi Fang, Rajesh Kumar Cherukuri, Naumaan Nayyar, Lorenzo Malandri, Srinivasan H. Sengamedu

    Abstract: Recent advancements in Large Language Models (LLMs) have been resha** Natural Language Processing (NLP) task in several domains. Their use in the field of Human Resources (HR) has still room for expansions and could be beneficial for several time consuming tasks. Examples such as time-off submissions, medical claims filing, and access requests are noteworthy, but they are by no means the sole in… ▽ More

    Submitted 1 February, 2024; originally announced February 2024.

    Comments: 13 pages, 9 figures

    MSC Class: 68T50 ACM Class: I.2.7

    Journal ref: EACL 2024

  2. arXiv:1505.00553  [pdf, other

    stat.ML cs.LG

    On Regret-Optimal Learning in Decentralized Multi-player Multi-armed Bandits

    Authors: Naumaan Nayyar, Dileep Kalathil, Rahul Jain

    Abstract: We consider the problem of learning in single-player and multiplayer multiarmed bandit models. Bandit problems are classes of online learning problems that capture exploration versus exploitation tradeoffs. In a multiarmed bandit model, players can pick among many arms, and each play of an arm generates an i.i.d. reward from an unknown distribution. The objective is to design a policy that maximiz… ▽ More

    Submitted 1 December, 2016; v1 submitted 4 May, 2015; originally announced May 2015.

  3. arXiv:1206.3582  [pdf, other

    math.OC cs.LG eess.SY

    Decentralized Learning for Multi-player Multi-armed Bandits

    Authors: Dileep Kalathil, Naumaan Nayyar, Rahul Jain

    Abstract: We consider the problem of distributed online learning with multiple players in multi-armed bandits (MAB) models. Each player can pick among multiple arms. When a player picks an arm, it gets a reward. We consider both i.i.d. reward model and Markovian reward model. In the i.i.d. model each arm is modelled as an i.i.d. process with an unknown distribution with an unknown mean. In the Markovian mod… ▽ More

    Submitted 14 June, 2012; originally announced June 2012.

    Comments: 33 pages, 3 figures. Submitted to IEEE Transactions on Information Theory