Skip to main content

Showing 1–5 of 5 results for author: Khadilkar, H

Searching in archive stat. Search in all archives.
.
  1. arXiv:2402.15478  [pdf, other

    cs.LG stat.ML

    Transformers are Expressive, But Are They Expressive Enough for Regression?

    Authors: Swaroop Nath, Harshad Khadilkar, Pushpak Bhattacharyya

    Abstract: Transformers have become pivotal in Natural Language Processing, demonstrating remarkable success in applications like Machine Translation and Summarization. Given their widespread adoption, several works have attempted to analyze the expressivity of Transformers. Expressivity of a neural network is the class of functions it can approximate. A neural network is fully expressive if it can act as a… ▽ More

    Submitted 7 June, 2024; v1 submitted 23 February, 2024; originally announced February 2024.

    Comments: 18 pages, 10 figures, 3 tables

  2. arXiv:2006.04037  [pdf, other

    cs.LG cs.AI cs.MA stat.ML

    Reinforcement Learning for Multi-Product Multi-Node Inventory Management in Supply Chains

    Authors: Nazneen N Sultana, Hardik Meisheri, Vinita Baniwal, Somjit Nath, Balaraman Ravindran, Harshad Khadilkar

    Abstract: This paper describes the application of reinforcement learning (RL) to multi-product inventory management in supply chains. The problem description and solution are both adapted from a real-world business solution. The novelty of this problem with respect to supply chain literature is (i) we consider concurrent inventory management of a large number (50 to 1000) of products with shared capacity, (… ▽ More

    Submitted 7 June, 2020; originally announced June 2020.

  3. arXiv:2004.09846  [pdf, other

    cs.LG cs.AI stat.ML

    SIBRE: Self Improvement Based REwards for Adaptive Feedback in Reinforcement Learning

    Authors: Somjit Nath, Richa Verma, Abhik Ray, Harshad Khadilkar

    Abstract: We propose a generic reward sha** approach for improving the rate of convergence in reinforcement learning (RL), called Self Improvement Based REwards, or SIBRE. The approach is designed for use in conjunction with any existing RL algorithm, and consists of rewarding improvement over the agent's own past performance. We prove that SIBRE converges in expectation under the same conditions as the o… ▽ More

    Submitted 21 December, 2020; v1 submitted 21 April, 2020; originally announced April 2020.

    Comments: 7 pages, 10 figures

  4. arXiv:2003.14093  [pdf, other

    physics.soc-ph cs.AI cs.LG q-bio.PE stat.ML

    Optimising Lockdown Policies for Epidemic Control using Reinforcement Learning

    Authors: Harshad Khadilkar, Tanuja Ganu, Deva P Seetharam

    Abstract: In the context of the ongoing Covid-19 pandemic, several reports and studies have attempted to model and predict the spread of the disease. There is also intense debate about policies for limiting the damage, both to health and to the economy. On the one hand, the health and safety of the population is the principal consideration for most countries. On the other hand, we cannot ignore the potentia… ▽ More

    Submitted 1 May, 2020; v1 submitted 31 March, 2020; originally announced March 2020.

  5. arXiv:1911.04947  [pdf, other

    cs.LG stat.ML

    Accelerating Training in Pommerman with Imitation and Reinforcement Learning

    Authors: Hardik Meisheri, Omkar Shelke, Richa Verma, Harshad Khadilkar

    Abstract: The Pommerman simulation was recently developed to mimic the classic Japanese game Bomberman, and focuses on competitive gameplay in a multi-agent setting. We focus on the 2$\times$2 team version of Pommerman, developed for a competition at NeurIPS 2018. Our methodology involves training an agent initially through imitation learning on a noisy expert policy, followed by a proximal-policy optimizat… ▽ More

    Submitted 13 November, 2019; v1 submitted 12 November, 2019; originally announced November 2019.

    Comments: Presented at Deep Reinforcement Learning workshop, NeurIPS-2019