Skip to main content

Showing 1–7 of 7 results for author: George, T K

.
  1. arXiv:2306.00295  [pdf, other

    cs.AI cs.LG

    EMOTE: An Explainable architecture for Modelling the Other Through Empathy

    Authors: Manisha Senadeera, Thommen Karimpanal George, Sunil Gupta, Stephan Jacobs, Santu Rana

    Abstract: We can usually assume others have goals analogous to our own. This assumption can also, at times, be applied to multi-agent games - e.g. Agent 1's attraction to green pellets is analogous to Agent 2's attraction to red pellets. This "analogy" assumption is tied closely to the cognitive process known as empathy. Inspired by empathy, we design a simple and explainable architecture to model another a… ▽ More

    Submitted 31 May, 2023; originally announced June 2023.

  2. arXiv:2204.09315  [pdf, ps, other

    cs.LG

    Learning to Constrain Policy Optimization with Virtual Trust Region

    Authors: Hung Le, Thommen Karimpanal George, Majid Abdolshah, Dung Nguyen, Kien Do, Sunil Gupta, Svetha Venkatesh

    Abstract: We introduce a constrained optimization method for policy gradient reinforcement learning, which uses a virtual trust region to regulate each policy update. In addition to using the proximity of one single old policy as the normal trust region, we propose forming a second trust region through another virtual policy representing a wide range of past policies. We then enforce the new policy to stay… ▽ More

    Submitted 15 September, 2022; v1 submitted 20 April, 2022; originally announced April 2022.

    Comments: Preprint, 22 pages

  3. arXiv:2112.01853  [pdf, other

    cs.LG cs.MA

    Episodic Policy Gradient Training

    Authors: Hung Le, Majid Abdolshah, Thommen K. George, Kien Do, Dung Nguyen, Svetha Venkatesh

    Abstract: We introduce a novel training procedure for policy gradient methods wherein episodic memory is used to optimize the hyperparameters of reinforcement learning algorithms on-the-fly. Unlike other hyperparameter searches, we formulate hyperparameter scheduling as a standard Markov Decision Process and use episodic memory to store the outcome of used hyperparameters and their training contexts. At any… ▽ More

    Submitted 3 December, 2021; originally announced December 2021.

    Comments: 19 pages

  4. arXiv:2111.02104  [pdf, ps, other

    cs.LG cs.AI

    Model-Based Episodic Memory Induces Dynamic Hybrid Controls

    Authors: Hung Le, Thommen Karimpanal George, Majid Abdolshah, Truyen Tran, Svetha Venkatesh

    Abstract: Episodic control enables sample efficiency in reinforcement learning by recalling past experiences from an episodic memory. We propose a new model-based episodic memory of trajectories addressing current limitations of episodic control. Our memory estimates trajectory values, guiding the agent towards good policies. Built upon the memory, we construct a complementary learning model via a dynamic h… ▽ More

    Submitted 6 November, 2021; v1 submitted 3 November, 2021; originally announced November 2021.

    Comments: 26 pages

  5. arXiv:2108.08960  [pdf, other

    cs.LG

    Plug and Play, Model-Based Reinforcement Learning

    Authors: Majid Abdolshah, Hung Le, Thommen Karimpanal George, Sunil Gupta, Santu Rana, Svetha Venkatesh

    Abstract: Sample-efficient generalisation of reinforcement learning approaches have always been a challenge, especially, for complex scenes with many components. In this work, we introduce Plug and Play Markov Decision Processes, an object-based representation that allows zero-shot integration of new objects from known object classes. This is achieved by representing the global transition dynamics as a unio… ▽ More

    Submitted 19 August, 2021; originally announced August 2021.

  6. arXiv:2107.08426  [pdf, other

    cs.LG

    A New Representation of Successor Features for Transfer across Dissimilar Environments

    Authors: Majid Abdolshah, Hung Le, Thommen Karimpanal George, Sunil Gupta, Santu Rana, Svetha Venkatesh

    Abstract: Transfer in reinforcement learning is usually achieved through generalisation across tasks. Whilst many studies have investigated transferring knowledge when the reward function changes, they have assumed that the dynamics of the environments remain consistent. Many real-world RL problems require transfer among environments with different dynamics. To address this problem, we propose an approach b… ▽ More

    Submitted 18 July, 2021; originally announced July 2021.

  7. arXiv:1705.08012  [pdf

    cs.HC

    Sensing discomfort of standing passengers in public rail transportation systems using a smart phone

    Authors: Thommen Karimpanal George, Harit Maganlal Gadhia, Ruben S/O Sukumar, John-John Cabibihan

    Abstract: This paper aims to investigate the effect of acceleration on the discomfort of standing passengers. The acceleration levels from different public rail transport lines such as the mass rapid transits (MRTs) and light rail transits (LRTs) of Singapore, as well as the associated qualitative data indicating the discomfort of standing passengers were collected and analyzed. Based on a logistic regressi… ▽ More

    Submitted 22 May, 2017; originally announced May 2017.

    Comments: Document prepared for IEEE International Conference on Control and Automation (ICCA), 2013, 5 pages, 8 figures

    Journal ref: 10th IEEE International Conference on Control & Automation (IEEE ICCA 2013), HangZhou China, June 12-14, 2013, pp. 1509-1513