Search | arXiv e-print repository

arXiv:2011.08743 [pdf, other]

Curiosity Based Reinforcement Learning on Robot Manufacturing Cell

Authors: Mohammed Sharafath Abdul Hameed, Md Muzahid Khan, Andreas Schwung

Abstract: This paper introduces a novel combination of scheduling control on a flexible robot manufacturing cell with curiosity based reinforcement learning. Reinforcement learning has proved to be highly successful in solving tasks like robotics and scheduling. But this requires hand tuning of rewards in problem domains like robotics and scheduling even where the solution is not obvious. To this end, we ap… ▽ More This paper introduces a novel combination of scheduling control on a flexible robot manufacturing cell with curiosity based reinforcement learning. Reinforcement learning has proved to be highly successful in solving tasks like robotics and scheduling. But this requires hand tuning of rewards in problem domains like robotics and scheduling even where the solution is not obvious. To this end, we apply a curiosity based reinforcement learning, using intrinsic motivation as a form of reward, on a flexible robot manufacturing cell to alleviate this problem. Further, the learning agents are embedded into the transportation robots to enable a generalized learning solution that can be applied to a variety of environments. In the first approach, the curiosity based reinforcement learning is applied to a simple structured robot manufacturing cell. And in the second approach, the same algorithm is applied to a graph structured robot manufacturing cell. Results from the experiments show that the agents are able to solve both the environments with the ability to transfer the curiosity module directly from one environment to another. We conclude that curiosity based learning on scheduling tasks provide a viable alternative to the reward shaped reinforcement learning traditionally used. △ Less

Submitted 17 November, 2020; originally announced November 2020.

Comments: 6 pages

arXiv:2009.03836 [pdf, other]

Graph neural networks-based Scheduler for Production planning problems using Reinforcement Learning

Authors: Mohammed Sharafath Abdul Hameed, Andreas Schwung

Abstract: Reinforcement learning (RL) is increasingly adopted in job shop scheduling problems (JSSP). But RL for JSSP is usually done using a vectorized representation of machine features as the state space. It has three major problems: (1) the relationship between the machine units and the job sequence is not fully captured, (2) exponential increase in the size of the state space with increasing machines/j… ▽ More Reinforcement learning (RL) is increasingly adopted in job shop scheduling problems (JSSP). But RL for JSSP is usually done using a vectorized representation of machine features as the state space. It has three major problems: (1) the relationship between the machine units and the job sequence is not fully captured, (2) exponential increase in the size of the state space with increasing machines/jobs, and (3) the generalization of the agent to unseen scenarios. We present a novel framework - GraSP-RL, GRAph neural network-based Scheduler for Production planning problems using Reinforcement Learning. It represents JSSP as a graph and trains the RL agent using features extracted using a graph neural network (GNN). While the graph is itself in the non-euclidean space, the features extracted using the GNNs provide a rich encoding of the current production state in the euclidean space, which is then used by the RL agent to select the next job. Further, we cast the scheduling problem as a decentralized optimization problem in which the learning agent is assigned to all the production units and the agent learns asynchronously from the data collected on all the production units. The GraSP-RL is then applied to a complex injection molding production environment with 30 jobs and 4 machines. The task is to minimize the makespan of the production plan. The schedule planned by GraSP-RL is then compared and analyzed with a priority dispatch rule algorithm like first-in-first-out (FIFO) and metaheuristics like tabu search (TS) and genetic algorithm (GA). The proposed GraSP-RL outperforms the FIFO, TS, and GA for the trained task of planning 30 jobs in JSSP. We further test the generalization capability of the trained agent on two different problem classes: Open shop system (OSS) and Reactive JSSP (RJSSP) where our method produces results better than FIFO and comparable results to TS and GA. △ Less

Submitted 16 May, 2023; v1 submitted 8 September, 2020; originally announced September 2020.

Comments: 31 pages, pre-print

arXiv:2005.12108 [pdf, other]

Gradient Monitored Reinforcement Learning

Authors: Mohammed Sharafath Abdul Hameed, Gavneet Singh Chadha, Andreas Schwung, Steven X. Ding

Abstract: This paper presents a novel neural network training approach for faster convergence and better generalization abilities in deep reinforcement learning. Particularly, we focus on the enhancement of training and evaluation performance in reinforcement learning algorithms by systematically reducing gradient's variance and thereby providing a more targeted learning process. The proposed method which w… ▽ More This paper presents a novel neural network training approach for faster convergence and better generalization abilities in deep reinforcement learning. Particularly, we focus on the enhancement of training and evaluation performance in reinforcement learning algorithms by systematically reducing gradient's variance and thereby providing a more targeted learning process. The proposed method which we term as Gradient Monitoring(GM), is an approach to steer the learning in the weight parameters of a neural network based on the dynamic development and feedback from the training process itself. We propose different variants of the GM methodology which have been proven to increase the underlying performance of the model. The one of the proposed variant, Momentum with Gradient Monitoring (M-WGM), allows for a continuous adjustment of the quantum of back-propagated gradients in the network based on certain learning parameters. We further enhance the method with Adaptive Momentum with Gradient Monitoring (AM-WGM) method which allows for automatic adjustment between focused learning of certain weights versus a more dispersed learning depending on the feedback from the rewards collected. As a by-product, it also allows for automatic derivation of the required deep network sizes during training as the algorithm automatically freezes trained weights. The approach is applied to two discrete (Multi-Robot Co-ordination problem and Atari games) and one continuous control task (MuJoCo) using Advantage Actor-Critic (A2C) and Proximal Policy Optimization (PPO) respectively. The results obtained particularly underline the applicability and performance improvements of the methods in terms of generalization capability. △ Less

Submitted 25 May, 2020; originally announced May 2020.

Comments: 14 pages, 15 images

arXiv:1402.3781 [pdf]

A Framework for Develo** Real-Time OLAP algorithm using Multi-core processing and GPU: Heterogeneous Computing

Authors: H I Alzeini, Sh A Hameed, M H Habaebi

Abstract: The overwhelmingly increasing amount of stored data has spurred researchers seeking different methods in order to optimally take advantage of it which mostly have faced a response time problem as a result of this enormous size of data. Most of solutions have suggested materialization as a favourite solution. However, such a solution cannot attain Real- Time answers anyhow. In this paper we propose… ▽ More The overwhelmingly increasing amount of stored data has spurred researchers seeking different methods in order to optimally take advantage of it which mostly have faced a response time problem as a result of this enormous size of data. Most of solutions have suggested materialization as a favourite solution. However, such a solution cannot attain Real- Time answers anyhow. In this paper we propose a framework illustrating the barriers and suggested solutions in the way of achieving Real-Time OLAP answers that are significantly used in decision support systems and data warehouses. △ Less

Submitted 16 February, 2014; originally announced February 2014.

arXiv:0908.0216 [pdf]

Novel Framework for Hidden Data in the Image Page within Executable File Using Computation between Advanced Encryption Standard and Distortion Techniques

Authors: A. W. Naji, Shihab A. Hameed, B. B. Zaidan, Wajdi F. Al-Khateeb, Othman O. Khalifa, A. A. Zaidan, Teddy S. Gunawan

Abstract: The hurried development of multimedia and internet allows for wide distribution of digital media data. It becomes much easier to edit, modify and duplicate digital information. In additional, digital document is also easy to copy and distribute, therefore it may face many threats. It became necessary to find an appropriate protection due to the significance, accuracy and sensitivity of the infor… ▽ More The hurried development of multimedia and internet allows for wide distribution of digital media data. It becomes much easier to edit, modify and duplicate digital information. In additional, digital document is also easy to copy and distribute, therefore it may face many threats. It became necessary to find an appropriate protection due to the significance, accuracy and sensitivity of the information. Furthermore, there is no formal method to be followed to discover a hidden data. In this paper, a new information hiding framework is presented.The proposed framework aim is implementation of framework computation between advance encryption standard (AES) and distortion technique (DT) which embeds information in image page within executable file (EXE file) to find a secure solution to cover file without change the size of cover file. The framework includes two main functions; first is the hiding of the information in the image page of EXE file, through the execution of four process (specify the cover file, specify the information file, encryption of the information, and hiding the information) and the second function is the extraction of the hiding information through three process (specify the stego file, extract the information, and decryption of the information). △ Less

Submitted 3 August, 2009; originally announced August 2009.

Comments: 6 Pages IEEE Format, International Journal of Computer Science and Information Security, IJCSIS 2009, ISSN 1947 5500, Impact Factor 0.423

Journal ref: International Journal of Computer Science and Information Security, IJCSIS July 2009, Vol. 3, No. 1, USA

Showing 1–5 of 5 results for author: Hameed, S A