Skip to main content

Showing 1–1 of 1 results for author: Tronge, J

Searching in archive cs. Search in all archives.
.
  1. MARS: Malleable Actor-Critic Reinforcement Learning Scheduler

    Authors: Betis Baheri, Jacob Tronge, Bo Fang, Ang Li, Vipin Chaudhary, Qiang Guan

    Abstract: In this paper, we introduce MARS, a new scheduling system for HPC-cloud infrastructures based on a cost-aware, flexible reinforcement learning approach, which serves as an intermediate layer for next generation HPC-cloud resource manager. MARS ensembles the pre-trained models from heuristic workloads and decides on the most cost-effective strategy for optimization. A whole workflow application wou… ▽ More

    Submitted 23 December, 2022; v1 submitted 4 May, 2020; originally announced May 2020.

    Comments: 10 pages, HPC, Cloud System, Scheduling, Workflow Management, Reinforcement Learning, Deep Learning

    Journal ref: 2022 IEEE International Performance Computing and Communications Conference (IPCCC) 217-226