Skip to main content

Showing 1–5 of 5 results for author: Shirzad, S

Searching in archive cs. Search in all archives.
.
  1. Towards a Scalable and Distributed Infrastructure for Deep Learning Applications

    Authors: Bita Hasheminezhad, Shahrzad Shirzad, Nanmiao Wu, Patrick Diehl, Hannes Schulz, Hartmut Kaiser

    Abstract: Although recent scaling up approaches to training deep neural networks have proven to be effective, the computational intensity of large and complex models, as well as the availability of large-scale datasets, require deep learning frameworks to utilize scaling out techniques. Parallelization approaches and distribution requirements are not considered in the preliminary designs of most available d… ▽ More

    Submitted 19 April, 2021; v1 submitted 6 October, 2020; originally announced October 2020.

  2. arXiv:2002.07970  [pdf, other

    cs.DC cs.PL

    Supporting OpenMP 5.0 Tasks in hpxMP -- A study of an OpenMP implementation within Task Based Runtime Systems

    Authors: Tianyi Zhang, Shahrzad Shirzad, Bibek Wagle, Adrian S. Lemoine, Patrick Diehl, Hartmut Kaiser

    Abstract: OpenMP has been the de facto standard for single node parallelism for more than a decade. Recently, asynchronous many-task runtime (AMT) systems have increased in popularity as a new programming paradigm for high performance computing applications. One of the major challenges of this new paradigm is the incompatibility of the OpenMP thread model and other AMTs. Highly optimized OpenMP-based librar… ▽ More

    Submitted 18 February, 2020; originally announced February 2020.

  3. Scheduling optimization of parallel linear algebra algorithms using Supervised Learning

    Authors: G. Laberge, S. Shirzad, P. Diehl, H. Kaiser, S. Prudhomme, A. Lemoine

    Abstract: Linear algebra algorithms are used widely in a variety of domains, e.g machine learning, numerical physics and video games graphics. For all these applications, loop-level parallelism is required to achieve high performance. However, finding the optimal way to schedule the workload between threads is a non-trivial problem because it depends on the structure of the algorithm being parallelized and… ▽ More

    Submitted 25 September, 2019; v1 submitted 9 September, 2019; originally announced September 2019.

    Comments: Accepted at HPCML19

  4. An Introduction to hpxMP: A Modern OpenMP Implementation Leveraging HPX, An Asynchronous Many-Task System

    Authors: Tianyi Zhang, Shahrzad Shirzad, Patrick Diehl, R. Tohid, Weile Wei, Hartmut Kaiser

    Abstract: Asynchronous Many-task (AMT) runtime systems have gained increasing acceptance in the HPC community due to the performance improvements offered by fine-grained tasking runtime systems. At the same time, C++ standardization efforts are focused on creating higher-level interfaces able to replace OpenMP or OpenACC in modern C++ codes. These higher level functions have been adopted in standards confor… ▽ More

    Submitted 5 July, 2019; v1 submitted 7 March, 2019; originally announced March 2019.

  5. Asynchronous Execution of Python Code on Task Based Runtime Systems

    Authors: R. Tohid, Bibek Wagle, Shahrzad Shirzad, Patrick Diehl, Adrian Serio, Alireza Kheirkhahan, Parsa Amini, Katy Williams, Kate Isaacs, Kevin Huck, Steven Brandt, Hartmut Kaiser

    Abstract: Despite advancements in the areas of parallel and distributed computing, the complexity of programming on High Performance Computing (HPC) resources has deterred many domain experts, especially in the areas of machine learning and artificial intelligence (AI), from utilizing performance benefits of such systems. Researchers and scientists favor high-productivity languages to avoid the inconvenienc… ▽ More

    Submitted 22 October, 2018; v1 submitted 17 October, 2018; originally announced October 2018.