Skip to main content

Showing 1–5 of 5 results for author: Zambre, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2206.14285  [pdf, other

    cs.DC

    Lessons Learned on MPI+Threads Communication

    Authors: Rohit Zambre, Aparna Chandramowlishwaran

    Abstract: Hybrid MPI+threads programming is gaining prominence, but, in practice, applications perform slower with it compared to the MPI everywhere model. The most critical challenge to the parallel efficiency of MPI+threads applications is slow MPI_THREAD_MULTIPLE performance. MPI libraries have recently made significant strides on this front, but to exploit their capabilities, users must expose the commu… ▽ More

    Submitted 28 June, 2022; originally announced June 2022.

    Comments: In Proceedings of the International Conference for High-Performance Computing, Networking, Storage and Analysis (SC), Dallas, TX, USA, November 2022

    ACM Class: C.2.4

  2. How I Learned to Stop Worrying About User-Visible Endpoints and Love MPI

    Authors: Rohit Zambre, Aparna Chandramowlishwaran, Pavan Balaji

    Abstract: MPI+threads is gaining prominence as an alternative to the traditional MPI everywhere model in order to better handle the disproportionate increase in the number of cores compared with other on-node resources. However, the communication performance of MPI+threads can be 100x slower than that of MPI everywhere. Both MPI users and developers are to blame for this slowdown. Typically, MPI users do no… ▽ More

    Submitted 1 May, 2020; originally announced May 2020.

    Comments: In Proceedings of the 34th ACM International Conference on Supercomputing (ICS), Barcelona, Spain, June 2020

    ACM Class: C.2.4

  3. arXiv:2002.03850  [pdf, other

    cs.DC cs.LG stat.ML

    Parallel Performance-Energy Predictive Modeling of Browsers: Case Study of Servo

    Authors: Rohit Zambre, Lars Bergstrom, Laleh Aghababaie Beni, Aparna Chandramowliswharan

    Abstract: Mozilla Research is develo** Servo, a parallel web browser engine, to exploit the benefits of parallelism and concurrency in the web rendering pipeline. Parallelization results in improved performance for pinterest.com but not for google.com. This is because the workload of a browser is dependent on the web page it is rendering. In many cases, the overhead of creating, deleting, and coordinating… ▽ More

    Submitted 6 February, 2020; originally announced February 2020.

    Comments: In Proceedings of the 23rd IEEE International Conference on High Performance Computing, Data, and Analytics (HiPC), Hyderabad, India, December 2016

  4. Breaking Band: A Breakdown of High-performance Communication

    Authors: Rohit Zambre, Megan Grodowitz, Aparna Chandramowlishwaran, Pavel Shamis

    Abstract: The critical path of internode communication on large-scale systems is composed of multiple components. When a supercomputing application initiates the transfer of a message using a high-level communication routine such as an MPI_Send, the payload of the message traverses multiple software stacks, the I/O subsystem on both the host and target nodes, and network components such as the switch. In th… ▽ More

    Submitted 6 February, 2020; originally announced February 2020.

    Comments: In Proceedings of the 48th ACM International Conference on Parallel Processing (ICPP), Kyoto, Japan, August 2019

    ACM Class: C.2.4

    Journal ref: In Proceedings of the 48th International Conference on Parallel Processing, pp. 1-10. 2019

  5. Scalable Communication Endpoints for MPI+Threads Applications

    Authors: Rohit Zambre, Aparna Chandramowlishwaran, Pavan Balaji

    Abstract: Hybrid MPI+threads programming is gaining prominence as an alternative to the traditional "MPI everywhere'" model to better handle the disproportionate increase in the number of cores compared with other on-node resources. Current implementations of these two models represent the two extreme cases of communication resource sharing in modern MPI implementations. In the MPI-everywhere model, each MP… ▽ More

    Submitted 6 February, 2020; originally announced February 2020.

    Comments: In Proceedings of the 24th IEEE International Conference on Parallel and Distributed Systems (ICPADS), Sentosa, Singapore, December 2018. Best Poster Award

    Journal ref: In 2018 IEEE 24th International Conference on Parallel and Distributed Systems (ICPADS), pp. 803-812. IEEE, 2018