Thesis etd-11102014-154250 |
Link copiato negli appunti
Thesis type
Tesi di laurea magistrale
Author
BELLI, ROBERTO
URN
etd-11102014-154250
Thesis title
Optimizing MPI one-sided synchronization mechanisms on Cray's Cascade HPC systems
Department
INFORMATICA
Course of study
INFORMATICA E NETWORKING
Supervisors
relatore Prof. Danelutto, Marco
tutor Prof. Hoefler, Torsten
controrelatore Prof. Coppola, Massimo
tutor Prof. Hoefler, Torsten
controrelatore Prof. Coppola, Massimo
Keywords
- HPC
- MPI
- Networking
- one sided
- RMA
- Synchronization
Graduation session start date
05/12/2014
Availability
Full
Summary
In this work we proposed Notified Access a new communication model that targets RDMA networks.
Our focus was on optimizing producer-consumer computations, avoiding to over synchronize processes in point-to-point communications when it's not needed. We proposed a communication model in which a notification can be coupled with a single Remote Memory Access (RMA). In our model the target of an RMA operation is directly notified after the completion of a notified operation. This approach, avoiding the use of other synchronization primitives, minimizes synchronization latencies while using full hardware offload typical of high-performance networks. In order to demonstrate lower overheads than other point-to-point synchronization mechanisms, we implemented it in an open source MPI-3 library.
We evaluated the performances of our implementation in a ping-pong benchmark, a computation/communication overlap benchmark and in three real-world applications: a pipeline stencil, a tree-based reduce and a task based Cholesky factorization.
Our analysis shows that Notified Access is a valuable primitive for any RMA system and furthermore we show that the required hardware feature are already available in multiple state-of-the-art high-performance networks.
Our focus was on optimizing producer-consumer computations, avoiding to over synchronize processes in point-to-point communications when it's not needed. We proposed a communication model in which a notification can be coupled with a single Remote Memory Access (RMA). In our model the target of an RMA operation is directly notified after the completion of a notified operation. This approach, avoiding the use of other synchronization primitives, minimizes synchronization latencies while using full hardware offload typical of high-performance networks. In order to demonstrate lower overheads than other point-to-point synchronization mechanisms, we implemented it in an open source MPI-3 library.
We evaluated the performances of our implementation in a ping-pong benchmark, a computation/communication overlap benchmark and in three real-world applications: a pipeline stencil, a tree-based reduce and a task based Cholesky factorization.
Our analysis shows that Notified Access is a valuable primitive for any RMA system and furthermore we show that the required hardware feature are already available in multiple state-of-the-art high-performance networks.
File
Nome file | Dimensione |
---|---|
modern_thesis.pdf | 2.89 Mb |
Contatta l’autore |