Tesi etd-11102014-154250 |
Link copiato negli appunti
Tipo di tesi
Tesi di laurea magistrale
Autore
BELLI, ROBERTO
URN
etd-11102014-154250
Titolo
Optimizing MPI one-sided synchronization mechanisms on Cray's Cascade HPC systems
Dipartimento
INFORMATICA
Corso di studi
INFORMATICA E NETWORKING
Relatori
relatore Prof. Danelutto, Marco
tutor Prof. Hoefler, Torsten
controrelatore Prof. Coppola, Massimo
tutor Prof. Hoefler, Torsten
controrelatore Prof. Coppola, Massimo
Parole chiave
- HPC
- MPI
- Networking
- one sided
- RMA
- Synchronization
Data inizio appello
05/12/2014
Consultabilità
Completa
Riassunto
In this work we proposed Notified Access a new communication model that targets RDMA networks.
Our focus was on optimizing producer-consumer computations, avoiding to over synchronize processes in point-to-point communications when it's not needed. We proposed a communication model in which a notification can be coupled with a single Remote Memory Access (RMA). In our model the target of an RMA operation is directly notified after the completion of a notified operation. This approach, avoiding the use of other synchronization primitives, minimizes synchronization latencies while using full hardware offload typical of high-performance networks. In order to demonstrate lower overheads than other point-to-point synchronization mechanisms, we implemented it in an open source MPI-3 library.
We evaluated the performances of our implementation in a ping-pong benchmark, a computation/communication overlap benchmark and in three real-world applications: a pipeline stencil, a tree-based reduce and a task based Cholesky factorization.
Our analysis shows that Notified Access is a valuable primitive for any RMA system and furthermore we show that the required hardware feature are already available in multiple state-of-the-art high-performance networks.
Our focus was on optimizing producer-consumer computations, avoiding to over synchronize processes in point-to-point communications when it's not needed. We proposed a communication model in which a notification can be coupled with a single Remote Memory Access (RMA). In our model the target of an RMA operation is directly notified after the completion of a notified operation. This approach, avoiding the use of other synchronization primitives, minimizes synchronization latencies while using full hardware offload typical of high-performance networks. In order to demonstrate lower overheads than other point-to-point synchronization mechanisms, we implemented it in an open source MPI-3 library.
We evaluated the performances of our implementation in a ping-pong benchmark, a computation/communication overlap benchmark and in three real-world applications: a pipeline stencil, a tree-based reduce and a task based Cholesky factorization.
Our analysis shows that Notified Access is a valuable primitive for any RMA system and furthermore we show that the required hardware feature are already available in multiple state-of-the-art high-performance networks.
File
Nome file | Dimensione |
---|---|
modern_thesis.pdf | 2.89 Mb |
Contatta l’autore |