logo SBA


Digital archive of theses discussed at the University of Pisa


Thesis etd-05232016-095351

Thesis type
Tesi di dottorato di ricerca
Thesis title
Studies on nVidia GPUs in parallel computing for Lattice QCD and Computational Fluid Dynamics applications
Academic discipline
Course of study
tutor Prof.ssa Bernardeschi, Cinzia
tutor Dott.ssa Arezzini, Silvia
  • Computational Fluid Dynamics
  • GPU
  • Lattice Quantum Chromodynamics
  • Parallel Computing
Graduation session start date
Release date
Over the last 20 years, the computing revolution has created many social benefits. The computing energy and environmental footprint have grown, and as a consequence the energy efficiency is becoming increasingly important. The evolution toward an always-on connectivity is adding demands for efficient computing performances. The result is a strong market that pulls for technologies that improve processor performance while reducing energy use.
The improvements in energy performance have largely come as a side effect of the Moore’s law – the number of transistors on a chip doubles about every two years, thanks to an ever-smaller circuitry. An improvement in better performance and in energy efficiency is due to more transistors on a single computer, with less physical distance between them. In the last few years, however, the energy-related benefits resulting from the Moore’s law are slowing down, threatening future advances in computing. This is caused by the reaching of a physical limit in the miniaturization of transistors.
The industry’s answer to this problem for now are new processors architecture and more power-efficient technologies.
For decades, the Central Processing Unit (CPU) of a computer has been the one designated to run general programming tasks, excelling at running computing instructions serially, and using a variety of complex techniques and algorithms in order to improve speed.
Graphics Processing Units (GPUs) are specialized accelerators originally designed for painting millions of pixels simultaneously across a screen, doing this by performing parallel calculation using simpler architecture. In recent years the video game market developments compelled GPUs manufacturers to increase the floating-point calculation performance of their products, by far exceeding the performance of standard CPUs in floating point calculations. The architecture evolved toward programmable manycore chips that are designed to process in parallel massive amounts of data. These developments suggested the possibility of using GPUs in the field of High-Performance Computing (HPC) as low-cost substitutes of more traditional CPU-based architectures: nowadays such possibility is being fully exploited and GPUs represent an ongoing breakthrough for many computationally demanding scientific fields, providing consistent computing resources at relatively low cost, also in terms of power consumption (watts/flops). Due to their many-core architectures, with fast access to the on-board memory, GPUs are ideally suited for numerical tasks allowing for data parallelism, i.e., for Single Instruction Multiple Data (SIMD) parallelization.
In this thesis, the parallel computing in Lattice Quantum Chromodynamics (Lattice QCD or LQCD) and in Computing Fluid Dynamics (CFD) using multi-GPU systems is presented, highlighting the approach of the software in each case, and trying to understand how to build and how to exploit the next generation clusters for scientific computing.
In Chapter 1, the fundamentals of parallel computing are enunciated and a description of the hardware architecture of the main multiple-processor systems is provided.
The use of GPUs for general purpose parallel computing is presented in Chapter 2, comparing them to traditional CPUs. Moreover, a brief history of GPU devices is presented, highlighting the evolution from the ones exclusively dedicated to graphics applications to the DirectX 8 generation, the first that can be fully dedicated to general purpose applications.
In Chapter 3, the Compute Unified Device Architecture (CUDA) is described. This is a parallel computing platform and application programming interface (API) model created by nVidia, that allows software developers to use nVidia GPUs (the so called CUDA-enabled one) for general purpose processing.
Chapter 4 described the GPUs approach to the Lattice QCD field. Two different approaches to exploit GPU computing are presented – through CUDA and OpenACC – where an existing code has been adapted to fully take advantage of accelerator devices such as GPUs. Performances comparison is shown and the first studies for the implementation of complete simulation of a Lattice QCD code in a multi-GPUs system are presented. A different application field is analyzed in Chapter 5, where a multi-GPU system is tested and optimized for CFD purpose, using a proprietary software such as ANSYS Fluent.