logo SBA

ETD

Digital archive of theses discussed at the University of Pisa

 

Thesis etd-09062021-154038


Thesis type
Tesi di laurea magistrale
Author
MANCINI, RICCARDO
URN
etd-09062021-154038
Thesis title
GPU-accelerated Join Order Optimization
Department
INGEGNERIA DELL'INFORMAZIONE
Course of study
COMPUTER ENGINEERING
Supervisors
relatore Ing. Bechini, Alessio
relatore Chandra, Bikash
relatore Venkatesh, Srinivas
Keywords
  • database
  • dynamic programming
  • gpu
  • graph
  • join order optimization
  • postgresql
  • query optimization
  • sql
Graduation session start date
24/09/2021
Availability
Withheld
Release date
24/09/2091
Summary
Nowadays, database queries can involve a large amount of tables, with the biggest queries involving more than 1000 tables, and even medium-sized queries involving around 50 tables. This increase in query size has pushed the query optimization component of the database engine to its limits, highlighting the need to investigate new approaches. In particular, join order optimization is known to be a NP-hard problem, and heuristic approaches are not always able to find good plans.
For these reasons, this thesis explores the possibility of accelerating this part of the query optimizer using a GPU. A new massively parallel algorithm, MPDP, has been developed, which is superior to other state-of-the-art algorithms in parallel performance, due to using join graph topology to efficiently reduce the search space. Furthermore, it has been implemented in the open-source DBMS PostgreSQL, and evaluated on real-world and synthetic datasets. Finally, it has been applied to a state-of-the-art heuristic, in order to be able to handle even 1000-table queries.
The experiments show that this new algorithm is able to efficiently optimize 25-table queries, compared to the 12 tables of the built-in Postgres algorithm, achieving a 80x speedup over state-of-the-art parallel algorithms on 24 CPU cores.
Its application to the heuristic enables the exploration of a much bigger search space, yielding a 10% to 50% improvement in terms of cost of the query plan over other state-of-the-art heuristics.
File