logo SBA

ETD

Archivio digitale delle tesi discusse presso l’Università di Pisa

Tesi etd-11072011-145033


Tipo di tesi
Tesi di dottorato di ricerca
Autore
CAPANNINI, GABRIELE
URN
etd-11072011-145033
Titolo
The Impact of Novel Computing Architectures on Large-Scale Distributed Web Information Retrieval Systems
Settore scientifico disciplinare
INF/01
Corso di studi
INFORMATICA
Relatori
tutor Baraglia, Ranieri
relatore Silvestri, Fabrizio
relatore Pedreschi, Dino
Parole chiave
  • GPGPU
  • GPU Sorting
  • GPU-based prefix sum
  • K-Model
  • Scalable Algorithms on GPU
Data inizio appello
22/12/2011
Consultabilità
Completa
Riassunto
Web search engines are the most popular mean of interaction with the Web. Realizing a search engine which scales even to such issues presents many challenges. Fast crawling technology is needed to gather the Web documents. Indexing has to process hundreds of gigabytes of data efficiently. Queries have to be handled quickly, at a rate of thousands per second. As a solution, within a datacenter, services are built up from clusters of common homogeneous PCs.

However, Information Retrieval (IR) has to face issues raised by the growing amount of Web data, as well as the number of new users. In response to these issues, cost-effective specialized hardware is available nowadays. In our opinion, this hardware is ideal for migrating distributed IR systems to computer clusters comprising heterogeneous processors in order to respond their need of computing power. Toward this end, we introduce K-model, a computational model to properly evaluate algorithms designed for such hardware.

We study the impact of K-model rules on algorithm design. To evaluate the benefits of using K-model in evaluating algorithms, we compare the complexity of a solution built using our properly designed techniques, and the existing ones. Although in theory competitors are more efficient than us, empirically, K-model is able to prove because our solutions have been shown to be faster than the state-of-the-art implementations.
File