Making a tedious search a breeze: parallel query execution in multi-core systems

When you request information from a huge database, you might want to grab a cup of coffee and sit back, because your request may take a while. How can we optimize searches for information, specifically in a multi-core processing unit? The answer may lie in parallel queries, which run simultaneously on different processors.

Publication date
15 Feb 2017

When you request information from a huge database, you might want to grab a cup of coffee and sit back, because your request may take a while. How can we optimize searches for information, specifically in a multi-core processing unit? The answer may lie in parallel queries, which run simultaneously on different processors.

By using a simultaneous, parallel search, the  information request can be completed quickly and efficiently. However, making query execution faster using parallelization in a multi-core CPU system is challenging. Different bottlenecks occur, such as scheduling problems and data distribution problems.

In his thesis, PhD student Mrunal Gawade of CWI explores the query parallelization problem in the context of multi-core CPUs, in so-called ‘ column-store database architectures’. In column-based database systems, data tables are stored as columns rather than rows, which allows the database to more efficiently access the data it needs to answer a query. These systems are the preferred option when a database consists of millions of tables, which is often the case for businesses or for scientific purposes. Yet, even though column-store database architectures allows for faster query execution than its row-store equivalent, there is still a way to go. Gawade provides another step forward.

Gawade proposes new visualization tools, which can be used to understand the bottleneck problems that occur during query parallelization. He also introduces a new feedback-based query parallelization technique called ‘adaptive parallelization’. Compared to existing techniques, adaptive parallelization proves to be more efficient due to better multi-core utilization. Finally, Gawade proposes a new distributed database architecture that can help make database systems more suitable for non-uniform-memory-access, the preferred memory access method for multi-core and many-core CPUs.

Gawade’s work sheds light on the behavior of the open-source database system MonetDB, in the context of different multi-core and many-core CPU architectures. The extensive experimentation and new techniques proposed in Gawade’s thesis can act as a reference for future explorations of column-store architectures.

Defense: 15/02/2017 - 11:00 - 13:00 hrs
Location: Aula, Singel 411 Amsterdam
Promotor: Prof.dr. Martin Kersten (CWI Database Architecture/UvA)
Thesis: 'Multi-core parallellism in a column-store'