Leader of the group Database Architectures: Stefan Manegold.

We are a top database systems research group, active in the broad area of data management systems and infrastructure for supporting data science. Our group has a strong international reputation in academia and industry for pioneering column store technology, fast compression methods, vectorized query execution, on-line query-driven indexing (cracking), adaptive caching and integration of statistical languages and analysis in database systems.

 We develop, distribute and maintain the MonetDB open-source system, and we have spawned multiple spin-off companies, including Data Distilleries, VectorWise and MonetDB Solutions. We also operate a self-built cluster, SciLens, that – unlike many other computer clusters – is bandwidth-optimized and thus better suited as a data-science infrastructure. We pride ourselves on revealing the real problems in our discipline and coming up with revolutionary solutions that are frequently ahead of their time.



PhD Students in the areas of Big Data Management and Analysis Architectures and Data Science Engineering Technologies

Three fully funded PhD positions are available to work under the direction of prof. Stefan Manegold on big data management technology with a particular focus on hardware-conscious data structures and algorithms in distributed and cloud environments as well as the integration of data mining and machine learning into large-scale data management systems.

PhD Students in the areas of Big Data Management and Analysis Architectures and Data Science Engineering Technologies - Read More…


CWI Database Architecture researchers awarded with best paper runner up award at SSDBM

CWI Database Architecture researchers awarded with best paper runner up award at SSDBM

Researchers of CWI's Database Architecture group have been awarded with a best paper runner up award at 29th International Conference on Scientific and Statistical Database Management (SSDBM 2017) in Chicago. The researchers received the award for their paper 'Multi Hypothesis CSV-parsing' by authors Till Döhmen, Hannes Mühleisen and Peter Boncz.

CWI Database Architecture researchers awarded with best paper runner up award at SSDBM - Read More…

Making a tedious search a breeze: parallel query execution in multi-core systems

Making a tedious search a breeze: parallel query execution in multi-core systems

When you request information from a huge database, you might want to grab a cup of coffee and sit back, because your request may take a while. How can we optimize searches for information, specifically in a multi-core processing unit? The answer may lie in parallel queries, which run simultaneously on different processors.

Making a tedious search a breeze: parallel query execution in multi-core systems - Read More…


PhD defence Mrunal Gawade (DA)

  • 2017-02-15T10:00:00+01:00
  • 2017-02-15T12:00:00+01:00
February 15 Wednesday

Start: 2017-02-15 10:00:00+01:00 End: 2017-02-15 12:00:00+01:00

Aula, Spui 111 Amsterdam

Everyone is invited to attend the public defence of Mrunal Gawade, of his thesis 'Multi-core parallellism in a column-store'.

Promotor: Prof.dr. Martin Kersten (CWI)

PhD Defence Thibault Sellam (DA)

  • 2016-11-03T09:00:00+01:00
  • 2016-11-03T10:00:00+01:00
November 3 Thursday

Start: 2016-11-03 09:00:00+01:00 End: 2016-11-03 10:00:00+01:00

Agnietenkapel, Oudezijds Voorburgwal 231, Amsterdam

Everyone is welcome to attend the public defence of Thibault Sellam, for his thesis « Automatic Assistants for Database Exploration »

Promotor: Prof. M.L. Kersten (UvA & CWI)


Dutch Belgian Database Day 2015 (DBDBD 2015)

  • 2015-12-16T08:00:00+01:00
  • 2015-12-16T16:00:00+01:00
December 16 Wednesday

Start: 2015-12-16 08:00:00+01:00 End: 2015-12-16 16:00:00+01:00

CWI, Ground Floor, Room Z009 (Euler)

              Dutch Belgian Database Day 2015 (DBDBD 2015) [0]

                       on Wednesday, December 16, 2015

     at Centrum Wiskunde & Informatica (CWI), Amsterdam, The Netherlands


Registration by Wednesday, December 9, 2015, is required for all participants.

Please register via



The Dutch Belgian Database Day (DBDBD) is a yearly one-day workshop
organized by a Belgian or Dutch university, whose general topic is database
research.  DBDBD invites submissions (1 page abstract) on a broad range of
database and database-related topics, including but not limited to data
storage and management, theoretical database issues, database performance,
data mining, information retrieval, data semantics, querying, ontologies

At DBDBD, junior researchers from the Netherlands and Belgium can present
their recent results, and meet senior researchers in the field of databases.
It is an excellent opportunity to meet up with your Belgian/Dutch
colleagues, and to get informed about the (recent) database-related research
performed in Belgian/Dutch universities.  The workshop is also open to
non-Belgian/Dutch participants (presentations are in English).  The workshop
consists of oral presentations.  There are no printed proceedings.
Abstracts of talks will be published on the workshop's website.

This year, DBDBD is organized by the Database Architectures group [1] of
Centrum Wiskunde & Informatica (CWI) [2], the national research institute
for mathematics and computer science in the Netherlands, and under auspices
of SIKS [3], the Dutch research school for information and knowledge
systems.  DBDBD 2015 will be held at CWI in Amsterdam (Netherlands) on
Wednesday, December 16, 2015.

Topics of Interest

We welcome submissions on all topics related to database research, including
but not limited to:

* Data storage and management, streaming data, peer-to-peer and cloud;
* Theoretical database issues, data models, semantics and languages;
* Database performance and scalability;
* Data integration, data quality, data cleaning, ontologies;
* Data mining, clustering, summarization, entity & event extraction;
* Data security, privacy and personalization;
* Information retrieval, data semantics, querying and ranking in databases;
* Bug Data & Data Science.


DBDBD has a tradition of favoring presentations by junior researchers.
Proposals for presentations should be made before or on November 27, 2015.
Each submission should contain:

 * the title of the talk;
 * the name of the prospective speaker;
 * his/her affiliation;
 * a one-page abstract;
 * reference(s) to papers covered by the proposed presentation (if pertinent).

The format is a one-page pdf-document, to be sent to:
dbdbd2015 (at)


 9:00 - 10:00   Arrival and Coffee
10:10 - 10:10   Opening         
10:10 - 11:50   First Session   (4x 25 min)
                 First-Order Under-Approximations of Consistent Query Answers
                  Fabian Pijcke        
                  (Université de Mons)  
                 Different ways of expressing boolean queries
                  Dimitri Surinx       
                  (Hasselt University)  
                 Enumeration of Most General Why-Not Explanations
                  Evgeny Sherkhonov         
                  (University of Amsterdam)     
                 JudgeD: a Probabilistic Datalog with Dependencies
                  Brend Wanders          
                  (University of Twente)
11:50 - 13:00   Lunch   
13:00 - 14:40   Second Session  (4x 25 min)
                 MonetDBLite – Bringing Column Stores to the Masses
                  Hannes Mühleisen      
                  (Centrum Wiskunde & Informatica)      
                 Efficient Joins on Heterogeneous Processors    
                  Henning Funke, Sebastian Breß, Stefan Noll, Jens Teubner
                  (TU Dortmund University)      
                 Data Blocks: Hybrid OLTP and OLAP on Compressed Storage using both Vectorization and Compilation
                  Harald Lang, Tobias Mühlbauer, Florian Funke, Peter Boncz, Thomas Neumann, Alfons Kemper
                  (Technical University Munich, Snowflake Computing, Centrum Wiskunde & Informatica)
                 Cleaning data with forbidden itemsets  
                  Joeri Rammelaere, Floris Geerts, Bart Goethals
                  (Universiteit Antwerpen)
14:40 - 15:10   Coffee Break    
15:10 - 16:50   Third Session   (4x 25 min)
                 Aggregation of spatio-temporal and event log databases for stochastic characterization of process activities
                  Rodrigo Gonçalves, Rui Jorge Almeida, João M. C. Sousa        
                  (Eindhoven University of Technology & Universidade de Lisboa)
                 Beauty and Brains: Detecting Anomalous Pattern Co-Occurrences
                  Roel Bertens, Jilles Vreeken, Arno Siebes     
                  (Utrecht University, Max Planck Institute for Informatics and Saarland University)    
                 Behavioral Dynamics from the SERP’s Perspective: What are Failed SERPs and How to Fix Them?
                  Julia Kiseleva, Jaap Kamps, Vadim Nikulin, Nikita Makarov     
                  (Eindhoven University of Technology, University of Amsterdam, Yandex)
                 Extraction of family relationships from historical documents
                  Julia Efremova & Toon Calders         
                  (Eindhoven University of Technology & Universit ́e Libre de Bruxelles)
16:50 - 17:00   Closing        
17:00 - 19:00   Borrel (Drinks)

Important Dates

 * Submission Deadline (1 page abstract):             Friday, November 27, 2015
 * Notification and program online:               Wednesday, December   2, 2015
 * Registration deadline:                                Wednesday, December   9, 2015
 * Dutch-Belgian Database Day:                     Wednesday, December 16, 2015


DBDBD 2015 is organized under auspices of SIKS, the Dutch research school
for information and knowledge systems. Participation is free for all
SIKS-members (Phd-students, research fellows, senior research fellows and
associated members). For non SIKS-members, the registration fee is 50 euro
per person. This includes lunch and coffee breaks. Registration in advance
is necessary for all participants.

Please register on-line at by Wednesday, December 9, 2015.

As a non SIKS-members, please pay the registration fee of € 50,- via also by Wednesday, December 9, 2015.


DBDBD 2015 will be held at Centrum Wiskunde & Informatica (CWI), in
Amsterdam, The Netherlands. Address and directions are available on-line at

Local Organizers

The Dutch-Belgian Database Day 2015 is organized by the Database
Architectures group of Centrum Wiskunde & Informatica (CWI) in Amsterdam
(Netherlands). The local organizers are

 * Stefan Manegold            (
 * Martine Anholt Gunzeln  (



Amsterdam Data Science "Coffee & Data #4: Digital Energy"

PhD Defence Sándor Héman (DA)

  • 2015-10-28T14:45:00+01:00
  • 2015-10-28T16:00:00+01:00
October 28 Wednesday

Start: 2015-10-28 14:45:00+01:00 End: 2015-10-28 16:00:00+01:00

VU Auditorium

You are invited to the public defense of Sándor Héman on his PhD thesis entitled: Updating Compressed Column-Stores


Location is at VU Auditorium, De Boelelaan 1105, 1081 HV Amsterdam


Promotor: Prof.dr. P.A. Boncz (CWI, VU)

PhD Defence Holger Pirk (DA)

  • 2015-05-01T10:00:00+02:00
  • 2015-05-01T12:00:00+02:00
May 1 Friday

Start: 2015-05-01 10:00:00+02:00 End: 2015-05-01 12:00:00+02:00

Agnieten kapel, Oudezijdsvoorburgwal 229-231 Amsterdam

Everybody is welcome to attend the defence of Holger Pirk, Title: "Waste Not, Want Not! - Managing Relational Data in Asymmetric Memories".


Promotor: Prof. dr. M.L. Kersten (CWI)

Pre-Labour Day Database Afternoon

  • 2015-04-30T12:00:00+02:00
  • 2015-04-30T15:00:00+02:00
April 30 Thursday

Start: 2015-04-30 12:00:00+02:00 End: 2015-04-30 15:00:00+02:00


For the occasion of the PhD defence of Holger Pirk (MIT, Cambridge, MA, USA) on Friday May 1, 2015, at 12:00 noon in the Agnietenkapel,

we are pleased to organize a "Pre-Labour Day Database Afternoon"

on Thursday April 30, 2015, from 14:00 to 17:00, in room L0.17 @ CWI

with three esteemed speakers:


Johannes Gehrke
(Distinguished Engineer at Microsoft and the Tisch University Professor in  the Department of Computer Science at Cornell University)

Title: Deferring the Effect of Transactions

ACID Transactions have for decades provided the gold standard in strong consistency. I will describe applications and models for deferring the effect of transactions on the state of the system while maintaining one-copy serializability. In our first model of a quantum database, we can commit transactions while deferring assignments of values in these transactions to optimize the allocation of resources. In our second model, we permit parts of a replicated or distributed database system to be inconsistent during execution, as long as this inconsistency is bounded and does not affect transaction correctness. Our fully automated approach uses program analysis to extract semantic information about permissible levels of inconsistency; it generates treaties between sites that allow sites to operate independently until treaties are violated.

Goetz Graefe
(Fellow at HP Labs)

Title: Instant restart after a system failure

Database system failures and the subsequent recovery disrupt many transactions and entire applications, usually for an extended duration. For those failures, new on-demand “instant” recovery techniques reduce application downtime from minutes or hours to seconds. These new recovery techniques work for databases, file systems, key-value stores, and all other data stores that employ write-ahead logging.

In traditional recovery from a system failure, e.g., a crash of the database server process, applications may resume and start new transactions after recovery has performed the “redo” actions of all log records written since the last checkpoint and then all “undo” (compensation) actions for failed transactions, i.e., those left incomplete at the time of the crash. Both “redo” and “undo” phases may require many random database reads and thus a relatively long time. The design and some implementations of ARIES support new transactions concurrent to the “undo” phase after lock acquisition during the “redo” phase. For even earlier application availability, the new, “instant” recovery technique permits new database transactions before any  “redo” and “undo” work, imposes little load concurrent to new transactions, and prioritizes recovery of database contents according to the needs of new transactions. Concurrently to “redo” and “undo” recovery guided by new transactions, traditional restart recovery scanning the pre-crash recovery log forward and backward ensures that all recovery actions complete in about the same time as traditional recovery without concurrent new transactions.

Johann-Christoph Freytag
(Professor for Databases and Information Systems at the Humboldt-Universität zu Berlin)

Title: Adapting Tree Structures for Processing with SIMD Instructions
In this talk, we show how to accelerate the processing of tree-based index structures by using SIMD instructions. We adapt the B+-Tree and prefix B-Tree (trie) by changing the search algorithm on inner nodes from binary search to k-ary search.

We develop adaptations of tree structures that satisfy the specific constraints of SIMD instructions. We present algorithms for transforming the original tree layout into a SIMD-friendly layout. Our adapted B+-Tree speeds up search processes by a factor of up to eight for small data types compared to the original B+-Tree using binary search. Furthermore, our adapted prefix B-Tree enables a high search performance even for larger data types.

This work was done together with Steffen Zeuch and Frank Huber.

Members of Database Architectures



Current Projects

  • Actian CWI Research Grant
  • Capturing the Laws of Data Nature
    Cross-Industry Predictive Maintenance Optimization Platform
    Data Mining on High Volume Simulation Output
  • Databricks
    Databricks CWI Research Agreement
  • LAD
    LAD: Layered Astronomical Databases
    Process mining for multi-objective online control
  • Scilens-II
    The SciLens-II Infrastructure, Big Data at work


  • Actian Corporation
  • BMW Munich
  • Databricks
  • LIACS Institute
  • MonetDB B.V.
  • Tata Steel