Description
Leader of the group Database Architectures: Stefan Manegold.
We, the Database Architectures (DA) research group of CWI, are well known as a top data systems research group, active in the broad area of data (management) systems and infrastructure for supporting data science. Our research group has a strong international reputation in academia and industry for pioneering column store technology, fast compression methods, vectorized query execution, on-line query-driven indexing (cracking), adaptive caching, and integration of statistical languages and analysis in database management systems.
We develop, distribute and maintain the MonetDB open-source system, and we have spawned multiple spin-off companies, including Data Distilleries, VectorWise and MonetDB Solutions. Our team also operates a self-built cluster, SciLens, that – unlike many other computer clusters – is bandwidth-optimized and thus better suited as a data-science infrastructure. We pride ourselves on revealing the real problems in our discipline and coming up with revolutionary solutions that are frequently ahead of their time.
Vacancies
PhD Students in the areas of Big Data Management and Analysis Architectures and Data Science Engineering Technologies
Three fully funded PhD positions are available to work under the direction of prof. Stefan Manegold on big data management technology with a particular focus on hardware-conscious data structures and algorithms in distributed and cloud environments as well as the integration of data mining and machine learning into large-scale data management systems.
News

CWI & Databricks: Big Data in Amsterdam
Amsterdam wants to play a leading international role in the development of data science research. In Big Data Amsterdam, Financieele Dagblad journalist Job Woudt interviews Amsterdam Data Science researchers on the functioning of the ecosystem where companies and knowledge institutions in Amsterdam collaborate in the area of Big Data.

Making a tedious search a breeze: parallel query execution in multi-core systems
When you request information from a huge database, you might want to grab a cup of coffee and sit back, because your request may take a while. How can we optimize searches for information, specifically in a multi-core processing unit? The answer may lie in parallel queries, which run simultaneously on different processors.

Martin Kersten appointed ACM Fellow
CWI fellow Martin Kersten has been appointed as one of the 2016 fellows of the Association of Computing Machinery (ACM).
Current events
ACM SIGMOD/PODS 2019
- 2019-06-30T00:00:00+02:00
- 2019-07-05T23:59:59+02:00
ACM SIGMOD/PODS 2019
Start: 2019-06-30 00:00:00+02:00 End: 2019-07-05 23:59:59+02:00
The Conference
The annual ACM SIGMOD/PODS Conference is a leading international forum for database researchers, practitioners, developers, and users to explore cutting-edge ideas and results, and to exchange techniques, tools, and experiences. The conference includes a fascinating technical program with research and industrial talks, tutorials, demos, and focused workshops. It also hosts a poster session to learn about innovative technology, an industrial exhibition to meet companies and publishers, and a careers-in-industry panel with representatives from leading companies.
SIGMOD
Stefan Manegold, Peter Boncz - General Chairs
Anastasia Ailamaki - Program Chair
PODS
Dan Suciu - General Chair
Christoph Koch - Program Chair
For more information: http://sigmod2019.org/
Members
Associated Members
Publications
-
Bereta, K, Koubarakis, M, Manegold, S, Stamoulis, G, & Demir, B. (2018). From big data to big information and big knowledge: The case of Earth observation data. In CIKM - Proceedings of the International Conference on Information and Knowledge Management (pp. 2293–2294). doi:10.1145/3269206.3274270
-
Tomé, D.G, Kepe, T.R, Alves, M.A.Z, & de Almeida, E.C. (2018). Near-data filters: Taking another brick from the memory wall. In Proceedings of the International Workshop on Accelerating Analytics and Data management Systems Using Modern Processor and Storage Architectures.
-
Raasveldt, M. (2018). Integrating analytics with relational databases. In Proceedings of the VLDB 2018 PhD Workshop co-located with the 44th International Conference on Very Large Databases (VLDB 2018).
-
Tomé, D.G, Gubner, T.K, Raasveldt, M, Rozenberg, E, & Boncz, P.A. (2018). Optimizing group-by and aggregation using GPU-CPU co-processing. In Proceedings of the International Workshop on Accelerating Analytics and Data management Systems Using Modern Processor and Storage Architectures (pp. 1–10).
-
Raasveldt, M, Holanda, P.T.T, Gubner, T.K, & Mühleisen, H.F. (2018). Fair benchmarking considered difficult: Common pitfalls in database performance testing. In Workshop on Testing Database Systems. doi:10.1145/3209950.3209955
-
Traub, M.C, van Ossenbruggen, J.R, Samar, T, & Hardman, L. (2018). Impact of Crowdsourcing OCR Improvements on Retrievability Bias. In ACM International Conference Proceeding Series. doi:10.1145/3197026.3197046
-
Tomé, D.G, Santos, P.C, Carro, L, de Almeida, E.C, & Alves, M.A.Z. (2018). HIPE: HMC Instruction Predication Extension Applied on Database Processing. In Proceedings of the 2018 Design, Automation and Test in Europe Conference and Exhibition, DATE 2018 (pp. 261–264). doi:10.23919/DATE.2018.8342015
-
Kipf, A, Lang, H, Pandey, V.N, Persa, R, Boncz, P.A, Neumann, T, & Kemper, A. (2018). Approximate geospatial joins with precision guarantees. In Proceedings of the International Conference on Data Engineering.
-
Zhang, Y, Koopmanschap, R.A, & Kersten, M.L. (2018). Love at first sight: MonetDB/TensorFlow. In IEEE 34th International Conference on Data Engineering, ICDE 2018. doi:10.1109/ICDE.2018.00208
-
Scheers, L.H.A, Bloemen, S, Mühleisen, H.F, Schellart, P, Van Elteren, A, Kersten, M.L, & Groot, P.J. (2018). Fast in-database cross-matching of high-cadence, high-density source lists with an up-to-date sky model. Astronomy and Computing, 23, 27–39. doi:10.1016/j.ascom.2018.02.006
Software
MonetDB: high-performance query processing against very large databases
MonetDB is a relational database management system (DBMS) providing high performance on complex queries against large databases.
Current projects with external funding
-
Actian CWI Research Grant
-
RelationalAI-CWI Research Agreement
-
Cross-Industry Predictive Maintenance Optimization Platform (CIMPLO)
-
Data Mining on High Volume Simulation Output (DAMIOSO)
-
Process mining for multi-objective online control (PROMIMOOC)
-
The SciLens-II Infrastructure, Big Data at work (Scilens-II)
-
Structure-aware Querying & Information Retrieval on Evolving Large Graphs (SQIREL-GRAPHS)
Related partners
-
Actian Corporation
-
BMW Munich
-
Databricks
-
LIACS Institute
-
MonetDB B.V.
-
Neo Technology AB
-
OBI4wan B.V.
-
RelationalAI
-
Spinque
-
Tata Steel
-
WizeNoze B.V.