Dutch Seminar on Data Systems Design

The seminar will be held via Zoom. The link will be sent separately in a following email.

When
5 Apr 2024 from 4 p.m. to 5 Apr 2024 5 p.m. CEST (GMT+0200)
Where
Zoom
Web
Add

I am happy to announce the upcoming edition of our DSDSD series,
introducing Pedro Holanda who will deliver the upcoming talk. The next
session will take place on Friday, 5 April at 4pm (Amsterdam time).

The seminar will be held via Zoom. The link will be sent separately in a
following email. Amsterdam-based researchers are also welcome to join us
in person at CWI. Feel free to contact me via email if interested!

Please see below for details on the talk and the speaker.

-----------------------------------------------------------------

Speaker: Pedro Holanda

Title: Efficient CSV Parsing: On the Complexity of Simple Things

Abstract: In this talk, we will revisit different CSV parsing
implementations in DuckDB and compare them with the current
implementation. The bulk of the talk is to discuss the design and
implementation decisions in DuckDB's current CSV Parser. In particular,
we will examine the parallel algorithm, the CSV buffer manager, and the
transitions of the CSV state machine. Disclaimer: This talk is not for
the faint of heart; some very exotically built CSV files will be depicted.

Bio: Pedro is an early contributor to DuckDB and currently works as a
software engineer at DuckDB Labs, focusing onĀ  core and integration
aspects of DBMS technology. He completed his PhD at the Database
Architectures group at CWI, researching Indexes for Interactive Data
Analysis.

-----------------------------------------------------------------

We look forward to seeing you all during next session!

--
dsdsd-list mailing list
dsdsd-list@cwi.nl
https://lists.cwi.nl/mailman/listinfo/dsdsd-list