Unveiling the ANP (Algemeen Nederlands Persbureau) Collection with open-source Large Language Models.

This project investigated the potential of open-source Large Language Models (LLMs) to enhance the exploration of the Koninklijke Bibliotheek's (KB) ANP collection. We explored the feasibility of enabling conversation-driven access to the collection using existing, publicly available software. In collaboration with a network of experts, we assessed technical viability before successfully connecting the collection to an Open Source LLM via Retrieval Augmented Generation (RAG).

Our findings demonstrate the potential of this approach to facilitate user interaction with the ANP collection, fostering deeper understanding and novel insights. However, the project also highlights challenges associated with the nascent nature of the software and ethical considerations surrounding the use of Open Source LLMs for real-world applications.

This work paves the way for future efforts to improve access to the ANP collection. While further development is necessary for full production deployment, the project presents a promising avenue for leveraging Open Source AI in cultural heritage exploration.

Speaker: Willem Jan Faber (KB)