The datathon will bring together an international group of researchers interested in the computational analysis of historical media collections across languages and modalities. Participants will work with tools, data, models and expert support provided by the Impresso team and the BnF.
The event will use English as a shared working language.
Impresso@BnF Datathon, 28–30 October 2026 - Historical Media Across Languages and Modalities
About the event
The Impresso@BnF Datathon is designed as a hands-on research event for participants interested in working with large-scale digitised historical media collections.
Participants will work in small teams on self-defined research questions, drawing on BnF collections and Impresso resources, including a multilingual media corpus, derived datasets, NLP and vision models, and the Impresso Datalab.
The Impresso team will be present throughout the event to provide guidance, technical support and methodological input. The format combines training, experimentation, collaborative work and discussion around concrete research questions.
Participation is limited to 30 people in order to support a productive and discussion-oriented working environment.
Programme overview
The event will take place over three days at the Bibliothèque nationale de France in Paris.
-
On Day 1 (afternoon), participants will be introduced to the Impresso Web App and the Impresso Datalab through presentations and live demonstrations. The afternoon will also include group formation around shared research interests.
-
Days 2 and 3, will focus on hands-on research work. The programme will include guided sessions on embeddings and data-driven analysis, team project time, methodological and technical support from the Impresso team, and a closing session with group presentations and collective discussion.
Sessions are designed to be flexible. Participants can engage with specific components according to their experience level, technical background and research focus. Dedicated exchange stations will also provide opportunities for informal conversations with the organisers throughout the event.
During registration, participants will be asked to briefly describe their research interests. These responses will help shape the agenda and the selection of case studies, so applicants are encouraged to be specific.
What Impresso offers
At the heart of the datathon is the Impresso Datalab, a research environment for the computational analysis of digitised historical media. It combines an exploration interface with a computational workspace and supports data-driven work with multilingual and multimodal collections.
Participants will have access to:
- the Impresso Web App, an interface for exploring and querying a semantically enriched multilingual corpus of digitised newspapers and radio broadcasts;
- the Impresso Datalab, a computational environment for data-driven analysis, accompanied by derived datasets, NLP models (available on Hugging Face), and multimodal embeddings enabling cross-lingual text search, image-text linking, and semantic analysis across heterogeneous archival collections;
- dedicated training materials and case study Jupyter notebooks to support participants across a range of technical backgrounds
These resources can support cross-lingual text search, image-text linking, semantic exploration and the analysis of heterogeneous archival collections. Whether participants work with sources in French, German, English or another European language, and whether their research focuses on text, images or both, the Datalab provides tools to connect, annotate and query historical collections at scale.
Who can participate?
The datathon is suited for:
- historians and media scholars working with digitised press archives;
- digital humanities researchers interested in computational methods for cultural heritage;
- information scientists and librarians exploring AI-supported access to archival collections;
- advanced students and early-career researchers working with, or interested in, historical data analysis.
Prior programming or data analysis experience is helpful, but not required. Ready-to-use case study notebooks will be provided, and the Impresso team will offer hands-on support throughout the event.
Registration
The Impresso@BnF Datathon will take place at the Bibliothèque nationale de France in Paris from 28 to 30 October 2026.
Places are limited to 30 participants. For questions, please contact us.
We look forward to welcoming you to Paris for what we hope will be a productive and enjoyable few days !
This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License.
A Creative Commons Attribution-NoDerivatives 4.0 (CC BY-ND 4.0) license applies
to all contents published in impresso. While articles published on impresso can
be copied by anyone for noncommercial purposes if proper credit is given,
all materials are published under an open-access license with authors retaining
full and permanent ownership of their work.