Edit_Dunhuang

The Edit_Dunhuang project, centered on the Pelliot chinois collection of the Bibliothèque nationale de France (BnF), aims to improve the automatic transcription of historical Chinese documents by developing tools to convert the output of Optical Character Recognition (OCR) into structured and richly annotated texts. This process is designed to create vast and usable textual corpora for both qualitative and quantitative research. The diversity of the Pelliot chinois collection provides ideal conditions for testing and refining our tools, thereby ensuring their effectiveness and robustness across a wide range of documents. Our goal is to provide an innovative solution for the digitization and analysis of ancient Chinese texts, thus opening new perspectives for researchers in the field of Chinese studies.

Illustration projet Edit_Dunhuang

Illustration projet Edit_Dunhuang