Measuring and Recording Manuscript Word Division Using HTR

Project type
Exploratory project
Scientific coordinator
Mark Faulkner
Elisabetta Magnanti
Project leader
Selected in
2025

The Middle Ages saw the transition from scriptio continua, where letters were written continuously across the page, to their eventual canonical division by spaces into individual words. This transition is especially interesting for vernacular languages, like Old English and Old French, where scribal experiments with word spacing can be said to bring into being the word as a unit of discourse in those languages. However, palaeographers’ perception and judgement of word spacing is affected by their awareness of the meaning of the text: they are not neutral reporters of where spaces occur and how large they are. Our project accordingly seeks to harness computer vision to record spaces between words in manuscripts with new accuracy, developing a set of protocols for measuring and recording word spacing in medieval manuscripts within the Biblissima portal, with case studies of manuscripts now in the BnF and BM in Rouen in Latin, Old English and Old French (as a vernacular comparandum for Old English), ranging in date from the ninth to the thirteenth centuries, facilitating digitisation of three Rouen manuscripts in full colour.

Illustration d'attente Iris (France, Paris, Bibliothèque Sainte-Geneviève, Ms. 1026 f. 080)