Base de Français Médiéval – Old French Corpus

Description: The Base de Français Médiéval database (or BFM), founded in 1989, currently comprises seventy-five complete Old and Middle French texts. Thanks to its volume (approximately 3 300 000 words) and the diversity of the texts included, this database is unique in France for this period of the history of French. It has been used by a research community of around one hundred scholars, teachers, and students worldwide.
The texts included in the BFM cover a considerable geographic area and an extensive chronological breadth, with texts from the 9th century (including the first known French text, the Serments de Strasbourg) to the end of the 15th century. Both verse and prose texts are represented, as well as different genres and domains (e.g., fiction, history, hagiography, law, the sciences...).
Since May 2012, the BFM is accessible via a new web portal powered by the TXM corpus search and analysis platform. Depending on their copyright status, texts can be searched with or without context size limitation and viewed using the web browser. Non copyrighted texts can be downloaded on demand in the form of TEI P5 XML files.
All BFM texts are tokenized and morphologically tagged with the help of TreeTagger (using BFM’s own parameter file). As of September 2012, morphological annotation of eleven texts has been verified and corrected by experts.

Implementation description: TEI P5, modules included: tei, textstructure, core, header, msdescription, corpus, analysis, namesdates, linking, gaiji, transcr.
See the BFM Text Encoding Manual for details ( ). See also the BFM TEI ODD specification ( ).

Related Resources: BFM Project Web Site:
Instructions for proofreaders and encoders (in French):
BFM Manuscript Transcription Encoding Manual (in French):
BFM Text Description Manual (in French):

Céline Guillot
ENS de Lyon / ICAR
15 parvis René-Descartes
69342 Lyon Cedex 7
Tel: 33 (0)4 37 37 63 15
Fax: 33 (0)4 37 37 62 65

