Digital Editions as the Myth of Sisyphus

Burghart, Marjorie


The Myth of Sisyphus is well know, even to schoolboys fascinated by Greek mythology: for having defied the gods and put Death in chains, Sisyphus was condemned to push a huge boulder up a mountain slope, and when he reached the top, the boulder would roll down the mountain, and Sysiphus would have to roll it up, again and again, for all eternity.

In this paper, I will argue that this myth functions as an interesting allegory of the work of digital editors working with the TEI. When they decide to create a digital edition, they defy the limitations of traditional print editions, and give their work better chances for accessibility and perennity.

But like Sisyphus, they have to pay a price for that: when a scholar publishes a print edition, he goes through the usual process of a critical edition, prepares his work for publication, then once it is published and printed, he does not need to care about his work: once it has been published, it will never be necessary to search for funds, unless a second edition is planned; libraries all over the world will take care of keeping copies safe in their collections and making them reasonably available to potential readers; the layout and typography of the work benefits from a secular tradition and is not likely to be questioned during the lifetime of the scholar.

In a word, the critical editor can move on to his next work, without giving much thought to his published edition.

The scholar who engages in a digital edition, on the other hand, is a modern Sisyphus: publishing a digital edition is an exacting, never ending task! For instance:


digital editions may often be at the mercy of the whim of IT services, willing or not to offer the necessary framework and evolutive development – this regards the encoding of the edition (with evolutions of the Text Encoding Initiative) as well as the applications needed to make an edition available to its readers

the standards and trends of web design evolve very quickly, more quickly even than the technologies behind, and will demand a graphic overhaul every 3 or 4 years.

  • to a lesser extent, the very possibility of updating the edition is a form of pressure in itself.

I will review these issues, showing how scholars publishing digital editions find themselves in an absurd situation where nothing guarantees their work will always be available, unless, during their whole career, they take care of maintaining their work – not to mention the uncertain future of their work once they retire.

I will discuss potential ways to address these issues and to relieve the critical editor of his Sisyphean task, among which I will suggest a better defined status of published digital editions, and the creation of public institutions offering the equivalent of a legal deposit to digital editions.



Converting legacy editions to TEI

Cramme, Stefan


In the course of the last years, the Library for Research on Educational History (Bibliothek für Bildungsgeschichtliche Forschung, BBF) in Berlin, Germany, has offered scholars in the history of education a service to put online unpublished editions of sources which have been transcribed and prepared but not published yet (in contrast to the more numerous editions born digital or the retrodigitization of older works). These editions, ranging from the 18th to the 20th century, mostly have been begun with a conventional printed publication in mind which has not proved feasible, though. The texts are converted from legacy formats (usually an older version of Microsoft Word) to XML, applying a limited set of semantic TEI markup. The online editions generated from the XML version are in some cases accompanied by a printed volume with selections from the full corpus. As a research library, the BBF regards close cooperation with the research community as one of its main tasks, providing TEI expertise while the scholars can concentrate on revising the text and preparing supplementary material like indexes and annotations.

The micropaper will show the specific problems and pitfalls encountered in such a conversion process, but also focus on how unpublished legacy transcriptions and editions of source material can be adapted to the TEI guidelines and put online with limited resources. Examples will be taken from the correspondences of the educator Friedrich Fröbel and the educational philosopher Eduard Spranger.



Application of TEI to a biographical dictionary (

Reinert, Matthias


Starting point of our DFG-funded project (2007-2009) had been two series of digitized volumes. The „Allgemeine Deutsche Biographie“ in 55 volumes (ADB, publ. 1875-1912) and 23 of „Neue Deutsche Biographie“ (NDB, publ. since 1953) comprising some 47.000 articles and 88.000 persons mentioned in a separate non-XML database.

The raw text had typographical encoded features we used to automatically restructure the text. Each article (in NDB) consists of functional parts like genealogy, life, works, etc. Most challenging part was the realignment of articles to persons. We had to identify persons with biographies and those mentioned in the text.

We did heavily use <persName>, <birth> and <death>-tags to identify strings as persons. The TEI-Lite standard was released in order to allow those tags within text. While proof-reading the automatically encoded articles these tags could be more easily be hold apart than <name>-tags with several types. Nonetheless the TEI-Lite scheme can be validated after a simple transformation.

In addition abbreviations and short-titles had been identified.

Almost all persons mentioned in both series are manually identified in bibliographic Authority Files (Personennamendatei, PND). Places of birth and death are in progress to be aligned with geodatabases (namely OpenStreetMap).

Both identifications result in concordance files, partly to have them maintained separately, partly to reduce the code within the TEI-encoded texts to ease readability.

Next steps consists in

a) „parsing“ the genealogy, to make relations between persons mentioned more explicit,

b) breaking up the series/volume-structure, to have articles ordered and editable by person.



Mapping metadata of TEI-encoded biographies to CIDOC-CRM

Reinert, Matthias; Riechert, Thomas


While publishing online two digitised biographical dictionaries containing biographies for about 40.000 historical persons in 47.000 articles a major challenge is to make the data available. Beside presenting the material freely available online and porting metadata into academic search engines and OA-registries we choosed to create Linked Open Data out of our biographical repository. Funded by PUBLINK (part of the AKSW helped us to provide biographical metadata in RDF. Thanks to having almost all persons aligned with the German Name Authority File (PND, already part of LOD), adding to a majority of places of birth and death identified in Geodatabases (OpenStreetMap) we created a first set of common ontologies (FOAF, DCMES) to express statements like „was born in“, „died in“, „knows“.

In a second step we defined a set of mapping rules to CIDOC-CRM (actually using the OWL-DL variant Erlangen CRM). Motivation has been

  • to maintain easy interoperability with Europeana and the emerging Deutsche Digitale Bibliothek (German Digital Library)

to be able to use semantic wikis (,, SMW+ recently in evaluation) assisting the redaction and correction of our content as well as „content-enrichment“.



Virtual Scriptorium St. Matthias

Vanscheidt, Philipp; Scholzen, Sabine


The project Virtual Scriptorium St. Matthias intends to reunite the worldwide scattered codices from the library of the Benedictine abbey St. Eucharius or St. Matthias in Trier electronically. The project is realized at Stadtbibliothek and Stadtarchiv Trier as well as at the Center for Digital Humanities at the University Trier since summer 2010. About 450 codices from the period between the eights and fifteenth century will be digitized in three years.

These codices concern a wide range of topics from various traditions. Beyond theological and religious writings you find a large amount of latin classics like Cicero, Priscian, Sallust or Martianus Capella. A prestigious example for the inculturation of ancient and pagan spirit is an illustrated edition of Aesops fables. No other abbey possessed as many manuscripts of Hildegard of Bingen as St. Matthias. You find also three important specimina of Dectretum Gratiani. One of them includes 60% of all glosses ever written on this work. But the richly illustrated Trierer Apokalypse from carolingian times maybe the most famous of all these codices.

The project Virtual Scriptorium St. Matthias will present an electronic catalogue that sums up the knowledge from older descriptions and combines them with a presentation of the digitized codices. In this context TEI is used as a standard of XML description of manuscripts. The amount of objects requires a synchronization of these descriptions with a dynamic database to correlate them with other digitized catalogues, editions and databases like the PND.

The results will be integrated in Manuscripta Mediaevalia and TextGrid. In this way the project will not only provide images and metadata but will also be included into a virtual working space in where further research and exploring will be possible, e.g. with TEI concurrent transcriptions of selected works.The project homepage will be released on the first of August 2011 on a trial base. The project should be presented in a short talk and a poster. The poster will cover the project thoroughly while the short talk is supposed to sketch the advantages and some practical limits of TEI in such an enterprise.



Beyond TEI: Returning the text to the reader

Wittern, Christian


Much research and practical effort has gone into the development and

maintenance of a digital format that could form a stable foundation

for texts in the digital age; the results of this work in the form of

the /Guidelines for Electronic Text Encoding and Interchange/ have

been widely adapted in the community.

While this can indeed serve as a foundation for a digital edition

of a text, the publication of texts encoded in such a way is still

much less well understood and researched. The most common practice

for digital publication today is to either publish in some form of web

accessible form, with CD-ROM publication quickly becoming obsolete.

In some rare cases, the XML source form of the edition is also


For a researcher, this situation is in some respects much worse than

it was when critical editions were published only in print, since in

most cases the texts can only be *browsed* online (and every site has

it's own idiosyncratic way of displaying and navigating a text) and

not physically owned. This not only invalidates many of the potential

advantages of digital texts, namely, making the digital edition

available for machine mediated analysis, but even denies the reader

the most basic form of scholarly activity, that is "active reading" or

annotation of the text. What is needed here is the digital equivalent

of a "college edition" of a text (and yes, we need to and can do much

better than simply converting the text into ePub for consumption by

electronic reading devices, but nevertheless this option also deserves

attention as such devices become more sophisticated and widely adopted).

To remedy this situation, a new publication form for digital texts is

proposed. At the core, this is a plain text format that only contains

very few traces of markup, but serves to make the textual content

available to the reader. The text is published through a distributed

version control system, which allows the researcher to create

branches, annotate, edit or translate the text without losing the

connection to the established digital edition and thus to all the

other researchers, that are working on this text. If there are

differing editions of a text, these editions can be represented as

'branches' in this system, but the assumption is still that there is

one privileged 'master' branch that corresponds to a reading text in a

critical edition.

In some respects, such a text is similar to publishing a college or

paperback edition of the text established in a critical edition: The

reader knows that the text is based on a rigorous editing process and

thus forms a safe foundation for further research, but at the same

time is not burdened with all the details that might get in between

her and the text, but has at every stage of his work the possibility

to refer back to the critical edition if that becomes necessary. In

some other respect, it resembles more the interactive communities or

"social networks", that have sprung up on the internet recently and

already carry a significant amount of scholarly communication. There

is however a critical difference between such services and the model

proposed here: In the model described here and implemented as a proof

of concept as part of the Mandoku project

[1] the researcher, who

publishes annotations in form of additional 'branches' of the master

branch of a text retains control and ownership of all these additions,

which constitute an essential part of his scholarly work, without

compromising the ability to quickly share the results with interested


Earlier versions of these experiments used the TEI XML format as base

for the texts, but it turned out to be a bad fit to the line-oriented

model of texts used in version control systems. Currently, an

enhanced version of the Emacs org-mode

[2] file format is used. This

has the additional advantage of providing a flexible user interface as

well as options for direct export to popular other text formats, such

as HTML, PDF, OpenOffice XML and DocBook XML. A back converter to TEI

XML, that will offer the option to roll the different "branches" of

texts maintained in the version control system back into one single

file is planned.

[1] cf. [], see also "Mandoku – An Incubator for

Premodern Chinese Texts – or How to Get the Text We Want: An Inquiry

into the Ideal Workflow", in: Digital Humanities 2010. Conference

abstracts. London, 2010, p. 271-273.

[2] cf. []



