Licensed under
No source: this is an original work
The Text Encoding Initiative
But scholarly editions involve the creation of new writing as well as work on existing texts: editions usually include introductions and commentary in some form, and may extend to such things as analytical essays, catalogues of sources or witnesses, and bibliographies. The TEI
If you've chosen the TEI for your project, that actually isn't the end of choice; there are further questions about exactly how you use it. In normal use of TEI it remains important to decide which components of the whole thing to use: choosing TEI doesn't mean choosing to include every possible element in your documents. The existence of the <date> element does not imply the obligation or recommendation that every date in a text be tagged as such; some TEI elements are required in certain contexts, but a great many of them are described as optional, and it is intended that their use be left to the scholar's judgment. Extra markup is costly, and it is essential that a project decide just which features need to be marked in order to serve its scholarly ends. It is tempting to add markup for which no specific use is intended, but which might be of possible interest to someone in the future; but only the especially well-funded project can afford this. And apart from the expense, it is worth considering just how useful the encoded information will be to other scholars who may see the phenomenon in question differently and would want to develop their own encoding. Personal names, for example, may seem at first a straightforward category that requires little extra time to tag; but scholars who have worked on encoding personal names have found them to be hard to define and delimit (see
Given a decision on features to be encoded, you will also want to
choose among the many ways that the TEI DTD allows you to encode them:
textual errors and corrections may be encoded using
<sic>, <corr>, or
<app>, for example. The work of encoding is simpler
if such things don't needed to be reconsidered every time the feature
comes up. Some scholarly communities have developed their own
guidelines for using the TEI guidelines, in which they specify a
preferred way for handling things they often see or that are distinctive to their materials; if there is such a group in your area of work it's a good idea to consider following their lead. (See for instance
The choice of elements needs to be based on their definitions in the TEI
<l> element is for a line of verse (which might be displayed on several typographical lines) and not for a single typographical line on a page; <add> and <supplied> sound very similar but are for different things (the first for additions present in the documents you're working with, the second for additions by editors and encoders). Getting such things wrong amounts to misdescribing the text. And if there isn't already a tag that can be used for what you need to describe, don't force an existing one into the role. Most projects will run into textual features that matter in their work but that aren't covered by the existing All of these considerations have had to do with the form to be taken by the ultimate products of a project, the final encoded files. But there are reasons for not using TEI at particular points during the lifetime of a project, even if a TEI product is still the result.
The appropriate scholarly tools for the early exploratory stages of a project may be pen and paper, or chalk and a large blackboard, or a word processor; some will find that the precision and formality required for TEI-encoded texts is not helpful at a stage when you may be entertaining many conflicting ideas about what sort of information will be in your edition and how it will be structured. Some may also find it most productive to start by thinking about ways in which the edition will be presented to its readers, and not in terms of the information structures needed to achieve that. Experience has shown that electronic texts that are closely tied to one mode of presentation tend to be short-lived; but thinking about an actual presentation to readers is still an effective way of working out what an edition is going to do, and in a later stage the design may be adapted for TEI encoding. The TEI
Scholars new to XML may also find that devising a tagging system from scratch for their texts is instructive: you will understand a standard DTD in a different way after going through the intellectual labor of trying to create one. These explorations of an edition's form, or of its encoding system, can't go on for very long, since any text that is generated will typically be hard to convert to TEI form: they'll need to be seen as trials that can be thrown away.
During the main phase of work on a project, there may be reasons to consider creating texts in a form somewhat different from the final form they're destined to take. Some projects have found that the very generic names of some TEI elements are a minor problem: though the existing elements are appropriate for the purpose, it may still be easier to tell staff members to enter a <stanza> element rather than an <lg type="stanza"> element. This is especially likely if the generic range of the texts in question is circumscribed, so that one encounters a restricted set of features. Devising a customized DTD for document creation (possibly just using the standard TEI customization mechanism) and converting the documents to a more standard markup at a later stage is a reasonable approach, if the conversion is one that is readily automated. (At one time a version of this approach was often used in which no TEI markup was used directly, but instead very specific ad hoc markup tailored to particular texts was invented: a system in which %, ∗, #, and other nonalphabetic characters had special meaning and were later expanded into proper markup. But circumstances have changed enough that this isn't often a good choice; such ad hoc systems always run into problems expressing any but the simplest structures, and plenty of XML editors are now available that face no such limitation.)
The project well along in its lifetime that was never TEI-based faces a difficult choice. Certainly there may be advantages to the switch, both intellectual and political, if the TEI approach is appropriate; but switching is always time-consuming and costly, and will typically require or cause some changes in thinking about the project's editorial approach. Projects that get completed are more valuable to the scholarly community than unfinished projects with more perfect methodologies.
This chapter has assumed so far the appropriateness of the TEI approach for your editorial project; but it is possible that the approach does not fit, and in that case it should not be used. Two requirements of this approach can be especially problematic: first, you need to understand your texts; and second, you need to believe in the integrity and utility of selective transcriptions.
You need to understand your texts in order to translate them from stone or paper versions to digital versions. It is no doubt evident that transcribing anything written in a script you don't understand is hard to do well; but unfamiliar conventions of layout raise problems as well. (See, for example, Cloud on headings in
In order to use the TEI approach you need also to believe in transcription. It is impossible for a transcription to reproduce the original object; it is always a selection of features from that object: the words but not their size on the page or the depth of the chisel marks, major changes in type style but not variations in the ink's darkness from page to page or over time. Any such features that do seem essential for a particular transcription can be encoded; what's impossible is notating every observable feature. And it may be that the creation of a digital description of such features has little value for analysis: what you really want may just be the opportunity to see an image of the original (assuming that the different selection of features involved in imaging is more acceptable). There are two common cases in which a transcription might be regarded more as an index of words in page images rather than as a reasonable working representation of the text: works intended as mixtures of words and images, and very complex draft manuscripts in which the sequence of text or inscription is difficult to make out.
These two considerations about the appropriateness of the TEI approach apply to most systems of electronic transcription that an edition might consider: as scholarly editors we need to make specific claims about what the text is and communicate them clearly to others, and we are engaged in analyzing texts and creating new representations of them, not in creating indistinguishable replicas. But it is still a real question whether that is the right thing to do for any given project; it's essential to recognize that an editorial project must take a particular view of the texts in question and choose particular scholarly goals, and those decisions determine whether an edition based on transcription can be made.