An Introduction to TEI simplePrint
Lou Burnard
Martin Mueller
Sebastian Rahtz
James Cummings
Magdalena Turska
January 2017

Preface

This document is the formal specification for TEI simplePrint, an entry-level customization of the Text Encoding Initiative (TEI) Guidelines, intended to be generally useful to a large variety of encoders attempting to cope with the standardized representation of a variety of documents in digital form.

Like every other TEI customization, TEI simplePrint was designed for use with a particular type of material. If the material you are planning to encode matches the following criteria, then TEI simplePrint is for you. If it does not, it may not be.

If your needs go beyond those summarized here, simplePrint may still be a good point of departure, and may be very useful as a basis for the creation of your own TEI customisation. We don't however discuss the creation of a TEI customization in this document: the TEI website provides a number of links to tutorial material and tools which may assist in this process.

The present document is intended to be generally comprehensible and accessible, but does assume some knowledge of XML (the encoding language used by the TEI), and of the way it is used by the TEI. Further information on both these topics are available from many places, not least the TEI's own web site at http://www.tei-c.org.

The TEI simplePrint schema was first elaborated as a part of the TEI Simple project funded by the Andrew W. Mellon Foundation (2012-2014). The project sought to define a new ‘highly-constrained and prescriptive subset’ of the Text Encoding Initiative (TEI) Guidelines suited to the representation of early modern print materials, a formally-defined set of processing rules which permit modern web applications to easily present and analyze the encoded texts, mapping to other ontologies, and processes to describe the encoding status and richness of a TEI digital text. Its choice of elements reflected the practices followed in the encoding of large-scale literary archives, notably those produced by the Text Creation Partnership. Practice of other comparable archives such as the German Text Archive was also taken into account.

The most distinctive feature of TEI simplePrint is its use of the TEI Processing Model, which provides explicit and recommended options for the display or processing of every textual element. Programmers developing systems to handle texts encoded with TEI simplePrint do not have to look beyond this when building stylesheets or other components. This greatly reduces the complexity of developing applications that will work reliably and consistently for many users and across large corpora of documents.

The TEI simplePrint schema and the TEI Processing Model were first defined by a working group led by Martin Mueller (Northwestern University) and Sebastian Rahtz (Oxford University). Major contributions to the project were made by Magdalena Turska (Oxford University), James Cummings (Oxford University), and Brian Pytlik Zillig. The changes to the TEI scheme needed to support the TEI Processing Model were reviewed and approved by the TEI Technical Council for inclusion in release 3.0.0 of TEI P5 in February 2016. The present document was extensively revised and extended by Lou Burnard in July 2016 for submission to the TEI Technical Council.

Table of contents

1 A Short Example

We begin with a short example. How should we go about transferring into a computer a passage of prose, such as the start of the last chapter of Charlotte Brontë's novel Jane Eyre? We might start by simply copying what we see on the printed page, typing it in such a way that what appears on the screen looks as similar as possible, for example, by retaining the original line breaks, by introducing blanks to represent the layout of the original headings, page breaks, and paragraphs, and so forth. Of course, the possibilities are limited by the nature of the computer program we use to capture the text: it may not be possible for example to reflect accurately the typographic characteristics of our source with all such software. Some characters in the printed text (such as the accented letter a in faàl or the long dash) may not be available on the keyboard; some typographic distinctions (such as that between small capitals and full capitals) may not be readily accessible. Our first attempt tries to mimic the appearance of the former, and simply ignores the latter.

                                CHAPTER 38

READER, I married him. A quiet wedding we had: he and I, the par-
son and clerk, were alone present. When we got back from church, I
went into the kitchen of the manor-house, where Mary was cooking
the dinner, and John cleaning the knives, and I said --
  'Mary, I have been married to Mr Rochester this morning.' The
housekeeper and her husband were of that decent, phlegmatic
order of people, to whom one may at any time safely communicate a
remarkable piece of news without incurring the danger of having
one's ears pierced by some shrill ejaculation and subsequently stunned
by a torrent of wordy wonderment. Mary did look up, and she did
stare at me; the ladle with which she was basting a pair of chickens
roasting at the fire, did for some three minutes hang suspended in air,
and for the same space of time John's knives also had rest from the
polishing process; but Mary, bending again over the roast, said only --
   'Have you, miss? Well, for sure!'
   A short time after she pursued, 'I seed you go out with the master,
but I didn't know you were gone to church to be wed'; and she
basted away. John, when I turned to him, was grinning from ear to
ear.
   'I telled Mary how it would be,' he said: 'I knew what Mr Ed-
ward' (John was an old servant, and had known his master when he
was the cadet of the house, therefore he often gave him his Christian
name) -- 'I knew what Mr Edward would do; and I was certain he
would not wait long either: and he's done right, for aught I know. I
wish you joy, miss!' and he politely pulled his forelock.
   'Thank you, John. Mr Rochester told me to give you and Mary
this.'
   I put into his hand a five-pound note.  Without waiting to hear
more, I left the kitchen. In passing the door of that sanctum some time
after, I caught the words --
   'She'll happen do better for him nor ony o' t' grand ladies.' And
again, 'If she ben't one o' th' handsomest, she's noan faa\l, and varry
good-natured; and i' his een she's fair beautiful, onybody may see
that.'
   I wrote to Moor House and to Cambridge immediately, to say what
I had done: fully explaining also why I had thus acted. Diana and

                            474

                 JANE EYRE                      475

Mary approved the step unreservedly. Diana announced that she
would just give me time to get over the honeymoon, and then she
would come and see me.
   'She had better not wait till then, Jane,' said Mr Rochester, when I
read her letter to him; 'if she does, she will be too late, for our honey-
moon will shine our life long: its beams will only fade over your
grave or mine.'
   How St John received the news I don't know: he never answered
the letter in which I communicated it: yet six months after he wrote
to me, without, however, mentioning Mr Rochester's name or allud-
ing to my marriage. His letter was then calm, and though very serious,
kind. He has maintained a regular, though not very frequent correspond-
ence ever since: he hopes I am happy, and trusts I am not of those who
live without God in the world, and only mind earthly things.

      

This transcription suffers from a number of shortcomings:

We now present the same passage, as it might be encoded in TEI simplePrint. As we shall see, there are many ways in which this encoding could be extended, but as a minimum, the TEI approach allows us to represent the following distinctions in a standardized way:
  • Paragraph and chapter divisions are now marked explicitly by means of tags rather than implicitly by white space.
  • Apostrophes are retained, but the quotation marks indicating direct speech have been removed, and direct speech is now marked explicitly by means of a tag.
  • The accented letter and the long dash are accurately represented, using the appropriate Unicode character.
  • Page divisions have been marked with an empty pb tag; the page heading and running text have been suppressed.
  • The lineation of the original has also been suppressed and words broken by typographic accident at the end of a line have been re-assembled without comment.
  • For convenience of proof reading, a new line has been introduced at the start of each paragraph, but the indentation is removed.
<pb n="474"/>
<div type="chaptern="38">
 <p>Reader, I married him. A quiet wedding we had: he and I, the parson and clerk, were
   alone present. When we got back from church, I went into the kitchen of the
   manor-house, where Mary was cooking the dinner, and John cleaning the knives, and I
   said —</p>
 <p>
  <q>Mary, I have been married to Mr Rochester this morning.</q> The housekeeper and
   her husband were of that decent, phlegmatic order of people, to whom one may at any
   time safely communicate a remarkable piece of news without incurring the danger of
   having one's ears pierced by some shrill ejaculation and subsequently stunned by a
   torrent of wordy wonderment. Mary did look up, and she did stare at me; the ladle
   with which she was basting a pair of chickens roasting at the fire, did for some
   three minutes hang suspended in air, and for the same space of time John's knives
   also had rest from the polishing process; but Mary, bending again over the roast,
   said only —</p>
 <p>
  <q>Have you, miss? Well, for sure!</q>
 </p>
 <p>A short time after she pursued, <q>I seed you go out with the master, but I didn't
     know you were gone to church to be wed</q>; and she basted away. John, when I
   turned to him, was grinning from ear to ear. <q>I telled Mary how it would be,</q>
   he said: <q>I knew what Mr Edward</q> (John was an old servant, and had known his
   master when he was the cadet of the house, therefore he often gave him his Christian
   name) — <q>I knew what Mr Edward would do; and I was certain he would not wait long
     either: and he's done right, for aught I know. I wish you joy, miss!</q> and he
   politely pulled his forelock.</p>
 <p>
  <q>Thank you, John. Mr Rochester told me to give you and Mary this.</q>
 </p>
 <p>I put into his hand a five-pound note. Without waiting to hear more, I left the
   kitchen. In passing the door of that sanctum some time after, I caught the words
   —</p>
 <p>
  <q>She'll happen do better for him nor ony o' t' grand ladies.</q> And again, <q>If
     she ben't one o' th' handsomest, she's noan faàl, and varry good-natured; and i'
     his een she's fair beautiful, onybody may see that.</q>
 </p>
 <p>I wrote to Moor House and to Cambridge immediately, to say what I had done: fully
   explaining also why I had thus acted. Diana and <pb n="475"/> Mary approved the step
   unreservedly. Diana announced that she would just give me time to get over the
   honeymoon, and then she would come and see me.</p>
 <p>
  <q>She had better not wait till then, Jane,</q> said Mr Rochester, when I read her
   letter to him; <q>if she does, she will be too late, for our honeymoon will shine
     our life long: its beams will only fade over your grave or mine.</q>
 </p>
 <p>How St John received the news I don't know: he never answered the letter in which I
   communicated it: yet six months after he wrote to me, without, however, mentioning
   Mr Rochester's name or alluding to my marriage. His letter was then calm, and though
   very serious, kind. He has maintained a regular, though not very frequent
   correspondence ever since: he hopes I am happy, and trusts I am not of those who
   live without God in the world, and only mind earthly things.</p>
</div>

This encoding is expressed in TEI XML, a very widely used and standardized method of representing information about a document within the document itself. The transcribed words are complemented by special flags within angle brackets, called tags, which both characterise and mark the beginning and end of a string of characters. For example, each paragraph is marked by a tag <p> at its start, and a corresponding </p> at its end. We don't elaborate further on the syntax of TEI XML here. 1

Aside from its syntax, it is important to note that this particular encoding represents a set of choices or priorities. We have chosen to prioritize and simplify the representation of the words of the text over the representation of the typographic layout associated with them in this source document. This makes it easier for a computer to answer questions about the words in the document than about its typesetting, reflecting our research priorities. This priority also leads us to suppress end-of-line hyphenation. Conceivably Brontë (or her printer) intended the word ‘honeymoon’ to appear as ‘honey-moon’ on its second appearance, though this seems unlikely: our decision to focus on Brontë's text, rather than on the printing of it in this particular edition, makes it impossible to be certain. Similarly, our decision makes it impossible to use this transcription as a means of statistically analysing hyphenation practice. An encoding makes explicit all and only those textual features of importance to the encoder.

It is not difficult to think of ways in which the encoding of even this short passage might readily be extended to address other research priorities. For example:

In the remainder of this document, we present a number of TEI-recommended ways of supporting these and other encoding requirements. These ways generally involve the application of specific TEI XML elements, selected from the full range of possibilities documented in the TEI Guidelines. Like every other TEI project, TEI Simple proposes a view of the TEI Guidelines. This document defines and documents that view.

2 The Structure of a TEI simplePrint Document

A TEI-conformant text contains (a) a TEI header (marked up as a teiHeader element) and (b) one or more representations of a text. These representations may be of three kinds: a transcribed text, marked up as a text element; a collection of digital images representing the text, marked up using a facsimile element; or a literal transcription of one or more documents instantiating the text, marked up using the <sourceDoc> element.

These elements are combined together to form a single TEI element, which must be declared within the TEI namespace, and therefore usually takes the form <TEI xmlns="http://www.tei-c.org/ns/1.0"> 2.

Some aspects of the TEI header are described in more detail in section 15 The Electronic Title Page. In what follows, we will focus chiefly on the use of the text element, though we describe one way of using the facsimile element in combination with it or alone in section 14 Encoding a Digital Facsimile. We do not consider the <sourceDoc> element further, since it is mainly used in very specialised applications for which TEI simplePrint would not be appropriate.

A text may be unitary (a single work) or composite (a collection of single works, such as an anthology). In either case, the text may have optional front or back matter such as title pages, prefaces, appendixes etc. We use the term body for whatever comes between these in the source document. We discuss various kinds of composite text in section 12 Composite and Floating Texts below.

A unitary text will be encoded using an overall structure like this:
<TEI xmlns="http://www.tei-c.org/ns/1.0">
 <teiHeader>
<!-- [ TEI Header information ] -->
 </teiHeader>
 <text>
  <front>
<!-- [ front matter ... ] -->
  </front>
  <body>
<!-- [ body of text ... ] -->
  </body>
  <back>
<!-- [ back matter ... ] -->
  </back>
 </text>
</TEI>

In each of the following sections we include a short list of the TEI elements under discussion, along with a brief description, and in most cases an example of how they are used. Throughout the text, element names are linked to their detailed reference documentation, as given in the TEI Guidelines. Note that most of the examples provided by the reference documentation, and all of the links, are not specific to TEI simplePrint.

For example, here are the elements discussed so far:

3 Encoding the Body

As indicated above, a unitary text is encoded by means of a text element, which may contain the following elements:

Elements specific to front and back matter are described below in section 13 Front and Back Matter. In this section we discuss the elements making up the body of a text. A text must always have a body.

3.1 Text Division Elements and Global Attributes

The body of a prose text may be just a series of paragraphs or similar blocks of text, or these may be grouped together into chapters, sections, subsections, etc. The div element is used to represent any such grouping of blocks.

  • div (text division) contains a subdivision of the front, body, or back of a text.
    type [att.typed]characterizes the element in some sense, using any convenient classification scheme or typology.

The type attribute on the div element may be used to supply a conventional name for this category of text division in order to distinguish them. Typical values might be book, chapter, section, part, poem, song, etc. TEI simplePrint does not constrain the range of values that may be used here.

A div element may itself contain further, nested, divs, thus mimicking the traditional structure of a book, which can be decomposed hierarchically into units such as parts, containing chapters, containing sections, and so on. TEI texts in general conform to this simple hierarchic model.

Here as elsewhere the xml:id attribute may be used to supply a unique identifier for the division, which may be used for cross references or other links to it, such as a commentary, as further discussed in section 3.7 Cross References and Links. It is good practice to provide an xml:id attribute for every major structural unit in a text, and to derive its values in some systematic way, for example by appending a section number to a short code for the title of the work in question, as in the examples below.

The n attribute may be used to supply (additionally or alternatively) a short mnemonic name or number for a division, or any other element. If a conventional form of reference or abbreviation for the parts of a work already exists (such as the book/chapter/verse pattern of Biblical citations), the n attribute is the place to record it; unlike the identifier supplied by the xml:id attribute, it does not need to be unique.

The xml:lang attribute may be used to specify the language of the division. Languages are identified by an internationally defined code, as further discussed in section 3.5.3 Foreign Words or Expressions below.

The rendition attribute may be used to supply information about the rendition (appearance) of a division, or any other element, as further discussed in section 3.5 Marking Highlighted Phrases below. Note that this attribute is used to describe the appearance of the source text, rather than the appearance of any intended output when the encoded text is displayed. The two may of course be similar, or identical, but the TEI does not assume or require this.

These four attributes, xml:id, n, xml:lang, and rendition are so widely useful that they are allowed on any element in any TEI schema: they are called global attributes. Other attributes defined in the TEI simplePrint schema are discussed in section 3.7.3 Special Kinds of Linking.

As noted above, the value of every xml:id attribute must be unique within a document. One simple way of ensuring this is to make it reflect the hierarchic structure of the document. For example, Smith's Wealth of Nations as first published consists of five books, each of which is divided into chapters, while some chapters are further subdivided into parts. We might define xml:id values for this structure as follows:
<body>
 <div xml:id="WN1n="Itype="book">
  <div xml:id="WN101n="I.1type="chapter">
<!-- ... -->
  </div>
  <div xml:id="WN102n="I.2type="chapter">
<!-- ... -->
  </div>
<!-- ... -->
  <div xml:id="WN110n="I.10"
   type="chapter">

   <div xml:id="WN1101n="I.10.1"
    type="part">

<!-- ... -->
   </div>
   <div xml:id="WN1102n="I.10.2"
    type="part">

<!-- ... -->
   </div>
  </div>
<!-- ... -->
 </div>
 <div xml:id="WN2n="IItype="book">
<!-- ... -->
 </div>
</body>
A different numbering scheme may be used for xml:id and n attributes: this is often useful where a canonical reference scheme is used which does not tally with the structure of the work. For example, in a novel divided into books each containing chapters, where the chapters are numbered sequentially through the whole work, rather than within each book, one might use a scheme such as the following:
<body>
 <div xml:id="TS01n="1type="volume">
  <div xml:id="TS011n="1type="chapter">
<!-- ... -->
  </div>
  <div xml:id="TS012n="2type="chapter">
<!-- ... -->
  </div>
 </div>
 <div xml:id="TS02n="2type="volume">
  <div xml:id="TS021n="3type="chapter">
<!-- ... -->
  </div>
  <div xml:id="TS022n="4type="chapter">
<!-- ... -->
  </div>
 </div>
</body>
Here the work has two volumes, each containing two chapters. The chapters are numbered conventionally 1 to 4, but the xml:id values specified allow them to be regarded additionally as if they were numbered 1.1, 1.2, 2.1, 2.2.

3.2 Headings and Closings

Every div may have a title or heading at its start, and (less commonly) a trailer such as ‘End of Chapter 1’ at its end. The following elements may be used to transcribe them:

  • head (heading) contains any type of heading, for example the title of a section, or the heading of a list, glossary, manuscript description, etc.
  • trailer contains a closing title or footer appearing at the end of a division of a text.

Some other elements which may be found at the beginning or ending of text divisions are discussed below in section 13.1.2 Prefatory Matter.

Whether or not headings and trailers are included in a transcription is a matter for the individual transcriber to decide. Where a heading is completely regular (for example ‘Chapter 1’) or may be automatically constructed from attribute values (e.g. <div type="chapter" n="1">), it may be omitted; where it contains otherwise unrecoverable text it should always be included. For example, the start of Hardy's Under the Greenwood Tree might be encoded as follows:
<div xml:id="UGT1n="Wintertype="part">
 <div xml:id="UGT101n="1type="chapter">
  <head>Mellstock-Lane</head>
  <p>To dwellers in a wood almost every species of tree ... </p>
 </div>
</div>

3.3 Textual Components

In prose texts such as the Brontë example above, the divisions are generally composed of paragraphs, represented as p elements, though in some circumstances it may be preferred to use the ‘anonymous block’ element ab. In poetic or dramatic texts different elements are used, representing stanzas and verse lines in the first case, and individual speeches or stage directions in the second:

  • p (paragraph) marks paragraphs in prose.
  • ab (anonymous block) contains any component-level unit of text, acting as a container for phrase or inter level elements analogous to, but without the same constraints as, a paragraph.
  • l (verse line) contains a single, possibly incomplete, line of verse.
  • lg (line group) contains one or more verse lines functioning as a formal unit, e.g. a stanza, refrain, verse paragraph, etc.
  • sp (speech) contains an individual speech in a performance text, or a passage presented as such in a prose or verse text.
  • speaker contains a specialized form of heading or label, giving the name of one or more speakers in a dramatic text or fragment.
  • stage (stage direction) contains any kind of stage direction within a dramatic text or fragment.

We discuss each of these kinds of component separately below.

3.3.1 Verse

Here, for example, is the start of a poetic text in which verse lines and stanzas are tagged:
<lg n="I">
 <l>I Sing the progresse of a deathlesse soule,</l>
 <l>Whom Fate, with God made, but doth not controule,</l>
<!-- ... -->
 <l>A worke t'out weare Seths pillars, bricke and stone,</l>
 <l>And (holy writs excepted) made to yeeld to none,</l>
</lg>

Note that the l element marks verse lines, not typographic lines: as elsewhere the original lineation of the source text is not therefore preserved by this encoding. The lb element described in section 3.4 Page and Line Numbers might additionally be used to mark typographic lines if so desired.

In a poetic text it may also be considered useful to identify the rhymes, for which the following element may be used:
  • rhyme marks the rhyming part of a metrical line.
    labelprovides a label (usually a single letter) to identify which part of a rhyme scheme this rhyming string instantiates.
The following example shows how this element might be used both to identify rhyming words or word parts and to assign each rhyme to a part of a rhyming pattern by means of its label attribute. The rhyming pattern here is specified by the rhyme attribute supplied on the lg representing the stanza within which the pattern operates:
<lg rhyme="AABCCBBA">
 <l>The sunlight on the <rhyme label="A">garden</rhyme>
 </l>
 <l>
  <rhyme label="A">Harden</rhyme>s and grows <rhyme label="B">cold</rhyme>,</l>
 <l>We cannot cage the <rhyme label="C">minute</rhyme>
 </l>
 <l>Wi<rhyme label="C">thin it</rhyme>s nets of <rhyme label="B">gold</rhyme>
 </l>
 <l>When all is <rhyme label="B">told</rhyme>
 </l>
 <l>We cannot beg for <rhyme label="A">pardon</rhyme>.</l>
</lg>
The rhyme attribute may be used independently of the rhyme element, or in combination with it, as above.

3.3.2 Drama

A dramatic text contains speeches, which may be in prose or verse, and will also contain stage directions. The sp element is used to represent each identified speech. It contains an optional speaker indication, marked with the speaker element, which can be followed by one or more l or p elements, depending on whether the speech is considered to be in prose or in verse. Stage directions, whether within or between speeches, are marked using the stage element.

For example:
<sp>
 <speaker>Vladimir</speaker>
 <p>Pull on your trousers.</p>
</sp>
<sp>
 <speaker>Estragon</speaker>
 <p>You want me to pull off my trousers?</p>
</sp>
<sp>
 <speaker>Vladimir</speaker>
 <p>Pull <hi>on</hi> your trousers.</p>
</sp>
<sp>
 <speaker>Vladimir</speaker>
 <p>
  <stage>(realizing his trousers are down)</stage>. True</p>
</sp>
<stage>He pulls up his trousers</stage>
<sp>
 <speaker>Vladimir</speaker>
 <p>Well? Shall we go?</p>
</sp>
<sp>
 <speaker>Estragon</speaker>
 <p>Yes, let's go.</p>
</sp>
<stage>They do not move.</stage>
In a verse drama, it is quite common to find that verse lines are split between speakers. The easiest way of encoding this is to use the part attribute to indicate that the lines so fragmented are incomplete:
<div type="Actn="I">
 <head>ACT I</head>
 <div type="Scenen="1">
  <head>SCENE I</head>
  <stage rendition="#italic">Enter Barnardo and Francisco, two Sentinels, at
     several doors</stage>
  <sp>
   <speaker>Barn</speaker>
   <l part="Y">Who's there?</l>
  </sp>
  <sp>
   <speaker>Fran</speaker>
   <l>Nay, answer me. Stand and unfold yourself.</l>
  </sp>
  <sp>
   <speaker>Barn</speaker>
   <l part="I">Long live the King!</l>
  </sp>
  <sp>
   <speaker>Fran</speaker>
   <l part="M">Barnardo?</l>
  </sp>
  <sp>
   <speaker>Barn</speaker>
   <l part="F">He.</l>
  </sp>
  <sp>
   <speaker>Fran</speaker>
   <l>You come most carefully upon your hour.</l>
  </sp>
<!-- ... -->
 </div>
</div>
The value of the part attribute may indicate just that the element bearing is fragmented in some (unspecified) respect rather than a complete verse line (part="Y"); alternatively it may indicate whether this is an initial (I), medial (M) or F (final) fragment.
The same mechanism may be applied to stanzas which are divided between two speakers:
<div>
 <sp>
  <speaker>First voice</speaker>
  <lg type="stanzapart="I">
   <l>But why drives on that ship so fast</l>
   <l>Withouten wave or wind?</l>
  </lg>
 </sp>
 <sp>
  <speaker>Second Voice</speaker>
  <lg type="stanzapart="F">
   <l>The air is cut away before.</l>
   <l>And closes from behind.</l>
  </lg>
 </sp>
<!-- ... -->
</div>
The sp element can also be used for dialogue presented in a prose work as if it were drama, as in the next example, which also demonstrates the use of the who attribute to bear a code identifying the speaker of the piece of dialogue concerned:
<div>
 <sp who="#OPI">
  <speaker>The reverend Doctor Opimian</speaker>
  <p>I do not think I have named a single unpresentable fish.</p>
 </sp>
 <sp who="#GRM">
  <speaker>Mr Gryll</speaker>
  <p>Bream, Doctor: there is not much to be said for bream.</p>
 </sp>
 <sp who="#OPI">
  <speaker>The Reverend Doctor Opimian</speaker>
  <p>On the contrary, sir, I think there is much to be said for him. In the first
     place....</p>
  <p>Fish, Miss Gryll -- I could discourse to you on fish by the hour: but for the
     present I will forbear.</p>
 </sp>
</div>
Here the who attribute values (#OPI etc.) are links, pointing to items in a list of the characters in the novel. In the case of a play, this list of characters might appear in the original source as a cast list or dramatic personae, which might be marked up using the castList element described in section 13.2.2 Specialized Front and Back Matter below. Such a list would not, of course, be appropriate to provide descriptive information about each character, much of which does not appear in the original source. Instead a particDesc (participant description) element should be provided in the TEI header, as further discussed in section 15.3 The Profile Description below.

3.3.3 Other Kinds of Text Block

As mentioned above, the ab element may also be used in preference to the p element. It should be used for blocks of text which are not clearly paragraphs, verse lines, or dramatic speeches. Typical examples include the canonical verses of the Bible, and the textual blocks of other ancient documents which predate the invention of the paragraph, such as Greek inscriptions or Egyptian hieroglyphs. The element is also useful as a means of encoding more specialized kinds of textual block, such as the question and answer structure of a catechism, or the highly formalized substructure of a legal document (if div is not considered appropriate for these). In more modern documents, it can be used to encode semi-organized or fragmentary materials such as an artist's notebook or work in progress; or to faithfully capture the substructure of a file produced by an OCR system.

3.4 Page and Line Numbers

Page and line breaks etc. may be marked with the following elements:

  • pb (page beginning) marks the beginning of a new page in a paginated document.
  • lb (line beginning) marks the beginning of a new (typographic) line in some edition or version of a text.
  • cb (column beginning) marks the beginning of a new column of a text on a multi-column page.
  • milestone (milestone) marks a boundary point separating any kind of section of a text, typically but not necessarily indicating a point at which some part of a standard reference system changes, where the change is not represented by a structural element.
  • fw (forme work) contains a running head (e.g. a header, footer), catchword, or similar material appearing on the current page.

The pb, lb, and cb elements are special cases of a general class of elements known as milestones because they mark reference points within a text. The generic milestone element can mark any kind of reference point: for example, a column break, the start of a new kind of section not otherwise tagged, a change of author or style, or in general any significant change in the text not enclosed by an XML element. Unlike other elements, milestone elements do not enclose a piece of text and make an assertion about it; instead they indicate a point in the text where something changes, as indicated by a change in the values of the milestone's attributes unit, which indicates the ‘something’ concerned, and n which indicates the new value.

The pb, lb, and cb elements are shortcuts or syntactic sugar for <milestone unit="page"/> <milestone unit="line"/> and <milestone unit="column"/> respectively.

When working from a paginated original, it is often useful to record its pagination, whether to simplify later proof-reading, or to align the transcribed text with a set of page images, as further discussed below.

Because pb and other milestone elements are empty, they may be placed freely within or between other elements. However, it is recommended practice always to put them at the beginning of whatever unit it is that their presence implies, and not to nest them within elements contained by that unit. For example, in the following example a page break occurs between two lines of a poem:
<l>Mary had a little lamb</l>
<pb n="13"/>
<l>Its fleece was white as snow</l>
The pb element should be placed ahead of all the text encoded on the 13th page. Contrast this with the following less accurate encoding:
<l>Mary had a little lamb</l>
<l>
 <pb n="13"/>Its fleece was white as snow
</l>
This is less accurate because it implies that the second verse line actually begins before the page break.

Similar considerations apply to line breaks (lb), though these are less frequently considered useful when encoding modern printed textual sources. When transcribing manuscripts or early printed books, however, it is often helpful to retain them in an encoding, if only to facilitate alignment of transcription and original. Like pb, the lb element should appear before the text of the line whose start it signals.

If features such as pagination or lineation are marked for more than one edition, the edition in question may be specified by the ed attribute. For example, in the following passage we indicate where the page breaks occur in two different editions (ED1 and ED2):
<p>I wrote to Moor House and to Cambridge immediately, to say what I had done:
fully explaining also why I had thus acted. Diana and <pb ed="ED1n="475"/> Mary
approved the step unreservedly. Diana announced that she would <pb ed="ED2n="485"/>just give me time to get over the honeymoon, and then she would come and see
me.</p>
When transcribing from a paginated source, the encoder must decide whether to suppress such features as running titles, page signatures, catch words etc., to replace them by a simplified representation using the pb element, perhaps using the n attribute to preserve some of the information, or to preserve them entirely using the fw element. The latter strategy is appropriate in encodings which aim to retain as much information as possible about the original typography; it will however make more complex the processing of the source for other purposes, as in the following example:
<l>He also fix'd the wandering
QUEEN OF NIGHT,</l>
<fw type="sig">Ii 2</fw>
<fw type="catch">Whether</fw>
<pb n="244"/>
<l>Whether she wanes into a scanty orb</l>...
<!-- Thomson, Seasons, 1730-->
The pb element is also used to align parts of a transcription with a digital image of the page concerned. This may be done in a very simple but inflexible way by using the facs attribute to point to each page image concerned:
<p>I wrote to Moor House and to Cambridge
immediately, to say what I had done: fully explaining also why I had thus acted.
Diana and <pb ed="ED1n="475facs="ed1p475.png"/> Mary approved the step
unreservedly... </p>
The facs attribute can supply (as here) a filename, or any other form of URI, if for example the page image is stored remotely. One drawback of this simplistic approach is that there must be exactly one image file per page of text. It is not therefore suitable in the case where the available page images represent double page spreads, or where there are multiple images of the same page (for example at different resolutions).

A more powerful approach, discussed in section 14 Encoding a Digital Facsimile below, is to use the facsimile element to define the organisation of the set of images representing the text, and then use the facs attribute to point to individual components of that representation.

3.5 Marking Highlighted Phrases

3.5.1 Changes of Typeface, etc.

Highlighted words or phrases are those made visibly different from the rest of the text, typically by a change of type font, handwriting style, ink colour etc., which is intended to draw the reader's attention to some associated change.

The global rendition attribute can be attached to any element, and used wherever necessary to specify details of the highlighting used for it in the source. For example, a heading rendered in bold might be tagged <head rendition="simple:bold">, and one in italic <head rendition="simple:italic">.

The values used for the rendition attribute point to definitions provided for the formatting concerned. These definitions are typically provided by a rendition element in the document's header, as further discussed in section 15.2.3 Tagging Declaration.

It is not always possible or desirable to interpret the reasons for such changes of rendering in a text. In such cases, the element hi may be used to mark a sequence of highlighted text without making any claim as to its status.

  • hi (highlighted) marks a word or phrase as graphically distinct from the surrounding text, for reasons concerning which no claim is made.
In the following example, the use of a distinct typeface for the subheading and for the included name are recorded but not interpreted:
<p>
 <hi rendition="simple:blackletter">And
   this Indenture further witnesseth</hi> that the said <hi rendition="simple:italic">Walter Shandy</hi>, merchant, in consideration of the
said intended marriage ...
</p>

Alternatively, where the cause for the highlighting can be identified with confidence, a number of other, more specific, elements are available.

  • foreign (foreign) identifies a word or phrase as belonging to some language other than that of the surrounding text.
  • label (label) contains any label or heading used to identify part of a text, typically but not exclusively in a list or glossary.
  • title (title) contains a title for any kind of work.

Some features (notably quotations, titles, and foreign words) may be found in a text either marked by highlighting, or with quotation marks. In either case, the element q (as discussed in the following section) should be used. Again, the global rendition attribute can be used to record details of the highlighting used in the source if this is thought useful.

As an example of the elements defined here, consider the following sentence: On the one hand the Nibelungenlied is associated with the new rise of romance of twelfth-century France, the romans d'antiquité, the romances of Chrétien de Troyes, and the German adaptations of these works by Heinrich van Veldeke, Hartmann von Aue, and Wolfram von Eschenbach. Interpreting the role of the highlighting, the sentence might be encoded as follows:
<p>On the one hand the
<title>Nibelungenlied</title> is associated with the new rise of romance of
twelfth-century France, the <foreign>romans d'antiquité</foreign>, the romances of
Chrétien de Troyes, ...</p>
Describing only the appearance of the original, it might be encoded like this:
<p>On the one hand the <hi rendition="simple:italic">Nibelungenlied</hi> is associated with the new rise of
romance of twelfth-century France, the <hi rendition="simple:italic">romans
   d'antiquité</hi>, the romances of Chrétien de Troyes, ...</p>

3.5.2 Quotations and Related Features

Like changes of typeface, quotation marks are conventionally used to denote several different features within a text, of which the most frequent is quotation, though many other features are possible. The full TEI Guidelines provide additional elements such as <mentioned> or <said> to distinguish some of these features, but these more specialised elements are not included in TEI simplePrint. In TEI Simple however, we use the quote element for quotation only, and the q element for all other material found within quotation marks in the text.

  • q (quoted) contains material which is distinguished from the surrounding text using quotation marks or a similar method, for any one of a variety of reasons including, but not limited to: direct speech or thought, technical terms or jargon, authorial distance, quotations from elsewhere, and passages that are mentioned but not used.
  • quote (quotation) contains a phrase or passage attributed by the narrator or author to some agency external to the text.
Here is a simple example of a quotation:
<p>Few dictionary makers are likely to
forget Dr. Johnson's description of the lexicographer as <quote>a harmless
   drudge.</quote>
</p>

As elsewhere, the way that a citation or quotation was printed (for example, in-line or set off as a display or block quotation), may be represented using the rendition attribute. This may also be used to indicate the kind of quotation marks used.

Direct speech interrupted by a narrator can be represented simply by ending the q element and beginning it again after the interruption, as in the following example:
<p>
 <q>Who-e debel
   you?</q> — he at last said — <q>you no speak-e, damme, I kill-e.</q> And so
saying, the lighted tomahawk began flourishing about me in the dark.
</p>
If it is important to convey the idea that the two q elements together make up a single speech, the linking attributes next and prev may be used, as described in section 3.7.3 Special Kinds of Linking.
Direct speech may be accompanied by a reference to the source or speaker, using the who attribute, whether or not this is explicit in the text, as in the following example:
<q who="#Wilson">Spaulding, he came down into the office just this day eight weeks with this very
paper in his hand, and he says:—<q who="#Spaulding">I wish to the Lord, Mr.
   Wilson, that I was a red-headed man.</q>
</q>
This example also demonstrates how quotations may be embedded within other quotations: one speaker (Wilson) quotes another speaker (Spaulding).

The creator of the electronic text must decide whether quotation marks are replaced by the tags or whether the tags are added and the quotation marks kept. If the quotation marks are removed from the text, the rendition attribute may be used to record the way in which they were rendered in the copy text.

3.5.3 Foreign Words or Expressions

Words, phrases, or longer stretches of text that are not in the main language of the texts may be tagged as such in one of two ways. The global xml:lang attribute may be attached to any element to show that it uses some other language than that of the surrounding text. Where there is no applicable element, the element foreign may be used, again using the xml:lang attribute. For example:
<p>John has real <foreign xml:lang="fr">savoir-faire</foreign>.</p>
<p>Have you read <title xml:lang="de">Die Dreigroschenoper</title>?</p>

As these examples show, the foreign element should not be used to tag foreign words if some other more specific element such as title, or div applies.

The value of the xml:lang attribute on an element applies hierarchically to everything contained by that element, unless overridden:

<div xml:lang="la">
 <p>Pars haec Latine composita est.</p>
 <p xml:lang="en">Except that this sentence is in English.</p>
 <p>Vita brevis, ars longa.</p>
</div>

Here we specify that the whole div element uses the language with the coded identifier la i.e., Latin. Since it is contained by that div there is no need to supply this information again for the first s element. The second s element however overrides this value, and indicates that its content is in English (the language with identifier en). The third s element is again in Latin.

The codes used to identify languages, supplied on the xml:lang attribute, are defined by an international standard3, as further explained in the relevant section of the TEI Guidelines. Some simple example codes for a few languages are given here:

zhChinesegrcAncient Greek
enEnglishelGreek
enmMiddle EnglishjaJapanese
frFrenchlaLatin
deGermansaSanskrit

3.6 Notes

A note is any additional comment found in a text, marked in some way as being out of the main textual stream. A note is always attached to some part of the text, implicitly or explicitly: we call this its target, or its point of attachment. The element note should be used to mark any kind of note whether it appears as a separate block of text in the main text area, at the foot of the page, at the end of the chapter or volume, in the margin, or in some other place.

  • note (note) contains a note or annotation.

Notes may be in a different hand or typeface, may be authorial or editorial, and may have been added later. The attributes type and resp can be used to distinguish between different kinds of notes or identify their authors.

In a printed or written text, the point of attachment for a note is typically represented by a siglum such as an alphanumerical or other character, often in superscripted form. When encoding such a text, it is conventional to replace this siglum by a note element containing the annotation itself, as in the following example:
<p>...some text <note xml:id="n6">a note about some text</note> .... </p>
An alternative approach is to encode the point of attachment wherever it appears in the text, using for example the ref element discussed in the next section, and to place the note element anywhere convenient. The two can then be associated by using the target attribute on the ref element to point to the note element, as in the following example, in which the superscripted ‘7’ indicating the point of attachment has been retained as part of the encoding:
<p>...some text <ref target="#n7"
  rendition="simple:superscript">
7</ref> .... <note xml:id="n7">a note about some text</note>
</p>

It may however be problematic to determine the precise position of the point of attachment, particularly in the case of marginal notes. A marginal note may also be hard to distinguish from a label or subheading which introduces the text with which it is associated. Where the purpose of the note is clearly to label the associated text, rather than to comment on it, the element label may be preferable. Where it is clearly a subheading attached to a distinct subdivision, it may be preferable to start a new element div and encode the subheading as a head. Note however that a head cannot be inserted anywhere except at the beginning of a div. And where (as in some Early Modern English plays) marginal annotation is systematically used to identify speakers, it may be better to represent these using the speaker element introduced above. In cases of doubt, the encoder should decide on a clear policy and preferably document it for the use of others.

3.7 Cross References and Links

Any kind of cross reference or link found at one point in a text which points to another part of the same or another document may be encoded using the ref element discussed in this section. Implicit links (such as the association between two parallel texts, or that between a text and its interpretation) may be encoded using the linking attributes discussed in section 3.7.3 Special Kinds of Linking.

3.7.1 Simple Cross References

  • ref (reference) defines a reference to another location, possibly modified by additional text or comment.

Usually, the presence of a cross-reference or link will be indicated by some text or symbol in the source being encoded, which will then become the content of the ref element. Occasionally, however, and frequently in the case of a born digital document, the exact form and appearance of the cross reference text will be determined dynamically by the software processing the document. In such cases, the ref element will have no content, and serve simply to mark a point from which a link is to be made, along with the target of the link.

The following two forms, for example, are logically equivalent:
See especially <ref target="#SEC12">section
12 on page 34</ref>.
See especially <ref target="#SEC12"/>.
In both cases, there is a cross reference from the position in the source document immediately following the word especially to whatever element in the encoded document has the identifier SEC12. In the first case, the encoder has supplied the original form of the cross reference ‘section 12 on page 34’; in the second, the task of generating an appropriate form of cross reference has been left to the formatting software. Perhaps the pagination and section numbers of the document in question are not yet determined; perhaps the cross reference should be replaced by a big red button. In either case, however, the value of the target attribute must be the identifier of some other element within the current document. Since the passage or phrase being pointed at must bear an identifier, it must be an element of some kind. In the following example, the cross reference is to a div element:
... see especially <ref target="#SEC12"/>.
...
<div xml:id="SEC12">
 <head>Concerning Identifiers</head>
<!-- ... -->
</div>
Because the xml:id attribute is global, any element in a TEI document may be pointed to in this way. In the following example, a paragraph has been given an identifier so that it may be pointed at:
... this is discussed in <ref target="#pspec">the paragraph on links</ref> ...
<p xml:id="pspec">Links may be
made to any kind of element ...</p>

Sometimes the target of a cross reference does not correspond with any particular feature of a text, and so may not be tagged as an element of some kind. If the desired target is simply a point in the current document, the easiest way to mark it is by introducing an anchor element at the appropriate spot. If the target is some sequence of words not otherwise tagged, the seg element may be used to mark them. These two elements are described as follows:

  • anchor (anchor point) attaches an identifier to a point within a text, whether or not it corresponds with a textual element.
  • seg (arbitrary segment) represents any segmentation of text below the ‘chunk’ level.
In the following example, ref elements have been used to represent points in this text which are to be linked in some way to other parts of it; in the first case to a point, and in the second, to a sequence of words:
Returning to <ref target="#ABCD">the point
where I dozed off</ref>, I noticed that <ref target="#EFGH">three words</ref> had
been circled in red by a previous reader
This encoding requires that elements with the specified identifiers (ABCD and EFGH in this example) are to be found somewhere else in the current document. Assuming that no element already exists to carry these identifiers, the anchor and seg elements may be used:
.... <anchor type="bookmarkxml:id="ABCD"/> .... ....<seg type="targetxml:id="EFGH"> ... </seg> ...

The type attribute should be used (as above) to distinguish amongst different purposes for which these general purpose elements might be used in a text. Some other uses are discussed in section 3.7.3 Special Kinds of Linking below.

3.7.2 Pointing to other documents

So far, we have shown how the ref element may be used for cross-references or links whose targets occur within the same document as their source. The element may also be used to refer to elements in any other XML document or resource, such as a document on the web, or a database component. This is possible because the value of the target attribute may be any valid Uniform Resource Identifier (URI)4.

A URI may reference a web page or just a part of one, for example http://www.tei-c.org/index.xml#SEC2. The hash sign indicates that what follows it is the identifier of an element to be located within the XML document identified by what precedes it: this example will therefore locate an element which has an xml:id attribute value of SEC2 within the document retrieved from http://www.tei-c.org/index.xml. In the examples we have discussed so far, the part to the left of the sharp sign has been omitted: this is understood to mean that the referenced element is to be located within the current document.

It is also possible to define an abbreviated form of the URI, using a predefined prefix separated from the rest of the code by a colon, as for example cesr:SEC2. This is known as a private URI, since the prefix is not standardized (except that the prefix xml: is reserved for use by XML itself). A prefixDef element should be supplied within the TEI header specifying how the prefix (here cesr) should be translated to give a full URL for the link. This is particularly useful if a document contains many references to an external document such as an authority file.

Parts of an XML document can be specified by means of other more sophisticated mechanisms using a language called Xpointer, also defined by the W3C. This is useful when, for example, the elements to be linked to do not bear identifiers. Further information about this and other forms of link addressing is provided in chapter 16 of the TEI Guidelines but is beyond the scope of the present document.

3.7.3 Special Kinds of Linking

The following special purpose linking attributes are defined for every element in the TEI simplePrint schema:

ana
links an element with its interpretation.
corresp
links an element with one or more other corresponding elements.
next
links an element to the next element in an aggregate.
prev
links an element to the previous element in an aggregate.
The ana (analysis) attribute is intended for use where a set of abstract analyses or interpretations have been defined somewhere within a document, as further discussed in section 10 Analysis. For example, a linguistic analysis of the sentence ‘John loves Nancy’ might be encoded as follows:
<seg type="sentenceana="#SVO">
 <seg type="lexana="#NP1">John</seg>
 <seg type="lexana="#VVI">loves</seg>
 <seg type="lexana="#NP1">Nancy</seg>
</seg>
This encoding implies the existence elsewhere in the document of elements with identifiers SVO, NP1, and VV1 where the significance of these particular codes is explained. Note the use of the seg element to mark particular components of the analysis, distinguished by the type attribute.
The corresp (corresponding) attribute provides a simple way of representing some form of correspondence between two elements in a text. For example, in a multilingual text, it may be used to link translation equivalents, as in the following example:
<seg xml:lang="frxml:id="FR1"
 corresp="#EN1">
Jean aime Nancy</seg>
<seg xml:lang="enxml:id="EN1"
 corresp="#FR1">
John loves Nancy</seg>
The same mechanism may be used for a variety of purposes. In the following example, it has been used to represent the correspondences between ‘the show’ and ‘Shirley’, and between ‘NBC’ and ‘the network’:
<p>
 <title xml:id="shirley">Shirley</title>,
which made its Friday night debut only a month ago, was not listed on <name xml:id="nbc">NBC</name>'s new schedule, although <seg xml:id="networkcorresp="#nbc">the network</seg> says <seg xml:id="showcorresp="#shirley">the
   show</seg> still is being considered.
</p>
The next and prev attributes provide a simple way of linking together the components of a discontinuous element, as in the following example:
<q xml:id="Q1anext="#Q1b">Who-e
debel you?</q> — he at last said — <q xml:id="Q1bprev="#Q1a">you no speak-e,
damme, I kill-e.</q> And so saying, the lighted tomahawk began flourishing about
me in the dark.

4 Editorial Interventions

The process of encoding an electronic text has much in common with the process of editing a manuscript or other text for printed publication. In either case a conscientious editor may wish to record both the original state of the source and any editorial correction or other change made in it. The elements discussed in this and the next section provide some facilities for meeting these needs.

4.1 Correction and Normalization

The following elements may be used to mark corrections, that is editorial changes introduced where the editor believes the original to be erroneous:

  • corr (correction) contains the correct form of a passage apparently erroneous in the copy text.
  • sic (Latin for thus or so) contains text reproduced although apparently incorrect or inaccurate.

The following elements may be used to mark normalization, that is editorial changes introduced for the sake of consistency or modernization of a text:

  • orig (original form) contains a reading which is marked as following the original, rather than being normalized or corrected.
  • reg (regularization) contains a reading which has been regularized or normalized in some sense.

Consider, for example, the following famous passage as it appears in the first quarto printing of Shakespeare's Henry V:

Detail from , first quarto (1600)
Figure 1. Detail from Henry V, first quarto (1600)

in particular the phrase we might transcribe directly as

... for his Nose was as sharpe as a Pen, and a Table of greene fields
A modern editor might wish to make a number of interventions here, specifically to modernize (or normalize) the Elizabethan spellings of a' and sharpe for he and sharp respectively. They might also want to emend table to babbl'd, following an editorial tradition that goes back to the 18th century Shakespearian scholar Lewis Theobald. The following encoding would then be appropriate:
... for his Nose was as <reg>sharp</reg> as a
Pen and
<reg>he</reg>
<corr resp="#Theobald">babbl'd</corr> of green fields
A more conservative or source-oriented editor, however, might want to retain the original, but at the same time signal that some of the readings it contains are in some sense anomalous:
... for his Nose was as
<orig>sharpe</orig> as a Pen, and
<orig>a</orig>
<sic>Table</sic> of green fields
Finally, a modern digital editor may decide to combine both possibilities in a single composite text, using the choice element.
  • choice (choice) groups a number of alternative encodings for the same point in a text.
This allows an editor to indicate where alternative encodings are possible:
... for his Nose was as <choice>
 <orig>sharpe</orig>
 <reg>sharp</reg>
</choice> as a Pen, and
<choice>
 <orig>a</orig>
 <reg>he</reg>
</choice>
<choice>
 <corr resp="#Theobald">babbl'd</corr>
 <sic>Table</sic>
</choice> of green fields

4.2 Omissions, Deletions, and Additions

In addition to correcting or normalizing words and phrases, editors and transcribers may also supply missing material, omit material, or transcribe material deleted or crossed out in the source. In addition, some material may be particularly hard to transcribe because it is hard to make out on the page. The following elements may be used to record such phenomena:

  • add (addition) contains letters, words, or phrases inserted in the source text by an author, scribe, or a previous annotator or corrector.
  • gap (gap) indicates a point where material has been omitted in a transcription, whether for editorial reasons described in the TEI header, as part of sampling practice, or because the material is illegible, invisible, or inaudible.
  • del (deletion) contains a letter, word, or passage deleted, marked as deleted, or otherwise indicated as superfluous or spurious in the copy text by an author, scribe, or a previous annotator or corrector.
  • unclear (unclear) contains a word, phrase, or passage which cannot be transcribed with certainty because it is illegible or inaudible in the source.
  • supplied (supplied) signifies text supplied by the transcriber or editor for any reason; for example because the original cannot be read due to physical damage, or because of an obvious omission by the author or scribe.
  • subst (substitution) groups one or more deletions (or surplus text) with one or more additions when the combination is to be regarded as a single intervention in the text.
These elements may be used to record changes made by an editor, by the transcriber, or (in manuscript material) by the author or scribe. For example, if the source for an electronic text read ‘The following elements are provided for for simple editorial interventions.’ then it might be felt desirable to correct the obvious error, but at the same time to record the deletion of the superfluous second for, thus:
The
following elements are provided for <del resp="#LB">for</del> simple editorial
interventions.
The attribute value #LB on the resp attribute is used to point to a fuller definition (typically in a respStmt element) of the person or other agency responsible for correcting the duplication of for.
If the source read ‘The following elements provided for simple editorial interventions.’ (i.e. if the word are had been inadvertently dropped) then the scholar identified as LB might choose to encode the corrected text as follows:
The following
elements <add resp="#LB">are</add> provided for simple editorial
interventions.

These elements may also be used to record the actual writing process, for example to record passages which have been deleted, added, corrected etc., whether by the author of a literary text or by a scribe copying out a manuscript. An analysis of such documentary modifications may be essential before a reading text can be presented, and is clearly of importance in the editorial process.

The example is taken from the surviving authorial manuscript of a poem by the English writer Wilfred Owen, a part of which is shown here:

Detail from  autograph manuscript
              in the English Faculty Library, Oxford University.
Figure 2. Detail from Dulce et decorum est autograph manuscript in the English Faculty Library, Oxford University.

Owen first wrote ‘Helping the worst amongst us’, but then deleted it, adding ‘Dragging the worst amongt us’ over the top. In the same way, he revised the phrase ‘half–blind’ by deleting the ‘half–’ and adding ‘all’ above it. In the last line, he started a word beginning ‘fif’ before deleting it and writing the word ‘five–nines’. We can encode all of this as follows:

<l>And towards our distant rest began to trudge,</l>
<l>
 <subst>
  <del>Helping the worst amongst us</del>
  <add>Dragging the worst amongt us</add>
 </subst>, who’d no boots
</l>
<l>But limped on, blood–shod. All went lame; <subst>
  <del status="shortEnd">half–</del>
  <add>all</add>
 </subst> blind;</l>
<l>Drunk with fatigue ; deaf even to the hoots</l>
<l>Of tired, outstripped <del>fif</del> five–nines that dropped behind.</l>

The tags add and del elements are used to enclose passages added or deleted respectively. Additional attributes are available such as resp to indicate responsibility for the modification, or place to indicate where in the text (for example, above or below the line) the modification has been made. Where the encoder wishes to assert that the addition and deletion make up a single editorial act of substitution, these elements can be combined within a subst element as shown above.

A very careful examination of Owen’s second modification shows that he really did write ‘amongt’ rather than ‘amongst’, presumably in error. An equally careful editor wishing to restore the missing ‘s’ might use the supplied element to indicate that they have done so:

<add>Dragging the worst among<supplied resp="#ED">s</supplied>t us</add>

Here the resp attribute has been used to indicate that the ‘s’ was not supplied by Owen but by someone else, specifically the person documented elsewhere by an element with the identifier ED.

The unclear element is useful where material in the source is so hard to read that the transcriber is uncertain as to whether they have done so correctly. The gap element by contrast should be used where the material is so illegible that the transcriber does not wish even to attempt it. The two may however be used together as in the following example:
One hundred &amp; twenty good regulars joined <unclear>to me <gap extent="2 words"
  reason="indecipherable"/>
and </unclear> instantly, would aid me signally in an
enterprise against Wilmington.
The del element marks material which is deleted in a source, but has been transcribed as part of the electronic text all the same, while gap marks the location of source material which is omitted from the electronic text, whether it is legible or not. A language corpus, for example, might omit long quotations in foreign languages. An extent attribute is available on the gap element to indicate how much material has been omitted. The desc element can be used inside the gap element to provide a brief characterisation of the omitted material, as in the following examples:
<p> ... An example of a list appearing in a fief ledger of <name type="place">Koldinghus</name>
 <date>1611/12</date> is given below. It shows cash income from a sale of
honey.</p>
<gap extent="50 lines">
 <desc>quotation from ledger (in Danish)</desc>
</gap>
<p>A description of the overall structure of the account is once again ...
</p>
(The name and date elements used in this example are discussed further below)
Language corpora (particular those constructed before the widespread use of scanners) often systematically omit figures and mathematics:
<p>At the bottom of your screen below the
mode line is the <hi>minibuffer</hi>. This is the area where Emacs echoes the
commands you enter and where you specify filenames for Emacs to find, values for
search and replace, and so on. <gap reason="graphic">
  <desc>diagram of Emacs screen</desc>
 </gap>
</p>

4.3 Abbreviation and their Expansion

Like names, dates, and numbers, abbreviations may be transcribed as they stand or expanded; they may be left unmarked, or encoded using the following elements:

  • abbr (abbreviation) contains an abbreviation of any sort.
  • expan (expansion) contains the expansion of an abbreviation.
The abbr element is useful as a means of distinguishing semi-lexical items such as acronyms or jargon:
Every
manufacturer of <abbr>3GL</abbr> or <abbr>4GL</abbr> languages is currently nailing on
<abbr>OOP</abbr> extensions

The type attribute may be used to distinguish types of abbreviation by their function.

The expan element is used to mark an expansion supplied by an encoder. This element is particularly useful in the transcription of manuscript materials. For example, the character p with a bar through its descender as a conventional representation for the word per is commonly encountered in Medieval European manuscripts. An encoder may choose to expand this as follows:
<expan>per</expan>
To record both an abbreviation and its expansion, the choice element mentioned above may be used to group the abbreviated form with its proposed expansion:
<choice>
 <abbr>wt</abbr>
 <expan>with</expan>
</choice>

The elements expan and abbr should contain a full word, or the abbreviated form of a full word respectively. For a fuller discussion of abbreviations and the intricacies of representing them consult the section on Abbreviation and Expansion in the TEI Guidelines.

5 Names, Codes, and Numbers

The TEI scheme defines elements for a large number of ‘data-like’ features which may appear almost anywhere within almost any kind of text. These features may be of particular interest in a range of disciplines; they all relate to objects external to the text itself, such as the names of persons and places, strings of code, formulae, or numbers and dates. These items may also pose particular problems for natural language processing (NLP) applications. The elements described here, by making such features explicit, reduce the complexity of processing texts containing them.

5.1 Names and Referring Strings

A referring string is any phrase which refers to some person, place, object, etc. A name is a referring string which contains proper nouns and honorifics only. Two elements are provided to mark such strings:

  • rs (referencing string) contains a general purpose name or referring string.
  • name (name, proper noun) contains a proper noun or noun phrase.
The type attribute is used to distinguish amongst (for example) names of persons, places and organizations, where this is possible:
<q>My dear <name type="person">Mr.
   Bennet</name>, </q>said his lady to him one day,
<q>have you heard that <name type="place">Netherfield Park</name> is let at last?</q>
It being one of the principles of the
<name type="org">Circumlocution Office</name> never, on any account whatsoever, to
give a straightforward answer, <name type="person">Mr Barnacle</name> said,

<q>Possibly.</q>
As the following example shows, the rs element may be used for a reference to a person, place, etc., which does not contain a proper noun or noun phrase:
<q>My dear <name type="person">Mr.
   Bennet</name>,</q> said <rs type="person">his lady</rs> to him one day...

Simply tagging something as a name is rarely enough to enable automatic processing of personal names into the canonical forms usually required for reference purposes. The name as it appears in the text may be inconsistently spelled, partial, or vague. Moreover, name prefixes such as van or de la, may or may not be included as part of the reference form of a name, depending on the language and country of origin of the bearer.

The ref attribute provides a way of linking a name with a description of the object being named, and may thus act as a normalized identifier for it. It is also very useful as a means of gathering together all references to the same individual or location scattered throughout a document:
<q>My dear <name type="personref="#BENM1">Mr. Bennet</name>, </q> said <rs type="personref="#BENM2">his lady</rs> to him
one day,
<q>have you heard that <name type="placeref="#NETP1">Netherfield
   Park</name> is let at last?</q>

The values used for the ref attribute here (#BENM1 etc.) are pointers; in this case indicating an element with the identifier BENM1 etc. somewhere in the current document, though any form of URI could be used. The element indicated will typically (for a person) be a person element, listed within a particDesc element, or (for a place) a place element, listed within a settingDesc element in the TEI header, as further discussed in 15.3 The Profile Description below.

This use should be distinguished from the case of the reg (regularization) element, which provides a means of marking the standard form of a referencing string as demonstrated below:
<name type="personref="#WADLM1">
 <choice>
  <orig>Walter de la Mare</orig>
  <reg>de la Mare, Walter</reg>
 </choice>
</name> was born at <name ref="https://en.wikipedia.org/wiki/Charlton,_London"
 type="place">
Charlton</name>,
in <name type="place">Kent</name>, in 1873.

5.2 Formulae, Codes, and Special Characters

The following elements may be useful when marking up sequences of text that represent mathematical expressions, chemical formulae, and the like:

  • formula (formula) contains a mathematical or other formula.
  • g (character or glyph) represents a glyph, or a non-standard character.
In many cases, a simple Unicode character suffices to represent the superscript or subscript digits and other symbols which may appear inside a mathematical formula:
<formula>E=mc²</formula>
In other more complex cases, the encoder may choose to use a different XML scheme (such as MathML) to encode the content of a formula, or a non-XML notation. These possibilities are not discussed further here.
The g element is useful in the case that no Unicode character exists to represent the character or glyph required. Its ref attribute can be used to point to a definition of the symbol intended, while its content (if any) represents a Unicode approximation to it:
...Thereto
<g ref="#air">[air]</g> and ...
The TEI header provides a number of additional elements for the definition of such non-Unicode characters, as further discussed in section 15.2.5 The character declaration below.

The following elements are useful for stretches of code or similar formal language appearing within a text:

  • code contains literal code from some formal language such as a programming language.
  • email (electronic mail address) contains an email address identifying a location to which email messages can be delivered.
This can be expressed in XML as follows:
<code>&amp;lt;date notBefore="2016-06-23"/></code> Contact the author at

<email>lou.burnard@gmail.com</email>

Note in this example that characters which have a syntactic function in XML (such as the ampersand or the angle bracket) must be represented within a TEI simplePrint document by means of an entity reference such as &lt; or &amp;.

The element ref discussed in section 3.7 Cross References and Links should be used to represent a coded reference such as a link given as a URL within a text, either as content or as an attribute value:
<p>Further discussion of <ref target="http://www.tei-c.org/">the Text Encoding
   Initiative website</ref> may be found at <ref>http://www.tei-c.org/</ref>
</p>

5.3 Dates and Times

The following elements are provided for the detailed encoding of times and dates:

  • date (date) contains a date in any format.
  • time (time) contains a phrase defining a time of day in any format.
These elements have a number of attributes which can be used to provide normalized versions of their values in various ways.
  • att.datable provides attributes for normalization of elements that contain dates, times, or datable events.
    periodsupplies pointers to one or more definitions of named periods of time (typically categorys, dates or <event>s) within which the datable item is understood to have occurred.
    when [att.datable.w3c]supplies the value of the date or time in a standard form, e.g. yyyy-mm-dd.
    notBefore [att.datable.w3c]specifies the earliest possible date for the event in standard form, e.g. yyyy-mm-dd.
    notAfter [att.datable.w3c]specifies the latest possible date for the event in standard form, e.g. yyyy-mm-dd.
The when attribute specifies a normalized form for the date or time, using one of the standard formats defined by ISO 8601. Partial dates or times (e.g. ‘1990’, ‘September 1990’, ‘twelvish’) can be expressed by omitting a part of the value supplied, as in the following examples:
<date when="1980-02-21">21 Feb
1980</date>
<date when="1990">1990</date>
<date when="1990-09">September
1990</date>
<date when="--09">September</date>
<date when="2001-09-11T12:48:00">Sept
11th, 12 minutes before 9 am</date>
These attributes are typically used to make a date or time more easily processable, as in the following examples:
Given on the
<date when="1977-06-12">Twelfth Day of June in the Year of Our Lord One Thousand
Nine Hundred and Seventy-seven of the Republic the Two Hundredth and first and of
the University the Eighty-Sixth.</date>
<l>specially when it's nine below zero</l>
<l>and <time when="15:00:00">three o'clock in the afternoon</time>
</l>
They are also useful in cases where the date concerned is uncertain or only vaguely specified:
<p>... <date period="secondEmpire">during the second empire</date>
</p>
<date notAfter="1946-12-09"
 notBefore="1946-11-01">
in the weeks shortly before my
birth</date>

5.4 Numbers and Measurements

Like dates, both numbers and quantities can be written with either letters or digits and may therefore need to be normalized for ease of processing. Their presentation is also highly language-dependent (e.g. English 5th becomes Greek 5.; English 123,456.78 equals French 123.456,78).

The following elements are provided for the detailed encoding of numbers and quantities:

  • num (number) contains a number, written in any form.
    typeindicates the type of numeric value.
    valuesupplies the value of the number in standard form.
  • measure (measure) contains a word or phrase referring to some quantity of an object or commodity, usually comprising a number, a unit, and a commodity name.
    quantity [att.measurement](quantity) specifies the number of the specified units that comprise the measurement
    unit [att.measurement](unit) indicates the units used for the measurement, usually using the standard symbol for the desired units. Suggested values include: 1] m (metre); 2] kg (kilogram); 3] s (second); 4] Hz (hertz); 5] Pa (pascal); 6] Ω (ohm); 7] L (litre); 8] t (tonne); 9] ha (hectare); 10] Å (ångström); 11] mL (millilitre); 12] cm (centimetre); 13] dB (decibel); 14] kbit (kilobit); 15] Kibit (kibibit); 16] kB (kilobyte); 17] KiB (kibibyte); 18] MB (megabyte); 19] MiB (mebibyte)
    commodity [att.measurement](commodity) indicates the substance that is being measured
For example:
<num value="33">xxxiii</num>
<num type="cardinalvalue="21">twenty-one</num>
<num type="percentagevalue="10">ten percent</num>
<num type="percentagevalue="10">10%</num>
<num type="ordinalvalue="5">5th</num>
<measure quantity="40unit="hogshead"
 commodity="rum">
2 score hh rum</measure>
<measure quantity="1unit="dozen"
 commodity="blooms">
1 doz. roses</measure>
<measure quantity="1unit="count"
 commodity="blooms">
a yellow tulip</measure>

6 Lists

The element list is used to mark any kind of list. A list is a sequence of text items, which may be numbered, bulleted, or arranged as a glossary list. Each item may be preceded by an item label (in a glossary list, this label is the term being defined):

Individual list items are tagged with item. The first item may optionally be preceded by a head, which gives a heading for the list. The numbering of items within the list may be omitted, indicated using the n attribute on each item, or (rarely) tagged as content using the label element. The following are all thus equivalent:
<list>
 <head>A short list</head>
 <item>First item in list.</item>
 <item>Second item in list.</item>
 <item>Third item in list.</item>
</list>
<list>
 <head>A short list</head>
 <item n="1">First item in list.</item>
 <item n="2">Second item in list.</item>
 <item n="3">Third item in list.</item>
</list>
<list>
 <head>A short list</head>
 <label>1</label>
 <item>First item in list.</item>
 <label>2</label>
 <item>Second item in list.</item>
 <label>3</label>
 <item>Third item in list.</item>
</list>
The styles should not be mixed in the same list.
A simple two-column table may be treated as a glossary list, tagged <list type="gloss">. Here, each item comprises a term and a gloss, marked with label and item respectively.
<list type="gloss">
 <head>Vocabulary</head>
 <label xml:lang="enm">nu</label>
 <item>now</item>
 <label xml:lang="enm">lhude</label>
 <item>loudly</item>
 <label xml:lang="enm">bloweth</label>
 <item>blooms</item>
 <label xml:lang="enm">med</label>
 <item>meadow</item>
 <label xml:lang="enm">wude</label>
 <item>wood</item>
 <label xml:lang="enm">awe</label>
 <item>ewe</item>
<!-- <label xml:lang="enm">lhouth</label> <item>lows</item> <label xml:lang="enm">sterteth</label> <item>bounds, frisks</item> <label xml:lang="enm">verteth</label> <item xml:lang="la">pedit</item> <label xml:lang="enm">murie</label> <item>merrily</item> <label xml:lang="enm">swik</label> <item>cease</item> <label xml:lang="enm">naver</label> <item>never</item>-->
</list>

Where the internal structure of a list item is more complex, it may be preferable to regard the list as a table, for which special-purpose tagging is defined in section 8 Tables.

Lists of whatever kind can, of course, nest within list items to any depth required. Here, for example, a glossary list contains two items, each of which is itself a simple list:
<list type="gloss">
 <label>EVIL</label>
 <item>
  <list type="simple">
   <item>I am cast upon a horrible desolate island, void of all hope of
       recovery.</item>
   <item>I am singled out and separated as it were from all the world to be
       miserable.</item>
   <item>I am divided from mankind — a solitaire; one banished from human
       society.</item>
  </list>
 </item>
 <label>GOOD</label>
 <item>
  <list type="simple">
   <item>But I am alive; and not drowned, as all my ship's company were.</item>
   <item>But I am singled out, too, from all the ship's crew, to be spared from
       death...</item>
   <item>But I am not starved, and perishing on a barren place, affording no
       sustenances....</item>
  </list>
 </item>
</list>

Lists of bibliographic items should be tagged using the listBibl element, described in the next section.

7 Bibliographic Citations

It is often useful to distinguish bibliographic citations where they occur within texts being transcribed for research, if only so that they will be properly formatted when the text is printed out. The element bibl is provided for this purpose. Where the components of a bibliographic reference are to be distinguished, the following elements may be used as appropriate. It is generally useful to distinguish at least those parts (such as the titles of articles, books, and journals) which will need special formatting. The other elements are provided for cases where particular interest attaches to such details:

Consider, for example the following editorial note:
He was a member of Parliament for Warwickshire in 1445, and died March 14, 1470 (according to Kittredge, Harvard Studies 5. 88ff).
This might be encoded as follows:
He was a member of Parliament for Warwickshire
in 1445, and died March 14, 1470 (according to <bibl>
 <author>Kittredge</author>,
<title>Harvard Studies</title> 5. 88ff
</bibl>).
The bibliographic elements listed above are particularly useful in a born digital document which contains a bibliography encoded using the listBibl element. Entries in the bibliography should be given an identifier, which can then be used as the target of cross references from elsewhere in the document:
<p>Perec citing, amongst others <ref target="#MK_73">Sturm und Drang, 1973</ref>,
concludes ... </p>
A bibl element may contain simply text, with possibly a few of its components distinguished by tagging, and much use of conventionalized punctuation, as in this example:
<bibl xml:id="MK_73">Sturm, U. &amp; Drang, F. : <title>Musikalische
   Katastrophe</title>. (Berlin, W. de Gruyter, 1973)</bibl>
Alternatively, each of the components of the bibliographic reference may be clearly distinguished by tagging; in this case there is no requirement for conventionalized punctuation, since the processor will be able to generate this appropriately:
<bibl xml:id="MK73">
 <author>Sturm, U.</author>
 <author>Drang, F.</author>
 <title xml:lang="delevel="m">Musikalische Katastrophe</title>
 <pubPlace>Berlin</pubPlace>
 <publisher>W. de Gruyter</publisher>
 <date>1973</date>
</bibl>

The element biblFull is also provided for convience in cases where bibliographic citations following a more sophisticated model have been used; it is permitted only in the TEI header.

The listBibl element is used to group lists of bibliographic citations. It may contain a series of bibl or biblFull elements.

8 Tables

The following elements are provided for the description of tabular matter, commonly found in many kinds of narrative text. Note that TEI simplePrint provides no sophisticated ways of describing the detailed layout of a table beyond its organization into rows and columns.

The role attribute may be used on either cell or rowto indicate the function of a cell, or of a row of cells. Its values should be taken from the following list:

data
data cell
label
label cell
sum
row or column sum data
total
table total data

For example, Defoe uses mortality tables like the following in the Journal of the Plague Year to show the rise and ebb of the epidemic:
<p>It was indeed coming on amain, for the
burials that same week were in the next adjoining parishes thus:— <table rows="5cols="4">
  <row role="data">
   <cell role="label">St. Leonard's, Shoreditch</cell>
   <cell>64</cell>
   <cell>84</cell>
   <cell>119</cell>
  </row>
  <row role="data">
   <cell role="label">St. Botolph's, Bishopsgate</cell>
   <cell>65</cell>
   <cell>105</cell>
   <cell>116</cell>
  </row>
  <row role="data">
   <cell role="label">St. Giles's, Cripplegate</cell>
   <cell>213</cell>
   <cell>421</cell>
   <cell>554</cell>
  </row>
 </table>
</p>
<p>This shutting up of houses was at first counted a very cruel and
unchristian method, and the poor people so confined made bitter lamentations. ...
</p>

9 Figures and Graphics

Not all the components of a document are necessarily textual. The most straightforward text will often contain diagrams or illustrations, to say nothing of documents in which image and text are inextricably intertwined, or electronic resources in which the two are complementary.

The encoder may simply record the presence of a graphic within the text, possibly with a brief description of its content, and may also provide a link to a digitized version of the graphic, using the following elements:

Any textual information accompanying the graphic, such as a heading and/or caption, may be included within the figure element itself, in a head and one or more p elements, as may any text appearing within the graphic itself. It is strongly recommended that a prose description of the image be supplied, as the content of a figDesc element, for the use of applications which are not able to render the graphic, and to render the document accessible to vision-impaired readers. (Such text is not normally considered part of the document proper.)

The simplest use for these elements is to mark the position of a graphic and provide a link to it, as in this example:
<pb n="412"/>
<figure>
 <graphic url="images/p412fig.png"
  width="40%"/>

</figure>
<pb n="413"/>
This indicates that the graphic contained by the file p412fig.png appears between pages 412 and 413.
The graphic element can appear anywhere that textual content is permitted, within but not between paragraphs or headings. In the following example, the encoder has decided to treat a specific printer's ornament as a heading:
<head>
 <graphic url="http://www.iath.virginia.edu/gants/Ornaments/Heads/hp-ral02.gif"/>
</head>
More usually, a graphic will have at the least an identifying title, which may be encoded using the head element, or a number of figures may be grouped together in a particular structure, as in the following example:
: illustration by George Cruikshank from Dickens'
                 (1843)
Figure 3. Mr Fezziwig's Ball: illustration by George Cruikshank from Dickens' A Christmas Carol (1843)
The figure element provides a means of wrapping one or more such elements together as a kind of graphic ‘block’. It may also include a brief description of the image:
<figure>
 <graphic url="images/fezzipic.png"/>
 <head>Mr Fezziwig's Ball</head>
 <figDesc>A Cruikshank engraving showing Mr Fezziwig leading a group of
   revellers.</figDesc>
</figure>
These cases should be carefully distinguished from the case where an encoded text is complemented by a collection of digital images, maintained as a distinct resource. The facs attribute may be used to associate any element in an encoded text with a digital facsimile of it. In the simplest case, the facs attribute on the pb element may be used to supply a location for an image file corresponding with that point in the text:
<div>
 <pb facs="page1.pngn="1"/>
<!-- text contained on page 1 is encoded here -->
 <pb facs="page2.pngn="2"/>
<!-- text contained on page 2 is encoded here -->
</div>
This method is only appropriate in the simple case where each digital image file page1.png etc. corresponds with a single transcribed and encoded page. If multiple images are provided for each page, or more detailed alignment of image and transcription is required, for example because the image files actually represent double page spreads, more sophisticated mechanisms are needed, as further discussed in 14 Encoding a Digital Facsimile below.

10 Analysis

10.1 Orthographic Sentences

Interpretation typically ranges across the whole of a text, with no particular respect to other structural units. A useful preliminary to intensive interpretation is therefore to segment the text into discrete and identifiable units, each of which can then bear a label for use as a sort of ‘canonical reference’. To facilitate such uses, these units may not cross each other, nor nest within each other. They may conveniently be represented using the following element:

  • s (s-unit) contains a sentence-like division of a text.
As the name suggests, the s element is most commonly used (in linguistic applications at least) for marking orthographic sentences, that is, units defined by orthographic features such as punctuation. For example, the passage from Jane Eyre discussed earlier might be divided into s-units as follows:
<div type="chaptern="38">
 <pb n="474"/>
 <p>
  <s n="001">Reader, I married him.</s>
  <s n="002">A quiet wedding we had:</s>
  <s n="003">he and I, the parson and clerk, were alone present.</s>
  <s n="004">When
     we got back from church, I went into the kitchen of the manor-house, where Mary
     was cooking the dinner, and John cleaning the knives, and I said —</s>
 </p>
 <p>
  <q>
   <s n="005">Mary, I have been married to Mr Rochester this morning.</s>
  </q> ...
 </p>
</div>
Note that s elements cannot nest: the beginning of one s element implies that the previous one has finished. When s-units are tagged as shown above, it is advisable to tag the entire text end-to-end, so that every word in the text being analyzed will be contained by exactly one s element, whose identifier can then be used to specify a unique reference for it. If the identifiers used are unique within the document, then the xml:id attribute might be used in preference to the n attribute used in the above example.

10.2 Words and Punctuation

Tokenization, that is, the identification of lexical or non-lexical tokens within a text, is a very common requirement for all kinds of textual analysis, and not an entirely trivial one. The decision as to whether, for example, ‘can't’ in English or ‘du’ in French should be treated as one word or two is not simple. Consequently it is often useful to make explicit the preferred tokenization in a marked up text. The following elements are available for this purpose:

  • w (word) represents a grammatical (not necessarily orthographic) word.
  • c (character) represents a character.
  • pc (punctuation character) contains a character or string of characters regarded as constituting a single punctuation mark.
For example, the output from a part of speech tagger might be recorded in TEI simplePrint as follows:
<s n="1">
 <w ana="#NP0">Marley</w>
 <w ana="#VBD">was</w>
 <w ana="#AJ0">dead</w>
 <pc>:</pc>
 <w ana="#TO0">to</w>
 <w ana="#VBB">begin</w>
 <w ana="#PRP">with</w>
 <pc ana="#SENT">.</pc>
</s>

In this example, each token in the input has been decorated with an automatically generated part of speech code, using the ana attribute discussed in section 3.7.3 Special Kinds of Linking above. The system has also distinguished between tokens to be treated as words (tagged w) and tokens considered to be punctuation (tagged pc). It may also sometimes be useful to distinguish tokens which consist of a single letter or character: the c element is provided for this purpose.

The w also provides for each word to be associated with a root form or lemma, either explicitly using the lemma attribute, or by reference, using the lemmaRef attribute, as in this example:
...<w ana="#VBDlemma="be"
 lemmaRef="http://www.myLexicon.com/be">
was</w> ...

10.3 General-Purpose Interpretation Elements

The w element is a specialisation of the seg element which has already been introduced for use in identifying otherwise unmarked targets of cross references and hypertext links (see section 3.7 Cross References and Links); it can be used to distinguish any portion of text to which the encoder wishes to assign a user-specified type or a unique identifier; it may thus be used to tag textual features for which there is no other provision in the published TEI Guidelines.

For example, the TEI Guidelines provide no ‘apostrophe’ element to mark parts of a literary text in which the narrator addresses the reader (or hearer) directly. One approach might be to regard these as instances of the q element, distinguished from others by an appropriate value for the who attribute. A possibly simpler, and certainly more general, solution would however be to use the seg element as follows:
<div type="chaptern="38">
 <p>
  <seg type="apostrophe">Reader, I married him.</seg> A quiet wedding we had:
   ...</p>
</div>
The type attribute on the seg element can take any value, and so can be used to distinguish phrase-level phenomena of any kind; it is good practice to record the values used and their significance in the TEI header or in the documentation of the encoding system.

11 Common Attributes

Some attributes are available on many elements, though not on all. These attributes are defined using a TEI attribute class, a concept which is discussed further in the TEI Guidelines. We list here some attribute classes which have been adapted or customized for use in TEI simplePrint.

The elements add, figure, fw, label, note and stage all take the attribute place to indicate whereabouts on the page they appear. In TEI simplePrint the possible values for this attribute are limited as indicated below:

above
above the line
below
below the line
top
at the top of the page
top-right
at the top right of the page
top-left
at the top left of the page
top-centre
at the top center of the page
bottom-right
at the bottom right of the page
bottom-left
at the bottom left of the page
bottom-centre
at the bottom centre of the page
bottom
at the foot of the page
tablebottom
underneath a table
margin-right
in the right-hand margin
margin
in the outer margin
margin-inner
in the inner margin
margin-left
in the left-hand margin
opposite
on the opposite, i.e. facing, page
overleaf
on the other side of the leaf
overstrike
superimposed on top of the current context
end
at the end of the volume
divend
at the end of the current division
parend
at the end of the current paragraph
inline
within the body of the text
inspace
in a predefined space, for example left by an earlier scribe
block
formatted as an indented paragraph

The elements add, <am>, corr, date, del, <ex>, expan, gap, name, reg, <space>, subst, supplied, time and unclear all use the attribute unit to indicate the units in which the size of the feature concerned is expressed. In TEI simplePrint the possible values for this attribute are limited as indicated below:

chars
characters
lines
lines
pages
pages
words
words
cm
centimetres
mm
millimetres
in
inches

Very many TEI elements take the value type (see the specification for att.typed for a full list). In most cases, no constraint is placed on the possible values for this attribute. In the case of the element name however, the possible values for this attribute are limited as indicated below:

person
person
forename
forename
surname
surname
personGenName
generational name component
personRoleName
role or position in society
personAddName
additional name component (e.g. nickname)
nameLink
connecting link within a name (e.g. van der)
org
organization
country
country
placeGeog
geographical name
place
place

12 Composite and Floating Texts

A composite text, like a simple text, has an optional front and back matter. In between however, instead of a single body, it contains one or more discrete texts, each with its own optional front and back matter. The following elements are provided to handle composite texts of various kinds.

A typical example might be an anthology containing several distinct works, or any other kind of collection, encoded using an overall structure like this:
<TEI xmlns="http://www.tei-c.org/ns/1.0">
 <teiHeader>
<!--[ header information for the composite ]-->
 </teiHeader>
 <text>
  <front>
<!--[ front matter for the composite ]-->
  </front>
  <group>
   <text>
    <front>
<!--[ front matter of first text ]-->
    </front>
    <body>
<!--[ body of first text ]-->
    </body>
    <back>
<!--[ back matter of first text ]-->
    </back>
   </text>
   <text>
    <front>
<!--[ front matter of second text]-->
    </front>
    <body>
<!--[ body of second text ]-->
    </body>
    <back>
<!--[ back matter of second text ]-->
    </back>
   </text>
<!--[ more texts or groups of texts here ]-->
  </group>
  <back>
<!--[ back matter for the composite ]-->
  </back>
 </text>
</TEI>
A different kind of composite text occurs when one text is embedded within another, as for example in the Arabian Nights or similar collections of stories, or in other cases where one narrative is interrupted by another. The element floatingText may be preferred to encode such materials as the following:
<p>The Gentleman having finish'd his Story, Galecia waited on him to the Stairs-head;
and at her return, casting her Eyes on the Table, she saw lying there an old dirty
rumpled Book, and found in it the following story:</p>
<floatingText>
 <body>
  <p>IN the time of the Holy War when Christians from all parts went into the Holy
     Land to oppose the Turks; Amongst these there was a certain English Knight...</p>
<!-- rest of story here -->
  <p>The King graciously pardoned the Knight; Richard was kindly receiv'd into his
     Convent, and all things went on in good order: But from hence came the Proverb, We
     must not strike <hi>Robert</hi> for <hi>Richard.</hi>
  </p>
 </body>
</floatingText>
<pb n="43"/>
<p>By this time Galecia's Maid brought up her Supper; after which she cast her Eyes
again on the foresaid little Book, where she found the following Story ....</p>
Note that there is only a single TEI header for composite texts of either kind, since the assumption is that the composite is at some level describable as a single work. However, it is also possible to define a composite of complete TEI texts, each with its own TEI header. Such a collection is known as a TEI corpus, and must itself have a TEI header:
<teiCorpus xmlns="http://www.tei-c.org/ns/1.0">
 <teiHeader>
<!--[header information for the corpus]-->
 </teiHeader>
 <TEI>
  <teiHeader>
<!--[header information for first text]-->
  </teiHeader>
  <text>
<!--[first text in corpus]-->
  </text>
 </TEI>
 <TEI>
  <teiHeader>
<!--[header information for second text]-->
  </teiHeader>
  <text>
<!--[second text in corpus]-->
  </text>
 </TEI>
</teiCorpus>
It is also possible to create a composite of corpora -- that is, one teiCorpus element may contain many nested teiCorpus elements rather than many nested TEI elements, to any depth considered necessary.

13 Front and Back Matter

13.1 Front Matter

For many purposes, particularly in older texts, the preliminary material such as title pages, prefatory epistles, etc., may provide very useful additional linguistic or social information.The TEI Guidelines provide a set of recommendations for distinguishing the textual elements most commonly encountered in front matter, which are summarized here.

13.1.1 Title Page

The start of a title page should be marked with the element titlePage. All text contained on the page should be transcribed and tagged with the appropriate element from the following list:

  • titlePage (title page) contains the title page of a text, appearing within the front or back matter.
  • docTitle (document title) contains the title of a document, including all its constituents, as given on a title page.
  • titlePart (title part) contains a subsection or division of the title of a work, as indicated on a title page.
  • byline (byline) contains the primary statement of responsibility given for a work on its title page or at the head or end of the work.
  • docAuthor (document author) contains the name of the author of the document, as given on the title page (often but not always contained in a byline).
  • docDate (document date) contains the date of a document, as given on a title page or in a dateline.
  • docEdition (document edition) contains an edition statement as presented on a title page of a document.
  • docImprint (document imprint) contains the imprint statement (place and date of publication, publisher name), as given (usually) at the foot of a title page.
  • epigraph (epigraph) contains a quotation, anonymous or attributed, appearing at the start or end of a section or on a title page.

Typeface distinctions should be marked with the rendition attribute when necessary, as described above though a very detailed description of the letter spacing and sizing used in ornamental titles is not easily done. Changes of language should be marked by appropriate use of the xml:lang attribute or the foreign element, as necessary. Names of people, places, or organizations, may be tagged using the name element wherever they appear if no other more specific element is available.

Two example title pages follow:
<titlePage>
 <docTitle>
  <titlePart type="main">PARADISE REGAIN'D. A POEM In IV
  <hi>BOOKS</hi>.</titlePart>
  <titlePart>To which is added <title>SAMSON AGONISTES</title>.</titlePart>
 </docTitle>
 <byline>The Author <docAuthor>JOHN MILTON</docAuthor>
 </byline>
 <docImprint>
  <name>LONDON</name>, Printed by <name>J.M.</name> for <name>John
     Starkey</name> at the <name>Mitre</name> in <name>Fleetstreet</name>, near
 <name>Temple-Bar.</name>
 </docImprint>
 <docDate>MDCLXXI</docDate>
</titlePage>
<titlePage>
 <docTitle>
  <titlePart type="main">Lives of the Queens of England, from the Norman
     Conquest;</titlePart>
  <titlePart type="sub">with anecdotes of their courts.</titlePart>
 </docTitle>
 <titlePart>Now first published from Official Records and other authentic documents
   private as well as public.</titlePart>
 <docEdition>New edition, with corrections and additions</docEdition>
 <byline>By <docAuthor>Agnes Strickland</docAuthor>
 </byline>
 <epigraph>
  <q>The treasures of antiquity laid up in old historic rolls, I opened.</q>
  <bibl>BEAUMONT</bibl>
 </epigraph>
 <docImprint>Philadelphia: Blanchard and Lea</docImprint>
 <docDate>1860.</docDate>
</titlePage>
As elsewhere, the ref attribute may be used to link a name with a canonical definition of the entity being named. For example:
<byline>By <docAuthor>
  <name ref="http://en.wikipedia.org/wiki/Agnes_Strickland">Agnes Strickland</name>
 </docAuthor>
</byline>

13.1.2 Prefatory Matter

Major blocks of text within the front matter should be marked using div elements; the following suggested values for the type attribute may be used to distinguish various common types of prefatory matter:

preface
A foreword or preface addressed to the reader in which the author or publisher explains the content, purpose, or origin of the text.
dedication
A formal offering or dedication of a text to one or more persons or institutions by the author.
abstract
A summary of the content of a text as continuous prose.
ack
A formal declaration of acknowledgment by the author in which persons and institutions are thanked for their part in the creation of a text.
contents
A table of contents, specifying the structure of a work and listing its constituents. The list element should be used to mark its structure.
frontispiece
A pictorial frontispiece, possibly including some text.

Where other kinds of prefatory matter are encountered, the encoder is at liberty to invent other values for the type attribute.

13.1.3 Liminal Elements

All text divisions, whether in front matter or elsewhere, may begin and end with one or more components which we term liminal elements, because they begin or end the division. A typical example is a heading or title of some kind which should be tagged using the head element; but there are many other possibilities:

  • salute (salutation) contains a salutation or greeting prefixed to a foreword, dedicatory epistle, or other division of a text, or the salutation in the closing of a letter, preface, etc.
  • signed (signature) contains the closing salutation, etc., appended to a foreword, dedicatory epistle, or other division of a text.
  • byline (byline) contains the primary statement of responsibility given for a work on its title page or at the head or end of the work.
  • dateline (dateline) contains a brief description of the place, date, time, etc. of production of a letter, newspaper story, or other work, prefixed or suffixed to it as a kind of heading or trailer.
  • argument (argument) contains a formal list or prose description of the topics addressed by a subdivision of a text.
  • cit (cited quotation) contains a quotation from some other document, together with a bibliographic reference to its source. In a dictionary it may contain an example text with at least one occurrence of the word form, used in the sense being described, or a translation of the headword, or an example.
  • imprimatur (imprimatur) contains a formal statement authorizing the publication of a work, sometimes required to appear on a title page or its verso.
  • opener (opener) groups together dateline, byline, salutation, and similar phrases appearing as a preliminary group at the start of a division, especially of a letter.
  • closer (closer) groups together salutations, datelines, and similar phrases appearing as a final group at the end of a division, especially of a letter.
  • postscript contains a postscript, e.g. to a letter.
As an example, the beginning and end of the dedication to Milton's Comus might be marked up as follows:
<div type="dedication">
 <head>To the Right Honourable <name>JOHN Lord Viscount BRACLY</name>, Son and Heir
   apparent to the Earl of Bridgewater, &amp;c.</head>
 <salute>MY LORD,</salute>
 <p>THis <hi>Poem</hi>, which receiv'd its first occasion of Birth from your Self,
   and others of your Noble Family .... and as in this representation your
   attendant <name>Thyrsis</name>, so now in all reall expression</p>
 <closer>
  <salute>Your faithfull, and most humble servant</salute>
  <signed>
   <name>H. LAWES.</name>
  </signed>
 </closer>
</div>

13.2 Back Matter

13.2.1 Structural Divisions of Back Matter

Because of variations in publishing practice, back matter can contain virtually any of the elements listed above for front matter, and the same elements should be used where this is so. Additionally, back matter may contain the following types of matter within the back element. Like the structural divisions of the body, these should be marked as div elements, and distinguished by the following suggested values of the type attribute:

appendix
An ancillary self-contained section of a work, often providing additional but in some sense extra-canonical text.
glossary
A list of terms associated with definition texts (‘glosses’): this should be encoded as a <list type="gloss"> element.
notes
A section in which textual or other kinds of notes are gathered together.
bibliogr
A list of bibliographic citations: this should be encoded as a listBibl.
index
Any form of pre-existing index to the work
colophon
A statement appearing at the end of a book describing the conditions of its physical production.

13.2.2 Specialized Front and Back Matter

TEI simplePrint also provides elements for some additional components of front or back matter which are characteristic of particular kinds of text, in particular old play texts. These often include lists of dramatis personae and notes about the setting of a play, for which the following elements are provided:

  • castList (cast list) contains a single cast list or dramatis personae.
  • castItem (cast list item) contains a single entry within a cast list, describing either a single role or a list of non-speaking roles.
  • castGroup (cast list grouping) groups one or more individual castItem elements within a cast list.
  • role (role) contains the name of a dramatic role, as given in a cast list.
  • roleDesc (role description) describes a character's role in a drama.
  • actor contains the name of an actor appearing within a cast list.
  • set (setting) contains a description of the setting, time, locale, appearance, etc., of the action of a play, typically found in the front matter of a printed performance text (not a stage direction).

Note that these elements are intended for use in marking up cast lists and setting notes as they appear in a source document. They are not intended for use when marking up definitive lists of the different roles identified in a play, except in so far as that may have been their original purpose.

The following example shows one way of encoding the last part of Shakespeare's Tempest, as printed in the first folio:

<back>
 <div type="epilogue">
  <head>Epilogue, spoken by Prospero.</head>
  <sp>
   <l>Now my Charmes are all ore-throwne,</l>
   <l>And what strength I have's mine owne</l>
   <l>As you from crimes would pardon'd be,</l>
   <l>Let your Indulgence set me free.</l>
  </sp>
  <stage>Exit</stage>
 </div>
 <set>
  <p>The Scene, an un-inhabited Island.</p>
 </set>
 <castList>
  <head>Names of the Actors.</head>
  <castItem>Alonso, K. of Naples</castItem>
  <castItem>Sebastian, his Brother.</castItem>
  <castItem>Prospero, the right Duke of Millaine.</castItem>
<!-- etc -->
 </castList>
 <trailer>FINIS</trailer>
</back>

14 Encoding a Digital Facsimile

The following elements may be used to encode a text represented by a collection of digital images, either alone or in conjunction with a textual transcription.

As mentioned in section 9 Figures and Graphics above, a TEI simplePrint document may reference a set of page images, alone, or in combination with a transcription. For ease of management, it is strongly recommended that the graphic elements representing those page images be grouped together within a facsimile element, as in the following example:
<facsimile>
 <graphic url="page1.pngxml:id="pg1"/>
 <graphic url="page2.pngxml:id="pg2"/>
</facsimile>
If a transcription is supplied in addition, the xml:id values can be used to align the page breaks within it with the relevant image, rather than using the URL given on the graphic element.
<text>
<!-- ...-->
 <pb facs="#page1"/>
<!-- text contained on page 1 -->
 <pb facs="#page2"/>
<!-- text contained on page 2 -->
<!-- ...-->
</text>

The surface element is useful in two situations: when it is desired to group different images of the same page, for example of different resolutions; and when it is desired to align parts of a page image with parts of a transcription. The zone element is used to define (and hence provide an identifier for) the location of a part of an image with reference to the surface on which it appears.

In this example, a thumbnail and a high resolution image are associated with the same surface:
<facsimile>
 <surface>
  <graphic xml:id="page1T"
   url="thumbs/page1.png"/>

  <graphic xml:id="page1url="page1.png"/>
 </surface>
</facsimile>
In this example, the head element in the transcription is aligned with the top half of a square image:
<facsimile>
 <surface ulx="1uly="1lrx="4lry="4">
  <graphic url="page1.pngxml:id="page1"/>
  <zone xml:id="topHalfP1ulx="1uly="1"
   lrx="2lry="4"/>

 </surface>
</facsimile>
<text>
 <body>
<!-- ... -->
  <pb facs="#page1"/>
  <head facs="#topHalfP1">Text of Heading</head>
<!-- ...-->
 </body>
</text>

A more detailed explanation of the use of these attributes and other associated elements is given in the full TEI Guidelines.

15 The Electronic Title Page

Every TEI text has a header which provides information analogous to that provided by the title page of printed text. The header is introduced by the element teiHeader and has four major parts:

A corpus or collection of texts with many shared characteristics may have one header for the corpus and individual headers for each component of the corpus. In this case the type attribute indicates the type of header. <teiHeader type="corpus"> introduces the header for corpus-level information.

Some of the header elements contain running prose which consists of one or more ps. Others are grouped:

15.1 The File Description

The fileDesc element is mandatory. It contains a full bibliographic description of the file with the following elements:

  • titleStmt (title statement) groups information about the title of a work and those responsible for its content.
  • editionStmt (edition statement) groups information relating to one edition of a text.
  • extent (extent) describes the approximate size of a text stored on some carrier medium or of some other object, digital or non-digital, specified in any convenient units.
  • publicationStmt (publication statement) groups information concerning the publication or distribution of an electronic or other text.
  • seriesStmt (series statement) groups information about the series, if any, to which a publication belongs.
  • notesStmt (notes statement) collects together any notes providing information about a text additional to that recorded in other parts of the bibliographic description.
  • sourceDesc (source description) describes the source(s) from which an electronic text was derived or generated, typically a bibliographic description in the case of a digitized text, or a phrase such as ‘born digital’ for a text which has no previous existence.
A minimal TEI header has the following structure:
<teiHeader>
 <fileDesc>
  <titleStmt>
<!-- [ bibliographic description of the digital resource ] -->
  </titleStmt>
  <publicationStmt>
<!-- [ information about how the resource is distributed ] -->
  </publicationStmt>
  <sourceDesc>
<!-- [ information about the sources from which the digital resource is derived ] -->
  </sourceDesc>
 </fileDesc>
</teiHeader>

15.1.1 The Title Statement

The following elements can be used in the titleStmt to provide information about the title of a work and those responsible for its content:

  • title (title) contains a title for any kind of work.
  • author (author) in a bibliographic reference, contains the name(s) of an author, personal or corporate, of a work; for example in the same form as that provided by a recognized bibliographic name authority.
  • respStmt (statement of responsibility) supplies a statement of responsibility for the intellectual content of a text, edition, recording, or series, where the specialized elements for authors, editors, etc. do not suffice or do not apply. May also be used to encode information about individuals or organizations which have played a role in the production or distribution of a bibliographic work.

The title of a digital resource derived from a non-digital original may be similar to that of its source but should be distinct from it, for example: [title of source]: TEI XML edition or A machine readable version of: [title of source]

The generic respStmt element may be used to indicate any kind of responsibility, ranging from a funder or sponsor to an illustrator or editor. It contains the following subcomponents:
  • resp (responsibility) contains a phrase describing the nature of a person's intellectual responsibility, or an organization's role in the production or distribution of a work.
  • name (name, proper noun) contains a proper noun or noun phrase.
Example:
<titleStmt>
 <title>Two stories by Edgar Allen Poe </title>
 <author>Poe, Edgar Allen (1809-1849)</author>
 <respStmt>
  <resp>TEI encoding</resp>
  <name>James D. Benson</name>
 </respStmt>
 <respStmt>
  <resp>Funding </resp>
  <name>Getty Foundation</name>
 </respStmt>
</titleStmt>

15.1.2 The Edition Statement

The editionStmt groups information relating to one edition of the digital resource (where edition is used as elsewhere in bibliography), and may include the following elements:

  • edition (edition) describes the particularities of one edition of a text.
  • respStmt (statement of responsibility) supplies a statement of responsibility for the intellectual content of a text, edition, recording, or series, where the specialized elements for authors, editors, etc. do not suffice or do not apply. May also be used to encode information about individuals or organizations which have played a role in the production or distribution of a bibliographic work.
Example:
<editionStmt>
 <edition n="U2">Third draft, substantially revised <date>1987</date>
 </edition>
</editionStmt>

Determining exactly what constitutes a new edition of an electronic text is left to the encoder.

15.1.3 The Extent Statement

The extent statement describes the approximate size of the digital resource.

Example:
<extent>15 Mb
</extent>

15.1.4 The Publication Statement

The publicationStmt is mandatory. It may contain a simple prose description or groups of the elements described below:

  • publisher (publisher) provides the name of the organization responsible for the publication or distribution of a bibliographic item.
  • distributor (distributor) supplies the name of a person or other agency responsible for the distribution of a text.

At least one of these elements must be present, unless the entire publication statement is in prose. The following elements may occur within them:

  • pubPlace (publication place) contains the name of the place where a bibliographic item was published.
  • address (address) contains a postal address, for example of a publisher, an organization, or an individual.
  • addrLine (address line) contains one line of a postal address.
  • idno (identifier) supplies any form of identifier used to identify some object, such as a bibliographic item, a person, a title, an organization, etc. in a standardized way.
  • availability (availability) supplies information about the availability of a text, for example any restrictions on its use or distribution, its copyright status, any licence applying to it, etc.
  • licence contains information about a licence or other legal agreement applicable to the text.
  • date (date) contains a date in any format.
Example:
<publicationStmt>
 <publisher>University of Victoria Humanities Computing and Media
   Centre</publisher>
 <pubPlace>Victoria, BC</pubPlace>
 <date>2011</date>
 <availability status="restricted">
  <licence target="http://creativecommons.org/licenses/by-sa/3.0/"> Distributed
     under a Creative Commons Attribution-ShareAlike 3.0 Unported License
  </licence>
 </availability>
</publicationStmt>

15.1.5 Series and Notes Statements

The seriesStmt element groups information about the series, if any, to which a publication belongs. It may contain title, idno, or respStmt elements.

The notesStmt, if used, contains one or more note elements which contain a note or annotation. Some information found in the notes area in conventional bibliography has been assigned specific elements in the TEI scheme.

15.1.6 The Source Description

The sourceDesc is a mandatory element which records details of the source or sources from which the computer file is derived. It may contain simple prose or a bibliographic citation, using one or more of the following elements:

  • bibl (bibliographic citation) contains a loosely-structured bibliographic citation of which the sub-components may or may not be explicitly tagged.
  • listBibl (citation list) contains a list of bibliographic citations of any kind.
Examples:
<sourceDesc>
 <bibl>The first folio of Shakespeare, prepared by Charlton Hinman (The Norton
   Facsimile, 1968)</bibl>
</sourceDesc>
<sourceDesc>
 <bibl>
  <author>CNN Network News</author>
  <title>News headlines</title>
  <date>12 Jun 1989</date>
 </bibl>
</sourceDesc>

15.2 The Encoding Description

The encodingDesc element specifies the methods and editorial principles which governed the transcription of the text. Its use is highly recommended. It may be prose description or may contain more specialized elements chosen from the following list:

  • projectDesc (project description) describes in detail the aim or purpose for which an electronic file was encoded, together with any other relevant information concerning the process by which it was assembled or collected.
  • samplingDecl (sampling declaration) contains a prose description of the rationale and methods used in selecting texts, or parts of a text, for inclusion in the resource.
  • editorialDecl (editorial practice declaration) provides details of editorial principles and practices applied during the encoding of a text.
  • tagsDecl (tagging declaration) provides detailed information about the tagging applied to a document.
  • refsDecl (references declaration) specifies how canonical references are constructed for this text.
  • listPrefixDef (list of prefix definitions) contains a list of definitions of prefixing schemes used in teidata.pointer values, showing how abbreviated URIs using each scheme may be expanded into full URIs.
  • prefixDef (prefix definition) defines a prefixing scheme used in teidata.pointer values, showing how abbreviated URIs using the scheme may be expanded into full URIs.
  • classDecl (classification declarations) contains one or more taxonomies defining any classificatory codes used elsewhere in the text.
  • charDecl (character declarations) provides information about nonstandard characters and glyphs.

15.2.1 Project Description and Sampling Declaration

Examples of projectDesc and samplingDecl:
<encodingDesc>
 <projectDesc>
  <p>Texts collected for use in the Claremont Shakespeare Clinic, June 1990. </p>
 </projectDesc>
</encodingDesc>
<encodingDesc>
 <samplingDecl>
  <p>Samples of 2000 words taken from the beginning of the text</p>
 </samplingDecl>
</encodingDesc>

15.2.2 Editorial Declarations

The editorialDecl contains a prose description of the practices used when encoding the text. Typically this description should cover such topics as the following, each of which may conveniently be given as a separate paragraph:

correction
how and under what circumstances corrections have been made in the text.
normalization
the extent to which the original source has been regularized or normalized.
quotation
what has been done with quotation marks in the original -- have they been retained or replaced by entity references, are opening and closing quotes distinguished, etc.
hyphenation
what has been done with hyphens (especially end-of-line hyphens) in the original -- have they been retained, replaced by entity references, etc.
segmentation
how has the text has been segmented, for example into sentences, tone-units, graphemic strata, etc.
interpretation
what analytic or interpretive information has been added to the text.
Example:
<editorialDecl>
 <p>The part of speech analysis applied throughout section 4 was added by hand and
   has not been checked.</p>
 <p>Errors in transcription controlled by using the WordPerfect spelling
   checker.</p>
 <p>All words converted to Modern American spelling using Webster's 9th Collegiate
   dictionary.</p>
</editorialDecl>

The full TEI Guidelines provide specialized elements for each of the topics above; these are not however included in TEI simplePrint.

15.2.3 Tagging Declaration

When it does not consist simply of a prose description, the tagsDecl element may contain a number of more specialized elements providing additional information about how the document concerned has been marked up. The following elements may be used:

  • rendition (rendition) supplies information about the rendition or appearance of one or more elements in the source text.
  • namespace (namespace) supplies the formal name of the namespace to which the elements documented by its children belong.
  • tagUsage (element usage) documents the usage of a specific element within a specified document.
Here is a simple example, showing how these elements may be used. It indicates the number of times the elements hi and title from the TEI namespace have been used in the document. It also documents how the way that the source document was originally printed has been represented using TEI tagging:
<tagsDecl partial="true">
 <rendition xml:id="rend-bo">font-weight:bold</rendition>
 <rendition xml:id="rend-it"
  selector="hi, title">
font-style:italic</rendition>
 <namespace name="http://www.tei-c.org/ns/1.0">
  <tagUsage gi="hioccurs="467"/>
  <tagUsage gi="titleoccurs="45"/>
 </namespace>
</tagsDecl>

The rendition elements here contain fragments expressed in the W3C standard Cascading Stylesheets language (CSS). Their function here is to associate the particular styles concerned with an identifier (for example rend-bo) which can then be pointed to from elsewhere within the document by means of the rendition attribute mentioned in section 3.5.1 Changes of Typeface, etc. above. To indicate, for example, that a particular name in the document was rendered in a bold font it might be tagged <name rendition="#rend-bo">. The selector attribute used in the preceding example is used to indicate once for all a default rendition value to be associated with several elements: in this example, unless otherwise indicated, it is assumed that the content of each hi and each title element was originally rendered using an italic font.

For TEI simplePrint, a large set of such rendition definitions has been predefined. The encoder is not therefore required to supply any detailed declarations, but can refer to the predefined list using the following list:

simple:allcaps
all capitals
simple:blackletter
black letter or gothic typeface
simple:bold
bold typeface
simple:bottombraced
marked with a brace under the bottom of the text
simple:boxed
border around the text
simple:centre
centred text
simple:cursive
cursive typeface
simple:display
block display
simple:doublestrikethrough
strikethrough with double line
simple:doubleunderline
underlined with double line
simple:dropcap
initial letter larger or decorated
simple:float
floated out of main flow
simple:hyphen
with a hyphen here (e.g. in line break)
simple:inline
inline rendering
simple:justify
justified text
simple:italic
italic typeface
simple:larger
larger type
simple:left
aligned to the left or left-justified
simple:leftbraced
marked with a brace on the left side of the text
simple:letterspace
larger-than-normal spacing between letters, usually for emphasis
simple:literal
fixed-width typeface, spacing preserved
simple:normalstyle
upright shape and default weight of typeface
simple:normalweight
normal typeface weight
simple:right
aligned to the right or right-justified
simple:rightbraced
marked with a brace to the right of the text
simple:rotateleft
rotated to the left
simple:rotateright
rotated to the right
simple:smallcaps
small caps
simple:smaller
smaller type
simple:strikethrough
strikethrough
simple:subscript
subscript
simple:superscript
superscript
simple:topbraced
marked with a brace above the text
simple:typewriter
fixed-width typeface, like typewriter
simple:underline
underlined with single line
simple:wavyunderline
underlined with wavy line

The simple: prefix used here must be mapped to a location at which the full rendition declaration can be found, by default the XML source of the present document.

Full details of the way these elements may be used are provided in the relevant section of the TEI Guidelines (http://www.tei-c.org/release/doc/tei-p5-doc/en/html/HD.html#HD57).

15.2.4 Reference, Prefix, and Classification Declarations

The refsDecl element is used to document the way in which any standard referencing scheme built into the encoding works. In its simplest form, it consists of prose description.

Example:
<refsDecl>
 <p>The @n attribute on each <div> element contains the canonical reference for
   each division in the form XX.yyy where XX is the book number in roman numeral
   and yyy is the section number in arabic.</p>
 <p>Milestone tags refer to the edition of 1830 as E30 and that of 1850 as E50.</p>
</refsDecl>
The listPrefixDef element contains one or more prefixDef elements, each defining a prefix which has been used to abbreviate references to other documents, for example as the value of a target or other pointing attribute. The definition provides information about how the prefix can be translated automatically into a full URL, as in the following example:
<listPrefixDef>
 <prefixDef ident="psn"
  matchPattern="([A-Z]+)"
  replacementPattern="http://www.example.com/personography.xml#$1"/>

</listPrefixDef>

In this case, a pointer value in the form psn:MDH would be translated to http://www.example.com/personography.xml#MDH.

The classDecl element groups together definitions or sources for any descriptive classification schemes or taxonomies used by other parts of the header. These schemes may be defined in a number of different ways, using one or more of the following elements:

  • taxonomy (taxonomy) defines a typology either implicitly, by means of a bibliographic citation, or explicitly by a structured taxonomy.
  • bibl (bibliographic citation) contains a loosely-structured bibliographic citation of which the sub-components may or may not be explicitly tagged.
  • category (category) contains an individual descriptive category, possibly nested within a superordinate category, within a user-defined taxonomy.
  • catDesc (category description) describes some category within a taxonomy or text typology, either in the form of a brief prose description or in terms of the situational parameters used by the TEI formal <textDesc>.
In the simplest case, the taxonomy may be defined by a bibliographic reference, as in the following example:
<classDecl>
 <taxonomy xml:id="LC-SH">
  <bibl>Library of Congress Subject Headings </bibl>
 </taxonomy>
</classDecl>
Alternatively, or in addition, the encoder may define a special purpose classification scheme, as in the following example:
<taxonomy xml:id="B">
 <bibl>Brown Corpus</bibl>
 <category xml:id="B.A">
  <catDesc>Press Reportage</catDesc>
  <category xml:id="B.A1">
   <catDesc>Daily</catDesc>
  </category>
  <category xml:id="B.A2">
   <catDesc>Sunday</catDesc>
  </category>
  <category xml:id="B.A3">
   <catDesc>National</catDesc>
  </category>
  <category xml:id="B.A4">
   <catDesc>Provincial</catDesc>
  </category>
  <category xml:id="B.A5">
   <catDesc>Political</catDesc>
  </category>
  <category xml:id="B.A6">
   <catDesc>Sports</catDesc>
  </category>
 </category>
 <category xml:id="B.D">
  <catDesc>Religion</catDesc>
  <category xml:id="B.D1">
   <catDesc>Books</catDesc>
  </category>
  <category xml:id="B.D2">
   <catDesc>Periodicals and tracts</catDesc>
  </category>
 </category>
</taxonomy>

Linkage between a particular text and a category within such a taxonomy is made by means of the catRef element within the textClass element, as described in the next section.

15.2.5 The character declaration

As mentioned in section 5.2 Formulae, Codes, and Special Characters above, the element g is used to indicate the presence of a nonstandard character or glyph in a transcription, and to reference a definition or description of it in the Header. These definitions are provided by means of the following specialised elements given within the charDecl component of the encodingDesc:
  • char (character) provides descriptive information about a character.
  • glyph (character glyph) provides descriptive information about a character glyph.
  • desc (description) contains a short description of the purpose, function, or use of its parent element, or when the parent is a documentation element, describes or defines the object being documented.
  • mapping (character mapping) contains one or more characters which are related to the parent character or glyph in some respect, as specified by the type attribute.
For example, the alchemical symbol for air might be encoded where it appears in a text using a g element, whose ref attribute might have a value #air to link to the following simple definition for the symbol concerned:
<char xml:id="air">
 <unicodeProp name="Name"
  value="ALCHEMICAL SYMBOL FOR AIR"/>

 <mapping type="standard">🜁</mapping>
</char>
Further details of these and related elements are provided in section http://www.tei-c.org/release/doc/tei-p5-doc/en/html/WD.html#D25-20 of the TEI Guidelines.

15.3 The Profile Description

The profileDesc element gathers together information about various descriptive aspects of a text. It has the following optional components:

  • creation (creation) contains information about the creation of a text.
  • abstract contains a summary or formal abstract prefixed to an existing source document by the encoder.
  • particDesc (participation description) describes the identifiable speakers, voices, or other participants in any kind of text or other persons named or otherwise referred to in a text, edition, or metadata.
  • settingDesc (setting description) describes the setting or settings within which a language interaction takes place, or other places otherwise referred to in a text, edition, or metadata.
  • langUsage (language usage) describes the languages, sublanguages, registers, dialects, etc. represented within a text.
  • textClass (text classification) groups information which describes the nature or topic of a text in terms of a standard classification scheme, thesaurus, etc.

The creation element documents where a work was created, even though it may not have been published or recorded there:

Example:
<creation>
 <date when="1992-08">August 1992</date>
 <name type="place">Taos, New Mexico</name>
</creation>
The abstract element may be used to provide a brief summary or abstract of the document concerned. It is most usually applied to texts born digital:
<profileDesc>
 <abstract>
  <p>This paper is a draft studying various aspects of using the TEI as a reference
     serialization framework for LMF. Comments are welcome to bring this to a useful
     document for the community.</p>
 </abstract>
</profileDesc>
The particDesc element is used to list descriptive information about the real or fictional participants in a text, for example the characters in a novel or a play. It contains at least one listPerson element, which contains individual person elements.
  • listPerson (list of persons) contains a list of descriptions, each of which provides information about an identifiable person or a group of people, for example the participants in a language interaction, or the people referred to in a historical source.
  • person (person) provides information about an identifiable individual, for example a participant in a language interaction, or a person referred to in a historical source.
For example:
<profileDesc>
 <particDesc>
  <listPerson>
   <person xml:id="OPI">
    <p>
     <name>Dr Opimian</name>: named for the famous Roman fine wine.</p>
   </person>
   <person xml:id="GRM">
    <p>
     <name>Mr Gryll</name>: named for the mythical Gryllus, one of Ulysses'
         sailors transformed by Circe into a pig, who argues that he was happier in
         that state than as a man.</p>
   </person>
  </listPerson>
 </particDesc>
</profileDesc>
In the same way, the settingDesc element can be used to list descriptive information about the real or fictional places mentioned in a text. It contains at least one listPlace element, which contains individual place elements.
  • listPlace (list of places) contains a list of places, optionally followed by a list of relationships (other than containment) defined amongst them.
  • place (place) contains data about a geographic location
For example:
<profileDesc>
 <settingDesc>
  <listPlace>
   <head>Houses mentioned in <title>Pride and Prejudice</title>
   </head>
   <place xml:id="NETF1">
    <p>
     <name>Netherfield Park</name>: home of the Bingleys</p>
   </place>
   <place xml:id="PEMB1">
    <p>
     <name>Pemberley</name>: home of Mr Darcy</p>
   </place>
  </listPlace>
 </settingDesc>
</profileDesc>

The full TEI Guidelines provide a rich range of additional elements to define more structured information about persons and places; these are not however available in TEI Simple.

The langUsage element is useful where a text contains many different languages. It may contain language elements to document each particular language used:
  • language (language) characterizes a single language or sublanguage used within a text.
For example, a text containing predominantly text in French as spoken in Quebec, but also smaller amounts of British and Canadian English might be documented as follows:
<langUsage>
 <language ident="fr-CAusage="60">Québecois</language>
 <language ident="en-CAusage="20">Canadian Business English</language>
 <language ident="en-GBusage="20">British English</language>
</langUsage>

The textClass element classifies a text. This may be done with reference to a classification system locally defined by means of the classDecl element, or by reference to some externally defined established scheme such as the Universal Decimal Classification. Texts may also be classified using lists of keywords, which may themselves be drawn from locally or externally defined control lists. The following elements are used to supply such classifications:

  • classCode (classification code) contains the classification code used for this text in some standard classification system.
  • catRef (category reference) specifies one or more defined categories within some taxonomy or text typology.
  • keywords (keywords) contains a list of keywords or phrases identifying the topic or nature of a text.
The simplest way of classifying a text is by means of the classCode element. For example, a text with classification 410 in the Universal Decimal Classification might be documented as follows:
<classCode scheme="http://www.udc.org">410</classCode>
When a classification scheme has been locally defined using the taxonomy element discussed in the preceding subsection, the catRef element should be used to reference it. To continue the earlier example, a work classified in the Brown Corpus as Press reportage - Sunday and also as Religion might be documented as follows:
<catRef target="#B.A3 #B.D"/>
The element keywords contains one or more keywords or phrases identifying the topic or nature of a text, each tagged as a term. As usual, the attribute scheme identifies the source from which these terms are taken. For example, if the LC Subject Headings are used, following declaration of that classification system in a taxonomy element as above:
<textClass>
 <keywords scheme="#LCSH">
  <term>English literature</term>
  <term>History and criticism</term>
  <term>Data processing.</term>
 </keywords>
</textClass>

Multiple classifications may be supplied using any of the mechanisms described in this section.

15.4 Other forms of metadata

The TEI header was one of the first attempts to provide a full range of metadata elements, but it is by no means the only standard now used for this purpose. To facilitate the management of large digital collections and to simplify interoperability of TEI and non-TEI resources, the following element may be found useful:

  • xenoData (non-TEI metadata) provides a container element into which metadata in non-TEI formats may be placed.

A typical use for this element might be to store a set of descriptors conforming to the Dublin Core standard in the TEI header rather than to generate them automatically from the corresponding TEI elements. For examples and discussion, see the TEI Guidelines at http://www.tei-c.org/release/doc/tei-p5-doc/en/html/HD.html#HD9

15.5 The Revision Description

The revisionDesc element provides a change log in which each significant change made to a text may be recorded. It is always the last element in a teiHeader and contains the following elements:

  • change (change) documents a change or set of changes made during the production of a source document, or during the revision of an electronic file.
  • listChange groups a number of change descriptions associated with either the creation of a source text or the revision of an encoded text.

Each change element contains a brief description of a significant change. The attributes when and who may be used to identify when the change was carried out and the person responsible for it.

It is good practice (but not required) to group changes together within a listChange element.

Example:
<revisionDesc>
 <listChange>
  <change when="1991-11-11who="#LB">deleted chapter 10</change>
  <change when="1991-11-02who="#MSM">completed first draft</change>
 </listChange>
</revisionDesc>

In a production environment it will usually be found preferable to use some kind of automated system to track and record changes. Many such version control systems, as they are known, can also be configured to update the TEI header of a file automatically.

16 The Simple Processing Model

Unlike most other TEI customizations, TEI simplePrint includes documentation of the intended processing associated with the majority of elements. As noted above, the TEI provides components such as the rendition attribute to indicate the appearance of particular parts of a document in the non-digital source from which it is derived. With TEI simplePrint, it is also possible to indicate how in general an element should be processed, in particular its intended appearance when processed for display on a screen or on paper. This ability derives from a number of capabilities recently added to the TEI architecture for the specification of processing, which were developed as part of the project that defined the TEI simplePrint schema.

The key feature of this ‘Processing Model’ is a notation that allows the encoder to associate each element with one or more categories, which we call its behaviours. In addition, the Processing Model indicates how the element should be rendered, possibly differently in differing circumstances, using the W3C Cascading Style Sheets (CSS) mentioned above. It is consequently much easier to develop processors for documents conforming to TEI simplePrint, since the complexity of the task is much reduced.

Twenty-five different behaviours are currently defined by the TEI Processing Model. Their names indicate informally the categorization concerned, and should be readily comprehensible for most programmers. The following table indicates the TEI simplePrint elements associated with each:

BehaviourUsed byEffect
alternatechoice datesupport display of alternative visualizations, for example by displaying the preferred content, by displaying both in parallel, or by toggling between the two.
anchoranchor create an identifiable anchor point in the output.
blockaddress addrLine argument back body byline closer dateline div docTitle epigraph figure floatingText formula front fw group head imprimatur l lg listBibl note opener postscript q quote role roleDesc salute signed sp speaker spGrp stage titlePage titlePart trailer create a block structure
bodytext create the body of a document
breakcb lb pbcreate a line, column, or page break according to the value of type
cellcell create a table cell
citcit show the content, with an indication of the source
documentTEI start a new output document
glyphg show a character by looking up reference to a chardesc at the given URI
graphicgraphic if URL is present, use it to display graphic, else display a placeholder image
headinghead creates a heading
indexbody generate list according to type
inlineabbr actor add am author bibl biblScope c choice code corr date del desc docAuthor docDate docEdition docImprint editor email ex expan figDesc figure foreign formula fw g gap hi label measure milestone name note num orig pc q quote ref reg relatedItem rhyme rs s salute seg sic signed subst supplied time title unclear w creates inline element out of content if there's something in <outputRendition>, use that formatting; otherwise just show text of selected content
linkref create hyperlink
listcastGroup castList list listBibl create a list
listItembibl castItem item create a list item
metadatateiHeader create metadata section
notenote create a note, often out of line, depending on the value of place; could be margin, footnote, endnote, inline
omitauthor editor publisher pubPlace profileDesc revisionDesc encodingDesc do nothing, do not process children
paragraphab pcreate a paragraph out of content
rowrow create a table row
sectiondiv create a new section of the output document
tabletable create a table
texttitlecreate literal text
titlefileDesc create document title

Full documentation of the Processing Model is provided in section http://www.tei-c.org/release/doc/tei-p5-doc/en/html/TD.html#TDPM of the TEI Guidelines, and we do not describe it further here.

17 The TEI simplePrint schema

Like other TEI customizations, TEI simplePrint is defined by reference to the TEI Guidelines. The following reference documentation provides formal specifications for each element, model class, attribute class, macro and datatype it uses. These concepts are further explained in the TEI Guidelines.

Specifications are provided here for each component which has been modified for inclusion in TEI simplePrint. Almost every textual element has been modified, if only to include a processing model component. Note that the cross references included in these specifications are to the section of the full TEI Guidelines where the subject is treated, and not to sections of the present document.

Schema tei_simplePrint: Elements

<ab>

<ab> (anonymous block) contains any component-level unit of text, acting as a container for phrase or inter level elements analogous to, but without the same constraints as, a paragraph. [17.3. Blocks, Segments, and Anchors]
Modulelinking
Attributes
Member of
Contained by
May contain
Note

The ab element may be used at the encoder's discretion to mark any component-level elements in a text for which no other more specific appropriate markup is defined. Unlike paragraphs, ab may nest and may use the type and subtype attributes.

Example
<div type="bookn="Genesis">
 <div type="chaptern="1">
  <ab>In the beginning God created the heaven and the earth.</ab>
  <ab>And the earth was without form, and void; and
     darkness was upon the face of the deep. And the
     spirit of God moved upon the face of the waters.</ab>
  <ab>And God said, Let there be light: and there was light.</ab>
<!-- ...-->
 </div>
</div>
Schematron

<sch:rule context="tei:ab">
<sch:report test="(ancestor::tei:l or ancestor::tei:lg) and not( ancestor::tei:floatingText |parent::tei:figure |parent::tei:note )"> Abstract model violation: Lines may not contain higher-level divisions such as p or ab, unless ab is a child of figure or note, or is a descendant of floatingText.
</sch:report>
</sch:rule>
Content model
<content>
 <macroRef key="macro.abContent"/>
</content>
Schema Declaration
element ab
{
   att.global.attributes,
   att.typed.attributes,
   att.fragmentable.attributes,
   att.written.attributes,
   att.cmc.attributes,
   macro.abContent
}
Processing Model
<model behaviour="paragraph"/>

<abbr>

<abbr> (abbreviation) contains an abbreviation of any sort. [3.6.5. Abbreviations and Their Expansions]
Modulecore
Attributes
type(type) allows the encoder to classify the abbreviation according to some convenient typology.
Derived fromatt.typed
Status Optional
Datatype teidata.enumerated
Member of
Contained by
May contain
analysis: c pc s w
figures: figure formula
gaiji: g
header: idno
linking: anchor seg
tagdocs: code
textstructure: floatingText
transcr: fw subst supplied
verse: rhyme
character data
Example
<choice>
 <expan>North Atlantic Treaty Organization</expan>
 <abbr cert="low">NorATO</abbr>
 <abbr cert="high">NATO</abbr>
 <abbr cert="highxml:lang="fr">OTAN</abbr>
</choice>
Example
<choice>
 <abbr>SPQR</abbr>
 <expan>senatus populusque romanorum</expan>
</choice>
Content model
<content>
 <macroRef key="macro.phraseSeq"/>
</content>
Schema Declaration
element abbr
{
   att.global.attributes,
   att.typed.attribute.subtype,
   att.cmc.attributes,
   attribute type { teidata.enumerated }?,
   macro.phraseSeq
}
Processing Model
<model behaviour="inline"/>

<abstract>

<abstract> contains a summary or formal abstract prefixed to an existing source document by the encoder. [2.4.4. Abstracts]
Moduleheader
Attributes
Member of
Contained by
header: profileDesc
May contain
figures: table
linking: ab
namesdates: listPerson listPlace
Note

This element is intended only for cases where no abstract is available in the original source. Any abstract already present in the source document should be encoded as a div within the front, as it should for a born-digital document.

Example
<profileDesc>
 <abstract resp="#LB">
  <p>Good database design involves the acquisition and deployment of
     skills which have a wider relevance to the educational process. From
     a set of more or less instinctive rules of thumb a formal discipline
     or "methodology" of database design has evolved. Applying that
     methodology can be of great benefit to a very wide range of academic
     subjects: it requires fundamental skills of abstraction and
     generalisation and it provides a simple mechanism whereby complex
     ideas and information structures can be represented and manipulated,
     even without the use of a computer. </p>
 </abstract>
</profileDesc>
Content model
<content>
 <alternate minOccurs="1"
  maxOccurs="unbounded">

  <classRef key="model.pLike"/>
  <classRef key="model.listLike"/>
  <elementRef key="listBibl"/>
 </alternate>
</content>
Schema Declaration
element abstract
{
   att.global.attributes,
   ( model.pLike | model.listLike | listBibl )+
}

<actor>

<actor> contains the name of an actor appearing within a cast list. [7.1.4. Cast Lists]
Moduledrama
Attributes
sexspecifies the sex of the actor.
Status Optional
Datatype 1–∞ occurrences of teidata.sex separated by whitespace
Note

Values for this attribute may be locally defined by a project, or may refer to an external standard.

genderspecifies the gender of the actor.
Status Optional
Datatype 1–∞ occurrences of teidata.gender separated by whitespace
Note

Values for this attribute may be locally defined by a project, or they may refer to an external standard.

Member of
Contained by
drama: castItem
May contain
analysis: c pc s w
figures: figure formula
gaiji: g
header: idno
linking: anchor seg
tagdocs: code
textstructure: floatingText
transcr: fw subst supplied
verse: rhyme
character data
Note

This element should be used only to mark the name of the actor as given in the source. Chapter 14. Names, Dates, People, and Places discusses ways of marking the components of names, and also of associating names with biographical information about a person.

Example
<castItem>
 <role>Mathias</role>
 <roleDesc>the Burgomaster</roleDesc>
 <actor ref="https://en.wikipedia.org/wiki/Henry_Irving">Mr. Henry Irving</actor>
</castItem>
Content model
<content>
 <macroRef key="macro.phraseSeq"/>
</content>
Schema Declaration
element actor
{
   att.global.attributes,
   att.canonical.attributes,
   attribute sex { list { teidata.sex+ } }?,
   attribute gender { list { teidata.gender+ } }?,
   macro.phraseSeq
}
Processing Model
<model behaviour="inline"/>

<add>

<add> (addition) contains letters, words, or phrases inserted in the source text by an author, scribe, or a previous annotator or corrector. [3.5.3. Additions, Deletions, and Omissions]
Modulecore
Attributes
Member of
Contained by
May contain
Note

In a diplomatic edition attempting to represent an original source, the add element should not be used for additions to the current TEI electronic edition made by editors or encoders. In these cases, either the corr or supplied element are recommended.

In a TEI edition of a historical text with previous editorial emendations in which such additions or reconstructions are considered part of the source text, the use of add may be appropriate, dependent on the editorial philosophy of the project.

Example
The story I am
going to relate is true as to its main facts, and as to the
consequences <add place="above">of these facts</add> from which
this tale takes its title.
Content model
<content>
 <macroRef key="macro.paraContent"/>
</content>
Schema Declaration
element add
{
   att.global.attributes,
   att.transcriptional.attributes,
   att.placement.attributes,
   att.typed.attributes,
   att.dimensions.attributes,
   att.cmc.attributes,
   macro.paraContent
}
Processing Model
<model behaviour="inline">
<outputRendition>color: green; text-decoration: underline;</outputRendition>
</model>

<address>

<address> (address) contains a postal address, for example of a publisher, an organization, or an individual. [3.6.2. Addresses 2.2.4. Publication, Distribution, Licensing, etc. 3.12.2.4. Imprint, Size of a Document, and Reprint Information]
Modulecore
Attributes
Member of
Contained by
May contain
figures: figure
header: idno
linking: anchor
transcr: fw
Note

This element should be used for postal addresses only. Within it, the generic element addrLine may be used as an alternative to any of the more specialized elements available from the model.addrPart class, such as <street>, <postCode> etc.

Example

Using just the elements defined by the core module, an address could be represented as follows:

<address>
 <street>via Marsala 24</street>
 <postCode>40126</postCode>
 <name>Bologna</name>
 <name>Italy</name>
</address>
Example

When a schema includes the names and dates module more specific elements such as country or settlement would be preferable over generic name:

<address>
 <street>via Marsala 24</street>
 <postCode>40126</postCode>
 <settlement>Bologna</settlement>
 <country>Italy</country>
</address>
Example
<address>
 <addrLine>Computing Center, MC 135</addrLine>
 <addrLine>P.O. Box 6998</addrLine>
 <addrLine>Chicago, IL 60680</addrLine>
 <addrLine>USA</addrLine>
</address>
Example
<address>
 <country key="FR"/>
 <settlement type="city">Lyon</settlement>
 <postCode>69002</postCode>
 <district type="arrondissement">IIème</district>
 <district type="quartier">Perrache</district>
 <street>
  <num>30</num>, Cours de Verdun</street>
</address>
Content model
<content>
 <sequence minOccurs="1maxOccurs="1">
  <classRef key="model.global"
   minOccurs="0maxOccurs="unbounded"/>

  <sequence minOccurs="1"
   maxOccurs="unbounded">

   <classRef key="model.addrPart"/>
   <classRef key="model.global"
    minOccurs="0maxOccurs="unbounded"/>

  </sequence>
 </sequence>
</content>
Schema Declaration
element address
{
   att.global.attributes,
   att.cmc.attributes,
   ( model.global*, ( ( model.addrPart, model.global* )+ ) )
}
Processing Model
<model behaviour="block">
<outputRendition>margin-top: 2em; margin-left: 2em; margin-right: 2em;
margin-bottom: 2em;</outputRendition>
</model>

<addrLine>

<addrLine> (address line) contains one line of a postal address. [3.6.2. Addresses 2.2.4. Publication, Distribution, Licensing, etc. 3.12.2.4. Imprint, Size of a Document, and Reprint Information]
Modulecore
Attributes
Member of
Contained by
core: address
May contain
analysis: c pc s w
figures: figure formula
gaiji: g
header: idno
linking: anchor seg
tagdocs: code
textstructure: floatingText
transcr: fw subst supplied
verse: rhyme
character data
Note

Addresses may be encoded either as a sequence of lines, or using any sequence of component elements from the model.addrPart class. Other non-postal forms of address, such as telephone numbers or email, should not be included within an address element directly but may be wrapped within an addrLine if they form part of the printed address in some source text.

Example
<address>
 <addrLine>Computing Center, MC 135</addrLine>
 <addrLine>P.O. Box 6998</addrLine>
 <addrLine>Chicago, IL</addrLine>
 <addrLine>60680 USA</addrLine>
</address>
Example
<addrLine>
 <ref target="tel:+1-201-555-0123">(201) 555 0123</ref>
</addrLine>
Content model
<content>
 <macroRef key="macro.phraseSeq"/>
</content>
Schema Declaration
element addrLine { att.global.attributes, macro.phraseSeq }
Processing Model
<model behaviour="block">
<outputRendition>white-space: nowrap;</outputRendition>
</model>

<anchor>

<anchor> (anchor point) attaches an identifier to a point within a text, whether or not it corresponds with a textual element. [8.4.2. Synchronization and Overlap 17.5. Correspondence and Alignment]
Modulelinking
Attributes
Member of
Contained by
May containEmpty element
Note

On this element, the global xml:id attribute must be supplied to specify an identifier for the point at which this element occurs within a document. The value used may be chosen freely provided that it is unique within the document and is a syntactically valid name. There is no requirement for values containing numbers to be in sequence.

Example
<s>The anchor is he<anchor xml:id="A234"/>re somewhere.</s>
<s>Help me find it.<ptr target="#A234"/>
</s>
Content model
<content>
 <empty/>
</content>
Schema Declaration
element anchor
{
   att.global.attributes,
   att.typed.attributes,
   att.cmc.attributes,
   empty
}
Processing Model
<model behaviour="anchor">
<param name="idvalue="@xml:id"/>
</model>

<argument>

<argument> (argument) contains a formal list or prose description of the topics addressed by a subdivision of a text. [4.2. Elements Common to All Divisions 4.6. Title Pages]
Moduletextstructure
Attributes
Member of
Contained by
core: lg list
drama: castList
figures: figure table
May contain
drama: castList
figures: figure table
header: biblFull
linking: ab anchor
namesdates: listPerson listPlace
textstructure: floatingText
transcr: fw
Example
<argument>
 <l>With ſighs and tears her love he doth deſire,</l>
 <l>Since Cupid hath his ſenſes ſet on fire;</l>
 <l>His torment and his pain to her he ſhews,</l>
 <l>With all his proteſtations and his vows:</l>
 <l>At laſt ſhe yields to grant him ſome relief,</l>
 <l>And make him joyful after all his grief.</l>
</argument>
Content model
<content>
 <sequence minOccurs="1maxOccurs="1">
  <alternate minOccurs="0"
   maxOccurs="unbounded">

   <classRef key="model.global"/>
   <classRef key="model.headLike"/>
  </alternate>
  <sequence minOccurs="1"
   maxOccurs="unbounded">

   <classRef key="model.common"/>
   <classRef key="model.global"
    minOccurs="0maxOccurs="unbounded"/>

  </sequence>
 </sequence>
</content>
Schema Declaration
element argument
{
   att.global.attributes,
   att.cmc.attributes,
   ( ( model.global | model.headLike )*, ( ( model.common, model.global* )+ ) )
}
Processing Model
<model behaviour="block">
<outputRendition>margin-bottom: 0.5em;</outputRendition>
</model>

<author>

<author> (author) in a bibliographic reference, contains the name(s) of an author, personal or corporate, of a work; for example in the same form as that provided by a recognized bibliographic name authority. [3.12.2.2. Titles, Authors, and Editors 2.2.1. The Title Statement]
Modulecore
Attributes
calendarindicates one or more systems or calendars to which the date represented by the content of this element belongs.
Deprecatedwill be removed on 2024-11-11
Status Optional
Datatype 1–∞ occurrences of teidata.pointer separated by whitespace
Schematron

<sch:rule context="tei:*[@calendar]">
<sch:assert test="string-length( normalize-space(.) ) gt 0"> @calendar indicates one or more
systems or calendars to which the date represented by the content of this element belongs,
but this <sch:name/> element has no textual content.</sch:assert>
</sch:rule>
Member of
Contained by
May contain
analysis: c pc s w
figures: figure formula
gaiji: g
header: idno
linking: anchor seg
tagdocs: code
textstructure: floatingText
transcr: fw subst supplied
verse: rhyme
character data
Note

Particularly where cataloguing is likely to be based on the content of the header, it is advisable to use a generally recognized name authority file to supply the content for this element. The attributes key or ref may also be used to reference canonical information about the author(s) intended from any appropriate authority, such as a library catalogue or online resource.

In the case of a broadcast, use this element for the name of the company or network responsible for making the broadcast.

Where an author is unknown or unspecified, this element may contain text such as Unknown or Anonymous. When the appropriate TEI modules are in use, it may also contain detailed tagging of the names used for people, organizations or places, in particular where multiple names are given.

Example
<author>British Broadcasting Corporation</author>
<author>La Fayette, Marie Madeleine Pioche de la Vergne, comtesse de (1634–1693)</author>
<author>Anonymous</author>
<author>Bill and Melinda Gates Foundation</author>
<author>
 <persName>Beaumont, Francis</persName> and
<persName>John Fletcher</persName>
</author>
<author>
 <orgName key="BBC">British Broadcasting
   Corporation</orgName>: Radio 3 Network
</author>
Schematron

<sch:rule context="tei:*[@calendar]">
<sch:assert test="string-length( normalize-space(.) ) gt 0"> @calendar indicates one or more
systems or calendars to which the date represented by the content of this element belongs,
but this <sch:name/> element has no textual content.</sch:assert>
</sch:rule>
Content model
<content>
 <macroRef key="macro.phraseSeq"/>
</content>
Schema Declaration
element author
{
   att.global.attributes,
   att.naming.attributes,
   att.datable.attributes,
   attribute calendar { list { teidata.pointer+ } }?,
   macro.phraseSeq
}
Processing Model
<model predicate="ancestor::teiHeader"
 behaviour="omit"/>

<model behaviour="inline"/>

<availability>

<availability> (availability) supplies information about the availability of a text, for example any restrictions on its use or distribution, its copyright status, any licence applying to it, etc. [2.2.4. Publication, Distribution, Licensing, etc.]
Moduleheader
Attributes
status(status) supplies a code identifying the current availability of the text.
Status Optional
Datatype teidata.enumerated
Legal values are:
free
(free) the text is freely available.
unknown
(unknown) the status of the text is unknown.
restricted
(restricted) the text is not freely available.
Member of
Contained by
core: bibl
May contain
core: p
header: licence
linking: ab
Note

A consistent format should be adopted

Example
<availability status="restricted">
 <p>Available for academic research purposes only.</p>
</availability>
<availability status="free">
 <p>In the public domain</p>
</availability>
<availability status="restricted">
 <p>Available under licence from the publishers.</p>
</availability>
Example
<availability>
 <licence target="http://opensource.org/licenses/MIT">
  <p>The MIT License
     applies to this document.</p>
  <p>Copyright (C) 2011 by The University of Victoria</p>
  <p>Permission is hereby granted, free of charge, to any person obtaining a copy
     of this software and associated documentation files (the "Software"), to deal
     in the Software without restriction, including without limitation the rights
     to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
     copies of the Software, and to permit persons to whom the Software is
     furnished to do so, subject to the following conditions:</p>
  <p>The above copyright notice and this permission notice shall be included in
     all copies or substantial portions of the Software.</p>
  <p>THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
     IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
     FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
     AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
     LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
     OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
     THE SOFTWARE.</p>
 </licence>
</availability>
Content model
<content>
 <alternate minOccurs="1"
  maxOccurs="unbounded">

  <classRef key="model.availabilityPart"/>
  <classRef key="model.pLike"/>
 </alternate>
</content>
Schema Declaration
element availability
{
   att.global.attributes,
   attribute status { "free" | "unknown" | "restricted" }?,
   ( model.availabilityPart | model.pLike )+
}

<back>

<back> (back matter) contains any appendixes, etc. following the main part of a text. [4.7. Back Matter 4. Default Text Structure]
Moduletextstructure
Attributes
Contained by
textstructure: floatingText text
transcr: facsimile
May contain
Note

Because cultural conventions differ as to which elements are grouped as back matter and which as front matter, the content models for the back and front elements are identical.

Example
<back>
 <div type="appendix">
  <head>The Golden Dream or, the Ingenuous Confession</head>
  <p>TO shew the Depravity of human Nature, and how apt the Mind is to be misled by Trinkets
     and false Appearances, Mrs. Two-Shoes does acknowledge, that after she became rich, she
     had like to have been, too fond of Money
<!-- .... -->
  </p>
 </div>
<!-- ... -->
 <div type="epistle">
  <head>A letter from the Printer, which he desires may be inserted</head>
  <salute>Sir.</salute>
  <p>I have done with your Copy, so you may return it to the Vatican, if you please;
  
<!-- ... -->
  </p>
 </div>
 <div type="advert">
  <head>The Books usually read by the Scholars of Mrs Two-Shoes are these and are sold at Mr
     Newbery's at the Bible and Sun in St Paul's Church-yard.</head>
  <list>
   <item n="1">The Christmas Box, Price 1d.</item>
   <item n="2">The History of Giles Gingerbread, 1d.</item>
<!-- ... -->
   <item n="42">A Curious Collection of Travels, selected from the Writers of all Nations,
       10 Vol, Pr. bound 1l.</item>
  </list>
 </div>
 <div type="advert">
  <head>By the KING's Royal Patent, Are sold by J. NEWBERY, at the Bible and Sun in St.
     Paul's Church-Yard.</head>
  <list>
   <item n="1">Dr. James's Powders for Fevers, the Small-Pox, Measles, Colds, &amp;c. 2s.
       6d</item>
   <item n="2">Dr. Hooper's Female Pills, 1s.</item>
<!-- ... -->
  </list>
 </div>
</back>
Content model
<content>
 <sequence minOccurs="1maxOccurs="1">
  <alternate minOccurs="0"
   maxOccurs="unbounded">

   <classRef key="model.frontPart"/>
   <classRef key="model.pLike.front"/>
   <classRef key="model.pLike"/>
   <classRef key="model.listLike"/>
   <classRef key="model.global"/>
  </alternate>
  <alternate minOccurs="0maxOccurs="1">
   <sequence minOccurs="1maxOccurs="1">
    <classRef key="model.div1Like"/>
    <alternate minOccurs="0"
     maxOccurs="unbounded">

     <classRef key="model.frontPart"/>
     <classRef key="model.div1Like"/>
     <classRef key="model.global"/>
    </alternate>
   </sequence>
   <sequence minOccurs="1maxOccurs="1">
    <classRef key="model.divLike"/>
    <alternate minOccurs="0"
     maxOccurs="unbounded">

     <classRef key="model.frontPart"/>
     <classRef key="model.divLike"/>
     <classRef key="model.global"/>
    </alternate>
   </sequence>
  </alternate>
  <sequence minOccurs="0maxOccurs="1">
   <classRef key="model.divBottomPart"/>
   <alternate minOccurs="0"
    maxOccurs="unbounded">

    <classRef key="model.divBottomPart"/>
    <classRef key="model.global"/>
   </alternate>
  </sequence>
 </sequence>
</content>
Schema Declaration
element back
{
   att.global.attributes,
   (
      (
         model.frontPartmodel.pLike.frontmodel.pLikemodel.listLikemodel.global
      )*,
      (
         (
            model.div1Like,
            ( model.frontPart | model.div1Like | model.global )*
         )
       | ( model.divLike, ( model.frontPart | model.divLike | model.global )* )
      )?,
      ( ( model.divBottomPart, ( model.divBottomPart | model.global )* )? )
   )
}
Processing Model
<model behaviour="block"/>

<bibl>

<bibl> (bibliographic citation) contains a loosely-structured bibliographic citation of which the sub-components may or may not be explicitly tagged. [3.12.1. Methods of Encoding Bibliographic References and Lists of References 2.2.7. The Source Description 16.3.2. Declarable Elements]
Modulecore
Attributes
Member of
Contained by
May contain
Note

Contains phrase-level elements, together with any combination of elements from the model.biblPart class

Example
<epigraph>
 <bibl>Deut. Chap. 5.</bibl>
 <q>11 Thou ſhalt not take the name of the Lord thy God in vaine, for the Lord
   will not hold him guiltleſſe which ſhall take his name in vaine.</q>
</epigraph>
Schematron

<sch:rule context="tei:bibl">
<sch:assert test="child::* or child::text()[normalize-space()]"
 role="ERROR">
Element "<sch:name/>" may not be empty.
</sch:assert>
</sch:rule>
Content model
<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">

  <textNode/>
  <classRef key="model.gLike"/>
  <classRef key="model.highlighted"/>
  <classRef key="model.pPart.data"/>
  <classRef key="model.pPart.edit"/>
  <classRef key="model.segLike"/>
  <classRef key="model.ptrLike"/>
  <classRef key="model.biblPart"/>
  <classRef key="model.global"/>
 </alternate>
</content>
Schema Declaration
element bibl
{
   att.global.attributes,
   att.typed.attributes,
   att.sortable.attributes,
   att.docStatus.attributes,
   att.cmc.attributes,
   (
      text
    | model.gLikemodel.highlightedmodel.pPart.datamodel.pPart.editmodel.segLikemodel.ptrLikemodel.biblPartmodel.global
   )*
}
Processing Model
<model predicate="parent::listBibl"
 behaviour="listItem"/>

<model behaviour="inline"/>

<biblFull>

<biblFull> (fully-structured bibliographic citation) contains a fully-structured bibliographic citation, in which all components of the TEI file description are present. [3.12.1. Methods of Encoding Bibliographic References and Lists of References 2.2. The File Description 2.2.7. The Source Description 16.3.2. Declarable Elements]
Moduleheader
Attributes
Member of
Contained by
May contain
Example
<sourceDesc>
 <biblFull>
  <titleStmt>
   <title>Buxom Joan of Lymas's love to a jolly sailer: or, The maiden's
       choice: being love for love again. To an excellent new play-house
       tune.</title>
   <author>Congreve, William, 1670-1729.</author>
  </titleStmt>
  <extent>1 sheet ([1] p.) : music. </extent>
  <publicationStmt>
   <publisher>printed for P[hilip]. Brooksby, at the Golden-ball, in
       Pye-corner.,</publisher>
   <pubPlace>London: :</pubPlace>
   <date>[between 1693-1695]</date>
  </publicationStmt>
  <notesStmt>
   <note>Attributed to William Congreve by Wing.</note>
   <note>Date of publication and publisher's name from Wing.</note>
   <note>Verse: "A soldier and a sailer ..."</note>
   <note>Printed in two columns.</note>
   <note>Reproduction of original in the British Library.</note>
  </notesStmt>
 </biblFull>
</sourceDesc>
Content model
<content>
 <alternate minOccurs="1maxOccurs="1">
  <sequence minOccurs="1maxOccurs="1">
   <sequence minOccurs="1maxOccurs="1">
    <elementRef key="titleStmt"/>
    <elementRef key="editionStmt"
     minOccurs="0"/>

    <elementRef key="extentminOccurs="0"/>
    <elementRef key="publicationStmt"/>
    <elementRef key="seriesStmt"
     minOccurs="0maxOccurs="unbounded"/>

    <elementRef key="notesStmt"
     minOccurs="0"/>

   </sequence>
   <elementRef key="sourceDesc"
    minOccurs="0maxOccurs="unbounded"/>

  </sequence>
  <sequence minOccurs="1maxOccurs="1">
   <elementRef key="fileDesc"/>
   <elementRef key="profileDesc"/>
  </sequence>
 </alternate>
</content>
Schema Declaration
element biblFull
{
   att.global.attributes,
   att.sortable.attributes,
   att.docStatus.attributes,
   att.cmc.attributes,
   (
      (
         (
            titleStmt,
            editionStmt?,
            extent?,
            publicationStmt,
            seriesStmt*,
            notesStmt?
         ),
         sourceDesc*
      )
    | ( fileDesc, profileDesc )
   )
}

<biblScope>

<biblScope> (scope of bibliographic reference) defines the scope of a bibliographic reference, for example as a list of page numbers, or a named subdivision of a larger work. [3.12.2.5. Scopes and Ranges in Bibliographic Citations]
Modulecore
Attributes
Member of
Contained by
core: bibl
header: seriesStmt
May contain
analysis: c pc s w
figures: figure formula
gaiji: g
header: idno
linking: anchor seg
tagdocs: code
textstructure: floatingText
transcr: fw subst supplied
verse: rhyme
character data
Note

When a single page is being cited, use the from and to attributes with an identical value. When no clear endpoint is provided, the from attribute may be used without to; for example a citation such as ‘p. 3ff’ might be encoded <biblScope from="3">p. 3ff</biblScope>.

It is now considered good practice to supply this element as a sibling (rather than a child) of <imprint>, since it supplies information which does not constitute part of the imprint.

Example
<biblScope>pp 12–34</biblScope>
<biblScope unit="pagefrom="12to="34"/>
<biblScope unit="volume">II</biblScope>
<biblScope unit="page">12</biblScope>
Content model
<content>
 <macroRef key="macro.phraseSeq"/>
</content>
Schema Declaration
element biblScope
{
   att.global.attributes,
   att.citing.attributes,
   macro.phraseSeq
}
Processing Model
<model behaviour="inline"/>

<body>

<body> (text body) contains the whole body of a single unitary text, excluding any front or back matter. [4. Default Text Structure]
Moduletextstructure
Attributes
Contained by
textstructure: floatingText text
May contain
Example
<body>
 <l>Nu scylun hergan hefaenricaes uard</l>
 <l>metudæs maecti end his modgidanc</l>
 <l>uerc uuldurfadur sue he uundra gihuaes</l>
 <l>eci dryctin or astelidæ</l>
 <l>he aerist scop aelda barnum</l>
 <l>heben til hrofe haleg scepen.</l>
 <l>tha middungeard moncynnæs uard</l>
 <l>eci dryctin æfter tiadæ</l>
 <l>firum foldu frea allmectig</l>
 <trailer>primo cantauit Cædmon istud carmen.</trailer>
</body>
Content model
<content>
 <sequence minOccurs="1maxOccurs="1">
  <classRef key="model.global"
   minOccurs="0maxOccurs="unbounded"/>

  <sequence minOccurs="0maxOccurs="1">
   <classRef key="model.divTop"/>
   <alternate minOccurs="0"
    maxOccurs="unbounded">

    <classRef key="model.global"/>
    <classRef key="model.divTop"/>
   </alternate>
  </sequence>
  <sequence minOccurs="0maxOccurs="1">
   <classRef key="model.divGenLike"/>
   <alternate minOccurs="0"
    maxOccurs="unbounded">

    <classRef key="model.global"/>
    <classRef key="model.divGenLike"/>
   </alternate>
  </sequence>
  <alternate minOccurs="1maxOccurs="1">
   <sequence minOccurs="1"
    maxOccurs="unbounded">

    <classRef key="model.divLike"/>
    <alternate minOccurs="0"
     maxOccurs="unbounded">

     <classRef key="model.global"/>
     <classRef key="model.divGenLike"/>
    </alternate>
   </sequence>
   <sequence minOccurs="1"
    maxOccurs="unbounded">

    <classRef key="model.div1Like"/>
    <alternate minOccurs="0"
     maxOccurs="unbounded">

     <classRef key="model.global"/>
     <classRef key="model.divGenLike"/>
    </alternate>
   </sequence>
   <sequence minOccurs="1maxOccurs="1">
    <sequence minOccurs="1"
     maxOccurs="unbounded">

     <alternate minOccurs="1maxOccurs="1">
      <elementRef key="schemaSpec"/>
      <classRef key="model.common"/>
     </alternate>
     <classRef key="model.global"
      minOccurs="0maxOccurs="unbounded"/>

    </sequence>
    <alternate minOccurs="0maxOccurs="1">
     <sequence minOccurs="1"
      maxOccurs="unbounded">

      <classRef key="model.divLike"/>
      <alternate minOccurs="0"
       maxOccurs="unbounded">

       <classRef key="model.global"/>
       <classRef key="model.divGenLike"/>
      </alternate>
     </sequence>
     <sequence minOccurs="1"
      maxOccurs="unbounded">

      <classRef key="model.div1Like"/>
      <alternate minOccurs="0"
       maxOccurs="unbounded">

       <classRef key="model.global"/>
       <classRef key="model.divGenLike"/>
      </alternate>
     </sequence>
    </alternate>
   </sequence>
  </alternate>
  <sequence minOccurs="0"
   maxOccurs="unbounded">

   <classRef key="model.divBottom"/>
   <classRef key="model.global"
    minOccurs="0maxOccurs="unbounded"/>

  </sequence>
 </sequence>
</content>
Schema Declaration
element body
{
   att.global.attributes,
   (
      model.global*,
      ( ( model.divTop, ( model.global | model.divTop )* )? ),
      ( ( model.divGenLike, ( model.global | model.divGenLike )* )? ),
      (
         ( ( model.divLike, ( model.global | model.divGenLike )* )+ )
       | ( ( model.div1Like, ( model.global | model.divGenLike )* )+ )
       | (
            ( ( ( schemaSpec | model.common ), model.global* )+ ),
            (
               ( ( model.divLike, ( model.global | model.divGenLike )* )+ )
             | ( ( model.div1Like, ( model.global | model.divGenLike )* )+ )
            )?
         )
      ),
      ( ( model.divBottom, model.global* )* )
   )
}
Processing Model
<modelSequence>
<model behaviour="index">
 <param name="typevalue="'toc'"/>
</model>
<model behaviour="block"/>
</modelSequence>

<byline>

<byline> (byline) contains the primary statement of responsibility given for a work on its title page or at the head or end of the work. [4.2.2. Openers and Closers 4.5. Front Matter]
Moduletextstructure
Attributes
Member of
Contained by
core: lg list
drama: castList
figures: figure table
May contain
analysis: c pc s w
figures: figure formula
gaiji: g
header: idno
linking: anchor seg
tagdocs: code
textstructure: docAuthor
transcr: fw subst supplied
verse: rhyme
character data
Note

The byline on a title page may include either the name or a description for the document's author. Where the name is included, it may optionally be tagged using the docAuthor element.

Example
<byline>Written by a CITIZEN who continued all the
while in London. Never made publick before.</byline>
Example
<byline>Written from her own MEMORANDUMS</byline>
Example
<byline>By George Jones, Political Editor, in Washington</byline>
Example
<byline>BY
<docAuthor>THOMAS PHILIPOTT,</docAuthor>
Master of Arts,
(Somtimes)
Of Clare-Hall in Cambridge.</byline>
Content model
<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">

  <textNode/>
  <classRef key="model.gLike"/>
  <classRef key="model.phrase"/>
  <elementRef key="docAuthor"/>
  <classRef key="model.global"/>
 </alternate>
</content>
Schema Declaration
element byline
{
   att.global.attributes,
   att.cmc.attributes,
   ( text | model.gLike | model.phrase | docAuthor | model.global )*
}
Processing Model
<model behaviour="block"/>

<c>

<c> (character) represents a character. [18.1. Linguistic Segment Categories]
Moduleanalysis
Attributes
Member of
Contained by
May contain
gaiji: g
character data
Note

Contains a single character, a g element, or a sequence of graphemes to be treated as a single character. The type attribute is used to indicate the function of this segmentation, taking values such as letter, punctuation, or digit etc.

Example
<phr>
 <c>M</c>
 <c>O</c>
 <c>A</c>
 <c>I</c>
 <w>doth</w>
 <w>sway</w>
 <w>my</w>
 <w>life</w>
</phr>
Content model
<content>
 <macroRef key="macro.xtext"/>
</content>
Schema Declaration
element c
{
   att.global.attributes,
   att.segLike.attributes,
   att.typed.attributes,
   att.notated.attributes,
   att.cmc.attributes,
   macro.xtext
}
Processing Model
<model behaviour="inline"/>

<castGroup>

<castGroup> (cast list grouping) groups one or more individual castItem elements within a cast list. [7.1.4. Cast Lists]
Moduledrama
Attributes
Contained by
May contain
figures: figure
linking: anchor
textstructure: trailer
transcr: fw
Note

The rend attribute may be used, as here, to indicate whether the grouping is indicated by a brace, whitespace, font change, etc.

Note that in this example the role description ‘friends of Mathias’ is understood to apply to both roles equally.

Example
<castGroup rend="braced">
 <castItem>
  <role>Walter</role>
  <actor>Mr Frank Hall</actor>
 </castItem>
 <castItem>
  <role>Hans</role>
  <actor>Mr F.W. Irish</actor>
 </castItem>
 <roleDesc>friends of Mathias</roleDesc>
</castGroup>
Content model
<content>
 <sequence minOccurs="1maxOccurs="1">
  <alternate minOccurs="0"
   maxOccurs="unbounded">

   <classRef key="model.global"/>
   <classRef key="model.headLike"/>
  </alternate>
  <sequence minOccurs="1"
   maxOccurs="unbounded">

   <alternate minOccurs="1maxOccurs="1">
    <elementRef key="castItem"/>
    <elementRef key="castGroup"/>
    <elementRef key="roleDesc"/>
   </alternate>
   <classRef key="model.global"
    minOccurs="0maxOccurs="unbounded"/>

  </sequence>
  <sequence minOccurs="0maxOccurs="1">
   <elementRef key="trailer"/>
   <classRef key="model.global"
    minOccurs="0maxOccurs="unbounded"/>

  </sequence>
 </sequence>
</content>
Schema Declaration
element castGroup
{
   att.global.attributes,
   (
      ( model.global | model.headLike )*,
      ( ( ( castItem | castGroup | roleDesc ), model.global* )+ ),
      ( ( trailer, model.global* )? )
   )
}
Processing Model
<model predicate="child::*behaviour="list">
<desc>Insert list. </desc>
</model>

<castItem>

<castItem> (cast list item) contains a single entry within a cast list, describing either a single role or a list of non-speaking roles. [7.1.4. Cast Lists]
Moduledrama
Attributes
typecharacterizes the cast item.
Derived fromatt.typed
Status Optional
Datatype teidata.enumerated
Default role
Contained by
May contain
Example
<castItem>
 <role>Player</role>
 <actor>Mr Milward</actor>
</castItem>
Example
<castItem type="list">Constables, Drawer, Turnkey, etc.</castItem>
Content model
<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">

  <textNode/>
  <classRef key="model.gLike"/>
  <classRef key="model.castItemPart"/>
  <classRef key="model.phrase"/>
  <classRef key="model.global"/>
 </alternate>
</content>
Schema Declaration
element castItem
{
   att.global.attributes,
   att.typed.attribute.subtype,
   attribute type { teidata.enumerated }?,
   ( text | model.gLike | model.castItemPart | model.phrase | model.global )*
}
Processing Model
<model behaviour="listItem">
<desc>Insert item, rendered as described in parent list rendition. </desc>
</model>

<castList>

<castList> (cast list) contains a single cast list or dramatis personae. [7.1.4. Cast Lists 7.1. Front and Back Matter ]
Moduledrama
Attributes
Member of
Contained by
May contain
Example
<castList>
 <castGroup>
  <head rend="braced">Mendicants</head>
  <castItem>
   <role>Aafaa</role>
   <actor>Femi Johnson</actor>
  </castItem>
  <castItem>
   <role>Blindman</role>
   <actor>Femi Osofisan</actor>
  </castItem>
  <castItem>
   <role>Goyi</role>
   <actor>Wale Ogunyemi</actor>
  </castItem>
  <castItem>
   <role>Cripple</role>
   <actor>Tunji Oyelana</actor>
  </castItem>
 </castGroup>
 <castItem>
  <role>Si Bero</role>
  <roleDesc>Sister to Dr Bero</roleDesc>
  <actor>Deolo Adedoyin</actor>
 </castItem>
 <castGroup>
  <head rend="braced">Two old women</head>
  <castItem>
   <role>Iya Agba</role>
   <actor>Nguba Agolia</actor>
  </castItem>
  <castItem>
   <role>Iya Mate</role>
   <actor>Bopo George</actor>
  </castItem>
 </castGroup>
 <castItem>
  <role>Dr Bero</role>
  <roleDesc>Specialist</roleDesc>
  <actor>Nat Okoro</actor>
 </castItem>
 <castItem>
  <role>Priest</role>
  <actor>Gbenga Sonuga</actor>
 </castItem>
 <castItem>
  <role>The old man</role>
  <roleDesc>Bero's father</roleDesc>
  <actor>Dapo Adelugba</actor>
 </castItem>
</castList>
<stage type="mix">The action takes place in and around the home surgery of
Dr Bero, lately returned from the wars.</stage>
Content model
<content>
 <sequence minOccurs="1maxOccurs="1">
  <alternate minOccurs="0"
   maxOccurs="unbounded">

   <classRef key="model.divTop"/>
   <classRef key="model.global"/>
  </alternate>
  <sequence minOccurs="0"
   maxOccurs="unbounded">

   <classRef key="model.common"/>
   <classRef key="model.global"
    minOccurs="0maxOccurs="unbounded"/>

  </sequence>
  <sequence minOccurs="1"
   maxOccurs="unbounded">

   <alternate minOccurs="1maxOccurs="1">
    <elementRef key="castItem"/>
    <elementRef key="castGroup"/>
   </alternate>
   <classRef key="model.global"
    minOccurs="0maxOccurs="unbounded"/>

  </sequence>
  <sequence minOccurs="0"
   maxOccurs="unbounded">

   <classRef key="model.common"/>
   <classRef key="model.global"
    minOccurs="0maxOccurs="unbounded"/>

  </sequence>
 </sequence>
</content>
Schema Declaration
element castList
{
   att.global.attributes,
   (
      ( model.divTop | model.global )*,
      ( ( model.common, model.global* )* ),
      ( ( ( castItem | castGroup ), model.global* )+ ),
      ( ( model.common, model.global* )* )
   )
}
Processing Model
<model predicate="child::*behaviour="list"
 useSourceRendition="true">

<outputRendition>list-style: ordered;</outputRendition>
</model>

<catDesc>

<catDesc> (category description) describes some category within a taxonomy or text typology, either in the form of a brief prose description or in terms of the situational parameters used by the TEI formal <textDesc>. [2.3.7. The Classification Declaration]
Moduleheader
Attributes
Contained by
header: category
May contain
header: idno
tagdocs: code
transcr: subst
character data
Example
<catDesc>Prose reportage</catDesc>
Example
<catDesc>
 <textDesc n="novel">
  <channel mode="w">print; part issues</channel>
  <constitution type="single"/>
  <derivation type="original"/>
  <domain type="art"/>
  <factuality type="fiction"/>
  <interaction type="none"/>
  <preparedness type="prepared"/>
  <purpose type="entertaindegree="high"/>
  <purpose type="informdegree="medium"/>
 </textDesc>
</catDesc>
Content model
<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">

  <textNode/>
  <classRef key="model.limitedPhrase"/>
  <classRef key="model.catDescPart"/>
 </alternate>
</content>
Schema Declaration
element catDesc
{
   att.global.attributes,
   att.canonical.attributes,
   ( text | model.limitedPhrase | model.catDescPart )*
}

<category>

<category> (category) contains an individual descriptive category, possibly nested within a superordinate category, within a user-defined taxonomy. [2.3.7. The Classification Declaration]
Moduleheader
Attributes
Contained by
May contain
core: desc
Example
<category xml:id="b1">
 <catDesc>Prose reportage</catDesc>
</category>
Example
<category xml:id="b2">
 <catDesc>Prose </catDesc>
 <category xml:id="b11">
  <catDesc>journalism</catDesc>
 </category>
 <category xml:id="b12">
  <catDesc>fiction</catDesc>
 </category>
</category>
Example
<category xml:id="LIT">
 <catDesc xml:lang="pl">literatura piękna</catDesc>
 <catDesc xml:lang="en">fiction</catDesc>
 <category xml:id="LPROSE">
  <catDesc xml:lang="pl">proza</catDesc>
  <catDesc xml:lang="en">prose</catDesc>
 </category>
 <category xml:id="LPOETRY">
  <catDesc xml:lang="pl">poezja</catDesc>
  <catDesc xml:lang="en">poetry</catDesc>
 </category>
 <category xml:id="LDRAMA">
  <catDesc xml:lang="pl">dramat</catDesc>
  <catDesc xml:lang="en">drama</catDesc>
 </category>
</category>
Content model
<content>
 <sequence>
  <alternate>
   <elementRef key="catDescminOccurs="1"
    maxOccurs="unbounded"/>

   <alternate minOccurs="0"
    maxOccurs="unbounded">

    <classRef key="model.descLike"/>
    <elementRef key="equiv"/>
    <elementRef key="gloss"/>
   </alternate>
  </alternate>
  <elementRef key="categoryminOccurs="0"
   maxOccurs="unbounded"/>

 </sequence>
</content>
Schema Declaration
element category
{
   att.global.attributes,
   ( ( catDesc+ | ( model.descLike | equiv | gloss )* ), category* )
}

<catRef>

<catRef> (category reference) specifies one or more defined categories within some taxonomy or text typology. [2.4.3. The Text Classification]
Moduleheader
Attributes
schemeidentifies the classification scheme within which the set of categories concerned is defined, for example by a taxonomy element, or by some other resource.
Status Optional
Datatype teidata.pointer
Contained by
header: textClass
May containEmpty element
Note

The scheme attribute needs to be supplied only if more than one taxonomy has been declared.

Example
<catRef scheme="#myTopics"
 target="#news #prov #sales2"/>

<!-- elsewhere -->
<taxonomy xml:id="myTopics">
 <category xml:id="news">
  <catDesc>Newspapers</catDesc>
 </category>
 <category xml:id="prov">
  <catDesc>Provincial</catDesc>
 </category>
 <category xml:id="sales2">
  <catDesc>Low to average annual sales</catDesc>
 </category>
</taxonomy>
Content model
<content>
 <empty/>
</content>
Schema Declaration
element catRef
{
   att.global.attributes,
   att.pointing.attributes,
   attribute scheme { teidata.pointer }?,
   empty
}

<cb>

<cb> (column beginning) marks the beginning of a new column of a text on a multi-column page. [3.11.3. Milestone Elements]
Modulecore
Attributes
Member of
Contained by
May containEmpty element
Note

On this element, the global n attribute indicates the number or other value associated with the column which follows the point of insertion of this cb element. Encoders should adopt a clear and consistent policy as to whether the numbers associated with column breaks relate to the physical sequence number of the column in the whole text, or whether columns are numbered within the page. The cb element is placed at the head of the column to which it refers.

Example

Markup of an early English dictionary printed in two columns:

<pb/>
<cb n="1"/>
<entryFree>
 <form>Well</form>, <sense>a Pit to hold Spring-Water</sense>:
<sense>In the Art of <hi rend="italic">War</hi>, a Depth the Miner
   sinks into the Ground, to find out and disappoint the Enemies Mines,
   or to prepare one</sense>.
</entryFree>
<entryFree>To <form>Welter</form>, <sense>to wallow</sense>, or
<sense>lie groveling</sense>.</entryFree>
<!-- remainder of column -->
<cb n="2"/>
<entryFree>
 <form>Wey</form>, <sense>the greatest Measure for dry Things,
   containing five Chaldron</sense>.
</entryFree>
<entryFree>
 <form>Whale</form>, <sense>the greatest of
   Sea-Fishes</sense>.
</entryFree>
Content model
<content>
 <empty/>
</content>
Schema Declaration
element cb
{
   att.global.attributes,
   att.typed.attributes,
   att.edition.attributes,
   att.spanning.attributes,
   att.breaking.attributes,
   att.cmc.attributes,
   empty
}
Processing Model
<model behaviour="break">
<param name="typevalue="'column'"/>
<param name="labelvalue="@n"/>
</model>

<cell>

<cell> (cell) contains one cell of a table. [15.1.1. TEI Tables]
Modulefigures
Attributes
role(role) indicates the kind of information held in this cell or in each cell of this row.
Derived fromatt.tableDecoration
Status Optional
Datatype teidata.enumerated
Legal values are:
data
data cell [Default]
label
label cell
sum
row or column sum data
total
table total data
Contained by
figures: row
May contain
Example
<row>
 <cell role="label">General conduct</cell>
 <cell role="data">Not satisfactory, on account of his great unpunctuality
   and inattention to duties</cell>
</row>
Content model
<content>
 <macroRef key="macro.specialPara"/>
</content>
Schema Declaration
element cell
{
   att.global.attributes,
   att.tableDecoration.attribute.rows,
   att.tableDecoration.attribute.cols,
   attribute role { "data" | "label" | "sum" | "total" }?,
   macro.specialPara
}
Processing Model
<model behaviour="cell">
<desc>Insert table cell. </desc>
</model>

<change>

<change> (change) documents a change or set of changes made during the production of a source document, or during the revision of an electronic file. [2.6. The Revision Description 2.4.1. Creation 12.7. Identifying Changes and Revisions]
Moduleheader
Attributes
calendarindicates one or more systems or calendars to which the date represented by the content of this element belongs.
Deprecatedwill be removed on 2024-11-11
Status Optional
Datatype 1–∞ occurrences of teidata.pointer separated by whitespace
Schematron

<sch:rule context="tei:*[@calendar]">
<sch:assert test="string-length( normalize-space(.) ) gt 0"> @calendar indicates one or more
systems or calendars to which the date represented by the content of this element belongs,
but this <sch:name/> element has no textual content.</sch:assert>
</sch:rule>
target(target) points to one or more elements that belong to this change.
Status Optional
Datatype 1–∞ occurrences of teidata.pointer separated by whitespace
Contained by
May contain
Note

The who attribute may be used to point to any other element, but will typically specify a respStmt or person element elsewhere in the header, identifying the person responsible for the change and their role in making it.

It is recommended that changes be recorded with the most recent first. The status attribute may be used to indicate the status of a document following the change documented.

Example
<titleStmt>
 <title> ... </title>
 <editor xml:id="LDB">Lou Burnard</editor>
 <respStmt xml:id="BZ">
  <resp>copy editing</resp>
  <name>Brett Zamir</name>
 </respStmt>
</titleStmt>
<!-- ... -->
<revisionDesc status="published">
 <change who="#BZwhen="2008-02-02"
  status="public">
Finished chapter 23</change>
 <change who="#BZwhen="2008-01-02"
  status="draft">
Finished chapter 2</change>
 <change n="P2.2when="1991-12-21"
  who="#LDB">
Added examples to section 3</change>
 <change when="1991-11-11who="#MSM">Deleted chapter 10</change>
</revisionDesc>
Example
<profileDesc>
 <creation>
  <listChange>
   <change xml:id="DRAFT1">First draft in pencil</change>
   <change xml:id="DRAFT2"
    notBefore="1880-12-09">
First revision, mostly
       using green ink</change>
   <change xml:id="DRAFT3"
    notBefore="1881-02-13">
Final corrections as
       supplied to printer.</change>
  </listChange>
 </creation>
</profileDesc>
Content model
<content>
 <macroRef key="macro.specialPara"/>
</content>
Schema Declaration
element change
{
   att.ascribed.attributes,
   att.datable.attributes,
   att.docStatus.attributes,
   att.global.attributes,
   att.typed.attributes,
   attribute calendar { list { teidata.pointer+ } }?,
   attribute target { list { teidata.pointer+ } }?,
   macro.specialPara
}

<char>

<char> (character) provides descriptive information about a character. [5.2. Markup Constructs for Representation of Characters and Glyphs]
Modulegaiji
Attributes
Contained by
gaiji: charDecl
May contain
Example
<char xml:id="circledU4EBA">
 <localProp name="Name"
  value="CIRCLED IDEOGRAPH 4EBA"/>

 <localProp name="daikanwavalue="36"/>
 <unicodeProp name="Decomposition_Mapping"
  value="circle"/>

 <mapping type="standard"></mapping>
</char>
Content model
<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">

  <elementRef key="unicodeProp"/>
  <elementRef key="unihanProp"/>
  <elementRef key="localProp"/>
  <elementRef key="mapping"/>
  <elementRef key="figure"/>
  <classRef key="model.graphicLike"/>
  <classRef key="model.noteLike"/>
  <classRef key="model.descLike"/>
 </alternate>
</content>
Schema Declaration
element char
{
   att.global.attributes,
   (
      unicodePropunihanProplocalPropmappingfiguremodel.graphicLikemodel.noteLikemodel.descLike
   )*
}

<charDecl>

<charDecl> (character declarations) provides information about nonstandard characters and glyphs. [5.2. Markup Constructs for Representation of Characters and Glyphs]
Modulegaiji
Attributes
Member of
Contained by
header: encodingDesc
May contain
core: desc
gaiji: char glyph
Example
<charDecl>
 <char xml:id="aENL">
  <unicodeProp name="Name"
   value="LATIN LETTER ENLARGED SMALL A"/>

  <mapping type="standard">a</mapping>
 </char>
</charDecl>
Content model
<content>
 <sequence>
  <elementRef key="descminOccurs="0"/>
  <alternate minOccurs="1"
   maxOccurs="unbounded">

   <elementRef key="char"/>
   <elementRef key="glyph"/>
  </alternate>
 </sequence>
</content>
Schema Declaration
element charDecl { att.global.attributes, ( desc?, ( char | glyph )+ ) }

<choice>

<choice> (choice) groups a number of alternative encodings for the same point in a text. [3.5. Simple Editorial Changes]
Modulecore
Attributes
Member of
Contained by
May contain
Note

Because the children of a choice element all represent alternative ways of encoding the same sequence, it is natural to think of them as mutually exclusive. However, there may be cases where a full representation of a text requires the alternative encodings to be considered as parallel.

Note also that choice elements may self-nest.

Where the purpose of an encoding is to record multiple witnesses of a single work, rather than to identify multiple possible encoding decisions at a given point, the <app> element and associated elements discussed in section 13.1. The Apparatus Entry, Readings, and Witnesses should be preferred.

Example

An American encoding of Gulliver's Travels which retains the British spelling but also provides a version regularized to American spelling might be encoded as follows.

<p>Lastly, That, upon his solemn oath to observe all the above
articles, the said man-mountain shall have a daily allowance of
meat and drink sufficient for the support of <choice>
  <sic>1724</sic>
  <corr>1728</corr>
 </choice> of our subjects,
with free access to our royal person, and other marks of our
<choice>
  <orig>favour</orig>
  <reg>favor</reg>
 </choice>.</p>
Schematron

<sch:rule context="tei:choice">
<sch:assert test="( tei:corr and tei:sic ) or ( tei:expan and tei:abbr ) or ( tei:reg and tei:orig )"
 role="ERROR">
Element "<sch:name/>" must have corresponding corr/sic, expand/abbr, reg/orig
</sch:assert>
</sch:rule>
Content model
<content>
 <alternate minOccurs="2"
  maxOccurs="unbounded">

  <classRef key="model.choicePart"/>
  <elementRef key="choice"/>
 </alternate>
</content>
Schema Declaration
element choice
{
   att.global.attributes,
   att.cmc.attributes,
   ( model.choicePart | choice ),
   ( model.choicePart | choice ),
   ( model.choicePart | choice )*
}
Processing Model
<model output="plain"
 predicate="sic and corrbehaviour="inline">

<param name="contentvalue="corr[1]"/>
</model>
<model output="plain"
 predicate="abbr and expanbehaviour="inline">

<param name="contentvalue="expan[1]"/>
</model>
<model output="plain"
 predicate="orig and regbehaviour="inline">

<param name="contentvalue="reg[1]"/>
</model>
<model predicate="sic and corr"
 behaviour="alternate">

<param name="defaultvalue="corr[1]"/>
<param name="alternatevalue="sic[1]"/>
</model>
<model predicate="abbr and expan"
 behaviour="alternate">

<param name="defaultvalue="expan[1]"/>
<param name="alternatevalue="abbr[1]"/>
</model>
<model predicate="orig and reg"
 behaviour="alternate">

<param name="defaultvalue="reg[1]"/>
<param name="alternatevalue="orig[1]"/>
</model>

<cit>

<cit> (cited quotation) contains a quotation from some other document, together with a bibliographic reference to its source. In a dictionary it may contain an example text with at least one occurrence of the word form, used in the sense being described, or a translation of the headword, or an example. [3.3.3. Quotation 4.3.1. Grouped Texts 10.3.5.1. Examples]
Modulecore
Attributes
Member of
Contained by
May contain
analysis: pc
figures: figure formula
header: biblFull
linking: anchor
textstructure: floatingText
transcr: fw
Example
<cit>
 <quote>and the breath of the whale is frequently attended with such an insupportable smell,
   as to bring on disorder of the brain.</quote>
 <bibl>Ulloa's South America</bibl>
</cit>
Example
<entry>
 <form>
  <orth>horrifier</orth>
 </form>
 <cit type="translationxml:lang="en">
  <quote>to horrify</quote>
 </cit>
 <cit type="example">
  <quote>elle était horrifiée par la dépense</quote>
  <cit type="translationxml:lang="en">
   <quote>she was horrified at the expense.</quote>
  </cit>
 </cit>
</entry>
Example
<cit type="example">
 <quote xml:lang="mix">Ka'an yu tsa'a Pedro.</quote>
 <media url="soundfiles-gen:S_speak_1s_on_behalf_of_Pedro_01_02_03_TS.wav"
  mimeType="audio/wav"/>

 <cit type="translation">
  <quote xml:lang="en">I'm speaking on behalf of Pedro.</quote>
 </cit>
 <cit type="translation">
  <quote xml:lang="es">Estoy hablando de parte de Pedro.</quote>
 </cit>
</cit>
Content model
<content>
 <alternate minOccurs="1"
  maxOccurs="unbounded">

  <classRef key="model.biblLike"/>
  <classRef key="model.egLike"/>
  <classRef key="model.entryPart"/>
  <classRef key="model.global"/>
  <classRef key="model.graphicLike"/>
  <classRef key="model.ptrLike"/>
  <classRef key="model.attributable"/>
  <elementRef key="pc"/>
  <elementRef key="q"/>
 </alternate>
</content>
Schema Declaration
element cit
{
   att.global.attributes,
   att.typed.attributes,
   att.cmc.attributes,
   (
      model.biblLike
    | model.egLike
    | model.entryPart
    | model.globalmodel.graphicLikemodel.ptrLikemodel.attributablepcq
   )+
}
Processing Model
<model predicate="child::quote and child::bibl"
 behaviour="cit">

<desc>Insert citation </desc>
</model>

<classCode>

<classCode> (classification code) contains the classification code used for this text in some standard classification system. [2.4.3. The Text Classification]
Moduleheader
Attributes
schemeidentifies the classification system in use, as defined by, e.g. a taxonomy element, or some other resource.
Status Required
Datatype teidata.pointer
Contained by
header: textClass
May contain
figures: figure
header: idno
linking: anchor
tagdocs: code
transcr: fw subst
character data
Example
<classCode scheme="http://www.udc.org">410</classCode>
Content model
<content>
 <macroRef key="macro.phraseSeq.limited"/>
</content>
Schema Declaration
element classCode
{
   att.global.attributes,
   attribute scheme { teidata.pointer },
   macro.phraseSeq.limited
}

<classDecl>

<classDecl> (classification declarations) contains one or more taxonomies defining any classificatory codes used elsewhere in the text. [2.3.7. The Classification Declaration 2.3. The Encoding Description]
Moduleheader
Attributes
Member of
Contained by
header: encodingDesc
May contain
header: taxonomy
Example
<classDecl>
 <taxonomy xml:id="LCSH">
  <bibl>Library of Congress Subject Headings</bibl>
 </taxonomy>
</classDecl>
<!-- ... -->
<textClass>
 <keywords scheme="#LCSH">
  <term>Political science</term>
  <term>United States -- Politics and government —
     Revolution, 1775-1783</term>
 </keywords>
</textClass>
Content model
<content>
 <elementRef key="taxonomyminOccurs="1"
  maxOccurs="unbounded"/>

</content>
Schema Declaration
element classDecl { att.global.attributes, taxonomy+ }

<closer>

<closer> (closer) groups together salutations, datelines, and similar phrases appearing as a final group at the end of a division, especially of a letter. [4.2.2. Openers and Closers 4.2. Elements Common to All Divisions]
Moduletextstructure
Attributes
Member of
Contained by
core: lg list
figures: figure table
May contain
analysis: c pc s w
figures: figure formula
gaiji: g
header: idno
linking: anchor seg
tagdocs: code
textstructure: dateline salute signed
transcr: fw subst supplied
verse: rhyme
character data
Example
<div type="letter">
 <p> perhaps you will favour me with a sight of it when convenient.</p>
 <closer>
  <salute>I remain, &amp;c. &amp;c.</salute>
  <signed>H. Colburn</signed>
 </closer>
</div>
Example
<div type="chapter">
 <p>
<!-- ... --> and his heart was going like mad and yes I said yes I will Yes.</p>
 <closer>
  <dateline>
   <name type="place">Trieste-Zürich-Paris,</name>
   <date>1914–1921</date>
  </dateline>
 </closer>
</div>
Content model
<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">

  <textNode/>
  <classRef key="model.gLike"/>
  <elementRef key="signed"/>
  <elementRef key="dateline"/>
  <elementRef key="salute"/>
  <classRef key="model.phrase"/>
  <classRef key="model.global"/>
 </alternate>
</content>
Schema Declaration
element closer
{
   att.global.attributes,
   att.written.attributes,
   att.cmc.attributes,
   (
      text
    | model.gLikesigneddatelinesalutemodel.phrasemodel.global
   )*
}
Processing Model
<model behaviour="block">
<outputRendition>margin-top: 1em; margin-left: 1em; margin-left:
1em;</outputRendition>
</model>

<code>

<code> contains literal code from some formal language such as a programming language. [23.1.1. Phrase Level Terms]
Moduletagdocs
Attributes
lang(formal language) a name identifying the formal language in which the code is expressed
Status Optional
Datatype teidata.word
Member of
Contained by
May containCharacter data only
Example
<code lang="JAVA"> Size fCheckbox1Size = new Size();
fCheckbox1Size.Height = 500;
fCheckbox1Size.Width = 500;
xCheckbox1.setSize(fCheckbox1Size);
</code>
Content model
<content>
 <textNode/>
</content>
Schema Declaration
element code { att.global.attributes, attribute lang { teidata.word }?, text }
Processing Model
<model behaviour="inline">
<outputRendition>font-family:monospace</outputRendition>
</model>

<corr>

<corr> (correction) contains the correct form of a passage apparently erroneous in the copy text. [3.5.1. Apparent Errors]
Modulecore
Attributes
Member of
Contained by
May contain
Example

If all that is desired is to call attention to the fact that the copy text has been corrected, corr may be used alone:

I don't know,
Juan. It's so far in the past now — how <corr>can we</corr> prove
or disprove anyone's theories?
Example

It is also possible, using the choice and sic elements, to provide an uncorrected reading:

I don't know, Juan. It's so far in the past now —
how <choice>
 <sic>we can</sic>
 <corr>can we</corr>
</choice> prove or
disprove anyone's theories?
Content model
<content>
 <macroRef key="macro.paraContent"/>
</content>
Schema Declaration
element corr
{
   att.global.attributes,
   att.editLike.attributes,
   att.typed.attributes,
   att.cmc.attributes,
   macro.paraContent
}
Processing Model
<model predicate="parent::choice and count(parent::*/*) gt 1"
 behaviour="inline">

<desc>simple inline, if in parent choice. </desc>
</model>
<model behaviour="inline">
<outputRendition scope="before">content: '[';</outputRendition>
<outputRendition scope="after">content: ']';</outputRendition>
</model>

<creation>

<creation> (creation) contains information about the creation of a text. [2.4.1. Creation 2.4. The Profile Description]
Moduleheader
Attributes
calendarindicates one or more systems or calendars to which the date represented by the content of this element belongs.
Deprecatedwill be removed on 2024-11-11
Status Optional
Datatype 1–∞ occurrences of teidata.pointer separated by whitespace
Schematron

<sch:rule context="tei:*[@calendar]">
<sch:assert test="string-length( normalize-space(.) ) gt 0"> @calendar indicates one or more
systems or calendars to which the date represented by the content of this element belongs,
but this <sch:name/> element has no textual content.</sch:assert>
</sch:rule>
Member of
Contained by
header: profileDesc
May contain
Note

The creation element may be used to record details of a text's creation, e.g. the date and place it was composed, if these are of interest.

It may also contain a more structured account of the various stages or revisions associated with the evolution of a text; this should be encoded using the listChange element. It should not be confused with the publicationStmt element, which records date and place of publication.

Example
<creation>
 <date>Before 1987</date>
</creation>
Example
<creation>
 <date when="1988-07-10">10 July 1988</date>
</creation>
Content model
<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">

  <textNode/>
  <classRef key="model.limitedPhrase"/>
  <elementRef key="listChange"/>
 </alternate>
</content>
Schema Declaration
element creation
{
   att.global.attributes,
   att.datable.attributes,
   attribute calendar { list { teidata.pointer+ } }?,
   ( text | model.limitedPhrase | listChange )*
}

<date>

<date> (date) contains a date in any format. [3.6.4. Dates and Times 2.2.4. Publication, Distribution, Licensing, etc. 2.6. The Revision Description 3.12.2.4. Imprint, Size of a Document, and Reprint Information 16.2.3. The Setting Description 14.4. Dates]
Modulecore
Attributes
Member of
Contained by
May contain
analysis: c pc s w
figures: figure formula
gaiji: g
header: idno
linking: anchor seg
tagdocs: code
transcr: fw subst supplied
verse: rhyme
character data
Example
<date when="1980-02">early February 1980</date>
Example
Given on the <date when="1977-06-12">Twelfth Day
of June in the Year of Our Lord One Thousand Nine Hundred and Seventy-seven of the Republic
the Two Hundredth and first and of the University the Eighty-Sixth.</date>
Example
<date when="1990-09">September 1990</date>
Content model
<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">

  <textNode/>
  <classRef key="model.gLike"/>
  <classRef key="model.phrase"/>
  <classRef key="model.global"/>
 </alternate>
</content>
Schema Declaration
element date
{
   att.global.attributes,
   att.canonical.attributes,
   att.datable.attributes,
   att.calendarSystem.attributes,
   att.editLike.attributes,
   att.dimensions.attributes,
   att.typed.attributes,
   att.cmc.attributes,
   ( text | model.gLike | model.phrase | model.global )*
}
Processing Model
<model output="printpredicate="text()"
 behaviour="inline"/>

<model output="print"
 predicate="@when and not(text())behaviour="inline">

<param name="contentvalue="@when"/>
</model>
<model output="webpredicate="@when"
 behaviour="alternate">

<param name="defaultvalue="."/>
<param name="alternatevalue="@when"/>
</model>
<model predicate="text()behaviour="inline"/>

<dateline>

<dateline> (dateline) contains a brief description of the place, date, time, etc. of production of a letter, newspaper story, or other work, prefixed or suffixed to it as a kind of heading or trailer. [4.2.2. Openers and Closers]
Moduletextstructure
Attributes
Member of
Contained by
core: lg list
drama: castList
figures: figure table
May contain
analysis: c pc s w
figures: figure formula
gaiji: g
header: idno
linking: anchor seg
tagdocs: code
textstructure: docDate
transcr: fw subst supplied
verse: rhyme
character data
Example
<dateline>Walden, this 29. of August 1592</dateline>
Example
<div type="chapter">
 <p>
<!-- ... --> and his heart was going like mad and yes I said yes I will Yes.</p>
 <closer>
  <dateline>
   <name type="place">Trieste-Zürich-Paris,</name>
   <date>1914–1921</date>
  </dateline>
 </closer>
</div>
Content model
<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">

  <textNode/>
  <classRef key="model.gLike"/>
  <classRef key="model.phrase"/>
  <classRef key="model.global"/>
  <elementRef key="docDate"/>
 </alternate>
</content>
Schema Declaration
element dateline
{
   att.global.attributes,
   att.cmc.attributes,
   ( text | model.gLike | model.phrase | model.global | docDate )*
}
Processing Model
<model behaviour="block"/>

<del>

<del> (deletion) contains a letter, word, or passage deleted, marked as deleted, or otherwise indicated as superfluous or spurious in the copy text by an author, scribe, or a previous annotator or corrector. [3.5.3. Additions, Deletions, and Omissions]
Modulecore
Attributes
Member of
Contained by
May contain
Note

This element should be used for deletion of shorter sequences of text, typically single words or phrases. The <delSpan> element should be used for longer sequences of text, for those containing structural subdivisions, and for those containing overlapping additions and deletions.

The text deleted must be at least partially legible in order for the encoder to be able to transcribe it (unless it is restored in a supplied tag). Illegible or lost text within a deletion may be marked using the gap tag to signal that text is present but has not been transcribed, or is no longer visible. Attributes on the gap element may be used to indicate how much text is omitted, the reason for omitting it, etc. If text is not fully legible, the unclear element (available when using the additional tagset for transcription of primary sources) should be used to signal the areas of text which cannot be read with confidence in a similar way.

Degrees of uncertainty over what can still be read, or whether a deletion was intended may be indicated by use of the <certainty> element (see 22. Certainty, Precision, and Responsibility).

There is a clear distinction in the TEI between del and <surplus> on the one hand and gap or unclear on the other. del indicates a deletion present in the source being transcribed, which states the author's or a later scribe's intent to cancel or remove text. <surplus> indicates material present in the source being transcribed which should have been so deleted, but which is not in fact. gap or unclear, by contrast, signal an editor's or encoder's decision to omit something or their inability to read the source text. See sections 12.3.1.7. Text Omitted from or Supplied in the Transcription and 12.3.3.2. Use of the gap, del, damage, unclear, and supplied Elements in Combination for the relationship between these and other related elements used in detailed transcription.

Example
<l>
 <del rend="overtyped">Mein</del> Frisch <del rend="overstriketype="primary">schwebt</del>
weht der Wind
</l>
Example
<del rend="overstrike">
 <gap reason="illegiblequantity="5"
  unit="character"/>

</del>
Content model
<content>
 <macroRef key="macro.paraContent"/>
</content>
Schema Declaration
element del
{
   att.global.attributes,
   att.transcriptional.attributes,
   att.typed.attributes,
   att.dimensions.attributes,
   att.cmc.attributes,
   macro.paraContent
}
Processing Model
<model behaviour="inline">
<outputRendition> text-decoration: line-through;</outputRendition>
</model>

<desc>

<desc> (description) contains a short description of the purpose, function, or use of its parent element, or when the parent is a documentation element, describes or defines the object being documented. [23.4.1. Description of Components]
Modulecore
Attributes
typecharacterizes the element in some sense, using any convenient classification scheme or typology.
Derived fromatt.typed
Status Optional
Datatype teidata.enumerated
<dataSpec module="tei"
 ident="teidata.pointvalidUntil="2050-02-25">

 <desc type="deprecationInfo"
  versionDate="2018-09-14xml:lang="en">
Several standards bodies, including NIST in the USA,
   strongly recommend against ending the representation of a number
   with a decimal point. So instead of <q>3.</q> use either <q>3</q>
   or <q>3.0</q>.</desc>
<!-- ... -->
</dataSpec>
Member of
Contained by
May contain
Note

When used in a specification element such as <elementSpec>, TEI convention requires that this be expressed as a finite clause, begining with an active verb.

Example

Example of a desc element inside a documentation element.

<dataSpec module="tei"
 ident="teidata.point">

 <desc versionDate="2010-10-17"
  xml:lang="en">
defines the data type used to express a point in cartesian space.</desc>
 <content>
  <dataRef name="token"
   restriction="(-?[0-9]+(\.[0-9]+)?,-?[0-9]+(\.[0-9]+)?)"/>

 </content>
<!-- ... -->
</dataSpec>
Example

Example of a desc element in a non-documentation element.

<place xml:id="KERG2">
 <placeName>Kerguelen Islands</placeName>
<!-- ... -->
 <terrain>
  <desc>antarctic tundra</desc>
 </terrain>
<!-- ... -->
</place>
SchematronA desc with a type of deprecationInfo should only occur when its parent element is being deprecated. Furthermore, it should always occur in an element that is being deprecated when desc is a valid child of that element.

<sch:rule context="tei:desc[ @type eq 'deprecationInfo']">
<sch:assert test="../@validUntil">Information about a
deprecation should only be present in a specification element
that is being deprecated: that is, only an element that has a
@validUntil attribute should have a child <desc
type="deprecationInfo">.</sch:assert>
</sch:rule>
Content model
<content>
 <macroRef key="macro.limitedContent"/>
</content>
Schema Declaration
element desc
{
   att.global.attributes,
   att.typed.attribute.subtype,
   att.cmc.attributes,
   attribute type { teidata.enumerated }?,
   macro.limitedContent
}
Processing Model
<model behaviour="inline"/>

<distributor>

<distributor> (distributor) supplies the name of a person or other agency responsible for the distribution of a text. [2.2.4. Publication, Distribution, Licensing, etc.]
Moduleheader
Attributes
Member of
Contained by
core: bibl
May contain
analysis: c pc s w
figures: figure formula
gaiji: g
header: idno
linking: anchor seg
tagdocs: code
textstructure: floatingText
transcr: fw subst supplied
verse: rhyme
character data
Example
<distributor>Oxford Text Archive</distributor>
<distributor>Redwood and Burn Ltd</distributor>
Content model
<content>
 <macroRef key="macro.phraseSeq"/>
</content>
Schema Declaration
element distributor
{
   att.global.attributes,
   att.canonical.attributes,
   macro.phraseSeq
}

<div>

<div> (text division) contains a subdivision of the front, body, or back of a text. [4.1. Divisions of the Body]
Moduletextstructure
Attributes
Member of
Contained by
textstructure: back body div front
May contain
Example
<body>
 <div type="part">
  <head>Fallacies of Authority</head>
  <p>The subject of which is Authority in various shapes, and the object, to repress all
     exercise of the reasoning faculty.</p>
  <div n="1type="chapter">
   <head>The Nature of Authority</head>
   <p>With reference to any proposed measures having for their object the greatest
       happiness of the greatest number [...]</p>
   <div n="1.1type="section">
    <head>Analysis of Authority</head>
    <p>What on any given occasion is the legitimate weight or influence to be attached to
         authority [...] </p>
   </div>
   <div n="1.2type="section">
    <head>Appeal to Authority, in What Cases Fallacious.</head>
    <p>Reference to authority is open to the charge of fallacy when [...] </p>
   </div>
  </div>
 </div>
</body>
Schematron

<sch:rule context="tei:div">
<sch:report test="(ancestor::tei:l or ancestor::tei:lg) and not(ancestor::tei:floatingText)"> Abstract model violation: Lines may not contain higher-level structural elements such as div, unless div is a descendant of floatingText.
</sch:report>
</sch:rule>
Schematron

<sch:rule context="tei:div">
<sch:report test="(ancestor::tei:p or ancestor::tei:ab) and not(ancestor::tei:floatingText)"> Abstract model violation: p and ab may not contain higher-level structural elements such as div, unless div is a descendant of floatingText.
</sch:report>
</sch:rule>
Content model
<content>
 <sequence minOccurs="1maxOccurs="1">
  <alternate minOccurs="0"
   maxOccurs="unbounded">

   <classRef key="model.divTop"/>
   <classRef key="model.global"/>
  </alternate>
  <sequence minOccurs="0maxOccurs="1">
   <alternate minOccurs="1maxOccurs="1">
    <sequence minOccurs="1"
     maxOccurs="unbounded">

     <alternate minOccurs="1maxOccurs="1">
      <classRef key="model.divLike"/>
      <classRef key="model.divGenLike"/>
     </alternate>
     <classRef key="model.global"
      minOccurs="0maxOccurs="unbounded"/>

    </sequence>
    <sequence minOccurs="1maxOccurs="1">
     <sequence minOccurs="1"
      maxOccurs="unbounded">

      <alternate minOccurs="1"
       maxOccurs="1">

       <elementRef key="schemaSpec"/>
       <classRef key="model.common"/>
      </alternate>
      <classRef key="model.global"
       minOccurs="0maxOccurs="unbounded"/>

     </sequence>
     <sequence minOccurs="0"
      maxOccurs="unbounded">

      <alternate minOccurs="1"
       maxOccurs="1">

       <classRef key="model.divLike"/>
       <classRef key="model.divGenLike"/>
      </alternate>
      <classRef key="model.global"
       minOccurs="0maxOccurs="unbounded"/>

     </sequence>
    </sequence>
   </alternate>
   <sequence minOccurs="0"
    maxOccurs="unbounded">

    <classRef key="model.divBottom"/>
    <classRef key="model.global"
     minOccurs="0maxOccurs="unbounded"/>

   </sequence>
  </sequence>
 </sequence>
</content>
Schema Declaration
element div
{
   att.global.attributes,
   att.divLike.attributes,
   att.typed.attributes,
   att.written.attributes,
   (
      ( model.divTop | model.global )*,
      (
         (
            (
               ( ( ( model.divLike | model.divGenLike ), model.global* )+ )
             | (
                  ( ( ( schemaSpec | model.common ), model.global* )+ ),
                  ( ( ( model.divLike | model.divGenLike ), model.global* )* )
               )
            ),
            ( ( model.divBottom, model.global* )* )
         )?
      )
   )
}
Processing Model
<model predicate="@type='title_page'"
 behaviour="block">

<outputRendition>border: 1px solid black; padding: 5px;</outputRendition>
</model>
<model behaviour="section"
 predicate="parent::body or parent::front or parent::back"/>

<model behaviour="block"/>

<docAuthor>

<docAuthor> (document author) contains the name of the author of the document, as given on the title page (often but not always contained in a byline). [4.6. Title Pages]
Moduletextstructure
Attributes
Member of
Contained by
core: lg list
drama: castList
figures: figure table
May contain
analysis: c pc s w
figures: figure formula
gaiji: g
header: idno
linking: anchor seg
tagdocs: code
textstructure: floatingText
transcr: fw subst supplied
verse: rhyme
character data
Note

The document author's name often occurs within a byline, but the docAuthor element may be used whether the byline element is used or not. It should be used only for the author(s) of the entire document, not for author(s) of any subset or part of it. (Attributions of authorship of a subset or part of the document, for example of a chapter in a textbook or an article in a newspaper, may be encoded with byline without docAuthor.)

Example
<titlePage>
 <docTitle>
  <titlePart>Travels into Several Remote Nations of the World, in Four
     Parts.</titlePart>
 </docTitle>
 <byline> By <docAuthor>Lemuel Gulliver</docAuthor>, First a Surgeon,
   and then a Captain of several Ships</byline>
</titlePage>
Content model
<content>
 <macroRef key="macro.phraseSeq"/>
</content>
Schema Declaration
element docAuthor
{
   att.global.attributes,
   att.canonical.attributes,
   att.cmc.attributes,
   macro.phraseSeq
}
Processing Model
<model behaviour="inline"/>

<docDate>

<docDate> (document date) contains the date of a document, as given on a title page or in a dateline. [4.6. Title Pages]
Moduletextstructure
Attributes
Member of
Contained by
May contain
analysis: c pc s w
figures: figure formula
gaiji: g
header: idno
linking: anchor seg
tagdocs: code
textstructure: floatingText
transcr: fw subst supplied
verse: rhyme
character data
Note

Cf. the general date element in the core tag set. This specialized element is provided for convenience in marking and processing the date of the documents, since it is likely to require specialized handling for many applications. It should be used only for the date of the entire document, not for any subset or part of it.

Example
<docImprint>Oxford, Clarendon Press, <docDate>1987</docDate>
</docImprint>
Content model
<content>
 <macroRef key="macro.phraseSeq"/>
</content>
Schema Declaration
element docDate
{
   att.global.attributes,
   att.cmc.attributes,
   att.datable.attributes,
   att.calendarSystem.attributes,
   macro.phraseSeq
}
Processing Model
<model behaviour="inline"/>

<docEdition>

<docEdition> (document edition) contains an edition statement as presented on a title page of a document. [4.6. Title Pages]
Moduletextstructure
Attributes
Member of
Contained by
textstructure: back front titlePage
May contain
Note

Cf. the edition element of bibliographic citation. As usual, the shorter name has been given to the more frequent element.

Example
<docEdition>The Third edition Corrected</docEdition>
Content model
<content>
 <macroRef key="macro.paraContent"/>
</content>
Schema Declaration
element docEdition { att.global.attributes, macro.paraContent }
Processing Model
<model behaviour="inline"/>

<docImprint>

<docImprint> (document imprint) contains the imprint statement (place and date of publication, publisher name), as given (usually) at the foot of a title page. [4.6. Title Pages]
Moduletextstructure
Attributes
Member of
Contained by
textstructure: back front titlePage
May contain
analysis: c pc s w
figures: figure formula
gaiji: g
header: idno
linking: anchor seg
tagdocs: code
textstructure: docDate
transcr: fw subst supplied
verse: rhyme
character data
Note

Cf. the <imprint> element of bibliographic citations. As with title, author, and editions, the shorter name is reserved for the element likely to be used more often.

Example
<docImprint>Oxford, Clarendon Press, 1987</docImprint>
Imprints may be somewhat more complex:
<docImprint>
 <pubPlace>London</pubPlace>
Printed for <name>E. Nutt</name>,
at
<pubPlace>Royal Exchange</pubPlace>;
<name>J. Roberts</name> in
<pubPlace>wick-Lane</pubPlace>;
<name>A. Dodd</name> without
<pubPlace>Temple-Bar</pubPlace>;
and <name>J. Graves</name> in
<pubPlace>St. James's-street.</pubPlace>
 <date>1722.</date>
</docImprint>
Content model
<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">

  <textNode/>
  <classRef key="model.gLike"/>
  <classRef key="model.phrase"/>
  <elementRef key="pubPlace"/>
  <elementRef key="docDate"/>
  <elementRef key="publisher"/>
  <classRef key="model.global"/>
 </alternate>
</content>
Schema Declaration
element docImprint
{
   att.global.attributes,
   (
      text
    | model.gLikemodel.phrasepubPlacedocDatepublishermodel.global
   )*
}
Processing Model
<model behaviour="inline"/>

<docTitle>

<docTitle> (document title) contains the title of a document, including all its constituents, as given on a title page. [4.6. Title Pages]
Moduletextstructure
Attributes
Member of
Contained by
textstructure: back front titlePage
May contain
figures: figure
linking: anchor
textstructure: titlePart
transcr: fw
Example
<docTitle>
 <titlePart type="main">The DUNCIAD, VARIOURVM.</titlePart>
 <titlePart type="sub">WITH THE PROLEGOMENA of SCRIBLERUS.</titlePart>
</docTitle>
Content model
<content>
 <sequence minOccurs="1maxOccurs="1">
  <classRef key="model.global"
   minOccurs="0maxOccurs="unbounded"/>

  <sequence minOccurs="1"
   maxOccurs="unbounded">

   <elementRef key="titlePart"/>
   <classRef key="model.global"
    minOccurs="0maxOccurs="unbounded"/>

  </sequence>
 </sequence>
</content>
Schema Declaration
element docTitle
{
   att.global.attributes,
   att.canonical.attributes,
   ( model.global*, ( ( titlePart, model.global* )+ ) )
}
Processing Model
<model behaviour="block"
 useSourceRendition="true">

<outputRendition>font-size: larger;</outputRendition>
</model>

<edition>

<edition> (edition) describes the particularities of one edition of a text. [2.2.2. The Edition Statement]
Moduleheader
Attributes
Member of
Contained by
core: bibl
header: editionStmt
May contain
analysis: c pc s w
figures: figure formula
gaiji: g
header: idno
linking: anchor seg
tagdocs: code
textstructure: floatingText
transcr: fw subst supplied
verse: rhyme
character data
Example
<edition>First edition <date>Oct 1990</date>
</edition>
<edition n="S2">Students' edition</edition>
Content model
<content>
 <macroRef key="macro.phraseSeq"/>
</content>
Schema Declaration
element edition { att.global.attributes, macro.phraseSeq }

<editionStmt>

<editionStmt> (edition statement) groups information relating to one edition of a text. [2.2.2. The Edition Statement 2.2. The File Description]
Moduleheader
Attributes
Contained by
May contain
header: edition
linking: ab
Example
<editionStmt>
 <edition n="S2">Students' edition</edition>
 <respStmt>
  <resp>Adapted by </resp>
  <name>Elizabeth Kirk</name>
 </respStmt>
</editionStmt>
Example
<editionStmt>
 <p>First edition, <date>Michaelmas Term, 1991.</date>
 </p>
</editionStmt>
Content model
<content>
 <alternate>
  <classRef key="model.pLikeminOccurs="1"
   maxOccurs="unbounded"/>

  <sequence>
   <elementRef key="edition"/>
   <classRef key="model.respLike"
    minOccurs="0maxOccurs="unbounded"/>

  </sequence>
 </alternate>
</content>
Schema Declaration
element editionStmt
{
   att.global.attributes,
   ( model.pLike+ | ( edition, model.respLike* ) )
}

<editor>

<editor> contains a secondary statement of responsibility for a bibliographic item, for example the name of an individual, institution or organization, (or of several such) acting as editor, compiler, translator, etc. [3.12.2.2. Titles, Authors, and Editors]
Modulecore
Attributes
calendarindicates one or more systems or calendars to which the date represented by the content of this element belongs.
Deprecatedwill be removed on 2024-11-11
Status Optional
Datatype 1–∞ occurrences of teidata.pointer separated by whitespace
Schematron

<sch:rule context="tei:*[@calendar]">
<sch:assert test="string-length( normalize-space(.) ) gt 0"> @calendar indicates one or more
systems or calendars to which the date represented by the content of this element belongs,
but this <sch:name/> element has no textual content.</sch:assert>
</sch:rule>
Member of
Contained by
May contain
analysis: c pc s w
figures: figure formula
gaiji: g
header: idno
linking: anchor seg
tagdocs: code
textstructure: floatingText
transcr: fw subst supplied
verse: rhyme
character data
Note

A consistent format should be adopted.

Particularly where cataloguing is likely to be based on the content of the header, it is advisable to use generally recognized authority lists for the exact form of personal names.

Example
<editor role="Technical_Editor">Ron Van den Branden</editor>
<editor role="Editor-in-Chief">John Walsh</editor>
<editor role="Managing_Editor">Anne Baillot</editor>
Schematron

<sch:rule context="tei:*[@calendar]">
<sch:assert test="string-length( normalize-space(.) ) gt 0"> @calendar indicates one or more
systems or calendars to which the date represented by the content of this element belongs,
but this <sch:name/> element has no textual content.</sch:assert>
</sch:rule>
Content model
<content>
 <macroRef key="macro.phraseSeq"/>
</content>
Schema Declaration
element editor
{
   att.global.attributes,
   att.naming.attributes,
   att.datable.attributes,
   attribute calendar { list { teidata.pointer+ } }?,
   macro.phraseSeq
}
Processing Model
<model predicate="ancestor::teiHeader"
 behaviour="omit"/>

<model behaviour="inline"/>

<editorialDecl>

<editorialDecl> (editorial practice declaration) provides details of editorial principles and practices applied during the encoding of a text. [2.3.3. The Editorial Practices Declaration 2.3. The Encoding Description 16.3.2. Declarable Elements]
Moduleheader
Attributes
Member of
Contained by
header: encodingDesc
May contain
core: p
linking: ab
Example
<encodingDesc>
 <editorialDecl>
  <p>EEBO-TCP is a partnership between the Universities of Michigan and Oxford
     and the publisher ProQuest to create accurately transcribed and encoded
     texts based on the image sets published by ProQuest via their Early English
     Books Online (EEBO) database (http://eebo.chadwyck.com). The general aim of
     EEBO-TCP is to encode one copy (usually the first edition) of every
     monographic English-language title published between 1473 and 1700 available
     in EEBO.</p>
  <p>EEBO-TCP aimed to produce large quantities of textual data within the usual
     project restraints of time and funding, and therefore chose to create
     diplomatic transcriptions (as opposed to critical editions) with
     light-touch, mainly structural encoding based on the Text Encoding
     Initiative (http://www.tei-c.org).</p>
  <p>The EEBO-TCP project was divided into two phases. The 25,363 texts created
     during Phase 1 of the project have been released into the public domain as
     of 1 January 2015. Anyone can now take and use these texts for their own
     purposes, but we respectfully request that due credit and attribution is
     given to their original source.</p>
  <p>Users should be aware of the process of creating the TCP texts, and
     therefore of any assumptions that can be made about the data.</p>
  <p>Text selection was based on the New Cambridge Bibliography of English
     Literature (NCBEL). If an author (or for an anonymous work, the title)
     appears in NCBEL, then their works are eligible for inclusion. Selection was
     intended to range over a wide variety of subject areas, to reflect the true
     nature of the print record of the period. In general, first editions of a
     works in English were prioritized, although there are a number of works in
     other languages, notably Latin and Welsh, included and sometimes a second or
     later edition of a work was chosen if there was a compelling reason to do
     so.</p>
  <p>Image sets were sent to external keying companies for transcription and
     basic encoding. Quality assurance was then carried out by editorial teams in
     Oxford and Michigan. 5% (or 5 pages, whichever is the greater) of each text
     was proofread for accuracy and those which did not meet QA standards were
     returned to the keyers to be redone. After proofreading, the encoding was
     enhanced and/or corrected and characters marked as illegible were corrected
     where possible up to a limit of 100 instances per text. Any remaining
     illegibles were encoded as <gap>s. Understanding these processes
     should make clear that, while the overall quality of TCP data is very good,
     some errors will remain and some readable characters will be marked as
     illegible. Users should bear in mind that in all likelihood such instances
     will never have been looked at by a TCP editor.</p>
  <p>The texts were encoded and linked to page images in accordance with level 4
     of the TEI in Libraries guidelines.</p>
  <p>Copies of the texts have been issued variously as SGML (TCP schema; ASCII
     text with mnemonic sdata character entities); displayable XML (TCP schema;
     characters represented either as UTF-8 Unicode or text strings within
     braces); or lossless XML (TEI P5, characters represented either as UTF-8
     Unicode or TEI g elements).</p>
  <p>Keying and markup guidelines are available at the <ref target="http://www.textcreationpartnership.org/docs/.">Text Creation
       Partnership web site</ref>.</p>
 </editorialDecl>
</encodingDesc>
Content model
<content>
 <alternate minOccurs="1"
  maxOccurs="unbounded">

  <classRef key="model.pLike"/>
  <classRef key="model.editorialDeclPart"/>
 </alternate>
</content>
Schema Declaration
element editorialDecl
{
   att.global.attributes,
   ( model.pLike | model.editorialDeclPart )+
}

<email>

<email> (electronic mail address) contains an email address identifying a location to which email messages can be delivered. [3.6.2. Addresses]
Modulecore
Attributes
Member of
Contained by
May contain
analysis: c pc s w
figures: figure formula
gaiji: g
header: idno
linking: anchor seg
tagdocs: code
textstructure: floatingText
transcr: fw subst supplied
verse: rhyme
character data
Note

The format of a modern Internet email address is defined in RFC 2822

Example
<email>membership@tei-c.org</email>
Content model
<content>
 <macroRef key="macro.phraseSeq"/>
</content>
Schema Declaration
element email { att.global.attributes, att.cmc.attributes, macro.phraseSeq }
Processing Model
<model behaviour="inline">
<outputRendition>font-family:monospace</outputRendition>
</model>

<encodingDesc>

<encodingDesc> (encoding description) documents the relationship between an electronic text and the source or sources from which it was derived. [2.3. The Encoding Description 2.1.1. The TEI Header and Its Components]
Moduleheader
Attributes
Member of
Contained by
header: teiHeader
May contain
Example
<encodingDesc>
 <p>Basic encoding, capturing lexical information only. All
   hyphenation, punctuation, and variant spellings normalized. No
   formatting or layout information preserved.</p>
</encodingDesc>
Content model
<content>
 <alternate minOccurs="1"
  maxOccurs="unbounded">

  <classRef key="model.encodingDescPart"/>
  <classRef key="model.pLike"/>
 </alternate>
</content>
Schema Declaration
element encodingDesc
{
   att.global.attributes,
   ( model.encodingDescPart | model.pLike )+
}
Processing Model
<model behaviour="omit"/>

<epigraph>

<epigraph> (epigraph) contains a quotation, anonymous or attributed, appearing at the start or end of a section or on a title page. [4.2.3. Arguments, Epigraphs, and Postscripts 4.2. Elements Common to All Divisions 4.6. Title Pages]
Moduletextstructure
Attributes
Member of
Contained by
core: lg list
drama: castList
figures: figure table
May contain
drama: castList
figures: figure table
header: biblFull
linking: ab anchor
namesdates: listPerson listPlace
textstructure: floatingText
transcr: fw
Example
<epigraph>
 <bibl>Deut. Chap. 5.</bibl>
 <q>11 Thou ſhalt not take the name of the Lord thy God in vaine, for the Lord
   will not hold him guiltleſſe which ſhall take his name in vaine.</q>
</epigraph>
Content model
<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">

  <classRef key="model.common"/>
  <classRef key="model.global"/>
 </alternate>
</content>
Schema Declaration
element epigraph
{
   att.global.attributes,
   att.cmc.attributes,
   ( model.common | model.global )*
}
Processing Model
<model behaviour="block"/>

<expan>

<expan> (expansion) contains the expansion of an abbreviation. [3.6.5. Abbreviations and Their Expansions]
Modulecore
Attributes
Member of
Contained by
May contain
analysis: c pc s w
figures: figure formula
gaiji: g
header: idno
linking: anchor seg
tagdocs: code
textstructure: floatingText
transcr: fw subst supplied
verse: rhyme
character data
Note

The content of this element should be the expanded abbreviation, usually (but not always) a complete word or phrase. The <ex> element provided by the transcr module may be used to mark up sequences of letters supplied within such an expansion.

If abbreviations are expanded silently, this practice should be documented in the editorialDecl, either with a <normalization> element or a p.

Example
The address is Southmoor
<choice>
 <expan>Road</expan>
 <abbr>Rd</abbr>
</choice>
Example
<choice xml:lang="la">
 <abbr>Imp</abbr>
 <expan>Imp<ex>erator</ex>
 </expan>
</choice>
Content model
<content>
 <macroRef key="macro.phraseSeq"/>
</content>
Schema Declaration
element expan
{
   att.global.attributes,
   att.editLike.attributes,
   att.cmc.attributes,
   macro.phraseSeq
}
Processing Model
<model behaviour="inline"/>

<extent>

<extent> (extent) describes the approximate size of a text stored on some carrier medium or of some other object, digital or non-digital, specified in any convenient units. [2.2.3. Type and Extent of File 2.2. The File Description 3.12.2.4. Imprint, Size of a Document, and Reprint Information 11.7.1. Object Description]
Moduleheader
Attributes
Member of
Contained by
core: bibl
May contain
analysis: c pc s w
figures: figure formula
gaiji: g
header: idno
linking: anchor seg
tagdocs: code
textstructure: floatingText
transcr: fw subst supplied
verse: rhyme
character data
Example
<extent>3200 sentences</extent>
<extent>between 10 and 20 Mb</extent>
<extent>ten 3.5 inch high density diskettes</extent>
Example

The measure element may be used to supply normalized or machine tractable versions of the size or sizes concerned.

<extent>
 <measure unit="MiBquantity="4.2">About four megabytes</measure>
 <measure unit="pagesquantity="245">245 pages of source
   material</measure>
</extent>
Content model
<content>
 <macroRef key="macro.phraseSeq"/>
</content>
Schema Declaration
element extent { att.global.attributes, macro.phraseSeq }

<facsimile>

<facsimile> contains a representation of some written source in the form of a set of images rather than as transcribed or encoded text. [12.1. Digital Facsimiles]
Moduletranscr
Attributes
Member of
Contained by
core: teiCorpus
textstructure: TEI
May contain
core: graphic
figures: formula
textstructure: back front
transcr: surface
Example
<facsimile>
 <graphic url="page1.png"/>
 <surface>
  <graphic url="page2-highRes.png"/>
  <graphic url="page2-lowRes.png"/>
 </surface>
 <graphic url="page3.png"/>
 <graphic url="page4.png"/>
</facsimile>
Example
<facsimile>
 <surface ulx="0uly="0lrx="200lry="300">
  <graphic url="Bovelles-49r.png"/>
 </surface>
</facsimile>
Schematron

<sch:rule context="tei:facsimile//tei:line | tei:facsimile//tei:zone">
<sch:report test="child::text()[ normalize-space(.) ne '']"> A facsimile element represents a text with images, thus
transcribed text should not be present within it.
</sch:report>
</sch:rule>
Content model
<content>
 <sequence>
  <elementRef key="frontminOccurs="0"/>
  <alternate minOccurs="1"
   maxOccurs="unbounded">

   <classRef key="model.graphicLike"/>
   <elementRef key="surface"/>
   <elementRef key="surfaceGrp"/>
  </alternate>
  <elementRef key="backminOccurs="0"/>
 </sequence>
</content>
Schema Declaration
element facsimile
{
   att.global.attributes,
   ( front?, ( model.graphicLike | surface | surfaceGrp )+, back? )
}

<figDesc>

<figDesc> (description of figure) contains a brief prose description of the appearance or content of a graphic figure, for use when documenting an image without displaying it. [15.4. Specific Elements for Graphic Images]
Modulefigures
Attributes
Contained by
figures: figure
May contain
Note

This element is intended for use as an alternative to the content of its parent figure element ; for example, to display when the image is required but the equipment in use cannot display graphic images. It may also be used for indexing or documentary purposes.

Example
<figure>
 <graphic url="emblem1.png"/>
 <head>Emblemi d'Amore</head>
 <figDesc>A pair of naked winged cupids, each holding a
   flaming torch, in a rural setting.</figDesc>
</figure>
Content model
<content>
 <macroRef key="macro.limitedContent"/>
</content>
Schema Declaration
element figDesc { att.global.attributes, macro.limitedContent }
Processing Model
<model behaviour="inline">
<outputRendition scope="before">content: '[..';</outputRendition>
<outputRendition scope="after">content: '..]';</outputRendition>
<outputRendition>color: grey;font-style:italic;</outputRendition>
</model>

<figure>

<figure> (figure) groups elements representing or containing graphic information such as an illustration, formula, or figure. [15.4. Specific Elements for Graphic Images]
Modulefigures
Attributes
Member of
Contained by
May contain
Example
<figure>
 <head>The View from the Bridge</head>
 <figDesc>A Whistleresque view showing four or five sailing boats in the foreground, and a
   series of buoys strung out between them.</figDesc>
 <graphic url="http://www.example.org/fig1.png"
  scale="0.5"/>

</figure>
Content model
<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">

  <classRef key="model.headLike"/>
  <classRef key="model.common"/>
  <elementRef key="figDesc"/>
  <classRef key="model.graphicLike"/>
  <classRef key="model.global"/>
  <classRef key="model.divBottom"/>
 </alternate>
</content>
Schema Declaration
element figure
{
   att.global.attributes,
   att.placement.attributes,
   att.typed.attributes,
   att.written.attributes,
   att.cmc.attributes,
   (
      model.headLikemodel.commonfigDescmodel.graphicLikemodel.globalmodel.divBottom
   )*
}
Processing Model
<model predicate="head or @rendition='simple:display'"
 behaviour="block"/>

<model behaviour="inline">
<outputRendition> display: block; border-top: solid 1pt blue; border-bottom: solid
1pt blue; </outputRendition>
</model>

<fileDesc>

<fileDesc> (file description) contains a full bibliographic description of an electronic file. [2.2. The File Description 2.1.1. The TEI Header and Its Components]
Moduleheader
Attributes
Contained by
May contain