This document is the formal specification for TEI simplePrint, an entry-level customization of the Text Encoding Initiative (TEI) Guidelines, intended to be generally useful to a large variety of encoders attempting to cope with the standardized representation of a variety of documents in digital form.
Like every other TEI customization, TEI simplePrint was designed for use with a particular type of material. If the material you are planning to encode matches the following criteria, then TEI simplePrint is for you. If it does not, it may not be.
If your needs go beyond those summarized here, simplePrint may still be a good point of departure, and may be very useful as a basis for the creation of your own TEI customisation. We don't however discuss the creation of a TEI customization in this document: the TEI website provides a number of links to tutorial material and tools which may assist in this process.
The present document is intended to be generally comprehensible and accessible, but does assume some knowledge of XML (the encoding language used by the TEI), and of the way it is used by the TEI. Further information on both these topics are available from many places, not least the TEI's own web site at http://www.tei-c.org.
The TEI simplePrint schema was first elaborated as a part of the TEI Simple project funded by the Andrew W. Mellon Foundation (2012-2014). The project sought to define a new ‘highly-constrained and prescriptive subset’ of the Text Encoding Initiative (TEI) Guidelines suited to the representation of early modern print materials, a formally-defined set of processing rules which permit modern web applications to easily present and analyze the encoded texts, mapping to other ontologies, and processes to describe the encoding status and richness of a TEI digital text. Its choice of elements reflected the practices followed in the encoding of large-scale literary archives, notably those produced by the Text Creation Partnership. Practice of other comparable archives such as the German Text Archive was also taken into account.
The most distinctive feature of TEI simplePrint is its use of the TEI Processing Model, which provides explicit and recommended options for the display or processing of every textual element. Programmers developing systems to handle texts encoded with TEI simplePrint do not have to look beyond this when building stylesheets or other components. This greatly reduces the complexity of developing applications that will work reliably and consistently for many users and across large corpora of documents.
The TEI simplePrint schema and the TEI Processing Model were first defined by a working group led by Martin Mueller (Northwestern University) and Sebastian Rahtz (Oxford University). Major contributions to the project were made by Magdalena Turska (Oxford University), James Cummings (Oxford University), and Brian Pytlik Zillig. The changes to the TEI scheme needed to support the TEI Processing Model were reviewed and approved by the TEI Technical Council for inclusion in release 3.0.0 of TEI P5 in February 2016. The present document was extensively revised and extended by Lou Burnard in July 2016 for submission to the TEI Technical Council.
We begin with a short example. How should we go about transferring into a computer a passage of prose, such as the start of the last chapter of Charlotte Brontë's novel Jane Eyre? We might start by simply copying what we see on the printed page, typing it in such a way that what appears on the screen looks as similar as possible, for example, by retaining the original line breaks, by introducing blanks to represent the layout of the original headings, page breaks, and paragraphs, and so forth. Of course, the possibilities are limited by the nature of the computer program we use to capture the text: it may not be possible for example to reflect accurately the typographic characteristics of our source with all such software. Some characters in the printed text (such as the accented letter a in faàl or the long dash) may not be available on the keyboard; some typographic distinctions (such as that between small capitals and full capitals) may not be readily accessible. Our first attempt tries to mimic the appearance of the former, and simply ignores the latter.
CHAPTER 38 READER, I married him. A quiet wedding we had: he and I, the par- son and clerk, were alone present. When we got back from church, I went into the kitchen of the manor-house, where Mary was cooking the dinner, and John cleaning the knives, and I said -- 'Mary, I have been married to Mr Rochester this morning.' The housekeeper and her husband were of that decent, phlegmatic order of people, to whom one may at any time safely communicate a remarkable piece of news without incurring the danger of having one's ears pierced by some shrill ejaculation and subsequently stunned by a torrent of wordy wonderment. Mary did look up, and she did stare at me; the ladle with which she was basting a pair of chickens roasting at the fire, did for some three minutes hang suspended in air, and for the same space of time John's knives also had rest from the polishing process; but Mary, bending again over the roast, said only -- 'Have you, miss? Well, for sure!' A short time after she pursued, 'I seed you go out with the master, but I didn't know you were gone to church to be wed'; and she basted away. John, when I turned to him, was grinning from ear to ear. 'I telled Mary how it would be,' he said: 'I knew what Mr Ed- ward' (John was an old servant, and had known his master when he was the cadet of the house, therefore he often gave him his Christian name) -- 'I knew what Mr Edward would do; and I was certain he would not wait long either: and he's done right, for aught I know. I wish you joy, miss!' and he politely pulled his forelock. 'Thank you, John. Mr Rochester told me to give you and Mary this.' I put into his hand a five-pound note. Without waiting to hear more, I left the kitchen. In passing the door of that sanctum some time after, I caught the words -- 'She'll happen do better for him nor ony o' t' grand ladies.' And again, 'If she ben't one o' th' handsomest, she's noan faa\l, and varry good-natured; and i' his een she's fair beautiful, onybody may see that.' I wrote to Moor House and to Cambridge immediately, to say what I had done: fully explaining also why I had thus acted. Diana and 474 JANE EYRE 475 Mary approved the step unreservedly. Diana announced that she would just give me time to get over the honeymoon, and then she would come and see me. 'She had better not wait till then, Jane,' said Mr Rochester, when I read her letter to him; 'if she does, she will be too late, for our honey- moon will shine our life long: its beams will only fade over your grave or mine.' How St John received the news I don't know: he never answered the letter in which I communicated it: yet six months after he wrote to me, without, however, mentioning Mr Rochester's name or allud- ing to my marriage. His letter was then calm, and though very serious, kind. He has maintained a regular, though not very frequent correspond- ence ever since: he hopes I am happy, and trusts I am not of those who live without God in the world, and only mind earthly things. ⚓
This transcription suffers from a number of shortcomings:
This encoding is expressed in TEI XML, a very widely used and standardized method of representing information about a document within the document itself. The transcribed words are complemented by special flags within angle brackets, called tags, which both characterise and mark the beginning and end of a string of characters. For example, each paragraph is marked by a tag <p> at its start, and a corresponding </p> at its end. We don't elaborate further on the syntax of TEI XML here. 1
Aside from its syntax, it is important to note that this particular encoding represents a set of choices or priorities. We have chosen to prioritize and simplify the representation of the words of the text over the representation of the typographic layout associated with them in this source document. This makes it easier for a computer to answer questions about the words in the document than about its typesetting, reflecting our research priorities. This priority also leads us to suppress end-of-line hyphenation. Conceivably Brontë (or her printer) intended the word ‘honeymoon’ to appear as ‘honey-moon’ on its second appearance, though this seems unlikely: our decision to focus on Brontë's text, rather than on the printing of it in this particular edition, makes it impossible to be certain. Similarly, our decision makes it impossible to use this transcription as a means of statistically analysing hyphenation practice. An encoding makes explicit all and only those textual features of importance to the encoder.
It is not difficult to think of ways in which the encoding of even this short passage might readily be extended to address other research priorities. For example:
In the remainder of this document, we present a number of TEI-recommended ways of supporting these and other encoding requirements. These ways generally involve the application of specific TEI XML elements, selected from the full range of possibilities documented in the TEI Guidelines. Like every other TEI project, TEI Simple proposes a view of the TEI Guidelines. This document defines and documents that view.
A TEI-conformant text contains (a) a TEI header (marked up as a teiHeader element) and (b) one or more representations of a text. These representations may be of three kinds: a transcribed text, marked up as a text element; a collection of digital images representing the text, marked up using a facsimile element; or a literal transcription of one or more documents instantiating the text, marked up using the <sourceDoc> element.
These elements are combined together to form a single TEI element, which must be declared within the TEI namespace, and therefore usually takes the form <TEI xmlns="http://www.tei-c.org/ns/1.0"> 2.
Some aspects of the TEI header are described in more detail in section 15 The Electronic Title Page. In what follows, we will focus chiefly on the use of the text element, though we describe one way of using the facsimile element in combination with it or alone in section 14 Encoding a Digital Facsimile. We do not consider the <sourceDoc> element further, since it is mainly used in very specialised applications for which TEI simplePrint would not be appropriate.
A text may be unitary (a single work) or composite (a collection of single works, such as an anthology). In either case, the text may have optional front or back matter such as title pages, prefaces, appendixes etc. We use the term body for whatever comes between these in the source document. We discuss various kinds of composite text in section 12 Composite and Floating Texts below.
In each of the following sections we include a short list of the TEI elements under discussion, along with a brief description, and in most cases an example of how they are used. Throughout the text, element names are linked to their detailed reference documentation, as given in the TEI Guidelines. Note that most of the examples provided by the reference documentation, and all of the links, are not specific to TEI simplePrint.
For example, here are the elements discussed so far:
As indicated above, a unitary text is encoded by means of a text element, which may contain the following elements:
Elements specific to front and back matter are described below in section 13 Front and Back Matter. In this section we discuss the elements making up the body of a text. A text must always have a body.
The body of a prose text may be just a series of paragraphs or similar blocks of text, or these may be grouped together into chapters, sections, subsections, etc. The div element is used to represent any such grouping of blocks.
type [att.typed] | characterizes the element in some sense, using any convenient classification scheme or typology. |
The type attribute on the div element may be used to supply a conventional name for this category of text division in order to distinguish them. Typical values might be book, chapter, section, part, poem, song, etc. TEI simplePrint does not constrain the range of values that may be used here.
A div element may itself contain further, nested, divs, thus mimicking the traditional structure of a book, which can be decomposed hierarchically into units such as parts, containing chapters, containing sections, and so on. TEI texts in general conform to this simple hierarchic model.
Here as elsewhere the xml:id attribute may be used to supply a unique identifier for the division, which may be used for cross references or other links to it, such as a commentary, as further discussed in section 3.7 Cross References and Links. It is good practice to provide an xml:id attribute for every major structural unit in a text, and to derive its values in some systematic way, for example by appending a section number to a short code for the title of the work in question, as in the examples below.
The n attribute may be used to supply (additionally or alternatively) a short mnemonic name or number for a division, or any other element. If a conventional form of reference or abbreviation for the parts of a work already exists (such as the book/chapter/verse pattern of Biblical citations), the n attribute is the place to record it; unlike the identifier supplied by the xml:id attribute, it does not need to be unique.
The xml:lang attribute may be used to specify the language of the division. Languages are identified by an internationally defined code, as further discussed in section 3.5.3 Foreign Words or Expressions below.
The rendition attribute may be used to supply information about the rendition (appearance) of a division, or any other element, as further discussed in section 3.5 Marking Highlighted Phrases below. Note that this attribute is used to describe the appearance of the source text, rather than the appearance of any intended output when the encoded text is displayed. The two may of course be similar, or identical, but the TEI does not assume or require this.
These four attributes, xml:id, n, xml:lang, and rendition are so widely useful that they are allowed on any element in any TEI schema: they are called global attributes. Other attributes defined in the TEI simplePrint schema are discussed in section 3.7.3 Special Kinds of Linking.
Every div may have a title or heading at its start, and (less commonly) a trailer such as ‘End of Chapter 1’ at its end. The following elements may be used to transcribe them:
Some other elements which may be found at the beginning or ending of text divisions are discussed below in section 13.1.2 Prefatory Matter.
In prose texts such as the Brontë example above, the divisions are generally composed of paragraphs, represented as p elements, though in some circumstances it may be preferred to use the ‘anonymous block’ element ab. In poetic or dramatic texts different elements are used, representing stanzas and verse lines in the first case, and individual speeches or stage directions in the second:
We discuss each of these kinds of component separately below.
Note that the l element marks verse lines, not typographic lines: as elsewhere the original lineation of the source text is not therefore preserved by this encoding. The lb element described in section 3.4 Page and Line Numbers might additionally be used to mark typographic lines if so desired.
label | provides a label (usually a single letter) to identify which part of a rhyme scheme this rhyming string instantiates. |
A dramatic text contains speeches, which may be in prose or verse, and will also contain stage directions. The sp element is used to represent each identified speech. It contains an optional speaker indication, marked with the speaker element, which can be followed by one or more l or p elements, depending on whether the speech is considered to be in prose or in verse. Stage directions, whether within or between speeches, are marked using the stage element.
part="Y"
); alternatively it may indicate whether this is an initial (I), medial (M) or F (final) fragment.As mentioned above, the ab element may also be used in preference to the p element. It should be used for blocks of text which are not clearly paragraphs, verse lines, or dramatic speeches. Typical examples include the canonical verses of the Bible, and the textual blocks of other ancient documents which predate the invention of the paragraph, such as Greek inscriptions or Egyptian hieroglyphs. The element is also useful as a means of encoding more specialized kinds of textual block, such as the question and answer structure of a catechism, or the highly formalized substructure of a legal document (if div is not considered appropriate for these). In more modern documents, it can be used to encode semi-organized or fragmentary materials such as an artist's notebook or work in progress; or to faithfully capture the substructure of a file produced by an OCR system.
Page and line breaks etc. may be marked with the following elements:
The pb, lb, and cb elements are special cases of a general class of elements known as milestones because they mark reference points within a text. The generic milestone element can mark any kind of reference point: for example, a column break, the start of a new kind of section not otherwise tagged, a change of author or style, or in general any significant change in the text not enclosed by an XML element. Unlike other elements, milestone elements do not enclose a piece of text and make an assertion about it; instead they indicate a point in the text where something changes, as indicated by a change in the values of the milestone's attributes unit, which indicates the ‘something’ concerned, and n which indicates the new value.
The pb, lb, and cb elements are shortcuts or syntactic sugar for <milestone unit="page"/> <milestone unit="line"/> and <milestone unit="column"/> respectively.
When working from a paginated original, it is often useful to record its pagination, whether to simplify later proof-reading, or to align the transcribed text with a set of page images, as further discussed below.
Similar considerations apply to line breaks (lb), though these are less frequently considered useful when encoding modern printed textual sources. When transcribing manuscripts or early printed books, however, it is often helpful to retain them in an encoding, if only to facilitate alignment of transcription and original. Like pb, the lb element should appear before the text of the line whose start it signals.
A more powerful approach, discussed in section 14 Encoding a Digital Facsimile below, is to use the facsimile element to define the organisation of the set of images representing the text, and then use the facs attribute to point to individual components of that representation.
Highlighted words or phrases are those made visibly different from the rest of the text, typically by a change of type font, handwriting style, ink colour etc., which is intended to draw the reader's attention to some associated change.
The global rendition attribute can be attached to any element, and used wherever necessary to specify details of the highlighting used for it in the source. For example, a heading rendered in bold might be tagged <head rendition="simple:bold">, and one in italic <head rendition="simple:italic">.
The values used for the rendition attribute point to definitions provided for the formatting concerned. These definitions are typically provided by a rendition element in the document's header, as further discussed in section 15.2.3 Tagging Declaration.
It is not always possible or desirable to interpret the reasons for such changes of rendering in a text. In such cases, the element hi may be used to mark a sequence of highlighted text without making any claim as to its status.
Alternatively, where the cause for the highlighting can be identified with confidence, a number of other, more specific, elements are available.
Some features (notably quotations, titles, and foreign words) may be found in a text either marked by highlighting, or with quotation marks. In either case, the element q (as discussed in the following section) should be used. Again, the global rendition attribute can be used to record details of the highlighting used in the source if this is thought useful.
Like changes of typeface, quotation marks are conventionally used to denote several different features within a text, of which the most frequent is quotation, though many other features are possible. The full TEI Guidelines provide additional elements such as <mentioned> or <said> to distinguish some of these features, but these more specialised elements are not included in TEI simplePrint. In TEI Simple however, we use the quote element for quotation only, and the q element for all other material found within quotation marks in the text.
As elsewhere, the way that a citation or quotation was printed (for example, in-line or set off as a display or block quotation), may be represented using the rendition attribute. This may also be used to indicate the kind of quotation marks used.
The creator of the electronic text must decide whether quotation marks are replaced by the tags or whether the tags are added and the quotation marks kept. If the quotation marks are removed from the text, the rendition attribute may be used to record the way in which they were rendered in the copy text.
As these examples show, the foreign element should not be used to tag foreign words if some other more specific element such as title, or div applies.
The value of the xml:lang attribute on an element applies hierarchically to everything contained by that element, unless overridden:
Here we specify that the whole div element uses the language with the coded identifier la i.e., Latin. Since it is contained by that div there is no need to supply this information again for the first s element. The second s element however overrides this value, and indicates that its content is in English (the language with identifier en). The third s element is again in Latin.
The codes used to identify languages, supplied on the xml:lang attribute, are defined by an international standard3, as further explained in the relevant section of the TEI Guidelines. Some simple example codes for a few languages are given here:
zh | Chinese | grc | Ancient Greek |
en | English | el | Greek |
enm | Middle English | ja | Japanese |
fr | French | la | Latin |
de | German | sa | Sanskrit |
A note is any additional comment found in a text, marked in some way as being out of the main textual stream. A note is always attached to some part of the text, implicitly or explicitly: we call this its target, or its point of attachment. The element note should be used to mark any kind of note whether it appears as a separate block of text in the main text area, at the foot of the page, at the end of the chapter or volume, in the margin, or in some other place.
Notes may be in a different hand or typeface, may be authorial or editorial, and may have been added later. The attributes type and resp can be used to distinguish between different kinds of notes or identify their authors.
It may however be problematic to determine the precise position of the point of attachment, particularly in the case of marginal notes. A marginal note may also be hard to distinguish from a label or subheading which introduces the text with which it is associated. Where the purpose of the note is clearly to label the associated text, rather than to comment on it, the element label may be preferable. Where it is clearly a subheading attached to a distinct subdivision, it may be preferable to start a new element div and encode the subheading as a head. Note however that a head cannot be inserted anywhere except at the beginning of a div. And where (as in some Early Modern English plays) marginal annotation is systematically used to identify speakers, it may be better to represent these using the speaker element introduced above. In cases of doubt, the encoder should decide on a clear policy and preferably document it for the use of others.
Any kind of cross reference or link found at one point in a text which points to another part of the same or another document may be encoded using the ref element discussed in this section. Implicit links (such as the association between two parallel texts, or that between a text and its interpretation) may be encoded using the linking attributes discussed in section 3.7.3 Special Kinds of Linking.
Usually, the presence of a cross-reference or link will be indicated by some text or symbol in the source being encoded, which will then become the content of the ref element. Occasionally, however, and frequently in the case of a born digital document, the exact form and appearance of the cross reference text will be determined dynamically by the software processing the document. In such cases, the ref element will have no content, and serve simply to mark a point from which a link is to be made, along with the target of the link.
Sometimes the target of a cross reference does not correspond with any particular feature of a text, and so may not be tagged as an element of some kind. If the desired target is simply a point in the current document, the easiest way to mark it is by introducing an anchor element at the appropriate spot. If the target is some sequence of words not otherwise tagged, the seg element may be used to mark them. These two elements are described as follows:
The type attribute should be used (as above) to distinguish amongst different purposes for which these general purpose elements might be used in a text. Some other uses are discussed in section 3.7.3 Special Kinds of Linking below.
So far, we have shown how the ref element may be used for cross-references or links whose targets occur within the same document as their source. The element may also be used to refer to elements in any other XML document or resource, such as a document on the web, or a database component. This is possible because the value of the target attribute may be any valid Uniform Resource Identifier (URI)4.
A URI may reference a web page or just a part of one, for example http://www.tei-c.org/index.xml#SEC2
. The hash sign indicates that what follows it is the identifier of an element to be located within the XML document identified by what precedes it: this example will therefore locate an element which has an xml:id attribute value of SEC2 within the document retrieved from http://www.tei-c.org/index.xml
. In the examples we have discussed so far, the part to the left of the sharp sign has been omitted: this is understood to mean that the referenced element is to be located within the current document.
It is also possible to define an abbreviated form of the URI, using a predefined prefix separated from the rest of the code by a colon, as for example cesr:SEC2. This is known as a private URI, since the prefix is not standardized (except that the prefix xml: is reserved for use by XML itself). A prefixDef element should be supplied within the TEI header specifying how the prefix (here cesr) should be translated to give a full URL for the link. This is particularly useful if a document contains many references to an external document such as an authority file.
Parts of an XML document can be specified by means of other more sophisticated mechanisms using a language called Xpointer, also defined by the W3C. This is useful when, for example, the elements to be linked to do not bear identifiers. Further information about this and other forms of link addressing is provided in chapter 16 of the TEI Guidelines but is beyond the scope of the present document.
The following special purpose linking attributes are defined for every element in the TEI simplePrint schema:
The process of encoding an electronic text has much in common with the process of editing a manuscript or other text for printed publication. In either case a conscientious editor may wish to record both the original state of the source and any editorial correction or other change made in it. The elements discussed in this and the next section provide some facilities for meeting these needs.
The following elements may be used to mark corrections, that is editorial changes introduced where the editor believes the original to be erroneous:
The following elements may be used to mark normalization, that is editorial changes introduced for the sake of consistency or modernization of a text:
Consider, for example, the following famous passage as it appears in the first quarto printing of Shakespeare's Henry V:
in particular the phrase we might transcribe directly as
... for his Nose was as sharpe as a Pen, and a Table of greene fields⚓
In addition to correcting or normalizing words and phrases, editors and transcribers may also supply missing material, omit material, or transcribe material deleted or crossed out in the source. In addition, some material may be particularly hard to transcribe because it is hard to make out on the page. The following elements may be used to record such phenomena:
These elements may also be used to record the actual writing process, for example to record passages which have been deleted, added, corrected etc., whether by the author of a literary text or by a scribe copying out a manuscript. An analysis of such documentary modifications may be essential before a reading text can be presented, and is clearly of importance in the editorial process.
The example is taken from the surviving authorial manuscript of a poem by the English writer Wilfred Owen, a part of which is shown here:
Owen first wrote ‘Helping the worst amongst us’, but then deleted it, adding ‘Dragging the worst amongt us’ over the top. In the same way, he revised the phrase ‘half–blind’ by deleting the ‘half–’ and adding ‘all’ above it. In the last line, he started a word beginning ‘fif’ before deleting it and writing the word ‘five–nines’. We can encode all of this as follows:
The tags add and del elements are used to enclose passages added or deleted respectively. Additional attributes are available such as resp to indicate responsibility for the modification, or place to indicate where in the text (for example, above or below the line) the modification has been made. Where the encoder wishes to assert that the addition and deletion make up a single editorial act of substitution, these elements can be combined within a subst element as shown above.
A very careful examination of Owen’s second modification shows that he really did write ‘amongt’ rather than ‘amongst’, presumably in error. An equally careful editor wishing to restore the missing ‘s’ might use the supplied element to indicate that they have done so:
Here the resp attribute has been used to indicate that the ‘s’ was not supplied by Owen but by someone else, specifically the person documented elsewhere by an element with the identifier ED.
Like names, dates, and numbers, abbreviations may be transcribed as they stand or expanded; they may be left unmarked, or encoded using the following elements:
The type attribute may be used to distinguish types of abbreviation by their function.
The elements expan and abbr should contain a full word, or the abbreviated form of a full word respectively. For a fuller discussion of abbreviations and the intricacies of representing them consult the section on Abbreviation and Expansion in the TEI Guidelines.
The TEI scheme defines elements for a large number of ‘data-like’ features which may appear almost anywhere within almost any kind of text. These features may be of particular interest in a range of disciplines; they all relate to objects external to the text itself, such as the names of persons and places, strings of code, formulae, or numbers and dates. These items may also pose particular problems for natural language processing (NLP) applications. The elements described here, by making such features explicit, reduce the complexity of processing texts containing them.
A referring string is any phrase which refers to some person, place, object, etc. A name is a referring string which contains proper nouns and honorifics only. Two elements are provided to mark such strings:
Simply tagging something as a name is rarely enough to enable automatic processing of personal names into the canonical forms usually required for reference purposes. The name as it appears in the text may be inconsistently spelled, partial, or vague. Moreover, name prefixes such as van or de la, may or may not be included as part of the reference form of a name, depending on the language and country of origin of the bearer.
The values used for the ref attribute here (#BENM1 etc.) are pointers; in this case indicating an element with the identifier BENM1 etc. somewhere in the current document, though any form of URI could be used. The element indicated will typically (for a person) be a person element, listed within a particDesc element, or (for a place) a place element, listed within a settingDesc element in the TEI header, as further discussed in 15.3 The Profile Description below.
The following elements may be useful when marking up sequences of text that represent mathematical expressions, chemical formulae, and the like:
The following elements are useful for stretches of code or similar formal language appearing within a text:
Note in this example that characters which have a syntactic function in XML (such as the ampersand or the angle bracket) must be represented within a TEI simplePrint document by means of an entity reference such as <
or &
.
The following elements are provided for the detailed encoding of times and dates:
period | supplies pointers to one or more definitions of named periods of time (typically categorys, dates or <event>s) within which the datable item is understood to have occurred. |
when [att.datable.w3c] | supplies the value of the date or time in a standard form, e.g. yyyy-mm-dd. |
notBefore [att.datable.w3c] | specifies the earliest possible date for the event in standard form, e.g. yyyy-mm-dd. |
notAfter [att.datable.w3c] | specifies the latest possible date for the event in standard form, e.g. yyyy-mm-dd. |
Like dates, both numbers and quantities can be written with either letters or digits and may therefore need to be normalized for ease of processing. Their presentation is also highly language-dependent (e.g. English 5th becomes Greek 5.; English 123,456.78 equals French 123.456,78).
The following elements are provided for the detailed encoding of numbers and quantities:
type | indicates the type of numeric value. |
value | supplies the value of the number in standard form. |
quantity [att.measurement] | (quantity) specifies the number of the specified units that comprise the measurement |
unit [att.measurement] | (unit) indicates the units used for the measurement, usually using the standard symbol for the desired units. Suggested values include: 1] m (metre); 2] kg (kilogram); 3] s (second); 4] Hz (hertz); 5] Pa (pascal); 6] Ω (ohm); 7] L (litre); 8] t (tonne); 9] ha (hectare); 10] Å (ångström); 11] mL (millilitre); 12] cm (centimetre); 13] dB (decibel); 14] kbit (kilobit); 15] Kibit (kibibit); 16] kB (kilobyte); 17] KiB (kibibyte); 18] MB (megabyte); 19] MiB (mebibyte) |
commodity [att.measurement] | (commodity) indicates the substance that is being measured |
The element list is used to mark any kind of list. A list is a sequence of text items, which may be numbered, bulleted, or arranged as a glossary list. Each item may be preceded by an item label (in a glossary list, this label is the term being defined):
Where the internal structure of a list item is more complex, it may be preferable to regard the list as a table, for which special-purpose tagging is defined in section 8 Tables.
Lists of bibliographic items should be tagged using the listBibl element, described in the next section.
It is often useful to distinguish bibliographic citations where they occur within texts being transcribed for research, if only so that they will be properly formatted when the text is printed out. The element bibl is provided for this purpose. Where the components of a bibliographic reference are to be distinguished, the following elements may be used as appropriate. It is generally useful to distinguish at least those parts (such as the titles of articles, books, and journals) which will need special formatting. The other elements are provided for cases where particular interest attaches to such details:
The element biblFull is also provided for convience in cases where bibliographic citations following a more sophisticated model have been used; it is permitted only in the TEI header.
The listBibl element is used to group lists of bibliographic citations. It may contain a series of bibl or biblFull elements.
The following elements are provided for the description of tabular matter, commonly found in many kinds of narrative text. Note that TEI simplePrint provides no sophisticated ways of describing the detailed layout of a table beyond its organization into rows and columns.
The role attribute may be used on either cell or rowto indicate the function of a cell, or of a row of cells. Its values should be taken from the following list:
Not all the components of a document are necessarily textual. The most straightforward text will often contain diagrams or illustrations, to say nothing of documents in which image and text are inextricably intertwined, or electronic resources in which the two are complementary.
The encoder may simply record the presence of a graphic within the text, possibly with a brief description of its content, and may also provide a link to a digitized version of the graphic, using the following elements:
Any textual information accompanying the graphic, such as a heading and/or caption, may be included within the figure element itself, in a head and one or more p elements, as may any text appearing within the graphic itself. It is strongly recommended that a prose description of the image be supplied, as the content of a figDesc element, for the use of applications which are not able to render the graphic, and to render the document accessible to vision-impaired readers. (Such text is not normally considered part of the document proper.)
Interpretation typically ranges across the whole of a text, with no particular respect to other structural units. A useful preliminary to intensive interpretation is therefore to segment the text into discrete and identifiable units, each of which can then bear a label for use as a sort of ‘canonical reference’. To facilitate such uses, these units may not cross each other, nor nest within each other. They may conveniently be represented using the following element:
Tokenization, that is, the identification of lexical or non-lexical tokens within a text, is a very common requirement for all kinds of textual analysis, and not an entirely trivial one. The decision as to whether, for example, ‘can't’ in English or ‘du’ in French should be treated as one word or two is not simple. Consequently it is often useful to make explicit the preferred tokenization in a marked up text. The following elements are available for this purpose:
In this example, each token in the input has been decorated with an automatically generated part of speech code, using the ana attribute discussed in section 3.7.3 Special Kinds of Linking above. The system has also distinguished between tokens to be treated as words (tagged w) and tokens considered to be punctuation (tagged pc). It may also sometimes be useful to distinguish tokens which consist of a single letter or character: the c element is provided for this purpose.
The w element is a specialisation of the seg element which has already been introduced for use in identifying otherwise unmarked targets of cross references and hypertext links (see section 3.7 Cross References and Links); it can be used to distinguish any portion of text to which the encoder wishes to assign a user-specified type or a unique identifier; it may thus be used to tag textual features for which there is no other provision in the published TEI Guidelines.
Some attributes are available on many elements, though not on all. These attributes are defined using a TEI attribute class, a concept which is discussed further in the TEI Guidelines. We list here some attribute classes which have been adapted or customized for use in TEI simplePrint.
The elements add, figure, fw, label, note and stage all take the attribute place to indicate whereabouts on the page they appear. In TEI simplePrint the possible values for this attribute are limited as indicated below:
place | specifies where this item is placed. |
The elements add, <am>, corr, date, del, <ex>, expan, gap, name, reg, <space>, subst, supplied, time and unclear all use the attribute unit to indicate the units in which the size of the feature concerned is expressed. In TEI simplePrint the possible values for this attribute are limited as indicated below:
unit | names the unit used for the measurement |
Very many TEI elements take the value type (see the specification for att.typed for a full list). In most cases, no constraint is placed on the possible values for this attribute. In the case of the element name however, the possible values for this attribute are limited as indicated below:
type | characterizes the element in some sense, using any convenient classification scheme or typology. |
A composite text, like a simple text, has an optional front and back matter. In between however, instead of a single body, it contains one or more discrete texts, each with its own optional front and back matter. The following elements are provided to handle composite texts of various kinds.
For many purposes, particularly in older texts, the preliminary material such as title pages, prefatory epistles, etc., may provide very useful additional linguistic or social information.The TEI Guidelines provide a set of recommendations for distinguishing the textual elements most commonly encountered in front matter, which are summarized here.
The start of a title page should be marked with the element titlePage. All text contained on the page should be transcribed and tagged with the appropriate element from the following list:
Typeface distinctions should be marked with the rendition attribute when necessary, as described above though a very detailed description of the letter spacing and sizing used in ornamental titles is not easily done. Changes of language should be marked by appropriate use of the xml:lang attribute or the foreign element, as necessary. Names of people, places, or organizations, may be tagged using the name element wherever they appear if no other more specific element is available.
Major blocks of text within the front matter should be marked using div elements; the following suggested values for the type attribute may be used to distinguish various common types of prefatory matter:
Where other kinds of prefatory matter are encountered, the encoder is at liberty to invent other values for the type attribute.
All text divisions, whether in front matter or elsewhere, may begin and end with one or more components which we term liminal elements, because they begin or end the division. A typical example is a heading or title of some kind which should be tagged using the head element; but there are many other possibilities:
Because of variations in publishing practice, back matter can contain virtually any of the elements listed above for front matter, and the same elements should be used where this is so. Additionally, back matter may contain the following types of matter within the back element. Like the structural divisions of the body, these should be marked as div elements, and distinguished by the following suggested values of the type attribute:
TEI simplePrint also provides elements for some additional components of front or back matter which are characteristic of particular kinds of text, in particular old play texts. These often include lists of dramatis personae and notes about the setting of a play, for which the following elements are provided:
Note that these elements are intended for use in marking up cast lists and setting notes as they appear in a source document. They are not intended for use when marking up definitive lists of the different roles identified in a play, except in so far as that may have been their original purpose.
The following example shows one way of encoding the last part of Shakespeare's Tempest, as printed in the first folio:
The following elements may be used to encode a text represented by a collection of digital images, either alone or in conjunction with a textual transcription.
The surface element is useful in two situations: when it is desired to group different images of the same page, for example of different resolutions; and when it is desired to align parts of a page image with parts of a transcription. The zone element is used to define (and hence provide an identifier for) the location of a part of an image with reference to the surface on which it appears.
A more detailed explanation of the use of these attributes and other associated elements is given in the full TEI Guidelines.
Every TEI text has a header which provides information analogous to that provided by the title page of printed text. The header is introduced by the element teiHeader and has four major parts:
A corpus or collection of texts with many shared characteristics may have one header for the corpus and individual headers for each component of the corpus. In this case the type attribute indicates the type of header. <teiHeader type="corpus"> introduces the header for corpus-level information.
Some of the header elements contain running prose which consists of one or more ps. Others are grouped:
The fileDesc element is mandatory. It contains a full bibliographic description of the file with the following elements:
The following elements can be used in the titleStmt to provide information about the title of a work and those responsible for its content:
The title of a digital resource derived from a non-digital original may be similar to that of its source but should be distinct from it, for example: [title of source]: TEI XML edition
or A machine readable version of: [title of source]
The editionStmt groups information relating to one edition of the digital resource (where edition is used as elsewhere in bibliography), and may include the following elements:
Determining exactly what constitutes a new edition of an electronic text is left to the encoder.
The extent statement describes the approximate size of the digital resource.
The publicationStmt is mandatory. It may contain a simple prose description or groups of the elements described below:
At least one of these elements must be present, unless the entire publication statement is in prose. The following elements may occur within them:
The seriesStmt element groups information about the series, if any, to which a publication belongs. It may contain title, idno, or respStmt elements.
The notesStmt, if used, contains one or more note elements which contain a note or annotation. Some information found in the notes area in conventional bibliography has been assigned specific elements in the TEI scheme.
The sourceDesc is a mandatory element which records details of the source or sources from which the computer file is derived. It may contain simple prose or a bibliographic citation, using one or more of the following elements:
The encodingDesc element specifies the methods and editorial principles which governed the transcription of the text. Its use is highly recommended. It may be prose description or may contain more specialized elements chosen from the following list:
The editorialDecl contains a prose description of the practices used when encoding the text. Typically this description should cover such topics as the following, each of which may conveniently be given as a separate paragraph:
The full TEI Guidelines provide specialized elements for each of the topics above; these are not however included in TEI simplePrint.
When it does not consist simply of a prose description, the tagsDecl element may contain a number of more specialized elements providing additional information about how the document concerned has been marked up. The following elements may be used:
The rendition elements here contain fragments expressed in the W3C standard Cascading Stylesheets language (CSS). Their function here is to associate the particular styles concerned with an identifier (for example rend-bo) which can then be pointed to from elsewhere within the document by means of the rendition attribute mentioned in section 3.5.1 Changes of Typeface, etc. above. To indicate, for example, that a particular name in the document was rendered in a bold font it might be tagged <name rendition="#rend-bo">. The selector attribute used in the preceding example is used to indicate once for all a default rendition value to be associated with several elements: in this example, unless otherwise indicated, it is assumed that the content of each hi and each title element was originally rendered using an italic font.
For TEI simplePrint, a large set of such rendition definitions has been predefined. The encoder is not therefore required to supply any detailed declarations, but can refer to the predefined list using the following list:
The simple: prefix used here must be mapped to a location at which the full rendition declaration can be found, by default the XML source of the present document.
Full details of the way these elements may be used are provided in the relevant section of the TEI Guidelines (http://www.tei-c.org/release/doc/tei-p5-doc/en/html/HD.html#HD57).
The refsDecl element is used to document the way in which any standard referencing scheme built into the encoding works. In its simplest form, it consists of prose description.
In this case, a pointer value in the form psn:MDH would be translated to http://www.example.com/personography.xml#MDH
.
The classDecl element groups together definitions or sources for any descriptive classification schemes or taxonomies used by other parts of the header. These schemes may be defined in a number of different ways, using one or more of the following elements:
Linkage between a particular text and a category within such a taxonomy is made by means of the catRef element within the textClass element, as described in the next section.
The profileDesc element gathers together information about various descriptive aspects of a text. It has the following optional components:
The creation element documents where a work was created, even though it may not have been published or recorded there:
The full TEI Guidelines provide a rich range of additional elements to define more structured information about persons and places; these are not however available in TEI Simple.
The textClass element classifies a text. This may be done with reference to a classification system locally defined by means of the classDecl element, or by reference to some externally defined established scheme such as the Universal Decimal Classification. Texts may also be classified using lists of keywords, which may themselves be drawn from locally or externally defined control lists. The following elements are used to supply such classifications:
Multiple classifications may be supplied using any of the mechanisms described in this section.
The TEI header was one of the first attempts to provide a full range of metadata elements, but it is by no means the only standard now used for this purpose. To facilitate the management of large digital collections and to simplify interoperability of TEI and non-TEI resources, the following element may be found useful:
A typical use for this element might be to store a set of descriptors conforming to the Dublin Core standard in the TEI header rather than to generate them automatically from the corresponding TEI elements. For examples and discussion, see the TEI Guidelines at http://www.tei-c.org/release/doc/tei-p5-doc/en/html/HD.html#HD9
The revisionDesc element provides a change log in which each significant change made to a text may be recorded. It is always the last element in a teiHeader and contains the following elements:
Each change element contains a brief description of a significant change. The attributes when and who may be used to identify when the change was carried out and the person responsible for it.
It is good practice (but not required) to group changes together within a listChange element.
In a production environment it will usually be found preferable to use some kind of automated system to track and record changes. Many such version control systems, as they are known, can also be configured to update the TEI header of a file automatically.
Unlike most other TEI customizations, TEI simplePrint includes documentation of the intended processing associated with the majority of elements. As noted above, the TEI provides components such as the rendition attribute to indicate the appearance of particular parts of a document in the non-digital source from which it is derived. With TEI simplePrint, it is also possible to indicate how in general an element should be processed, in particular its intended appearance when processed for display on a screen or on paper. This ability derives from a number of capabilities recently added to the TEI architecture for the specification of processing, which were developed as part of the project that defined the TEI simplePrint schema.
The key feature of this ‘Processing Model’ is a notation that allows the encoder to associate each element with one or more categories, which we call its behaviours. In addition, the Processing Model indicates how the element should be rendered, possibly differently in differing circumstances, using the W3C Cascading Style Sheets (CSS) mentioned above. It is consequently much easier to develop processors for documents conforming to TEI simplePrint, since the complexity of the task is much reduced.
Twenty-five different behaviours are currently defined by the TEI Processing Model. Their names indicate informally the categorization concerned, and should be readily comprehensible for most programmers. The following table indicates the TEI simplePrint elements associated with each:
Behaviour | Used by | Effect |
alternate | choice date | support display of alternative visualizations, for example by displaying the preferred content, by displaying both in parallel, or by toggling between the two. |
anchor | anchor | create an identifiable anchor point in the output. |
block | address addrLine argument back body byline closer dateline div docTitle epigraph figure floatingText formula front fw group head imprimatur l lg listBibl note opener postscript q quote role roleDesc salute signed sp speaker spGrp stage titlePage titlePart trailer | create a block structure |
body | text | create the body of a document |
break | cb lb pb | create a line, column, or page break according to the value of type |
cell | cell | create a table cell |
cit | cit | show the content, with an indication of the source |
document | TEI | start a new output document |
glyph | g | show a character by looking up reference to a chardesc at the given URI |
graphic | graphic | if URL is present, use it to display graphic, else display a placeholder image |
heading | head | creates a heading |
index | body | generate list according to type |
inline | abbr actor add am author bibl biblScope c choice code corr date del desc docAuthor docDate docEdition docImprint editor email ex expan figDesc figure foreign formula fw g gap hi label measure milestone name note num orig pc q quote ref reg relatedItem rhyme rs s salute seg sic signed subst supplied time title unclear w | creates inline element out of content if there's something in <outputRendition>, use that formatting; otherwise just show text of selected content |
link | ref | create hyperlink |
list | castGroup castList list listBibl | create a list |
listItem | bibl castItem item | create a list item |
metadata | teiHeader | create metadata section |
note | note | create a note, often out of line, depending on the value of place; could be margin, footnote, endnote, inline |
omit | author editor publisher pubPlace profileDesc revisionDesc encodingDesc | do nothing, do not process children |
paragraph | ab p | create a paragraph out of content |
row | row | create a table row |
section | div | create a new section of the output document |
table | table | create a table |
text | title | create literal text |
title | fileDesc | create document title |
Full documentation of the Processing Model is provided in section http://www.tei-c.org/release/doc/tei-p5-doc/en/html/TD.html#TDPM of the TEI Guidelines, and we do not describe it further here.
Like other TEI customizations, TEI simplePrint is defined by reference to the TEI Guidelines. The following reference documentation provides formal specifications for each element, model class, attribute class, macro and datatype it uses. These concepts are further explained in the TEI Guidelines.
Specifications are provided here for each component which has been modified for inclusion in TEI simplePrint. Almost every textual element has been modified, if only to include a processing model component. Note that the cross references included in these specifications are to the section of the full TEI Guidelines where the subject is treated, and not to sections of the present document.
<ab> (anonymous block) contains any component-level unit of text, acting as a container for phrase or inter level elements analogous to, but without the same constraints as, a paragraph. [17.3. Blocks, Segments, and Anchors] | |
Module | linking |
Attributes |
|
Member of | |
Contained by | |
May contain | core: abbr add address bibl cb choice cit corr date del desc email expan foreign gap graphic hi l label lb lg list listBibl measure milestone name note num orig pb q quote ref reg rs sic stage term time title unclear drama: castList gaiji: g namesdates: listPerson listPlace tagdocs: code textstructure: floatingText verse: rhyme character data |
Note | The ab element may be used at the encoder's discretion to mark any component-level elements in a text for which no other more specific appropriate markup is defined. Unlike paragraphs, ab may nest and may use the type and subtype attributes. |
Example | <div type="book" n="Genesis"> <div type="chapter" n="1"> <ab>In the beginning God created the heaven and the earth.</ab> <ab>And the earth was without form, and void; and darkness was upon the face of the deep. And the spirit of God moved upon the face of the waters.</ab> <ab>And God said, Let there be light: and there was light.</ab> <!-- ...--> </div> </div> |
Schematron | <sch:rule context="tei:ab"> <sch:report test="(ancestor::tei:l or ancestor::tei:lg) and not( ancestor::tei:floatingText |parent::tei:figure |parent::tei:note )"> Abstract model violation: Lines may not contain higher-level divisions such as p or ab, unless ab is a child of figure or note, or is a descendant of floatingText. </sch:report> </sch:rule> |
Content model | <content> |
Schema Declaration | element ab { att.global.attributes, att.typed.attributes, att.fragmentable.attributes, att.written.attributes, att.cmc.attributes, macro.abContent } |
Processing Model | <model behaviour="paragraph"/> |
<abbr> (abbreviation) contains an abbreviation of any sort. [3.6.5. Abbreviations and Their Expansions] | |||||||||
Module | core | ||||||||
Attributes |
| ||||||||
Member of | |||||||||
Contained by | core: abbr add addrLine author bibl biblScope choice corr date del desc editor email expan foreign head hi item l label measure name note num orig p pubPlace publisher q quote ref reg resp rs sic speaker stage term time title unclear header: catDesc change classCode creation distributor edition extent language licence rendition tagUsage textstructure: byline closer dateline docAuthor docDate docEdition docImprint imprimatur opener salute signed titlePart trailer verse: rhyme | ||||||||
May contain | |||||||||
Example | <choice> <expan>North Atlantic Treaty Organization</expan> <abbr cert="low">NorATO</abbr> <abbr cert="high">NATO</abbr> <abbr cert="high" xml:lang="fr">OTAN</abbr> </choice> | ||||||||
Example | <choice> <abbr>SPQR</abbr> <expan>senatus populusque romanorum</expan> </choice> | ||||||||
Content model | <content> | ||||||||
Schema Declaration | element abbr { att.global.attributes, att.typed.attribute.subtype, att.cmc.attributes, attribute type { teidata.enumerated }?, macro.phraseSeq } | ||||||||
Processing Model | <model behaviour="inline"/> |
<abstract> contains a summary or formal abstract prefixed to an existing source document by the encoder. [2.4.4. Abstracts] | |
Module | header |
Attributes |
|
Member of | |
Contained by | header: profileDesc |
May contain | |
Note | This element is intended only for cases where no abstract is available in the original source. Any abstract already present in the source document should be encoded as a div within the front, as it should for a born-digital document. |
Example | <profileDesc> <abstract resp="#LB"> <p>Good database design involves the acquisition and deployment of skills which have a wider relevance to the educational process. From a set of more or less instinctive rules of thumb a formal discipline or "methodology" of database design has evolved. Applying that methodology can be of great benefit to a very wide range of academic subjects: it requires fundamental skills of abstraction and generalisation and it provides a simple mechanism whereby complex ideas and information structures can be represented and manipulated, even without the use of a computer. </p> </abstract> </profileDesc> |
Content model | <content> |
Schema Declaration | element abstract { att.global.attributes, ( model.pLike | model.listLike | listBibl )+ } |
<actor> contains the name of an actor appearing within a cast list. [7.1.4. Cast Lists] | |||||||||||||||||
Module | drama | ||||||||||||||||
Attributes |
| ||||||||||||||||
Member of | |||||||||||||||||
Contained by | drama: castItem | ||||||||||||||||
May contain | |||||||||||||||||
Note | This element should be used only to mark the name of the actor as given in the source. Chapter 14. Names, Dates, People, and Places discusses ways of marking the components of names, and also of associating names with biographical information about a person. | ||||||||||||||||
Example | <castItem> <role>Mathias</role> <roleDesc>the Burgomaster</roleDesc> <actor ref="https://en.wikipedia.org/wiki/Henry_Irving">Mr. Henry Irving</actor> </castItem> | ||||||||||||||||
Content model | <content> | ||||||||||||||||
Schema Declaration | element actor { att.global.attributes, att.canonical.attributes, attribute sex { list { teidata.sex+ } }?, attribute gender { list { teidata.gender+ } }?, macro.phraseSeq } | ||||||||||||||||
Processing Model | <model behaviour="inline"/> |
<add> (addition) contains letters, words, or phrases inserted in the source text by an author, scribe, or a previous annotator or corrector. [3.5.3. Additions, Deletions, and Omissions] | |
Module | core |
Attributes |
|
Member of | |
Contained by | core: abbr add addrLine author bibl biblScope corr date del editor email expan foreign head hi item l label lg measure name note num orig p pubPlace publisher q quote ref reg rs sic speaker stage term time title unclear figures: cell header: change distributor edition extent licence textstructure: byline closer dateline docAuthor docDate docEdition docImprint imprimatur opener salute signed titlePart trailer verse: rhyme |
May contain | core: abbr add address bibl cb choice cit corr date del desc email expan foreign gap graphic hi l label lb lg list listBibl measure milestone name note num orig pb q quote ref reg rs sic stage term time title unclear drama: castList gaiji: g namesdates: listPerson listPlace tagdocs: code textstructure: floatingText verse: rhyme character data |
Note | In a diplomatic edition attempting to represent an original source, the add element should not be used for additions to the current TEI electronic edition made by editors or encoders. In these cases, either the corr or supplied element are recommended. In a TEI edition of a historical text with previous editorial emendations in which such additions or reconstructions are considered part of the source text, the use of add may be appropriate, dependent on the editorial philosophy of the project. |
Example | The story I am going to relate is true as to its main facts, and as to the consequences <add place="above">of these facts</add> from which this tale takes its title. |
Content model | <content> |
Schema Declaration | element add { att.global.attributes, att.transcriptional.attributes, att.placement.attributes, att.typed.attributes, att.dimensions.attributes, att.cmc.attributes, macro.paraContent } |
Processing Model | <model behaviour="inline"> |
<address> (address) contains a postal address, for example of a publisher, an organization, or an individual. [3.6.2. Addresses 2.2.4. Publication, Distribution, Licensing, etc. 3.12.2.4. Imprint, Size of a Document, and Reprint Information] | |
Module | core |
Attributes |
|
Member of | |
Contained by | analysis: s core: abbr add addrLine author bibl biblScope corr date del desc editor email expan foreign head hi item l label measure name note num orig p pubPlace publisher q quote ref reg resp rs sic speaker stage term time title unclear header: catDesc change classCode creation distributor edition extent language licence publicationStmt rendition tagUsage textstructure: byline closer dateline docAuthor docDate docEdition docImprint imprimatur opener salute signed titlePart trailer verse: rhyme |
May contain | |
Note | This element should be used for postal addresses only. Within it, the generic element addrLine may be used as an alternative to any of the more specialized elements available from the model.addrPart class, such as <street>, <postCode> etc. |
Example | Using just the elements defined by the core module, an address could be represented as follows: <address> <street>via Marsala 24</street> <postCode>40126</postCode> <name>Bologna</name> <name>Italy</name> </address> |
Example | When a schema includes the names and dates module more specific elements such as country or settlement would be preferable over generic name: <address> <street>via Marsala 24</street> <postCode>40126</postCode> <settlement>Bologna</settlement> <country>Italy</country> </address> |
Example | <address> <addrLine>Computing Center, MC 135</addrLine> <addrLine>P.O. Box 6998</addrLine> <addrLine>Chicago, IL 60680</addrLine> <addrLine>USA</addrLine> </address> |
Example | <address> <country key="FR"/> <settlement type="city">Lyon</settlement> <postCode>69002</postCode> <district type="arrondissement">IIème</district> <district type="quartier">Perrache</district> <street> <num>30</num>, Cours de Verdun</street> </address> |
Content model | <content> |
Schema Declaration | element address { att.global.attributes, att.cmc.attributes, ( model.global*, ( ( model.addrPart, model.global* )+ ) ) } |
Processing Model | <model behaviour="block"> |
<addrLine> (address line) contains one line of a postal address. [3.6.2. Addresses 2.2.4. Publication, Distribution, Licensing, etc. 3.12.2.4. Imprint, Size of a Document, and Reprint Information] | |
Module | core |
Attributes |
|
Member of | |
Contained by | core: address |
May contain | |
Note | Addresses may be encoded either as a sequence of lines, or using any sequence of component elements from the model.addrPart class. Other non-postal forms of address, such as telephone numbers or email, should not be included within an address element directly but may be wrapped within an addrLine if they form part of the printed address in some source text. |
Example | <address> <addrLine>Computing Center, MC 135</addrLine> <addrLine>P.O. Box 6998</addrLine> <addrLine>Chicago, IL</addrLine> <addrLine>60680 USA</addrLine> </address> |
Example | <addrLine> <ref target="tel:+1-201-555-0123">(201) 555 0123</ref> </addrLine> |
Content model | <content> |
Schema Declaration | element addrLine { att.global.attributes, macro.phraseSeq } |
Processing Model | <model behaviour="block"> |
<anchor> (anchor point) attaches an identifier to a point within a text, whether or not it corresponds with a textual element. [8.4.2. Synchronization and Overlap 17.5. Correspondence and Alignment] | |
Module | linking |
Attributes |
|
Member of | |
Contained by | core: abbr add addrLine address author bibl biblScope cit corr date del editor email expan foreign head hi item l label lg list listBibl measure name note num orig p pubPlace publisher q quote ref reg resp rs sic sp speaker stage term time title unclear namesdates: person textstructure: argument back body byline closer dateline div docAuthor docDate docEdition docImprint docTitle epigraph floatingText front group imprimatur opener postscript salute signed text titlePage titlePart trailer verse: rhyme |
May contain | Empty element |
Note | On this element, the global xml:id attribute must be supplied to specify an identifier for the point at which this element occurs within a document. The value used may be chosen freely provided that it is unique within the document and is a syntactically valid name. There is no requirement for values containing numbers to be in sequence. |
Example | <s>The anchor is he<anchor xml:id="A234"/>re somewhere.</s> <s>Help me find it.<ptr target="#A234"/> </s> |
Content model | <content> |
Schema Declaration | element anchor { att.global.attributes, att.typed.attributes, att.cmc.attributes, empty } |
Processing Model | <model behaviour="anchor"> |
<argument> (argument) contains a formal list or prose description of the topics addressed by a subdivision of a text. [4.2. Elements Common to All Divisions 4.6. Title Pages] | |
Module | textstructure |
Attributes |
|
Member of | |
Contained by | |
May contain | |
Example | <argument> <l>With ſighs and tears her love he doth deſire,</l> <l>Since Cupid hath his ſenſes ſet on fire;</l> <l>His torment and his pain to her he ſhews,</l> <l>With all his proteſtations and his vows:</l> <l>At laſt ſhe yields to grant him ſome relief,</l> <l>And make him joyful after all his grief.</l> </argument> |
Content model | <content> |
Schema Declaration | element argument { att.global.attributes, att.cmc.attributes, ( ( model.global | model.headLike )*, ( ( model.common, model.global* )+ ) ) } |
Processing Model | <model behaviour="block"> |
<author> (author) in a bibliographic reference, contains the name(s) of an author, personal or corporate, of a work; for example in the same form as that provided by a recognized bibliographic name authority. [3.12.2.2. Titles, Authors, and Editors 2.2.1. The Title Statement] | |||||||||||
Module | core | ||||||||||
Attributes |
| ||||||||||
Member of | |||||||||||
Contained by | core: bibl header: editionStmt titleStmt | ||||||||||
May contain | |||||||||||
Note | Particularly where cataloguing is likely to be based on the content of the header, it is advisable to use a generally recognized name authority file to supply the content for this element. The attributes key or ref may also be used to reference canonical information about the author(s) intended from any appropriate authority, such as a library catalogue or online resource. In the case of a broadcast, use this element for the name of the company or network responsible for making the broadcast. Where an author is unknown or unspecified, this element may contain text such as Unknown or Anonymous. When the appropriate TEI modules are in use, it may also contain detailed tagging of the names used for people, organizations or places, in particular where multiple names are given. | ||||||||||
Example | <author>British Broadcasting Corporation</author> <author>La Fayette, Marie Madeleine Pioche de la Vergne, comtesse de (1634–1693)</author> <author>Anonymous</author> <author>Bill and Melinda Gates Foundation</author> <author> <persName>Beaumont, Francis</persName> and <persName>John Fletcher</persName> </author> <author> <orgName key="BBC">British Broadcasting Corporation</orgName>: Radio 3 Network </author> | ||||||||||
Schematron | <sch:rule context="tei:*[@calendar]"> <sch:assert test="string-length( normalize-space(.) ) gt 0"> @calendar indicates one or more systems or calendars to which the date represented by the content of this element belongs, but this <sch:name/> element has no textual content.</sch:assert> </sch:rule> | ||||||||||
Content model | <content> | ||||||||||
Schema Declaration | element author { att.global.attributes, att.naming.attributes, att.datable.attributes, attribute calendar { list { teidata.pointer+ } }?, macro.phraseSeq } | ||||||||||
Processing Model | <model predicate="ancestor::teiHeader" |
<availability> (availability) supplies information about the availability of a text, for example any restrictions on its use or distribution, its copyright status, any licence applying to it, etc. [2.2.4. Publication, Distribution, Licensing, etc.] | |||||||||
Module | header | ||||||||
Attributes |
| ||||||||
Member of | |||||||||
Contained by | core: bibl header: publicationStmt | ||||||||
May contain | |||||||||
Note | A consistent format should be adopted | ||||||||
Example | <availability status="restricted"> <p>Available for academic research purposes only.</p> </availability> <availability status="free"> <p>In the public domain</p> </availability> <availability status="restricted"> <p>Available under licence from the publishers.</p> </availability> | ||||||||
Example | <availability> <licence target="http://opensource.org/licenses/MIT"> <p>The MIT License applies to this document.</p> <p>Copyright (C) 2011 by The University of Victoria</p> <p>Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:</p> <p>The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.</p> <p>THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.</p> </licence> </availability> | ||||||||
Content model | <content> | ||||||||
Schema Declaration | element availability { att.global.attributes, attribute status { "free" | "unknown" | "restricted" }?, ( model.availabilityPart | model.pLike )+ } |
<back> (back matter) contains any appendixes, etc. following the main part of a text. [4.7. Back Matter 4. Default Text Structure] | |
Module | textstructure |
Attributes |
|
Contained by | textstructure: floatingText text transcr: facsimile |
May contain | namesdates: listPerson listPlace textstructure: argument byline closer dateline div docAuthor docDate docEdition docImprint docTitle epigraph postscript signed titlePage titlePart trailer transcr: fw |
Note | Because cultural conventions differ as to which elements are grouped as back matter and which as front matter, the content models for the back and front elements are identical. |
Example | <back> <div type="appendix"> <head>The Golden Dream or, the Ingenuous Confession</head> <p>TO shew the Depravity of human Nature, and how apt the Mind is to be misled by Trinkets and false Appearances, Mrs. Two-Shoes does acknowledge, that after she became rich, she had like to have been, too fond of Money <!-- .... --> </p> </div> <!-- ... --> <div type="epistle"> <head>A letter from the Printer, which he desires may be inserted</head> <salute>Sir.</salute> <p>I have done with your Copy, so you may return it to the Vatican, if you please; <!-- ... --> </p> </div> <div type="advert"> <head>The Books usually read by the Scholars of Mrs Two-Shoes are these and are sold at Mr Newbery's at the Bible and Sun in St Paul's Church-yard.</head> <list> <item n="1">The Christmas Box, Price 1d.</item> <item n="2">The History of Giles Gingerbread, 1d.</item> <!-- ... --> <item n="42">A Curious Collection of Travels, selected from the Writers of all Nations, 10 Vol, Pr. bound 1l.</item> </list> </div> <div type="advert"> <head>By the KING's Royal Patent, Are sold by J. NEWBERY, at the Bible and Sun in St. Paul's Church-Yard.</head> <list> <item n="1">Dr. James's Powders for Fevers, the Small-Pox, Measles, Colds, &c. 2s. 6d</item> <item n="2">Dr. Hooper's Female Pills, 1s.</item> <!-- ... --> </list> </div> </back> |
Content model | <content> |
Schema Declaration | element back { att.global.attributes, ( ( model.frontPart | model.pLike.front | model.pLike | model.listLike | model.global )*, ( ( model.div1Like, ( model.frontPart | model.div1Like | model.global )* ) | ( model.divLike, ( model.frontPart | model.divLike | model.global )* ) )?, ( ( model.divBottomPart, ( model.divBottomPart | model.global )* )? ) ) } |
Processing Model | <model behaviour="block"/> |
<bibl> (bibliographic citation) contains a loosely-structured bibliographic citation of which the sub-components may or may not be explicitly tagged. [3.12.1. Methods of Encoding Bibliographic References and Lists of References 2.2.7. The Source Description 16.3.2. Declarable Elements] | |
Module | core |
Attributes |
|
Member of | |
Contained by | |
May contain | core: abbr add address author bibl biblScope cb choice corr date del editor email expan foreign gap hi lb measure milestone name note num orig pb pubPlace publisher q quote ref reg relatedItem respStmt rs sic term time title unclear figures: figure gaiji: g header: availability distributor edition extent idno tagdocs: code character data |
Note | Contains phrase-level elements, together with any combination of elements from the model.biblPart class |
Example | <epigraph> <bibl>Deut. Chap. 5.</bibl> <q>11 Thou ſhalt not take the name of the Lord thy God in vaine, for the Lord will not hold him guiltleſſe which ſhall take his name in vaine.</q> </epigraph> |
Schematron | <sch:rule context="tei:bibl"> <sch:assert test="child::* or child::text()[normalize-space()]" role="ERROR">Element "<sch:name/>" may not be empty. </sch:assert> </sch:rule> |
Content model | <content> |
Schema Declaration | element bibl { att.global.attributes, att.typed.attributes, att.sortable.attributes, att.docStatus.attributes, att.cmc.attributes, ( text | model.gLike | model.highlighted | model.pPart.data | model.pPart.edit | model.segLike | model.ptrLike | model.biblPart | model.global )* } |
Processing Model | <model predicate="parent::listBibl" |
<biblFull> (fully-structured bibliographic citation) contains a fully-structured bibliographic citation, in which all components of the TEI file description are present. [3.12.1. Methods of Encoding Bibliographic References and Lists of References 2.2. The File Description 2.2.7. The Source Description 16.3.2. Declarable Elements] | |
Module | header |
Attributes |
|
Member of | |
Contained by | |
May contain | |
Example | <sourceDesc> <biblFull> <titleStmt> <title>Buxom Joan of Lymas's love to a jolly sailer: or, The maiden's choice: being love for love again. To an excellent new play-house tune.</title> <author>Congreve, William, 1670-1729.</author> </titleStmt> <extent>1 sheet ([1] p.) : music. </extent> <publicationStmt> <publisher>printed for P[hilip]. Brooksby, at the Golden-ball, in Pye-corner.,</publisher> <pubPlace>London: :</pubPlace> <date>[between 1693-1695]</date> </publicationStmt> <notesStmt> <note>Attributed to William Congreve by Wing.</note> <note>Date of publication and publisher's name from Wing.</note> <note>Verse: "A soldier and a sailer ..."</note> <note>Printed in two columns.</note> <note>Reproduction of original in the British Library.</note> </notesStmt> </biblFull> </sourceDesc> |
Content model | <content> |
Schema Declaration | element biblFull { att.global.attributes, att.sortable.attributes, att.docStatus.attributes, att.cmc.attributes, ( ( ( titleStmt, editionStmt?, extent?, publicationStmt, seriesStmt*, notesStmt? ), sourceDesc* ) | ( fileDesc, profileDesc ) ) } |
<biblScope> (scope of bibliographic reference) defines the scope of a bibliographic reference, for example as a list of page numbers, or a named subdivision of a larger work. [3.12.2.5. Scopes and Ranges in Bibliographic Citations] | |
Module | core |
Attributes |
|
Member of | |
Contained by | core: bibl header: seriesStmt |
May contain | |
Note | When a single page is being cited, use the from and to attributes with an identical value. When no clear endpoint is provided, the from attribute may be used without to; for example a citation such as ‘p. 3ff’ might be encoded It is now considered good practice to supply this element as a sibling (rather than a child) of <imprint>, since it supplies information which does not constitute part of the imprint. |
Example | <biblScope>pp 12–34</biblScope> <biblScope unit="page" from="12" to="34"/> <biblScope unit="volume">II</biblScope> <biblScope unit="page">12</biblScope> |
Content model | <content> |
Schema Declaration | element biblScope { att.global.attributes, att.citing.attributes, macro.phraseSeq } |
Processing Model | <model behaviour="inline"/> |
<body> (text body) contains the whole body of a single unitary text, excluding any front or back matter. [4. Default Text Structure] | |
Module | textstructure |
Attributes |
|
Contained by | textstructure: floatingText text |
May contain | |
Example | <body> <l>Nu scylun hergan hefaenricaes uard</l> <l>metudæs maecti end his modgidanc</l> <l>uerc uuldurfadur sue he uundra gihuaes</l> <l>eci dryctin or astelidæ</l> <l>he aerist scop aelda barnum</l> <l>heben til hrofe haleg scepen.</l> <l>tha middungeard moncynnæs uard</l> <l>eci dryctin æfter tiadæ</l> <l>firum foldu frea allmectig</l> <trailer>primo cantauit Cædmon istud carmen.</trailer> </body> |
Content model | <content> |
Schema Declaration | element body { att.global.attributes, ( model.global*, ( ( model.divTop, ( model.global | model.divTop )* )? ), ( ( model.divGenLike, ( model.global | model.divGenLike )* )? ), ( ( ( model.divLike, ( model.global | model.divGenLike )* )+ ) | ( ( model.div1Like, ( model.global | model.divGenLike )* )+ ) | ( ( ( ( schemaSpec | model.common ), model.global* )+ ), ( ( ( model.divLike, ( model.global | model.divGenLike )* )+ ) | ( ( model.div1Like, ( model.global | model.divGenLike )* )+ ) )? ) ), ( ( model.divBottom, model.global* )* ) ) } |
Processing Model | <modelSequence> |
<byline> (byline) contains the primary statement of responsibility given for a work on its title page or at the head or end of the work. [4.2.2. Openers and Closers 4.5. Front Matter] | |
Module | textstructure |
Attributes |
|
Member of | |
Contained by | |
May contain | |
Note | The byline on a title page may include either the name or a description for the document's author. Where the name is included, it may optionally be tagged using the docAuthor element. |
Example | <byline>Written by a CITIZEN who continued all the while in London. Never made publick before.</byline> |
Example | <byline>Written from her own MEMORANDUMS</byline> |
Example | <byline>By George Jones, Political Editor, in Washington</byline> |
Example | <byline>BY <docAuthor>THOMAS PHILIPOTT,</docAuthor> Master of Arts, (Somtimes) Of Clare-Hall in Cambridge.</byline> |
Content model | <content> |
Schema Declaration | element byline { att.global.attributes, att.cmc.attributes, ( text | model.gLike | model.phrase | docAuthor | model.global )* } |
Processing Model | <model behaviour="block"/> |
<c> (character) represents a character. [18.1. Linguistic Segment Categories] | |
Module | analysis |
Attributes |
|
Member of | |
Contained by | core: abbr add addrLine author bibl biblScope corr date del editor email expan foreign head hi item l label measure name note num orig p pubPlace publisher q quote ref reg rs sic speaker stage term time title unclear figures: cell header: change distributor edition extent licence textstructure: byline closer dateline docAuthor docDate docEdition docImprint imprimatur opener salute signed titlePart trailer verse: rhyme |
May contain | gaiji: g character data |
Note | Contains a single character, a g element, or a sequence of graphemes to be treated as a single character. The type attribute is used to indicate the function of this segmentation, taking values such as letter, punctuation, or digit etc. |
Example | <phr> <c>M</c> <c>O</c> <c>A</c> <c>I</c> <w>doth</w> <w>sway</w> <w>my</w> <w>life</w> </phr> |
Content model | <content> |
Schema Declaration | element c { att.global.attributes, att.segLike.attributes, att.typed.attributes, att.notated.attributes, att.cmc.attributes, macro.xtext } |
Processing Model | <model behaviour="inline"/> |
<castGroup> (cast list grouping) groups one or more individual castItem elements within a cast list. [7.1.4. Cast Lists] | |
Module | drama |
Attributes |
|
Contained by | |
May contain | |
Note | The rend attribute may be used, as here, to indicate whether the grouping is indicated by a brace, whitespace, font change, etc. Note that in this example the role description ‘friends of Mathias’ is understood to apply to both roles equally. |
Example | <castGroup rend="braced"> <castItem> <role>Walter</role> <actor>Mr Frank Hall</actor> </castItem> <castItem> <role>Hans</role> <actor>Mr F.W. Irish</actor> </castItem> <roleDesc>friends of Mathias</roleDesc> </castGroup> |
Content model | <content> |
Schema Declaration | element castGroup { att.global.attributes, ( ( model.global | model.headLike )*, ( ( ( castItem | castGroup | roleDesc ), model.global* )+ ), ( ( trailer, model.global* )? ) ) } |
Processing Model | <model predicate="child::*" behaviour="list"> |
<castItem> (cast list item) contains a single entry within a cast list, describing either a single role or a list of non-speaking roles. [7.1.4. Cast Lists] | |||||||||||
Module | drama | ||||||||||
Attributes |
| ||||||||||
Contained by | |||||||||||
May contain | |||||||||||
Example | <castItem> <role>Player</role> <actor>Mr Milward</actor> </castItem> | ||||||||||
Example | <castItem type="list">Constables, Drawer, Turnkey, etc.</castItem> | ||||||||||
Content model | <content> | ||||||||||
Schema Declaration | element castItem { att.global.attributes, att.typed.attribute.subtype, attribute type { teidata.enumerated }?, ( text | model.gLike | model.castItemPart | model.phrase | model.global )* } | ||||||||||
Processing Model | <model behaviour="listItem"> |
<castList> (cast list) contains a single cast list or dramatis personae. [7.1.4. Cast Lists 7.1. Front and Back Matter ] | |
Module | drama |
Attributes |
|
Member of | |
Contained by | textstructure: argument back body div docEdition epigraph front imprimatur postscript salute signed titlePart trailer transcr: supplied verse: rhyme |
May contain | |
Example | <castList> <castGroup> <head rend="braced">Mendicants</head> <castItem> <role>Aafaa</role> <actor>Femi Johnson</actor> </castItem> <castItem> <role>Blindman</role> <actor>Femi Osofisan</actor> </castItem> <castItem> <role>Goyi</role> <actor>Wale Ogunyemi</actor> </castItem> <castItem> <role>Cripple</role> <actor>Tunji Oyelana</actor> </castItem> </castGroup> <castItem> <role>Si Bero</role> <roleDesc>Sister to Dr Bero</roleDesc> <actor>Deolo Adedoyin</actor> </castItem> <castGroup> <head rend="braced">Two old women</head> <castItem> <role>Iya Agba</role> <actor>Nguba Agolia</actor> </castItem> <castItem> <role>Iya Mate</role> <actor>Bopo George</actor> </castItem> </castGroup> <castItem> <role>Dr Bero</role> <roleDesc>Specialist</roleDesc> <actor>Nat Okoro</actor> </castItem> <castItem> <role>Priest</role> <actor>Gbenga Sonuga</actor> </castItem> <castItem> <role>The old man</role> <roleDesc>Bero's father</roleDesc> <actor>Dapo Adelugba</actor> </castItem> </castList> <stage type="mix">The action takes place in and around the home surgery of Dr Bero, lately returned from the wars.</stage> |
Content model | <content> |
Schema Declaration | element castList { att.global.attributes, ( ( model.divTop | model.global )*, ( ( model.common, model.global* )* ), ( ( ( castItem | castGroup ), model.global* )+ ), ( ( model.common, model.global* )* ) ) } |
Processing Model | <model predicate="child::*" behaviour="list" |
<catDesc> (category description) describes some category within a taxonomy or text typology, either in the form of a brief prose description or in terms of the situational parameters used by the TEI formal <textDesc>. [2.3.7. The Classification Declaration] | |
Module | header |
Attributes |
|
Contained by | header: category |
May contain | |
Example | <catDesc>Prose reportage</catDesc> |
Example | <catDesc> <textDesc n="novel"> <channel mode="w">print; part issues</channel> <constitution type="single"/> <derivation type="original"/> <domain type="art"/> <factuality type="fiction"/> <interaction type="none"/> <preparedness type="prepared"/> <purpose type="entertain" degree="high"/> <purpose type="inform" degree="medium"/> </textDesc> </catDesc> |
Content model | <content> |
Schema Declaration | element catDesc { att.global.attributes, att.canonical.attributes, ( text | model.limitedPhrase | model.catDescPart )* } |
<category> (category) contains an individual descriptive category, possibly nested within a superordinate category, within a user-defined taxonomy. [2.3.7. The Classification Declaration] | |
Module | header |
Attributes |
|
Contained by | |
May contain | |
Example | <category xml:id="b1"> <catDesc>Prose reportage</catDesc> </category> |
Example | <category xml:id="b2"> <catDesc>Prose </catDesc> <category xml:id="b11"> <catDesc>journalism</catDesc> </category> <category xml:id="b12"> <catDesc>fiction</catDesc> </category> </category> |
Example | <category xml:id="LIT"> <catDesc xml:lang="pl">literatura piękna</catDesc> <catDesc xml:lang="en">fiction</catDesc> <category xml:id="LPROSE"> <catDesc xml:lang="pl">proza</catDesc> <catDesc xml:lang="en">prose</catDesc> </category> <category xml:id="LPOETRY"> <catDesc xml:lang="pl">poezja</catDesc> <catDesc xml:lang="en">poetry</catDesc> </category> <category xml:id="LDRAMA"> <catDesc xml:lang="pl">dramat</catDesc> <catDesc xml:lang="en">drama</catDesc> </category> </category> |
Content model | <content> |
Schema Declaration | element category { att.global.attributes, ( ( catDesc+ | ( model.descLike | equiv | gloss )* ), category* ) } |
<catRef> (category reference) specifies one or more defined categories within some taxonomy or text typology. [2.4.3. The Text Classification] | |||||||
Module | header | ||||||
Attributes |
| ||||||
Contained by | header: textClass | ||||||
May contain | Empty element | ||||||
Note | The scheme attribute needs to be supplied only if more than one taxonomy has been declared. | ||||||
Example | <catRef scheme="#myTopics" target="#news #prov #sales2"/> <!-- elsewhere --> <taxonomy xml:id="myTopics"> <category xml:id="news"> <catDesc>Newspapers</catDesc> </category> <category xml:id="prov"> <catDesc>Provincial</catDesc> </category> <category xml:id="sales2"> <catDesc>Low to average annual sales</catDesc> </category> </taxonomy> | ||||||
Content model | <content> | ||||||
Schema Declaration | element catRef { att.global.attributes, att.pointing.attributes, attribute scheme { teidata.pointer }?, empty } |
<cb> (column beginning) marks the beginning of a new column of a text on a multi-column page. [3.11.3. Milestone Elements] | |
Module | core |
Attributes |
|
Member of | |
Contained by | core: abbr add addrLine address author bibl biblScope cit corr date del editor email expan foreign head hi item l label lg list listBibl measure name note num orig p pubPlace publisher q quote ref reg resp rs sic sp speaker stage term time title unclear namesdates: person textstructure: argument back body byline closer dateline div docAuthor docDate docEdition docImprint docTitle epigraph floatingText front group imprimatur opener postscript salute signed text titlePage titlePart trailer verse: rhyme |
May contain | Empty element |
Note | On this element, the global n attribute indicates the number or other value associated with the column which follows the point of insertion of this cb element. Encoders should adopt a clear and consistent policy as to whether the numbers associated with column breaks relate to the physical sequence number of the column in the whole text, or whether columns are numbered within the page. The cb element is placed at the head of the column to which it refers. |
Example | Markup of an early English dictionary printed in two columns: <pb/> <cb n="1"/> <entryFree> <form>Well</form>, <sense>a Pit to hold Spring-Water</sense>: <sense>In the Art of <hi rend="italic">War</hi>, a Depth the Miner sinks into the Ground, to find out and disappoint the Enemies Mines, or to prepare one</sense>. </entryFree> <entryFree>To <form>Welter</form>, <sense>to wallow</sense>, or <sense>lie groveling</sense>.</entryFree> <!-- remainder of column --> <cb n="2"/> <entryFree> <form>Wey</form>, <sense>the greatest Measure for dry Things, containing five Chaldron</sense>. </entryFree> <entryFree> <form>Whale</form>, <sense>the greatest of Sea-Fishes</sense>. </entryFree> |
Content model | <content> |
Schema Declaration | element cb { att.global.attributes, att.typed.attributes, att.edition.attributes, att.spanning.attributes, att.breaking.attributes, att.cmc.attributes, empty } |
Processing Model | <model behaviour="break"> |
<cell> (cell) contains one cell of a table. [15.1.1. TEI Tables] | |||||||||||
Module | figures | ||||||||||
Attributes |
| ||||||||||
Contained by | figures: row | ||||||||||
May contain | core: abbr add address bibl cb choice cit corr date del desc email expan foreign gap graphic hi l label lb lg list listBibl measure milestone name note num orig p pb q quote ref reg rs sic sp stage term time title unclear drama: castList gaiji: g namesdates: listPerson listPlace tagdocs: code textstructure: floatingText verse: rhyme character data | ||||||||||
Example | <row> <cell role="label">General conduct</cell> <cell role="data">Not satisfactory, on account of his great unpunctuality and inattention to duties</cell> </row> | ||||||||||
Content model | <content> | ||||||||||
Schema Declaration | element cell { att.global.attributes, att.tableDecoration.attribute.rows, att.tableDecoration.attribute.cols, attribute role { "data" | "label" | "sum" | "total" }?, macro.specialPara } | ||||||||||
Processing Model | <model behaviour="cell"> |
<change> (change) documents a change or set of changes made during the production of a source document, or during the revision of an electronic file. [2.6. The Revision Description 2.4.1. Creation 12.7. Identifying Changes and Revisions] | |||||||||||||||||
Module | header | ||||||||||||||||
Attributes |
| ||||||||||||||||
Contained by | header: listChange revisionDesc | ||||||||||||||||
May contain | core: abbr add address bibl cb choice cit corr date del desc email expan foreign gap graphic hi l label lb lg list listBibl measure milestone name note num orig p pb q quote ref reg rs sic sp stage term time title unclear drama: castList gaiji: g namesdates: listPerson listPlace tagdocs: code textstructure: floatingText verse: rhyme character data | ||||||||||||||||
Note | The who attribute may be used to point to any other element, but will typically specify a respStmt or person element elsewhere in the header, identifying the person responsible for the change and their role in making it. It is recommended that changes be recorded with the most recent first. The status attribute may be used to indicate the status of a document following the change documented. | ||||||||||||||||
Example | <titleStmt> <title> ... </title> <editor xml:id="LDB">Lou Burnard</editor> <respStmt xml:id="BZ"> <resp>copy editing</resp> <name>Brett Zamir</name> </respStmt> </titleStmt> <!-- ... --> <revisionDesc status="published"> <change who="#BZ" when="2008-02-02" status="public">Finished chapter 23</change> <change who="#BZ" when="2008-01-02" status="draft">Finished chapter 2</change> <change n="P2.2" when="1991-12-21" who="#LDB">Added examples to section 3</change> <change when="1991-11-11" who="#MSM">Deleted chapter 10</change> </revisionDesc> | ||||||||||||||||
Example | <profileDesc> <creation> <listChange> <change xml:id="DRAFT1">First draft in pencil</change> <change xml:id="DRAFT2" notBefore="1880-12-09">First revision, mostly using green ink</change> <change xml:id="DRAFT3" notBefore="1881-02-13">Final corrections as supplied to printer.</change> </listChange> </creation> </profileDesc> | ||||||||||||||||
Content model | <content> | ||||||||||||||||
Schema Declaration | element change { att.ascribed.attributes, att.datable.attributes, att.docStatus.attributes, att.global.attributes, att.typed.attributes, attribute calendar { list { teidata.pointer+ } }?, attribute target { list { teidata.pointer+ } }?, macro.specialPara } |
<char> (character) provides descriptive information about a character. [5.2. Markup Constructs for Representation of Characters and Glyphs] | |
Module | gaiji |
Attributes |
|
Contained by | gaiji: charDecl |
May contain | |
Example | <char xml:id="circledU4EBA"> <localProp name="Name" value="CIRCLED IDEOGRAPH 4EBA"/> <localProp name="daikanwa" value="36"/> <unicodeProp name="Decomposition_Mapping" value="circle"/> <mapping type="standard">人</mapping> </char> |
Content model | <content> |
Schema Declaration | element char { att.global.attributes, ( unicodeProp | unihanProp | localProp | mapping | figure | model.graphicLike | model.noteLike | model.descLike )* } |
<charDecl> (character declarations) provides information about nonstandard characters and glyphs. [5.2. Markup Constructs for Representation of Characters and Glyphs] | |
Module | gaiji |
Attributes |
|
Member of | |
Contained by | header: encodingDesc |
May contain | |
Example | <charDecl> <char xml:id="aENL"> <unicodeProp name="Name" value="LATIN LETTER ENLARGED SMALL A"/> <mapping type="standard">a</mapping> </char> </charDecl> |
Content model | <content> |
Schema Declaration | element charDecl { att.global.attributes, ( desc?, ( char | glyph )+ ) } |
<choice> (choice) groups a number of alternative encodings for the same point in a text. [3.5. Simple Editorial Changes] | |
Module | core |
Attributes |
|
Member of | |
Contained by | core: abbr add addrLine author bibl biblScope choice corr date del desc editor email expan foreign head hi item l label measure name note num orig p pubPlace publisher q quote ref reg resp rs sic speaker stage term time title unclear header: catDesc change classCode creation distributor edition extent language licence rendition tagUsage textstructure: byline closer dateline docAuthor docDate docEdition docImprint imprimatur opener salute signed titlePart trailer verse: rhyme |
May contain | |
Note | Because the children of a choice element all represent alternative ways of encoding the same sequence, it is natural to think of them as mutually exclusive. However, there may be cases where a full representation of a text requires the alternative encodings to be considered as parallel. Note also that choice elements may self-nest. Where the purpose of an encoding is to record multiple witnesses of a single work, rather than to identify multiple possible encoding decisions at a given point, the <app> element and associated elements discussed in section 13.1. The Apparatus Entry, Readings, and Witnesses should be preferred. |
Example | An American encoding of Gulliver's Travels which retains the British spelling but also provides a version regularized to American spelling might be encoded as follows. <p>Lastly, That, upon his solemn oath to observe all the above articles, the said man-mountain shall have a daily allowance of meat and drink sufficient for the support of <choice> <sic>1724</sic> <corr>1728</corr> </choice> of our subjects, with free access to our royal person, and other marks of our <choice> <orig>favour</orig> <reg>favor</reg> </choice>.</p> |
Schematron | <sch:rule context="tei:choice"> <sch:assert test="( tei:corr and tei:sic ) or ( tei:expan and tei:abbr ) or ( tei:reg and tei:orig )" role="ERROR"> Element "<sch:name/>" must have corresponding corr/sic, expand/abbr, reg/orig </sch:assert> </sch:rule> |
Content model | <content> |
Schema Declaration | element choice { att.global.attributes, att.cmc.attributes, ( model.choicePart | choice ), ( model.choicePart | choice ), ( model.choicePart | choice )* } |
Processing Model | <model output="plain" |
<cit> (cited quotation) contains a quotation from some other document, together with a bibliographic reference to its source. In a dictionary it may contain an example text with at least one occurrence of the word form, used in the sense being described, or a translation of the headword, or an example. [3.3.3. Quotation 4.3.1. Grouped Texts 10.3.5.1. Examples] | |
Module | core |
Attributes |
|
Member of | |
Contained by | analysis: s core: abbr add addrLine author biblScope cit corr del desc editor email expan foreign head hi item l label measure name note num orig p pubPlace publisher q quote ref reg rs sic sp speaker stage term title unclear textstructure: argument body div docAuthor docDate docEdition epigraph imprimatur postscript salute signed titlePart trailer verse: rhyme |
May contain | |
Example | <cit> <quote>and the breath of the whale is frequently attended with such an insupportable smell, as to bring on disorder of the brain.</quote> <bibl>Ulloa's South America</bibl> </cit> |
Example | <entry> <form> <orth>horrifier</orth> </form> <cit type="translation" xml:lang="en"> <quote>to horrify</quote> </cit> <cit type="example"> <quote>elle était horrifiée par la dépense</quote> <cit type="translation" xml:lang="en"> <quote>she was horrified at the expense.</quote> </cit> </cit> </entry> |
Example | <cit type="example"> <quote xml:lang="mix">Ka'an yu tsa'a Pedro.</quote> <media url="soundfiles-gen:S_speak_1s_on_behalf_of_Pedro_01_02_03_TS.wav" mimeType="audio/wav"/> <cit type="translation"> <quote xml:lang="en">I'm speaking on behalf of Pedro.</quote> </cit> <cit type="translation"> <quote xml:lang="es">Estoy hablando de parte de Pedro.</quote> </cit> </cit> |
Content model | <content> |
Schema Declaration | element cit { att.global.attributes, att.typed.attributes, att.cmc.attributes, ( model.biblLike | model.egLike | model.entryPart | model.global | model.graphicLike | model.ptrLike | model.attributable | pc | q )+ } |
Processing Model | <model predicate="child::quote and child::bibl" |
<classCode> (classification code) contains the classification code used for this text in some standard classification system. [2.4.3. The Text Classification] | |||||||
Module | header | ||||||
Attributes |
| ||||||
Contained by | header: textClass | ||||||
May contain | |||||||
Example | <classCode scheme="http://www.udc.org">410</classCode> | ||||||
Content model | <content> | ||||||
Schema Declaration | element classCode { att.global.attributes, attribute scheme { teidata.pointer }, macro.phraseSeq.limited } |
<classDecl> (classification declarations) contains one or more taxonomies defining any classificatory codes used elsewhere in the text. [2.3.7. The Classification Declaration 2.3. The Encoding Description] | |
Module | header |
Attributes |
|
Member of | |
Contained by | header: encodingDesc |
May contain | header: taxonomy |
Example | <classDecl> <taxonomy xml:id="LCSH"> <bibl>Library of Congress Subject Headings</bibl> </taxonomy> </classDecl> <!-- ... --> <textClass> <keywords scheme="#LCSH"> <term>Political science</term> <term>United States -- Politics and government — Revolution, 1775-1783</term> </keywords> </textClass> |
Content model | <content> |
Schema Declaration | element classDecl { att.global.attributes, taxonomy+ } |
<closer> (closer) groups together salutations, datelines, and similar phrases appearing as a final group at the end of a division, especially of a letter. [4.2.2. Openers and Closers 4.2. Elements Common to All Divisions] | |
Module | textstructure |
Attributes |
|
Member of | |
Contained by | |
May contain | |
Example | <div type="letter"> <p> perhaps you will favour me with a sight of it when convenient.</p> <closer> <salute>I remain, &c. &c.</salute> <signed>H. Colburn</signed> </closer> </div> |
Example | <div type="chapter"> <p> <!-- ... --> and his heart was going like mad and yes I said yes I will Yes.</p> <closer> <dateline> <name type="place">Trieste-Zürich-Paris,</name> <date>1914–1921</date> </dateline> </closer> </div> |
Content model | <content> |
Schema Declaration | element closer { att.global.attributes, att.written.attributes, att.cmc.attributes, ( text | model.gLike | signed | dateline | salute | model.phrase | model.global )* } |
Processing Model | <model behaviour="block"> |
<code> contains literal code from some formal language such as a programming language. [23.1.1. Phrase Level Terms] | |||||||
Module | tagdocs | ||||||
Attributes |
| ||||||
Member of | |||||||
Contained by | analysis: s core: abbr add addrLine author bibl biblScope corr date del desc editor email expan foreign head hi item l label measure name note num orig p pubPlace publisher q quote ref reg resp rs sic speaker stage term time title unclear header: catDesc change classCode creation distributor edition extent language licence rendition tagUsage textstructure: byline closer dateline docAuthor docDate docEdition docImprint imprimatur opener salute signed titlePart trailer verse: rhyme | ||||||
May contain | Character data only | ||||||
Example | <code lang="JAVA"> Size fCheckbox1Size = new Size(); fCheckbox1Size.Height = 500; fCheckbox1Size.Width = 500; xCheckbox1.setSize(fCheckbox1Size); </code> | ||||||
Content model | <content> | ||||||
Schema Declaration | element code { att.global.attributes, attribute lang { teidata.word }?, text } | ||||||
Processing Model | <model behaviour="inline"> |
<corr> (correction) contains the correct form of a passage apparently erroneous in the copy text. [3.5.1. Apparent Errors] | |
Module | core |
Attributes |
|
Member of | |
Contained by | core: abbr add addrLine author bibl biblScope choice corr date del editor email expan foreign head hi item l label lg measure name note num orig p pubPlace publisher q quote ref reg rs sic speaker stage term time title unclear figures: cell header: change distributor edition extent licence textstructure: byline closer dateline docAuthor docDate docEdition docImprint imprimatur opener salute signed titlePart trailer verse: rhyme |
May contain | core: abbr add address bibl cb choice cit corr date del desc email expan foreign gap graphic hi l label lb lg list listBibl measure milestone name note num orig pb q quote ref reg rs sic stage term time title unclear drama: castList gaiji: g namesdates: listPerson listPlace tagdocs: code textstructure: floatingText verse: rhyme character data |
Example | If all that is desired is to call attention to the fact that the copy text has been corrected, corr may be used alone: I don't know, Juan. It's so far in the past now — how <corr>can we</corr> prove or disprove anyone's theories? |
Example | It is also possible, using the choice and sic elements, to provide an uncorrected reading: I don't know, Juan. It's so far in the past now — how <choice> <sic>we can</sic> <corr>can we</corr> </choice> prove or disprove anyone's theories? |
Content model | <content> |
Schema Declaration | element corr { att.global.attributes, att.editLike.attributes, att.typed.attributes, att.cmc.attributes, macro.paraContent } |
Processing Model | <model predicate="parent::choice and count(parent::*/*) gt 1" |
<creation> (creation) contains information about the creation of a text. [2.4.1. Creation 2.4. The Profile Description] | |||||||||||
Module | header | ||||||||||
Attributes |
| ||||||||||
Member of | |||||||||||
Contained by | header: profileDesc | ||||||||||
May contain | |||||||||||
Note | The creation element may be used to record details of a text's creation, e.g. the date and place it was composed, if these are of interest. It may also contain a more structured account of the various stages or revisions associated with the evolution of a text; this should be encoded using the listChange element. It should not be confused with the publicationStmt element, which records date and place of publication. | ||||||||||
Example | <creation> <date>Before 1987</date> </creation> | ||||||||||
Example | <creation> <date when="1988-07-10">10 July 1988</date> </creation> | ||||||||||
Content model | <content> | ||||||||||
Schema Declaration | element creation { att.global.attributes, att.datable.attributes, attribute calendar { list { teidata.pointer+ } }?, ( text | model.limitedPhrase | listChange )* } |
<date> (date) contains a date in any format. [3.6.4. Dates and Times 2.2.4. Publication, Distribution, Licensing, etc. 2.6. The Revision Description 3.12.2.4. Imprint, Size of a Document, and Reprint Information 16.2.3. The Setting Description 14.4. Dates] | |
Module | core |
Attributes |
|
Member of | |
Contained by | analysis: s core: abbr add addrLine author bibl biblScope corr date del desc editor email expan foreign head hi item l label measure name note num orig p pubPlace publisher q quote ref reg resp rs sic speaker stage term time title unclear header: catDesc change classCode creation distributor edition extent language licence publicationStmt rendition tagUsage textstructure: byline closer dateline docAuthor docDate docEdition docImprint imprimatur opener salute signed titlePart trailer verse: rhyme |
May contain | |
Example | <date when="1980-02">early February 1980</date> |
Example | Given on the <date when="1977-06-12">Twelfth Day of June in the Year of Our Lord One Thousand Nine Hundred and Seventy-seven of the Republic the Two Hundredth and first and of the University the Eighty-Sixth.</date> |
Example | <date when="1990-09">September 1990</date> |
Content model | <content> |
Schema Declaration | element date { att.global.attributes, att.canonical.attributes, att.datable.attributes, att.calendarSystem.attributes, att.editLike.attributes, att.dimensions.attributes, att.typed.attributes, att.cmc.attributes, ( text | model.gLike | model.phrase | model.global )* } |
Processing Model | <model output="print" predicate="text()" |
<dateline> (dateline) contains a brief description of the place, date, time, etc. of production of a letter, newspaper story, or other work, prefixed or suffixed to it as a kind of heading or trailer. [4.2.2. Openers and Closers] | |
Module | textstructure |
Attributes |
|
Member of | |
Contained by | |
May contain | |
Example | <dateline>Walden, this 29. of August 1592</dateline> |
Example | <div type="chapter"> <p> <!-- ... --> and his heart was going like mad and yes I said yes I will Yes.</p> <closer> <dateline> <name type="place">Trieste-Zürich-Paris,</name> <date>1914–1921</date> </dateline> </closer> </div> |
Content model | <content> |
Schema Declaration | element dateline { att.global.attributes, att.cmc.attributes, ( text | model.gLike | model.phrase | model.global | docDate )* } |
Processing Model | <model behaviour="block"/> |
<del> (deletion) contains a letter, word, or passage deleted, marked as deleted, or otherwise indicated as superfluous or spurious in the copy text by an author, scribe, or a previous annotator or corrector. [3.5.3. Additions, Deletions, and Omissions] | |
Module | core |
Attributes |
|
Member of | |
Contained by | core: abbr add addrLine author bibl biblScope corr date del editor email expan foreign head hi item l label lg measure name note num orig p pubPlace publisher q quote ref reg rs sic speaker stage term time title unclear figures: cell header: change distributor edition extent licence textstructure: byline closer dateline docAuthor docDate docEdition docImprint imprimatur opener salute signed titlePart trailer verse: rhyme |
May contain | core: abbr add address bibl cb choice cit corr date del desc email expan foreign gap graphic hi l label lb lg list listBibl measure milestone name note num orig pb q quote ref reg rs sic stage term time title unclear drama: castList gaiji: g namesdates: listPerson listPlace tagdocs: code textstructure: floatingText verse: rhyme character data |
Note | This element should be used for deletion of shorter sequences of text, typically single words or phrases. The <delSpan> element should be used for longer sequences of text, for those containing structural subdivisions, and for those containing overlapping additions and deletions. The text deleted must be at least partially legible in order for the encoder to be able to transcribe it (unless it is restored in a supplied tag). Illegible or lost text within a deletion may be marked using the gap tag to signal that text is present but has not been transcribed, or is no longer visible. Attributes on the gap element may be used to indicate how much text is omitted, the reason for omitting it, etc. If text is not fully legible, the unclear element (available when using the additional tagset for transcription of primary sources) should be used to signal the areas of text which cannot be read with confidence in a similar way. Degrees of uncertainty over what can still be read, or whether a deletion was intended may be indicated by use of the <certainty> element (see 22. Certainty, Precision, and Responsibility). There is a clear distinction in the TEI between del and <surplus> on the one hand and gap or unclear on the other. del indicates a deletion present in the source being transcribed, which states the author's or a later scribe's intent to cancel or remove text. <surplus> indicates material present in the source being transcribed which should have been so deleted, but which is not in fact. gap or unclear, by contrast, signal an editor's or encoder's decision to omit something or their inability to read the source text. See sections 12.3.1.7. Text Omitted from or Supplied in the Transcription and 12.3.3.2. Use of the gap, del, damage, unclear, and supplied Elements in Combination for the relationship between these and other related elements used in detailed transcription. |
Example | <l> <del rend="overtyped">Mein</del> Frisch <del rend="overstrike" type="primary">schwebt</del> weht der Wind </l> |
Example | <del rend="overstrike"> <gap reason="illegible" quantity="5" unit="character"/> </del> |
Content model | <content> |
Schema Declaration | element del { att.global.attributes, att.transcriptional.attributes, att.typed.attributes, att.dimensions.attributes, att.cmc.attributes, macro.paraContent } |
Processing Model | <model behaviour="inline"> |
<desc> (description) contains a short description of the purpose, function, or use of its parent element, or when the parent is a documentation element, describes or defines the object being documented. [23.4.1. Description of Components] | |||||||||||
Module | core | ||||||||||
Attributes |
| ||||||||||
Member of | |||||||||||
Contained by | |||||||||||
May contain | |||||||||||
Note | When used in a specification element such as <elementSpec>, TEI convention requires that this be expressed as a finite clause, begining with an active verb. | ||||||||||
Example | Example of a desc element inside a documentation element. <dataSpec module="tei" ident="teidata.point"> <desc versionDate="2010-10-17" xml:lang="en">defines the data type used to express a point in cartesian space.</desc> <content> <dataRef name="token" restriction="(-?[0-9]+(\.[0-9]+)?,-?[0-9]+(\.[0-9]+)?)"/> </content> <!-- ... --> </dataSpec> | ||||||||||
Example | Example of a desc element in a non-documentation element. <place xml:id="KERG2"> <placeName>Kerguelen Islands</placeName> <!-- ... --> <terrain> <desc>antarctic tundra</desc> </terrain> <!-- ... --> </place> | ||||||||||
Schematron | A desc with a type of deprecationInfo should only occur when its parent element is being deprecated. Furthermore, it should always occur in an element that is being deprecated when desc is a valid child of that element. <sch:rule context="tei:desc[ @type eq 'deprecationInfo']"> <sch:assert test="../@validUntil">Information about a deprecation should only be present in a specification element that is being deprecated: that is, only an element that has a @validUntil attribute should have a child <desc type="deprecationInfo">.</sch:assert> </sch:rule> | ||||||||||
Content model | <content> | ||||||||||
Schema Declaration | element desc { att.global.attributes, att.typed.attribute.subtype, att.cmc.attributes, attribute type { teidata.enumerated }?, macro.limitedContent } | ||||||||||
Processing Model | <model behaviour="inline"/> |
<distributor> (distributor) supplies the name of a person or other agency responsible for the distribution of a text. [2.2.4. Publication, Distribution, Licensing, etc.] | |
Module | header |
Attributes |
|
Member of | |
Contained by | core: bibl header: publicationStmt |
May contain | |
Example | <distributor>Oxford Text Archive</distributor> <distributor>Redwood and Burn Ltd</distributor> |
Content model | <content> |
Schema Declaration | element distributor { att.global.attributes, att.canonical.attributes, macro.phraseSeq } |
<div> (text division) contains a subdivision of the front, body, or back of a text. [4.1. Divisions of the Body] | |
Module | textstructure |
Attributes |
|
Member of | |
Contained by | |
May contain | |
Example | <body> <div type="part"> <head>Fallacies of Authority</head> <p>The subject of which is Authority in various shapes, and the object, to repress all exercise of the reasoning faculty.</p> <div n="1" type="chapter"> <head>The Nature of Authority</head> <p>With reference to any proposed measures having for their object the greatest happiness of the greatest number [...]</p> <div n="1.1" type="section"> <head>Analysis of Authority</head> <p>What on any given occasion is the legitimate weight or influence to be attached to authority [...] </p> </div> <div n="1.2" type="section"> <head>Appeal to Authority, in What Cases Fallacious.</head> <p>Reference to authority is open to the charge of fallacy when [...] </p> </div> </div> </div> </body> |
Schematron | <sch:rule context="tei:div"> <sch:report test="(ancestor::tei:l or ancestor::tei:lg) and not(ancestor::tei:floatingText)"> Abstract model violation: Lines may not contain higher-level structural elements such as div, unless div is a descendant of floatingText. </sch:report> </sch:rule> |
Schematron | <sch:rule context="tei:div"> <sch:report test="(ancestor::tei:p or ancestor::tei:ab) and not(ancestor::tei:floatingText)"> Abstract model violation: p and ab may not contain higher-level structural elements such as div, unless div is a descendant of floatingText. </sch:report> </sch:rule> |
Content model | <content> |
Schema Declaration | element div { att.global.attributes, att.divLike.attributes, att.typed.attributes, att.written.attributes, ( ( model.divTop | model.global )*, ( ( ( ( ( ( model.divLike | model.divGenLike ), model.global* )+ ) | ( ( ( ( schemaSpec | model.common ), model.global* )+ ), ( ( ( model.divLike | model.divGenLike ), model.global* )* ) ) ), ( ( model.divBottom, model.global* )* ) )? ) ) } |
Processing Model | <model predicate="@type='title_page'" |
<docAuthor> (document author) contains the name of the author of the document, as given on the title page (often but not always contained in a byline). [4.6. Title Pages] | |
Module | textstructure |
Attributes |
|
Member of | |
Contained by | |
May contain | |
Note | The document author's name often occurs within a byline, but the docAuthor element may be used whether the byline element is used or not. It should be used only for the author(s) of the entire document, not for author(s) of any subset or part of it. (Attributions of authorship of a subset or part of the document, for example of a chapter in a textbook or an article in a newspaper, may be encoded with byline without docAuthor.) |
Example | <titlePage> <docTitle> <titlePart>Travels into Several Remote Nations of the World, in Four Parts.</titlePart> </docTitle> <byline> By <docAuthor>Lemuel Gulliver</docAuthor>, First a Surgeon, and then a Captain of several Ships</byline> </titlePage> |
Content model | <content> |
Schema Declaration | element docAuthor { att.global.attributes, att.canonical.attributes, att.cmc.attributes, macro.phraseSeq } |
Processing Model | <model behaviour="inline"/> |
<docDate> (document date) contains the date of a document, as given on a title page or in a dateline. [4.6. Title Pages] | |
Module | textstructure |
Attributes |
|
Member of | |
Contained by | |
May contain | |
Note | Cf. the general date element in the core tag set. This specialized element is provided for convenience in marking and processing the date of the documents, since it is likely to require specialized handling for many applications. It should be used only for the date of the entire document, not for any subset or part of it. |
Example | <docImprint>Oxford, Clarendon Press, <docDate>1987</docDate> </docImprint> |
Content model | <content> |
Schema Declaration | element docDate { att.global.attributes, att.cmc.attributes, att.datable.attributes, att.calendarSystem.attributes, macro.phraseSeq } |
Processing Model | <model behaviour="inline"/> |
<docEdition> (document edition) contains an edition statement as presented on a title page of a document. [4.6. Title Pages] | |
Module | textstructure |
Attributes |
|
Member of | |
Contained by | |
May contain | core: abbr add address bibl cb choice cit corr date del desc email expan foreign gap graphic hi l label lb lg list listBibl measure milestone name note num orig pb q quote ref reg rs sic stage term time title unclear drama: castList gaiji: g namesdates: listPerson listPlace tagdocs: code textstructure: floatingText verse: rhyme character data |
Note | Cf. the edition element of bibliographic citation. As usual, the shorter name has been given to the more frequent element. |
Example | <docEdition>The Third edition Corrected</docEdition> |
Content model | <content> |
Schema Declaration | element docEdition { att.global.attributes, macro.paraContent } |
Processing Model | <model behaviour="inline"/> |
<docImprint> (document imprint) contains the imprint statement (place and date of publication, publisher name), as given (usually) at the foot of a title page. [4.6. Title Pages] | |
Module | textstructure |
Attributes |
|
Member of | |
Contained by | |
May contain | |
Note | Cf. the <imprint> element of bibliographic citations. As with title, author, and editions, the shorter name is reserved for the element likely to be used more often. |
Example | <docImprint>Oxford, Clarendon Press, 1987</docImprint> Imprints may be somewhat more complex: <docImprint> <pubPlace>London</pubPlace> Printed for <name>E. Nutt</name>, at <pubPlace>Royal Exchange</pubPlace>; <name>J. Roberts</name> in <pubPlace>wick-Lane</pubPlace>; <name>A. Dodd</name> without <pubPlace>Temple-Bar</pubPlace>; and <name>J. Graves</name> in <pubPlace>St. James's-street.</pubPlace> <date>1722.</date> </docImprint> |
Content model | <content> |
Schema Declaration | element docImprint { att.global.attributes, ( text | model.gLike | model.phrase | pubPlace | docDate | publisher | model.global )* } |
Processing Model | <model behaviour="inline"/> |
<docTitle> (document title) contains the title of a document, including all its constituents, as given on a title page. [4.6. Title Pages] | |
Module | textstructure |
Attributes |
|
Member of | |
Contained by | |
May contain | |
Example | <docTitle> <titlePart type="main">The DUNCIAD, VARIOURVM.</titlePart> <titlePart type="sub">WITH THE PROLEGOMENA of SCRIBLERUS.</titlePart> </docTitle> |
Content model | <content> |
Schema Declaration | element docTitle { att.global.attributes, att.canonical.attributes, ( model.global*, ( ( titlePart, model.global* )+ ) ) } |
Processing Model | <model behaviour="block" |
<edition> (edition) describes the particularities of one edition of a text. [2.2.2. The Edition Statement] | |
Module | header |
Attributes |
|
Member of | |
Contained by | core: bibl header: editionStmt |
May contain | |
Example | <edition>First edition <date>Oct 1990</date> </edition> <edition n="S2">Students' edition</edition> |
Content model | <content> |
Schema Declaration | element edition { att.global.attributes, macro.phraseSeq } |
<editionStmt> (edition statement) groups information relating to one edition of a text. [2.2.2. The Edition Statement 2.2. The File Description] | |
Module | header |
Attributes |
|
Contained by | |
May contain | |
Example | <editionStmt> <edition n="S2">Students' edition</edition> <respStmt> <resp>Adapted by </resp> <name>Elizabeth Kirk</name> </respStmt> </editionStmt> |
Example | <editionStmt> <p>First edition, <date>Michaelmas Term, 1991.</date> </p> </editionStmt> |
Content model | <content> |
Schema Declaration | element editionStmt { att.global.attributes, ( model.pLike+ | ( edition, model.respLike* ) ) } |
<editor> contains a secondary statement of responsibility for a bibliographic item, for example the name of an individual, institution or organization, (or of several such) acting as editor, compiler, translator, etc. [3.12.2.2. Titles, Authors, and Editors] | |||||||||||
Module | core | ||||||||||
Attributes |
| ||||||||||
Member of | |||||||||||
Contained by | core: bibl header: editionStmt seriesStmt titleStmt | ||||||||||
May contain | |||||||||||
Note | A consistent format should be adopted. Particularly where cataloguing is likely to be based on the content of the header, it is advisable to use generally recognized authority lists for the exact form of personal names. | ||||||||||
Example | <editor role="Technical_Editor">Ron Van den Branden</editor> <editor role="Editor-in-Chief">John Walsh</editor> <editor role="Managing_Editor">Anne Baillot</editor> | ||||||||||
Schematron | <sch:rule context="tei:*[@calendar]"> <sch:assert test="string-length( normalize-space(.) ) gt 0"> @calendar indicates one or more systems or calendars to which the date represented by the content of this element belongs, but this <sch:name/> element has no textual content.</sch:assert> </sch:rule> | ||||||||||
Content model | <content> | ||||||||||
Schema Declaration | element editor { att.global.attributes, att.naming.attributes, att.datable.attributes, attribute calendar { list { teidata.pointer+ } }?, macro.phraseSeq } | ||||||||||
Processing Model | <model predicate="ancestor::teiHeader" |
<editorialDecl> (editorial practice declaration) provides details of editorial principles and practices applied during the encoding of a text. [2.3.3. The Editorial Practices Declaration 2.3. The Encoding Description 16.3.2. Declarable Elements] | |
Module | header |
Attributes |
|
Member of | |
Contained by | header: encodingDesc |
May contain | |
Example | <encodingDesc> <editorialDecl> <p>EEBO-TCP is a partnership between the Universities of Michigan and Oxford and the publisher ProQuest to create accurately transcribed and encoded texts based on the image sets published by ProQuest via their Early English Books Online (EEBO) database (http://eebo.chadwyck.com). The general aim of EEBO-TCP is to encode one copy (usually the first edition) of every monographic English-language title published between 1473 and 1700 available in EEBO.</p> <p>EEBO-TCP aimed to produce large quantities of textual data within the usual project restraints of time and funding, and therefore chose to create diplomatic transcriptions (as opposed to critical editions) with light-touch, mainly structural encoding based on the Text Encoding Initiative (http://www.tei-c.org).</p> <p>The EEBO-TCP project was divided into two phases. The 25,363 texts created during Phase 1 of the project have been released into the public domain as of 1 January 2015. Anyone can now take and use these texts for their own purposes, but we respectfully request that due credit and attribution is given to their original source.</p> <p>Users should be aware of the process of creating the TCP texts, and therefore of any assumptions that can be made about the data.</p> <p>Text selection was based on the New Cambridge Bibliography of English Literature (NCBEL). If an author (or for an anonymous work, the title) appears in NCBEL, then their works are eligible for inclusion. Selection was intended to range over a wide variety of subject areas, to reflect the true nature of the print record of the period. In general, first editions of a works in English were prioritized, although there are a number of works in other languages, notably Latin and Welsh, included and sometimes a second or later edition of a work was chosen if there was a compelling reason to do so.</p> <p>Image sets were sent to external keying companies for transcription and basic encoding. Quality assurance was then carried out by editorial teams in Oxford and Michigan. 5% (or 5 pages, whichever is the greater) of each text was proofread for accuracy and those which did not meet QA standards were returned to the keyers to be redone. After proofreading, the encoding was enhanced and/or corrected and characters marked as illegible were corrected where possible up to a limit of 100 instances per text. Any remaining illegibles were encoded as <gap>s. Understanding these processes should make clear that, while the overall quality of TCP data is very good, some errors will remain and some readable characters will be marked as illegible. Users should bear in mind that in all likelihood such instances will never have been looked at by a TCP editor.</p> <p>The texts were encoded and linked to page images in accordance with level 4 of the TEI in Libraries guidelines.</p> <p>Copies of the texts have been issued variously as SGML (TCP schema; ASCII text with mnemonic sdata character entities); displayable XML (TCP schema; characters represented either as UTF-8 Unicode or text strings within braces); or lossless XML (TEI P5, characters represented either as UTF-8 Unicode or TEI g elements).</p> <p>Keying and markup guidelines are available at the <ref target="http://www.textcreationpartnership.org/docs/.">Text Creation Partnership web site</ref>.</p> </editorialDecl> </encodingDesc> |
Content model | <content> |
Schema Declaration | element editorialDecl { att.global.attributes, ( model.pLike | model.editorialDeclPart )+ } |
<email> (electronic mail address) contains an email address identifying a location to which email messages can be delivered. [3.6.2. Addresses] | |
Module | core |
Attributes |
|
Member of | |
Contained by | analysis: s core: abbr add addrLine author bibl biblScope corr date del desc editor email expan foreign head hi item l label measure name note num orig p pubPlace publisher q quote ref reg resp rs sic speaker stage term time title unclear header: catDesc change classCode creation distributor edition extent language licence rendition tagUsage textstructure: byline closer dateline docAuthor docDate docEdition docImprint imprimatur opener salute signed titlePart trailer verse: rhyme |
May contain | |
Note | The format of a modern Internet email address is defined in RFC 2822 |
Example | <email>membership@tei-c.org</email> |
Content model | <content> |
Schema Declaration | element email { att.global.attributes, att.cmc.attributes, macro.phraseSeq } |
Processing Model | <model behaviour="inline"> |
<encodingDesc> (encoding description) documents the relationship between an electronic text and the source or sources from which it was derived. [2.3. The Encoding Description 2.1.1. The TEI Header and Its Components] | |
Module | header |
Attributes |
|
Member of | |
Contained by | header: teiHeader |
May contain | |
Example | <encodingDesc> <p>Basic encoding, capturing lexical information only. All hyphenation, punctuation, and variant spellings normalized. No formatting or layout information preserved.</p> </encodingDesc> |
Content model | <content> |
Schema Declaration | element encodingDesc { att.global.attributes, ( model.encodingDescPart | model.pLike )+ } |
Processing Model | <model behaviour="omit"/> |
<epigraph> (epigraph) contains a quotation, anonymous or attributed, appearing at the start or end of a section or on a title page. [4.2.3. Arguments, Epigraphs, and Postscripts 4.2. Elements Common to All Divisions 4.6. Title Pages] | |
Module | textstructure |
Attributes |
|
Member of | |
Contained by | |
May contain | |
Example | <epigraph> <bibl>Deut. Chap. 5.</bibl> <q>11 Thou ſhalt not take the name of the Lord thy God in vaine, for the Lord will not hold him guiltleſſe which ſhall take his name in vaine.</q> </epigraph> |
Content model | <content> |
Schema Declaration | element epigraph { att.global.attributes, att.cmc.attributes, ( model.common | model.global )* } |
Processing Model | <model behaviour="block"/> |
<expan> (expansion) contains the expansion of an abbreviation. [3.6.5. Abbreviations and Their Expansions] | |
Module | core |
Attributes |
|
Member of | |
Contained by | core: abbr add addrLine author bibl biblScope choice corr date del desc editor email expan foreign head hi item l label measure name note num orig p pubPlace publisher q quote ref reg resp rs sic speaker stage term time title unclear header: catDesc change classCode creation distributor edition extent language licence rendition tagUsage textstructure: byline closer dateline docAuthor docDate docEdition docImprint imprimatur opener salute signed titlePart trailer verse: rhyme |
May contain | |
Note | The content of this element should be the expanded abbreviation, usually (but not always) a complete word or phrase. The <ex> element provided by the transcr module may be used to mark up sequences of letters supplied within such an expansion. If abbreviations are expanded silently, this practice should be documented in the editorialDecl, either with a <normalization> element or a p. |
Example | The address is Southmoor <choice> <expan>Road</expan> <abbr>Rd</abbr> </choice> |
Example | <choice xml:lang="la"> <abbr>Imp</abbr> <expan>Imp<ex>erator</ex> </expan> </choice> |
Content model | <content> |
Schema Declaration | element expan { att.global.attributes, att.editLike.attributes, att.cmc.attributes, macro.phraseSeq } |
Processing Model | <model behaviour="inline"/> |
<extent> (extent) describes the approximate size of a text stored on some carrier medium or of some other object, digital or non-digital, specified in any convenient units. [2.2.3. Type and Extent of File 2.2. The File Description 3.12.2.4. Imprint, Size of a Document, and Reprint Information 11.7.1. Object Description] | |
Module | header |
Attributes |
|
Member of | |
Contained by | |
May contain | |
Example | <extent>3200 sentences</extent> <extent>between 10 and 20 Mb</extent> <extent>ten 3.5 inch high density diskettes</extent> |
Example | The measure element may be used to supply normalized or machine tractable versions of the size or sizes concerned. <extent> <measure unit="MiB" quantity="4.2">About four megabytes</measure> <measure unit="pages" quantity="245">245 pages of source material</measure> </extent> |
Content model | <content> |
Schema Declaration | element extent { att.global.attributes, macro.phraseSeq } |
<facsimile> contains a representation of some written source in the form of a set of images rather than as transcribed or encoded text. [12.1. Digital Facsimiles] | |
Module | transcr |
Attributes |
|
Member of | |
Contained by | |
May contain | |
Example | <facsimile> <graphic url="page1.png"/> <surface> <graphic url="page2-highRes.png"/> <graphic url="page2-lowRes.png"/> </surface> <graphic url="page3.png"/> <graphic url="page4.png"/> </facsimile> |
Example | <facsimile> <surface ulx="0" uly="0" lrx="200" lry="300"> <graphic url="Bovelles-49r.png"/> </surface> </facsimile> |
Schematron | <sch:rule context="tei:facsimile//tei:line | tei:facsimile//tei:zone"> <sch:report test="child::text()[ normalize-space(.) ne '']"> A facsimile element represents a text with images, thus transcribed text should not be present within it. </sch:report> </sch:rule> |
Content model | <content> |
Schema Declaration | element facsimile { att.global.attributes, ( front?, ( model.graphicLike | surface | surfaceGrp )+, back? ) } |
<figDesc> (description of figure) contains a brief prose description of the appearance or content of a graphic figure, for use when documenting an image without displaying it. [15.4. Specific Elements for Graphic Images] | |
Module | figures |
Attributes |
|
Contained by | figures: figure |
May contain | |
Note | This element is intended for use as an alternative to the content of its parent figure element ; for example, to display when the image is required but the equipment in use cannot display graphic images. It may also be used for indexing or documentary purposes. |
Example | <figure> <graphic url="emblem1.png"/> <head>Emblemi d'Amore</head> <figDesc>A pair of naked winged cupids, each holding a flaming torch, in a rural setting.</figDesc> </figure> |
Content model | <content> |
Schema Declaration | element figDesc { att.global.attributes, macro.limitedContent } |
Processing Model | <model behaviour="inline"> |
<figure> (figure) groups elements representing or containing graphic information such as an illustration, formula, or figure. [15.4. Specific Elements for Graphic Images] | |
Module | figures |
Attributes |
|
Member of | |
Contained by | core: abbr add addrLine address author bibl biblScope cit corr date del editor email expan foreign head hi item l label lg list measure name note num orig p pubPlace publisher q quote ref reg resp rs sic sp speaker stage term time title unclear namesdates: person textstructure: argument back body byline closer dateline div docAuthor docDate docEdition docImprint docTitle epigraph floatingText front group imprimatur opener postscript salute signed text titlePage titlePart trailer verse: rhyme |
May contain | |
Example | <figure> <head>The View from the Bridge</head> <figDesc>A Whistleresque view showing four or five sailing boats in the foreground, and a series of buoys strung out between them.</figDesc> <graphic url="http://www.example.org/fig1.png" scale="0.5"/> </figure> |
Content model | <content> |
Schema Declaration | element figure { att.global.attributes, att.placement.attributes, att.typed.attributes, att.written.attributes, att.cmc.attributes, ( model.headLike | model.common | figDesc | model.graphicLike | model.global | model.divBottom )* } |
Processing Model | <model predicate="head or @rendition='simple:display'" |
<fileDesc> (file description) contains a full bibliographic description of an electronic file. [2.2. The File Description 2.1.1. The TEI Header and Its Components] | |
Module | header |
Attributes |
|
Contained by | |
May contain | header: editionStmt extent notesStmt |