<?xml version="1.0"?>
<TEI xmlns="http://www.tei-c.org/ns/1.0" xmlns:rng="http://relaxng.org/ns/structure/1.0" xml:lang="en">
  <teiHeader>
    <fileDesc>
      <titleStmt>
        <title>Encoding for Interchange: an introduction to the TEI</title>
      </titleStmt>
      <publicationStmt>
	<availability>
	  <p><hi>Copyright 2010 TEI Consortium.</hi></p>
	  <p>This is free software; you can redistribute it and/or
	    modify it under the terms of the GNU General Public
	    License as published by the Free Software Foundation;
	    either version 2 of the License, or (at your option) any
	    later version.</p>
	  <p>This material is distributed in the hope that it will be
	    useful, but <emph>without any warranty</emph>; without even the implied
	    warranty of <emph>merchantability</emph> or 
	    <emph>fitness for a particular
	    purpose.</emph> See the GNU General Public License for more
	    details.</p>
	  <p>A copy of the GNU General Public
	    License is stored on the TEI web site
	    along with this file; you can also contact the Free
	    Software Foundation, Inc., 59 Temple Place, Suite 330,
	    Boston, MA 02111-1307, USA, for a copy.</p>
	</availability>
      </publicationStmt>
      <sourceDesc>
        <bibl>TEI U5 (derived from TEI U1: An
	Introduction to TEI Tagging (derived from TEI ED W21: Living with the
	Guidelines)</bibl>
      </sourceDesc>
    </fileDesc>
    <revisionDesc>
      <change when="2008-02-01" who="SPQR">remove some 
      unreachable elements, and remove @rendition (since its not
      useable without <gi>tagsDecl</gi>)</change>
      <change when="2007-01-25" who="LB">restored quote (under protest)
</change>
      <change when="2006-12-06" who="LB">fix titlePage problem
</change>
      <change when="2006-02-08" who="LB">
more substantive changes: add elements from tagdocs module; kill
numbered divs
</change>
      <change when="2006-01-29" who="LB">
first cut conversion to P5
</change>
      <change when="2004-10" who="SPQR">
reformat, clean up, remove list of translations
      </change>
      <change when="2002-08-07" who="LB">
Correct blunder in Gifford example
      </change>
      <change when="2002-05-18" who="LB">
First pass  for P4/XML revision
      </change>
      <change when="2001-01-21" who="LB">
Added IDs to div1s and checked links
      </change>
      <change when="2001-01-04" who="SPQR">
Remove TOC, fix a couple of links, change preface to div1 not div.
 Add stylesheet PI, make parse
      </change>
      <change when="2000-06-21" who="LB">
Add preface; modify links
      </change>
      <change when="1995-09-09" who="CMSMcQ">
fix Oxford links
      </change>
      <change when="1995-06-08" who="CMSMcQ">
install on TEI web server (changing DTD subset slightly)
      </change>
      <change when="1995-06-07" who="CMSMcQ">
Bring TeX and Script spelling corrections, etc. into SGML form.
      </change>
      <change when="1995-06-03" who="CMSMcQ">
Spellcheck, final (! ha!) changes, format, and print.  Many changes made only in
TeX and Script versions.
      </change>
      <change when="1995-05-30" who="LB">
Last (ha!) pass. Cut down intro section. Moved divgen again.  Revised
interp and index sections extensively and generally
hacked.      </change>
      <change when="1995-05-25" who="CMSMcQ">
changes as agreed with LB at ExCommittee meeting: interp section,
rev. editorial tags, add def of TEI Lite, add section on Making It
Work with software, resettle divGen and index, begin continuous pass
through working from LB's notes
      </change>
      <change when="1995-05-15" who="CMSMcQ">
begin last push prior to publication
      </change>
      <change when="1994-12-01" who="LB">
retagged using TEI Lite
      </change>
      <change when="1994-06-23" who="LB">
change to use ODD-style tagdescs
      </change>
      <change when="1993-07-20" who="CMSMcQ">
made file from old ED W21
      </change>
    </revisionDesc>
  </teiHeader>
  <text>
    <front>
      <titlePage>
        <docTitle>
          <titlePart type="main">TEI Lite: 
Encoding for Interchange: an introduction to the TEI
</titlePart>
          <titlePart type="sub">Revised for TEI P5 release</titlePart>
        </docTitle>
        <docAuthor>Lou Burnard</docAuthor>
        <docAuthor>C. M. Sperberg-McQueen</docAuthor>
        <docDate>February 2006</docDate>
      </titlePage>
      <div xml:id="U5-pref">
        <head>Prefatory note</head>
<p>TEI Lite was the name adopted for what the TEI editors originally
conceived of as a simple demonstration of how the TEI encoding scheme
might be adopted to meet 90% of the needs of 90% of the TEI user
community. In retrospect, it was predictable that many people should
imagine TEI Lite to be all there is to TEI, or find TEI Lite to be far
too heavy for their needs.</p>

<p>The original TEI Lite was based largely on  observations of existing and
previous practice in the encoding of texts, particularly as manifest
in the collections of the <ref target="http://ota.ahds.ac.uk">Oxford
Text Archive</ref> and in our own experience. It is therefore
unsurprising that it seems to have become, if not a de facto standard,
at least a common point of departure for electronic text centres and
encoding projects world wide. Maybe the fact that we actually produced
this shortish, readable, manual for it also helped.</p>

<p>Early adopters of TEI Lite included a number of
<soCalled>Electronic Text Centers</soCalled>, many of whom produced
their own documentation and tutorial materials (some examples are
listed in <ref target="http://www.tei-c.org/Tutorials">the TEI
Tutorials pages</ref>). It was also widely adopted as the basis for
TEI-conformant authoring systems. Documentation introducing TEI Lite
has been widely used for tutorial purposes and has been widely
translated (see further the list of versions at <ptr
target="http://www.tei-c.org/Lite/"/>).</p>

<p>With the publication of TEI P4, the XML version of the TEI
Guidelines, which uses the generation of TEI Lite as an example of the
modification mechanism built into the TEI Guidelines, the opportunity
was taken to produce a lightly revised XML-conformant version, but the
present revision is the first substantively changed version since its
first appearance in 1997. This revision takes advantage of the many
new features introduced into the TEI Guidelines at release P5. A brief
list of those changes likely to affect users of previous versions of
this document is given below (<ptr target="#changes"/>). </p>
        <trailer>Lou Burnard, February 2006</trailer>
      </div>
    </front>

<body><p>This document provides an introduction to the recommendations
of the Text Encoding Initiative (TEI), by describing a specific subset
of the full TEI encoding scheme. The scheme documented here can be
used to encode a wide variety of commonly encountered textual
features, in such a way as to maximize the usability of electronic
transcriptions and to facilitate their interchange among scholars
using different computer systems. It is fully compatible with the full
TEI scheme, as defined by TEI document P5, <title>Guidelines for
Electronic Text Encoding and Interchange</title>, as of February 2006,
and available from the TEI Consortium website at <ptr
target="http://www.tei-c.org"/>.
</p>

<div xml:id="U5-Intro">
        <head>Introduction</head>

<p>The Text Encoding Initiative (TEI) Guidelines are addressed to
anyone who wants to interchange information stored in an electronic
form. They emphasize the interchange of textual information, but other
forms of information such as images and sound are also addressed. The
Guidelines are equally applicable in the creation of new resources and
in the interchange of existing ones.</p>
        <p>The Guidelines provide a means of making explicit certain
        features of a text in such a way as to aid the processing of
        that text by computer programs running on different
        machines. This process of making explicit we call
        <term>markup</term> or <term>encoding</term>.  Any textual
        representation on a computer uses some form of markup; the TEI
        came into being partly because of the enormous variety of
        mutually incomprehensible encoding schemes currently besetting
        scholarship, and partly because of the expanding range of
        scholarly uses now being identified for texts in electronic
        form.</p>
        <p>The TEI Guidelines describe an encoding scheme which can be
        expressed using a number of different formal languages. The
        first editions of the Guidelines used the <term>Standard
        Generalized Markup Language</term> (SGML); since 2002, this
        has been replaced by the use of the Extensible Markup Language
        (XML). These markup languages have in common the definition of
        text in terms of <term>elements</term> and
        <term>attributes</term>, and rules governing their appearance
        within a text. The TEI's use of XML is ambitious in its
        complexity and generality, but it is fundamentally no
        different from that of any other XML markup scheme, and so any
        general-purpose XML-aware software is able to process
        TEI-conformant texts.</p>
        <p>The TEI was sponsored by the Association for Computers and
        the Humanities, the Association for Computational Linguistics,
        and the Association for Literary and Linguistic Computing, and
        is now maintained and developed by an independent membership
        consortium, hosted by four major Universities. Funding has
        been provided in part from the U.S. National Endowment for the
        Humanities, Directorate General XIII of the Commission of the
        European Communities, the Andrew W. Mellon Foundation, and the
        Social Science and Humanities Research Council of Canada. The
        Guidelines were first published in May 1994, after six years
        of development involving many hundreds of scholars from
        different academic disciplines worldwide. During the years
        that followed, the Guidelines were increasingly influential in
        the development of the digital library, in the language
        industries, and even in the development of the World Wide Web
        itself. The TEI consortium was set up in January 2001, and a
        year later produced an  edition of the
        Guidelines entirely revised for XML
        compatibility. In 2004, it set about a major revision of the
	Guidelines to take full advantage of new schema
	languages, the first release of which appeared in 2005. This
	revision of the TEI Lite manual conforms to version 0.3 of
	this most recent edition of the Guidelines, TEI P5.</p>
        <p>At the outset of its work, the overall goals of the TEI
        were defined by the closing statement of a planning conference
        held at Vassar College, N.Y., in November, 1987; these
        <soCalled>Poughkeepsie Principles</soCalled> were further
        elaborated in a series of design documents.  The Guidelines,
        say these design documents, should:

<list><item>suffice to represent the textual features needed for
       research;</item><item>be simple, clear, and concrete;</item><item>be easy for researchers to use without special-purpose
software;</item><item>allow the rigorous definition and efficient processing of
texts;</item><item>provide for user-defined extensions;</item><item>conform to existing and emergent standards.</item></list></p>
        <p>The world of scholarship is large and diverse. For the Guidelines
to have wide acceptability, it was important to ensure that:
<list type="ordered"><item>the common core of textual features be easily shared;</item><item>additional specialist features be easy to add to (or remove
from) a text;</item><item>multiple parallel encodings of the same feature should be
possible;</item><item>the richness of markup should be user-defined, with a very
small minimal requirement;</item><item>adequate documentation of the text and its encoding should be
provided.</item></list></p>
        <p>The present document describes a manageable selection from the
extensive set of elements and recommendations resulting from those
design goals, which is called <title>TEI Lite</title>.</p>
        <p>In selecting from the  several hundred elements defined by
the full TEI scheme, we have tried to identify a useful <soCalled>starter
set</soCalled>, comprising the elements which almost every user should
know about.  Experience working with TEI Lite will be invaluable in
understanding the full TEI scheme and in knowing how to integrate
specialized parts of it into the general TEI framework.</p>
        <p>Our goals in defining this subset may be summarized as follows:
<list type="simple"><item>it should be able to handle adequately a reasonably wide variety
of texts, at the level of detail found in existing practice (as
demonstrated in, for example, the holdings of the Oxford Text
Archive);</item><item>it should be useful for the production of new documents (such as
this one) as well as the encoding of existing texts;</item><item>it should be usable with a wide range of existing XML
software;</item><item>it should be derivable from the full TEI scheme using the
extension mechanisms described in the TEI Guidelines;</item><item>it should be as small and simple as is consistent with the
other goals.</item></list> </p>
        <p>The reader may judge our success in meeting these goals for him or
herself. At the time of first writing (1995), our confidence that we
have at least partially done so is borne out by its use in practice
for the encoding of real texts.  The Oxford Text Archive uses TEI Lite
when it translates texts from its holdings from their original markup
schemes into SGML; the Electronic Text Centers at the University of
Virginia and the University of Michigan have used TEI Lite to encode
their holdings. And the Text Encoding Initiative itself uses TEI Lite,
in its current technical documentation — including this
document.
       </p>
        <p>Although we have tried to make this document self-contained, as
suits a tutorial text, the reader should be aware that it does not
cover every detail of the TEI encoding scheme. All of the elements
described here are fully documented in the TEI Guidelines themselves,
which should be consulted for authoritative reference information on
these, and on the many others which are not described here.  Some
basic knowledge of XML is assumed.</p>
      </div>
      <div xml:id="U5-eg">
        <head>A Short Example</head>
        <p>We begin with a short example, intended to show what happens when
a passage of prose is typed into a computer by someone with little
sense of the purpose of mark-up, or the potential of electronic texts.
In an ideal world, such output might be generated by a very accurate
optical scanner.  It attempts to be faithful to the appearance of the
printed text, by retaining the original line breaks, by introducing
blanks to represent the layout of the original headings and page
breaks, and so forth. Where characters not available on the keyboard
are needed (such as the accented letter <mentioned>a</mentioned> in
<mentioned>faàl</mentioned> or the long dash), it attempts to
mimic their appearance.</p>
        <p>
          <eg>                          CHAPTER 38

READER, I married him. A quiet wedding we had: he and I, the par-
son and clerk, were alone present. When we got back from church, I
went into the kitchen of the manor-house, where Mary was cooking
the dinner, and John cleaning the knives, and I said --
  'Mary, I have been married to Mr Rochester this morning.' The
housekeeper and her husband were of that decent, phlegmatic
order of people, to whom one may at any time safely communicate a
remarkable piece of news without incurring the danger of having
one's ears pierced by some shrill ejaculation and subsequently stunned
by a torrent of wordy wonderment. Mary did look up, and she did
stare at me; the ladle with which she was basting a pair of chickens
roasting at the fire, did for some three minutes hang suspended in air,
and for the same space of time John's knives also had rest from the
polishing process; but Mary, bending again over the roast, said only --
   'Have you, miss? Well, for sure!'
   A short time after she pursued, 'I seed you go out with the master,
but I didn't know you were gone to church to be wed'; and she
basted away. John, when I turned to him, was grinning from ear to
ear.
   'I telled Mary how it would be,' he said: 'I knew what Mr Ed-
ward' (John was an old servant, and had known his master when he
was the cadet of the house, therefore he often gave him his Christian
name) -- 'I knew what Mr Edward would do; and I was certain he
would not wait long either: and he's done right, for aught I know. I
wish you joy, miss!' and he politely pulled his forelock.
   'Thank you, John. Mr Rochester told me to give you and Mary
this.'
   I put into his hand a five-pound note.  Without waiting to hear
more, I left the kitchen. In passing the door of that sanctum some time
after, I caught the words --
   'She'll happen do better for him nor ony o' t' grand ladies.' And
again, 'If she ben't one o' th' handsomest, she's noan faa\l, and varry
good-natured; and i' his een she's fair beautiful, onybody may see
that.'
   I wrote to Moor House and to Cambridge immediately, to say what
I had done: fully explaining also why I had thus acted. Diana and

                            474

                 JANE EYRE                      475

Mary approved the step unreservedly. Diana announced that she
would just give me time to get over the honeymoon, and then she
would come and see me.
   'She had better not wait till then, Jane,' said Mr Rochester, when I
read her letter to him; 'if she does, she will be too late, for our honey-
moon will shine our life long: its beams will only fade over your
grave or mine.'
   How St John received the news I don't know: he never answered
the letter in which I communicated it: yet six months after he wrote
to me, without, however, mentioning Mr Rochester's name or allud-
ing to my marriage. His letter was then calm, and though very serious,
kind. He has maintained a regular, though not very frequent correspond-
ence ever since: he hopes I am happy, and trusts I am not of those who
live without God in the world, and only mind earthly
things.</eg>
        </p>
        <p>This transcription suffers from a number of shortcomings:
<list><item>the page numbers and running titles are intermingled with the
text in a way which makes it difficult for software to disentangle
them;</item><item>no distinction is made between single quotation marks and
apostrophe, so it is difficult to know exactly which passages are in
direct speech;</item><item>the preservation of the copy text's hyphenation means that
simple-minded search programs will not find the broken words;</item><item>the accented letter in <mentioned>fa&#xE0;l</mentioned> and
the long dash have been rendered by ad hoc keying conventions which
follow no standard pattern and will be processed correctly only if the
transcriber remembers to mention them in the documentation;</item><item>paragraph divisions are marked only by the use of white space,
and hard carriage returns have been introduced at the end of each
line. Consequently, if the size of type used to print the text
changes, reformatting will be problematic.</item></list></p>
        <p>We now present the same passage, as it might be encoded using the
TEI Guidelines. As we shall see, there are many ways in which this
encoding could be extended, but as a minimum, the TEI approach allows
us to represent the following distinctions:
<list><item>Paragraph and chapter divisions are now marked explicitly.</item><item>Apostrophes are distinguished from quotation marks; direct
speech is explicitly marked.</item><item>The accented letter and the long dash are correctly represented.</item><item>Page divisions have been marked with an empty <gi>pb</gi>
element alone.</item><item>The lineation of the
original has not been retained and words broken by typographic
accident at the end of a line have been re-assembled without comment.</item><item>For convenience of proof reading, a new line has been
introduced at the start of each paragraph, but the indentation is
removed.</item></list>

<egXML xmlns="http://www.tei-c.org/ns/Examples">
<pb n='474'/>
<div type="chapter" n='38'>

<p>Reader, I married him.  A quiet wedding we had: he and I,
the parson and clerk, were alone present.  When we got back
from church, I went into the kitchen of the manor-house,
where Mary was cooking the dinner, and John cleaning the
knives, and I said —</p>

<p><q>Mary, I have been married to Mr Rochester this
morning.</q> The housekeeper and her husband were of that
decent, phlegmatic order of people, to whom one may at any
time safely communicate a remarkable piece of news without
incurring the danger of having one's ears pierced by some
shrill ejaculation and subsequently stunned by a torrent of
wordy wonderment.  Mary did look up, and she did stare at
me; the ladle with which she was basting a pair of chickens
roasting at the fire, did for some three minutes hang
suspended in air, and for the same space of time John's
knives also had rest from the polishing process; but Mary,
bending again over the roast, said only —</p>

<p><q>Have you, miss? Well, for sure!</q></p>

<p>A short time after she pursued, <q>I seed you go out with
the master, but I didn't know you were gone to church to be
wed</q>; and she basted away.  John, when I turned to him,
was grinning from ear to ear.  <q>I telled Mary how it would
be,</q> he said: <q>I knew what Mr Edward</q> (John was an
old servant, and had known his master when he was the cadet
of the house, therefore he often gave him his Christian
name) — <q>I knew what Mr Edward would do; and I was
certain he would not wait long either: and he's done right,
for aught I know.  I wish you joy, miss!</q> and he politely
pulled his forelock.</p>

<p><q>Thank you, John.  Mr Rochester told me to give you and
Mary this.</q></p>

<p>I put into his hand a five-pound note.  Without waiting
to hear more, I left the kitchen.  In passing the door of
that sanctum some time after, I caught the words —</p>

<p><q>She'll happen do better for him nor ony o' t' grand
ladies.</q> And again, <q>If she ben't one o' th'
handsomest, she's noan faàl, and varry good-natured;
and i' his een she's fair beautiful, onybody may see
that.</q></p>

<p>I wrote to Moor House and to Cambridge immediately, to
say what I had done: fully explaining also why I had thus
acted.  Diana and <pb n='475'/> Mary approved the step
unreservedly.  Diana announced that she would just give me
time to get over the honeymoon, and then she would come and
see me.</p>

<p><q>She had better not wait till then, Jane,</q> said Mr
Rochester, when I read her letter to him; <q>if she does,
she will be too late, for our honeymoon will shine our life
long: its beams will only fade over your grave or mine.</q></p>

<p>How St John received the news I don't know: he never
answered the letter in which I communicated it: yet six
months after he wrote to me, without, however, mentioning Mr
Rochester's name or alluding to my marriage.  His letter was
then calm, and though very serious, kind.  He has maintained
a regular, though not very frequent correspondence ever
since: he hopes I am happy, and trusts I am not of those who
live without God in the world, and only mind earthly things.</p></div></egXML></p>
        <p>This particular encoding represents a set of choices or priorities.
The decision to focus on Brontë's text, rather than on the
printing of it in this particular edition, is an instance of the
fundamental <emph>selectivity</emph> of any encoding. An encoding makes
explicit only those textual features of importance to the encoder.  It
is not difficult to think of ways in which the encoding of even this
short passage might readily be extended. For example:
<list type="simple"><item>a regularized form of the passages in dialect could be
provided;</item><item>footnotes glossing or commenting on any passage could be
added;</item><item>pointers linking parts of this text to others could be
added;</item><item>proper names of various kinds could be distinguished from the
surrounding text;</item><item>detailed bibliographic information about the text's provenance
and context could be prefixed to it;</item><item>a linguistic analysis of the passage into sentences, clauses,
words, etc., could be provided, each unit being associated with
appropriate category codes;</item><item>the text could be segmented into narrative or discourse
       units;</item><item>systematic analysis or interpretation of the text could be
included in the encoding, with potentially complex alignment or
linkage between the text and the analysis, or between the text and one
or more translations of it;</item><item>passages in the text could be linked to images or sound held on
other media.</item></list></p>
        <p>A TEI-recommended way of carrying out most of these is described
in the remainder of this document. The TEI scheme as a whole also
provides for an enormous range of other possibilities, of which we
cite only a few:
<list type="simple"><item>detailed analysis of the components of names;</item><item>detailed meta-information providing thesaurus-style information
about the text's origins or topics;</item><item>information about the printing history or manuscript variations
exhibited by a particular series of versions of the text.</item></list> For recommendations on these and many other possibilities, the
full Guidelines should be consulted.</p>
      </div>
      <div xml:id="U5-struc">
        <head>The Structure of a TEI Text</head>
        <p>All TEI-conformant texts contain (a) a <term>TEI header</term>
(marked up as a <gi>teiHeader</gi> element) and (b) the transcription
of the text proper (marked up as a <gi>text</gi> element). These two
elements are combined together to form a single <gi>TEI</gi> element. 
</p>
        <p>The TEI header provides information analogous to that provided by
the title page of a printed text.  It has up to four parts: a
bibliographic description of the machine-readable text, a description
of the way it has been encoded, a non-bibliographic description of the
text (a <term>text profile</term>), and a revision history. The
header is described in more detail in section <ptr target="#U5-header"/>.</p>
        <p>A TEI text may be <term>unitary</term> (a single work) or
<term>composite</term> (a collection of single works, such as an
anthology). In either case, the text may have an optional <term>front</term>
or <term>back</term>. In between is the <term>body</term> of the
text, which, in the case of a composite text, may consist of
<term>group</term>s, each containing more groups or texts.</p>
        <p>A unitary text will be encoded using an overall structure like
this:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><TEI><teiHeader><!-- [ TEI Header information ]  --></teiHeader><text><front><!-- [ front matter ... ] -->  </front><body> <!-- [ body of text ... ]  --> </body><back><!--  [ back matter ...  ] -->  </back></text></TEI></egXML></p>
        <p>A composite text also has an optional front and back. In between
occur one or more groups of texts, each with its own optional front
and back matter. A composite text will thus be encoded using an
overall structure like this:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><TEI><teiHeader> <!--[ header information for the composite ]--> </teiHeader><text><front> <!--[ front matter for the composite  ]-->      </front><group><text><front> <!--[ front matter of first text ]--> </front><body>  <!--[ body of first text  ]-->          </body><back>  <!--[ back matter of first text ]-->    </back></text><text><front> <!--[ front matter of second text]-->  </front><body>  <!--[ body of second text  ]-->          </body><back>  <!--[ back matter of second text ]-->    </back></text>
           <!--[ more texts or groups of texts here ]-->
        </group><back>      <!--[ back matter for the composite  ]-->      </back></text></TEI></egXML></p>
        <p>It is also possible to define a composite of TEI texts, each with
its own header. Such a collection is known as a <term>TEI corpus</term>,
and may itself have a header:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><teiCorpus><teiHeader>   <!--[header information for the corpus]--></teiHeader><TEI><teiHeader><!--[header information for first text]--></teiHeader><text>     <!--[first text in corpus]-->             </text></TEI><TEI><teiHeader><!--[header information for second text]--></teiHeader><text>     <!--[second text in corpus]-->             </text></TEI></teiCorpus></egXML>
It is not however possible to create a composite of corpora --
that is, a number of <gi>teiCorpus</gi> elements combined together
and treated as a single object. This is a restriction of the current
version of the TEI Guidelines.</p>
        <p>In the remainder of this document, we discuss chiefly simple text
structures. The discussion in each case consists of a short list of
relevant TEI <term>elements</term> with a brief definition of each,
followed by definitions for any <term>attributes</term> specific to
that element, and a reference to any <term>classes</term> of which the
element is a member. These references are linked to full
specifications for each object, as given in the TEI
<title>Guidelines</title>. In most cases, short examples are also given.</p>
        <p>For example, here are the elements discussed so far:
<specList><specDesc key="TEI"/><specDesc key="teiHeader"/><specDesc key="text"/></specList></p>
      </div>
      <div xml:id="U5-body">
        <head>Encoding the Body</head>
        <p>As indicated above, a simple TEI document at the textual level
consists of the following elements:

<specList><specDesc key="front"/><specDesc key="group"/><specDesc key="body"/><specDesc key="back"/></specList>


Elements specific to front and back matter are described
below in section <ptr target="#U5-fronbac"/>. In this section we discuss
the elements making up the body of a text. </p>
        <div xml:id="divs">
          <head>Text Division Elements</head>
          <p>The body of a prose text may be just a series of paragraphs, or
these paragraphs may be grouped together into chapters, sections,
subsections, etc. Each paragraph is tagged using
the <gi>p</gi> tag. The <gi>div</gi> element is used to represent any
such grouping of paragraphs. 
<specList><specDesc key="p"/><specDesc key="div"/></specList>
  </p>
          <p>The <att>type</att> attribute on the <gi>div</gi> element may be
used to supply a conventional name for this category of text division,
or otherwise distinguish them.  Typical values might be <q>book</q>,
<q>chapter</q>, <q>section</q>, <q>part</q>, <q>poem</q>, <q>song</q>,
etc.  For a given project, it will usually be advisable to define and
adhere to a specific list of such values.  </p>
          <p>A <gi>div</gi> element may itself contain further, nested,
<gi>div</gi>s, thus mimicking the traditional structure of a book,
which can be decomposed hierarchically into units such as parts,
containing chapters, containing sections, and so on. TEI texts in general
conform to this simple hierarchic model.</p>
          <p> The <att>xml:id</att> attribute may be used to supply a unique
identifier for the division, which may be used for cross references or
other links to it, such as a commentary, as further discussed in
section <ptr target="#U5-ptrs"/>. It is often useful to provide an
<att>xml:id</att> attribute for every major structural unit in a
text, and to derive its values in some systematic way, for example
by appending a section number to a short code for the title of the
work in question, as in the examples below.</p>
          <p>The <att>n</att> attribute may be used to supply (additionally or
alternatively) a short mnemonic name
or number for the division.  If a conventional form of reference or
abbreviation for the parts of a work already exists (such as the
book/chapter/verse pattern of Biblical citations), the <att>n</att>
attribute is the place to record it.</p>
          <p>The <att>xml:lang</att> attribute may be used to specify the
language of the division.  Languages are identified by an
internationally defined code, as further discussed in section <ptr target="#z636"/> below.</p>
          <p>The <att>rend</att> attribute may be used to supply information
about the rendition (appearance) of a division, or any other element,
as further discussed in section <ptr target="#U5-hilites"/> below. As with the
<att>type</att> attribute, a project will often find it useful to
predefine the possible values for this attribute, but TEI Lite does
not constrain it in anyway. </p>
          <p> These four attributes, <att>xml:id</att>, <att>n</att>,
<att>xml:lang</att>, and <att>rend</att> are so widely useful that
they are allowed on any element in any TEI schema: they are
<term>global attributes</term>.  Other global attributes defined in
the TEI Lite scheme are discussed in section <ptr target="#xatts"/>.</p>
          <p>The value of every <att>xml:id</att> attribute should be unique
within a document. One simple way of ensuring that this is so is to
make it reflect the hierarchic structure of the document. For example,
Smith's <title>Wealth of Nations</title> as first published consists
of five books, each of which is divided into chapters, while some
chapters are further subdivided into parts. We might define
<att>xml:id</att> values for this structure as follows:

<egXML xmlns="http://www.tei-c.org/ns/Examples"><body><div xml:id="WN1" n="I" type="book"><div xml:id="WN101" n="I.1" type="chapter">
   <!-- ... --> </div><div xml:id="WN102" n="I.2" type="chapter">
   <!-- ... --> </div>
   <!-- ... -->
  <div xml:id="WN110" n="I.10" type="chapter"><div xml:id="WN1101" n="I.10.1" type="part">
      <!-- ... --> </div><div xml:id="WN1102" n="I.10.2" type="part">
      <!-- ... --> </div></div>
  <!-- ... -->
 </div><div xml:id="WN2" n="II" type="book">
   <!-- ... -->
</div></body></egXML></p>
          <p>A different numbering scheme may be used for <att>xml:id</att> and
<att>n</att> attributes: this is often useful where a canonical
reference scheme is used which does not tally with the structure of
the work.  For example, in a novel divided into books each containing
chapters, where the chapters are numbered sequentially through the
whole work, rather than within each book, one might use a scheme such
as the following:

<egXML xmlns="http://www.tei-c.org/ns/Examples"><body><div xml:id="TS01" n="1" type="Volume"><div xml:id="TS011" n="1" type="Chapter">
      <!-- ... --> </div><div xml:id="TS012" n="2">
      <!-- ... --></div></div><div xml:id="TS02" n="2" type="Volume"><div xml:id="TS021" n="3" type="Chapter">
      <!-- ... --></div><div xml:id="TS022" n="4">
      <!-- ... --></div></div></body></egXML>
Here the work has two volumes, each containing two chapters.
The chapters are numbered conventionally 1 to 4, but the <att>xml:id</att>
values specified allow them to be regarded additionally as if they
were numbered 1.1, 1.2, 2.1, 2.2.</p>
        </div>
        <div xml:id="h25">
          <head>Headings and Closings</head>
          <p>Every <gi>div</gi>  may
have a title or heading at its start, and (less commonly) a closing
such as <q>End of Chapter 1</q>.  The
following elements may be used to transcribe them:

<specList><specDesc key="head"/><specDesc key="trailer"/></specList>

Some other elements which may be necessary at the beginning or ending
of text divisions are discussed below in section <ptr target="#h52"/>.</p>
          <p>Whether or not headings and trailers are included in a
transcription is a matter for the individual transcriber to decide.
Where a heading is completely regular (for example <q>Chapter 1</q>)
or may be automatically constructed from attribute values
(e.g. <gi>div type="Chapter" n="1"</gi>), it may be omitted; where it
contains otherwise unrecoverable text it should always be included.
For example, the start of Hardy's <title>Under the Greenwood
Tree</title> might be encoded as follows:

<egXML xmlns="http://www.tei-c.org/ns/Examples"><div xml:id="UGT1" n="Winter" type="Part"><div xml:id="UGT11" n="1" type="Chapter"><head>Mellstock-Lane</head><p>To dwellers in a wood almost every species of tree ... 
</p></div></div></egXML></p>
        </div>
        <div xml:id="vedr">
          <head>Prose, Verse and Drama</head>
          <p>As noted above, the paragraphs making up a textual division should
be tagged with the <gi>p</gi> tag. For example:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><p>I fully appreciate Gen. Pope's splendid achievements
with their invaluable results; but you must know that
Major Generalships in the Regular Army, are not as
plenty as blackberries.
</p></egXML>
<!-- Is this quote right? shouldn't it be "plentIFUL"? -->
 <!-- A. Lincoln to Richard Yates and William Butler, 10 Apr 1862,
 --><!--  Library of America, Lincoln, v. 2 p. 315.
 --></p>
          <p>A number of different tags are provided for the encoding of the
structural components of verse and performance texts (drama, film,
etc.):
<specList><specDesc key="l"/><specDesc key="lg"/><specDesc key="sp"/><specDesc key="speaker"/><specDesc key="stage"/></specList>


</p>
          <p>Here, for example, is the start of a poetic text in which verse
lines and stanzas are tagged:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><lg n="I"><l>I Sing the progresse of a
   deathlesse soule,</l><l>Whom Fate, with God made,
  but doth not controule,</l><l>Plac'd in most shapes; all times
  before the law</l><l>Yoak'd us, and when, and since,
  in this I sing.</l><l>And the great world to his aged evening;</l><l>From infant morne, through manly noone I draw.</l><l>What the gold Chaldee, of silver Persian saw,</l><l>Greeke brass, or Roman iron, is in this one;</l><l>A worke t'out weare Seths pillars, bricke and stone,</l><l>And (holy writs excepted) made to yeeld to none,</l></lg></egXML></p>
          <p>Note that the <gi>l</gi> element marks verse lines, not typographic
lines: the original lineation of the first few lines above has not
therefore been made explicit by this encoding, and may be lost. The
<gi>lb</gi> element described in section <ptr target="#U5-pln"/> may be
used to mark typographic lines if so desired.</p>
          <p>Sometimes, particularly in dramatic texts, verse lines are split
between speakers. The easiest way of encoding this is to use the
<att>part</att> attribute to indicate that the lines so
fragmented are incomplete, as in this example:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><div type="Act" n="I"><head>ACT I</head><div type="Scene" n="1"><head>SCENE I</head><stage rend="italic">
Enter Barnardo and Francisco, two Sentinels, at several doors</stage><sp><speaker>Barn</speaker><l part="Y">Who's there?</l></sp><sp><speaker>Fran</speaker><l>Nay, answer me. Stand and unfold 
  yourself.</l></sp><sp><speaker>Barn</speaker><l part="I">Long live the King!</l></sp><sp><speaker>Fran</speaker><l part="M">Barnardo?</l></sp><sp><speaker>Barn</speaker><l part="F">He.</l></sp><sp><speaker>Fran</speaker><l>You come most carefully upon 
  your hour.</l></sp><!-- ... --> </div></div></egXML></p>
          <p>The same mechanism may be applied to stanzas which are divided
between two speakers:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><div><sp><speaker>First voice</speaker><lg type="stanza" part="I"><l>But why drives on that ship so fast</l><l>Withouten wave or wind?</l></lg></sp><sp><speaker>Second Voice</speaker><lg part="F"><l>The air is cut away before.</l><l>And closes from behind.</l></lg></sp><!-- ... --> </div></egXML> </p>
          <p>This example shows how dialogue presented in a prose work as if it
were drama should be encoded. It also demonstrates the use of the
<att>who</att> attribute to bear a code identifying the speaker
of the piece of dialogue concerned:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><div><sp who="OPI"><speaker>The reverend Doctor Opimiam</speaker><p>I do not think I have named a single unpresentable fish.</p></sp><sp who="GRM"><speaker>Mr Gryll</speaker><p>Bream, Doctor: there is not much to be said for bream.</p></sp><sp who="OPI"><speaker>The Reverend Doctor Opimiam</speaker><p>On the contrary, sir, I think there is much to be said for him.
 In the first place....</p><p>Fish, Miss Gryll -- I could discourse to you on fish by
the hour:  but for the present I will forbear.</p></sp></div></egXML></p>
        </div>
      </div>
      <div xml:id="U5-pln">
        <head>Page and Line Numbers</head>
        <p>Page and line breaks may be marked with the following empty
elements.
<specList><specDesc key="pb"/><specDesc key="lb"/><specDesc key="milestone"/></specList>

These elements mark a single point in the text, not a span
of text. The global <att>n</att> attribute should be used to
supply the number of the page or line beginning at the tag. 
</p>
        <p>When working from a paginated original, it is often useful to
record its pagination, if only to simplify later proof-reading.
Recording the line breaks may be useful for the same reason; treatment
of end-of-line hyphenation in printed source texts will require some
consideration.</p>
        <p>If pagination, etc., are marked for more than one edition, specify
the edition in question using the <att>ed</att> attribute, and
supply as many tags are necessary. For example, in the following
passage we indicate where the page breaks occur in two different
editions (<val>ED1</val> and <val>ED2</val>)
<egXML xmlns="http://www.tei-c.org/ns/Examples"><p>I wrote to Moor House and to Cambridge immediately, to
say what I had done: fully explaining also why I had thus
acted.  Diana and <pb ed="ED1" n="475"/> Mary approved the
step unreservedly.  Diana announced that she would
<pb ed="ED2" n="485"/>just give me time to get over the
honeymoon, and then she would come and see me.</p></egXML></p>
        <p>The <gi>pb</gi> and <gi>lb</gi> elements are special cases of
the general class of <term>milestone</term> elements which mark
reference points within a text.  TEI Lite also includes a generic
<gi>milestone</gi> element, which is not restricted to special cases
but can  mark any kind of reference point:  for example, a column
break, the start of a new kind of section not otherwise tagged, or in
general any significant change in the text not marked by an
XML element. The names used for types of unit and for editions referred to by the
<att>ed</att> and <att>unit</att> attributes may be chosen freely, but
should be documented in the header. The <gi>milestone</gi> element may
be used to replace the others, or the others may be used as a set;
they should not be mixed arbitrarily.</p>
      </div>
      <div xml:id="U5-hilites">
        <head>Marking Highlighted Phrases</head>
        <div xml:id="faces">
          <head>Changes of Typeface, etc.</head>
          <p>Highlighted words or phrases are those made visibly different from
the rest of the text, typically by a change of type font, handwriting
style, ink colour etc., which is intended to draw the reader's attention to
some associated change.</p>
          <p>The global <att>rend</att> attribute can be attached to any
element, and used wherever necessary to specify details of the
highlighting used for it. For example, a heading rendered in bold
might be tagged <gi>head rend="bold"</gi>, and one in
italic <gi>head rend="italic"</gi>.</p>
          <p>It is not always possible or desirable to interpret the reasons
for such changes of rendering in a text.  In such cases, the element
<gi>hi</gi> may be used to mark a sequence of  highlighted text
without making any claim as to its status.
<specList><specDesc key="hi"/></specList></p>

          <p>In the following example, the use of a distinct typeface for the
subheading and for the included name are recorded but not interpreted:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><p><hi rend="gothic">And this Indenture further witnesseth</hi>
that the said <hi rend="italic">Walter Shandy</hi>, merchant,
in consideration of the said intended marriage ...</p></egXML></p>
          <p>Alternatively, where the cause for the highlighting can be
identified with confidence, a number of other, more specific, elements
are available.
<specList><specDesc key="emph"/><specDesc key="foreign"/><specDesc key="gloss"/><specDesc key="label"/><specDesc key="mentioned"/><specDesc key="term"/><specDesc key="title"/></specList></p>

          <p>Some features (notably quotations and glosses) may be found in a
text either marked by highlighting, or with quotation marks.  In
either case, the elements <gi>q</gi> and <gi>gloss</gi> (as
discussed in the following section) should be used. If the rendition
is to be recorded, use the global <att>rend</att> attribute.</p>
          <p>As an example of the elements defined here, consider the following
sentence:
<q rend="display">On the one hand the <title>Nibelungenlied</title>
is associated with the new rise of romance of twelfth-century France,
the <hi rend="ital">romans d'antiquité</hi>, the romances of Chrétien
de Troyes, and the German adaptations of these works by Heinrich van
Veldeke, Hartmann von Aue, and Wolfram von Eschenbach.</q> Interpreting the role of the highlighting, the sentence might
look like this:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><p>On the one hand the <title>Nibelungenlied</title> is associated
with the new rise of romance of twelfth-century France, the
<foreign>romans d'antiquité</foreign>, the romances of
Chrétien de Troyes, ...</p></egXML>
Describing only the appearance of the original, it might look
like this:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><p>On the one hand the <hi rend="italic">Nibelungenlied</hi>
is associated with the new rise of romance of twelfth-century
France, the <hi rend="italic">romans
d'antiquité</hi>, the romances of
Chrétien de Troyes, ...</p></egXML></p>
        </div>
        <div xml:id="z635">
          <head>Quotations and Related Features</head>
          <p>Like changes of typeface, quotation marks are conventionally used
to denote several different features within a text, of which the most
frequent is quotation.  When possible, we recommend that the
underlying feature be tagged, rather than the simple fact that
quotation marks appear in the text, using the following elements:

<specList><specDesc key="q"/><specDesc key="quote"/>
<specDesc key="mentioned"/><specDesc key="soCalled"/><specDesc key="gloss"/></specList>
  </p>

          <p>Here is a simple example of a quotation:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><p>Few dictionary makers are likely to forget
Dr. Johnson's description of the
lexicographer as <q>a harmless drudge.</q></p></egXML> </p>
          <p>To record how a quotation was printed (for example,
<term>in-line</term> or set off as a <term>display</term> or
<term>block quotation</term>), the <att>rend</att> attribute
should be used. This may also be used to indicate the kind of
quotation marks used.</p>
          <p>Direct speech interrupted by a narrator can be represented simply
by ending the quotation and beginning it again after the interruption,
as in the following example:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><p><q>Who-e debel you?</q> — he at last said — <q>you
no speak-e, damme, I kill-e.</q>  And so saying, the lighted
tomahawk began flourishing about me in the dark.</p></egXML>
If it is important to convey the idea that the two <gi>q</gi>
elements together make up a single speech, the linking attributes
<att>next</att> and
<att>prev</att> may be used, as described in section <ptr target="#xatts"/>.</p>
          <p>Quotations may be accompanied by a reference to the source or
speaker, using the <att>who</att> attribute, whether or not the
source is given in the text, as in the following example:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><q who="Wilson">Spaulding, he came down into the office just this
day eight weeks with this very paper in his hand, and he
says:—<q who="Spaulding">I wish to the Lord, Mr. Wilson, that
I was a red-headed man.</q></q></egXML>
 This example also demonstrates how quotations may be embedded
within other quotations: one speaker (Wilson) quotes another speaker
(Spaulding).</p>
          <p>The creator of the electronic text must decide whether quotation
marks are replaced by the tags or whether the tags are added and the
quotation marks kept. If the quotation marks are removed from the
text, the <att>rend</att> attribute may be used to record the way
in which they were rendered in the copy text.</p>
          <p>As with highlighting, it is not always possible and may not be
considered desirable to interpret the function of quotation marks in a
text in this way.  In such cases, the tag <gi>hi rend="quoted"</gi>
might be used to mark quoted text without making any claim as to its
status.</p>
        </div>
        <div xml:id="z636">
          <head>Foreign Words or Expressions</head>
          <p>Words or phrases which are not in the main language of the texts
may be tagged as such in one of two ways. If the word or phrase is
already tagged for some reason, the element indicated should bear a
value for the global <att>xml:lang</att> attribute indicating the
language used. Where there is no applicable element, the element
<gi>foreign</gi> may be used, again using the <att>xml:lang</att>
attribute.  For example:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><p>John has real
<foreign xml:lang="fra">savoir-faire</foreign>.</p><p>Have you read <title xml:lang="deu">Die Dreigroschenoper</title>?</p><p><mentioned xml:lang="fra">Savoir-faire</mentioned> is French for
know-how.</p><p>The court issued a writ of <term xml:lang="lat">mandamus</term>.</p></egXML></p>
          <p>As these examples show, the <gi>foreign</gi> element should not
be used to tag foreign words if some other more specific element such
as <gi>title</gi>, <gi>mentioned</gi>, or <gi>term</gi> applies.
The global <att>xml:lang</att> attribute may be attached to any
element to show that it uses some other language than that of the
surrounding text.</p>
          <p>The codes used to identify languages, supplied on the
<att>xml:lang</att> attribute, must be constructed in a particular
way, and must conform to common Internet standards<note place="foot">The relevant standards are RFC 3066, and the lists of two
and three language identifiers maintained as part of ISO 639 (see http://www.w3.org/WAI/ER/IG/ert/iso639.htm)</note>, as further
explained in the relevant section of the TEI Guidelines. Some simple example
codes for a few languages are given here:
<table><row><cell>zh or zho</cell><cell>Chinese</cell><cell>grc</cell><cell>Ancient Greek</cell></row><row><cell>en</cell><cell>English</cell><cell>ell or el</cell><cell>Greek</cell></row><row><cell>enm</cell><cell>Middle English</cell><cell>ja or jpn</cell><cell>Japanese</cell></row><row><cell>fr or fra</cell><cell>French</cell><cell>la or lat</cell><cell>Latin</cell></row><row><cell>de or deu</cell><cell>German</cell><cell>sa or san</cell><cell>Sanskrit</cell></row></table>
</p>
        </div>
      </div>
      <div xml:id="U5-notes">
        <head>Notes</head>
        <p>All notes, whether printed as footnotes, endnotes, marginalia, or
elsewhere, should be marked using the same element:
<specList><specDesc key="note"/></specList>

Where possible, the body of a note should be inserted in the
text at the point at which its identifier or mark first appears. This
may not be possible for example with marginalia, which may not be
anchored to an exact location.  For simplicity, it may be adequate to
position marginal notes before the relevant paragraph or other
element.  Notes may also be placed in a separate division of the text
(as end-notes are, in printed books) and linked to the relevant
portion of the text using their <att>target</att> attribute.</p>
        <p>The <att>n</att> attribute may be used to supply the number
or identifier of a note if this is required.  The <att>resp</att>
attribute should be used consistently to distinguish between authorial
and editorial notes, if the work has both kinds; otherwise, the TEI
header should state which kind they are.</p>
        <p>Examples:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><p>Collections are ensembles of distinct
entities or objects of any sort.
<note place="foot" n="1">
We explain below why we use the uncommon term
<mentioned>collection</mentioned>
instead of the expected <mentioned>set</mentioned>.
Our usage corresponds to the <mentioned>aggregate</mentioned>
of many mathematical writings and to the sense of
<mentioned>class</mentioned> found
in older logical writings.
</note>
The elements ...</p></egXML>
<egXML xmlns="http://www.tei-c.org/ns/Examples"><lg xml:id="RAM609"><note place="margin">The curse is finally expiated</note><l>And now this spell was snapt: once more</l><l>I viewed the ocean green,</l><l>And looked far forth, yet little saw</l><l>Of what had else been seen —</l></lg></egXML></p>
      </div>
      <div xml:id="U5-ptrs">
        <head>Cross References and Links</head>
        <p>Explicit cross references or links from one point in a text to
another in the same or  another document may be encoded using the elements
described in this section.   Implicit links (such as
the association between two parallel texts, or that between a text and
its interpretation) may be encoded using the linking attributes
discussed in section <ptr target="#xatts"/>.</p>
        <div xml:id="ptrs">
          <head>Simple Cross References</head>
          <p>A cross reference from one point within a single document to
another can be encoded using either of the following elements:

<specList><specDesc key="ref"/><specDesc key="ptr"/></specList>

</p>
          <p>The difference between these two elements is that <gi>ptr</gi> is
an empty element, simply marking a point from which a link is to be
made, whereas <gi>ref</gi> may contain some text as well —
typically the text of the cross-reference itself. The <gi>ptr</gi>
element would be used for a cross reference which is to be indicated by
some non-verbal means such as a symbol or icon, or in an electronic
text by a button. It is also useful in document production systems,
where the formatter can generate the correct verbal form of the cross
reference.</p>
          <p>The following two forms, for example, are logically equivalent
(assuming we have documented somewhere the exact verbal form of cross
references represented by <gi>ptr</gi> elements):
<egXML xmlns="http://www.tei-c.org/ns/Examples">See especially <ref target="#SEC12">section 12 on page
       34</ref>.</egXML>
<egXML xmlns="http://www.tei-c.org/ns/Examples">See especially <ptr target="#SEC12"/>.</egXML>  The value of the
      <att>target</att> attribute must have been used as the 
identifier of some other element within the current document.  This implies that the
passage or phrase being pointed at must bear an identifier, and must
therefore be tagged as an element of some kind. In the following
example, the cross reference is to a
<gi>div</gi> element:
<egXML xmlns="http://www.tei-c.org/ns/Examples">    ...
    see especially <ptr target="#SEC12"/>.
    ...
    <div xml:id="SEC12"><head>Concerning Identifiers</head>
     <!-- ... --></div></egXML>
 </p>
          <p>Because the <att>xml:id</att> attribute is global, any element in
a document may be pointed to in this way. In the following example, a
paragraph has been given an identifier so that it may be pointed at:
<egXML xmlns="http://www.tei-c.org/ns/Examples">    ...
    this is discussed in <ref target="#pspec">the paragraph on links</ref>
    ...
    <p xml:id="pspec">Links may be made to any kind of element
    ...</p></egXML></p>

<p>Sometimes the target of a cross reference  does not correspond
with any particular feature of a text, and so may not be tagged as an
element of some kind. If the desired target is simply a point in the
current document, the easiest way to mark it is by introducing an
<gi>anchor</gi> element at the appropriate spot. If the target is
some sequence of words not otherwise tagged, the <gi>seg</gi> element
may be introduced to mark them. These two elements are described as
follows:
<specList><specDesc key="anchor"/><specDesc key="seg"/></specList>

</p>
<p>In the following (imaginary) example, <gi>ref</gi> elements have
been used to represent points in this text which are to be linked in
some way to other parts of it; in the first case to a point, and in
the second, to a sequence of words:
<egXML xmlns="http://www.tei-c.org/ns/Examples">  Returning to <ref target="#ABCD">the point where I dozed
  off</ref>, I noticed that <ref target="#EFGH">three
  words</ref> had been circled in red by a previous reader</egXML></p>
          <p>This encoding requires that elements with the specified
identifiers (<att>ABCD</att> and <att>EFGH</att> in this
example) are to be found somewhere else in the current document.
Assuming that no element already exists to carry these identifiers,
the <gi>anchor</gi> and
<gi>seg</gi> elements may be used:
<egXML xmlns="http://www.tei-c.org/ns/Examples">  .... <anchor type="bookmark" xml:id="ABCD"/> ....
   ....<seg type="target" xml:id="EFGH"> ... </seg> ...</egXML></p>
          <p>The <att>type</att> attribute should be used (as above) to
distinguish amongst different purposes for which these general purpose
elements might be used in a text. Some other uses are  discussed in
section <ptr target="#xatts"/> below.</p>
        </div>
        <div xml:id="xptrs">
          <head>Pointing to other documents</head>
<p>So far, we have shown how the elements <gi>ptr</gi> and
<gi>ref</gi> may be used for cross-references or links whose targets
occur within the same document as their source. However, the same
elements may also be used to refer to elements in any other XML
document or resource, such as a document on the web, or a database
component.  This is possible because the value of the
<att>target</att> attribute may be any valid <term>universal resource
indicator</term> (URI). A full definition of this term, defined by the
W3C (the consortium which manages the development and maintenance of
the World Wide Web), is beyond the scope of this tutorial: however,
the most frequently encountered version of a URI is the familiar
<soCalled>URL</soCalled> used to indicate a web page, such as
<code>http://www.tei-c.org/index.xml</code>. </p>

          <p>A URL may reference a web page or just a part of one, for example
<code>http://www.tei-c.org/index.xml#SEC2</code>.  The sharp sign
indicates that what follows it is the identifier of an element to be
located within the XML document identified by what precedes it: this
example will therefore locate an element which has an
<att>xml:id</att> attribute value of <val>SEC2</val> within the
document retrieved from <code>http://www.tei-c.org/index.xml</code>.
In the examples we have discussed so far, the part to the left of the
sharp sign has been omitted: this is understood to mean that the
referenced element is to be located within the current document.</p>
          
<p>Within a URL, parts of an XML document can be specified by means of
other more sophisticated mechanisms, using a special language called
Xpath, also defined by the W3C.  This is particularly useful where the
elements to be linked to do not bear identifiers and must therefore be
located by some other means.  A full specification of the language is
well beyond the scope of this document; here we provide only a flavour
of its power. </p>
          <p>In the XPath language, locations are defined as a series of
<term>steps</term>, each one identifying some part of the document,
often in terms of the locations identified by the previous step.  For
example, you would point to the third sentence of the second paragraph
of chapter two by selecting chapter two in the first step, the second
paragraph in the second step, and the third sentence in the last step.
A step can be defined in terms of the document tree itself, using such
concepts as <val>parent</val>, <val>descendent</val>,
<val>preceding</val>, etc. or, more loosely, in terms of text
patterns, word or character positions. </p>

        </div>
        <div xml:id="xatts">
          <head>Special kinds of Linking</head>
          <p>The following special purpose <term>linking</term> attributes are
defined for every element in the TEI Lite scheme:
<list type="gloss"><label><att>ana</att></label><item>links an element with its interpretation.</item><label><att>corresp</att></label><item>links an element with one or more other corresponding elements.</item><label><att>next</att></label><item>links an element to the next element in an aggregate.</item><label><att>prev</att></label><item>links an element to the previous element in an aggregate.</item></list></p>
          <p>The <att>ana</att> (analysis) attribute is intended for use
where a set of abstract analyses or interpretations have been defined
somewhere within a document, as further discussed in section
<ptr target="#U5-anal"/>. For example, a linguistic analysis of the sentence
<q>John loves Nancy</q> might be encoded as follows:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><seg type="sentence" ana="SVO"><seg type="lex" ana="#NP1">John</seg><seg type="lex" ana="#VVI">loves</seg><seg type="lex" ana="#NP1">Nancy</seg></seg></egXML>  This encoding implies the existence elsewhere in the
document of elements with identifiers <val>SVO</val>, <val>NP1</val>,
and
<val>VV1</val> where the significance of these particular codes
is explained. Note the use of the <gi>seg</gi> element to mark
particular components of the analysis, distinguished by the <att>type</att>
attribute.</p>
          <p>The <att>corresp</att> (corresponding) attribute provides a
simple way of representing some form of correspondence between two
elements in a text. For example, in a multilingual text, it may be
used to link translation equivalents, as in the following example
<egXML xmlns="http://www.tei-c.org/ns/Examples"><seg xml:lang="fra" xml:id="FR1" corresp="#EN1">Jean aime Nancy</seg><seg xml:lang="en" xml:id="EN1" corresp="#FR1">John loves Nancy</seg></egXML></p>
          <p>The same mechanism may be used for a variety of purposes. In the
following example, it has been used to represent anaphoric
correspondences between <q rend="inline" type="inline">the show</q>
and <q>Shirley</q>, and between
<q>NBC</q> and <q>the network</q>:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><p><title xml:id="shirley">Shirley</title>, which made
its Friday night debut only a month ago, was
not listed on <name xml:id="nbc">NBC</name>'s new schedule,
although <seg xml:id="network" corresp="#nbc">the network</seg>
says <seg xml:id="show" corresp="#shirley">the show</seg>
still is being considered.</p></egXML></p>
          <p>The <att>next</att> and <att>prev</att> attributes
provide a simple way of linking together the components of a
discontinuous  element, as in the following example:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><q xml:id="Q1a" next="#Q1b">Who-e debel you?</q>
— he at last said — <q xml:id="Q1b" prev="#Q1a">you no speak-e,
damme, I kill-e.</q>  And so saying,
the lighted tomahawk began flourishing
about me in the dark.</egXML></p>
        </div>
      </div>
      <div xml:id="U5-edit1">
        <head>Editorial  Interventions</head>
        <p>The process of encoding an electronic text has much in common with
the process of editing a manuscript or other text for printed
publication. In either  case a conscientious editor may  wish to record
both the original state of the source and any editorial correction or
other change made in it. The elements discussed in this and the next
section provide some facilities for meeting these needs.</p>
        <div>
          <head>Correction and Normalization</head>
          <p>The following elements may be used to mark
<term>correction</term>, that is editorial changes introduced where
the editor believes the original to be erroneous:
<specList><specDesc key="corr"/><specDesc key="sic"/></specList>
</p>
          <p>The following elements may be used to mark
<term>normalization</term>, that is editorial changes introduced for
the sake of consistency or modernization of a text:
<specList><specDesc key="orig"/><specDesc key="reg"/></specList></p>

          <p>As an example, consider this extract from the quarto printing of
Shakespeare's <title>Henry V</title>. 
<eg> ... for his nose was as sharp as a pen and a table of green
feelds</eg>
</p>
          <p>A modern editor might wish to make a number of interventions here,
specifically to modernize (or normalise) the Elizabethan spellings of 
<mentioned>a'</mentioned> and
<mentioned>feelds</mentioned> for <mentioned>he</mentioned> and
<mentioned>fields</mentioned> respectively. He or she might also want
to emend  <mentioned>table</mentioned> to
<mentioned>babbl'd</mentioned>, following an editorial tradition that
goes back to the 18th century Shakesperean scholar  Theobald. The
following encoding would then be appropriate:

<egXML xmlns="http://www.tei-c.org/ns/Examples">... for his nose was as sharp as a pen and <reg>he</reg>
 <corr resp="#Theobald">babbl'd</corr> of green
<reg>fields</reg></egXML></p>
          <p>A more conservative or source-oriented editor, however, might want
to retain the original, but at the same time signal that some
of the readings it contains are in some sense anomalous:
<egXML xmlns="http://www.tei-c.org/ns/Examples">... for his nose was as sharp as a pen and <orig>a</orig>
 <sic>table</sic> of green
<orig>feelds</orig></egXML></p>
          <p>Finally, a modern digital editor may decide to combine both
possibilities in a single composite text, using the <gi>choice</gi>
element.

<specList><specDesc key="choice"/></specList>


This allows an editor to mark where alternative readings are possible:
<egXML xmlns="http://www.tei-c.org/ns/Examples">... for his nose was
as sharp as a pen and 
<choice><orig>a</orig><reg>he</reg></choice>
<choice><corr resp="#Theobald">babbl'd</corr><sic>table</sic></choice>
 of green
<choice><orig>feelds</orig><reg>fields</reg></choice>
</egXML></p>
        </div>
        <div xml:id="U5-edit2">
          <head>Omissions, Deletions, and  Additions</head>
          <p>In addition to correcting or normalizing words and phrases,
editors and transcribers may also supply missing material, omit
material, or transcribe material deleted or crossed out in the source.
In addition, some material may be particularly hard to transcribe
because it is hard to make out on the page.  The following elements
may be used to record such phenomena:
<specList><specDesc key="add"/><specDesc key="gap"/><specDesc key="del"/><specDesc key="unclear"/></specList>
</p>

          <p>These elements may be used to record changes made by an editor, by
the transcriber, or (in manuscript material) by the author or scribe.
For example, if the source for an electronic text read

<eg>The following elements are provided for for simple editorial
interventions.</eg> then it might be felt desirable to correct the
obvious error, but at the same time to record the deletion of the
superfluous second <mentioned>for</mentioned>, thus:

<egXML xmlns="http://www.tei-c.org/ns/Examples">The following elements are provided for
<del resp="#LB">for</del> simple editorial interventions.</egXML> The attribute value <code>LB</code> on the <att>resp</att>
attribute indicates that <q>LB</q>
corrected the duplication of <mentioned>for</mentioned>.</p>
          <p>If the source read<eg>The following elements provided for
simple editorial interventions.</eg> (i.e. if the verb had been
inadvertently dropped) then the corrected text might read:
<egXML xmlns="http://www.tei-c.org/ns/Examples">The following elements <add resp="#LB">are</add> provided for
simple editorial interventions.</egXML></p>
          <p>These elements are not limited to changes made by an editor; they
can also be used to record authorial changes in manuscripts.  A
manuscript in which the author has first written <q>How it galls me,
what a galling shadow</q>, then crossed out the word
<mentioned>galls</mentioned> and inserted <mentioned>dogs</mentioned>
might be encoded thus:
<egXML xmlns="http://www.tei-c.org/ns/Examples">How it <del hand="DHL" type="overstrike">galls</del>
<add hand="DHL" place="supralinear">dogs</add> me,
what a galling shadow</egXML></p>
          <p>Similarly, the <gi>unclear</gi> and <gi>gap</gi> elements may be
used together to indicate the omission of illegible material; the
following example also shows the use of <gi>add</gi> for a
conjectural emendation:
<egXML xmlns="http://www.tei-c.org/ns/Examples">One hundred &amp; twenty good regulars joined to me
<unclear><gap reason="indecipherable"/></unclear>
&amp; instantly, would aid me signally <add hand="ed">in?</add>
an enterprise against Wilmington.</egXML></p>
          <p>The <gi>del</gi> element marks material which is transcribed as
part of the electronic text despite being marked as deleted, while
<gi>gap</gi> marks the location of material which is omitted from the
electronic text, whether it is legible or not.  A language corpus, for
example, might omit long quotations in foreign languages:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><p> ... An example of a list appearing in a fief ledger of
<name type="place">Koldinghus</name> <date>1611/12</date>
is given below. It shows cash income from a sale of
honey.</p><gap><desc>quotation from ledger (in Danish)</desc></gap><p>A description of the overall structure of the account is
once again ... </p></egXML></p>
          <p>Other corpora (particular those constructed before the widespread
use of scanners) systematically omit figures and 
mathematics: <egXML xmlns="http://www.tei-c.org/ns/Examples"><p>At the bottom of your screen below the mode line is the
<term>minibuffer</term>.  This is the area where Emacs
echoes the commands you enter and where you specify
filenames for Emacs to find, values for search and replace,
and so on.
<gap reason="graphic"><desc>diagram of Emacs screen</desc></gap>
</p></egXML></p>

        </div>
        <div>
          <head>Abbreviations and their Expansion</head>
          <p>Like names, dates, and numbers, abbreviations may be transcribed
as they stand or expanded; they may be left unmarked, or encoded using
the following elements:
<specList><specDesc key="abbr"/><specDesc key="expan"/></specList>

</p>
          <p>The <gi>abbr</gi> element is useful as a means of distinguishing
semi-lexical items such as acronyms or jargon:
<egXML xmlns="http://www.tei-c.org/ns/Examples">We can sum up the above discussion as follows:  the identity of a
<abbr>CC</abbr> is defined by that calibration of values which
motivates the elements of its <abbr>GSP</abbr>;</egXML>
<egXML xmlns="http://www.tei-c.org/ns/Examples">Every manufacturer of <abbr>3GL</abbr> or <abbr>4GL</abbr>
languages is currently nailing on <abbr>OOP</abbr> extensions</egXML> </p>
          <p>The <att>type</att> attribute may be used to distinguish
types of abbreviation by their function. </p>

          <p>The <gi>expan</gi> element is used to mark an expansion supplied by
an encoder. This element is particularly useful in the transcription of manuscript
materials. For example, the
character p with a bar through its descender as a conventional
representation for the word <val>per</val> is commonly encountered in
Medieval European manuscripts. An encoder may choose to expand this as
follows:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><expan>per</expan></egXML>
</p>
          <p>The expansion corresponding with an abbreviated form may not
always contain the same letters as the abbreviation. Where it does,
however, common editorial practice is  to italicize or otherwise
signal which letters have been supplied. The <gi>expan</gi> element
should not be used for this purpose since its function is to indicate an
expanded form, not a part of one. 
For example, consider the common
abbreviation <val>wt</val> (for <val>with</val>) found in medieval
texts. In a modern edition, an editor might wish to represent this as
<soCalled>w<hi>i</hi>t<hi>h</hi></soCalled>, italicising the
letters not found in the source. An appropriate encoding for this
purpose would be 
<egXML xmlns="http://www.tei-c.org/ns/Examples"><expan>w<hi>i</hi>t<hi>h</hi></expan></egXML>
</p>
          <p>To record both an
abbreviation and its expansion, the <gi>choice</gi> element mentioned
above may be used to group the abbreviated form with its proposed
expansion: 
<egXML xmlns="http://www.tei-c.org/ns/Examples"><choice><abbr>wt</abbr><expan>with</expan></choice></egXML>
  </p>
        </div>
      </div>
      <div xml:id="U5-names">
        <head>Names, Dates, and  Numbers</head>
        <p>The TEI scheme defines elements for a large number of
<soCalled>data-like</soCalled> features which may appear almost
anywhere within almost any kind of text. These features may be of
particular interest in a range of disciplines; they all relate to
objects external to the text itself, such as the names of persons and
places, numbers and dates. They also pose particular problems for many
natural language processing (NLP) applications because of the variety
of ways in which they may be presented within a text. The elements
described here, by making such features explicit, reduce the
complexity of processing texts containing them.</p>
        <div xml:id="nomen">
          <head>Names and Referring Strings</head>
          <p>A <term>referring string</term> is a phrase which refers to some
person, place, object, etc. Two elements are provided to mark such
strings:

<specList><specDesc key="rs"/><specDesc key="name"/></specList>

  </p>
          <p>  The <att>type</att> attribute is used to distinguish
amongst (for example) names of persons, places and organizations,
where this is possible:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><q>My dear <rs type="person">Mr. Bennet</rs>, </q>
said his lady to him one day, <q>have you heard
that <rs type="place">Netherfield Park</rs> is let
at last?</q></egXML>
<egXML xmlns="http://www.tei-c.org/ns/Examples">It being one of the principles of the
<rs type="organization">Circumlocution Office</rs> never,
on any account whatsoever, to give a straightforward answer,
<rs type="person">Mr Barnacle</rs> said, <q>Possibly.</q></egXML></p>
          <p>As the following example shows, the <gi>rs</gi> element may be
used for any reference to a person, place, etc, not necessarily one in
the form of a proper noun or noun phrase.
<egXML xmlns="http://www.tei-c.org/ns/Examples"><q>My dear <rs type="person">Mr. Bennet</rs>,</q>
said <rs type="person">his lady</rs> to him
one day...</egXML></p>
          <p>The <gi>name</gi> element by contrast is provided for the special
case of referencing strings which consist only of proper nouns; it may
be used synonymously with the <gi>rs</gi> element, or nested within
it if a referring string contains a mixture of common and proper
nouns.</p>
          <p>Simply tagging something as a name is rarely enough to
enable automatic processing of personal names into the canonical forms
usually required for reference purposes. The name as it appears in the
text may be inconsistently spelled, partial, or vague.  Moreover, name
prefixes such as <mentioned>van</mentioned> or <mentioned>de la</mentioned>,
 may or may not be included as part of the reference form of a name,
depending on the language and country of origin of the bearer.</p>
          <p>The <att>key</att>  attribute provides an alternative normalized identifier for the object being named,
like  a database record key. It  may thus be useful as a means
of gathering together all references to the same individual or
location scattered throughout a document:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><q>My dear <rs type="person" key="BENM1">Mr. Bennet</rs>,
  </q> said <rs type="person" key="BENM2">his lady</rs>
  to him one day, <q>have you heard that
  <rs type="place" key="NETP1">Netherfield Park</rs>
  is let at last?</q></egXML></p>
          <p>This use should be distinguished from the case of the
<gi>reg</gi> (regularization) element, which provides a means
of marking the standard form of a referencing string as demonstrated
below:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><name type="person" key="WADLM1"><choice><sic>Walter de la Mare</sic><reg>de la Mare, Walter</reg></choice></name> was born at
  <name key="Ch1" type="place">Charlton</name>, in
  <name key="KT1" type="county">Kent</name>, in 1873.</egXML></p>
          <p>The <gi>index</gi> element discussed in <ptr target="indexing"/> may be
more appropriate if the function of the regularization is to provide a
consistent index:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><p><name type="place">Montaillou</name> is not a large parish.
At the time of the events which led to
<name type="person">Fournier</name>'s <index><term>Benedict XII, Pope of Avignon (Jacques Fournier)</term></index>
investigations, the local population consisted of between 200 and 250 inhabitants.</p></egXML>

	

 Although adequate for many simple applications, these methods have
 two inconveniences: if the name occurs many times, then its
 regularised form must be repeated many times; and the burden of
 additional XML markup in the body of the text may be inconvenient to
 maintain and complex to process. For applications such as onomastics,
 relating to persons or places named rather than the name itself, or wherever a detailed
analysis of the component parts of a name is needed, the full TEI
Guidelines provide a range of other solutions.</p>
        </div>
        <div>
          <head>Dates and Times</head>
          <p>Tags for the more detailed encoding of times and dates include the
following:
<specList><specDesc key="date"/><specDesc key="time"/></specList>
</p>
          <p>The <att>value</att> attribute specifies a normalized form
for the date or time, using one of the standard formats defined by  ISO 8601.
Partial dates or times (e.g. <q>1990</q>,
<q>September 1990</q>,
<q>twelvish</q>) can  be expressed
by  omitting a part of the value supplied, as in the following examples:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><date when="1980-02-21">21 Feb 1980</date><date when="1990">1990</date><date when="1990-09">September 1990</date><date when="--09">September</date><date when="2001-09-11T12:48:00">Sept 11th, 12 minutes before 9 am</date></egXML>Note in the last example the use of a normalized representation for
the date string which includes a time: this example could thus equally
well be tagged using the <gi>time</gi> element. 
</p>
          <p>
            <egXML xmlns="http://www.tei-c.org/ns/Examples">Given on the <date when="1977-06-12">Twelfth Day of June
in the Year of Our Lord One Thousand Nine Hundred and
Seventy-seven of the Republic the Two Hundredth and first
and of the University the Eighty-Sixth.</date></egXML>
            <egXML xmlns="http://www.tei-c.org/ns/Examples">
              <l>specially when it's nine below zero</l>
              <l>and <time when="15:00:00">three o'clock in the
       afternoon</time></l>
            </egXML>
          </p>
        </div>
        <div>
          <head>Numbers </head>
          <p>Numbers can be written with either letters or digits (<code>twenty-one</code>,
<code>xxi</code>, and <code>21</code>) and their presentation is
language-dependent (e.g. English <mentioned>5th</mentioned> becomes
Greek <mentioned>5.</mentioned>; English <mentioned>123,456.78</mentioned>
equals French
<mentioned>123.456,78</mentioned>). In natural-language processing or
machine-translation applications, it is often helpful to distinguish
them from other, more <soCalled>lexical</soCalled> parts of the text.
In other applications, the ability to record a number's value in
standard notation is important. The <gi>num</gi> element provides
this possibility:
<specList><specDesc key="num"/></specList>

</p>
          <p>For example:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><num value="33">xxxiii</num><num type="cardinal" value="21">twenty-one</num><num type="percentage" value="10">ten percent</num><num type="percentage" value="10">10%</num><num type="ordinal" value="5">5th</num></egXML></p>
        </div>
      </div>
      <div xml:id="U5-lists">
        <head>Lists</head>
        <p>The element <gi>list</gi> is used to mark any kind of
<term>list</term>.  A list is a sequence of text items, which may be
ordered, unordered, or a glossary list.  Each item may be preceded by
an item label (in a glossary list, this label is the term being
defined):
<specList><specDesc key="list"/><specDesc key="item"/><specDesc key="label"/></specList>

</p>
        <p>Individual list items are tagged with <gi>item</gi>.  The first
<gi>item</gi> may optionally be preceded by a <gi>head</gi>, which
gives a heading for the list.  The numbering of a list may be omitted, indicated using the <att>n</att> attribute
on each item, or (rarely) tagged as content using the <gi>label</gi>
element.  The following are all thus equivalent:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><list><head>A short list</head><item>First item in list.</item><item>Second item in list.</item><item>Third item in list.</item></list><list><head>A short list</head><item n="1">First item in list.</item><item n="2">Second item in list.</item><item n="3">Third item in list.</item></list><list><head>A short list</head><label>1</label><item>First item in list.</item><label>2</label><item>Second item in list.</item><label>3</label><item>Third item in list.</item></list></egXML> The styles should not be mixed in the same list.</p>
        <p>A simple two-column table may be treated as a <term>glossary
list</term>, tagged <gi>list type="gloss"</gi>.  Here, each item
comprises a <term>term</term> and a <term>gloss</term>, marked with
<gi>label</gi> and <gi>item</gi> respectively.  These correspond to
the elements <gi>term</gi> and <gi>gloss</gi>, which can occur
anywhere in prose text.

<egXML xmlns="http://www.tei-c.org/ns/Examples"><list type="gloss"><head>Vocabulary</head><label xml:lang="enm">nu</label><item>now</item><label xml:lang="enm">lhude</label><item>loudly</item><label xml:lang="enm">bloweth</label><item>blooms</item><label xml:lang="enm">med</label><item>meadow</item><label xml:lang="enm">wude</label><item>wood</item><label xml:lang="enm">awe</label><item>ewe</item><label xml:lang="enm">lhouth</label><item>lows</item><label xml:lang="enm">sterteth</label><item>bounds, frisks</item><label xml:lang="enm">verteth</label><item xml:lang="lat">pedit</item><label xml:lang="enm">murie</label><item>merrily</item><label xml:lang="enm">swik</label><item>cease</item><label xml:lang="enm">naver</label><item>never</item></list></egXML></p>
        <p>Where the internal structure of a list item is more complex, it
may be preferable to regard the list as a <term>table</term>, for
which special-purpose tagging is defined below (<ptr target="#U5-tables"/>).
</p>
        <p>Lists of whatever kind can, of course, nest within list items to
any depth required. Here, for example, a glossary list contains two
items, each of which is itself a simple list:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><list type="gloss"><label>EVIL</label><item><list type="simple"><item>I am cast upon a horrible desolate island, void
          of all hope of recovery.</item><item>I am singled out and separated as it were from
         all the world to be miserable.</item><item>I am divided from mankind — a solitaire; one
           banished from human society.</item></list></item><label>GOOD</label><item><list type="simple"><item>But I am alive; and not drowned, as all my
              ship's company were.</item><item>But I am singled out, too, from all the ship's
             crew, to be spared from death...</item><item>But I am not starved, and perishing on a barren place,
            affording no sustenances....</item></list></item></list></egXML></p>
        <p>A list need not necessarily be displayed in list format.  For
example,
<egXML xmlns="http://www.tei-c.org/ns/Examples"><p>On those remote pages it is written that animals are
divided into <list rend="run-on"><item n="a">those that belong to the
Emperor,</item><item n="b"> embalmed ones, </item><item n="c"> those
that are trained, </item><item n="d"> suckling pigs, </item><item n="e">
mermaids, </item><item n="f"> fabulous ones, </item><item n="g"> stray
dogs, </item><item n="h"> those that are included in this
classification, </item><item n="i"> those that tremble as if they
were mad, </item><item n="j"> innumerable ones, </item><item n="k"> those
drawn with a very fine camel's-hair brush, </item><item n="l">
others, </item><item n="m"> those that have just broken a flower
vase, </item><item n="n"> those that resemble flies from a
distance.</item></list></p></egXML></p>
        <p>Lists of bibliographic items should be tagged using the <gi>listBibl</gi>
element, described in the next section.</p>
      </div>
      <div xml:id="U5-bibls">
        <head>Bibliographic Citations</head>
        <p>It is often useful to distinguish bibliographic citations where
they occur within texts being transcribed for research, if only so
that they will be properly formatted when the text is printed out. The
element <gi>bibl</gi> is provided for this purpose. 
Where the components of a bibliographic reference are to be
distinguished, the following elements may be used as appropriate. It
is generally useful to mark at least those parts (such as the titles
of articles, books, and journals) which will need special formatting. 
The other elements are provided for cases where particular interest
attaches to such details.
<specList><specDesc key="bibl"/><specDesc key="author"/><specDesc key="biblScope"/><specDesc key="date"/><specDesc key="editor"/><specDesc key="publisher"/><specDesc key="pubPlace"/><specDesc key="title"/></specList></p>
        <p>For example, the following editorial note might be transcribed as
shown:
<q rend="display">He was a member of Parliament for Warwickshire in
1445, and died March 14, 1470 (according to Kittredge, <title>Harvard
Studies</title> 5. 88ff).</q>
<egXML xmlns="http://www.tei-c.org/ns/Examples">He was a member of Parliament for Warwickshire in 1445, and died
March 14, 1470 (according to <bibl><author>Kittredge</author>,
<title>Harvard Studies</title><biblScope>5. 88ff</biblScope></bibl>).</egXML></p>
        <p>For lists of bibliographic citations, the <gi>listBibl</gi>
element should be used; it may contain a series of <gi>bibl</gi>
elements.  </p>
      </div>
      <div xml:id="U5-tables">
        <head>Tables</head>
        <p>Tables represent a  challenge for any text processing
system, but simple tables, at least, appear in so many texts that even
in the simplified TEI tag set presented here, markup for tables is
necessary.  The following  elements are provided for this purpose:
<specList><specDesc key="table"/><specDesc key="row"/><specDesc key="cell"/></specList></p>
        <p>For example, Defoe uses mortality tables like the following in the
<title level="m">Journal of the Plague Year</title> to show the rise
and ebb of the epidemic:<egXML xmlns="http://www.tei-c.org/ns/Examples"><p>It was indeed coming on amain, for the burials that
same week were in the next adjoining parishes thus:—
<table rows="5" cols="4"><row role="data"><cell role="label">St. Leonard's, Shoreditch</cell><cell>64</cell><cell>84</cell><cell>119</cell></row><row role="data"><cell role="label">St. Botolph's, Bishopsgate</cell><cell>65</cell><cell>105</cell><cell>116</cell></row><row role="data"><cell role="label">St. Giles's, Cripplegate</cell><cell>213</cell><cell>421</cell><cell>554</cell></row></table></p><p>This shutting up of houses was at first counted a very cruel
and unchristian method, and the poor people so confined made
bitter lamentations. ... </p></egXML></p>
      </div>
      <div xml:id="U5-figs">
        <head>Figures and Graphics</head>
        <p>Not all the components of a document are necessarily textual. The
most straightforward text will often contain diagrams or
illustrations, to say nothing of documents in which image and text are
inextricably intertwined, or electronic resources in which the two are
complementary. </p>
        <p>The encoder may simply record the presence of a graphic within the
text, possibly with a brief description of its content, by using the
elements described in this section. The same elements may also be used
to embed digitized versions of the graphic within an electronic
document.
<specList><specDesc key="graphic"/><specDesc key="figure"/><specDesc key="figDesc"/></specList></p>
        <p>Any textual information accompanying the graphic, such as a
heading and/or caption, may be included within the <gi>figure</gi>
element itself, in a <gi>head</gi> and one or more <gi>p</gi>
elements, as may also any text appearing within the graphic itself. It
is strongly recommended that a prose description of the image be
supplied, as the content of a <gi>figDesc</gi> element, for the use
of applications which are not able to render the graphic, and to
render the document accessible to vision-impaired readers. (Such text
is not normally considered part of the document proper.)</p>
        <p>The simplest use for these elements is to mark the position of a
graphic and provide a link to it, as in this example;
<egXML xmlns="http://www.tei-c.org/ns/Examples"><pb n="412"/><graphic url="p412fig.png"/><pb n="413"/></egXML>
This indicates that the graphic contained by the file
<ident>p412fig.png</ident> appears between pages 412 and 413. </p>
        <p>The <gi>graphic</gi> element can appear
anywhere  that textual content is permitted, within but not between
paragraphs or headings. In the following example, the encoder has
decided to treat a specific printer's ornament as a heading:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><head><graphic url="http://www.iath.virginia.edu/gants/Ornaments/Heads/hp-ral02.gif"/></head></egXML> </p>
        <p>More usually, a graphic will have at the
least an identifying title, which may be encoded using the <gi>head</gi>
element, or a number of figures may be grouped together in a
particular structure. It is also often convenient to include a brief description of
the image.  The <gi>figure</gi> element provides a means of wrapping
one or more such elements together as a kind of graphic
<soCalled>block</soCalled>:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><figure><graphic url="fessipic.png"/><head>Mr Fezziwig's Ball</head><figDesc>A Cruikshank engraving showing Mr Fezziwig leading
       a group of revellers.</figDesc></figure></egXML></p>
        <p>When a digitized version of the graphic concerned is available, it
may be  embedded  at the appropriate point within the
document in this way.</p>
      </div>
      <div xml:id="U5-anal">
        <head>Interpretation and  Analysis</head>
        <p>It is often said that <emph>all</emph> markup is a form of
interpretation or analysis.  While it is certainly difficult, and may
be impossible, to distinguish firmly between
<soCalled>objective</soCalled> and <soCalled>subjective</soCalled>
information in any universal way, it remains true that judgments
concerning the latter are typically regarded as more likely to provide
controversy than  those concerning the former.  Many scholars
therefore prefer to record such interpretations only if it is possible
to alert the reader that they are considered more open to dispute,
than the rest of the markup. This section describes some of the
elements provided by the TEI scheme to meet this need. </p>
        <div>
          <head>Orthographic Sentences</head>
          <p>Interpretation typically ranges across the whole of a text, with
no  particular respect to other structural units. A useful preliminary
to intensive interpretation is therefore to segment the text into
discrete and identifiable units, each of which can then bear a label
for use as a sort of <soCalled>canonical reference</soCalled>.  To
facilitate such uses, these units may not cross each other, nor nest
within each other. They may conveniently be represented using the
following element:
<specList><specDesc key="s"/></specList></p>
          <p>As the name suggests, the <gi>s</gi> element is most commonly
used (in linguistic applications at least) for marking <term>orthographic
sentences</term>, that is, units defined by orthographic features such
as punctuation.  For example, the passage from
<title>Jane Eyre</title> discussed earlier might be divided into
s-units as follows:<egXML xmlns="http://www.tei-c.org/ns/Examples"><pb n="474"/><div type="chapter" n="38"><p><s n="001">Reader, I married him.</s><s n="002">A quiet wedding we had:</s><s n="003">he and I, the parson and clerk, were alone present.</s><s n="004">When we got back from church, I went
into the kitchen of the manor-house, where Mary was cooking 
the dinner, and John cleaning the knives, 
and I said —</s></p><p><q><s n="005">Mary, I have been married to Mr Rochester
this morning.</s></q> ... </p></div></egXML> 
Note that  <gi>s</gi>
elements cannot nest: the beginning of one <gi>s</gi> element implies
that the previous one has finished. When s-units are tagged as shown
above, it is advisable to tag the entire text end-to-end, so that
every word in the text being analysed will be contained by exactly one
<gi>s</gi> element, whose identifier can then be used to specify a
unique reference for it. If the identifiers used are unique within the
document, then the <att>xml:id</att> attribute might be used in
preference to the <att>n</att> used in the above  example.</p>
        </div>
        <div>
          <head>General-Purpose Interpretation Elements</head>
          <p>A more general purpose segmentation element, the <gi>seg</gi> has
  already been introduced for use in identifying otherwise unmarked
  targets of cross references and hypertext links (see section <ptr target="#U5-ptrs"/>); it identifies some phrase-level portion of text
  to which the encoder may assign a user-specified <att>type</att>, as
  well as a unique identifier; it may thus be used to tag textual
  features for which there is no provision in the published TEI
  Guidelines.</p>
          <p>For example, the Guidelines provide no
  <soCalled>apostrophe</soCalled> element to mark parts of a literary
  text in which the narrator addresses the reader (or hearer)
  directly. One approach might be to regard these as instances of the
  <gi>q</gi> element, distinguished from others by an appropriate
  value for the <att>who</att> attribute. A possibly simpler, and
  certainly more general, solution would however be to use the
  <gi>seg</gi> element as follows:

<egXML xmlns="http://www.tei-c.org/ns/Examples"><div type="chapter" n="38"><p><seg type="apostrophe">Reader, I married him.</seg>
A quiet wedding we had: ...</p></div></egXML> 

The <att>type</att> attribute on the <gi>seg</gi> element
can take any value, and so can be used to record phrase-level
phenomena of any kind; it is good practice to record the values used
and their significance in the header.</p>
          <p>A <gi>seg</gi> element of one type (unlike the <gi>s</gi>
element which it superficially resembles) can be nested within a <gi>seg</gi>
element of the same or another type. This enables quite complex
structures to be represented; some examples were given in section
<ptr target="#xatts"/> above. However, because it must respect the
requirement that elements be properly nested, and may not cut
across each other, it cannot cope with the common requirement to
associate an interpretation with arbitrary segments of a text which
may completely ignore the document hierarchy. It also requires that
the interpretation itself be represented by a single coded value in
the <att>type</att> attribute.</p>
          <p>Neither restriction applies to the <gi>interp</gi> element, which
provides powerful features for the encoding of quite complex
interpretive information in a relatively straightforward manner.

<specList><specDesc key="interp"/><specDesc key="interpGrp"/></specList>


These elements allows the encoder to specify both the class of
an interpretation, and the particular instance of that class which the
interpretation involves. Thus, whereas with <gi>seg</gi> one can say
simply that something is an apostrophe, with
<gi>interp</gi> one can say that it is an instance (apostrophe) of a
larger class (rhetorical figures).</p>
          <p>Moreover, <gi>interp</gi>  is an empty element, which must be
linked to the passage to which it applies either by means of the
<att>ana</att>  attribute discussed in section <ptr target="#xatts"/>
 above, or by means of its own <att>inst</att> attribute. This
means that any kind of analysis can be represented, with no need to
respect the document hierarchy, and also facilitates the grouping
of analyses of a particular type together. A special purpose <gi>interpGrp</gi>
element is provided for the latter purpose.</p>
          <p>For example, suppose that you wish to mark such diverse aspects of
a text as  themes or subject matter, rhetorical figures, and the
locations of individual scenes of the narrative. Different portions of
our sample passage from <title>Jane Eyre</title> for example, might
be associated with the rhetorical figures of apostrophe, hyperbole,
and metaphor; with subject-matter references to churches, servants,
cooking, postal service, and honeymoons; and with scenes located in
the church, in the kitchen, and in an unspecified location (drawing
room?).</p>
          <p>These interpretations could be placed anywhere within the <gi>text</gi>
element; it is however good practice to put them all in the same place
(e.g. a separate section of the front or back matter), as in the
following example:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><back><div type="Interpretations"><p><interp xml:id="fig-apos-1" resp="#LB-MSM" type="figureOfSpeech">apostrophe</interp><interp xml:id="fig-hyp-1" resp="#LB-MSM" type="figureOfSpeech">hyperbole</interp><interp xml:id="set-church-1" resp="#LB-MSM" type="setting">church</interp><interp xml:id="ref-church-1" resp="#LB-MSM" type="reference">church</interp><interp xml:id="ref-serv-1" resp="#LB-MSM" type="reference">servants</interp></p></div></back></egXML></p>
          <p>The evident redundancy of this encoding can be considerably
reduced by using the <gi>interpGrp</gi> element to group together all
those <gi>interp</gi> elements which share common attribute values,
as follows: 
<egXML xmlns="http://www.tei-c.org/ns/Examples"><back><div type="Interpretations"><p><interpGrp type="figureOfSpeech" resp="#LB-MSM"><interp xml:id="fig-apos">apostrophe</interp><interp xml:id="fig-hyp">hyperbole</interp><interp xml:id="fig-meta">metaphor</interp></interpGrp><interpGrp type="scene-setting" resp="#LB-MSM"><interp xml:id="set-church">church</interp><interp xml:id="set-kitch">kitchen</interp><interp xml:id="set-unspec">unspecified</interp></interpGrp><interpGrp type="reference" resp="#LB-MSM"><interp xml:id="ref-church">church</interp><interp xml:id="ref-serv">servants</interp><interp xml:id="ref-cook">cooking</interp></interpGrp></p></div></back></egXML></p>
          <p>Once these interpretation elements have been defined, they can be
linked with the parts of the text to which they apply in either or
both of two ways. The <att>ana</att> attribute can be used on
whichever element is appropriate:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><div type="chapter" n="38"><p xml:id="P38.1" ana="#set-church #set-kitch"><s xml:id="P38.1.1" ana="#fig-apos">Reader, I married him.</s></p>
</div></egXML> 
Note in this example that since the paragraph has two settings
(in the church  and in the kitchen), the identifiers of both have been
supplied.</p>
          <p>Alternatively, the <gi>interp</gi> elements can point to all the
parts of the text to which they apply, using their <att>inst</att>
attribute:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><interp xml:id="fig-apos-2" type="figureOfSpeech" resp="#LB-MSM" inst="#P38.1.1">apostrophe</interp><interp xml:id="set-church-2" type="scene-setting" inst="#P38.1" resp="#LB-MSM">church</interp><interp xml:id="set-kitchen-2" type="scene-setting" inst="#P38.1" resp="#LB-MSM">kitchen</interp></egXML></p>
          <p>The <gi>interp</gi> is not limited to any particular type of
analysis, The literary analysis shown above is but one possibility;
one could equally well use <gi>interp</gi> to capture a linguistic
part-of-speech analysis. For example, the example sentence given in
section <ptr target="#xatts"/> assumes a linguistic analysis which
might be represented as follows:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><interp xml:id="NP1" type="pos">noun phrase, singular</interp><interp xml:id="VV1" type="pos">inflected verb, present-tense singular</interp>
...
</egXML></p>
        </div>
      </div>
      <div xml:id="U5-techdoc">
        <head>Technical Documentation</head>
        <p>Although the focus of this document is on the use of the TEI
scheme for the encoding of existing <soCalled>pre-electronic</soCalled>
documents,  the same scheme may also be used for the encoding of new
documents. In the preparation of new documents (such as this one),
XML has much to recommend it: the document's structure can be clearly
represented, and the same electronic text can be re-used for many
purposes — to provide both online hypertext or browsable versions
and well-formatted typeset versions from a common source for
example. </p>
        <p>To facilitate this, the TEI Lite schema includes some elements for
marking features of technical documents in general, and of
XML-related documents in particular.</p>
        <div>
          <head>Additional Elements for Technical Documents</head>
          <p>The following elements may be used to mark particular features of
technical documents:
<specList><specDesc key="eg"/><specDesc key="code"/><specDesc key="ident"/><specDesc key="gi"/><specDesc key="att"/><specDesc key="formula"/><specDesc key="val"/></specList></p>
          <p>The following example shows how these elements might be used to
encode a passage from a tutorial introducing the Fortran programming
language:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><p>It is traditional to introduce a language with a program like the
following:
<eg>
   CHAR*12 GRTG
   GRTG = 'HELLO WORLD'
   PRINT *, GRTG
   END
</eg></p><p>This simple example first declares a variable <ident>GRTG</ident>, in
the line <code>CHAR*12 GRTG</code>, which identifies <ident>GRTG</ident>
as consisting of 12 bytes of type <ident>CHAR</ident>.  To this variable,
the value <val>HELLO WORLD</val>
is then assigned.</p></egXML></p>
          <p>A formatting application, given a text like that above, can be
instructed to format examples appropriately (e.g. to preserve line
breaks, or to use a distinctive font). Similarly, the use of tags such
as <gi>ident</gi> greatly facilitates the
construction of a useful index.</p>
          <p>The <gi>formula</gi> element should be used to enclose a
mathematical or chemical formula presented within the text as a
distinct item. Since formulae generally include a large variety of
special typographic features not otherwise present in ordinary text,
it will usually be necessary to present the body of the formula in a
specialized notation. The notation used should be specified by the
<att>notation</att> attribute, as in the following example:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><formula notation="tex">
  \begin{math}E = mc^{2}\end{math}
</formula></egXML></p>

          <p>A particular problem arises when XML encoding is the
subject of discussion within a technical document, itself encoded in
XML. In such a document, it is clearly essential to distinguish
clearly the markup occurring within examples from that marking up
the document itself, and end-tags are highly likely to occur. One
      simple solution is to use the predefined entity reference
      <code>&amp;lt;</code> to represent each &lt; character which marks
      the start of an XML tag within the examples. A more
general solution is to mark off the whole body of each example as
containing data which is not to be scanned for XML mark-up by the
parser. This is achieved by enclosing it within  a special XML
construct called a <term><code>CDATA</code> marked section</term>,  as
in the following example:
<eg>&lt;p&gt;A list should be encoded as follows:
&lt;eg&gt;&lt;![ CDATA [
   &lt;list&gt;
   &lt;item&gt;First item in the list&lt;/item&gt;
   &lt;item&gt;Second item&lt;/item&gt;
   &lt;/list&gt;
]]&gt;
&lt;/eg&gt;
The &lt;gi&gt;list&lt;/gi&gt; element consists of a series of &lt;gi&gt;item&lt;/gi&gt;
elements.</eg></p>
          <p>The <gi>list</gi> element used within the example above will not
be regarded as forming part of the document proper, because it is
embedded within a marked section (beginning with the special markup
declaration <val>&lt;![CDATA[ </val>, and ending with
<val>]]&gt;</val>).</p>
          <p>Note also the use of the <gi>gi</gi> element to tag references to
element names (or <term>generic identifiers</term>) within the
body of the text.</p>
        </div>
        <div>
          <head>Generated Divisions</head>
          <p>Most modern document production systems have the ability to
generate automatically whole sections such as a table of contents or
an index. The TEI Lite scheme provides an element to mark the location
at which such a generated section should be placed.
<specList><specDesc key="divGen"/></specList></p>
          <p>The <gi>divGen</gi> element can be placed anywhere that a
division element would be legal, as in the following example:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><front><titlePage> <!-- ... --> </titlePage><divGen type="toc"/><div><head>Preface</head> <!-- ... --> </div></front><body> <!-- ... --> </body><back><div><head>Appendix</head> <!-- ... --> </div><divGen type="index" n="Index"/></back></egXML></p>
          <p>This example also demonstrates the use of the <att>type</att>
attribute to distinguish the different kinds of division to be
generated: in the first case a table of contents (a
<mentioned>toc</mentioned>) and in the second an index.</p>
          <p>When an existing index or table of contents is to be encoded
(rather than one being generated) for some reason, the
<gi>list</gi> element discussed in section <ptr target="#U5-lists"/>
should be used. </p>
        </div>
        <div xml:id="indexing">
          <head>Index Generation</head>
          <p>While production of a table of contents from a properly tagged
document is generally unproblematic for an automatic processor, the
production of a good quality index will often require more careful
tagging. It may not be enough simply to produce a list of all parts
tagged in some particular way, although extracting (for example) all
occurrences of elements such as <gi>term</gi> or <gi>name</gi> will
often be a good departure point for an index. </p>
          <p>The TEI schema provides a special purpose <gi>index</gi> tag which
may be used to mark both the parts of the document which should be
indexed, and how the indexing should be done.
<specList><specDesc key="index"/></specList></p>
          <p>For example, the second paragraph of this section might include
the following:<egXML xmlns="http://www.tei-c.org/ns/Examples">...
TEI lite also provides a special purpose <gi>index</gi> tag
<index><term>indexing</term></index>
<index><term>index (tag)</term><index><term>use in index generation</term></index></index>
which may be used ...</egXML></p>
          <p>The <gi>index</gi> element can also be used to provide a form of
interpretive or analytic information.  For example, in a study of
Ovid, it might be desired to record all the poet's references to
different figures, for comparative stylistic study.  In the following
lines of the <title>Metamorphoses</title>, such a study would record
the poet's references to Jupiter (as
<mentioned>deus</mentioned>, <mentioned>se</mentioned>, and as the
subject of <mentioned>confiteor</mentioned> [in inflectional form
number 227]), to Jupiter-in-the-guise-of-a-bull (as
<mentioned>imago tauri fallacis</mentioned> and the subject of
<mentioned>teneo</mentioned>), and so on.<note place="foot">The
analysis is taken, with permission, from Willard McCarty and Burton
Wright, <title>An Analytical Onomasticon to the Metamorphoses of Ovid</title>
(Princeton: Princeton University Press, forthcoming).  Some
simplifications have been undertaken.</note>
<egXML xmlns="http://www.tei-c.org/ns/Examples"><l n="3.001">iamque deus posita fallacis imagine tauri</l><l n="3.002">se confessus erat Dictaeaque rura tenebat</l></egXML> This need might be met using the <gi>note</gi> element
discussed in section in <ptr target="#U5-notes"/>, or with the <gi>interp</gi>
element discussed in section <ptr target="#U5-anal"/>. Here we demonstrate
how it might also be satisfied by using the <gi>index</gi> element.</p>
          <p>We assume that the object is to generate more than one index: one
for names of deities (called <att>dn</att>), another for
onomastic references (called <att>on</att>),  a third for pronominal
references (called <att>pr</att>) and so forth. One way of
achieving this might be as follows:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><l n="3.001">iamque deus posita fallacis imagine tauri
     <index indexName="dn"><term>Iuppiter</term><index><term>deus</term></index></index>
     <index indexName="on"><term>Iuppiter (taurus)</term><index><term>imago tauri fallacis</term></index></index></l><l n="3.002">se confessus erat Dictaeaque rura tenebat
     <index indexName="pr"><term>Iuppiter</term><index><term>se</term></index></index>
     <index indexName="v"><term>Iuppiter</term><index><term>confiteor (v227)</term></index></index></l></egXML>  For each <gi>index</gi> element above, an entry will be
generated in the appropriate index, using  as headword the content of
the <gi>term</gi> element it contains; the <gi>term</gi> elements
nested within the secondary <gi>index</gi> element in each case
provide a secondary keyword. The actual reference will be taken from the context
in which the <gi>index</gi> element appears, i.e. in this case the
identifier of the <gi>l</gi> element containing it. </p>
        </div>
        <div>
          <head>Addresses</head>
          <p>The <gi>address</gi> element is used to mark a postal address of
any kind. It contains one or more <gi>addrLine</gi> elements, one for
each line of the address.
<specList><specDesc key="address"/><specDesc key="addrLine"/></specList>
</p>
          <p>Here is a simple example:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><address><addrLine>Computer Center (M/C 135)</addrLine><addrLine>1940 W. Taylor, Room 124</addrLine><addrLine>Chicago, IL 60612-7352</addrLine><addrLine>U.S.A.</addrLine></address></egXML></p>
          <p>The individual parts of an address may be further distinguished by
 using the <gi>name</gi> element discussed above (section <ptr target="#nomen"/>).
<egXML xmlns="http://www.tei-c.org/ns/Examples"><address><addrLine>Computer Center (M/C 135)</addrLine><addrLine>1940 W. Taylor, Room 124</addrLine><addrLine><name type="city">Chicago</name>, IL 60612-7352</addrLine><addrLine><name type="country">USA</name></addrLine></address></egXML></p>
        </div>
      </div>
      <div xml:id="U5-chars">
        <head>Character Sets, Diacritics, etc.</head>
        <p>With the advent of XML and its adoption of Unicode as the required
     character set for all documents, most problems previously
     associated with the representation of the divers languages and
     writing systems of the world are greatly reduced. For those
     working with standard forms of the European languages in
     particular, almost no special action is needed: any XML editor
     should enable you to input accented letters or other <soCalled>non-ASCII</soCalled>
     characters directly, and they should be stored in the resulting
     file in a way which is transferable directly between different
     systems. </p>
        <p>There are two important exceptions: the characters &amp; and &lt; may not be
entered directly in an XML document, since they have a special
significance as initiating markup. They must always be represented as
<term>entity references</term>, like this: <code>&amp;amp;</code> or
<code>&amp;lt;</code>. Other characters may also be represented by
means of entity reference where necessary, for example to retain
compatibility with a pre-Unicode processing system. </p>







      </div>
      <div xml:id="U5-fronbac">
        <head>Front and Back Matter</head>
        <div>
          <head>Front Matter</head>
          <p>For many purposes, particularly in older texts, the preliminary
material such as title pages, prefatory epistles, etc., may provide
very useful additional linguistic or social information. P5 provides a
set of recommendations for distinguishing the textual elements most
commonly encountered in front matter, which are summarized
      here.</p>
          <div xml:id="h51">
            <head>Title Page</head>
            <p>The start of a title page should be marked with the element
<gi>titlePage</gi>.  All text contained on the page should be
transcribed and tagged with the appropriate element from the following
list:
<specList><specDesc key="titlePage"/><specDesc key="docTitle"/><specDesc key="titlePart"/><specDesc key="byline"/><specDesc key="docAuthor"/><specDesc key="docDate"/><specDesc key="docEdition"/><specDesc key="docImprint"/><specDesc key="epigraph"/></specList></p>
            <p>Typeface distinctions should be marked with the <att>rend</att>
attribute when necessary, as described above. Very detailed
description of the letter spacing and sizing used in ornamental titles
is not as yet provided for by the Guidelines. Changes of language
should be marked by appropriate use of the <att>lang</att>
attribute or the
<gi>foreign</gi> element, as necessary. Names, wherever they appear,
should be tagged using the <gi>name</gi>, as elsewhere.</p>
            <p>Two example title pages follow:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><titlePage rend="Roman"><docTitle><titlePart type="main">
    PARADISE REGAIN'D. A POEM In IV <hi>BOOKS</hi>.
    </titlePart><titlePart>
    To which is added <title>SAMSON AGONISTES</title>.
    </titlePart></docTitle><byline>The Author <docAuthor>JOHN MILTON</docAuthor></byline><docImprint><name>LONDON</name>,
    Printed by <name>J.M.</name>
    for <name>John Starkey</name>
    at the <name>Mitre</name>
    in <name>Fleetstreet</name>,
    near <name>Temple-Bar.</name></docImprint><docDate>MDCLXXI</docDate></titlePage></egXML><egXML xmlns="http://www.tei-c.org/ns/Examples"><titlePage><docTitle><titlePart type="main">
  Lives of the Queens of England, from the Norman
    Conquest;</titlePart><titlePart type="sub">with anecdotes of their courts.
  </titlePart></docTitle><titlePart>Now first published from Official Records
    and other authentic documents private as well as
    public.</titlePart><docEdition>New edition, with corrections and
    additions</docEdition><byline>By <docAuthor>Agnes Strickland</docAuthor></byline><epigraph><q>The treasures of antiquity laid up in old
       historic rolls, I opened.</q><bibl>BEAUMONT</bibl></epigraph><docImprint>Philadelphia: Blanchard and Lea</docImprint><docDate>1860.</docDate></titlePage></egXML></p>
          </div>
          <div xml:id="h52">
            <head>Prefatory Matter</head>
            <p>Major blocks of text within the front matter should be marked as
<gi>div</gi> or <gi>div</gi> elements; the following suggested
values for the <att>type</att> attribute may be used to
distinguish various common types of prefatory matter:
<list type="gloss"><label>foreword</label><item>a text addressed to the reader, by the author, editor or
publisher, possibly in the form of a letter.</item><label>preface</label><item>a text addressed to the reader, by the author, editor or
publisher, possibly in the form of a letter.</item><label>dedication</label><item>a text (often a letter) addressed to someone other than the
reader in which the author typically commends the work in hand to the
attention of the person concerned.</item><label>abstract</label><item>a prose argument summarizing the content of the work.</item><label>ack</label><item>Acknowledgements.</item><label>contents</label><item>a table of contents (typically this should be tagged as a
<gi>list</gi>).</item><label>frontispiece</label><item>a pictorial frontispiece, possibly including some text.</item></list></p>
            <p>Like any text division, those in front matter may contain low
level structural or non-structural elements as described elsewhere.
They will generally begin with a heading or title of some kind which
should be tagged using the <gi>head</gi> element. Epistles will
contain the following additional elements:
<specList><specDesc key="salute"/><specDesc key="signed"/><specDesc key="byline"/><specDesc key="dateline"/><specDesc key="argument"/><specDesc key="cit"/><specDesc key="opener"/><specDesc key="closer"/></specList></p>

            <p> Epistles which appear elsewhere in a text will, of course,
contain these same elements.</p>
            <p>As an example, the dedication at the start of Milton's
<title>Comus</title> should be marked up as follows:
<egXML xmlns="http://www.tei-c.org/ns/Examples">
<div type="dedication">
<head>To the Right Honourable <name>JOHN Lord Viscount
BRACLY</name>, Son and Heir apparent to the Earl of
Bridgewater, &amp;c.</head><salute>MY LORD,</salute><p>THis <hi>Poem</hi>, which receiv'd its first occasion of
Birth from your Self, and others of your Noble Family ....
and as in this representation your attendant
<name>Thyrsis</name>, so now in all reall expression</p>
<closer><salute>Your faithfull, and most humble servant</salute><signed><name>H. LAWES.</name></signed></closer>
</div></egXML></p>
          </div>
        </div>
        <div>
          <head>Back Matter</head>
          <div>
            <head>Structural Divisions of Back Matter</head>
            <p>Because of variations in publishing practice, back matter can
contain virtually any of the elements listed above for front matter,
and the same elements should be used where this is so.  Additionally,
back matter may contain the following types of matter within the
<gi>back</gi> element.  Like the structural divisions of the body,
these should be marked as  <gi>div</gi> elements,
and distinguished by the following suggested values of the
<att>type</att> attribute:
<list type="gloss"><label>appendix</label><item>an
appendix.</item><label>glossary</label><item>a list of
words and definitions, typically marked up as a  <gi>list type="gloss"</gi>
element
.</item><label>notes</label><item>a series of <gi>note</gi>
elements.</item><label>bibliography</label><item>a series of bibliographic references, typically in the form of
a special bibliographic-list element <gi>listBibl</gi>, whose items
are individual <gi>bibl</gi> elements.</item><label>index</label><item>a set of index entries, possibly represented as a structured
list or glossary list, with optional leading
<gi>head</gi> and perhaps some paragraphs of introductory or closing
text (An index may also be generated for a document by using the
<gi>index</gi> element, described above in section 
<ptr target="#index"/>).</item><label>colophon</label><item>a description at the back of the book describing where, when,
and by whom it was printed; in modern books it also often gives
production details and identifies the type faces used.</item></list></p>
          </div>
        </div>
      </div>
      <div xml:id="U5-header">
        <head>The Electronic Title Page</head>
        <p>Every TEI text has a header which provides information analogous
to that provided by the title page of printed text. The header is
introduced by the element <gi>teiHeader</gi> and has four major
parts:
<specList><specDesc key="fileDesc"/><specDesc key="encodingDesc"/><specDesc key="profileDesc"/><specDesc key="revisionDesc"/></specList></p>
        <p>    A corpus or collection of texts, which share many
characteristics, may have one header for the corpus and individual
headers for each component of the corpus.  In this case the <att>type</att>
attribute indicates the type of header.
<gi>teiHeader type="corpus"</gi> introduces the header for corpus-level information.</p>
        <p>Some of the header elements contain running prose which consists
of one or more <gi>p</gi>s.  Others are grouped:
<list><item>Elements whose names end in <mentioned>Stmt</mentioned>(for
statement) usually enclose a group of elements recording some
structured information.</item><item>Elements whose names end in <mentioned>Decl</mentioned> (for
declaration) enclose information about specific encoding practices.</item><item>Elements whose names end in <mentioned>Desc</mentioned> (for
description) contain a prose description.</item></list></p>
        <div>
          <head>The File Description</head>
          <p>The <gi>fileDesc</gi> element is mandatory. It contains a full
bibliographic description of the file with the following elements:
<specList><specDesc key="titleStmt"/><specDesc key="editionStmt"/><specDesc key="extent"/><specDesc key="publicationStmt"/><specDesc key="seriesStmt"/><specDesc key="notesStmt"/><specDesc key="sourceDesc"/></specList></p>
          <p>   A minimal header has the following structure:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><teiHeader><fileDesc><titleStmt> <!-- ... --> </titleStmt><publicationStmt> <!-- ... --> </publicationStmt><sourceDesc> <!-- ... --> </sourceDesc></fileDesc></teiHeader></egXML></p>
          <div>
            <head>The Title Statement</head>
            <p>The following elements can be used in the <gi>titleStmt</gi>:
<specList><specDesc key="title"/><specDesc key="author"/><specDesc key="sponsor"/><specDesc key="funder"/><specDesc key="principal"/><specDesc key="respStmt"/></specList></p>
            <p>   It is recommended that the title should distinguish the
computer file from the source text, for example:
<eg>[title of source]: a machine readable transcription
[title of source]: electronic edition
A machine readable version of: [title of source]</eg> The <gi>respStmt</gi> element contains the following
subcomponents:
<specList><specDesc key="resp"/><specDesc key="name"/></specList>   Example:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><titleStmt><title>Two stories by Edgar Allen Poe: a machine readable
               transcription</title><author>Poe, Edgar Allen (1809-1849)</author><respStmt><resp>compiled by</resp><name>James D. Benson</name></respStmt></titleStmt></egXML></p>
          </div>
          <div>
            <head>The Edition Statement</head>
            <p>The <gi>editionStmt</gi> groups information relating to one
edition of a text (where <mentioned>edition</mentioned> is used as
elsewhere in bibliography), and may include the following elements:
<specList><specDesc key="edition"/><specDesc key="respStmt"/></specList></p>
            <p>Example:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><editionStmt><edition n="U2">Third draft, substantially revised
     <date>1987</date>
     </edition></editionStmt></egXML></p>
            <p>Determining exactly what constitutes a new edition of an
electronic text is left to the encoder.</p>
          </div>
          <div>
            <head>The Extent Statement</head>
            <p>The <gi>extent</gi> statement describe the approximate size of a
file.</p>
            <p>Example:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><extent>4532 bytes</extent></egXML></p>
          </div>
          <div>
            <head>The Publication Statement</head>
            <p>The <gi>publicationStmt</gi> is mandatory. It may contain a
simple prose description or groups of the elements described below:
<specList><specDesc key="publisher"/><specDesc key="distributor"/><specDesc key="authority"/></specList></p>
            <p>At least one of these three elements must be present, unless the
entire publication statement is in prose. The following elements may
occur within them:
<specList><specDesc key="pubPlace"/><specDesc key="address"/><specDesc key="idno"/><specDesc key="availability"/><specDesc key="date"/></specList></p>
            <p>Example:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><publicationStmt><publisher>Oxford University Press</publisher><pubPlace>Oxford</pubPlace><date>1989</date><idno type="ISBN"> 0-19-254705-5</idno><availability><p>Copyright 1989, Oxford University
          Press</p></availability></publicationStmt></egXML></p>
          </div>
          <div>
            <head>Series and Notes Statements</head>
            <p>The <gi>seriesStmt</gi> element groups information about the series, if
any, to which a publication belongs. It may contain <gi>title</gi>,
<gi>idno</gi>, or <gi>respStmt</gi> elements.</p>
            <p>The <gi>notesStmt</gi>, if used, contains one or more <gi>note</gi>
elements which contain a note or annotation. Some information found in
the notes area in conventional bibliography has been assigned specific
elements in the TEI scheme.</p>
          </div>
          <div>
            <head>The Source Description</head>
            <p>The <gi>sourceDesc</gi> is a mandatory element which records
details of the source or sources from which the computer file is
derived. It may contain simple prose or a bibliographic citation,
using one or more of the following elements:
<specList><specDesc key="bibl"/><specDesc key="biblFull"/><specDesc key="listBibl"/></specList></p>
            <p>Examples:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><sourceDesc><bibl>The first folio of Shakespeare, prepared by Charlton
          Hinman (The Norton Facsimile, 1968)</bibl></sourceDesc></egXML>
<egXML xmlns="http://www.tei-c.org/ns/Examples"><sourceDesc><bibl><author>CNN Network News</author><title>News headlines</title><date>12 Jun 1989</date></bibl></sourceDesc></egXML></p>
          </div>
        </div>
        <div>
          <head>The Encoding Description</head>
          <p>The <gi>encodingDesc</gi> element specifies the methods and
editorial principles which governed the transcription of the text. Its
use is highly recommended.  It may be prose description or may contain
elements from the following list:
<specList><specDesc key="projectDesc"/><specDesc key="samplingDecl"/><specDesc key="editorialDecl"/><specDesc key="refsDecl"/><specDesc key="classDecl"/></specList></p>
          <div>
            <head>Project and Sampling Descriptions</head>
            <p>Examples of <gi>projectDesc</gi> and <gi>samplingDesc</gi>:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><encodingDesc><projectDesc><p>Texts collected for use in the Claremont
          Shakespeare Clinic, June 1990.
    </p></projectDesc></encodingDesc></egXML>
<egXML xmlns="http://www.tei-c.org/ns/Examples"><encodingDesc><samplingDecl><p>Samples of 2000 words taken from the beginning
          of the text</p>
     </samplingDecl></encodingDesc></egXML></p>
          </div>
          <div>
            <head>Editorial Declarations</head>
            <p>The <gi>editorialDecl</gi> contains a prose description of the
practices used when encoding the text. Typically this description
should cover such topics as the following, each of which may
conveniently be given as a separate paragraph. 
<list type="gloss"><label>correction </label><item>how and under what circumstances corrections have been made in
the text.</item><label>normalization</label><item>the extent to which the original source has been regularized or
normalized.</item><label>quotation</label><item>what has been done with quotation marks in the original -- have
they been retained or replaced by entity references, are opening and
closing quotes distinguished, etc. </item><label>hyphenation</label><item>what has been done with hyphens (especially end-of-line
hyphens)  in the original -- have they been retained, replaced by
entity references, etc.</item><label>segmentation</label><item>how has the text has been segmented, for example into
sentences, tone-units, graphemic strata, etc.</item><label>interpretation</label><item>what analytic or interpretive information has been added to the
text. </item></list></p>
            <p>Example:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><editorialDecl><p>The part of speech analysis applied throughout
               section 4 was added by hand and has not been
               checked.</p><p>Errors in transcription controlled by using the
               WordPerfect spelling checker.</p><p>All words converted to Modern American spelling
               using Webster's 9th Collegiate dictionary.</p><p>All quotation marks converted to entity
               references odq and cdq.</p></editorialDecl></egXML></p>
          </div>
          <div>
            <head>Reference and Classification Declarations</head>

            <p>The <gi>refsDecl</gi> element is used to document the way in
which any standard referencing scheme built into the encoding works.
In its simplest form, it consists of prose description.</p>
            <p>Example:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><refsDecl><p>The <att>n</att> attribute on each <gi>div</gi> contains the
     canonical reference for each such division in the form
     XX.yyy where XX is the book number in roman numeral and
     yyy is the section number in arabic.</p></refsDecl></egXML></p>
            <p>The <gi>classDecl</gi> element groups together definitions or
sources for any descriptive classification schemes used by other parts
of the header. At least one such scheme must be provided, encoded
using the following elements:
<specList><specDesc key="taxonomy"/><specDesc key="bibl"/><specDesc key="category"/><specDesc key="catDesc"/></specList></p>
            <p>   In the simplest case, the taxonomy may be defined by a
bibliographic reference, as in the following example:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><classDecl><taxonomy xml:id="LC-SH"><bibl>Library of Congress Subject Headings
          </bibl></taxonomy></classDecl></egXML></p>
            <p>Alternatively, or in addition, the encoder may define a special
purpose classification scheme, as in the following example:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><taxonomy xml:id="B"><bibl>Brown Corpus</bibl><category xml:id="B.A"><catDesc>Press Reportage</catDesc><category xml:id="B.A1"><catDesc>Daily</catDesc></category><category xml:id="B.A2"><catDesc>Sunday</catDesc></category><category xml:id="B.A3"><catDesc>National</catDesc></category><category xml:id="B.A4"><catDesc>Provincial</catDesc></category><category xml:id="B.A5"><catDesc>Political</catDesc></category><category xml:id="B.A6"><catDesc>Sports</catDesc></category></category><category xml:id="B.D"><catDesc>Religion</catDesc><category xml:id="B.D1"><catDesc>Books</catDesc></category><category xml:id="B.D2"><catDesc>Periodicals and tracts</catDesc></category></category>
</taxonomy></egXML></p>
            <p>Linkage between a particular text and a category within such a
taxonomy is made by means of the <gi>catRef</gi> element within the
<gi>textClass</gi> element, as further described
       below.</p>
          </div>
        </div>
        <div>
          <head>The Profile Description</head>
          <p>The <gi>profileDesc</gi> element enables information
characterizing various descriptive aspects of a text to be recorded
within a single framework. It has three optional components:
<specList><specDesc key="creation"/><specDesc key="langUsage"/><specDesc key="textClass"/></specList></p>
          <p>The <gi>creation</gi> element is useful for documenting where a
work was created, even though it may not have been published or
recorded there.</p>
          <p>Example:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><creation><date when="1992-08">August 1992</date><name type="place">Taos, New Mexico</name></creation></egXML></p>
          <p>The <gi>langUsage</gi> element is useful where a text contains many
different languages. It may contain <gi>language</gi> elements to
document each particular language used:
<specList><specDesc key="language"/></specList>
an example is needed.
</p>
          <p>The <gi>textClass</gi> element classifies a text by reference to
the system or systems defined by the <gi>classDecl</gi> element, and
contains one or more of the following elements:
<specList><specDesc key="keywords"/><specDesc key="classCode"/><specDesc key="catRef"/></specList>
</p>
          <p>The element <gi>keywords</gi> contains a list of keywords or
phrases identifying the topic or nature of a text. The attribute
<att>scheme</att> links these to the classification system
defined in
<gi>taxonomy</gi>.
<egXML xmlns="http://www.tei-c.org/ns/Examples"><textClass><keywords scheme="LCSH"><list><item>English literature -- History and criticism --
               Data processing.</item><item>English literature -- History and criticism --
               Theory etc.</item><item>English language -- Style -- Data
               processing.</item></list></keywords></textClass></egXML></p>
        </div>
        <div>
          <head>The Revision Description</head>
          <p>The <gi>revisionDesc</gi> element provides a change log in which
each change made to a text may be recorded. The log may be recorded as
a sequence of <gi>change</gi> elements each of which contains
a brief description of the change. The attributes <att>date</att> and
<att>who</att> may be used to identify when the change was carried out
and the agency responsible for it. </p>
          <p>Example:
<egXML xmlns="http://www.tei-c.org/ns/Examples"><revisionDesc><change when="1991-03-06" who="EMB">File format updated</change><change when="1990-05-25" who="EMB">Stuart's corrections entered</change></revisionDesc></egXML></p>
        </div>
      </div>
      <div xml:id="U5-taglist">
        <head>List of Elements Described</head>
        <p>The following list shows all the elements defined for the TEI
  Lite schema, with a brief description of each, and a link to its full
specification in the Appendix.
<specList rend="noatts"><specDesc rend="noatts" key="abbr"/><specDesc rend="noatts" key="add"/><specDesc rend="noatts" key="address"/><specDesc rend="noatts" key="addrLine"/><specDesc rend="noatts" key="anchor"/><specDesc rend="noatts" key="argument"/><specDesc rend="noatts" key="author"/><specDesc rend="noatts" key="authority"/><specDesc rend="noatts" key="availability"/><specDesc rend="noatts" key="back"/><specDesc rend="noatts" key="bibl"/><specDesc rend="noatts" key="biblFull"/><specDesc rend="noatts" key="biblScope"/><specDesc rend="noatts" key="body"/><specDesc rend="noatts" key="byline"/><specDesc rend="noatts" key="catDesc"/><specDesc rend="noatts" key="category"/><specDesc rend="noatts" key="catRef"/><specDesc rend="noatts" key="cell"/><specDesc rend="noatts" key="change"/><specDesc rend="noatts" key="choice"/><specDesc rend="noatts" key="cit"/><specDesc rend="noatts" key="classCode"/><specDesc rend="noatts" key="classDecl"/><specDesc rend="noatts" key="closer"/><specDesc rend="noatts" key="code"/><specDesc rend="noatts" key="corr"/><specDesc rend="noatts" key="creation"/><specDesc rend="noatts" key="date"/><specDesc rend="noatts" key="dateline"/><specDesc rend="noatts" key="del"/><specDesc rend="noatts" key="distributor"/><specDesc rend="noatts" key="div"/><specDesc rend="noatts" key="divGen"/><specDesc rend="noatts" key="docAuthor"/><specDesc rend="noatts" key="docDate"/><specDesc rend="noatts" key="docEdition"/><specDesc rend="noatts" key="docImprint"/><specDesc rend="noatts" key="docTitle"/><specDesc rend="noatts" key="edition"/><specDesc rend="noatts" key="editionStmt"/><specDesc rend="noatts" key="editor"/><specDesc rend="noatts" key="editorialDecl"/><specDesc rend="noatts" key="eg"/><specDesc rend="noatts" key="emph"/><specDesc rend="noatts" key="encodingDesc"/><specDesc rend="noatts" key="epigraph"/><specDesc rend="noatts" key="extent"/><specDesc rend="noatts" key="figure"/><specDesc rend="noatts" key="fileDesc"/><specDesc rend="noatts" key="foreign"/><specDesc rend="noatts" key="formula"/><specDesc rend="noatts" key="front"/><specDesc rend="noatts" key="funder"/><specDesc rend="noatts" key="gap"/><specDesc rend="noatts" key="gi"/><specDesc rend="noatts" key="gloss"/><specDesc rend="noatts" key="group"/><specDesc rend="noatts" key="head"/><specDesc rend="noatts" key="hi"/><specDesc rend="noatts" key="ident"/><specDesc rend="noatts" key="idno"/><specDesc rend="noatts" key="index"/><specDesc rend="noatts" key="interp"/><specDesc rend="noatts" key="interpGrp"/><specDesc rend="noatts" key="item"/><specDesc rend="noatts" key="keywords"/><specDesc rend="noatts" key="l"/><specDesc rend="noatts" key="label"/><specDesc rend="noatts" key="language"/><specDesc rend="noatts" key="langUsage"/><specDesc rend="noatts" key="lb"/><specDesc rend="noatts" key="lg"/><specDesc rend="noatts" key="list"/><specDesc rend="noatts" key="listBibl"/><specDesc rend="noatts" key="mentioned"/><specDesc rend="noatts" key="milestone"/><specDesc rend="noatts" key="name"/><specDesc rend="noatts" key="note"/><specDesc rend="noatts" key="notesStmt"/><specDesc rend="noatts" key="num"/><specDesc rend="noatts" key="opener"/><specDesc rend="noatts" key="orig"/><specDesc rend="noatts" key="p"/><specDesc rend="noatts" key="pb"/><specDesc rend="noatts" key="principal"/><specDesc rend="noatts" key="profileDesc"/><specDesc rend="noatts" key="projectDesc"/><specDesc rend="noatts" key="ptr"/><specDesc rend="noatts" key="publicationStmt"/><specDesc rend="noatts" key="publisher"/><specDesc rend="noatts" key="pubPlace"/><specDesc rend="noatts" key="q"/><specDesc rend="noatts" key="ref"/><specDesc rend="noatts" key="refsDecl"/><specDesc rend="noatts" key="reg"/><specDesc rend="noatts" key="resp"/><specDesc rend="noatts" key="respStmt"/><specDesc rend="noatts" key="revisionDesc"/><specDesc rend="noatts" key="row"/><specDesc rend="noatts" key="rs"/><specDesc rend="noatts" key="s"/><specDesc rend="noatts" key="salute"/><specDesc rend="noatts" key="samplingDecl"/><specDesc rend="noatts" key="seg"/><specDesc rend="noatts" key="seriesStmt"/><specDesc rend="noatts" key="sic"/><specDesc rend="noatts" key="signed"/><specDesc rend="noatts" key="soCalled"/><specDesc rend="noatts" key="sourceDesc"/><specDesc rend="noatts" key="sp"/><specDesc rend="noatts" key="speaker"/><specDesc rend="noatts" key="sponsor"/><specDesc rend="noatts" key="stage"/><specDesc rend="noatts" key="table"/><specDesc rend="noatts" key="taxonomy"/><specDesc rend="noatts" key="TEI"/><specDesc rend="noatts" key="teiHeader"/><specDesc rend="noatts" key="text"/><specDesc rend="noatts" key="term"/><specDesc rend="noatts" key="textClass"/><specDesc rend="noatts" key="time"/><specDesc rend="noatts" key="title"/><specDesc rend="noatts" key="titlePage"/><specDesc rend="noatts" key="titlePart"/><specDesc rend="noatts" key="titleStmt"/><specDesc rend="noatts" key="trailer"/><specDesc rend="noatts" key="unclear"/></specList>
  </p>
      </div>
    </body>
    <back>
      <head>Appendixes</head>
      <div xml:id="changes">
        <head>Substantive changes from the P4
version</head>
        <p>This revision of the TEI Lite schema conforms to the TEI P5
Guidelines, which makes a number of changes from the TEI P4 Guidelines
underlying earlier versions of TEI Lite. The following brief list
indicates some of the major changes which will be needed in existing
TEI P4-conformant documents before they can be used with the new
schema. A fuller list is in preparation for publication as a part of
TEI P5: the items listed here relate specifically to changes in TEI Lite
only.</p>
        <list>
          <item>At P5, a TEI document must declare a namespace of
<code>http://www.tei-c.org/ns/1.0</code></item>
          <item>The attributes <att>id</att> and <att>lang</att> are replaced by
the attributes <att>xml:id</att> and <att>xml:lang</att>
respectively. Values for the latter attribute must conform to RFC
3066</item>
          <item>The element <gi>choice</gi> must be used to wrap <gi>reg</gi>
and <gi>orig</gi> if both are supplied. Similarly for
<gi>sic</gi> and <gi>corr</gi>, and for <gi>abbr</gi> and
<gi>expan</gi>.</item>
          <item><soCalled>numbered divs</soCalled> (<gi>div1</gi>,
<gi>div2</gi>, etc.) are not supported in this version of TEI Lite</item>
          <item>all pointing and linking mechanisms now use the same W3C-defined
mechanism: there is no longer any distinction between internal and
external pointing elements</item>
          <item>the content model of <gi>change</gi> has changed
significantly</item>
          <item>
            <hi>hic desunt multa</hi>
          </item>
        </list>
      </div>
      <div>
        <head>Formal specification</head>
        <p>The TEI Lite is a pure subset of the TEI. All of the  elements
defined in it are taken from the following  standard TEI
modules:

 <ident type="module">tei</ident>, <ident type="module">core</ident>, <ident type="module">header</ident>, <ident type="module">textstructure</ident>, <ident type="module">figures</ident>, <ident type="module">linking</ident>, <ident type="module">analysis</ident>, and <ident type="module">tagdocs</ident>.
</p>
        <p>The following elements from those modules are excluded from the
schema:
<gi>ab</gi>, 
<gi>alt</gi>, 
<gi>altGrp</gi>, 
<gi>altIdent</gi>, 
<gi>analytic</gi>, 
<gi>attDef</gi>, 
<gi>attList</gi>, 
<gi>attRef</gi>, 
<gi>biblItem</gi>, 
<gi>biblStruct</gi>, 
<gi>binaryObject</gi>, 
<gi>broadcast</gi>, 
<gi>c</gi>, 
<gi>cb</gi>, 
<gi>cl</gi>, 
<gi>classSpec</gi>, 
<gi>classes</gi>, 
<gi>content</gi>, 
<gi>correction</gi>, 
<gi>datatype</gi>, 
<gi>defaultVal</gi>, 
<gi>desc</gi>, 
<gi>distinct</gi>, 
<gi>div1</gi>, 
<gi>div2</gi>, 
<gi>div3</gi>, 
<gi>div4</gi>, 
<gi>div5</gi>, 
<gi>div6</gi>, 
<gi>div7</gi>, 
<gi>egXML</gi>, 
<gi>elementSpec</gi>, 
<gi>equipment</gi>, 
<gi>equiv</gi>, 
<gi>exemplum</gi>, 
<gi>fsdDecl</gi>, 
<gi>floatingText</gi>, 
<gi>headItem</gi>, 
<gi>headLabel</gi>, 
<gi>hyphenation</gi>, 
<gi>imprimatur</gi>, 
<gi>interpretation</gi>, 
<gi>join</gi>, 
<gi>joinGrp</gi>, 
<gi>link</gi>, 
<gi>linkGrp</gi>, 
<gi>listRef</gi>, 
<gi>m</gi>, 
<gi>macroSpec</gi>, 
<gi>measure</gi>, 
<gi>meeting</gi>, 
<gi>memberOf</gi>, 
<gi>metDecl</gi>, 
<gi>metSym</gi>, 
<gi>moduleRef</gi>, 
<gi>moduleSpec</gi>, 
<gi>monogr</gi>, 
<gi>normalization</gi>, 
<gi>phr</gi>, 
<gi>postBox</gi>, 
<gi>postCode</gi>, 
<gi>quotation</gi>, 
<!--<gi>quote</gi>, -->
<gi>recording</gi>, 
<gi>recordingStmt</gi>, 
<gi>remarks</gi>, 
<gi>schemaSpec</gi>, 
<gi>scriptStmt</gi>, 
<gi>segmentation</gi>, 
<gi>series</gi>, 
<gi>span</gi>, 
<gi>spanGrp</gi>, 
<gi>specDesc</gi>, 
<gi>specGrp</gi>, 
<gi>specGrpRef</gi>, 
<gi>specList</gi>, 
<gi>state</gi>, 
<gi>stdVals</gi>, 
<gi>street</gi>, 
<gi>stringVal</gi>, 
<gi>tag</gi>, 
<gi>timeline</gi>, 
<gi>valDesc</gi>, 
<gi>valItem</gi>, 
<gi>valList</gi>, 
<gi>variantEncoding</gi>, 
<gi>w</gi>, 
<gi>when</gi></p>
        <p>Here is the TEI Lite schema itself :</p>
        <schemaSpec ident="teilite" start="TEI teiCorpus">

          <moduleRef key="tei"/>
          <moduleRef key="core"/>
          <moduleRef key="header"/>
          <moduleRef key="textstructure"/>
          <moduleRef key="figures"/>
          <moduleRef key="linking"/>
          <moduleRef key="analysis"/>
          <moduleRef key="tagdocs"/>

          <elementSpec  module="linking" ident="ab" mode="delete"/>
          <elementSpec  module="linking" ident="altGrp" mode="delete"/>
          <elementSpec  module="tagdocs" ident="altIdent" mode="delete"/>
          <elementSpec  module="linking" ident="alt" mode="delete"/>
          <elementSpec  module="core" ident="analytic" mode="delete"/>
          <elementSpec  module="tagdocs" ident="attDef" mode="delete"/>
          <elementSpec  module="tagdocs" ident="attList" mode="delete"/>
          <elementSpec  module="tagdocs" ident="attRef" mode="delete"/>
          <elementSpec  module="core" ident="biblItem" mode="delete"/>
          <elementSpec  module="core" ident="biblStruct" mode="delete"/>
          <elementSpec  module="core" ident="binaryObject" mode="delete"/>
          <elementSpec  module="header" ident="broadcast" mode="delete"/>
          <elementSpec  module="core" ident="cb" mode="delete"/>
          <elementSpec  module="tagdocs" ident="classes" mode="delete"/>
          <elementSpec  module="tagdocs" ident="classSpec" mode="delete"/>
          <elementSpec  module="analysis" ident="cl" mode="delete"/>
          <elementSpec  module="analysis" ident="c" mode="delete"/>
          <elementSpec  module="tagdocs" ident="content" mode="delete"/>
          <elementSpec  module="tagdocs" ident="classRef" mode="delete"/>
          <elementSpec ident="constraint" mode="delete" module="tagdocs"/>
          <elementSpec ident="constraintSpec" mode="delete" module="tagdocs"/>
          <elementSpec  module="header" ident="correction" mode="delete"/>
          <elementSpec  module="tagdocs" ident="datatype" mode="delete"/>
          <elementSpec  module="tagdocs" ident="defaultVal" mode="delete"/>
          <elementSpec  module="core" ident="distinct" mode="delete"/>
          <elementSpec  module="textstructure" mode="delete" ident="div1"/>
          <elementSpec  module="textstructure" mode="delete" ident="div2"/>
          <elementSpec  module="textstructure" mode="delete" ident="div3"/>
          <elementSpec  module="textstructure" mode="delete" ident="div4"/>
          <elementSpec  module="textstructure" mode="delete" ident="div5"/>
          <elementSpec  module="textstructure" mode="delete" ident="div6"/>
          <elementSpec  module="textstructure" mode="delete" ident="div7"/>
          <elementSpec  module="tagdocs" ident="egXML" mode="delete"/>
          <elementSpec  module="tagdocs" ident="elementRef" mode="delete"/>
          <elementSpec  module="core" ident="email" mode="delete"/>
          <elementSpec  module="tagdocs" ident="elementSpec" mode="delete"/>
          <elementSpec  module="header" ident="equipment" mode="delete"/>
          <elementSpec  module="core" ident="equiv" mode="delete"/>
          <elementSpec  module="tagdocs" ident="exemplum" mode="delete"/>
          <elementSpec  module="textstructure" ident="floatingText" mode="delete"/>

          <elementSpec  module="header" ident="fsdDecl" mode="delete"/>
          <elementSpec  module="core" ident="headItem" mode="delete"/>
          <elementSpec  module="core" ident="headLabel" mode="delete"/>
          <elementSpec  module="header" ident="hyphenation" mode="delete"/>
          <elementSpec  module="textstructure" ident="imprimatur" mode="delete"/>
          <elementSpec  module="header" ident="interpretation" mode="delete"/>
          <elementSpec  module="linking" ident="joinGrp" mode="delete"/>
          <elementSpec  module="linking" ident="join" mode="delete"/>
          <elementSpec  module="linking" ident="linkGrp" mode="delete"/>
          <elementSpec  module="linking" ident="link" mode="delete"/>
          <elementSpec  module="tagdocs" ident="listRef" mode="delete"/>
          <elementSpec  module="tagdocs" ident="macroRef" mode="delete"/>
          <elementSpec  module="tagdocs" ident="macroSpec" mode="delete"/>
          <elementSpec  module="core" ident="measure" mode="delete"/>
          <elementSpec  module="core" ident="meeting" mode="delete"/>
          <elementSpec  module="tagdocs" ident="memberOf" mode="delete"/>
          <elementSpec  module="header" ident="metDecl" mode="delete"/>
          <elementSpec  module="header" ident="metSym" mode="delete"/>
          <elementSpec  module="analysis" ident="m" mode="delete"/>
          <elementSpec  ident="moduleRef" mode="delete" module="tagdocs"/>
          <elementSpec  ident="moduleSpec" mode="delete" module="tagdocs"/>
          <elementSpec  module="core" ident="monogr" mode="delete"/>
          <elementSpec  module="header" ident="normalization" mode="delete"/>
          <elementSpec  module="analysis" ident="phr" mode="delete"/>
          <elementSpec  module="core" ident="postBox" mode="delete"/>
          <elementSpec  module="core" ident="postCode" mode="delete"/>
          <elementSpec  module="header" ident="quotation" mode="delete"/>
          <elementSpec  module="header" ident="recording" mode="delete"/>
          <elementSpec  module="header" ident="recordingStmt" mode="delete"/>
          <elementSpec  module="tagdocs" ident="remarks" mode="delete"/>
          <elementSpec  module="tagdocs" ident="schemaSpec" mode="delete"/>
          <elementSpec  module="header" ident="scriptStmt" mode="delete"/>
          <elementSpec  module="header" ident="segmentation" mode="delete"/>
          <elementSpec  module="core" ident="series" mode="delete"/>
          <elementSpec  module="analysis" ident="spanGrp" mode="delete"/>
          <elementSpec  module="analysis" ident="span" mode="delete"/>
          <elementSpec  module="tagdocs" ident="specDesc" mode="delete"/>
          <elementSpec  module="tagdocs" ident="specGrp" mode="delete"/>
          <elementSpec  module="tagdocs" ident="specGrpRef" mode="delete"/>
          <elementSpec  module="tagdocs" ident="specList" mode="delete"/>
          <elementSpec  module="header" ident="state" mode="delete"/>
          <elementSpec  module="header" ident="stdVals" mode="delete"/>
          <elementSpec  module="core" ident="street" mode="delete"/>
          <elementSpec  module="tagdocs" ident="stringVal" mode="delete"/>
          <elementSpec  module="tagdocs" ident="tag" mode="delete"/>
          <elementSpec  module="header" ident="tagsDecl" mode="delete"/>
          <elementSpec  module="linking" ident="timeline" mode="delete"/>
          <elementSpec  module="tagdocs" ident="valDesc" mode="delete"/>
          <elementSpec  module="tagdocs" ident="valItem" mode="delete"/>
          <elementSpec  module="tagdocs" ident="valList" mode="delete"/>
          <elementSpec  module="header" ident="variantEncoding" mode="delete"/>
          <elementSpec  module="linking" ident="when" mode="delete"/>
          <elementSpec  module="analysis" ident="w" mode="delete"/>

<!-- added 2008/01/30, after discovering these are all unreachable -->

          <elementSpec  module="header" ident="handNote" mode="delete"/>
          <elementSpec  module="header" ident="tagUsage" mode="delete"/>
          <elementSpec  module="core" ident="imprint" mode="delete"/>
          <elementSpec  module="header" ident="rendition" mode="delete"/>
          <elementSpec  module="header" ident="namespace" mode="delete"/>

	  <classSpec ident="att.global" type="atts" mode="change" module="tei">
	    <attList>
	      <attDef ident="rendition" mode="delete"/>
	    </attList>
	  </classSpec>

        </schemaSpec>
      </div>
    </back>
  </text>
</TEI>
