4 A brief summary of the MASTER proposals

One of the consequences of the decision to define the MASTER encoding scheme as an application of the TEI document type definition is the availability of a number of standard TEI elements; another, perhaps less convenient, is that an awareness of the scope and idioms of the TEI scheme is necessary to fully understand the functionality of the proposed scheme. We do not attempt to provide such background here: the interested reader should consult the TEI web site for pointers to such information; perhaps of particular interest to francophone readers is the special edition of Cahiers Gutenberg published in October 1996, which provides an excellent introduction.

An <msDescription> element may reasonably appear either within the body or within the header of a TEI conformant document. In the former case, where the document being encoded is essentially a collection of manuscript descriptions, the <msDescription> element may be placed anywhere that a paragraph might appear. In the latter case, where the description forms part of the metadata to be associated with a digital representation of some manuscript original, whether as a transcription, as a collection of digital images, or as some combination of the two, then the <msDescription> should appear within the <sourceDesc> element of the header.

The <msDescription> element may contain up to eight of the following different components, of which only the first is mandatory:

4.1 Status and Identification

In addition to the global attributes available to all TEI elements, the <msDescription> element carries a special purpose STATUS attribute which defines the compositional status of the manuscript being described , i.e. whether it is a complete unitary object, a composite of fragments, or an isolated group of fragments. This attribute is intended solely as an indication of the overall status of the manuscript being described; details of its composition are documented in various other ways, more fully described elsewhere. For example, as noted above, a composite manuscript, in which several originally distinct complete or fragmentary manuscripts have been physically combined, may be described by a manuscript description whose internal structure mirrors that of the manuscript, with distinct <msPart> elements for each of the identifiable fragments.

The <msIdentifier> element is used, as the name suggests, to supply the identifier used for a manuscript within the holding institution. As a minimum, the identifier should supply the name of the repository and an identifying number such as a shelfmark, as in this example:

<repository>Huntington Library</repository>
<idNo>EL 26 C 9</idNo>
Further elements may additionally be supplied to specify the country, region etc. of the repository, and also to supply any alternative names or former shelfmarks associated with the manuscript in this repository, as in the following (fictitious) example:
<institution>The University of Oxford</institution>
<repository>Bodleian Library</repository>
<idNo type="BOD">Rawl. MS poet. 176</idNo>
<altName type="nick">The Tungebryht manuscript</altName>

As both the above examples demonstrate, we have not yet agreed on a way of normalizing such features as the names of institutions or places by the use of authority lists or similar features. We anticipate that this will be an important further stage during the development of the standard, for which, fortunately, a range of appropriate mechanisms already exist within the TEI scheme.

The <msHeading> element is provided to make life easier for the cataloguer who wishes to use a standardised or summary name for a manuscript, which might contain some elements derived from both its self number and its intellectual content (for example) but not all of them, perhaps including additional information not strictly present in either. It is an optional element, which does not strictly form part of the <msIdentifier>, since it is a form of supplied title for the whole description.

4.2 The Manuscript Summary

Manuscript cataloguing practices vary widely, not only in the scope of the information included, but also in the amount of detail which it is feasible or desirable to record for each item. Particularly when handling legacy data, there is a frequent demand for some kind of `minimal' level record, whether as an end in itself or as an initial stage in the creation of a more complete record.

The MASTER DTD makes it possible for cataloguers to select fairly freely from a very wide range of detailed cataloguing possibilities, which can be tailored to specific project needs, and moreover later be expanded as appropriate. The DTD described here is designed precisely to facilitate such incremental enrichment. However, there is also a frequently voiced desire for some specific recommendation concerning the minimum practical level of detail to be recorded. The <msSummary> element is provided to meet this need, and also that for rapid conversion of legacy data. It should not be used simply to hold short summary title, `docket', or `tombstone' specifying a supplied title or heading applicable to the whole of a manuscript: the <msHeading> element is provided for this purpose. Moreover, it should not be used where more detail is required than can be accomodated by the <msSummary> element; indeed, good practice may suggest the removal of this element from the record when more detailed information has been included in the remainder of the manuscript description.

The <msSummary> element is unlike other components of the MASTER manuscript description in that its contents are relatively constrained. It is rather more like a traditional database record than a piece of free prose in which certain words happen to be tagged. It allows the cataloguer to specify and to mark unambiguously information about authorship, origin (place and date), titles, and languages used within a manuscript, but little else. These components are signalled by using a mixture of pre-defined TEI elements (such as <author>, <langUsage>, and <title>) and newly-defined specialist elements (such as <origPlace> and <origDate>). The constituents of a <msSummary> must be supplied in a prescribed order, and only the <author>, <title> and <note> elements may be repeated.

Here is a simple example of the use of this element:

<author>Domenico Cavalca</author>
<title>Vite dei santi padri</title>

As previously noted, we recognize the need for additional mechanisms such as authority lists and controlled thesauri to facilitate standardization of the content of these elements; these will probably involve the definition of additional attributes to hold normalized forms, or links to other components in a knowledge base. At this stage in the project, however, our focus is on defining an appropriate structure, adequate to the needs we have so far identified as common to all projects. It may well be that the elements described so far will be all that some cataloguing projects ever find necessary or feasible to supply for an initial cataloguing exercise. However, we also anticipate a need for far more detailed cataloguing, using one or more of the remaining components of the <msDescription> element.

4.3 Intellectual Content

A <msSummary> will typically supply only a single main author and possibly a supplied title. In the common case that a manuscript contains many distinct works, or parts of works, a more complex description is likely to be desirable. Rather than try to complicate the structure of the <msSummary> element, we propose the use of a specialized <msContents> element in which the structure and organization of the intellectual content of a manuscript can be faithfully and accurately recorded to whatever degree of complexity is appropriate.

A <msContents> element consists of one or more <msItem> elements, each of which describes a single distinct "item" of intellectual content, as determined by the cataloguer. It is a matter for individual cataloguing practice to decide whether, for example, each poem in a miscellany of poems, each life in a collection of saint's lives, or each charter in a cartulary, should be regarded as discrete items. The purpose of the tagging scheme we describe here is to provide a method by which the results of that decision can be communicated unambiguously, not to provide guidance on how it should be made, except insofar as that is implied by the definition we propose for distinct <msItem> elements.

A <msItem> element may simply contain running text with no further tagging; more usually, however, it will contain additional identifiable components. Many of these are standard TEI elements: these are <author>, <respStmt>, <title>, <langUsage>, <q> (for quotations), and <bibl> (for conventional bibliographic description, for example of a modern edition). These may however be combined with any appropriate combination of the following more specialized and manuscript-specific elements: <colophon>, <incipit>, <explicit>, <rubric>, and <finalRubric>.

Finally, a special purpose <locus> element is provided to specify the location in the manuscript occupied by the element within which it appears. Locations are conventionally specified as a sequence of folio numbers, but may also be a discontinuous list, or a combination of the two. This specification should be given as the content of the <locus> element. It may also be specified in a normalised form using special purpose attributes on the <locus> element. To avoid ambiguity, a <locus> element should be the first component element of the item whose location is being specified, and it should not normally be repeated within that element.

Each element within <msItem> has the same substructure, in which (following an optional <locus> element) any combination of low-level TEI phrase elements and plain text may appear.

Here is a simple example, for a manuscript containing a single item which occupies folios 1 to 223 of a manuscript, with an incipit containing the words ``Forte Hervei monachi'' on folio 1, and an explicit on the verso of f 223 which reads ``Benedictio salis et aquae'':

<msItem><locus>ff. 1-223v</locus>
<author>Radulphus  Flaviacensis</author>
<title>Expositio super Leviticum </title>
cf. <bibl>Stegmüller, RB 7093</bibl>
<incipit><locus>f. 1</locus>
Forte Hervei monachi</incipit>
<explicit><locus>f. 223v</locus>
Benedictio salis et aquae</explicit>

Here is a complex and perhaps more typical example:

<msItem n="1"><title><locus>fols. 5r -7v</locus>An ABC</title>: 
<ref>IMEV 239</ref></msItem>
<msItem n="2"><title lang="FRA"><locus>fols. 7v -8v</locus>
Lenvoy de Chaucer a Scogan</title>: <ref>IMEV 3747</ref></msItem>
<msItem n="3"><title><locus>fol. 8v</locus>Truth</title>: 
<ref>IMEV 809</ref></msItem>
<msItem n="4"><title><locus>fols. 8v-10v</locus>
Birds Praise of Love</title>:  <ref>IMEV 1506</ref></msItem>
<msItem n="5">
  <title type="supp"><locus>fols. 10v -11v</locus>Two Latin poems</title>
  <msItem><title lang="LAT">De amico ad amicam</title><ref>IMEV 16</ref></msItem>
  <msItem><title lang="LAT">Responcio</title><ref>IMEV 19</ref></msItem>
<msItem n="6"><title><locus>fols. 14r-126v</locus>
Troilus and Criseyde</title> 
(Bk. 1:71-Bk. 5:1701, with additional losses due to
mutilation throughout)</msItem>

As shown in the fifth item above, a manuscript item may itself contain further nested manuscript items, for example where a title is supplied for a group of works each of which is also titled. More complex situations where groups of items are nested arbitrarily deep are also imaginable.

4.4 Physical Description

Of the three remaining chief groups of the standard manuscript description, that concerned with its physical description is probably the most complex, and also that in which cataloguing practices tends to be most divergent. Recognising this, the MASTER DTD does not currently impose a strict model on the information to be collected under this heading; instead, it allows for cataloguers simply to describe what information they wish as regular prose, optionally grouping one or more paragraphs under specific categories, for which specific elements are defined.

The categories for which elements have so far been identified and which may therefore be distinguished within a full description include the format (e.g. codex, scroll, etc.), characteristics of the support (e.g. its material) and of the layout (e.g. number of columns), and scripts used. Element definitions are also provided for information relating solely to the binding of a manuscript, its collation and other paratextual features. Finally, and perhaps more controversially, discussion of decorative features of the manuscript is currently classed under physical description, as is discussion of any music contained within it.

The majority of the elements making up a physical description will consist of plain prose descriptions, expressed as one or more paragraphs. Indeed, a <physDesc> element may contain nothing but a series of paragraphs: this would be essential in the case where the description intermingles discussion of (say) collation, decorative features, and binding indiscriminately. More usually, discussion of each of the topics listed above will be confined to one or more distinct paragraphs, in which case the encoder has the option to use the more specific elements (<binding>, <music>, <collation> etc.) proposed by the MASTER DTD to distinguish them. In a few cases (notably collation) this principle applies at a further level; that is, a <collation> element may contain either simply a series of paragraphs, or it may contain one or more of the more specialized elements <catchwords> and <signatures>, each of which contains paragraphs relating specifically to catchwords and to signatures respectively.

A particular strength of this approach is that the TEI <p> element can contain a wide range of useful objects other than simple running prose. These include features such as dates, numbers, titles, editorial corrections, typographic highlighting, foreign language phrases, etc. and mathematical (or other) formulae. Thus, even a <collation> element containing <p> elements only can also contain a <formula> element in which a detailed collation formula may be supplied using an appropriate notation.

This richness of the underlying "soup" from which the components of a description are made up is one important characteristic that the MASTER DTD inherits from the TEI. Another is its tendency to apply Occam's razor, In the case of decoration, rather than proposing very specific sets of distinctions for different classes of decorative feature (marginal decoration, rubrication, illustration etc, as was done in the original Studley proposals), the MASTER DTD proposes a single category of decorative note, (<decoNote>) optionally further categorized by means of type and subclass attributes.

4.5 Historical and Curatorial Information

The remaining parts of the MASTER manuscript description are concerned with the history and curation of the manuscript as an artefact. A <history> element is used to document firstly the origin, next the provenance, and finally the acquisition of the object concerned. Attributes may be used to associate the prose descriptions within these component elements with more directly searchable normalized information such as exact dates, whether or not the evidence for a date is internal to the object, and its reliability.

The history of a manuscript should normally be presented in the order implied above. Information about the origins of the element (including any discussion of its sources) should be given as one or more paragraphs contained by a single <origin> element; any available information or discussion of distinct stages in the history of the manuscript before its arrival in its current location should be included as paragraphs within one or more <provenance> elements following this. Finally, any information specific to the means by which the manuscript was acquired by its present owners should be given as paragraphs within the <acquisition> element.

A variety of information relating to the curation and management of a manuscript may be recorded as simple prose narrative tagged using the standard <p> element, optionally grouped within one or more of the specific elements <recordHist> (record history) <custHist> (custodial history) and <availability> (a standard TEI element). Together these make up a group of elements referred to in the current DTD as <adminInfo>, as a gesture to the comparable element within the Extended Archival Description (EAD) DTD.

The <recordHist> element is provided as a means of documenting the history of the cataloguing record itself. If supplied, it contains a <source> element, followed by an optional series of <change> elements. The latter are standard TEI elements, which may also appear within the <revisionDesc> element of the standard TEI Header; their use here is intended to signal the similarity of function between the two container elements. Where the TEI Header should be used to document the revision history of the whole electronic file to which it is prefixed, the <recordHist> element may be used to document changes at a lower level, relating to the individual record.

The <source> element is used to document the primary source of information for the catalogue record containing it, in a similar way to the standard TEI <sourceDesc> element within a TEI Header. If the record is a new one, it may simply contain a <p> element as in the following example:

<source><p>No source: this is an original record</p></source>

More usually however the record will be derived from some previously existing catalogue, which may be specified using the standard TEI <bibl> element, as in the following example:

<source><p>Information transcribed from 

If, as is likely, a full bibliographic description of the source from which cataloguing information was taken has already been given elsewhere in the manuscript description (for example in a <listBibl> element), then it need not be repeated here. Instead, it should be referenced using the standard TEI <ref> element, as in the following example:

<bibl id="IMEV123">
<title>Index of Middle English Verse</title>
<!-- other bibliographic details for IMEV here -->
<!-- other bibliographic records relating to this ms here -->
<source><p>Information transcribed from 
<ref target="IMEV123">IMEV 123</ref>
<!-- ... -->

The <custHist> record is used to describe the custodial history of a manuscript, recording any significant events noted during the period that it has been located within the cataloguing institution. It may contain either a series of paragraphs tagged with the standard TEI <p> element, or a series of paired <date> and <custEvent> elements, each describing a distinct incident or event, further specified by a TYPE attribute.

In the following example, the cataloguer has chosen to record the key events in a manuscript's custodial history simply as a series of paragraphs:

<p>Conserved between March 1961 and February 1963 at Birgitte Dalls 
<p>Photographed in May 1988 by AMI/FA.</p>
<p>Dispatched to Iceland on 13 Nov 1989.</p>

The same history might alternatively be represented in a slightly more structured and searchable form by using typed <custEvent> elements, as follows:

<date notBefore="1961-03" notAfter="1963-02"></date>
<custEvent type="conservation">
<p>Conserved between March 1961 and February 1963 at Birgitte Dalls 
<custEvent type="photography">
<date notBefore="1988-05-01" notAfter="1988-05-30">May 1988</date>
<p>Photographed in May 1988 by AMI/FA.</p></custEvent>
<custEvent type="transfer/dispatch">
<date value="1989-11-13">13 November 1989</date>
<p>Dispatched to Iceland.</p></custEvent>

4.6 Other materials

A need for information under three further headings has so far been identified to complete the manuscript description: firstly, a traditional bibliography of other works describing the manuscript in hand; secondly, information about any `surrogates' (copies or photographs etc.) of the manuscript, and finally information about any materials now accompanying the manuscript but not forming part of it or its binding.

The first need is simply met by the standard TEI <listBibl> element, which being fully described in the TEI reference given above we do not describe further here.

The second need is met by a special purpose <surrogates> element which enables cataloguers to provide information about any digital or photographic representations of the manuscript which may exist within the holding institution or elsewhere. Where such representations exist within published works, they will normally be documented within the <listBibl> element already mentioned. However, it is often also convenient to record information such as negative numbers, digital identifiers etc. for unpublished collections of manuscript images maintained within the holding institution, as well as to provide more detailed descriptive information about the surrogate itself.

At a later version, the content of the <surrogates> element is likely to be expanded to include elements more specifically intended to provide detailed information such as technical details of the process by which a digital or photographic image was made. At present, this and other information may be recorded simply as prose paragraphs.

Turning to the third need identified above, where a manuscript has additional material, not originally part of the manuscript, which is bound with it or otherwise accompanying the manuscript. In cases where this additional material is clearly a distinct manuscript or manuscript fragment, the whole manuscript should be treated as a composite manuscript and the additional matter described in a separate <msPart>. However, there are cases where the additional matter is not self-evidently a distinct manuscript: it might be an important set of notes by a later scholar or owner, or it might be a file of correspondence relating to the manuscript. The <accMat> element is provided as a holder for this kind of information, as in the following example, describing a note by the Icelandic manuscript collector rni Magnsson which has been bound with the manuscript:

<p>A slip in Árni Magnússon's hand has been stuck to the
pastedown on the inside front cover; the text reads:<q>Þidreks
Søgu þessa hefi eg feiged af Sekreterer Wielandt Anno 171
5 i Kaupmanna høfn.  Hun er, sem eg sie, Copia af Austfirda
bókinni (Eidagás) en<expan>n</expan> ecki progenies
Brædratungu bokarinnar. Og er þar fyrer eigi i
allan<expan>n</expan> máta samhlioda
þ<expan>eir</expan>re er Sr Jon Erlendz son hefer ritad fyrer
Mag. Bryniolf. Þesse Þidreks Saga mun vera komin fra Sr
Vigfuse á Helgafelle.</q></p>