Overlapping Markup SIG Minutes
23 October, 2004
Dot Porter
November 12, 2004

Contents

Seven in attendence.

Approach: At last year's meeting (see the minutes), we had discussed creating a web site to explain in some detail many different approaches to overlapping markup. This year, we discussed some approaches that are currently in use by those of us in the SIG: the use of milestone elements and Just In Time Trees (JITTS). The traditional problem with using milestone elements extensively to deal with overlapping markup is that existing support languages (XPath, XSLT) cannot deal with non-content (that is, text between two milestone elements acting as the beginning and end tags). However, Alex Dekhtyar and Emil Iacob at the University of Kentucky have been working on an extension of XPath (Extended XPath, or EXPath) that can search overlapping encodings represented in a GODDAG. The GODDAG can be stored in an XML file with milestones (plus a set of DTDs, one per hierarchy) or in separate files: one XML file per hierarchy. In fact, the storage method is not important as long as there are parsers for GODDAG. The GODDAG implementation provides DOM-like API which can be used as well by an XML editor. They have begun working on an extension of XSLT as well. Patrick Durusau also cited a paper presented at the 2004 Extreme Markup Conference by Steven DeRose, Markup Overlap: A Review and a Horse. In this paper, DeRose outlines a system of milestone elements similar to that already implemeneted at the University of Kentucky, which he calls clix (not to be confused with Constraint Language in XML (CLIX)).

The SIG proposes to investigate the possibility of implementing within TEI a system for dealing with overlapping markup through a system of milestone elements based on clix, JITTS, and the EXPath and EXSLT support being developed at the University of Kentucky.

  • Provide several examples on the OM SIG website and invite TEI users to comment and criticize the approach.
  • Invite examples of overlapping markup from the user community.
  • Finally, make a recommendation to the TEI editors — either to look into making the milestone approach an integrated part of P5 (if it appears to handle most instances of OM), or not.

Examples

Example 1
<p><q who="Wilson" sID="001"/>The first thing that put us out was that advertisement. Spaulding, he came down into the office just this day eight weeks with this very paper in his hand, and he says:&mdash;</p> <p><q who="Spaulding" sID="002"/>I wish to the Lord, Mr. Wilson, that I was a red-headed man.<q eID="002"/></p> <p><q who="Wilson" sID="003"/>Why that?<q eID="003"/> I asks.<q eID="001"/></p>
Example 2
This example shows how we can use milestones to show, at the same time, four different and overlapping organizational sections:
  • <line> = folio line
  • <vline> = verse line (TEI <l> )
  • <HL> = half line
  • <oecno> = lines according to the Old English Corpus
  • <vsection> = verse section (TEI <lg> )

I used the same id for sID and eID as we use for the regular ID.

<p> <oecno sID="boe014000005002" n="66"/> ... <line sID="oa6003r12" n="12"/> ... <vsection sID="oa6m05"/> <vline sID="oa6m05001" n="m5.1"/> <HL sID="oa6m05001a"/>&ETH;V meaht be &eth;&aelig;re sunnan<HL eID="oa6m05001a"/> <line eID="oa6003r12"/> <line sID="oa6003r13" n="13"/> <HL sID="oa6m05001b"/>sweotole ge&thorn;encean<HL eID="oa6m05001b"/> <vline eID="oa6m05001"/> <vline sID="oa6m05002" n="m5.2"/> <HL sID="oa6m05002a"/>7 be &aelig;ghwel- <line eID="oa6003r13"/> <line sID="oa6003r14" n="14"/>cum <HL eID="oa6m05002a"/> <HL sID="oa6m05002b"/>o&eth;rum steorran<HL eID="oa6m05002b"/> <vline eID="oa6m05002"/> <vline sID="oa6m05003" n="m5.3"/> <HL sID="oa6m05003a"/>&thorn;ara <line eID="oa6003r14"/> <line sID="oa6003r15" n="15"/>&thorn;e &aelig;fter burgum <HL eID="oa6m05003a"/> <HL sID="oa6m05003b"/>beorhtost <line eID="oa6003r15"/> <line sID="oa6003r16" n="16"/>scine&eth;. <HL sID="oa6m05003b"/> <oecno eID="boe014000005002"/> <oecno sID="boe014000005004" n="67"/> <vline eID="oa6m05003"/> <vline sID="oa6m05004" n="m5.4"/> <HL sID="oa6m05004a"/>gif him wan fore<HL eID="oa6m05004a"/> <HL sID="oa6m05004b"/>wolcen <line eID="oa6003r16"/> <line sID="oa6003r17" n="17"/>hanga&eth; <HL eID="oa6m05004b"/> <vline eID="oa6m05004"/> <vline sID="oa6m05005" n="m5.5"/> <HL sID="oa6m05005a"/>ne m&aelig;gen hi swa leohtne<HL eID="oa6m05005a"/> <HL sID="oa6m05005b"/>leo- <line eID="oa6003r17"/> <line sID="oa6003r18" n="18"/>man ansendan <HL eID="oa6m05005b"/> <vline eID="oa6m05005"/> <vline sID="oa6m05006" n="m5.6"/> <HL sID="oa6m05006a"/>&aelig;r se &thorn;icca mist<HL eID="oa6m05006a"/> <line eID="oa6003r18"/> <line sID="oa6003r19" n="19"/> <HL sID="oa6m05006b"/>&thorn;ynra weor&eth;e<HL eID="oa6m05006b"/> <oecno eID="boe014000005006"/> <oecno sID="boe014000005007" n="68"/> <vline eID="oa6m05006"/> <vline sID="oa6m05007" n="m5.7"/> <HL sID="oa6m05007a"/>swa oft smylte<HL eID="oa6m05007a"/> <HL sID="oa6m05007b"/>s&aelig; <line eID="oa6003r19"/> ... <HL eID="oa6m05007b"/> <vline eID="oa6m05007"/> ... <oecno eID="boe014000005007"/> ... <vsection eID="oa6m05"/></p>