Intended for distribution to TEI Migration Task Force members for corrections; corrected version to TEI Secretary for archiving.
No source; this electronic version is the source.
The notation [?...?]
is for I
missed something here
.
All times are local to MITH (i.e., -05).
Started late (due to weather) at ~14:15 with CR, SS, TE, FW, LB, JH, AB, SW, JW, SB. CP arrived @ ~15:07.
Group extends a gracious thank-you to SS, MITH, and also to Amit Kumar. SS and AK have gone above and beyond to see that this meeting works despite the closing of the University of Maryland due to snow.
Discussion of survey project; little done so far.
LB announces Alan Morrison looking into using new projects TEI webpage as springboard for survey.
Consensus is to perform a small survey using the list of projects available on the aforementioned TEI applications page; perhaps do larger survey later on a different or extended grant.
Strategic document should have an overview of how the process should be performed — who does what, whether to stop production, how to change workflow, etc.
FW: DTD extension problems; reports bug in tei2tei.xsl leaves attribute name but not value for defaulted attr. SDATA problems: using SDATA entity referencess for renaissance musical notation, some of which are not in Unicode.
Discussion of whether to discuss SX or osx — consensus is to discuss both, including difficulties of building osx.
LB reports that CE WG is working on this [FW's SDATA problem]. LB thinks only solution is going to be PUA use, and that WG is going to recommend the encoding thereof.
FW asks how can one use a font to represent a PUA char. No
one actually knows.
Discussion of fonts to be incorporated into SDATA section of technical document.
TE asks about depreciation of named entities; big discussion on whether XML requires named entities to be declared or not. Consensus is that we will discuss the disadvantes of using named entities in SDATA section.
SB suggests tools section admits that JH works on osx.
FW points out that XMetal can convert SGML to XML. (Discussion as to whether it does or not to be discussed on list later.)
We should include in our survey a question or small section asking people about tools they use.
TE: points out discrepancy in tools (sx v osx); also he felt there was no where to start, so he used checklist.
Group considers software for out-of-the-box TEI Lite (P3) to TEI Lite (P4 XML) something we'd like to be able to recommend.
SB suggests that MI W 04 be rolled into an appendix of MI W 03 and, similar to SS's suggestion, be referred to by the first steps of AB's list … "if you have complicated data, lots of it, anything you don't know [e.g., data that was created before you started working on the project]."
TE: documents don't discuss marked sections! (Marked sections
in docuement instance, that is — LB suggests using
general entity references declared based on the value of
TE: no place are SGML declarations mentioned. (Need to mention to use your local declaration for SGML processing).
ACTION: LB to check whether osx reads SGML declaration, in particular whether it acts on NAMECASE GENERAL NO. Answer: 1.5 seems to do it right if you specify the SGML declaration on the commandline as you're supposed to.
The tech document should discuss that osx will only preserve case if you specify an SGML declaration that specifies case sensitivity.
LB points out we should point out the disadvantages of using dirty hacks. We should put some effort into overcoming the obvious reasonable objections to the off-the-shelf tools.
Discussion on XSLT engines. Consensus is to state that we have successfully transformed an X big document with tei2tei.xsl using [software we use, probably xsltproc].
SB points out that the
SB wonders why osx doesn't fix case. After explained to JH, she thinks this feature might be added in future.
LB reminds us (JH in particular) that fix to
CR & SB point out that the discussion of that batch
script should be more plug & play.If you need to do
.
TE points out that we do not mention anything about public identifiers; SB adds DOCTYPEs in general. LB points out that this is mentioned on sgml2xml page on site, could be used as a starting point.
Catalogs:
At the very least, we'll need some sort of discussion of catlogs.
Commenced 10:30
JH reports on osx updates.
CR raises issues with osx: [?...?]. Input files in EUC (a Unix
Japanese double-byte encoding). osx can process them (with a
-b switch); problem is that it gave an error message, even
though it seemed to work. JH points out that this is
technically an OpenSP issue, not an osx issue. Which is to
say, I will definitely not be able to fix it. I will take
care of harrassing the OpenSP folks about it, though.
In some other case got gibberish out
Suppressing output of built-in entity references has been written
but not checked in; supression of default attributes is on JH's
CP: using SX, has been relatively smooth as pretty simple
data pretty well normalized. Points out that her parser
complains about "<l/>" in the output.
JW asks if putting up lists of entity names & Unicode codepoints for n2x would be helpful. SB says yes, but not much. SB points out that users find it difficult to find the ISO entity sets for Unicode on the website. Consensus is that MI W 03 should contain an explicit reference.
NS: TEILite, quite well normalized. Easy translation. Had used osx & xsltproc.
CP & CR bring up a company called Intellex; NS mentions
Apex. CR thinks they might be helpful in taking a look at our
documents and providing feedback. Perhaps raw SGML with lots
of minimizations.
JW: VWWP is also TEILite, well normalized; created new entity sets using XHTML versions as a base and adding Greek with diacritics by themselves.
SB reports on WWP extension experience. FW & SB point out
that
should be explicitly mentioned in TR's section of MI W
03. LB points out some of this is already mentioned in P4:2002 Appendix
C.2
LB reports that information about the BNC migration is now on
website (
CR: main barrier to conversion is error-prone SGML on
input. LB suggests that we have a discussion of document
management issues: e.g., keeping track of changes (e.g., in
CR discusses what a repository rep report should contain.
MI W 06 will be the collected case reports, "Migration Case
Study Reports".
Next meeting tentatively
On the subject of MI W 06, conversion that maximizes XML
tool usability
seems to mean whether you convert from
external entities to XInclude type of stuff.
Current survey plans (aka
Adjourned ~16:15.