none
The goal of the workgroup is to recommend strategies, procedures, and tools for converting SGML TEI data to P4 XML. The workgroup consists of two subgroups: technical experts, who will recommend specific tools and procedures for data conversion, and repository representatives, who will test those tools and procedures on their own SGML data and document the results. The group will produce two reports (a strategic document and a technical document) and a series of case studies describing specific migration projects in detail.
Each repository group member was asked to identify a few SGML data samples from their holdings that present particular migration challenges. The samples, along with DTDs, associated extension files, and readme statements, were made available to the whole task force via an FTP site. At the same time, the technical group began using an artificial data sample to experiment with currently available migration strategies, using the TEI conversion FAQ as a starting point. In addition to the tools mentioned in the FAQ, the group discussed OpenSP, a currently supported implementation of sx, which has been modified to provide more options for entity handling. The possibility of recommending further modifications to OpenSP was discussed.
The technical group also developed a list of survey questions for TEI data managers. The survey queries managers about their encoding practices, especially with regard to SGML-specific features, as well as their attitudes toward and experience with XML migration. This survey was sent to the TEI-L list but prompted few initial responses.
The official meeting minutes (TEI MI M 01) are publicly available on the TEI website.
The technical group began the meeting by reviewing the draft charge (TEI ED W72) and revising the objectives and the proposed timeline. During the course of the meeting, the group 1) defined the scope of the workgroup activities; 2) developed a plan to survey the TEI user community and elicit SGML data samples; 3) continued its discussion -- initiated via email before the meeting -- of technical recommendations and tools; and 4) developed a basic structure for the final reports, with each group member accepting responsibility for a section of the technical report.
The group decided to focus primarily on strategies for migrating P3 SGML document instances to P4 XML, although the reports will provide some general discussion about migrating DTD extensions, catalog files, and the processing environment.
Advocacy is not an explicit part of the group's charge. However, the recommendations will point out the advantages of conversion to P4, particularly the fact that P3 is no longer supported.
While the group itself will not undertake any software development, it may express the need for new tools, or modifications to existing tools.
Because data samples from the workgroup members may not provide a broad enough range of current encoding practices, the group will need to survey the larger TEI user community. Since email questionnaires to TEI-L and other lists tend to be ignored, the group devised a more targeted approach:
If the survey fails to generate much response, or the data samples provided are inadequate, the group may develop fabricated samples for testing purposes. The group will also attempt to locate survey information on DTD practices that was previously gathered by the TEI consortium.
Following is a brief summary of specific topics discussed by the technical group.
The strategic report will discuss migration issues from a managerial perspective, with an emphasis on planning and decision-making. The technical report will describe the mechanics of conversion in fine detail; it will provide solutions to specific conversion problems as well as a recommended conversion workflow.
A tentative structure for the final reports has been established:
The technical subgroup will be responsible for drafting the technical report; the main sections of the report have already been assigned to individual members. The workgroup chair will draft a skeletal version of the strategic report, which will be fleshed out by the repository representatives.
Each repository representative will also write up a case study, based on his or her experience testing the group's draft recommendations for migration. A generic template will be provided for writing up these results.
In the next two months, the technical group will complete its draft report and circulate it to the repository group, who will test the report's recommendations on their own data and begin writing up the results according to the case study template. Also during this period, the workgroup will begin the survey process described above by compiling the master list of TEI projects and making initial contact with the project representatives.
The second workgroup meeting, which will include both the technical and repository subgroups, will take place in late January or early February 2003 at the University of Maryland. At the meeting, the repository group will present their case studies and suggest any necessary modifications or enhancements to the technical report. A portion of this meeting will also be devoted to developing a draft of the strategic report.
After the second meeting, the workgroup members will continue the survey by analyzing the query responses and data samples and requesting additional information when it is needed. The technical group members will revise their report based on both the survey results and the feedback from the repository group. The repository group will complete their case studies and continue to develop the strategic report.
The third and final workgroup meeting has not yet been planned, but will be scheduled to coincide with the spring meeting of the TEI Council if at all possible. According to the workgroup charge, this meeting is intended for the technical group only, but if funding permits, the repository group will be invited to attend as well. The attendees will finalize the two reports and discuss possible TEI migration efforts in the future.