TEI: Multilingual Text Tools and Corpora (MULTEXT)


For inclusion in the TEI Application Page

E-Mail from Jean Veronis, June, 1996; additional suggestions from Nancy Ide, August, 1996
DutchEnglish (including Old/Middle English)French (including dialects)GermanItalian (including dialects)KikongoMultilingualSpanish (including Catalan)SwahiliSwedish (including Old/Medieval Swedish)Language Corpora21 September 2007Chris Ruotolo Converted to TEI P5 13 December 2001

Stuart BrownMinor edit; URLs checked and OK.

13 August 1996

WPSeparated MULTEXT and MULTEXT-EAST, Nancy Ide’s suggestion.

13 August 1996

WPAdded section on Corpus Encoding Standard, with link, in description.

15 July 1996

WPCreated file

Description:

Multext encompasses a series of projects whose goals are to develop standards and specifications for the encoding and processing of linguistic corpora, and to develop tools, corpora, and linguistic resources embodying these standards. Multext is developing tools, corpora, and linguistic resources for a wide variety of languages, including Bambara, Bulgarian, Catalan, Czech, Dutch, English, Estonian, French, German, Hungarian, Italian, Kikongo, Occitan, Romanian, Slovenian, Spanish, Swedish, and Swahili. All Multext results are made freely and publicly available for non-commercial, non-military purposes.

Corpus Encoding Standard:

MULTEXT, along with EAGLES and the Vassar/CNRS collaboration (supported by the U.S. National Science Foundation), have developed a Corpus Encoding Standard that will “serve as a widely accepted set of encoding standards for corpus-based work”.

Funding:

The Multext effort has been supported by the European Commission, under the Linguistic Research and Engineering, Copernicus, and Langues regionales et minoritaires programmes; the U.S. National Science Fundation, under the Vassar/CNRS collaboration; the Fonds Francophone pour la Recherche (AUPELF-UREF); the Centre National de la Recherche Scientifique (CNRS) and the Universite de Provence.

Contact:

Dr. Jean Veronis (coordinator)Laboratoire Parole et LangageCNRS & Universite de Provence29, Av. Robert Schuman13621 Aix-en-Provence Cedex 1, FranceTel: (+33) 42 95 36 33Fax: (+33) 42 59 50 96E-mail: Jean.Veronis@lpl.univ-aix.fr