TEI: JOS corpora of Slovene

For inclusion in the TEI Application Page

Form posted from TEI website, 2011-12-02
Language CorporaSloveneCreated using webform
  • Host: Jožef Stefan Institute
  • Other institutions involved: Faculty of Arts, University of Ljubljana
  • URL:
  • Main language: Slovene

General description: The JOS project developed Slovene annotated corpora and associated resources meant to facilitate development of Human Language Technologies for the Slovene language. The main results are the JOS morphosyntactic specifications (tagset definition), two annotated corpora, and two Web services. The developed resources are available under the Creative Commons licences.

Implementation description: The corpora and morphosyntactic specifications are encoded in TEI P5 using the additional modules for corpora, linking, analysis and iso-fs plus a few local extensions.

Related resources: Links to papers describing the corpora are given at http://nl.ijs.si/jos/index-en.html#bib

Copyright information: The corpora are distributed under the Creative Commons, Attribution, Non-commercial licence.


Tomaž ErjavecDepartment of Knowledge Technologies Jožef Stefan Institute Jamova cesta 39 1000 Ljubljana SloveniaEmail: tomaz.erjavec@ijs.si