<?xml version="1.0" encoding="UTF-8"?>
<!-- © TEI Consortium. Dual-licensed under CC-by and BSD2 licenses; see the file COPYING.txt for details. -->
<?xml-model href="https://jenkins.tei-c.org/job/TEIP5-dev/lastSuccessfulBuild/artifact/P5/release/xml/tei/odd/p5.nvdl" type="application/xml" schematypens="http://purl.oclc.org/dsdl/nvdl/ns/structure/1.0"?>
<classSpec xmlns="http://www.tei-c.org/ns/1.0" module="analysis" xml:id="ATTNORM" type="atts" ident="att.lexicographic.normalized">
  <desc versionDate="2021-08-22" xml:lang="en">provides attributes for usage within word-level elements in the analysis module and within lexicographic microstructure in the dictionaries module.</desc>
  <classes/>
  <attList>
    <attDef ident="norm" usage="opt">
      <gloss versionDate="2007-07-04" xml:lang="en">normalized</gloss>
      <gloss versionDate="2007-12-20" xml:lang="ko"> 표준화</gloss>
      <gloss versionDate="2008-04-06" xml:lang="es">normalizado</gloss>
      <gloss versionDate="2008-03-30" xml:lang="fr">normalisé</gloss>
      <gloss versionDate="2007-11-06" xml:lang="it">normalizzato</gloss>
      
      <desc versionDate="2018-03-17" xml:lang="en">provides the normalized/standardized form of information present in the source text in a non-normalized form.</desc>
      <desc versionDate="2007-12-20" xml:lang="ko">비표준형으로 원본 텍스트에서 제시된 정보의 표준형을 제시한다.</desc>
      <desc versionDate="2007-05-02" xml:lang="zh-TW">提供一規格化形式的資訊，其來源文件為非規格化形式</desc>
      <desc versionDate="2008-04-05" xml:lang="ja">当該テキストを、正規形を示す。</desc>
      <desc versionDate="2009-05-28" xml:lang="fr">donne une forme normalisée de l'information fournie par le texte source sous une forme non normalisée.</desc>
      <desc versionDate="2007-05-04" xml:lang="es">proprorciona de manera normalizada información dada en el texto fuente de manera no normalizada</desc>
      <desc versionDate="2007-01-21" xml:lang="it">fornisce la forma normalizzata delle informazioni contenute nel testo di origine in forma non normalizzata.</desc>
      <datatype><dataRef key="teidata.text"/></datatype>
      <exemplum xml:lang="en">
        <p>Normalization of part-of-speech information within a dictionary entry.</p>
        <egXML xmlns="http://www.tei-c.org/ns/Examples" xml:id="ATTNORM-egXML-gi">
          <gramGrp>
            <pos norm="noun">n</pos>
          </gramGrp>
        </egXML>
      </exemplum>
      <exemplum versionDate="2008-04-06" xml:lang="fr">
        <egXML xmlns="http://www.tei-c.org/ns/Examples" xml:id="ATTNORM-egXML-lk" source="#fr-ex-Grand-Robert">
          <gramGrp>
            <pos norm="nom">n.</pos>
          </gramGrp>
        </egXML>
      </exemplum>
      <exemplum xml:lang="en">
        <p>Normalization of a source form in a tokenized historical corpus.</p>
        <egXML xmlns="http://www.tei-c.org/ns/Examples" xml:id="ATTNORM-egXML-pk">
          <s xmlns="">
            <w>for</w>
            <w norm="virtue's">vertues</w>
          <w>sake</w></s>
        </egXML>
      </exemplum>
      <exemplum xml:lang="en">
        <egXML xmlns="http://www.tei-c.org/ns/Examples" xml:id="ATTNORM-egXML-wd">
          <s xmlns="">
            <w norm="persuasion">perswasion</w>
            <w>of</w>
            <w norm="Unity">Vnitie</w>
          </s>
        </egXML>
      </exemplum>

      <exemplum xml:lang="en">
        <p>Example of normalization from <ref target="http://www.deutschestextarchiv.de/anonym_aviso_1609/258">Aviso. Relation oder Zeitung. Wolfenbüttel, 1609. In: Deutsches Textarchiv</ref>.</p>
        <egXML xmlns="http://www.tei-c.org/ns/Examples" xml:id="ATTNORM-egXML-xl">
          <s xmlns="">
            <w norm="freiwillig">freywillig</w>
            <pc norm="," join="left">/</pc>
            <w norm="unbedrängt">vnbedraͤngt</w>
            <w norm="und">vnd</w>
            <w norm="unverhindert">vnuerhindert</w>
          </s>
        </egXML>
      </exemplum>

      <exemplum xml:lang="en">
        <egXML xmlns="http://www.tei-c.org/ns/Examples" xml:id="ATTNORM-egXML-xv">
          <w xmlns="" norm="Teil">Theyll</w>
        </egXML>
      </exemplum>
      <exemplum xml:lang="en">
        <egXML xmlns="http://www.tei-c.org/ns/Examples" xml:id="ATTNORM-egXML-le">
          <w xmlns="" norm="Freude">Frewde</w>
        </egXML>
      </exemplum>
    </attDef>
    
    <attDef ident="orig" usage="opt">
      <gloss versionDate="2007-07-04" xml:lang="en">original</gloss>
      <gloss versionDate="2007-12-20" xml:lang="ko">원본</gloss>
      <gloss versionDate="2007-11-06" xml:lang="it">originale</gloss>
      <desc versionDate="2005-10-10" xml:lang="en">gives the original string or is the empty string when the element does not appear in the source text.</desc>
      <desc versionDate="2007-12-20" xml:lang="ko">원본 문자열을 제시하거나 요소가 원본텍스트에 나타나지 않을 때는 공백 문자열을 제시한다.</desc>
      <desc versionDate="2009-05-28" xml:lang="fr">indique la chaîne originale ou contient une chaîne vide si l'élément n'apparaît pas dans le texte source.</desc>
      <desc versionDate="2007-05-02" xml:lang="zh-TW">當該元素未出現於來源文件中，提供其原文字串或空白字串。</desc>
      <desc versionDate="2008-04-05" xml:lang="ja">元の文字列を示す。当該要素が元資料に無い場合には空になる。</desc>
      <desc versionDate="2007-05-04" xml:lang="es">da la serie original o la serie vacía cuando el elemento no aparece en el texto fuente</desc>
      <desc versionDate="2007-01-21" xml:lang="it">fornisce la stringa originale o la srtinga vuota quando l'elemento non compare nel testo di origine.</desc>
      <datatype><dataRef key="teidata.text"/></datatype>
      
      <exemplum xml:lang="en">
        <p>Example from a language documentation project of the Mixtepec-Mixtec language (ISO 639-3:
        'mix'). This is a use case where speakers spell something incorrectly but we would
        like to preserve it for any number of reasons, the use of <att>orig</att> is essential and could
        have uses for both the speaker to see past mistakes, researchers to get insight into how
        untrained speakers write their language instinctually (in contrast to prescribed
        convention), etc.:</p>
        <egXML xmlns="http://www.tei-c.org/ns/Examples" xml:id="ATTNORM-egXML-qs">
          <w orig="ntsa sia'i">ntsasia'i</w>
        </egXML>
      </exemplum>
      <exemplum xml:lang="en">
        <p>Example from the <ref target="https://earlyprint.org">EarlyPrint</ref> project. Fragment
        of text where obvious errors have been corrected but the original forms remain
        recorded:</p>
        <egXML xmlns="http://www.tei-c.org/ns/Examples" xml:id="ATTNORM-egXML-ah">
          <w lemma="he" pos="pns" xml:id="b1afj-003-a-0950">he</w>
          <w lemma="have" pos="vvz" xml:id="b1afj-003-a-0960">hath</w>
          <w lemma="bring" pos="vvn" xml:id="b1afj-003-a-0970">brought</w>
          <w lemma="forth" pos="av" xml:id="b1afj-003-a-0980" orig="sorth">forth</w>
        </egXML>
      </exemplum>
      <exemplum xml:lang="en">
        <p>An example from the EarlyPrint project showing the use of both <att>norm</att> and <att>orig</att>. The <att>orig</att> attribute preserves the original version (sometimes with spelling errors, often with printer abbreviations), the element content resolves printer abbreviations but retains the original orthography, and the <att>norm</att> attribute holds normalized values:</p>
        <egXML xmlns="http://www.tei-c.org/ns/Examples" xml:id="ATTNORM-egXML-sb">
          <w lemma="commandment" pos="n1" norm="commandment" xml:id="b9avr-018-a-7720" orig="commandemēt">commandement</w>
        </egXML>
      </exemplum>
    </attDef>

  </attList>
  <remarks versionDate="2020-02-12" xml:lang="en">
    <p>It needs to be stressed that the two attributes in this class are meant for strictly
    lexicographic and linguistic uses, and not for editorial interventions. For the latter, the
    mechanism based on <gi>choice</gi>, <gi>orig</gi>, and <gi>reg</gi> needs to be employed.</p>
  </remarks>
</classSpec>
