The internet and social media have given rise to a broad range of new communicative genres which are subsumed under the term computer-mediated communication (CMC) – genres such as chats, forums, text messaging (SMS, WhatsApp), interaction on wiki talk pages and in blog comments, via Twitter, on social network sites, and in multimodal 3D environments. A TEI standard for the representation of those genres and their structural and linguistic peculiarities is a desideratum both in the fields of digital humanities and computer sciences. Such a standard would foster interoperability between language resources as well as the analysis and automatic exploitation of resources of that kind in several respect:
- It would allow scholars for building interoperable CMC corpora for different languages and thus enhance the empirical basis for doing CMC research across languages and cultures.
- It would allow scholars for building CMC resources which are interoperable with text and speech corpora that are already represented in TEI and thus pave the way for corpus-based research on language use across different types of corpora (= comparative analysis of the language use in CMC, in edited text and in spoken language).
- Through including models for the description of not only verbal but also of non-verbal acts, it would allow scholars to describe and analyse CMC accross different modalities.
The TEI special interest group (SIG) “computer-mediated communication” is developing and discussing suggestions for adapting the TEI guidelines for the representation of genres of computer-mediated communication. The focus of the group’s work is on (but not limited to) tasks such as:
- modeling user contributions (posts) to written CMC interactions (which share features both with written and spoken language) as well as the interplay of written posts, spoken utterances and non-verbal acts in multimodal CMC environments;
- modeling CMC document structures (“CMC macrostructures” – e.g., forum threads, wiki talk pages, chat logfiles, Twitter timelines etc.);
- annotating linguistic features within user posts (“CMC microstructures” – elements such as emoticons, addressing terms, hashtags; quotes from prior posts; etc.);
- representing linked data and media objects connected with/embedded in CMC discourse;
- metadata schemas for the description of CMC resources;
- developing perspectives for the representation of discourse in multimodal cmc environments in which the participants in one interaction space combine a variety of modalities from written, spoken and non-verbal modes.
Michael Beißwenger, University of Duisburg-Essen
Wiki space and mailing lists
For exchange on the issues and tasks listed above, the SIG uses the talk pages in the TEI wiki and a mailing list.
Visit the SIG’s space in the TEI wiki.
Mailing list: firstname.lastname@example.org (For subscription please visit https://groups.google.com/d/forum/tei-cmc)
Previous meetings and panels of the SIG have been held at the TEI Conference in Rome (2013), at the 2nd CMC Corpora Conference in Dortmund (2014), as part of the 4th DARIAH-EU VCC meeting in Rome (2014), at the 3rd CMC Corpora Conference in Rennes (2015) and at the TEI Conference in Lyon (2015). Activities of the SIG in 2016 include meetings at the 4th CMC Corpora Conference in Ljubljana and at the TEI Conference in Vienna.
A documentation of the work and activities of the SIG (including TEI customizations and schemas for CMC) can be found in the SIG space of the TEI wiki.