Computable Text and Media SIG
The TEI Special Interest Group "Computable Text and Media" aims to explore and provide encoding strategies for documents that are designed to be processed—whether by humans, machines, or both. While many such documents are digital and require computation for interaction or interpretation, the notion of "computability" is not limited to electronic media. Rather, it encompasses any textual form governed by a predefined ruleset that can be executed manually, mechanically, or electronically.
Relations to other SIGs
The SIG Computable Text and Media is related to other SIGs, as the Correspondence SIG (for e-mails) and the Computer Mediated Communication SIG (for social media posts).
Convener
The SIG is convened by Torsten Roeder (Center for Philology and Digitality, University of Würzburg).
Background
In everyday understanding, “text” typically refers to inscriptions—material or digital—intended for human reading. Much of our textual heritage consists of works like literature, correspondence, journals, and newspapers, grounded in natural language. The TEI Guidelines are well-suited to model such complexity, offering robust tools for representing a wide array of human-authored texts.
Yet even traditional texts may be, in principle, processed by humans or machines according to explicit or implicit rules. This raises several foundational questions: Is “text”—and by extension, the TEI—necessarily tied to natural language and human readability? Are there forms of textual heritage whose full significance depends on rule-based processing, whether before, during, or after interpretation? If so, how can these forms be represented within the TEI framework—and what adaptations might be needed? Moreover, can the rulesets that govern such processing themselves be treated as part of cultural heritage?
While the questions above may seem abstract, they become clearer when grounded in real-world examples. The following examples reveal that the key characteristic of "computable" text lies not in its material form but in its processability. While the focus here is on born-digital heritage, the boundary between digital and material documents is often fluid. Further, as born-digital cultural heritage continues to grow—and as it faces threats from hardware decay, software obsolescence, legal restrictions, and censorship—it is increasingly urgent to develop sustainable encoding strategies for computable media.
Examples
Formalized Communication: Consider a standardized multiple-choice exam: the printed form encodes a ruleset guiding both how answers are marked and how scores are calculated. Though on paper, it exemplifies "computable" text—its meaning unfolds through a process governed by formal logic, not merely by human reading. A similarly structured form of communication appears in correspondence chess, where players exchange standardized postcards or messages encoding each move using algebraic notation. These communications are fully intelligible only within the framework of the game’s ruleset, and can be processed—by a person or a machine—to reconstruct the state of play. Other examples include library catalog cards that encode bibliographic metadata for rule-based indexing and retrieval. In each case, the text serves not merely as a communicative artifact but as a set of (usually implicit) instructions intended to be parsed and performed, either mentally, manually, or mechanically.
Text in Digital Environments: Digital correspondence, such as email, further complicates conventional textual models. Beyond sender and recipient data, the creation, transmission, and rendering of emails depend on specific client software and server-side behaviors—introducing elements of performance and variation that are essential to capture. (cf. Beshero-Bondar and Bauman 2024)
Interactive Text: Early works of electronic literature created in HyperCard, for instance, depend on specific software and hardware environments (e.g., pre-OS X Macintosh systems) to function as intended. User interaction is shaped by these environments, prompting us to ask whether and how the TEI might capture not only the textual content but also the interactional logic and interface conditions. (cf. Ensslin 2007)
Obsolete or Destandardized Formats: Early digital periodicals—such as “diskmags” distributed on floppy disks—pose related challenges. These media often rely on custom fonts, visual layouts, or interactive elements that resist standard textual extraction. Here, understanding the original processing context is crucial for recovering the intended presentation of text, images, and sound. (cf. Shtohryn 2025)
Program Code: Program code occupies a dual status as both human-readable document and machine-executable instruction. Source code files, even before compilation, already carry formal structures—syntax, control flow, comments—that are legible to trained readers and interpretable by machines. Similarly, scripting languages used in digital art installations or generative literature are often intended to be performed at runtime, foregrounding their status as both process and product. As with other forms of computable text, software challenges the boundary between static representation and dynamic behavior, inviting us to consider how TEI might model code not merely as text, but as a form of action embedded in cultural and technical ecosystems. (cf. Montfort 2024)
Paperware: Printed program code—found in books, journals, or private notes—underscores the hybridity of computable text. Such "paperware" can be read by humans and machines alike, with the intent of execution central to its form. Punchcards offer a compelling historical parallel, combining machine-readable perforations with human-readable text—testament to the enduring entanglement of material and digital representation. (cf. Höltgen 2014; Feichtinger 2023)
References
Beshero-Bondar, Elisa; Bauman, Syd (2024): Can we apply the new CMC chapter to the TEI Listserv Archives? An experiment with TEI for Correspondence and Computer-Mediated Communication, TEI Conference 2024. https://www.conftool.pro/tei2024/…
Ensslin, Astrid (2007): Canonizing Hypertext. Explorations and Constructions.
Feichtinger, Moritz (2023): Annotation, Simulation und Analyse eines historischen Datenbanksystems. In: Burghardt, M. & Weiß, C. (eds.): Lecture Notes in Informatics (LNI), Gesellschaft für Informatik. DOI: 10.18420/inf2023_96
Höltgen, Stefan (2014): Humanities of the Digital. Philologische Perspektiven auf Source Codes als Beitrag einer computerarchäologischen Knowledge Preservation. In: Bartelmus, M./Nebrig, A. (eds.): Digitale Schriftlichkeit. Programmieren, Prozessieren und Codieren von Schrift, p. 207–229.
Montfort, Nick (ed.) (2024): Our Generation: Programs + Computer-Generated Texts. Bad Quarto. https://badquar.to/publications/our_generation.html
Shtohryn, Tomash (2025): Semi-automatisierte Erzeugung eines Textkorpus von deutschsprachigen Diskettenmagazinen für das Heimcomputersystem Commodore 64 im TEI-Format, Universität Würzburg. DOI: 10.25972/OPUS-40282