Downs and Acrosses: Textual Markup on a Stroke Level
Authors: Melissa Terras1; Paul Robertson2
Source: Literary and Linguistic Computing, Volume 19, Number 3, September 2004 , pp. 397-414(18)
Publisher: Oxford University Press
Key:
- Free Content
- New Content
- Subscribed Content
- Free Trial Content
Abstract:
Textual encoding is one of the main focuses of Humanities Computing. However, existing encoding schemes and initiatives focus on text from the character level upwards, and are of little use to scholars, such as papyrologists and palaeographers, who study the constituent strokes of individual characters. This paper discusses the development of a markup system used to annotate a corpus of images of Roman texts, resulting in an XML representation of each character on a stroke by stroke basis. The XML data generated allows further interrogation of the palaeographic data, increasing the knowledge available regarding the palaeography of the documentation produced by the Roman Army. Additionally, the corpus was used to train an Artificial Intelligence system to effectively read in stroke data of unknown text and output possible, reliable, interpretations of that text: the next step in aiding historians in the reading of ancient texts. The development and implementation of the markup scheme is introduced, the results of our initial encoding effort are presented, and it is demonstrated that textual markup on a stroke level can extend the remit of marked-up digital texts in the humanities.Document Type: Research article
Affiliations: 1: University College London, UK 2: Massachusetts Institute of Technology, USA
Key:
- Free Content
- New Content
- Subscribed Content
- Free Trial Content

Click here for Page Help