Digging into Signs: Towards a gloss annotation standard for sign language corpora

Autor/a: CRASBORN, Onno; BANK, Richard; CORMIER, Kearsy
Año: 2015
Editorial: Technical Report, 2015
Tipo de código: Copyright
Soporte: Digital


Lingüística » Sistemas de transcripción de las Lenguas de Signos, Lingüística » Corpus signados


The Digging into Signs project, a one year project that ran between mid-2014 and mid-2015, was a joint effort of University College London (PI Kearsy Cormier) and Radboud University (PI Onno Crasborn). The project developed standard annotation protocols for glossing sign language corpora, and included enhancements to the ELAN annotation software and the development of the Signbank lexical database system (Johnston, 2010; Cormier et al., 2012). The relatively recent advances in computer technology and digital video have made it possible to collect and store large datasets of sign language video recordings. Partly due to the fact that sign languages lack a commonly used writing system, annotation of lexical signs involves assigning a unique gloss to each sign: the ID-gloss (Johnston, 2008) As Johnston (2014) emphasises, annotation should be priortised over transcription. These ID-glosses are stored in a computerized lexical database so that signs in the corpus can consistently be identified. However, this leaves many complexities to deal with in annotation as not all signs (or manual articulations more generally) are lexicalized. Although several sign language corpus projects have provided guidelines for annotation (e.g. Crasborn, Mesch, Waters, Nonhebel, Van der Kooij, Woll, & Bergman, 2007; Crasborn & Zwitserlood, 2008; Johnston, 2014; Cormier & Fenlon, 2014), there is no general agreement on annotation standards. Recent arguments for standardising sign language corpus annotation have been made by Johnston (2008) and Schembri and Crasborn (2010). This document is a firm first step towards providing such general annotation standards for sign language corpora. This first step has included finding the common ground in the existing corpus glossing practices for BSL and NGT, seeking areas where change was needed, and outlining the motivations for choosing between different alternatives.