PeruSIL: A Framework to Build a Continuous Peruvian Sign Language Interpretation Dataset

Autor/a: BEJARANO, Gissella Maria; HUAMANI-MALCA, Joe; CERNA-HERRERA, Francisco; ALVA-MANCHEGO, Fernando, RIVAS, Pablo
Año: 2022
Editorial: European Language Resources Association
Tipo de código: Copyright
Soporte: Digital


Medios de comunicación y acceso a la información » Nuevas Tecnologías


Video-based datasets for Continuous Sign Language are scarce due to the challenging task of recording videos from native signers and the reduced number of people who can annotate sign language. COVID-19 has evidenced the key role of sign language interpreters in delivering nationwide health messages to deaf communities. In this paper, we present a framework for creating a multi-modal sign language interpretation dataset based on videos and we use it to create the first dataset for Peruvian Sign Language (LSP) interpretation annotated by hearing volunteers who have intermediate knowledge of PSL guided by the video audio. We rely on hearing people to produce a first version of the annotations, which should be reviewed by native signers in the future. Our contributions: i) we design a framework to annotate a sign Language dataset; ii) we release the first annotated LSP multi-modal interpretation dataset (AEC); iii) we evaluate the annotation done by hearing people by training a sign language recognition model. Our model reaches up to 80.3% of accuracy among a minimum of five classes (signs) AEC dataset, and 52.4% in a second dataset. Nevertheless, analysis by subject in the second dataset show variations worth to discuss.

En Proceedings of the LREC2022 10th Workshop on the Representation and Processing of Sign Languages: Multilingual Sign Language Resources.