RWTH-Phoenix: Analysis of the German Sign Language Corpus

Autor/a: STEIN, Daniel; FORSTER, Jens Forster; ZELLE, Uwe; DREUW, Philippe; NEY, Hermann
Año: 2010
Editorial: RWTH Aachen University, Germany
Tipo de código: Copyright
Soporte: Digital


Lingüística » Lingüística de otras Lenguas de Signos, Lingüística » Sistemas de transcripción de las Lenguas de Signos, Lingüística » Corpus signados


For  data-driven  automatical  sign  language  processing, finding  a  suitable  corpus  is  still  one  of  the  main  obstacles. Most available data collections focus on linguistic issues and have a domain that is too broad to be suitable for these approaches.  In (Bungeroth et al., 2006), the RWTH-Phoenix corpus was described, a collection of richly annotated video data from the domain of German weather forecasting.  It includes a bilingual text-based sentence corpus and a collection of monolingual data of the German sentences.  This domain was chosen since it is easily extendable,  has  a  limited  vocabulary  and  features  real-life  data rather than material made under lab conditions. In this work, we are going to analyse the recent additions made to the existing corpus and its impact on the automatic machine translation. We are also applying some recent advancements in the field of statistical machine translations and analyse if they work on tiny data collections.