Machine Learning Tools for Exploring Compositionality in the American SL Lexicon
Temas
Detalles
Highly accurate and precise models of sub-lexical compositionality with high coverage over the lexicon could be a valuable tool in studying processes like acquisition, lexicalization, and specific cases of comprehension (like neologisms and classifier predicates). However, characterizing the systematic relationships between form and meaning in sign language lexicons can be challenging, not simply because annotations from humans are expensive to obtain, but also because those relationships are often probabilistic and vary across individuals and contexts. This type of pattern recognition is well within the scope of machine learning (ML) methodology, and for American SL, we now have sufficient data to empirically test to where and to what extent ML methods can learn compositionality as a task unto itself. This seminar will introduce several open-source tools designed for linguists and cognitive scientists interested in studying compositionality at scale, with emphasis on models that automatically identify the phonological and lexical-semantic features of isolated sign productions, especially signs that the ML model has never seen before. Experimental results support the hypothesis that machine learning models can internalize certain form-meaning relationships, and that this skill can be helpful in (a) predicting the average age of acquisition, (b) predicting free associations from 41 early-acquisition deaf signers, (c) approximating the meaning of unseen signs (in isolation), and (d) reproducing the process of new sign creation. Future work will attempt to improve the operationalization of lexical semantics and apply these tools to sentence-level data.
