This dataset was designed primarily for work on isolated sign language recognition (ISLR), and within that space we recommend using this dataset for the task of dictionary retrieval. We define the task of video-based dictionary retrieval as: given a video of a person demonstrating a single sign through a webcam, the system retrieves a ranked list of dictionary entries that match that sign. This task is useful for creating reliable ASL-to-ASL or ASL-to-English dictionaries, which are essential tools for language learners and users.
We caution against using this dataset for understanding continuous signing by tokenizing sequences of signs that might map to our dataset. Continuous signing — full sentences and longer signed content — introduces many grammatical and structural complexities not present in isolated signs (e.g. signs may be modulated, co-articulation effects, how context changes the meaning of signs, etc.). Many of these complexities are also absent from spoken/written languages. At a minimum, this dataset would need to be used in conjunction with other datasets and/or domain knowledge about sign language in order to tackle continuous recognition or translation.
We ask that this dataset be used with an aim of making the world more equitable and just for deaf people, and with a commitment to do no harm. In that spirit, this dataset should not be used to develop technology that purports to replace sign language interpreters, fluent signing educators, and/or other hard-won accommodations for deaf people. We also ask that users of this dataset make no attempt to identify participants, or to use this dataset for applications that might exploit participant identity or appearance, including (but not limited to) facial recognition, deepfakes, or identification of sensitive attributes like race.
For whichever application you choose, we recommend using this data with meaningful involvement from Deaf community members at every step. As we describe in our linked paper, research and development of sign language technologies that involves Deaf community members in leadership roles with decision-making authority increases the quality of the work, and can help to ensure technologies are relevant and wanted. Historically, projects developed without meaningful Deaf involvement have not been well received and have damaged relationships between technologists and Deaf communities.
Please see the links below for an entry point to more information about respectful sign language technology development.
- Disability Dongle – (opens in new tab)Liz Jackson (opens in new tab), Alex Haagaard (opens in new tab), Rua Williams (opens in new tab)
- SignAloud Open Letter – Lance Forshay, Kristi Winter, Emily M. Bender (opens in new tab)
- Is “good enough” good enough? Ethical and responsible development of sign language technologies – (opens in new tab)Maartje De Meulder (opens in new tab)
- Nothing About Us Without Us: Disability Oppression and Empowerment – James I. Charlton (opens in new tab)
- The FATE Landscape of Sign Language AI Datasets: An Interdisciplinary Perspective – Danielle Bragg, Naomi Caselli, Julie A. Hochgesang, Matt Huenerfauth, Leah Katz-Hernandez, Oscar Koller, Raja Kushalnagar, Christian Vogler, Richard E. Ladner