Doodling: A Gaming Paradigm for Generating Language Data

Proceedings of the Human Computation Workshop 2012. |

With the advent of the increasingly participatory Internet and the growing power of the crowd, “Serious Games” have proven to be a fertile approach for gathering task-specific natural language data at very low cost. In this paper we outline a game we call Doodling, based on the sketch-and convey metaphor used in the popular board game Pictionary ®2, with the goal of generating useful natural language data. We explore whether such a paradigm can be successfully extended for conveying more complex syntactic and semantic constructs than the words or short phrases typically used in the board game. Through a series of user experiments, we show that this is indeed the case, and that valuable parallel language data may be produced as a byproduct. In addition, we explore extensions to this paradigm along two axes – going online (vs. face-to-face) and going cross-lingual. The results in each of the sets of experiments confirm the potential of Doodling game to generate data in large quantities and across languages, and thus provide a new means of developing data sets and technologies for resource-poor languages.