Crowdsourcing Subjective Image Quality Evaluation
Subjective tests are generally regarded as the most reliable and definitive methods for assessing image quality. Nevertheless, laboratory studies are time consuming and expensive. Thus, researchers often choose to run informal studies or use objective quality measures, producing results which may not correlate well with human perception. In this paper we propose a cost-effective and convenient subjective quality measure called crowdMOS, obtained by having internet workers participate in MOS (mean opinion score) subjective quality studies. Since these workers cannot be supervised, we propose methods for detecting and discarding inaccurate or malicious scores. To facilitate this process, we offer an open source set of tools for Amazon Mechanical Turk, which is an internet marketplace for crowdsourcing. These tools completely automate the test design, score retrieval and statistical analysis, abstracting away the technical details of Mechanical Turk and ensuring a user-friendly, affordable and consistent test methodology. We demonstrate crowdMOS using data from the LIVE subjective quality image dataset, showing that it delivers accurate and repeatable results.