Preserving Privacy in Crowd-Powered Systems

In Proceedings of the Workshop on Human-Agent Interaction Design and Models (HAIDM 2015) at AAMAS 2015 |

It can be hard to automatically identify sensitive content in images or other media because significant context is often necessary to interpret noisy content and complex notions of sensitivity. Online crowds can help computers interpret information that cannot be understood algorithmically. However, systems that use this approach can unwittingly show workers information that should remain private. For instance, images sent to the crowd may accidentally include faces or geographic identifiers in the background, and information pertaining to a task (e.g., the amount of a bill) may appear alongside private information (e.g., an account number). This paper introduces an approach for using crowds to filter information from sensory data that should remain private, while retaining information needed to complete a specified task. The pyramid workflow that we introduce allows crowd workers to identify private information while never having complete access to the (potentially private) information they are filtering. Our approach is flexible, easily configurable, and can protect user information in settings where automated approaches fail. Our experiments with 4685 crowd workers show that it performs significantly better than previous approaches.