Tracing Data Errors with View-Conditioned Causality
- Alexandra Meliou ,
- Wolfgang Gatterbauer ,
- Suman Nath ,
- Dan Suciu
SIGMOD'11: Proceedings of the 2011 ACM SIGMOD international conference on Management of data |
Published by ACM
A surprising query result is often an indication of errors in
the query or the underlying data. Recent work suggests us-
ing causal reasoning to nd explanations for the surprising
result. In practice, however, one often has multiple queries
and/or multiple answers, some of which may be considered
correct and others unexpected. In this paper, we focus on
determining the causes of a set of unexpected results, pos-
sibly conditioned on some prior knowledge of the correct-
ness of another set of results. We call this problem View-
Conditioned Causality. We adapt the denitions of causa-
lity and responsibility for the case of multiple answers/views
and provide a non-trivial algorithm that reduces the problem
of nding causes and their responsibility to a satisability
problem that can be solved with existing tools. We evaluate
both the accuracy and eectiveness of our approach on a real
dataset of user-generated mobile device tracking data, and
demonstrate that it can identify causes of error more eec-
tively than static Boolean in
uence and alternative notions
of causality.