Co-audit: tools to help humans double-check AI-generated content
- Andy Gordon ,
- Carina Negreanu ,
- José Cambronero ,
- Rasika Mudumbai Chakravarthy ,
- Ian Drosos ,
- Hao Fang ,
- Bhaskar Mitra ,
- Hannah Richardson (nee Murfet) ,
- Advait Sarkar ,
- Stephanie Simmons ,
- Jack Williams ,
- Ben Zorn
Users are increasingly being warned to check AI-generated content for correctness. Still, as LLMs (and other generative models) generate more complex output, such as summaries, tables, or code, it becomes harder for the user to audit or evaluate the output for quality or correctness. Hence, we are seeing the emergence of tool-assisted experiences to help the user double-check a piece of AI-generated content. We refer to these as co-audit tools. Co-audit tools complement prompt engineering techniques: one helps the user construct the input prompt, while the other helps them check the output response. As a specific example, this paper describes recent research on co-audit tools for spreadsheet computations powered by generative models. We explain why co-audit experiences are essential for any application of generative AI where quality is important and errors are consequential (as is common in spreadsheet computations). We propose a preliminary list of principles for co-audit, and outline research challenges.