Supporting exploratory data analysis with live programming
- Robert DeLIne ,
- Danyel Fisher
Visual Languages and Human-Centric Computing (VL/HCC), 2015 IEEE Symposium on |
Published by IEEE
Data scientists often conduct exploratory data analysis in scripting environments with a read-eval-print loop (REPL), like R, IPython or MATLAB. This user experience requires diligent management of execution and generates lengthy histories of unwanted command responses. This paper explores the alternative of live programming, a user experience in which the user’s edits immediately and automatically update the script results-a “ripple” effect familiar from spreadsheets. Which user experience provides better support for exploratory data analysis, REPL or ripple? We conducted a controlled lab study with 15 data-experienced professionals. Each participant explored four datasets, two in each experience. The REPL sessions left histories with both significantly more data results and significantly more errors than the live sessions. However, both experiences produced comparable numbers of data results that participants self-rated as insightful. Participants largely preferred the live experience for its responsiveness and ability to keep the script content clean, but missed the visible history that a REPL provides.