What makes an Expert? Comparing Problem-solving Practices in Data Science Notebooks
Experts use short, iterative workflows, while novices follow long, linear processes, a new analysis of 440 notebooks finds.
Researchers Manuel Valle Torre, Marcus Specht, and Catharine Oertel analyzed 440 Jupyter notebooks to compare expert and novice data science practices. Their multi-level sequence analysis found that experts and novices don't differ in high-level phase transitions (like Data Import to EDA). Instead, expertise is defined by shorter, more iterative workflows and efficient, context-specific action sequences at the cell level. This provides educators with empirical insights for curriculum design focused on flexible, iterative thinking.
Why It Matters
This research provides a data-driven framework for training the next generation of data scientists, especially as AI tools reshape the field.