AI Safety

What makes an Expert? Comparing Problem-solving Practices in Data Science Notebooks

Experts use short, iterative workflows, while novices follow long, linear processes, a new analysis of 440 notebooks finds.

Deep Dive

Researchers Manuel Valle Torre, Marcus Specht, and Catharine Oertel analyzed 440 Jupyter notebooks to compare expert and novice data science practices. Their multi-level sequence analysis found that experts and novices don't differ in high-level phase transitions (like Data Import to EDA). Instead, expertise is defined by shorter, more iterative workflows and efficient, context-specific action sequences at the cell level. This provides educators with empirical insights for curriculum design focused on flexible, iterative thinking.

Why It Matters

This research provides a data-driven framework for training the next generation of data scientists, especially as AI tools reshape the field.