Some data on the shape of the forgetting curve
Ebbinghaus's classic forgetting curve may not match real flashcard performance data.
A recent LessWrong post by user nwm challenges the widely accepted exponential forgetting curve, originally derived from Hermann Ebbinghaus's 1885 experiments. The classic curve, often depicted as a steep drop in retention over time, is based on 'savings'—the time saved when relearning information—rather than direct measures of recall probability. nwm analyzes their own spaced-repetition data, specifically performance on the fourth response after three consecutive correct answers (CCC prefix), and finds that the data does not clearly follow an exponential distribution.
Using Bayesian information criterion (BIC) analyses, nwm shows that models with fewer parameters, including linear fits, perform nearly as well as more complex ones, suggesting any distribution can approximate the data. The author emphasizes that the pragmatic takeaway is to prioritize the ergonomics of broader learning systems over fine-tuning algorithmic details. This post adds to ongoing debates in the spaced-repetition community, where some advocate for power laws or other models, but the core insight remains: real-world forgetting curves may not match textbook schematics.
- Ebbinghaus's forgetting curve is based on 'savings' metric, not modern recall probability.
- nwm's CCC-prefix data shows no clear exponential fit; linear models perform similarly.
- Author suggests focusing on learning system ergonomics over precise algorithmic models.
Why It Matters
This challenges foundational assumptions in spaced-repetition software design, urging a shift from algorithmic precision to user experience.