Deep dive

Spaced repetition & SM-2

The science and engineering behind Educator's adaptive card picker: why reviewing at the right moment matters, how the SM-2 algorithm calculates that moment, and exactly how Educator implements it.

What is spaced repetition?

The forgetting curve

In the 1880s, German psychologist Hermann Ebbinghaus memorised lists of nonsense syllables and then tested himself at intervals to measure how much he retained. The result, the “forgetting curve”, showed that memory decays exponentially after learning stops. Within 24 hours of a single study session, most people retain only about 40% of what they encountered. Within a week, that falls below 25% without any follow-up.

The shape of the curve is predictable: it drops steeply at first, then flattens. Each time you successfully recall something, the curve resets, but it also becomes shallower. A memory you've retrieved three times decays far more slowly than one you've only encountered once.

Reviewing at exactly the right moment

Spaced repetition exploits this: the optimal time to review a card is just before you would forget it. Review too soon and you waste the session on something already secure. Review too late and the memory has already decayed, so you are essentially relearning from scratch. Schedule the review at the edge of forgetting and you get the maximum strengthening effect for the minimum time investment.

Spaced practice vs. massed practice

Decades of cognitive science confirm that spreading practice across multiple sessions (“spaced practice”) produces dramatically better long-term retention than the same total study time crammed into one sitting (“massed practice”, or simply cramming). A student who revises proteins across four sessions of 15 minutes over two weeks will remember significantly more at exam time than one who spends a single hour on the same material the night before.

This effect is robust across ages, subjects, and difficulty levels. It is one of the most replicated findings in educational psychology.

Short sessions daily beat one long session weekly. A 10-minute Educator session every day is more valuable than a 70-minute session on Sunday. The daily session reinforces memories at the moment they are fading; the weekly marathon mostly re-encounters things already forgotten. Educator's daily streak mechanic is designed specifically to encourage this pattern.

The SM-2 algorithm

SuperMemo 2 (SM-2) was developed by Polish researcher Piotr Woźniak in 1987 as part of the SuperMemo software project, one of the first computer programs dedicated to spaced repetition. It remains the most widely deployed spaced-repetition algorithm in the world: it powers Anki (the dominant flashcard app among university students), Duolingo's early word-practice engine, and countless other tools. Its longevity reflects a core insight that has held up under decades of real-world use.

Three key concepts

Ease factor (EF): A per-card multiplier that controls how fast its review interval grows. A card with a high ease factor is reviewed infrequently because you find it easy; a low ease factor keeps the card appearing frequently because it keeps tripping you up.
Interval: The number of days until the card is due for review again. After each successful review the interval is multiplied by the ease factor, pushing the card further into the future.
Quality grade (q): A 0–5 score describing how well you answered the card. In classic SM-2, the learner self-rates after seeing the answer. Educator assigns this automatically based on timing, as described below.

Educator's exact implementation

Quality grades

Educator never asks you to rate how well you remembered a card: that interrupts the flow and most people rate themselves inaccurately. Instead, it infers quality from two signals: whether the answer was correct, and how quickly it arrived.

q = 0 Skipped: Card was skipped during the session. Treated as unknown.
q = 1 Wrong: An incorrect answer was submitted.
q = 3 Correct but slow: Answered correctly but took more than 8 seconds. The information was retrieved, but with effort, so it is still fragile.
q = 4 Correct at normal pace: Answered correctly in 2.5–8 seconds. Solid recall; the most common grade.
q = 5 Correct and fast: Answered correctly in under 2.5 seconds. Effortless recall: the card is well-embedded in long-term memory.

Note: there is no q = 2 grade. SM-2 defines it as “incorrect but nearly remembered”, a distinction that cannot be inferred from timing alone, so Educator omits it.

Ease factor

Every card starts with an ease factor of 2.50 (stored internally as the integer 250). After each review, the ease factor is updated:

EF delta = 100 × (0.1 − (5 − q) × (0.08 + (5 − q) × 0.02))
New EF = max(130, currentEF + EF delta)

The minimum ease factor is 1.30. Even a card you consistently struggle with will never be scheduled so frequently that it dominates every session. At the other end, there is no hard ceiling: a card you answer fast repeatedly will gradually space out to months or even years.

Concretely: a correct answer at normal pace (q = 4) adds exactly 0 to the ease factor, so it is neutral. A wrong answer (q = 1) reduces EF by 54 points. A fast correct answer (q = 5) adds 10 points. Wrong answers compound: a card answered incorrectly five times in a row will have fallen to near the 1.30 floor.

Interval rules

Wrong or skipped (q < 3): interval resets to 1 day. The card comes back tomorrow regardless of how long it had been spaced out. Start over.
First successful review (card's current interval is 1 day or less): next interval steps to 6 days. This fixed step mirrors the original SM-2 bootstrap sequence.
All subsequent successful reviews: new interval = min(365, round(currentInterval × newEF / 100)). The interval grows exponentially, capped at one year.

Example progression

A card you always answer at normal pace (q = 4, EF stays at 2.50):

Review	Interval	Next due
1st correct	1 day → 6 days	6 days from now
2nd correct	6 × 2.50 = 15 days	15 days from now
3rd correct	15 × 2.50 → 38 days	~5 weeks from now
4th correct	38 × 2.50 → 95 days	~3 months from now
5th correct	95 × 2.50 → 238 days	~8 months from now
6th correct	238 × 2.50 → 595 days, capped at 365	up to 1 year from now

A card you consistently get wrong never reaches step 2 of that table: it keeps resetting to a 1-day interval, and its ease factor slowly declines, meaning it will resurface frequently until you nail it reliably.

Mastery levels

Educator summarises a card's review history into one of five mastery levels. These are shown on your profile page and in the Class accuracy by topic section on the class page.

0 Unseen: No review exists for this card yet.
1 Seen: The card has been reviewed at least once.
2 Familiar: Last answer was correct and at least 2 correct answers total. You know it, but only just.
3 Proficient: Last answer correct, at least 3 correct answers total, and current interval ≥ 7 days. The card has survived at least one full week of spacing.
4 Mastered: Last answer correct, at least 4 correct answers total, and current interval ≥ 14 days. The card has been correctly recalled across roughly half a school term of consistent spacing: a strong signal that it is in long-term memory.

Why Mastered doesn't mean finished. A Mastered card is still scheduled for future review. The interval grows longer each time, but the card never disappears from the queue entirely. In practice, a Mastered card might reappear only once every few months, a light-touch check that the memory is holding up.

Getting a card wrong drops mastery to Seen (level 1) regardless of prior history. A card that was Proficient and is answered incorrectly immediately reverts to Seen and must re-earn Familiar, Proficient, and Mastered through subsequent correctly-spaced answers. This is intentional: it prevents mastery levels from becoming a misleading indicator when retention has slipped.

How the card picker uses this

At the start of each session, Educator builds the card queue using five priority buckets, worked through in order. If a bucket has no candidates, the next bucket fills the queue:

Skipped + due: cards you skipped in a previous session whose review interval has also elapsed. Highest priority because skipping signals difficulty, and the interval has now lapsed too.
Wrong + due: cards answered incorrectly whose dueAt timestamp is in the past. Your weakest overdue material, tackled while attention is sharpest.
Unseen: cards never reviewed before. Unseen material is prioritised ahead of cards you have already answered correctly, so new content reaches you promptly rather than sitting behind a backlog of familiar cards.
Due and correct: cards whose interval has elapsed and that you previously answered correctly.
Not yet due: cards whose interval has not elapsed. Drawn only when all other buckets are empty; reviewing them early offers no retention benefit.

The key design decision is bucket 3 (unseen) ranking above bucket 4 (due-and-correct). This prevents the pattern where a student who has answered 15 cards correctly keeps seeing the same 15 every session while 200 unseen cards wait indefinitely behind them.

Speed Round and Marathon use the same adaptive picker. In a Speed Round (15 cards, 8-second timer) the clock changes what happens at the extremes: if you run out of time the card is marked wrong (q = 1) and its interval resets to 1 day. An answer submitted inside the 8-second limit is graded on timing exactly as in a normal session: q = 5 under 2.5 seconds, otherwise q = 4. There is no Speed-Round-specific q = 3: a card either beats the clock (and grades as 4 or 5) or runs out of time (and grades as wrong).

Why not use stock SM-2 exactly?

Classic SM-2 requires the learner to self-grade after seeing each answer, typically a button row from “Complete blackout” to “Perfect response”. This works well for deliberate study with physical flashcards, but creates friction in a fast, gamified app used by 13-year-olds. The pause interrupts flow, and most learners tend to overrate themselves (especially after a correct answer that took 15 seconds of searching).

Educator removes this friction with three changes:

Timing replaces self-grading. Response time is a reliable proxy for retrieval ease: fast = effortless, slow = effortful, no answer = not retrieved. The 2.5s and 8s thresholds were calibrated against the distribution of response times in early Educator sessions.
Skipped cards get q = 0. A skipped card is treated as unknown: it surfaces again in the next session with a 1-day reset. Students who use “skip” as an escape hatch quickly learn it keeps the card coming back.
No q = 2 grade. This distinguishes “wrong but nearly there” from “completely wrong”, a judgment that requires conscious self-assessment. Since Educator cannot infer this from timing, the grade is omitted rather than guessed.

The underlying math (ease factor, interval multiplication, the 1-day reset, the 6-day bootstrap step) is otherwise identical to Woźniak's 1987 specification.