Learning & Cognition

Desirable Difficulties: Why Struggle Is the Study

A student at a desk with several open textbooks, a page of handwritten notes, and crossed-out scribbles under focused desk light.

In 1979, two motor-learning researchers at Louisiana State University, John Shea and Robyn Morgan, ran what is now a canonical experiment. They taught subjects three short movement patterns, each involving knocking down a specific set of barriers in a specific order. Half the subjects practiced in blocks — all repetitions of pattern A, then all of pattern B, then all of pattern C. Half interleaved the three patterns randomly. During practice, the blocked group looked like it was winning: its errors dropped faster and its movements felt smoother. Then the researchers tested retention ten minutes later and again ten days later. The blocked group had regressed. The interleaved group had kept what it learned. On the delayed test, the gap was large enough to be clinically obvious.

Robert Bjork, a cognitive psychologist at UCLA, had been collecting results like this from memory research for a few years. By the 1990s, he and his wife Elizabeth, also a cognitive scientist, had assembled them into a framework they called desirable difficulties. The core idea reads like a contradiction until you sit with it. Conditions of practice that slow down acquisition and make the practice feel less successful often produce better long-term retention and transfer than conditions that make acquisition feel smooth and fast. The student who practices the hard way learns more slowly but ends up further ahead.

The Bjorks named four categories of desirable difficulty, and thirty years of research has mostly confirmed all four. Each one is worth looking at in turn, because the practical implications contradict what most students and most teachers intuitively do.

The first is spacing. Cramming six hours of study into the night before an exam feels efficient. It is not. The same six hours distributed across six study sessions over two weeks produces substantially better retention, in nearly every experimental comparison psychologists have run. Hermann Ebbinghaus found this in 1885; Harry Bahrick replicated it across years in the 1980s; the effect has survived every serious attempt to undermine it. The reason cramming feels more efficient is that it produces a surge of familiarity that the brain reads as mastery. That familiarity evaporates on a timescale of days. The spacing effect is why every serious spaced repetition system is built around distributing exposures rather than concentrating them.

The second is interleaving, which is the effect Shea and Morgan isolated. When you practice a single skill in a block — fifty problems of the same type from a textbook, or a hundred free throws in a row — your brain gets very good at the specific mental motion for that session. It stops having to select which method applies, because every problem uses the same method. When you interleave multiple types of problems, each one forces a small decision: is this a related-rates calculus problem or a chain-rule problem? The decision costs effort, and the effort feels like poor performance, but the decision is exactly what you will need to make on a real exam or in real work. Doug Rohrer and Kelli Taylor at the University of South Florida have done the most careful empirical work on interleaving in mathematics education, and their results show the same pattern Shea and Morgan first saw in motor learning: students who interleave practice get worse scores in the moment and better scores weeks later.

The third is variation. Practicing a skill under exactly the conditions it was taught — same room, same lighting, same textbook font, same order of problems — produces narrower learning than practicing the same skill under varied conditions. A student who studies anatomy only in her dorm room performs worse on an exam held in a different building than a student who studied in several locations. The effect sounds silly but has been replicated many times. It appears to be a contextual-binding phenomenon: when learning is tightly bound to a single set of environmental cues, it becomes harder to access when those cues are absent. Varied practice binds the learning to multiple contexts, which means it retrieves more reliably when a new context shows up.

The fourth is testing, and this is where desirable difficulties shade into the testing effect directly. Testing yourself feels like failure in a way rereading does not, because the failures are visible. The student who rereads a chapter and nods is not confronted with her gaps. The student who closes the book and tries to summarize is. The discomfort of that confrontation is the learning moment, not an obstacle to it. Kornell and Bjork, running a series of experiments in the mid-2000s, showed repeatedly that students would rate their own learning as higher under conditions that were producing less actual retention, because the smoother experience was mistaken for mastery.

What unites all four is that they convert short-term performance into long-term learning, at a cost the learner almost always misreads. Bjork distinguishes between performance — how well you can do the thing right now — and learning — how durably the capacity has been laid down. Desirable difficulties worsen performance in the moment in exchange for better learning over time. Most students, left to their own devices, will optimize for performance, because performance is visible and learning is not. A student who reads fluently feels she has studied. A student who struggled through a practice problem and got it wrong feels she did not. The intuitions are backwards.

The pedagogical implication is uncomfortable. Teaching that optimizes for student satisfaction in the moment, and for high in-class quiz scores, is probably sacrificing long-term retention. The course that feels hard, where students cannot tell how well they are doing in the middle, is often the course from which they leave actually knowing the material three months later. Teachers have known this intuitively for decades. The research now gives them permission to resist the student-evaluation incentives that push teaching toward fluency.

A caveat the Bjorks have always stressed: not all difficulties are desirable. An illegible textbook is difficult. So is a poorly organized lecture. So is a problem set with errors in the answer key. These are what they call undesirable difficulties, and they interfere with learning without adding any of the benefits. The distinction is between difficulty that forces productive cognitive work and difficulty that wastes working memory on decoding the material rather than engaging with it. Extraneous cognitive load is undesirable difficulty in a different vocabulary.

In practice, how does a student apply this? Space study sessions across days, not into a single night. Mix problem types within a session rather than doing a block of one type. Study in more than one location. Replace some of your rereading time with self-testing, even when the testing reveals that you know less than you thought. Use the feeling of confusion as a signal that learning might be happening, not that it has failed.

There is one last move that the literature supports, which is to forgive yourself for the feeling of floundering. The floundering is not wasted time. It is often the exact interval during which the brain is doing the structural work that produces durable knowledge. The smooth session that felt productive is frequently the one you will remember nothing from. The rough session where nothing seemed to stick is often the one that did.

A version of this truth shows up across the learning-science literature: in the case for drawing over rereading, in the retrieval-practice work, in the research on note-taking systems that force rewriting, in the design principles of good flashcard decks. Each of these findings, taken alone, looks narrow. Taken together, they are one finding: effort that the learner has to generate herself, against some resistance, is what produces durable knowledge. Effort that the material provides for her, however beautifully presented, does not.

This is why the old-fashioned advice — practice hard, practice variably, test yourself often, space it out, and do not mistake ease for progress — has survived decades of experimental scrutiny. It turns out the folk wisdom was closer to right than the later generations of study-smart-not-hard self-help. The studying is the hard part, and the hard part is what works.

Photo via Unsplash.