Learning & Cognition

What Cognitive Load Theory Gets Right About Studying

March 13, 2026 • 6 min read

A student's desk scattered with a half-read textbook, a notebook covered in arrows and diagrams, and a single lamp casting focused light.

In the mid-1980s, an Australian educational psychologist named John Sweller kept running into a strange result. He was studying how students learned algebra, and he noticed that the harder he made them work on conventional problems, the less they seemed to learn. Students who solved stacks of equations appeared to be mastering the material. On transfer tests, they flailed. Students who studied worked examples, meanwhile, with all the steps laid out, did better on new problems they had never seen.

Something about conventional practice was burning through the students’ mental bandwidth without depositing anything into long-term memory. Sweller, working at the University of New South Wales, began building a theory around that observation. He called it cognitive load theory. Almost forty years later, it remains one of the few frameworks in education that actually predicts, with some reliability, whether a given lesson or textbook will work.

The core claim is almost embarrassingly simple. Human working memory is tiny. It can hold only a handful of novel items at once, and it holds them for a matter of seconds before they decay or get overwritten. Long-term memory is effectively unlimited, but everything that lands there has to pass through that narrow aperture first. Learning is the process of building schemas in long-term memory that let you chunk information and treat complicated patterns as single units. A chess master does not see thirty-two pieces. She sees a handful of tactical configurations she has encountered before. Her working memory is doing the same amount of work as a novice’s. It is just working on bigger pieces.

This matters because every study task imposes some combination of three kinds of load on that narrow working-memory channel. Sweller and his collaborators distinguished them carefully, and the distinctions are worth knowing.

Intrinsic load is the difficulty that belongs to the material itself. Learning the names of the noble gases is low intrinsic load. Learning to balance a redox equation in acidic solution is high intrinsic load, because you have to hold several interacting rules in mind simultaneously. You cannot reduce intrinsic load by better teaching. You can only reduce it by breaking the material into smaller pieces and sequencing them, so the student’s schemas are ready before the harder material arrives.

Extraneous load is the difficulty that comes from how the material is presented rather than from the material itself. A textbook that puts a diagram on page 47 and its caption on page 48 is imposing extraneous load. So is a slide with a paragraph of text the lecturer is also reading aloud, forcing students to process the same information through two channels that interfere with each other. Extraneous load is the villain of the story. It eats working-memory capacity without contributing anything to learning.

Germane load is the useful kind. It is the effort that goes into actually building schemas, noticing patterns, connecting a new idea to something you already know. The goal of good instruction, in Sweller’s framing, is to minimize extraneous load so that as much working-memory capacity as possible is free for germane load to do its work.

This framework explains something every student has felt but rarely names. Some textbooks flow. You read a page and the concepts click into place with almost no resistance. Other textbooks, covering the same material, leave you exhausted after three paragraphs, your eyes tracking words while nothing enters your head. The difference is rarely the author’s intelligence. It is usually the author’s instinct for extraneous load. A good textbook puts the example next to the concept. A good textbook introduces one variable at a time. A good textbook knows which schemas the reader already has and which ones need to be built from scratch. A bad textbook assumes too much, or explains the wrong things, or presents a diagram the reader cannot yet parse because she has not been given the vocabulary.

The practical implication for students is less obvious than it sounds. You cannot simply “reduce cognitive load” by studying less, because the intrinsic load of genuinely hard material is not going anywhere. What you can do is stop adding extraneous load to your own study sessions.

Rereading is a good example of a high-extraneous-load activity dressed up to look productive. When you reread a chapter, your eyes pass over words your brain has already half-processed. The material feels familiar, which feels like understanding. But the work of actually building a schema, of taking the ideas apart and putting them back together in your own structure, is not happening. A student who closes the book and tries to redraw the concept on a blank page is forcing her working memory to do germane work. The research on drawing as a study strategy compared with rereading points in exactly this direction. The drawing is not more efficient because it is prettier. It is more efficient because it strips away the extraneous load of surface familiarity and forces the learner to reconstruct, which is the behavior that builds schemas.

Sequencing matters too. Cramming crams because it tries to push too much novel material through working memory in too short a window. The encoding stage, the first time you meet an idea, is when working memory is under the greatest strain. Spacing study across sessions is partly about long-term retention, but it is also about giving working memory room to breathe during the initial encoding. Even the common mistakes in how spaced repetition gets implemented come back to this. Stacking dozens of new cards in a single session turns the spacing into something that looks like cramming again, and the working-memory channel clogs.

Self-explanation works for the same reason. When a student reads a worked example and stops at each line to articulate why that step follows from the last, she is doing germane work. The effort to generate the explanation forces the working-memory channel to engage with the logical structure of the material, rather than passively absorbing it. The worked example itself has already taken care of the extraneous load. The student is free to spend her capacity on schema construction.

None of this is a trick. Cognitive load theory does not give you a shortcut. It gives you a reason to be suspicious of study activities that feel productive but don’t seem to stick, and a reason to trust activities that feel effortful but leave something behind. A practice problem where you get stuck for six minutes and then see the solution is, in cognitive-load terms, almost ideal, and it sits at the center of what researchers call the testing effect. You have held the problem in working memory long enough to build partial schemas, and the solution arrives in time to cement them. A practice problem where you glance at the answer after ten seconds is, in the same terms, almost worthless. Your working memory never did the work.

The theory is now old enough to have its critics, and the field has spent the last decade arguing about whether germane load is really a separate category or just intrinsic load that happens to be productive. The arguments are worth having. But the central picture has held up in study after study, across countries and subjects and age groups. Working memory is the bottleneck. The question is not whether you can expand it. You cannot. The question is what you are spending it on.

Photo via Unsplash.

What Cognitive Load Theory Gets Right About Studying

Related Articles

What Working Memory Research Actually Predicts for Students

The Expertise Reversal Effect: When Scaffolding Becomes a Ceiling

Judgments of Learning: Why You Are a Poor Judge of What You Know