Prompting Patterns That Help Students Learn vs. Ones That Substitute
A second-year biochemistry student I know keeps two browser tabs open while she studies. The first is a PDF of her lecture slides. The second is ChatGPT. For months, her default prompt was some version of “explain the electron transport chain to me like I’m five.” She’d read the answer, feel satisfied, close the tab, and fail the quiz the next morning. When she changed one thing, the shape of the prompt rather than the model, her quiz scores climbed about a letter grade. She didn’t switch to Claude. She didn’t upgrade her subscription. She stopped asking the model to do her thinking.
The gap between prompts that help students learn and prompts that quietly replace learning has almost nothing to do with the tool. It has to do with who is doing the cognitive work in the exchange. When a student types “explain X to me,” the model does the retrieval, the model does the structuring, the model does the synthesis. The student reads. Reading feels like studying. It isn’t, or at least not much of it is, which is the premise behind decades of work on the testing effect and active recall. The worst default prompt for a student is the one that turns the model into an audiobook.
Consider three prompt patterns that invert this. The first is what I’ve started calling explain-first-ask-second. Instead of asking the model to explain, the student writes out their own understanding of a concept, pastes it in, and asks the model to identify what’s missing, wrong, or unclear. This forces the student to retrieve before they receive. The model becomes an auditor. The cognitive labor stays with the learner, and the feedback is specific to their actual gaps, not the average gaps of the textbook’s implied reader.
The second pattern is the Socratic chain. The student asks the model to question them on a topic rather than teach it. A good version looks like: “Ask me five increasingly hard questions about mitochondrial ATP synthesis. Wait for each answer. Don’t tell me if I’m right until I ask.” Models trained after 2024 are reasonably good at this if you pin them down. They drift toward lecturing otherwise, which is their native mode and the path of least resistance. The prompt has to hold them in the Socratic posture. Students who use this pattern report that the first few questions feel easy and the last two feel like an oral exam, which is exactly the productive frustration that learning-sciences researchers call a desirable difficulty.
The third is worked-example critique. The student pastes a solved problem (their own, or one from the textbook) and asks the model to evaluate the reasoning step by step. Not “is this right,” which invites a yes-or-no rubber stamp, but “walk through each line and say what assumption it depends on and where a student typically goes wrong at this step.” This turns a passive example into an active diagnosis. It works particularly well in physics, statistics, and organic chemistry, where the expert eye sees structure that the novice misses.
Compare these to the outsourcing patterns. “Summarize this chapter.” “Write a study guide for this topic.” “Give me ten practice questions and the answers.” The last one sounds like retrieval practice but isn’t, because reading a question and then reading the answer below it is just reading. The model has done both halves of the exchange. The student’s brain has done the work of a spectator at a tennis match.
There’s a subtler failure mode that shows up with high-achieving students, which is what you might call the infinite-clarification loop. A student asks a good question, gets a reasonable answer, and then asks a follow-up, and another, and another. Each response feels productive. Two hours later, they’ve read about twenty pages of generated prose and retained almost none of it because they never had to reconstruct any of it from memory. The 2006 Karpicke and Roediger work on testing versus restudying found, famously, that students who restudied material rated their own learning higher than students who tested themselves, and learned less. Long conversations with LLMs trigger the same illusion of fluency. The text on the screen feels like knowledge. It isn’t yours yet.
What separates a substitutive prompt from a generative one, in practice, is whether the student could close the browser and still answer the question. A good study session with a model should end with the student, not the transcript, holding the concept. Which is the whole argument for treating the model as a study partner rather than an outsourcing service: a study partner expects you to do your share of the thinking, and the best ones will refuse to just hand you the answer.
A few concrete prompts worth stealing. For difficult readings: “I’ve just read this paper. Before I look at it again, I’m going to write out what I think the main argument is and the two pieces of evidence I remember. Then I want you to tell me what I’ve gotten wrong or left out, citing the page.” For problem sets: “I got this problem wrong. Here’s what I did. Don’t tell me the right answer. Ask me one question that, if I can answer it, will show me where I went off.” For exam prep: “Quiz me on this unit. Mix question types. Give me no feedback until the end, then tell me which conceptual categories I got wrong and which I got right.”
None of these are clever. They’re boring, in the way that effective study is usually boring. They also work against the grain of what the model wants to do, which is to be helpful in the thickest, most immediate sense: to answer, to explain, to resolve. A student who wants to learn has to keep pulling the model back from that instinct.
One more thing worth saying. The best single prompt I’ve seen a student use, from a graduate student in linguistics, was a permission: “For the next hour, refuse to give me any direct answer. If I ask for one, respond with a question that would help me find it myself. Only break this rule if I say the word ‘stop’.” She told me the first twenty minutes were painful. The last forty felt like the kind of tutoring she couldn’t afford. The model was the same one everyone uses. The difference was who was holding the pencil.
Photo via Unsplash.