AI Tools

Using AI as a Study Partner Without Outsourcing Your Thinking

A student seated at a cluttered desk, laptop open, handwritten notes covered in crossed-out phrases, with a coffee cup casting a long shadow.

A chemistry student, two weeks before her organic final, opens ChatGPT and types: explain SN1 and SN2 reactions to me. The model produces a clean, well-structured explanation. She reads it, nods along, copies the key points into her notes, and closes the tab. Forty minutes later she cannot reconstruct the difference between the two mechanisms without going back to the notes. A week later, sitting in the exam, she stares at a substrate and cannot predict which pathway it will favor. She studied. She used a powerful tool. She remembers almost nothing.

This is the failure mode that most worries teachers watching their students use AI. It is not cheating in the traditional sense. The student was not trying to evade work. She was trying to study, and she used the tool in the most obvious way available, and the tool performed beautifully. The problem is that its performance replaced hers. The cognitive labor that was supposed to build her understanding happened inside the model instead of inside her head.

The research on this failure is still young, but the pattern it describes is old. In 2011, Betsy Sparrow and colleagues at Columbia published a paper in Science showing that people who expected to be able to look up information later remembered that information less well. They called it the Google effect, and the finding has been replicated, debated, and refined many times since. When an external system reliably holds knowledge for us, our own memories quietly decline to hold it. ChatGPT is Google with better prose. The cognitive offloading is the same, only smoother.

But there is a second way to use the tool, and it produces almost the opposite result. Instead of asking the model to explain something, the student explains it to the model. She types out her current understanding of SN1 and SN2, stumbling through the parts she is unsure about, making guesses. Then she asks the model whether her explanation is correct, and where the gaps are. The model responds, and her understanding gets revised. This is not a minor variation. It is a different activity entirely.

In learning-science terms, the first version is rereading, dressed up in machine intelligence. The student passively receives a polished account of the material. She feels productive because she has collected information. But she has done no retrieval, which is to say she has not pulled the concept out of her own memory and held it up for inspection. The second version is retrieval practice, and retrieval practice is one of the few study behaviors with overwhelming experimental support. Henry Roediger and Jeffrey Karpicke at Washington University spent most of the 2000s showing that students who take practice tests learn more than students who restudy, often dramatically more, and the effect shows up at every level from word lists to medical school curricula. The common failure of spaced repetition systems in actual student practice usually comes back to this. Students replace retrieval with re-exposure, and the tool stops working.

An AI partner, used well, turns every study session into a low-stakes practice test with an infinitely patient examiner. Used badly, it turns every study session into a private lecture the student will forget on the way out of the library.

The difference in practice can be as small as a change in the first sentence of the prompt. Explain the central dogma of molecular biology to me is the passive move. I think the central dogma goes DNA to RNA to protein, with transcription and then translation, and there are some exceptions like reverse transcriptase. Am I missing anything? is the retrieval move. The student has already done the work of pulling the idea out of her own memory. The model is now useful, because it can confirm, correct, and extend. It has become a check on her thinking rather than a replacement for it.

The move generalizes. A physics student stuck on a problem about rotational inertia can ask the AI for the solution, in which case she has outsourced the problem. Or she can describe what she has tried, where she got stuck, and what she thinks the next step might be. The AI can then ask her a pointed question, or offer a hint just specific enough to unblock her. This is what good human tutors do, and it is what the model is surprisingly capable of doing if the student sets up the conversation correctly. There is even a name for it in the tutoring literature, going back to Arthur Graesser’s work at Memphis in the 1990s: the expectation and misconception-tailored dialogue. The tutor does not lecture. The tutor listens, identifies the student’s specific confusion, and intervenes at exactly the point where intervention matters.

The same logic applies to summaries. Asking an AI to summarize a chapter is almost always a mistake, because summaries are a particularly efficient form of passive study. They let the student feel she has processed the material without actually processing it. A summary you read is a summary you did not write, and writing it is where the learning lives. The problem is closely related to the drawing-versus-rereading comparison. Passive consumption of pre-organized material looks like studying but rarely builds the schemas that allow you to reason about new cases. The AI summary is cleaner than a textbook passage, and it is more dangerous for the same reason. It flows too well.

A better move is to write your own summary first, even a bad one, and ask the AI to critique it. Where is the logic weak? What did you leave out that matters? What would a specialist in this field disagree with? Now the AI is doing the work only it can do, which is acting as a knowledgeable second reader, while the student keeps doing the work only she can do, which is holding the material in her own head long enough to put it back together.

None of this is especially complicated. It rests on a single distinction: in every exchange with the tool, who is generating the thinking? If the student types a question and reads the answer, the model is thinking and the student is spectating. If the student types an argument, an attempt, a messy partial answer, and asks the model to respond to it, the student is thinking and the model is sharpening. The same tool, the same interface, the same few minutes. A different trajectory of learning.

Students who figure this out early tend to describe their AI use in oddly specific ways. They talk about arguing with the model, explaining things to it, getting caught when they have been lazy. They do not talk about asking it for answers. The conversations get longer and messier, not shorter and cleaner. That is usually a good sign. A clean conversation with an AI tutor is often one in which no learning happened.

This is what makes the worry about AI in education both more and less serious than it sometimes sounds. The tool is not inherently corrosive to thinking. It is simply amplifying whatever habit the student already has. A student who already studies by rereading will use the AI to reread faster. A student who already studies by retrieval will use the AI to retrieve better. The AI does not create the pattern. It just makes the existing pattern louder.

Which leaves teachers and students with a more uncomfortable task than banning the tool or surrendering to it. The task is teaching the move. Not the technology. The move.

Photo via Unsplash.