AI can give students answers quickly. That is useful, but it is not enough.
If AI-supported learning only helps students produce better answers, the task may look stronger while the student’s judgement remains underdeveloped. The more important question is whether students are learning how to evaluate what AI gives them. Can they compare it with what they already know? Can they question its assumptions? Can they verify its accuracy? Can they adapt it for purpose and audience? Can they decide what to accept, reject, revise or take responsibility for?
To design AI tasks that require student judgement, teachers need to make students compare, question, verify, adapt and decide rather than simply accept or improve AI-generated answers.
This is a central part of Visible Agency. When AI is involved in the learning process, student agency cannot be assumed from the final product alone. It has to become visible through the decisions students make along the way. Judgement is one of the clearest forms of that evidence. This matters because generative AI can both support and complicate learner agency. Roe and Perkins (2024) found that GenAI may enhance agency through personalisation and support, while also raising concerns around learner autonomy, equitable access and changing notions of agency.
The goal is not for students to get better answers from AI. The goal is for students to become better judges of answers.
That shift matters because AI can produce fluent, confident and useful responses even when the student has not yet done much thinking. If the task asks only for a final product, students may learn to become better users of AI without becoming more discerning learners. If the task requires judgement, students have to stay intellectually active. They must decide what is useful, what is limited, what is accurate, what is relevant and what they are prepared to stand behind.
This is where AI can become educationally powerful. It can give students more material to think with, but the task must require them to think.
In traditional classroom tasks, students often had to generate much of the content themselves. They had to search for information, organise ideas, draft responses, create examples and develop explanations. Those tasks still required judgement, but the act of production often carried much of the visible cognitive effort.
AI changes that balance.
When students can generate a paragraph, summary, argument, image, explanation, code sample or set of ideas in seconds, the cognitive demand shifts. The challenge is no longer only whether students can produce something. The challenge is whether they can evaluate what has been produced.
Is it accurate? Is it relevant? Is it complete? Is it appropriate for the audience? Does it match the criteria? Does it oversimplify the concept? Does it sound convincing without being well supported? Does it represent what the student actually understands?
These are judgement questions. They are not secondary to the learning. In AI-supported learning, they are often the learning.
In the language of revised Bloom’s taxonomy, this work draws on the upper levels of cognitive process. Students analyse when they compare AI output with their own thinking, task criteria or trusted sources. They evaluate when they judge usefulness, accuracy, relevance and quality. They create when they adapt, revise or produce a stronger final response. The revised taxonomy also identifies metacognitive knowledge as a distinct knowledge dimension, which matters here because students are not only evaluating AI output; they are learning to notice how they evaluated it (Anderson & Krathwohl, 2001; Krathwohl, 2002).
The deeper opportunity is metacognitive.
Students are not only judging AI output. They are learning to notice how they judge, why they trust, what they question and when they need to verify before deciding. This is consistent with metacognition and self-regulated learning research, which emphasises students learning to plan, monitor and evaluate their own learning rather than simply complete tasks (Education Endowment Foundation, 2025).
Students evaluate when they judge the quality of an AI response. They become metacognitive when they can explain how they made that judgement.
This connects directly to Visible Agency: How to Design AI-Supported Learning Without Outsourcing Student Thinking. The flagship idea is that AI should not make student thinking disappear. This article focuses on one practical way to keep that thinking visible: design the task so students must judge before they use.
Judgement also protects the learning from becoming passive. If the student’s role is simply to prompt, receive, copy and submit, the task has not required enough discernment. But if the student must compare, question, verify, adapt and decide, AI becomes part of a thinking process rather than a substitute for it.
The most common risk in AI-supported learning is not that students use AI. The risk is that they accept AI output too quickly.
AI responses often sound confident. They may be well organised, grammatically fluent and plausible enough to feel correct. For students, especially those still developing background knowledge, this creates a problem. Fluency can be mistaken for accuracy. Clarity can be mistaken for depth. Completion can be mistaken for understanding.
Passive acceptance occurs when students treat AI output as the answer rather than as something to be examined. They may paste an AI response into their work, lightly edit the wording, or use AI suggestions without asking whether they are accurate, relevant or appropriate. The task is complete, but the student may not have practised much judgement.
This is not a character flaw in students. It is a design issue.
If the task can be completed by accepting AI’s first useful response, then the task has not made judgement necessary. Students usually respond to the structure of the task. If the task rewards polished completion, they will optimise for polished completion. If the task rewards discernment, they will need to practise discernment.
The better question is not, “How do we stop students from using AI?” It is, “How do we design the task so AI output has to be interrogated before it can be used?”
This is why AI literacy cannot be reduced to prompt writing. Foundational AI literacy includes the capacity to understand, use and critically evaluate AI technologies, including their limits and ethical implications (Long & Magerko, 2020). More recent work on generative AI literacy makes a similar case, arguing that students need the capacity to evaluate generative AI critically and use it effectively, ethically and responsibly (Zhang & Magerko, 2025).
The concern is not only accuracy. It is also cognitive offloading: when AI does too much of the thinking, students may lose opportunities to monitor, test and regulate their own judgement. Viberg et al. (2026) identify cognitive offloading, transparency and human oversight as central dilemmas for protecting and promoting human agency in AI-supported education.
This is where the connection to How to Stop AI From Replacing Student Thinking becomes important. Passive acceptance allows AI to take over too much of the cognitive work. Judgement brings the student back into the centre of the process.
Judgement can sound abstract, but in task design it can become very practical. Students need repeated opportunities to practise five judgement moves: compare, question, verify, adapt and decide.
These moves help students work with AI output without surrendering their thinking to it. They also give teachers visible evidence of the student’s reasoning.
The important point is that each move has two layers. There is the cognitive action students perform, and there is the metacognitive awareness students develop as they notice how they performed it.
Students practise judgement when they compare AI output with something else.
They might compare an AI-generated explanation with their own first attempt. They might compare two AI responses to the same question. They might compare AI output with a rubric, source text, worked example, peer response, teacher model or class success criteria. They might compare a simple answer with a more complex one, or a general explanation with one written for a specific audience.
Comparison slows the process down. It asks students to notice differences rather than accept the first response that sounds reasonable. Cognitively, students are analysing. Metacognitively, they are asking: what am I noticing, and why does that difference matter?
A useful comparison task might ask:
Comparison makes thinking visible because students have to identify the basis for their preference. They are not simply saying, “This one is better.” They are explaining what makes it better.
Students do not practise judgement by receiving better answers. They practise judgement by deciding what makes an answer better.
This is where How to Make Student Thinking Visible When AI Is Part of the Process becomes a natural next step. Judgement should leave traces. Students need to show what they compared and why the comparison changed, confirmed or challenged their thinking.
Students practise judgement when they question AI output rather than treating it as neutral or complete.
Questioning means looking for assumptions, omissions, weaknesses, bias, overgeneralisation, unsupported claims or misleading confidence. It asks students to move from receiving an answer to interrogating an answer.
A student might ask:
Cognitively, students are analysing the answer’s structure, limits and assumptions. Metacognitively, they are asking: what made me suspicious, uncertain or curious?
That is important. We are not simply asking students to find faults. We are helping them become aware of the cues that tell them an answer needs closer examination. That is a form of intellectual discipline.
Questioning is especially important because AI can produce responses that sound authoritative even when they are incomplete or inaccurate. Students need to learn that a fluent response is not the same as a trustworthy response.
Questioning also develops intellectual independence. It teaches students that useful support does not remove the need for careful thought. AI may give them a starting point, but it should not get the final word without challenge.
Students practise judgement when they verify AI-supported work against evidence, criteria or trusted sources.
Verification is more than checking whether something “sounds right”. It asks students to test claims against something outside the AI response itself. That might be a source document, textbook, experiment result, dataset, assessment criteria, teacher explanation, approved website, class notes or expert model.
Verification matters because AI-generated work can contain errors, invented details, weak evidence or oversimplified explanations. It can also give students answers that are broadly plausible but not suitable for the specific task.
A verification task might ask students to:
Cognitively, verification asks students to evaluate accuracy, evidence and reliability. Metacognitively, it asks them to notice what they knew enough to check, and what they needed help or evidence to confirm.
This is where students learn a crucial habit: confidence is not evidence. A response can sound polished and still need verification. An answer can be useful and still be incomplete. A suggestion can be helpful and still require responsibility.
Verification gives students an important message: AI can assist, but it does not remove responsibility. If students use an AI-supported answer, they must know how to check the parts that matter.
Students practise judgement when they adapt AI output for purpose, audience, accuracy, context or meaning.
Adaptation is different from cosmetic editing. It is not just making the response sound better. It is changing the response so it becomes more appropriate, precise or useful for the learning purpose.
Students might adapt an AI-generated explanation so it suits younger learners. They might revise a generic response so it connects to a local example. They might change the tone of an argument for a particular audience. They might add missing evidence, remove unsupported claims, simplify unnecessary language or make the reasoning more explicit.
A strong adaptation task might ask:
Cognitively, adaptation moves students towards creating because they are shaping something new from the material they have evaluated. Metacognitively, they are asking: why am I changing this, and what does that change improve?
Adaptation helps students see that AI output is not finished work. It is material for thinking. The student’s role is to shape that material into something accurate, purposeful and responsible.
Students practise judgement when they make a decision and take responsibility for it.
This is the move that completes the process. After comparing, questioning, verifying and adapting, students still need to decide what they will accept, reject, revise or ignore. They need to explain why.
Decision-making is where judgement becomes ownership.
A student might decide that an AI suggestion is useful but incomplete. They might decide that one explanation is clearer but another is more accurate. They might decide that AI is helpful for generating examples but not for forming the final argument. They might decide not to use AI for a particular part of the task because the learning purpose is to practise independent recall, original interpretation or personal reflection.
Cognitively, students are evaluating and justifying. Metacognitively, they are asking: what am I prepared to stand behind?
The decision matters because it places responsibility back with the learner. They are not submitting AI’s thinking. They are submitting their own decision about what to do with AI-supported material.
This connects directly to How to Design AI-Rich Tasks That Still Require Student Ownership. Ownership is not proved by refusing support. It is shown when students can explain and stand behind the decisions they made while using support.
A task requires judgement when students cannot complete it by copying, lightly editing or submitting AI output. They have to evaluate and justify what they do with it.
That does not mean every AI-supported task needs to become complex. It means the task needs to include a clear moment where students must make thinking visible. The judgement should be built into the learning design, not added as a decorative reflection after the work is already finished.
This is also an assessment-design issue. Zaphir et al. (2024) argue that educators need ways to examine how vulnerable assessment questions are to generative AI and to redesign tasks around the critical thinking students are expected to demonstrate. In the same spirit, Ding and Magerko (2025) argue that educational AI evaluation needs to move beyond technical performance and output quality to include learner agency, context, ethics, explainability and human-centred outcomes.
There are several practical task structures that can help.
In an AI comparison task, students compare AI output with another version, source, model or set of criteria.
For example, students might write their own explanation of a concept before asking AI for a second explanation. They then compare the two versions and identify where their own explanation was stronger, where AI was clearer and what they would change in their final version.
The learning is not in asking AI for an explanation. The learning is in comparing explanations and deciding what makes one stronger than another.
In an AI critique task, students examine an AI response for strengths, weaknesses, omissions or inaccuracies.
For example, students might ask AI to produce a persuasive argument, then critique the quality of the reasoning, evidence and audience awareness. They could identify unsupported claims, weak transitions, vague language or assumptions that need to be challenged.
The value of this task is that students are not positioned as passive receivers. They are positioned as evaluators.
In an AI verification task, students check AI output against evidence.
For example, students might ask AI to summarise a historical event, then verify the summary using approved sources. They could mark which claims are confirmed, which are incomplete and which need correction.
This kind of task is especially useful when students are learning research habits. It makes verification part of the process rather than a teacher warning given after the fact.
In an AI revision task, students use AI feedback or suggestions to improve work, but they must decide which suggestions to accept.
For example, students might draft a paragraph, ask AI for feedback on clarity and structure, then choose two suggestions to apply and one to reject. They must explain each decision.
This teaches students that feedback, whether from AI, peers or teachers, is not something to follow blindly. It is something to judge.
In an AI limitation task, students identify where AI is not sufficient.
For example, students might ask AI to respond to a local community issue, then identify where the response lacks local knowledge, cultural understanding, emotional nuance or direct evidence. They then explain what human judgement or contextual knowledge would be needed.
This kind of task is important because students need to learn that AI can be useful and limited at the same time.
In an AI decision task, students decide whether, where or how AI should be used.
For example, students might be given several parts of a project and asked to decide where AI could be helpful, where it could interfere with the learning goal and where human judgement should lead. They must justify their choices.
This moves AI use from habit to intention.
The Visible Agency Design Test can help teachers review whether these tasks genuinely require judgement or only appear to. If the task does not require students to compare, question, verify, adapt or decide, it may not yet make judgement visible enough.
One of the most important forms of judgement is knowing when not to use AI.
This is easy to overlook. Many classroom conversations focus on how students can use AI effectively, ethically or efficiently. Those conversations matter, but they are incomplete. Students also need to understand when AI may be unhelpful, inappropriate or counterproductive for the learning purpose.
AI may not be the right tool when the purpose is to practise recall. It may not be the right tool when students need to wrestle with their own first ideas before receiving support. It may not be the right tool when the task requires personal experience, ethical reflection, cultural knowledge, emotional nuance or original interpretation. It may not be the right tool when using it would remove the productive struggle the task was designed to create.
This does not mean AI should be excluded from all of these moments. It means students need to understand the purpose of the learning before choosing the tool.
One sign of student agency is not simply using AI well. It is knowing when not to use it.
Teachers can help students develop this judgement by asking tool-choice questions before the task begins:
These questions help students see AI as a tool to be chosen intentionally, not a default response to difficulty. They also connect AI use to ownership. If students can explain when and why they used AI, they are more likely to remain responsible for the learning.
The easiest way to strengthen judgement is to build discernment into the prompt or task instructions. Instead of asking students to generate an answer with AI, ask them to do something with the answer that requires thinking.
The following prompts can be adapted across year levels and subject areas.
Use these when students need to examine differences between versions, sources or explanations.
Use these when students need to interrogate assumptions, omissions or weaknesses.
Use these when students need to test accuracy or reliability.
Use these when students need to revise for purpose, audience or context.
Use these when students need to make and justify a choice.
Use these when students need to think about appropriate use.
These prompts should not become a worksheet for every task. They are design options. The teacher’s role is to choose the prompt that matches the learning demand.
If the goal is accuracy, use verification. If the goal is discernment, use comparison and questioning. If the goal is communication, use adaptation. If the goal is ownership, use decision-making. If the goal is agency, ask students when AI is and is not the right tool.
When students are practising judgement, teachers should be able to see more than the final product. They should be able to see the student’s reasoning.
Useful evidence might include a comparison between the student’s first idea and AI’s response, annotations showing what the student questioned, notes showing which claims were verified, a revision explanation showing what changed and why, a short justification for accepting or rejecting AI suggestions, a reflection on when AI was useful or limited, or an explanation of why the student chose not to use AI for part of the task.
This evidence does not need to be elaborate. In many cases, a few sentences are enough. The point is not to create more paperwork. The point is to make the student’s judgement visible enough to support better feedback and deeper learning.
A useful test is whether the evidence helps the learning conversation. Can the teacher see what the student noticed? Can the student explain what they decided? Can both teacher and student identify how the work improved because of judgement rather than because of passive AI use?
If so, the task is doing more than producing an answer. It is developing a learner.
This article is part of the Visible Agency series.
For the broader framework, read Visible Agency: How to Design AI-Supported Learning Without Outsourcing Student Thinking.
For the evidence side of the work, read How to Make Student Thinking Visible When AI Is Part of the Process.
For the reflective side of AI-supported learning, read How to Use AI to Strengthen Metacognition.
For the ownership side of the work, read How to Design AI-Rich Tasks That Still Require Student Ownership.
To review a task before using it with students, use The Visible Agency Design Test.
AI should not simply give students more answers. It should create better opportunities for students to practise judgement.
When students compare, they learn to notice quality. When they question, they learn to recognise assumptions and gaps. When they verify, they learn that confidence is not evidence. When they adapt, they learn to shape ideas for purpose and context. When they decide, they learn to take responsibility.
This is how AI-supported learning can strengthen agency rather than weaken it.
The aim is not for students to become dependent on AI for better responses. The aim is for students to become more discerning, more responsible and more capable because of the thinking AI required them to do.
The question for teachers is not only, “Can students use AI for this task?”
The better question is, “What judgement will this task require students to practise?”
If the task requires students to compare, question, verify, adapt and decide, AI can become more than a shortcut. It can become a powerful surface for thinking.
That is where student judgement becomes visible.