Why AI Is a Different Kind of Tool: (1) The Judgment Gap

AI gives you the conclusion, pre-formatted as authoritative, rather just giving you the data to decide for yourself

May 31, 2026

A hiring manager opens a candidate file. The hiring system has already reviewed the application, the CV, the structured interview notes from the previous rounds. It returns a recommendation: not a strong match, along with a confidence score and a structured summary: communication style flagged as misaligned with the team profile, technical screen below threshold for the role’s seniority band, two competencies marked uncertain in absence of further evidence.

The manager reads through the file. They look at the writing samples, the screener’s notes, the candidate’s prior roles. Hiring manager has made dozens of hiring calls before, and some they were proud of, some they weren’t. After a few minutes, they make a decision: decline, move to the next candidate.

Here is what I want to ask about that moment. What kind of act was it?

The manager would say they reviewed the evidence and made a hiring call. And in one sense, that’s correct because they looked at the file, they applied their experience, they decided. But if looked at this case more carefully, if we let the structure of what happened reveal itself, we will see that the question they were actually answering was not is this person right for the role? It was does the AI’s assessment seem right here? Those are, obviously, not the same question. The first requires you to reason from the evidence to a conclusion. The second one – requires you to evaluate a conclusion that arrived before you did. The manager was performing an act of ratification, not a genuine act of judgment. The difference, I argue, is real, even if the outcome is identical.

This is the thing I want to make precise in this post. I’ve been calling it the judgment gap.

The simplest way into this scenario is through what makes a calculator different from the kind of AI system I’m describing.

When you use a calculator, you get a number and that number is data. You still have to decide what to do with it, what it means in context, how it bears on the decision you’re making, whether it changes anything. The calculator does the arithmetic while you do the rest. This is true of most tools we use: a thermometer gives you a temperature, a scale gives you a weight, a search engine gives you results. The output is raw material that you take upon yourself to reason through toward a conclusion that is genuinely yours. Even if not unique or too original. The tool provides a premise, you are the one to provide the judgment.

AI outputs in judgment domains don’t work this way. The hiring system doesn’t give the manager a tabulation of competency scores to reason about, rather it gives them not a strong match. The clinical decision support system doesn’t give the physician a blood pressure reading, rather it gives them consider aggressive treatment given patient profile. The legal AI doesn’t return the relevant statutes, rather it returns the strongest argument available is X. The content moderation tool doesn’t return a list of policy clauses the post might engage, rather it returns remove: violates policy. You get the picture. In each case, the output already incorporates what deliberation would have produced. It arrives formatted as a conclusion, carrying the implicit authority of a system that processed more signals than any human could review. The output is not a premise. It is the product of reasoning, packaged for uptake.

The distinction I’m drawing is between an output that enters your thinking as something to reason with and one that enters your thinking as the output your reasoning would have reached. Prior tools, almost without exception, produced the first kind of output. AI, in judgment domains, produces the second.

Of course, the obvious response from my learned friends is this: humans can override. The manager, doctor, lawyer, support specialist can disagree with the recommendation, push back, advance their case against the score. The AI is not making the final call after all, a person is. And this is observably true. The formal architecture indeed preserves human decision-making authority.

But the question I am asking isn’t whether override is technically available. It’s what kind of cognitive act is being performed when the manager, lawyer, or doctor looks at the file. If the conclusion arrived before they did, their task is to evaluate the conclusion rather than form one. These look the same from outside, and in both cases, a human reviews information and makes a call. The structure, however, is different: one involves reasoning from evidence to judgment; the other involves assessing a judgment that was already reached. The first is the act we mean when we say a person decided. The second is something else, call it review, endorsement, auditing, but it’s not quite decision-making in its most genuine sense, even though it produces an output that looks exactly like one.

The override possibility doesn’t close the gap. If anything, it makes the gap harder to see, because the formal architecture of human authority is preserved while the substance of the reasoning act changes underneath it.

Now, here’s the part that matters for the longer argument.

A radiologist who has spent five years reviewing AI analyses of imaging studies develops something real, even if intangible. They learn when the AI is right and when it’s off, what kinds of cases it tends to miss, how to read the gap between the AI output and the clinical picture. That is genuine expertise, and it is valuable. It is not the same expertise as five years of reading radiology films directly. Both practitioners are competent. However, the character of their competence is different.

The one who read radiology films directly for five years developed the capacity to form independent judgments: to look at an image, without prior framing, and arrive at a clinical conclusion through their own perceptual and interpretive work. The one who reviewed AI analyses for five years developed the capacity to evaluate judgments already formed: to recognize when the AI was confident and wrong, when the exception should override the pattern, when to escalate. The second capacity is useful but only while the first one remains available as a check on the AI. It is less useful if the first capacity was never developed in the first place.

The skill that develops from five years of reviewing AI conclusions is not the same skill as five years of forming the underlying judgments. We are making a large, quiet substitution between them, and we are already doing it across radiology, law, hiring, moderation, clinical medicine, financial decision-making, and dozens of other judgment-intensive domains without clearly tracking which skill we’re building and which we’re declining to build. The professional identity says I make the decisions. The structure of the work increasingly says I ratify someone else’s decisions. The gap between those two descriptions is where something important is happening, and it is happening without anyone having decided it should.

The manager who opens that file tomorrow will be slightly more fluent at evaluating AI recommendations than the manager who opened one last year. They will also have had one fewer occasion to form the underlying judgment themselves. Neither change is dramatic. Neither is visible. But they compound: daily, across careers, across an entire professional generation.

This is the first of four things I want to say about what makes AI specifically different from prior tools. Last month I described the overall structure: the difference between technologies that give you more to work with and ones that do the work. This post goes deeper into what that substitution is exactly: not the replacement of a function but the pre-emption of the reasoning act that produces the output. The reasoning still happens but it happens elsewhere, before the human even enters the picture.

Next month, I want to explore why this scope matters, what changes when the substitution doesn’t stay in one domain but follows you into every domain simultaneously.

Justas Petronis

Discussion about this post

Ready for more?