Administrative Law and AI's Overconfidence: What Regulated Organizations Need to Know

Most discussions about AI risk in regulated industries circle around the same concerns: bias, data privacy, explainability, validation. Those are legitimate problems. But Cary Coglianese's March 2026 analysis in The Regulatory Review identifies something more specific — and more legally consequential — that deserves serious attention from every compliance officer, general counsel, and regulatory affairs team working in a regulated sector.

The problem isn't just that AI can be wrong. The problem is that AI is confidently wrong in precisely the ways that administrative law prohibits. And organizations that don't understand that distinction are building compliance exposure without realizing it.

What the Coglianese Analysis Actually Says — and Why It's Different From the Usual AI Risk Talk

Coglianese, a Penn Law professor and one of the more serious regulatory scholars working on AI governance, makes a deceptively simple argument: large language models generate plausible-sounding answers to questions they cannot actually answer with evidentiary grounding. He illustrates this with a striking example. When asked where the EPA should set ozone standards, ChatGPT confidently recommended 60 parts per billion. That recommendation has no evidentiary backing whatsoever. It is a word-prediction output dressed as a policy conclusion.

This isn't a story about hallucination in the abstract. It's a story about what happens when an organization — public or private — relies on that kind of output to justify a consequential decision and then has to defend that decision under legal scrutiny.

Google Gemini reportedly predicted college football game outcomes before the games were played — another illustration that LLMs generate text based on probability patterns, not factual grounding. But football games are low stakes. What happens when the same architecture generates a confident product safety determination, a formulary recommendation, or a contract risk assessment?

What makes Coglianese's analysis distinctive is that it anchors AI risk not in abstract ethics language but in the specific legal standards that govern consequential decisions. For regulated organizations, that matters because those same standards — or their private-sector analogs — likely apply to you too.

The DOGE/VA Case: A Compliance Failure in Real Time

The clearest illustration of what Coglianese is describing isn't hypothetical. ProPublica's investigation revealed that the Department of Government Efficiency deployed an AI tool built in roughly two days — by a software engineer with no healthcare experience — to flag VA contracts for cancellation. The results were a documented compliance disaster.

The tool inflated contract values significantly. Some $35,000 contracts registered as $34 million. It flagged cancer treatment gene sequencers and patient care systems as cuttable. It identified contracts that, if terminated, would have locked veterans out of benefits systems permanently. Congressional oversight panels from both chambers cited the arbitrary nature of the cancellations. One expert's assessment was blunt: "AI is absolutely the wrong tool for this."

What's most instructive about this case isn't the scale of the error — it's the developer's own post-hoc admission. The person who built the tool later stated: "I would never recommend someone run my code and do what it says." That statement describes the gap that Coglianese is pointing to. An AI tool can generate confident, decisive-looking outputs that the person who built it does not believe should be acted on directly. When an organization does act on those outputs directly — without validation, without expert review, without structured analysis of alternatives — it has substituted AI confidence for institutional judgment.

That substitution is not a technology problem. It is a governance failure.

The Legal Architecture That Makes AI Overconfidence a Liability

The Administrative Procedure Act's arbitrary-and-capricious standard requires federal agencies to do three things when making consequential decisions: consider the important aspects of the problem, assess policy alternatives, and make forecasts about how those alternatives would change real-world outcomes. Agencies cannot satisfy these requirements by pointing to an AI tool and saying "the model said so."

As the Harvard Law Review noted in its analysis of "Machine Rulemaking: Arbitrary and Capricious Review in the Age of AI," AI-assisted decisions require validation of algorithmic reliability and comparative analysis of alternatives — not just confidence in the output. The Yale Journal on Regulation's work on minimum administrative law standards for agency AI use goes further, establishing that when AI tools can be validated to perform reliably, failure to use them may also be arbitrary and capricious. The standard cuts both ways.

The Administrative Conference of the United States, in Statement 20 on Agency Use of Artificial Intelligence, addresses the disclosure requirements directly: agencies must disclose AI methods, assumptions, and data used in consequential decisions — analogous to the notice requirements under Section 553 of the APA. "Rubber-stamping the output of a tool known to be prone to error without explanation," as one legal commentator framed it, "would be arbitrary and capricious."

What this legal architecture describes is a documentation and reasoning standard. Consequential decisions require demonstrated consideration of evidence, not just a confident output from a model that has no evidentiary basis for its confidence.

This Isn't Just a Government Problem — Implications for Regulated Industries

Here is where most analysis of the Coglianese piece stops short. It focuses on federal agencies because the APA is a federal statute. But the same structural problem applies throughout regulated industries, and the legal exposure is real.

Consider pharma. FDA requires that regulatory decisions — product labeling changes, safety reporting determinations, manufacturing deviation assessments — be supported by documented analysis. The agency does not accept "our AI system flagged this" as a substitute for qualified expert judgment documented in your quality system. If a pharmacovigilance team uses an LLM to generate a safety narrative and submits it without qualified medical review, the question isn't whether the AI was right or wrong. The question is whether the required human review actually occurred and was documented. If it wasn't, you have a GxP compliance gap regardless of whether the output was accurate.

Consider medical devices. FDA's Software as a Medical Device framework and the Predetermined Change Control Plan approach both require that AI/ML-based decisions be traceable, validated, and subject to ongoing performance monitoring. An AI tool that generates confident-sounding clinical decision support recommendations without validated performance data in the intended patient population is not a validated device function. It is an uncontrolled process variation.

Consider financial services. Model risk management guidance from the OCC and Federal Reserve (SR 11-7) requires that models be validated, that their limitations be documented, and that their outputs not be used without appropriate challenge. An LLM generating credit risk assessments or AML determinations that bypasses the model validation framework — even as an "internal tool" — may create regulatory examination findings and enforcement exposure.

Consider federal contractors. When AI tools inform cost proposals, compliance certifications, or statements of work, the False Claims Act doesn't care that a model generated the numbers. The contractor signed the certification. The DOGE/VA case is instructive here precisely because the accountability problem doesn't disappear when the AI is on the contractor side of the relationship.

The Two-Direction Risk

The Yale Journal on Regulation's point about AI not being used deserves equal weight. The risk isn't only that organizations use AI irresponsibly — it is also that organizations categorically avoid AI tools that, if properly validated, could improve the quality and consistency of regulated decisions.

An organization that refuses to deploy a validated adverse event signal detection algorithm because it involves AI is making a different kind of error. If the tool can be shown to reliably detect signals that manual review misses, then not using it may itself represent a failure of required diligence. Regulators are not looking for AI abstinence. They are looking for defensible governance — which means knowing when AI is appropriate, deploying it properly when it is, and refusing to deploy it when it isn't.

The trap to avoid is binary thinking: either AI is fine or AI is dangerous. The actual standard is more precise. Is this tool validated for this use? Is it being used within its validated scope? Is human review occurring at the points where the APA — or its regulatory analog — requires documented expert judgment? Are the AI's assumptions, training data, and limitations disclosed to decision-makers?

These are governance questions, not technology questions. And they have answers.

What a Defensible AI Governance Framework Looks Like

Coglianese draws a useful line between permissible and impermissible AI uses. Drafting emails, processing public comments, administrative support — these are appropriate. Replacing required regulatory impact analysis or serving as the sole justification for a consequential policy decision — these are not.

Translating that principle into an organizational governance framework requires several specific components.

Use-Case Classification Before Deployment

Every proposed AI use case should be mapped against a classification framework that distinguishes administrative support from decision-support from decision-making. The further down that spectrum, the more validation, documentation, and human oversight the use case requires. This classification should happen before deployment, not after an audit finding.

Validation Proportionate to Stakes

AI tools used in regulated contexts need validation that corresponds to the regulatory stakes involved. This does not mean every AI tool needs a 21 CFR Part 820-level validation package. It means the validation evidence needs to be sufficient to defend the claim that the tool performs reliably for its intended use. For an LLM used to draft internal memos, that bar is low. For an LLM used to generate safety assessment narratives, the bar is high — and human expert review is not optional.

Documented Human Oversight at Decision Points

The APA's arbitrary-and-capricious standard requires that decision-makers actually consider the important aspects of the problem. In practice, this means AI outputs need to flow through a documented review process where a qualified human either accepts, modifies, or overrides the AI recommendation — and that decision is recorded. A signature on a form that no one reads is not a governance control. It is liability without protection.

Disclosure of AI Involvement and Limitations

When AI contributes to a regulatory submission, a safety report, or any externally consequential document, the organization should be able to describe what AI was used, what it was used for, what its known limitations are, and what human review occurred. This is not yet uniformly required across regulated sectors, but the trajectory is clear. Regulators are moving toward disclosure expectations. Organizations that build disclosure discipline now will have an advantage over those that retrofit it under regulatory pressure.

AI Governance Integrated Into Existing Quality Systems

The most durable governance structures don't create a parallel AI oversight system. They integrate AI governance into existing quality management frameworks — change control, CAPA, document control, training, supplier qualification. An LLM used in a GxP-adjacent process is a software system with a supplier, a validated scope, a change control history, and a periodic review cycle. Treating it as anything less is how governance gaps form.

What Regulated Organizations Should Do Right Now

The Coglianese analysis is a useful forcing function. It names the specific legal standard — arbitrary and capricious — that AI overconfidence violates, and it connects that standard to real-world examples that regulators and courts will find instructive. For regulated organizations, the immediate actions are straightforward:

Inventory your AI use cases. Start with any AI tool that touches a regulatory submission, a safety determination, a compliance certification, or a contract obligation. These are the use cases where the arbitrary-and-capricious logic applies most directly. If you don't know what AI is being used in these areas, that gap is itself a governance finding.

Assess each use case against the permissible/impermissible line. Coglianese's framework is a reasonable starting point: is the AI providing administrative support, or is it substituting for required expert analysis? If the latter, what validation exists, and what documented human review is occurring?

Document the human judgment layer. For any AI-assisted decision that could face regulatory scrutiny, the documentation trail should show that a qualified human reviewed the AI output, applied independent judgment, and made the final determination. The AI's role should be described accurately — as a drafting aid, as a signal generator, as an analytical starting point — not as the decision-maker.

Build AI disclosure into your submission and reporting practices. Where AI contributes to regulated documents, describe its role. Regulators are increasingly asking. Organizations that have thought through their answer in advance are in a stronger position than those that haven't.

Connect your AI governance to existing compliance frameworks. ISO 42001 provides a management system structure designed for exactly this kind of AI governance work. The NIST AI Risk Management Framework offers a complementary risk assessment methodology. Neither replaces sector-specific regulatory requirements — but both provide defensible scaffolding for demonstrating that your AI use is governed, not improvised.

The underlying question that Coglianese's analysis surfaces is not whether AI is useful. It clearly is, across regulated industries. The question is whether organizations can demonstrate, when scrutinized, that their AI use meets the same evidentiary and reasoning standards that apply to every other consequential decision they make. The organizations that answer yes — with documentation to back it up — are in a defensible position. The ones that can't are carrying exposure they may not have mapped.

At Regulated AI Consulting, this is the work we do with regulated organizations: mapping AI use against governance requirements, identifying the gaps between current practice and defensible practice, and building the documentation infrastructure that lets you use AI confidently without creating new liability. If you're working through any of the questions this article raises, a consultation is the right place to start.