Not a subscriber?
Join 10,000+ getting a unique perspective every Saturday on growing their internet business with actionable AI workflows, systems, and insights.
You're in! Check your email
Oops! Something went wrong while submitting the form 🤔
January 17, 2026

AI doesn't hallucinate. You do.

I fired a consultant last week.

He built an automation for us that was supposed to summarize industry trends. On day three, the bot reported a "major acquisition" between two of my client’s competitors. The communication created huge confusion, dozens of unnecessary emails, and hours of "costly" strategic panic.

There was just one problem: The acquisition never happened.

When I confronted the consultant, he shrugged and said, "Yeah, well, you know how it is. The AI hallucinated. It’s a black box."

I fired him.

I didn't fire him because the AI made a mistake. I fired him because he blamed the tool.

(Rule #1 of the AI OS: Don't use AI if you can't OWN the final result.)

We need to have a serious, "adults-in-the-room" conversation about the "H-word."

We keep calling incorrect outputs "hallucinations." It’s the most dangerous word in the industry right now because it gives you a pass. It implies the AI has a psyche that is "breaking," or that it’s having a fever dream.

If you want to build a real AI Operating System, you need to stop thinking like a psychologist and start thinking like an engineer.

Here is the cold, hard truth: AI models cannot hallucinate, because they do not know what the truth is.

And if you don't understand how they lie, you can never stop them.

The Technical Reality: The "Blurry JPEG"

To fix this, we have to get slightly technical.

In 2023, sci-fi writer Ted Chiang gave the perfect analogy for Large Language Models (LLMs): A blurry JPEG of the web.

Imagine you take a screenshot of a New York Times article. Then you compress it. You zip it, shrink it, and lower the resolution until it is a blurry mess of pixels.

Now, you ask a computer: "Reconstruct the article from this blur."

The computer looks at the fuzzy pixels. It sees a shape that looks like the word "President." It sees a blur that looks like "signed." It sees a blob that looks like "Treaty."

It reconstructs the sentence: "The President signed the treaty."

It looks perfect. But maybe the original article said, "The President vetoed the treaty."

The computer didn't "hallucinate." It didn't have a mental breakdown. It successfully reconstructed a plausible pattern from lossy data.

LLMs are not databases. They are probabilistic engines. They don't look up facts; they predict the next most likely token (word-part) based on the statistical patterns they learned during training.

If you ask an LLM about a court case that doesn't exist, and the statistical weight of the words "v." and "Supreme Court" are high enough in that context, it will invent a case citation.

It is choosing Plausibility over Accuracy.

The "BS" of Prompting

The vast majority of people try to fix this by "begging" the AI.

"Please don't lie."

"Only tell the truth."

"Be accurate."

This is "Vibe Hoping."

It’s nonsense. You cannot ask a probabilistic engine to stop using probability. If you want accuracy, you have to build a Constraint Architecture.

The Solution: The "Truth Architecture"

In the AI OS, we don't trust. We verify.

If you are using AI for anything that matters—legal, financial, medical, or strategic—you must implement this 4-step protocol.

Let me be clear: you can never make a probabilistic engine deterministic. You can only make accuracy a highly probable event.

Phase 1: The Context Cage (RAG)

Never ask an AI to answer a question using its "brain" (its training data). Its brain is the Blurry JPEG.

Instead, force it to answer using only the data you provide. This is called "Grounding."

  • The Amateur Prompt: "Write a report on the Q3 financial performance of Tesla." (The AI will pull from its blurry memory, which cuts off at its training date, and likely invent numbers).
  • The AI OS Prompt: "I have pasted the Q3 Earnings Transcript below. You are to answer the user's question using ONLY this text. Do not use outside knowledge. If the answer is not in the text, state 'Data Not Available'."

You are putting the AI in a cage. You are limiting its ability to "guess" by restricting its source material.

Phase 2: The "Null Token" Requirement

The biggest reason AI lies is that it thinks it must answer you to be helpful. You have to explicitly program the "escape hatch."

In every single prompt I write, I include this line:

"If you are not 100% certain of the answer based on the provided context, you must reply with 'NULL'. Do not attempt to derive an answer."

This sounds simple, but it changes everything. You are adjusting the "incentive structure" of the prompt. You are rewarding the AI for silence rather than fluency.

Phase 3: The Split (Generator vs. Verifier)

This is the most common mistake I see. People ask the same prompt to generate the content and check the facts.

An LLM is terrible at checking its own work. It has a "consistency bias"—it tends to double down on its own errors to maintain coherence.

You need to split the roles into two separate agents (or prompt steps). We call this Adversarial Verification.

Let’s say you are generating a newsletter draft.

  • Agent 1 (The Writer): Writes the draft.
  • Agent 2 (The Lawyer): You feed the draft and the original source material to a separate prompt.

The "Lawyer" Prompt:

"You are an Auditor. Your goal is to find errors. Compare the Draft Text below against the Source Text below. List every claim made in the Draft. Verify if it is supported by the Source. If a claim is unsupported, flag it as a HALLUCINATION."

By separating the "Writer" from the "Critic" ("Red Teaming" step), our error rate dropped from ~15% to near 0%.

Phase 4: The "Calculator Clause" (Deterministic Fallback)

Finally, if you are doing math, data analysis, or logic, do not let the LLM do the math in its head.

LLMs are bad at math for the same reason they are good at poetry: they predict the next word, they don't calculate values.

  • The BS: Asking ChatGPT "What is 13,490 divided by 23?" and trusting the text it spits out.
  • The AI OS: Asking ChatGPT to "Write and execute a Python script to calculate 13,490 divided by 23."

When the AI writes code, it shifts from Probabilistic (guessing the word) to Deterministic (running the math). Code doesn't hallucinate. It either runs or it errors.

The New Standard

Stop accepting "hallucination" as the cost of doing business. It is a sign of lazy architecture.

If your system is making things up:

  1. You aren't grounding it (Context Cage).
  2. You aren't giving it an escape hatch (Null Token).
  3. You aren't auditing it (Red Team).
  4. You are asking a poet to do a calculator's job (Code Fallback).

We don't need "smarter" models to fix this. We need smarter operators.

Next Step:

Take your most complex prompt. Add a "Verification Step" to the end of it. Paste the output back into the chat and say: "Compare your response above to the source text provided. List any sentence that is not explicitly supported by the source text."

Until next time.

— Charafeddine (CM)

Share this article on: